DSL for shell scripting

May 11, 2020

You need to go through quite a lot of ceremony in most programming languages to run a sub command. This post covers my design of a domain specific language to solve this problem for Janet.

First, let’s set compare simple tasks you might perform during a typical script with some existing languages and tools.

Run a command or abort.

Shell:

set -e
git clone $url

Python:

subprocess.check_call(["git", "clone", url])

Go:

cmd := exec.Command("git", "clone", url)
err := cmd.Run()
if err != nil {
  panic(err)
}

Read the output of a command or abort

Shell:

set -e
out="$(git status)"

Python:

out = subprocess.check_output(["git", "status"])

Go:

cmd := exec.Command("git", "status")
out, err := cmd.Output()
if err != nil {
  panic(err)
}

Running a multicore pipeline and saving the result

Shell:

set -e
set -o pipefail # Non standard bash extension
find "$DIR" | sort | uniq | gzip > "$out"

Go and Python:

Excercise for the reader ;)

Just know its long and/or annoying to write unless you execute ‘sh -c’ to cheat. If you do use ‘sh -c’, remember to properly shell escape ‘dir’.

Designing a DSL

Lucky for us, Janet is a ’lisp-like’ language and we can write extensions to the language (when it makes sense!) as a normal library.

The following is a summary of the implementation process.

Wrap posix_spawn

First I wrote a C wrapper around the posix_spawn POSIX api for spawning processes. https://github.com/andrewchambers/janet-posix-spawn

The usage of this is about on par with other languages:

(posix-spawn/run ["ls"])

Notably, it supports the dup2 arguments for io redirection.

# Redirect stdout into /dev/null
(posix-spawn/run ["ls"] :file-actions [[:dup2 (file/open "/dev/null") stdout]])

Decide on a nicer syntax

Key ideas behind my thought process:

Be close to shell in expressivity.
Have more robust error handling than shell.
Have seamless interop between Janet values and shell arguments.
Be idiomatic to Janet.

Running a process and get its exit code

(sh/run ls)

Running a process and abort on error

(sh/$ ls)

Using janet variables as arguments

We use the janet ‘unquote’ operator ‘,’ to revert to janet values.

(def dir "/")
(sh/$ ls ,dir)

For those who haven’t used a lisp before. The unquote operator ,RHS is shorthand for (unquote RHS):

i.e. (sh/$ ls ,dir) is equivalent to (sh/$ ls (unquote dir)).

It will then be simple for our DSL to handle this unquote action.

Splice a janet array or tuple into command args

(def args ["a" "b" "c"])
(sh/$ echo ;args)

See the official documentation for the splice ; operator. The important part is that is functions exactly as in standard janet code.

Running a process with output as a string

(def out (sh/$< ls))

Running a process with io redirection

(def buf (buffer/new 0))
(def errors (buffer/new 0))
# > value means redirect into that.
# > [a b] means redirect a into b.
(sh/$ ls > ,buf > [stderr errors])

Running a pipeline

(sh/$ tar | gzip )

Implement your ideas

A janet macro lets us transform arbitrary janet code into a new code form. I won’t go in too much detail on how to write janet/lisp macros, but they are a fairly light weight way to extend the syntax of the host language.

The core of the implementation problem is converting janet compatible symbols to a shell like specification we can actually run. I do this using a simple state machine to loop over macro values and convert them to an array of posix-spawn :file-actions or command arguments.

The whole DSL is implemented in about 200 lines of code, see it here.

One trick to allow redirection to/from buffers is to first coerce them to anonymous file descriptors, then after command completion read the result back from these files.

Heres a quick peek under the hood using the macex function in the janet repl to show use the result of macro expansion.

janet:3:> (macex '(sh/run tar -cf - ,dir | gzip))
(<function run*> @["tar" "-cf" "-" dir] @["gzip"])

We simply desugared the original form into a function call of run* with proper quotation.

Original examples

Now, let’s revisit the examples above, this time using our new DSL:

Run a command or abort

(sh/$ git clone ,url)

Read the output of a command or abort

(def out (sh/$< git status))

Running a multicore pipeline and saving the result

(with [out (file/open "some/path" :wb)]
  (sh/$ find ,dir | sort | uniq | gzip > ,out))

BONUS ROUND - Pattern match on exit codes

(def url "https://google.com")
(match (sh/run curl ,url | gzip | xxd)
  [0 0 0] (print "ok")
  [1 _ _] (print "curl failed"))

This doesn’t have a shell equivalent I am aware of :).

Conclusion

Overall I think mainstream languages could do a whole lot better with ‘scripting’ or executing sub commands. I hope this post encourages some programmers to experiment with how much better life can be.

One of the things I love about janet is how easy it is to bounce between ‘scripting mode’ and the ‘serious programmer mode’ these other languages are stuck in. The ability to write our own DSL takes this further.

Also remember that great care must be taken when writing a DSL. You may just make a total mess nobody understands. The reader can be the judge as to whether this mini language is a good or bad DSL, and whether the code using it is more clear or less clear.

Feel free to give Janet or my library a try. Also, consider making a scripting DSL for your own favourite language.

Thank you for reading.