I wanted to make some advanced logic available, easily configurable via a database, in a couple apps that I’ve been working on recently.
Honestly, I could have just stored the code in the database and eval’d it — but no, I don’t want to take the risk of arbitrarily executing code. I could have done some gymnastics like running it in a network-less container with defined inputs and outputs. I could have made the configuration more capable. What I decided to do in the end, though, was to write an extremely compact domain-specific language (DSL).
To write this DSL, I chose a Lisp-style syntax due to its dead-simple parsing. The basic idea is to parse the string for tokens, generate an abstract syntax tree (AST), then just recurse through the AST and run whatever code is required.
In one example, I wanted to have an extendible SQL WHERE clause, with a very limited set of operators — AND
, OR
, LIKE
, =
.
(and (like (attr "person.name") (str "%Keita%")) (= (attr "person.city) (str "Tokyo")))
This example will generate the SQL WHERE clause:
((person.name LIKE '%Keita%') AND (person.city = 'Tokyo'))
Here’s pseudocode for how I write the parser / interpreter for this:
COMMANDS = {
"and": (left, right) => { return f"(({left}) AND ({right}))" }
"or": (left, right) => { return f"(({left}) OR ({right}))" }
"like":(left, right) => { return f"(({left}) LIKE ({right}))" }
"=": (left, right) => { return f"(({left}) = ({right}))" }
"str": (str) => { return escape_sql(str) }
"attr": (str) => { return escape_sql_for_attr_name(str) }
}
def parse_ast(code):
# parse the "code" string into nested arrays:
# "(1 (2 3))" becomes ["1", ["2", "3"]]
...
def execute_node(ast):
cmd = ast[0]
argv = ast[1:]
resolved_argv = [ execute_node(x) for x in argv ]
return COMMANDS[cmd](*resolved_argv)
def execute(code):
ast = parse_ast(code)
execute_node(ast)
As you can see, this is a very simple example that takes the DSL and transforms it in to a SQL string. If you wanted to do parameterized queries, you might return a tuple with the string as the first element and a map of parameters for the second, for example.
The ability to map the language so closely to the AST, and being able to evaluate the AST just by recursion, makes this implementation simple and easy to write, easy to extend, and easy to embed in existing applications. While I probably won’t be switching to writing Common Lisp full time (for practical reasons), I definitely do get the appeal of the language itself.
This tool isn’t something I use all the time. It’s probably something that should be used very sparingly, and in specific circumstances. That said, it’s a good tool in my toolbox for those times for when I want to have on-the-fly customizable logic without the security concerns of using eval
, or the complexity of creating a sandboxed environment for potentially unsafe code.
Last note: while this solution may be more secure than eval
, it is definitely not 100% secure. In the simple example above, we do escape strings so SQL injection shouldn’t be a problem, but it doesn’t check if the column defined by the attr
function is valid, or if the user is allowed to query information based on that column or not (although something like that would be possible). I would not use something like this to process completely untrusted input.