QL Parameters are not Referentially Transparent
Summary:
(Eric pointed out a variant of this a couple of years ago, but I was resistant to the idea, because it's going to be a pain. But by now I have a firmer grasp on the problem, and why it's broken from a programming-theoretic POV.)
In QL, parameters are currently never pass-by-value. This is decidedly weird -- possibly the only programming language I've ever encountered where that is so. There are good reasons for it (it's one of the reasons why QL is so expressive), but it's broken in theory and sometimes produces mal-effects in practice. The problem is a mix of referential transparency, and sometimes violation of expectations.
In normal programming languages, parameters are usually pass-by-value: what you put into the parameter is exactly what is passed to the function. This is never the case in QL, and there is currently no way to say that you want it.
What it looks like is happening is what I'll call "pass-evaluated": the parameter is evaluated in the calling context, and the result is passed to the function. This is the real secret sauce of QL, and allows it to be so concise. But it's not what actually happens.
In practice, parameters are almost always what I'll call "pass-by-expression" -- similar to Scala's by-name, but really not quite the same. In fact, the entire syntax tree is passed down to the function, and evaluated where and when the function feels like it. This is where we potentially violate referential integrity, and we certainly violate locality of code: the caller really doesn't have any a priori way of knowing how this parameter will be evaluated. That's powerful, but often confusing: even I often have problems getting it right.
Moreover, getting around this is hard: you can say that you want pass-evaluated by preceding the parameter with ~!
, but this is so obscure it isn't even documented, and I almost never use it myself. And that's an issue, because I suspect it is desired 99% of the time. It is telling that the vast majority of built-in QL functions do this. (Which is why it looks like it works this way.) But that is not what happens in user-written QL functions, so it's easy to get it wrong. $_1
, $_2
, etc, as well as params in local functions, are pass-by-expression, and often wind up behaving very weirdly in practice.
So I think we need to rethink this. What I think we should have is:
- The default should be pass-evaluated. That's what you usually want, it's the way the built-in functions work, and it's at least theoretically referentially transparent, if a tad odd, so I believe it's theoretically sound.
- We should allow for pass-by-expression -- some built-ins absolutely rely on it (eg,
_sort
, which is where it came from in the first place). There should be a way to declare pass-by-expression at the definition site, since it is the function itself that declares that it wants this sort of macro-level parameter.
- There should probably be a way to get pass-by-value at the call site -- that is, a way to say "take this parameter literally, don't evaluate it". Note that only the call site should care about the difference between pass-by-value and pass-evaluated; it is no business of the called function. This should be rarely needed (since most values aside from properties just evaluate to themselves, and properties should usually be pass-by-expression), but we should have it in mind as a possible feature of the language.
How do we get there from here? It's going to be a multi-step process, and needs more research and design, but roughly speaking, I think we will need to:
- Change the default handling of Signature for the built-ins. For each parameter, you can optionally declare that this parameter is pass-by-expression (the current behavior); for everything else, it will become pass-evaluated by default. The evaluation should happen before the call, and the type of the resulting value checked before the function call begins. Instead of calling
process()
on the parameter, most functions can just call a new getParam()
function.
- Go through every built-in function with a fine-toothed comb. Any parameters that are currently being processed with the context should be switched to
getParam()
; anything else should be declared as pass-by-expression.
- Deprecate
$_1
, $_2
, and replace them with explicitly pass-evaluated and pass-by-expression equivalents, so that you say what you want when you write the function. Pass-evaluated should be considered the "normal" case in the documentation, since it is easier to understand and more often what you want. (This might be the nudge needed to introduce Signatures to Functions, with named parameters and the full power that the internal functions enjoy. At that point, the $_1
style can be really deprecated and later dropped.)
- Depending on whether anybody but me is using local functions, possibly just change the default there to pass-evaluated, and add some sort of syntax to declare a parameter as pass-by-expression.
- Introduce some sort of syntax to mean pass-by-value, without evaluating in the local context. As mentioned above, this should rarely be necessary, but it's likely to be an important escape hatch for certain unusual circumstances. (It will, in practice, wind up replacing the use of
._self
on many Property parameters.)
- And of course, test, test, test! This is dangerous stuff, and could easily break a lot of code, so it will need to be handled gradually and carefully.