RFC: Interpolated string function calls#170
RFC: Interpolated string function calls#170rofrankel wants to merge 8 commits intoluau-lang:masterfrom
Conversation
Add support for calling functions with interpolated strings without parentheses, enabling DSL patterns like structured logging where the template, interpolation values, and optional context are passed to the function. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
Edit: This pattern is now described in this RFC (not as a core part of the RFC, just as an illustration of how it might be used). For completeness, one alternative proposal that came up (I can flesh this out into a separate RFC if helpful): We could create a new type Message = {
template: string,
context: { [string]: any },
-- Renders `template` using `context`
toString: (self: Message) -> string
}Primary functional differences from this RFC include:
Note: there are things I like about this approach, but I can't take credit for the idea - it was suggested by a colleague during a brainstorming session. |
|
The arguments passed to the function "called" in this way seem unintuitive, and also very specific. For this to be used in a DSL-y way (or even just by a fancy logger) it would also require that function to itself handle parsing of the Would an approach like JavaScripts template literal tag functions make more sense? When you call log_info`Hello {name}!`in JavaScript, it would pass the In a luau-y way this might be better translated as taking two arguments, the first being a table of the string chunks (N values, N>=1) and the second being a table of the values to interpolate (N-1 values). Edit: I just noticed this JS way of doing things was mentioned as a rejected alternative. The rejection reasons don't make sense to me. Not providing the original template string is good and honestly necessary if you want any hope of this API being possible without implementing an entire luau parser, for the reasons outlined below. A table mapping for names to values could be easily added if really desired, though I can't think of many instances outside of debuggers where it ends up necessary to know the names (as opposed to confusing, when the same template is being used but provided different locals for values). The third rejection reason makes no sense at all; JS's "arrays" are the same as luau "tables". If multiple-arguments is the actual concern, and just poorly worded, I already addressed that here. As an example of where this makes way more sense, the RFC suggests:
To me this is absolutely not what you'd want, because now
which suggests any consumer of this API would need to implement an entire luau parser to have any hope of reconstructing the desired string output! This conundrum also then causes seemingly nonsensical restrictions being applied to the API such as just below where we see the print(dsl`Something { dsl`Something else` }`) to be illegal, but one might expect many DSLs would require (or at least be made substantially nicer with) nesting. Just so this comment isn't all doom and gloom, I think one approach that could be more ergonomic nice is a With that, log "hello world"and log `hello world`would function identically unless the body of |
hgoldstein
left a comment
There was a problem hiding this comment.
The main sticking points here are:
- The ambiguity noted by the extra parameter(s);
- The restrictions on what can be passed to interpolated strings.
| When a function is called with an interpolated string literal in this style, the call receives multiple arguments derived from the interpolated string: | ||
|
|
||
| 1. **Formatted string**: The fully interpolated result (what you would get from the expression today) | ||
| 2. **Template string**: The original template with placeholders intact, e.g. `"Hello {name}"` |
There was a problem hiding this comment.
As is, the way string interpolation appears to work is that we de-sugar:
print(`Hello {name}, it is {currtime}!`)... to something like ...
print(string.format("Hello %*, it is %*!", name, currtime))I'm imagining semantics like:
template `Hello {name}, it is {currtime}`... desugaring to something like ...
-- Could also be an array for the second argument
template("Hello %*, it is %*!", name, currtime)Would that be enough for your use case? I think it also makes the "mistaken" case of ...
print `Hello, my name is {name}`... a little less unfortunate.
There was a problem hiding this comment.
Personally me one issue with this is you still end up needing to write your own parser to consume this API. It's not too bad, but you can't just do a simple find and replace because you need to account for %-escaping in the passed format string. I think you could cook up a nasty gsub using "([^%%])%%%*", "%1" .. replace with an extra check for if the %* is at the start of the string, but this feels pretty ugly to me.
There was a problem hiding this comment.
Yeah there's a couple decent outcomes here. My immediate thought, well before I read the RFC, is you'd get the "decomposed" interpolation, so an array of alternating string and value parts, e.g.:
log `Foo {name} bar {1 + 2}`... becomes something like ...
log({"Foo ", name, " bar ", 3 })I'll admit wanting to preserve the exact string came out of one of the goals of logging. Another option is that we can probably embed the actual text of the interpolated string, e.g. you'll get:
log `Foo {name} bar {1 + 2}`... becoming ...
log("Foo {name} bar {1 + 2}", "Foo ", name, " bar ", 3 }).... you can still reconstruct it by table.concating everything but the first argument.
There was a problem hiding this comment.
From live discussion with Hunter, perhaps something like:
local a = 1
local function double(x: number)
return x * 2
end
-- Desugars to `log:Info("The double of %* is %*", {1, 2}, {"a", "double(a)"})`
log:Info "The double of {a} is {double(a)}"`There was a problem hiding this comment.
I've updated the draft, PTAL!
|
In response to recent suggestions, I definitely don't want to have to parse out format strings (%*). One interesting thing that brings up though is that if we ever let you provide format specifiers, e.g. |
Address reviewer comments by redesigning the desugaring approach: - Use positional tables (format string with %s, values table, expressions table) instead of named interpolation table, eliminating the need for consumers to parse Luau expressions - Remove expression restrictions: function calls, method calls, and repeated expressions are now all valid - Remove optional trailing context table from grammar; show currying pattern instead for passing additional context - Add SQL escaping and HTML templating motivation examples - Add Message wrapper type for bridging to existing functions - Expand alternatives section with rejected design rationale Co-authored-by: Cursor <cursoragent@cursor.com>
|
I've updated this RFC with a substantially rewritten and simplified approach addressing feedback from @MagmaBurnsV, @Cooldude2606, @hgoldstein, and @Bottersnike. Thanks to @hgoldstein for the live brainstorming. I am much happier with this version than the original version I wrote up (which came out of a 30 minute group brainstorming session that perhaps needed to be longer :) ). The latest draft doesn't directly address @Kampfkarren's comment about potential future support for format specifiers, but perhaps it does so implicitly? E.g. if the |
|
One alternative not mentioned is that you can get pretty close (albeit not with a nice interpolated string) to this behaviour just using existing functionality: local function Log(fmt: string)
local function log_handler(args: { any })
print("Log event. Format string:", fmt, "Evaluated:", string.format(fmt :: any, table.unpack(args)))
end
return log_handler
end
local user = "Bottersnike"
Log "Hello, %*" { user }I like the changes, though
While not an issue directly with the RFC or it's wording, it's worth noting that updating a function isn't possible in many cases. For example, |
…back Co-authored-by: Cursor <cursoragent@cursor.com>
Good call, updated.
Yes this is a valid point - that's why I describe the |
|
|
||
| This was considered and the proposed design shares the same spirit: decomposing the interpolated string into parts that don't require the consumer to parse Luau expressions. The proposed design differs in using a format string with `%*` placeholders instead of a string parts array, and in providing an additional expressions table with the source text of each interpolated expression. The format string approach: | ||
|
|
||
| 1. Provides a single template string usable as a log aggregation key or cache key |
There was a problem hiding this comment.
I think we discussed it offline, but wouldn't it be enough to (effectively) reconstruct the format string by doing something like:
local function mylog(templateparts, values)
local fmt = table.concat(templateparts, "%*")
print(string.format(fmt :: any, table.unpack(values)))
end... including metadata about the values in the template aside.
There was a problem hiding this comment.
This kind of works, but with two drawbacks:
- There's collision risk for simple templates that have the same overall structure, e.g.
{foo}: {bar}and{baz}: {qux}. False positives for aggregation/deduplication may in some cases be even worse than false negatives. - Replacing the placeholder expressions with just
"%*"may hurt readability...in some cases it may be obvious what the placeholders were, but in other cases, less so.
Speaking purely as the user, these drawbacks seem significant enough that I'd rather have the placeholder expressions as well. And users who are happy without the expressions can just ignore them. I know there may be some performance hit, but as we discussed, all the identified realistic use cases for this functionality are likely going to do something much more expensive anyway with the result (e.g. make a network request), so the language performance cost is not a primary concern.
|
This new draft addresses the issues I had, although a new one has arisen. A clarification / brief note on how literals and whitespare are past into the third argument should be included: log `{42}`
-- this one is trivial
-- log("%*", { 42 }, { "42" })
log `{"Alice"}`
-- this one should note how escapes are handled
-- log("%*", { "Alice" }, { "\"Alice\"" })
log `{'Alice'}`
-- are single quotes maintained or converted to double quotes?
-- log("%*", { "Alice" }, { "\'Alice\'" })
-- log("%*", { "Alice" }, { "\"Alice\"" })
log `{ { ["foo"] = "bar", } }`
-- are spaces, quotes and commas maintained or trimed?
-- log("%*", { { "foo" } }, { "{foo=\"bar\"}" })
-- log("%*", { { "foo" } }, { " { [\"foo\"] = \"bar\", } " })
log `{
user . name
}`
-- are spaces and new lines maintained or trimed?
-- log("%*", { "Alice" }, { "user.name" }
-- log("%*", { "Alice" }, { "\n\tuser . name\n" } |
Specify that entries are verbatim source text with leading/trailing whitespace trimmed. Quoting style, internal spacing, and formatting are preserved as written. Addresses reviewer question about literals and whitespace. Co-authored-by: Cursor <cursoragent@cursor.com>
Thanks, good point, updated. |
Co-authored-by: Cursor <cursoragent@cursor.com>
|
|
||
| The template string enables aggregating logs by message pattern (e.g. grouping all "The double of %* is %*" messages), while the values and expression names provide searchable structured data for platforms like Splunk, Datadog, or Elasticsearch. | ||
|
|
||
| Without this feature, developers must either manually construct all arguments (tedious and error-prone) or use a logging library that implements its own template parsing at runtime (duplicating language functionality). |
There was a problem hiding this comment.
... use a logging library that implements its own template parsing at runtime (duplicating language functionality).
I'm not sure that duplicating language functionality is the issue here: it's more that this is also error prone in its own way. For example:
log:Info("Hello, my name is: {{nome}}", { name = "Hunter" })... whereas we already have a mechanism for constructing strings that can provide analysis input (intellisense, error checking): it's more missed opportunity than duplicating language features being a problem.
|
Would it be better (or worse?) if the desugaring instead included byte offsets into the original string for each substitution? log`{timestamp}: Count is {count}. Total is {count + total}`
-- desugars to
log("{timestamp}: Count is {count}. Total is {count + total}", {timestamp, count, count + total},
-- (offset, length) pairs for each substitution
{{0, 11}, {22, 7}, {45, 15}}
)This trivializes any custom parsing you might want to do. |
I like this interface quite a lot. I'd like to see this approach added to the "Alternatives" section alongside a strong argument for why it is unviable. |
I'll leave it to you and the other reviewers to say how you all feel about this alternative from a language perspective, but I agree with you this would support the motivating use case (and is generically powerful). Happy to update the RFC to reflect this if that's the recommended path. |
FWIW this is pretty close to the original internal proposal (not publicly linkable but it's API-1080 on internal Jira). The feedback I got from the reviewers was to write this Luau RFC instead. (The main difference is that instead of |
…rings
- Replace expressions list with byte-offset/length pairs per andyfriesen's suggestion
- Template string now passed with {expressions} intact (serves as aggregation key)
- Add product requirements for log consumption: human-readable templates and key-value structured data
- Move expressions-list approach to Alternatives
- Add manual format string + curried values to Alternatives per reviewer request
Co-authored-by: Cursor <cursoragent@cursor.com>
On reflection, I just updated it to reflect this. |
…at strings - Ambiguity: two natural log patterns that produce identical %* format strings - Readability: complex pattern uninterpretable without source code lookup - Note that call stack fingerprints can disambiguate but remain opaque Co-authored-by: Cursor <cursoragent@cursor.com>
Cool. Could you add a minor elaboration: The offsets provided for the replacements need to be 1-based so that they can be used with existing builtin string utilities like |
Replace three-argument desugaring (template, values, offsets) with a simpler two-argument convention (template, values) plus two new core library functions: string.interpparse for extracting expression names and string.interp for rendering templates. string.interp supports both sequential and associative tables, enabling a single function to serve paren-free calls (positional values from locals) and parenthesized calls (named context objects). Move the previous byte-offsets design to alternatives, and add an @` metadata object syntax alternative. Made-with: Cursor
| ```luau | ||
| string.interpparse("Hello {name}") -- returns {"name"} | ||
| string.interpparse("{a} + {b} = {a + b}") -- returns {"a", "b", "a + b"} | ||
| string.interpparse("No interpolations") -- returns {} |
There was a problem hiding this comment.
This seems like a downgrade in functionality from previous, where now there's no easy way for code to inspect the non-interpolated parts of an interpolated string. I honestly can't think of a good example use case for when you would want to, but I think it would be nice if interparse also gave the offsets for those things. As-is, this function doesn't really seem especially useful; Luau doesn't have loadstring so the things returned from this function have very limited utility. It also doesn't scream "parsing the interpolation string" given the limited information it returns.
| ```luau | ||
| -- Sequential: values matched positionally to expressions | ||
| string.interp("Hello {name}", {"Alice"}) -- returns "Hello Alice" | ||
| string.interp("{a} + {b} = {a + b}", {1, 2, 3}) -- returns "1 + 2 = 3" |
There was a problem hiding this comment.
This is an unclear example, is the final expression "3" because 1 + 2 is 3 or because the positional value is "3"? I assume the latter is intended. Therefore an example should be used that shows that the contents of the expression is not a constrained on the value.
e.g. string.interp("{a} + {b} = {a + b}", {1, 2, "three"}) -- returns "1 + 2 = three"
| string.interp("{a} + {b} = {a + b}", {1, 2, 3}) -- returns "1 + 2 = 3" | ||
|
|
||
| -- Associative: values matched by expression name | ||
| string.interp("Hello {name}", {name = "Alice"}) -- returns "Hello Alice" |
There was a problem hiding this comment.
What happens in the following scenarios?
string.interp("Hello {name}", {"bob", name = "Alice"})
string.interp("{a} + {b} = {a + b}", {b = 1, 2, "three"})
I would assume it matches by name, then any unmatched expressions consume positional values in order:
Hello Alice and 2 + 1 = three
There was a problem hiding this comment.
As another scenario, what about string.interp("{2} {1}", {"First", "Second"}) (ie literal values, but that are also array indexes}
|
Edit, and TLDR: I don't think there's any good solution involving conversion of the interpolated string into any other string representation that is actually viable to consume. I believe the only reasonable solution is to pass the invoked function each chunk of the interpolated string (be that string chunks or value chunks) with associated metadata. Any string format including the original representation is ripe for parsing issues, or substantial duplication of existing language code. Any string format that removes the original representation (eg The rest of this comment explains the issues with the currently proposed Something that came to mind is that print(`\{{1}\}`)Which would print If a user was to write print `\{{1}\}`Then that would desugar to If we consider, for example, the string Let's consider the reference implemented for We're going to use capture `\{{"Hello }} world"}\}`As an interpolated string, this evaluates to The reason for this is the example At this time, I don't have an ideal solution that just drops into the existing RFC and fixes this all. The best I can think of off the top of my head would be to entirely change the signature of an interpolated string function call to return pre-chunked parts of the string. Something like Edit: A different example I thought of that bypasses this problem while retaining all the useful functionality would be fn `hello {name}, do you like {food}?`
-> fn("hello ", {repr="name", value="Bottersnike"}, " do you like ", {repr="food", value="Oranges"}, "?") |
|
This RFC as is isn't type safe, which I would consider an absolute requirement for any new luau feature. Given there's been all this work into making the type system as great as it is today, it'd be rather stupid to throw all that to the wayside. So I've thought of a possible type safe alternative to what this RFC is trying to achieve which adds a new 'keyword' tagged function log:Info(`{rule_id: number}: {action: string} {src_ip: string}:{src_port: string} -> {dst_ip: string}:{dst_port: string} proto={proto: buffer} bytes={bytes: buffer}`)
-- code here
endWhere a "interpolated" string is used to define the functions arguments, with the brackets containing function arguments with their type. Perhaps there could be a variable provided called tagged function log:Info(`{rule_id: number}: {action: string} {src_ip: string}:{src_port: string} -> {dst_ip: string}:{dst_port: string} proto={proto: buffer} bytes={bytes: buffer}`)
--[[
this would de-sugar into:
print(string.format("%*: %* %*:%* -> %*:%* proto=%* bytes=%*", rule_id, action, src_ip, src_port, dst_ip, dst_port, proto, bytes))
]]
print(format())
end |
|
Thanks for all the feedback and apologies for the churn/lack of responsiveness. Some of the updates are in response to internal conversations that are happening in parallel with this quasi-public process. (meta: there's probably room for improvement there) There might be one more substantive change to this RFC in the works, which might implicitly address some of the current feedback (and/or make it obsolete). If not then I'll respond point by point. Thanks again for the thoughtful comments. |
Summary
Allow calling functions with interpolated strings without parentheses, enabling DSL patterns like structured logging, SQL escaping, and HTML templating. The call desugars into two arguments: the original template string (with interpolation expressions intact) and a table of evaluated values.
Two new core library functions are also introduced:
string.interpparsefor extracting expression names from a template, andstring.interpfor rendering a template with values. These reuse the same code as normal string interpolation.Motivation
The string interpolation RFC noted the restriction on parentheses-free calls was "likely temporary while we work through string interpolation DSLs." This RFC proposes lifting that restriction with semantics designed for structured logging and similar use cases.
Key design decisions
string.interpparse(template)extracts expression names;string.interp(template, values)renders the templatestring.interpsupports both sequential tables (positional) and associative tables (by expression name), enabling a single function to handle both paren-free calls and parenthesized calls with a named context object