Skip to content

RFC: Interpolated string function calls#170

Open
rofrankel wants to merge 8 commits intoluau-lang:masterfrom
rofrankel:rfc-interpolated-string-function-calls
Open

RFC: Interpolated string function calls#170
rofrankel wants to merge 8 commits intoluau-lang:masterfrom
rofrankel:rfc-interpolated-string-function-calls

Conversation

@rofrankel
Copy link

@rofrankel rofrankel commented Jan 29, 2026

Summary

Allow calling functions with interpolated strings without parentheses, enabling DSL patterns like structured logging, SQL escaping, and HTML templating. The call desugars into two arguments: the original template string (with interpolation expressions intact) and a table of evaluated values.

log:Info `Hello {name}`
-- Desugars to: log:Info("Hello {name}", {"Alice"})

Two new core library functions are also introduced: string.interpparse for extracting expression names from a template, and string.interp for rendering a template with values. These reuse the same code as normal string interpolation.

Motivation

The string interpolation RFC noted the restriction on parentheses-free calls was "likely temporary while we work through string interpolation DSLs." This RFC proposes lifting that restriction with semantics designed for structured logging and similar use cases.

Key design decisions

  • Parentheses-free interpolated string calls desugar to 2 arguments (template string, values table)
  • Any expression valid in a regular interpolated string is also valid here (no restrictions)
  • string.interpparse(template) extracts expression names; string.interp(template, values) renders the template
  • string.interp supports both sequential tables (positional) and associative tables (by expression name), enabling a single function to handle both paren-free calls and parenthesized calls with a named context object
  • Currying pattern enables passing additional context without grammar changes

Add support for calling functions with interpolated strings without
parentheses, enabling DSL patterns like structured logging where the
template, interpolation values, and optional context are passed to
the function.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@rofrankel
Copy link
Author

rofrankel commented Feb 5, 2026

Edit: This pattern is now described in this RFC (not as a core part of the RFC, just as an illustration of how it might be used).


For completeness, one alternative proposal that came up (I can flesh this out into a separate RFC if helpful):

We could create a new Message type that wraps a string interpolation template (as a " string, presumably) and a context object, something like:

type Message = {
	template: string,
	context: { [string]: any },
	-- Renders `template` using `context`
	toString: (self: Message) -> string
}

Primary functional differences from this RFC include:

  • Full encapsulation - template rendering would use only the context table, not local variables.
    • This avoids the challenges/potential limitations around method/function calls inside interpolated strings.
  • Would be much easier to make backwards-compatible with print/warn/error.

Note: there are things I like about this approach, but I can't take credit for the idea - it was suggested by a colleague during a brainstorming session.

@Bottersnike
Copy link

Bottersnike commented Feb 6, 2026

The arguments passed to the function "called" in this way seem unintuitive, and also very specific. For this to be used in a DSL-y way (or even just by a fancy logger) it would also require that function to itself handle parsing of the { ... } blocks within the passed arguments.

Would an approach like JavaScripts template literal tag functions make more sense? When you call

log_info`Hello {name}!`

in JavaScript, it would pass the log_info function the arguments {"Hello ", "!"} followed by all the substitution values (name's value in this case) as the rest of the arguments. This means consumers of the syntax don't need to themselves worry about syntax parsing, and can just focus on either custom concatenation or complete mutation of what they were given.

In a luau-y way this might be better translated as taking two arguments, the first being a table of the string chunks (N values, N>=1) and the second being a table of the values to interpolate (N-1 values).


Edit: I just noticed this JS way of doing things was mentioned as a rejected alternative. The rejection reasons don't make sense to me.

Not providing the original template string is good and honestly necessary if you want any hope of this API being possible without implementing an entire luau parser, for the reasons outlined below.

A table mapping for names to values could be easily added if really desired, though I can't think of many instances outside of debuggers where it ends up necessary to know the names (as opposed to confusing, when the same template is being used but provided different locals for values).

The third rejection reason makes no sense at all; JS's "arrays" are the same as luau "tables". If multiple-arguments is the actual concern, and just poorly worded, I already addressed that here.


As an example of where this makes way more sense, the RFC suggests:

log:Info `{user.name} is {user.age} years old`
-- Interpolation table is {user = {name = "Alice", age = 30}}

This approach is more ergonomic for consuming code, which can use natural table access like context.user.name, and is more idiomatic to Lua/Luau's table-centric design.

To me this is absolutely not what you'd want, because now log:Info has to now not only parse out the { ... } syntax from the format string, but also handle the complex expressions that could be within it. This becomes even more of a problem with the example of:

log:Info `Sum is {a + b}`
-- Desugars to: log:Info("Sum is 30", "Sum is {a + b}", {a = 10, b = 20})

which suggests any consumer of this API would need to implement an entire luau parser to have any hope of reconstructing the desired string output!

This conundrum also then causes seemingly nonsensical restrictions being applied to the API such as just below where we see the ??? (can't use "increment()" as key twice with different values) error being thrown because we can now no longer support any mutations within the template string. If we were to consider the DSL use-case, this then causes

print(dsl`Something { dsl`Something else` }`)  

to be illegal, but one might expect many DSLs would require (or at least be made substantially nicer with) nesting.


Just so this comment isn't all doom and gloom, I think one approach that could be more ergonomic nice is a string.getcomponents method that returns the arguments that would be passed to a luau-style JS tag function method as described above (with normal strings just returning {"the string"}, {}); maybe even returning the original template string if you really want it for some reason, though as explained above you probably don't. While this would be a nice way to expose stuff it would cause strings to carry pretty heavy baggage when being passed around in the VM under the hood, for a feature likely to be an incredibly code path, so I'm not sure if it would actually make sense to do.

With that,

log "hello world"

and

log `hello world`

would function identically unless the body of log wanted to call string.getcomponents on its passed argument.

Copy link

@hgoldstein hgoldstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main sticking points here are:

  1. The ambiguity noted by the extra parameter(s);
  2. The restrictions on what can be passed to interpolated strings.

When a function is called with an interpolated string literal in this style, the call receives multiple arguments derived from the interpolated string:

1. **Formatted string**: The fully interpolated result (what you would get from the expression today)
2. **Template string**: The original template with placeholders intact, e.g. `"Hello {name}"`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As is, the way string interpolation appears to work is that we de-sugar:

print(`Hello {name}, it is {currtime}!`)

... to something like ...

print(string.format("Hello %*, it is %*!", name, currtime))

I'm imagining semantics like:

template `Hello {name}, it is {currtime}`

... desugaring to something like ...

-- Could also be an array for the second argument
template("Hello %*, it is %*!", name, currtime)

Would that be enough for your use case? I think it also makes the "mistaken" case of ...

print `Hello, my name is {name}`

... a little less unfortunate.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally me one issue with this is you still end up needing to write your own parser to consume this API. It's not too bad, but you can't just do a simple find and replace because you need to account for %-escaping in the passed format string. I think you could cook up a nasty gsub using "([^%%])%%%*", "%1" .. replace with an extra check for if the %* is at the start of the string, but this feels pretty ugly to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah there's a couple decent outcomes here. My immediate thought, well before I read the RFC, is you'd get the "decomposed" interpolation, so an array of alternating string and value parts, e.g.:

log `Foo {name} bar {1 + 2}`

... becomes something like ...

log({"Foo ", name, " bar ", 3 })

I'll admit wanting to preserve the exact string came out of one of the goals of logging. Another option is that we can probably embed the actual text of the interpolated string, e.g. you'll get:

log `Foo {name} bar {1 + 2}`

... becoming ...

log("Foo {name} bar {1 + 2}", "Foo ", name, " bar ", 3 })

.... you can still reconstruct it by table.concating everything but the first argument.

Copy link
Author

@rofrankel rofrankel Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From live discussion with Hunter, perhaps something like:

local a = 1
local function double(x: number)
  return x * 2
end

-- Desugars to `log:Info("The double of %* is %*", {1, 2}, {"a", "double(a)"})`
log:Info "The double of {a} is {double(a)}"`

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the draft, PTAL!

@Kampfkarren
Copy link
Contributor

In response to recent suggestions, I definitely don't want to have to parse out format strings (%*). One interesting thing that brings up though is that if we ever let you provide format specifiers, e.g. {name} has {money, .2f} dollars, that it becomes much easier to combine with this system if you get it back as a string passed to a function, like f(name, " has ", string.format("%.2f", money), " dollars") than it is if you get it back as the format string it would create.

Address reviewer comments by redesigning the desugaring approach:

- Use positional tables (format string with %s, values table, expressions table) instead of named interpolation table, eliminating the need for consumers to parse Luau expressions

- Remove expression restrictions: function calls, method calls, and repeated expressions are now all valid

- Remove optional trailing context table from grammar; show currying pattern instead for passing additional context

- Add SQL escaping and HTML templating motivation examples

- Add Message wrapper type for bridging to existing functions

- Expand alternatives section with rejected design rationale

Co-authored-by: Cursor <cursoragent@cursor.com>
@rofrankel
Copy link
Author

rofrankel commented Feb 10, 2026

I've updated this RFC with a substantially rewritten and simplified approach addressing feedback from @MagmaBurnsV, @Cooldude2606, @hgoldstein, and @Bottersnike. Thanks to @hgoldstein for the live brainstorming.

I am much happier with this version than the original version I wrote up (which came out of a 30 minute group brainstorming session that perhaps needed to be longer :) ).

The latest draft doesn't directly address @Kampfkarren's comment about potential future support for format specifiers, but perhaps it does so implicitly? E.g. if the expressions argument looked like {"name", "money, .2f"} would that work or is there some subtle problem?

@Bottersnike
Copy link

One alternative not mentioned is that you can get pretty close (albeit not with a nice interpolated string) to this behaviour just using existing functionality:

local function Log(fmt: string)
  local function log_handler(args: { any })
    print("Log event. Format string:", fmt, "Evaluated:", string.format(fmt :: any, table.unpack(args)))
  end
  return log_handler
end

local user = "Bottersnike"
Log "Hello, %*" { user }

I like the changes, though %* would make more sense than %s given the types of the arguments aren't strictly limited to strings.


Functions designed for parentheses-free interpolated string calls would need to be written (or updated) to accept the three-argument format.

While not an issue directly with the RFC or it's wording, it's worth noting that updating a function isn't possible in many cases. For example, print has no way of knowing if it was called with an interpolated string and should itself perform the formatting, or the user in fact called it as print(...) and passed those three values explicitly. I think for the most part any variadic function can't be updated, but any function that currently takes a single string argument could be augmented to support the additional arguments optionally.

…back

Co-authored-by: Cursor <cursoragent@cursor.com>
@rofrankel
Copy link
Author

rofrankel commented Feb 10, 2026

One alternative not mentioned is that you can get pretty close (albeit not with a nice interpolated string) to this behaviour just using existing functionality:

local function Log(fmt: string)
  local function log_handler(args: { any })
    print("Log event. Format string:", fmt, "Evaluated:", string.format(fmt :: any, table.unpack(args)))
  end
  return log_handler
end

local user = "Bottersnike"
Log "Hello, %*" { user }

I like the changes, though %* would make more sense than %s given the types of the arguments aren't strictly limited to strings.

Good call, updated.

Functions designed for parentheses-free interpolated string calls would need to be written (or updated) to accept the three-argument format.

While not an issue directly with the RFC or it's wording, it's worth noting that updating a function isn't possible in many cases. For example, print has no way of knowing if it was called with an interpolated string and should itself perform the formatting, or the user in fact called it as print(...) and passed those three values explicitly. I think for the most part any variadic function can't be updated, but any function that currently takes a single string argument could be augmented to support the additional arguments optionally.

Yes this is a valid point - that's why I describe the Message approach as an alternative. I've rephased slightly to try and make that clearer.


This was considered and the proposed design shares the same spirit: decomposing the interpolated string into parts that don't require the consumer to parse Luau expressions. The proposed design differs in using a format string with `%*` placeholders instead of a string parts array, and in providing an additional expressions table with the source text of each interpolated expression. The format string approach:

1. Provides a single template string usable as a log aggregation key or cache key

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed it offline, but wouldn't it be enough to (effectively) reconstruct the format string by doing something like:

local function mylog(templateparts, values)
    local fmt = table.concat(templateparts, "%*")
    print(string.format(fmt :: any, table.unpack(values)))
end

... including metadata about the values in the template aside.

Copy link
Author

@rofrankel rofrankel Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of works, but with two drawbacks:

  1. There's collision risk for simple templates that have the same overall structure, e.g. {foo}: {bar} and {baz}: {qux}. False positives for aggregation/deduplication may in some cases be even worse than false negatives.
  2. Replacing the placeholder expressions with just "%*" may hurt readability...in some cases it may be obvious what the placeholders were, but in other cases, less so.

Speaking purely as the user, these drawbacks seem significant enough that I'd rather have the placeholder expressions as well. And users who are happy without the expressions can just ignore them. I know there may be some performance hit, but as we discussed, all the identified realistic use cases for this functionality are likely going to do something much more expensive anyway with the result (e.g. make a network request), so the language performance cost is not a primary concern.

@Cooldude2606
Copy link

This new draft addresses the issues I had, although a new one has arisen.

A clarification / brief note on how literals and whitespare are past into the third argument should be included:

log `{42}`
-- this one is trivial
-- log("%*", { 42 }, { "42" })

log `{"Alice"}`
-- this one should note how escapes are handled
-- log("%*", { "Alice" }, { "\"Alice\"" })

log `{'Alice'}`
-- are single quotes maintained or converted to double quotes?
-- log("%*", { "Alice" }, { "\'Alice\'" })
-- log("%*", { "Alice" }, { "\"Alice\"" })

log `{ { ["foo"] = "bar", } }`
-- are spaces, quotes and commas maintained or trimed?
-- log("%*", { { "foo" } }, { "{foo=\"bar\"}" })
-- log("%*", { { "foo" } }, { " { [\"foo\"] = \"bar\", } " })

log `{
    user . name
}`
-- are spaces and new lines maintained or trimed?
-- log("%*", { "Alice" }, { "user.name" }
-- log("%*", { "Alice" }, { "\n\tuser . name\n" }

Specify that entries are verbatim source text with leading/trailing whitespace trimmed. Quoting style, internal spacing, and formatting are preserved as written. Addresses reviewer question about literals and whitespace.

Co-authored-by: Cursor <cursoragent@cursor.com>
@rofrankel
Copy link
Author

This new draft addresses the issues I had, although a new one has arisen.

A clarification / brief note on how literals and whitespare are past into the third argument should be included:

log `{42}`
-- this one is trivial
-- log("%*", { 42 }, { "42" })

log `{"Alice"}`
-- this one should note how escapes are handled
-- log("%*", { "Alice" }, { "\"Alice\"" })

log `{'Alice'}`
-- are single quotes maintained or converted to double quotes?
-- log("%*", { "Alice" }, { "\'Alice\'" })
-- log("%*", { "Alice" }, { "\"Alice\"" })

log `{ { ["foo"] = "bar", } }`
-- are spaces, quotes and commas maintained or trimed?
-- log("%*", { { "foo" } }, { "{foo=\"bar\"}" })
-- log("%*", { { "foo" } }, { " { [\"foo\"] = \"bar\", } " })

log `{
    user . name
}`
-- are spaces and new lines maintained or trimed?
-- log("%*", { "Alice" }, { "user.name" }
-- log("%*", { "Alice" }, { "\n\tuser . name\n" }

Thanks, good point, updated.

Co-authored-by: Cursor <cursoragent@cursor.com>

The template string enables aggregating logs by message pattern (e.g. grouping all "The double of %* is %*" messages), while the values and expression names provide searchable structured data for platforms like Splunk, Datadog, or Elasticsearch.

Without this feature, developers must either manually construct all arguments (tedious and error-prone) or use a logging library that implements its own template parsing at runtime (duplicating language functionality).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... use a logging library that implements its own template parsing at runtime (duplicating language functionality).

I'm not sure that duplicating language functionality is the issue here: it's more that this is also error prone in its own way. For example:

log:Info("Hello, my name is: {{nome}}", { name = "Hunter" })

... whereas we already have a mechanism for constructing strings that can provide analysis input (intellisense, error checking): it's more missed opportunity than duplicating language features being a problem.

@andyfriesen
Copy link
Collaborator

Would it be better (or worse?) if the desugaring instead included byte offsets into the original string for each substitution?

log`{timestamp}: Count is {count}.  Total is {count + total}`
-- desugars to
log("{timestamp}: Count is {count}.  Total is {count + total}", {timestamp, count, count + total},
    -- (offset, length) pairs for each substitution
    {{0, 11}, {22, 7}, {45, 15}}
)

This trivializes any custom parsing you might want to do.

@andyfriesen
Copy link
Collaborator

One alternative not mentioned is that you can get pretty close (albeit not with a nice interpolated string) to this behaviour just using existing functionality:

local function Log(fmt: string)
  local function log_handler(args: { any })
    print("Log event. Format string:", fmt, "Evaluated:", string.format(fmt :: any, table.unpack(args)))
  end
  return log_handler
end

local user = "Bottersnike"
Log "Hello, %*" { user }

I like this interface quite a lot.

I'd like to see this approach added to the "Alternatives" section alongside a strong argument for why it is unviable.

@rofrankel
Copy link
Author

rofrankel commented Feb 23, 2026

Would it be better (or worse?) if the desugaring instead included byte offsets into the original string for each substitution?

log`{timestamp}: Count is {count}.  Total is {count + total}`
-- desugars to
log("{timestamp}: Count is {count}.  Total is {count + total}", {timestamp, count, count + total},
    -- (offset, length) pairs for each substitution
    {{0, 11}, {22, 7}, {45, 15}}
)

This trivializes any custom parsing you might want to do.

I'll leave it to you and the other reviewers to say how you all feel about this alternative from a language perspective, but I agree with you this would support the motivating use case (and is generically powerful).

Happy to update the RFC to reflect this if that's the recommended path.

@rofrankel
Copy link
Author

rofrankel commented Feb 23, 2026

One alternative not mentioned is that you can get pretty close (albeit not with a nice interpolated string) to this behaviour just using existing functionality:

local function Log(fmt: string)
  local function log_handler(args: { any })
    print("Log event. Format string:", fmt, "Evaluated:", string.format(fmt :: any, table.unpack(args)))
  end
  return log_handler
end

local user = "Bottersnike"
Log "Hello, %*" { user }

I like this interface quite a lot.

I'd like to see this approach added to the "Alternatives" section alongside a strong argument for why it is unviable.

FWIW this is pretty close to the original internal proposal (not publicly linkable but it's API-1080 on internal Jira). The feedback I got from the reviewers was to write this Luau RFC instead. (The main difference is that instead of %* it uses named placeholders and takes an associative table instead of a list, because for log consumption having a self-describing template string is important. The only other material difference is that that proposal doesn't use a paren-free calling style.)

…rings

- Replace expressions list with byte-offset/length pairs per andyfriesen's suggestion

- Template string now passed with {expressions} intact (serves as aggregation key)

- Add product requirements for log consumption: human-readable templates and key-value structured data

- Move expressions-list approach to Alternatives

- Add manual format string + curried values to Alternatives per reviewer request

Co-authored-by: Cursor <cursoragent@cursor.com>
@rofrankel
Copy link
Author

Would it be better (or worse?) if the desugaring instead included byte offsets into the original string for each substitution?

log`{timestamp}: Count is {count}.  Total is {count + total}`
-- desugars to
log("{timestamp}: Count is {count}.  Total is {count + total}", {timestamp, count, count + total},
    -- (offset, length) pairs for each substitution
    {{0, 11}, {22, 7}, {45, 15}}
)

This trivializes any custom parsing you might want to do.

I'll leave it to you and the other reviewers to say how you all feel about this alternative from a language perspective, but I agree with you this would support the motivating use case (and is generically powerful).

Happy to update the RFC to reflect this if that's the recommended path.

On reflection, I just updated it to reflect this.

…at strings

- Ambiguity: two natural log patterns that produce identical %* format strings

- Readability: complex pattern uninterpretable without source code lookup

- Note that call stack fingerprints can disambiguate but remain opaque

Co-authored-by: Cursor <cursoragent@cursor.com>
@andyfriesen
Copy link
Collaborator

On reflection, I just updated it to reflect this.

Cool. Could you add a minor elaboration: The offsets provided for the replacements need to be 1-based so that they can be used with existing builtin string utilities like string.sub.

Replace three-argument desugaring (template, values, offsets) with a simpler two-argument convention (template, values) plus two new core library functions: string.interpparse for extracting expression names and string.interp for rendering templates. string.interp supports both sequential and associative tables, enabling a single function to serve paren-free calls (positional values from locals) and parenthesized calls (named context objects). Move the previous byte-offsets design to alternatives, and add an @` metadata object syntax alternative.

Made-with: Cursor
```luau
string.interpparse("Hello {name}") -- returns {"name"}
string.interpparse("{a} + {b} = {a + b}") -- returns {"a", "b", "a + b"}
string.interpparse("No interpolations") -- returns {}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a downgrade in functionality from previous, where now there's no easy way for code to inspect the non-interpolated parts of an interpolated string. I honestly can't think of a good example use case for when you would want to, but I think it would be nice if interparse also gave the offsets for those things. As-is, this function doesn't really seem especially useful; Luau doesn't have loadstring so the things returned from this function have very limited utility. It also doesn't scream "parsing the interpolation string" given the limited information it returns.

```luau
-- Sequential: values matched positionally to expressions
string.interp("Hello {name}", {"Alice"}) -- returns "Hello Alice"
string.interp("{a} + {b} = {a + b}", {1, 2, 3}) -- returns "1 + 2 = 3"
Copy link

@Cooldude2606 Cooldude2606 Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an unclear example, is the final expression "3" because 1 + 2 is 3 or because the positional value is "3"? I assume the latter is intended. Therefore an example should be used that shows that the contents of the expression is not a constrained on the value.

e.g. string.interp("{a} + {b} = {a + b}", {1, 2, "three"}) -- returns "1 + 2 = three"

string.interp("{a} + {b} = {a + b}", {1, 2, 3}) -- returns "1 + 2 = 3"

-- Associative: values matched by expression name
string.interp("Hello {name}", {name = "Alice"}) -- returns "Hello Alice"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in the following scenarios?
string.interp("Hello {name}", {"bob", name = "Alice"})
string.interp("{a} + {b} = {a + b}", {b = 1, 2, "three"})

I would assume it matches by name, then any unmatched expressions consume positional values in order:
Hello Alice and 2 + 1 = three

Copy link

@Bottersnike Bottersnike Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As another scenario, what about string.interp("{2} {1}", {"First", "Second"}) (ie literal values, but that are also array indexes}

@Bottersnike
Copy link

Bottersnike commented Feb 27, 2026

Edit, and TLDR: I don't think there's any good solution involving conversion of the interpolated string into any other string representation that is actually viable to consume. I believe the only reasonable solution is to pass the invoked function each chunk of the interpolated string (be that string chunks or value chunks) with associated metadata.

Any string format including the original representation is ripe for parsing issues, or substantial duplication of existing language code. Any string format that removes the original representation (eg %*) needs extra baggage to annotate what that value was (for structured logging).

The rest of this comment explains the issues with the currently proposed {{/}} escaping and where even the example implementation falls flat regarding this.


Something that came to mind is that {{ as an escape sequence is different to how {s are normally escaped. For example, to normally escape a brace in an interpolation string you have to do:

print(`\{{1}\}`)

Which would print {1}.

If a user was to write

print `\{{1}\}`

Then that would desugar to print("{{{1}}}", ...) if we're treating {{ as an in-string escape sequence for a non-interpolation-literal {. I'm not sure if there is a good way to make this nice and consistent with the rest of the language, but I thought it was worth raising at least as a mild concern. As a similar thought, {{ in an interpolated string is considered invalid syntax, reminding users that Double braces are not permitted within interpolated strings. This is useful both as a way to catch errors, and to drastically ease parsing.

If we consider, for example, the string "{{{1}}}", matching the brackets is now a non-trivial operation. A naive scan for a non-duplicate { followed by a non-duplicate } is impossible to do with just string matching.

Let's consider the reference implemented for string.interp provided at https://github.com/luau-lang/luau/pull/2261/changes#diff-5754ebda9ef82adaa3498deff7066d2cef90204707aa9cb3067101238f49bebbR1726

We're going to use

capture `\{{"Hello }} world"}\}`

As an interpolated string, this evaluates to {Hello }} world}. If we parse that by hand, we first have a literal {, then a value of "Hello }} world" followed by a final literal }. If we think about escaping this into a string, we convert the literal braces into brace pairs, and end up with either "{{{\"Hello }} world\"}}}" or "{{{\"Hello }}}} world\"}}}" depending on if we also escape the braces within the value. Passing that "escaped" string into the example string.interp implementation we get an output of {Hello }}world}}world"}} or {Hello }}}}world}}world"}}. Needless to say these are both nonsense.

The reason for this is the example string.interp implementation naively counts { and } symbols to calculate the "depth" of nesting. This is thwarted by the reality of language syntax. A properly working implementation of string.interp rapidly approaches the complexity of the entire language's lexer.

At this time, I don't have an ideal solution that just drops into the existing RFC and fixes this all. The best I can think of off the top of my head would be to entirely change the signature of an interpolated string function call to return pre-chunked parts of the string. Something like ({str1, str2, str3, str4}, {repr1, repr2, repr3}, {value1, value2, value}) (always with N+1, N+1, and N items in each of those three tables) would resolve this, allowing the lexing pass that has already been performed on the language to be utilised, rather than reconstructing a faux representation of the interpolated string for runtime use. This is obviously a large divergence from the currently suggested method.

Edit: A different example I thought of that bypasses this problem while retaining all the useful functionality would be

fn `hello {name}, do you like {food}?`
-> fn("hello ", {repr="name", value="Bottersnike"}, " do you like ", {repr="food", value="Oranges"}, "?")

@gaymeowing
Copy link
Contributor

This RFC as is isn't type safe, which I would consider an absolute requirement for any new luau feature. Given there's been all this work into making the type system as great as it is today, it'd be rather stupid to throw all that to the wayside.

So I've thought of a possible type safe alternative to what this RFC is trying to achieve which adds a new 'keyword' tagged that goes before the function keyword:

tagged function log:Info(`{rule_id: number}: {action: string} {src_ip: string}:{src_port: string} -> {dst_ip: string}:{dst_port: string} proto={proto: buffer} bytes={bytes: buffer}`)
     -- code here
end

Where a "interpolated" string is used to define the functions arguments, with the brackets containing function arguments with their type.
The non bracket stuff could be given via a string argument called full that works similarly to the self argument when a function is created by using : on a table.
There could also be an array argument that contains the stuff not in brackets given similar to self, although dunno what it'd be called, as I can't think of a good name for it.

Perhaps there could be a variable provided called format, which would be a format string version of "interpolated" string used to define arguments. Which you could call string.format on to then format the arguments.
Alternatively format could be a function which the compiler sees and then de-sugars into a string.format call with all of the arguments:

tagged function log:Info(`{rule_id: number}: {action: string} {src_ip: string}:{src_port: string} -> {dst_ip: string}:{dst_port: string} proto={proto: buffer} bytes={bytes: buffer}`)
     --[[
          this would de-sugar into:
          print(string.format("%*: %* %*:%* -> %*:%* proto=%* bytes=%*", rule_id, action, src_ip, src_port, dst_ip, dst_port, proto, bytes))
     ]]
     print(format())
end

@rofrankel
Copy link
Author

Thanks for all the feedback and apologies for the churn/lack of responsiveness. Some of the updates are in response to internal conversations that are happening in parallel with this quasi-public process. (meta: there's probably room for improvement there)

There might be one more substantive change to this RFC in the works, which might implicitly address some of the current feedback (and/or make it obsolete). If not then I'll respond point by point. Thanks again for the thoughtful comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants