Description
Summary
Carbon already has reference expressions. We should add the ability to declare and match a binding to a reference expression.
The core idea would be:
fn F(ptr: i32*) {
// A reference binding `x`.
let ref x: i32 = *ptr;
// Use of `x` is a reference expression that refers to the same object as `*ptr`.
x += 1;
}
Key suggestion highlights:
- The keyword
ref
is used for this kind of binding. - Remove
addr
, and use this for the object parameter. - Allow this for any parameter (or nested within
let
, etc.). - Reference binding names, when used later on form reference expressions.
- No ability to take the address of the reference itself,
&x == ptr
above. - No ability to use this as a class or struct field.
Background
Reference bindings have come up multiple times:
- Better alternative to
addr self: Self*
- Lambda captures
- Nested bindings within a destructured
var
They also closely match the expression category.
How addresses interact with ref
The suggested model is that ref
bindings mirror reference expressions in that they refer back to some underlying object. As a consequence, it should be possible to take the address of a ref
binding and get the address of that object.
However, we expect reference expressions and as consequence ref
bindings to work more like Swift inout
than like a pointer: there may be implicit copies or moves that occur prior to forming the reference expression, or binding it to a name. The goal is that it should be possible for some types to implement ref
parameters through move-in / move-out semantics.
When we have a ref
binding specifically, we expect its address to be stable for the lifetime of the binding. And there is no valid move-in/move-out semantic model for overlapping bindings -- those must all reference the same underlying object, and the address of those must all match in addition to being stable. But for non-overlapping bindings such as parameters, a move-in/move-out model should be equally valid from the perspective of the ref
binding, and the address within the function might be different from the address in the caller.
At least in cases where a type permits move-in/move-out, the address of a ref
parameter should be implicitly nocapture
in LLVM's semantic model for example. Whether we go further and restrict ref
to be LLVM-nocapture
more broadly is an open question that can likely also be an area for future work.
More general replacement for addr
We should consider ref
as being available as a general part of our patterns -- for any parameter and in local declarations.
However, this will provide a significantly more general and likely more ergonomic replacement for addr
and we should remove addr
in favor of ref self: Self
object parameters for methods that mutate (or take the address of) self
. Potentially abbreviating the syntax further should be left as future work.
Improved C++ interop and migration
We expect this to improve interop and migration by allowing significantly more interface similarity between Carbon and C++. Previously, many things in C++ that used references on interface boundaries would be forced to switch to pointers. This adds ergonomic friction both at a basic level because of the forced change but also a deeper level because it will make it significantly harder to see the parallel usage across the boundary between C++ and Carbon. With reference bindings, the vast majority of this dissonance will be removed.
Open question: call site annotation
One important open question that should be answered here, at least initially, has to do with call-site annotation. When using pointers, there was often a "built-in" call site annotation of &
when passing mutable state into a function. We need to decide what to do in that case.
There are three possible answers in this leads issue:
- We need a call-site annotation to move forward with reference bindings
- We don't need a call-site annotation decision to move forward directionally with reference bindings, and should consider that separately.
- We don't need a call-site annotation to reject reference bindings
If the result is (1), then we will need to work through what that annotation looks like (likely candidate: F(ref <expr>)
)
Details of impact on the type system
These will ultimately be part of the type system, but the goal is for them to only be part of the type system through patterns used in the type system: function parameters, etc.
Specifically, we don't expect them to be part of the object types in Carbon, but only part of the expression categories and bindings within patterns. In this regard, they are very similar to value bindings -- we retain a great deal of implementation flexibility around layout, etc.
This specifically means we will need to incorporate ref
bindings into the Call
interface and we will be adding complexity there that will need to be handled by overloading. The overloading impact specifically is likely future work, but will at least carry additional complexity to handle ref
.
Details of lifetimes
We should ensure that reference expressions formed via reference bindings do not dangle.
So for any reference expression that has a known lifetime already in the language, such as those associated with temporaries or var
declarations, we should either lifetime-extend (in the case of temporaries) or error (in the case of declarations) when trying to form a binding that would outlive the referenced object.
For reference expressions without known lifetimes currently such as dereferenced pointers, while we should allow them despite unsafety today, we should fully expect lifetime safety in Carbon to eventually introduce a way of reasoning about these lifetimes and with that a requirement that the lifetime of the binding be satisfied. That should be explicitly expected as future work and part of getting an overall safety story for Carbon.
This does fundamentally mean that we now have another kind of "pointer", potentially adding complexity to any memory-safety story. However, I think this ship already sailed to some extent with value bindings. Fundamentally, bindings are allowed to have pointer-like semantics from a lifetime perspective, and so will need to be considered as a pointer-like thing as we build out lifetime safety.
Details that should be addressed in a proposal
When this goes to a proposal, there are a collection of important details that will need to be worked out. However, this issue suggests that we do not need these details to decide this issue directionally. This issue is about "should we have reference bindings in the language in some form", and any details that are necessary to resolve that should be pulled up and covered in this issue. We don't want to expend the (considerable) effort of building a proposal for this w/o reasonable alignment that we'll actually move forward.
Specific details that we'll defer to the proposal:
- Exact structure of how
ref
is attached within patterns w.r.t. destructuring, etc. - Should we have a top-level
ref
introducer, or islet ref
good enough. - Exact specification of how this surfaces in
Call
or other interfaces. - Adding
ref
lambda captures and all of the details that need to be resolved there, this may even be deferred into a second proposal.