Skip to content

Commit e257051

Browse files
zygoloidchandlerc
andauthored
No predeclared identifiers, Core is a keyword (#4864)
Introduce a principle that the Carbon language should not encroach on the developer's namespace. Satisfy this principle by making `Core` a keyword. --------- Co-authored-by: Chandler Carruth <[email protected]>
1 parent e4e6332 commit e257051

File tree

6 files changed

+309
-3
lines changed

6 files changed

+309
-3
lines changed

docs/design/code_and_name_organization/README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -546,6 +546,7 @@ the caller.
546546

547547
```regex
548548
import IDENTIFIER (library NAME_PATH)?;
549+
import Core (library NAME_PATH)?;
549550
import library NAME_PATH;
550551
import library default;
551552
```
@@ -554,7 +555,9 @@ An import with a package name `IDENTIFIER` declares a package entity named after
554555
the imported package, and makes API entities from the imported library available
555556
through it. `Main` cannot be imported from other packages; in other words, only
556557
`import library NAME_PATH` syntax can be used to import from `Main`. Imports of
557-
`Main//default` are invalid.
558+
`Main//default` are invalid. The keyword `Core` can be used as a package name in
559+
an import in order to import portions of the standard library that are not part
560+
of the prelude.
558561

559562
The full name path is a concatenation of the names of the package entity, any
560563
namespace entities applied, and the final entity addressed. Child namespaces or

docs/design/lexical_conventions/words.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ in Unicode Normalization Form C (NFC).
3939

4040
<!--
4141
Keep in sync:
42-
- utils/textmate/Syntaxes/Carbon.plist
42+
- utils/textmate/Syntaxes/carbom.tmLanguage.json
4343
- utils/tree_sitter/queries/highlights.scm
4444
-->
4545

@@ -54,6 +54,7 @@ The following words are interpreted as keywords:
5454
- `auto`
5555
- `base`
5656
- `break`
57+
- `Core`
5758
- `case`
5859
- `choice`
5960
- `class`
@@ -92,6 +93,7 @@ The following words are interpreted as keywords:
9293
- `return`
9394
- `returned`
9495
- `Self`
96+
- `self`
9597
- `template`
9698
- `then`
9799
- `type`
Lines changed: 122 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,122 @@
1+
# Principle: Namespace cleanliness
2+
3+
<!--
4+
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
5+
Exceptions. See /LICENSE for license information.
6+
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
7+
-->
8+
9+
<!-- toc -->
10+
11+
## Table of contents
12+
13+
- [Background](#background)
14+
- [Principle](#principle)
15+
- [Applications of this principle](#applications-of-this-principle)
16+
- [Exceptions](#exceptions)
17+
- [Alternatives considered](#alternatives-considered)
18+
19+
<!-- tocstop -->
20+
21+
## Background
22+
23+
Names and entities in a program can come from multiple sources -- from a local
24+
declaration, from an import, from the standard library, or from the prelude.
25+
Names can be imported from Carbon code or imported or derived from code written
26+
in another language such as C++ or an interface description language such as
27+
that of Protobuf, MIDL, or CORBA. Names can be selected for use in a program
28+
that language designers later decide they want to use as keywords. And in order
29+
to use a library, it is sometimes necessary to redeclare the same name that that
30+
library chose.
31+
32+
This puts a lot of pressure on the language to support a free choice of naming
33+
for entities. Different languages make different choices in this space:
34+
35+
- Many languages have a set of keywords that are not usable as identifiers,
36+
with no workaround. If this set collides with a name needed by user code,
37+
the user is left to solve this problem, often by rewriting the identifier in
38+
some way (`klass` or `class_`), which sometimes conflicts with the general
39+
naming convention used by the code. And conversely, suboptimal choices are
40+
made for new language keywords to avoid causing problems for existing code.
41+
- C and C++ reserve a family of identifiers, such as those beginning with an
42+
underscore and a capital letter. However, it's not clear which audiences the
43+
reserved identifiers are for, and this leads to collisions between standard
44+
library vendors and compiler authors, as well as between implementation
45+
extensions and language extensions.
46+
- MSVC provides a `__identifier(keyword)` extension that allows using a
47+
keyword as an identifier. This extension is also implemented by Clang in
48+
`-fms-extensions` mode.
49+
- GCC provides an `__asm__(symbol)` extension that allows a specific
50+
symbol to be assigned to an object or function, which provides ABI
51+
compatibility but not source compatibility with code that uses a keyword
52+
as a symbol name. This extension is also implemented by Clang.
53+
- Python reserves some identifiers but still allows them to be freely
54+
overwritten (such as `bool`) and reserves some identifiers but rejects
55+
assignment to them (such as `True`).
56+
- Rust provides a raw identifier syntax to allow most identifiers with
57+
reserved meaning to be used by a program, but
58+
[not all](https://internals.rust-lang.org/t/raw-identifiers-dont-work-for-all-identifiers/9094):
59+
`self`, `Self`, `super`, `extern`, and `crate` cannot be used as raw
60+
identifiers. Rust also predeclares a large number of library names in every
61+
file, but allows them to be shadowed by user declarations with the same
62+
name.
63+
- Swift provides a raw identifier syntax using backticks: `` `class` ``, and
64+
is
65+
[considering](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0451-escaped-identifiers.md)
66+
extending this to allow arbitrary non-word-shaped character sequences
67+
between the `` ` ``s.
68+
69+
Carbon provides
70+
[raw identifier syntax](/docs/design/lexical_conventions/words.md#raw-identifiers),
71+
for example `r#for`, to allow using keywords as identifiers. Carbon also intends
72+
to have strict shadowing rules that may make predeclared identifiers that are
73+
_not_ keywords difficult or impossible to redeclare and use in inner scopes.
74+
75+
## Principle
76+
77+
In Carbon, the language does not encroach on the developer's namespace. There
78+
are no predeclared or reserved identifiers. In cases where the language gives
79+
special meaning to a word or to word-shaped syntax such as `i32`, that special
80+
meaning can always be undone with raw identifier syntax, `r#`.
81+
82+
Conversely, when adding language keywords, we will not select an inferior
83+
keyword merely to avoid the risk of breaking existing programs. We will still
84+
take into account how often it is desirable to use the word as an identifier,
85+
including in domain-specific contexts, because that is a factor in whether it
86+
would make a good keyword, and will manage the rollout of new keywords to make
87+
it straightforward to migrate existing uses to `r#` or a different name.
88+
89+
## Applications of this principle
90+
91+
- Words like `final` and `base` that only have special meaning in a few
92+
contexts, and could otherwise be made available as identifiers, are keywords
93+
in Carbon. `{.base = ...}` and `{.r#base = ...}` specify different member
94+
names.
95+
- Words like `self` that are declared by the developer but nonetheless have
96+
special language-recognized meaning are keywords in Carbon. `[self:! Self]`
97+
introduces a self parameter; `[r#self:! Self]` introduces a deduced
98+
parameter.
99+
- Words like `Self` that are implicitly declared by the language in some
100+
contexts are keywords, even though we could treat them as user-declared
101+
identifiers that are merely implicitly declared in some cases.
102+
- Words like `i32` that are treated as type literals rather than keywords can
103+
be used as identifiers with raw identifier syntax `r#i32`.
104+
- There are no predeclared identifiers imported from the prelude. If an entity
105+
is important enough to be available by default, we should add a keyword, and
106+
allow the name of the entity to be used for other purposes with `r#`.
107+
- The predeclared package name `Core` is a keyword. A package named `r#Core`
108+
is an unrelated package, and `Core.foo` always refers to members of the
109+
predeclared `Core` package.
110+
111+
## Exceptions
112+
113+
For now, we reserve the package names `Main` and `Cpp`. These names aren't
114+
predeclared in any scope, and the name `Main` is not even usable from within
115+
source files to refer to the main package. However, there is currently no way to
116+
avoid collisions between the package name `Cpp` and a top-level entity named
117+
`Cpp` if they are both used in the same source file.
118+
119+
## Alternatives considered
120+
121+
- [Have both predeclared identifiers and keywords](/proposals/p4864.md#have-both-predeclared-identifiers-and-keywords)
122+
- [Reserve words with a certain spelling](/proposals/p4864.md#reserve-words-with-a-certain-spelling)

proposals/p4864.md

Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
# No predeclared identifiers, `Core` is a keyword
2+
3+
<!--
4+
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
5+
Exceptions. See /LICENSE for license information.
6+
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
7+
-->
8+
9+
[Pull request](https://github.com/carbon-language/carbon-lang/pull/4864)
10+
11+
<!-- toc -->
12+
13+
## Table of contents
14+
15+
- [Abstract](#abstract)
16+
- [Problem](#problem)
17+
- [Background](#background)
18+
- [Proposal](#proposal)
19+
- [Details](#details)
20+
- [Rationale](#rationale)
21+
- [Future work](#future-work)
22+
- [Package name `Cpp`](#package-name-cpp)
23+
- [Package name `Main`](#package-name-main)
24+
- [Alternatives considered](#alternatives-considered)
25+
- [Have both predeclared identifiers and keywords](#have-both-predeclared-identifiers-and-keywords)
26+
- [Reserve words with a certain spelling](#reserve-words-with-a-certain-spelling)
27+
28+
<!-- tocstop -->
29+
30+
## Abstract
31+
32+
Introduce a principle that the Carbon language should not encroach on the
33+
developer's namespace. Satisfy this principle by making `Core` a keyword.
34+
35+
## Problem
36+
37+
Ongoing design work needs rules for how to expose types such as a primitive
38+
array type to Carbon code, and in particular, if we choose to make it available
39+
by default, whether that should be accomplished by a keyword or a predeclared
40+
identifier.
41+
42+
## Background
43+
44+
See the
45+
[Background section of the added principle](/docs/project/principles/namespace_cleanliness.md#background).
46+
47+
## Proposal
48+
49+
We choose to not have any predeclared identifiers in Carbon. If a word has
50+
special meaning to the language, then that word is a keyword, and a plain
51+
identifier with no special meaning is always available using raw identifier
52+
syntax.
53+
54+
## Details
55+
56+
See [the principle document](/docs/project/principles/namespace_cleanliness.md)
57+
for details of the added principle. In addition, we make one change and one
58+
clarification:
59+
60+
- `Core` is changed from being an identifier that happens to be the name of
61+
the Carbon standard library, and happens to be predeclared in every source
62+
file as naming that library, to being a keyword. The keyword can only be
63+
used:
64+
65+
- When importing the `Core` package.
66+
- When implementing the `Core` package as part of the language
67+
implementation.
68+
- As a keyword naming the `Core` package, much like the `package` keyword.
69+
70+
The identifier `r#Core` can be used freely and does not conflict with the
71+
keyword. This includes use of `r#Core` as the name of a package. Language
72+
constructs that are defined in terms of entities in the `Core` package refer
73+
specifically to the package named with the _keyword_ `Core`, not to any
74+
other entity named `Core`.
75+
76+
- The `self` keyword is now included in the list of keywords. It is already
77+
treated as a keyword by the toolchain.
78+
79+
## Rationale
80+
81+
- [Language tools and ecosystem](/docs/project/goals.md#language-tools-and-ecosystem)
82+
- Code generation tools can have a uniform handling for all words with
83+
special meaning, with no need to alter the spelling of names from other
84+
languages.
85+
- Language tools can determine the meaning of `Core.<name>` without
86+
needing to do any name lookup or sophisticated analysis.
87+
- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution)
88+
- Migration between versions of Carbon with a changed set of reserved
89+
words can be done uniformly.
90+
- Adding names to the prelude remains a non-breaking change. Adding new
91+
predeclared names requires adding a keyword, with the same cost and
92+
value tradeoffs regardless of whether the keyword names a library
93+
declaration or introduces new language syntax.
94+
- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
95+
- Syntax highlighting tools can easily distinguish between words with
96+
special meaning and words with program-defined meaning.
97+
- The meaning of core language constructs can be defined as a rewrite in
98+
terms of `Core.<name>` without concern that `Core` may have some
99+
different local interpretation.
100+
- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)
101+
- All C++ identifiers are nameable from Carbon code without conflicts.
102+
Virtual functions introduced in Carbon can be overridden in Carbon
103+
regardless of their name. C++ code can be migrated to Carbon even if its
104+
name in C++ has special meaning in Carbon.
105+
- [Principle: Prefer providing only one way to do a given thing](/docs/project/principles/one_way.md)
106+
- This proposal specifies that there is only one way to give words special
107+
meaning in Carbon, and one way to resolve issues if that special meaning
108+
conflicts with another desired meaning.
109+
110+
## Future work
111+
112+
### Package name `Cpp`
113+
114+
The special package name `Cpp` that refers to code written in C++ is not made a
115+
keyword by this proposal, but this proposal is also not deciding that it should
116+
_not_ be a keyword. While this name has special meaning to the language, it's
117+
not predeclared in any context, so it's considered to be out of scope. A future
118+
proposal that describes the details of C++ import should determine whether this
119+
name becomes a keyword. Notably, making `Cpp` a keyword would also allow an
120+
`import Cpp` declaration to have custom syntax, which may be useful.
121+
122+
### Package name `Main`
123+
124+
The special package name `Main` that is currently reserved in all package name
125+
contexts is not made a keyword in this proposal either. There would be no
126+
meaning in making it a keyword, as it is never used as a special package name in
127+
Carbon source files. However, we could consider using an empty package name as
128+
the name of the main package, and unreserving the package name `Main`, if it
129+
becomes a concern that we reserve this name.
130+
131+
## Alternatives considered
132+
133+
### Have both predeclared identifiers and keywords
134+
135+
We could provide both predeclared identifiers and keywords. Many languages
136+
follow this path. However, predeclared identifiers have some problems compared
137+
to keywords:
138+
139+
- In order to locally declare a name matching a predeclared identifier, the
140+
name would need to be shadowed.
141+
- Such shadowing may be invalid, depending on how the name is used.
142+
- Readability is harmed by using a name used as basic vocabulary with a
143+
different, local meaning.
144+
- Shadowing a predeclared identifier typically makes the original name
145+
hard to access -- an alias or similar must be established in advance.
146+
- There need to be two different stories for how to deal with adding a new
147+
word with special meaning to the language, depending on whether it is a
148+
keyword.
149+
- For each word with special meaning, we must make an arbitrary decision as to
150+
which kind it is, resulting in a largely meaningless distinction that
151+
nonetheless is visible and would need to be known by developers in some
152+
contexts.
153+
154+
### Reserve words with a certain spelling
155+
156+
We could reserve words with certain spellings for future use as keywords or as
157+
vendor extensions. Some languages do this:
158+
159+
- C reserves words starting with an underscore followed by a capital letter or
160+
an underscore.
161+
- C++ additionally reserves words containing a double underscore anywhere.
162+
- Python uses the `__name__` namespace for certain special names, and by
163+
convention these names are reserved for that purpose.
164+
165+
In Carbon we could accomplish this by saying that all words of the reserved
166+
forms are keywords, with no meaning ascribed to them yet.
167+
168+
However, we do not have a clear need for such reserved words at this time, and
169+
we would not want to use such spellings when we do add language keywords later.
170+
Moreover, C++ programs frequently declare reserved words in practice, and we
171+
should expect the same in Carbon. Without enforcement, the names are not
172+
effectively reserved.
173+
174+
If we find a need at a later time to introduce vendor-specific language
175+
extension keywords, we can revisit this, but should also consider alternatives
176+
such as a `k#foo` spelling to turn what is normally an identifier into a
177+
(potentially vendor-specific) keyword.

utils/textmate/Syntaxes/carbon.tmLanguage.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -223,7 +223,7 @@
223223
"patterns": [
224224
{
225225
"name": "support.class.carbon",
226-
"match": "(?<=\\bpackage\\s)\\w+"
226+
"match": "(?<=\\b(package|Core)\\s)\\w+"
227227
},
228228
{
229229
"name": "support.variable.carbon",

utils/tree_sitter/queries/highlights.scm

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@
8989
"auto"
9090
"base"
9191
"break"
92+
; "Core"
9293
"case"
9394
"choice"
9495
"class"
@@ -126,6 +127,7 @@
126127
"return"
127128
"returned"
128129
"Self"
130+
; "self"
129131
"template"
130132
"then"
131133
"type"

0 commit comments

Comments
 (0)