You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/adding-new-languages.md
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -8,14 +8,14 @@ Note that we recently transitioned the system to auto-generate strongly-typed AS
8
8
9
9
1.**Find or write a [tree-sitter](https://tree-sitter.github.io) parser for your language.** The tree-sitter [organization page](https://github.com/tree-sitter) has a number of parsers beyond those we currently support in Semantic; look there first to make sure you're not duplicating work. The tree-sitter [documentation on creating parsers](http://tree-sitter.github.io/tree-sitter/creating-parsers) provides an exhaustive look at the process of developing and debugging tree-sitter parsers. Though we do not support grammars written with other toolkits such as [ANTLR](https://www.antlr.org), translating an ANTLR or other BNF-style grammar into a tree-sitter grammar is usually straightforward.
10
10
2.**Create a Haskell library providing an interface to that C source.** The [`haskell-tree-sitter`](https://github.com/tree-sitter/haskell-tree-sitter) repository provides a Cabal package for each supported language. You can find an example of a pull request to add such a package [here](https://github.com/tree-sitter/haskell-tree-sitter/pull/276/files), and a file providing:
11
-
- A bridged (via the FFI) reference to the toplevel parser in the generated file must be provided ([example](https://github.com/tree-sitter/haskell-tree-sitter/blob/master/tree-sitter-json/TreeSitter/JSON.hs#L11)).
12
-
- A way to retrieve [`tree-sitter` data](https://github.com/tree-sitter/haskell-tree-sitter/blob/master/tree-sitter-json/TreeSitter/JSON.hs#L13-L14) used to auto-generate syntax datatypes using the following steps. During parser generation, tree-sitter produces a `node-types.json` file that captures the structure of a language's grammar. The autogeneration described below in Step 4 derives datatypes based on this structural representation. The `node-types.json` is a data file in `haskell-tree-sitter` that gets installed with the package. The function `getNodeTypesPath :: IO FilePath` is defined to access in the contents of this file, using `getDataFileName :: FilePath -> IO FilePath`, which is defined in the autogenerated `Paths_` module.
11
+
- A bridged (via the FFI) reference to the toplevel parser in the generated file must be provided ([example](https://github.com/tree-sitter/haskell-tree-sitter/blob/master/tree-sitter-json/TreeSitter/JSON.hs#L11)).
12
+
- A way to retrieve [`tree-sitter` data](https://github.com/tree-sitter/haskell-tree-sitter/blob/master/tree-sitter-json/TreeSitter/JSON.hs#L13-L14) used to auto-generate syntax datatypes using the following steps. During parser generation, tree-sitter produces a `node-types.json` file that captures the structure of a language's grammar. The autogeneration described below in Step 4 derives datatypes based on this structural representation. The `node-types.json` is a data file in `haskell-tree-sitter` that gets installed with the package. The function `getNodeTypesPath :: IO FilePath` is defined to access in the contents of this file, using `getDataFileName :: FilePath -> IO FilePath`, which is defined in the autogenerated `Paths_` module.
13
13
3.**Create a Haskell library in Semantic to auto-generate precise ASTs.** Create a `semantic-[LANGUAGE]` package. This is an example of [`semantic-python`](https://github.com/github/semantic/tree/master/semantic-python)). Each package needs to provide the following API surfaces:
14
14
-`Language.[LANGUAGE].AST` - Derives Haskell datatypes from a language and its `node-types.json` file ([example](https://github.com/github/semantic/blob/master/semantic-python/src/Language/Python/AST.hs)).
15
15
-`Language.[LANGUAGE].Grammar` - Provides statically-known rules corresponding to symbols in the grammar for each syntax node, generated with the `mkStaticallyKnownRuleGrammarData` Template Haskell splice ([example](https://github.com/github/semantic/blob/master/semantic-python/src/Language/Python/Grammar.hs)).
16
16
-`Language.[LANGUAGE]` - Semantic functionality for programs in a language ([example](https://github.com/github/semantic/blob/master/semantic-python/src/Language/Python.hs)).
17
17
-`Language.[LANGUAGE].Tags` - Computes tags for code nav definitions and references found in source ([example](https://github.com/github/semantic/blob/master/semantic-python/src/Language/Python/Tags.hs)).
18
-
5.**Add tests for precise ASTs, tagging and graphing, and evaluating code written in that language.** Because tree-sitter grammars often change, we require extensive testing so as to avoid the unhappy situation of bitrotted languages that break as soon as a new grammar comes down the line. Here are examples of tests for [precise ASTs](https://github.com/github/semantic/blob/master/semantic-python/test/PreciseTest.hs), [tagging](https://github.com/github/semantic/blob/master/test/Tags/Spec.hs), and [graphing](https://github.com/github/semantic/blob/master/semantic-python/test-graphing/GraphTest.hs).
18
+
5.**Add tests for precise ASTs, tagging and graphing, and evaluating code written in that language.** Because tree-sitter grammars often change, we require extensive testing so as to avoid the unhappy situation of bitrotted languages that break as soon as a new grammar comes down the line. Here are examples of tests for [precise ASTs](https://github.com/github/semantic/blob/master/semantic-python/test/PreciseTest.hs), [tagging](https://github.com/github/semantic/blob/master/test/Tags/Spec.hs), and [graphing](https://github.com/github/semantic/blob/master/semantic-python/test-graphing/GraphTest.hs).
19
19
20
20
To summarize, each interaction made possible by the Semantic CLI corresponds to one (or more) of the above steps:
21
21
@@ -30,4 +30,4 @@ To summarize, each interaction made possible by the Semantic CLI corresponds to
30
30
31
31
**This sounds hard.** You're right! It is currently a lot of work: just because the Semantic architecture is extensible in the expression-problem manner does not mean that adding new support is trivial.
32
32
33
-
**What recent changes have been made?** The Semantic authors have introduced a new architecture for language support and parsing, one that dispenses with the [assignment](https://github.com/github/semantic/blob/master/docs/assignment.md) step altogether. The `semantic-ast` package generates Haskell data types from tree-sitter grammars; these types are then translated into the [Semantic core language](https://github.com/github/semantic/blob/master/semantic-core/src/Data/Core.hs); all evaluators will then be written in terms of the Core language. As compared with the [historic process]() used to add new languages, these changes entire obviate the process of 1) assigning types into an open-union of syntax functors, and 2) implementing `Evaluatable` instances and adding value effects to describe the control flow of your language.
33
+
**What recent changes have been made?** The Semantic authors have introduced a new architecture for language support and parsing, one that dispenses with the [assignment](https://github.com/github/semantic/blob/master/docs/assignment.md) step altogether. The `semantic-ast` package generates Haskell data types from tree-sitter grammars. As compared with the [historic process]() used to add new languages, these changes entire obviate the process of 1) assigning types into an open-union of syntax functors, and 2) implementing `Evaluatable` instances and adding value effects to describe the control flow of your language.
0 commit comments