Skip to content

Commit 5ce604a

Browse files
authored
Add documentation for escape sequences (#372)
* Describe valid escape sequences in string literals * Describe differences from Haskell regarding strings and chars
1 parent 0187e2e commit 5ce604a

File tree

2 files changed

+48
-1
lines changed

2 files changed

+48
-1
lines changed

language/Differences-from-Haskell.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,14 @@ ap :: forall m a b. (Monad m) => m (a -> b) -> m a -> m b
5252

5353
There is a native `Number` type which represents JavaScript's standard IEEE 754 float and an `Int` which is restricted to the range of 32-bit integers. In JavaScript, the `Int` values and operations are generated with a `|0` postfix to achieve this, e.g. if you have variables `x`, `y`, and `z` of type `Int`, then the PureScript expression `(x + y) * z` would compile to `((x + y)|0 * z)|0`.
5454

55+
#### Strings
56+
57+
There is a native `String` type which is distinct from `Array Char`. A `String` consists of UTF-16 code units and may contain unpaired surrogates. Working with Unicode code points is supported by library functions. The set of supported escape sequences for string literals is different from both Haskell and JavaScript.
58+
59+
#### Chars
60+
61+
PureScript has a type `Char` which represents a UTF-16 code unit for compatibility with JavaScript. In contrast, the type `Char` in Haskell represents a Unicode code point.
62+
5563
### Unit
5664

5765
PureScript has a type `Unit` used in place of Haskell's `()`. The `Prelude` module provides a value `unit` that inhabits this type.

language/Syntax.md

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,9 +157,48 @@ String literals are enclosed in double-quotes and may extend over multiple lines
157157

158158
Line breaks will be omitted from the string when written this way.
159159

160+
#### Escape sequences
161+
162+
String literals can contain a variety of escape sequences. The following escape sequences insert a commonly used control character:
163+
164+
``` purescript
165+
"\t" -- tab (U+0009)
166+
"\n" -- line feed (U+000A)
167+
"\r" -- carriage return (U+000D)
168+
```
169+
170+
The following escape sequences insert a character which normally has special meaning in string literals:
171+
172+
``` purescript
173+
"\\" -- backslash
174+
"\"" -- double-quote
175+
"\'" -- apostrophe
176+
```
177+
178+
Hexadecimal escape sequences can be used to insert an arbitrary Unicode code point. They start with `\x` and can contain 1 to 6 hexadecimal digits.
179+
180+
``` purescript
181+
"\x0" -- U+0000 (the lowest valid code point)
182+
"\x2713" -- U+2713 (check mark)
183+
"\x02713" -- U+2713 as well
184+
"\x10ffff" -- U+10FFFF (the highest valid code point)
185+
```
186+
187+
If you want to include a hexadecimal digit after such an escape sequence you have two options. You can break the string into two parts:
188+
189+
``` purescript
190+
"\x2713" <> "1"
191+
```
192+
193+
or use leading zeros in the escape sequence to make sure it has six digits:
194+
195+
``` purescript
196+
"\x0027131"
197+
```
198+
160199
#### Triple-quote Strings
161200

162-
If line breaks are required in the output, they can be inserted with `\n`. Alternatively, you can use triple double-quotes to prevent special parsing of escaped symbols. This also allows the use of double quotes within the string with no need to escape them:
201+
You can use triple double-quotes to prevent special parsing of escaped symbols. This also allows the use of double-quotes within the string with no need to escape them:
163202

164203
``` purescript
165204
jsIsHello :: String

0 commit comments

Comments
 (0)