Skip to content

Commit a5c28ff

Browse files
🐛 FIX: parsing of unicode ordinals (#81)
Co-authored-by: Chris Sewell <[email protected]>
1 parent 042f403 commit a5c28ff

File tree

2 files changed

+13
-8
lines changed

2 files changed

+13
-8
lines changed

markdown_it/common/utils.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -92,15 +92,13 @@ def isValidEntityCode(c):
9292
return True
9393

9494

95-
def fromCodePoint(c):
96-
97-
if c > 0xFFFF:
98-
c -= 0x10000
99-
surrogate1 = 0xD800 + (c >> 10)
100-
surrogate2 = 0xDC00 + (c & 0x3FF)
101-
102-
return "".join(map(chr, [surrogate1, surrogate2]))
95+
def fromCodePoint(c: int) -> str:
96+
"""Convert ordinal to unicode.
10397
98+
Note, in the original Javascript two string characters were required,
99+
for codepoints larger than `0xFFFF`.
100+
But Python 3 can represent any unicode codepoint in one character.
101+
"""
104102
return chr(c)
105103

106104

tests/test_port/fixtures/issue-fixes.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,10 @@
2929
<p>Another Block Quote</p>
3030
</blockquote>
3131
.
32+
33+
#80 UnicodeError with codepoints larger than 0xFFFF
34+
.
35+
&#x1F4AC;
36+
.
37+
<p>💬</p>
38+
.

0 commit comments

Comments
 (0)