Skip to content

Conversation

@rdon-key
Copy link

@rdon-key rdon-key commented Sep 1, 2025

Assumptions

The output is now sorted by rune and ensures uniqueness.
This matches the assumptions made by const2bit.GetGlyph,
which relies on binary search and does not support duplicate runes.

Summary

This PR fixes the unstable font generation when using multiple font files.

  • Added a duplicate glyph check:
    • The first font that provides the glyph will be used.
    • Subsequent duplicates are skipped with a warning to stderr.
  • Sorted glyphs by rune codepoint to make the output deterministic.
  • Added font file information in comments of the generated code:
    • Helps debugging when multiple fonts are used.
    • Clarifies font origin for license management.

Why

Previously, glyphs could be duplicated or appear in non-deterministic order,
which caused incorrect behavior.

For example, NotoSansJP-Regular.ttf and NotoSansKR-Regular.ttf are CJK fonts
that embed very large character sets. Each font includes not only Japanese or
Korean characters but also full sets of Latin, Greek, Cyrillic, and other
Euro-zone glyphs. As a result, there is a huge overlap (e.g. punctuation,
Latin letters).

When these fonts are combined, many runes appear multiple times and the generated
order becomes non-deterministic. This PR resolves these issues by enforcing
uniqueness and sorting.

@sago35
Copy link
Member

sago35 commented Oct 16, 2025

I understand the changes. To confirm that the corrections have been made, please provide the following information:

  • Where to obtain the file
  • The command to run

@rdon-key
Copy link
Author

rdon-key commented Oct 24, 2025

Thanks for the review.
Here is a Linux repro to validate multi-font combination and duplicate handling (JP/KR).
It produces deterministic output and logs duplicates as expected.
Please let me know if you want me to add checksum verification or run this in CI.

Reproduction Steps for Multi-Font Combination Test (Linux)

Step 1) Download and Extract Fonts

mkdir -p ~/tinyfontgen-test && cd ~/tinyfontgen-test

wget https://sourceforge.net/projects/source-han-sans.mirror/files/2.004R/SourceHanSansKR.zip
wget https://sourceforge.net/projects/source-han-sans.mirror/files/2.004R/SourceHanSansJP.zip

unzip -j SourceHanSansJP.zip "SubsetOTF/JP/SourceHanSansJP-Regular.otf" -d .
unzip -j SourceHanSansKR.zip "SubsetOTF/KR/SourceHanSansKR-Regular.otf" -d .

ls -lh SourceHanSans*-Regular.otf

Step 2) Create the Test String File (string.txt)

cat > string.txt << 'EOF'
JP-only glyphs
U+3402 㐂
U+3405 㐅
U+3406 㐆
U+3427 㐧
U+342C 㐬
U+342E 㐮
U+3468 㑨
U+346A 㑪
U+3488 㒈
U+3492 㒒

KR-only glyphs
U+1100 ᄀ
U+1101 ᄁ
U+1102 ᄂ
U+1103 ᄃ
U+1104 ᄄ
U+1105 ᄅ
U+1106 ᄆ
U+1107 ᄇ
U+1108 ᄈ
U+1109 ᄉ

JP/KR overlap
U+3042 あ
U+3044 い
U+3046 う
U+3048 え
U+304A お
EOF

Step 3) Generate JP-Only Font

tinyfontgen-ttf
-output gen_jp.go
-fontname JPTest
-string-file string.txt
SourceHanSansJP-Regular.otf

# Expected results:
# - gen_jp.go is created.
# - No duplicate warnings (single font).
# - The generated code contains comments like:
#   // from: SourceHanSansJP-Regular.otf

Step 4) Generate Combined JP + KR Font

tinyfontgen-ttf
-output gen_mix.go
-fontname MixedTest
-string-file string.txt
SourceHanSansJP-Regular.otf
SourceHanSansKR-Regular.otf
2> dup.log

# Expected results:
# 1. gen_mix.go is created.
# 2. dup.log contains duplicate warnings for JP/KR Both characters, e.g.:
#    [warn] duplicate rune U+3042 'あ' from SourceHanSansKR-Regular.otf (kept: SourceHanSansJP-Regular.otf)
# 3. gen_mix.go entries are sorted by Unicode codepoint.
#    Re-running the command should produce identical output (deterministic).
# 4. Each glyph definition includes a provenance comment such as:
#    // from: SourceHanSansJP-Regular.otf
#    // from: SourceHanSansKR-Regular.otf

Step 5) Expected Outcomes Summary

Test case Expected behavior
JP-only glyphs Present in gen_mix.go, no warnings.
KR-only glyphs Present in gen_mix.go, no warnings.
JP/KR overlap Reported as duplicate rune in dup.log; JP glyphs kept.
Sorting Output is Unicode-sorted and stable across runs.
Font comments Each glyph shows a // from: comment indicating its source font.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants