Skip to content

Potential noise from copyright detection #4381

@chinyeungli

Description

@chinyeungli

FIle: busybox-1.37.0/docs)/unicode_full-bmp.txt

Part of the context

ힰힱힲힳힴힵힶힷힸힹힺힻힼힽힾힿퟀퟁퟂퟃퟄퟅퟆ퟇퟈퟉퟊ퟋퟌퟍퟎퟏퟐퟑퟒퟓퟔퟕퟖퟗퟘퟙퟚퟛퟜퟝퟞퟟퟠퟡퟢퟣퟤퟥퟦퟧퟨퟩퟪퟫퟬퟭퟮퟯ
ퟰퟱퟲퟳퟴퟵퟶퟷퟸퟹퟺퟻ퟼퟽퟾퟿

High Surrogates (U+D800-U+DB7F):

í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í �í  í ¡í ¢í £í ¤í ¥í ¦í §í ¨í ©í ªí «í ¬í ­í ®í ¯í °í ±í ²í ³í ´í µí ¶í ·í ¸í ¹í ºí »í ¼í ½í ¾í ¿
í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡�í¡ í¡¡í¡¢í¡£í¡¤í¡¥í¡¦í¡§í¡¨í¡©í¡ªí¡«í¡¬í¡­í¡®í¡¯í¡°í¡±í¡²í¡³í¡´í¡µí¡¶í¡·í¡¸í¡¹í¡ºí¡»í¡¼í¡½í¡¾í¡¿
í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢�í¢ í¢¡í¢¢í¢£í¢¤í¢¥í¢¦í¢§í¢¨í¢©í¢ªí¢«í¢¬í¢­í¢®í¢¯í¢°í¢±í¢²í¢³í¢´í¢µí¢¶í¢·í¢¸í¢¹í¢ºí¢»í¢¼í¢½í¢¾í¢¿
í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£�í£ í£¡í£¢í££í£¤í£¥í£¦í£§í£¨í£©í£ªí£«í£¬í£­í£®í£¯í£°í£±í£²í£³í£´í£µí£¶í£·í£¸í£¹í£ºí£»í£¼í£½í£¾í£¿
í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤�í¤ í¤¡í¤¢í¤£í¤¤í¤¥í¤¦í¤§í¤¨í¤©í¤ªí¤«í¤¬í¤­í¤®í¤¯í¤°í¤±í¤²í¤³í¤´í¤µí¤¶í¤·í¤¸í¤¹í¤ºí¤»í¤¼í¤½í¤¾í¤¿
í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥�í¥ í¥¡í¥¢í¥£í¥¤í¥¥í¥¦í¥§í¥¨í¥©í¥ªí¥«í¥¬í¥­í¥®í¥¯í¥°í¥±í¥²í¥³í¥´í¥µí¥¶í¥·í¥¸í¥¹í¥ºí¥»í¥¼í¥½í¥¾í¥¿
í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦�í¦ í¦¡í¦¢í¦£í¦¤í¦¥í¦¦í¦§í¦¨í¦©í¦ªí¦«í¦¬í¦­í¦®í¦¯í¦°í¦±í¦²í¦³í¦´í¦µí¦¶í¦·í¦¸í¦¹í¦ºí¦»í¦¼í¦½í¦¾í¦¿
í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§�í§ í§¡í§¢í§£í§¤í§¥í§¦í§§í§¨í§©í§ªí§«í§¬í§­í§®í§¯í§°í§±í§²í§³í§´í§µí§¶í§·í§¸í§¹í§ºí§»í§¼í§½í§¾í§¿
í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨�í¨ í¨¡í¨¢í¨£í¨¤í¨¥í¨¦í¨§í¨¨í¨©í¨ªí¨«í¨¬í¨­í¨®í¨¯í¨°í¨±í¨²í¨³í¨´í¨µí¨¶í¨·í¨¸í¨¹í¨ºí¨»í¨¼í¨½í¨¾í¨¿
í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í© í©¡í©¢í©£í©¤í©¥í©¦í©§í©¨í©©í©ªí©«í©¬í©­í©®í©¯í©°í©±í©²í©³í©´í©µí©¶í©·í©¸í©¹í©ºí©»í©¼í©½í©¾í©¿
íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª�íª íª¡íª¢íª£íª¤íª¥íª¦íª§íª¨íª©íªªíª«íª¬íª­íª®íª¯íª°íª±íª²íª³íª´íªµíª¶íª·íª¸íª¹íªºíª»íª¼íª½íª¾íª¿
í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í«�í« í«¡í«¢í«£í«¤í«¥í«¦í«§í«¨í«©í«ªí««í«¬í«­í«®í«¯í«°í«±í«²í«³í«´í«µí«¶í«·í«¸í«¹í«ºí«»í«¼í«½í«¾í«¿
í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬�í¬ í¬¡í¬¢í¬£í¬¤í¬¥í¬¦í¬§í¬¨í¬©í¬ªí¬«í¬¬í¬­í¬®í¬¯í¬°í¬±í¬²í¬³í¬´í¬µí¬¶í¬·í¬¸í¬¹í¬ºí¬»í¬¼í¬½í¬¾í¬¿
í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­�í­ í­¡í­¢í­£í­¤í­¥í­¦í­§í­¨í­©í­ªí­«í­¬í­­í­®í­¯í­°í­±í­²í­³í­´í­µí­¶í­·í­¸í­¹í­ºí­»í­¼í­½í­¾í­¿

High Private Use Surrogates (U+DB80-U+DBFF):

and this line is detected as have copyright context

í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í©�í© í©¡í©¢í©£í©¤í©¥í©¦í©§í©¨í©©í©ªí©«í©¬í©­í©®í©¯í©°í©±í©²í©³í©´í©µí©¶í©·í©¸í©¹í©ºí©»í©¼í©½í©¾í©¿

and return copyright as copyright: (c) $?i (c) Y

Not sure can this be improved.

This is the sample file:

unicode_full-bmp.txt

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions