Skip to content

Commit b6daf5c

Browse files
authored
Merge pull request #218 from bzz/tokenizer-flex-cgo
New, optional flex-based tokenizer
2 parents ab3c26b + 7e136ba commit b6daf5c

File tree

9 files changed

+2707
-15
lines changed

9 files changed

+2707
-15
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,4 @@ Makefile.main
88
build/
99
vendor/
1010
java/lib/
11+
.vscode/

internal/tokenizer/common.go

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
// Package tokenizer implements file tokenization used by the enry content
2+
// classifier. This package is an implementation detail of enry and should not
3+
// be imported by other packages.
4+
package tokenizer
5+
6+
// ByteLimit defines the maximum prefix of an input text that will be tokenized.
7+
const ByteLimit = 100000

0 commit comments

Comments
 (0)