You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File programming language detector and toolbox to ignore binary or vendored files. *enry*, started as a port to _Go_ of the original [linguist](https://github.com/github/linguist)_Ruby_ library, that has an improved *2x performance*.
3
+
Programming language detector and toolbox to ignore binary or vendored files. *enry*, started as a port to _Go_ of the original [linguist](https://github.com/github/linguist)_Ruby_ library, that has an improved *2x performance*.
4
4
5
-
*[Installation](#installation)
6
-
*[Examples](#examples)
7
5
*[CLI](#cli)
8
-
*[Java bindings](#java-bindings)
9
-
*[Python bindings](#python-bindings)
6
+
*[Library](#library)
7
+
*[Go](#go)
8
+
*[Java bindings](#java-bindings)
9
+
*[Python bindings](#python-bindings)
10
10
*[Divergences from linguist](#divergences-from-linguist)
11
11
*[Benchmarks](#benchmarks)
12
12
*[Why Enry?](#why-enry)
13
13
*[Development](#development)
14
14
*[Sync with github/linguist upstream](#sync-with-githublinguist-upstream)
Note that enry's CLI **_doesn't need a git repository to work_**, which is intentionally different from the linguist.
160
-
161
95
## Java bindings
162
96
97
+
Generated Java bindings using a C shared library and JNI are available under [`java`](https://github.com/src-d/enry/blob/master/java).
98
+
99
+
A library is published on Maven as [tech.sourced:enry-java](https://mvnrepository.com/artifact/tech.sourced/enry-java) for macOS and linux platforms. Windows support is planned under [src-d/enry#150](https://github.com/src-d/enry/issues/150).
163
100
164
-
Generated Java bindings using a C shared library and JNI are available under [`java`](https://github.com/src-d/enry/blob/master/java) and published on Maven at [tech.sourced:enry-java](https://mvnrepository.com/artifact/tech.sourced/enry-java) for macOS and linux.
101
+
# Python bindings
165
102
103
+
Generated Python bindings using a C shared library and cffi are WIP under [src-d/enry#154](https://github.com/src-d/enry/issues/154).
166
104
167
-
## Python bindings
168
-
Generated Python bindings using a C shared library and cffi are not available yet and are WIP under [src-d/enry#154](https://github.com/src-d/enry/issues/154).
105
+
A library is going to be published on pypi as [enry](https://pypi.org/project/enry/) for
106
+
macOS and linux platforms. Windows support is planned under [src-d/enry#150](https://github.com/src-d/enry/issues/150).
169
107
170
108
Divergences from linguist
171
109
------------
@@ -199,26 +137,27 @@ In all the cases above that have an issue number - we plan to update enry to mat
199
137
Benchmarks
200
138
------------
201
139
202
-
Enry's language detection has been compared with Linguist's one. In order to do that, Linguist's project directory [*linguist/samples*](https://github.com/github/linguist/tree/master/samples) was used as a set of files to run benchmarks against.
140
+
Enry's language detection has been compared with Linguist's on [*linguist/samples*](https://github.com/github/linguist/tree/master/samples).
The histogram shows the number of files detected (y-axis) per time interval bucket (x-axis). As one can see, most of the files were detected faster by enry.
146
+
The histogram shows the _number of files_ (y-axis) per _time interval bucket_ (x-axis).
147
+
Most of the files were detected faster by enry.
209
148
210
-
We found few cases where enry turns slower than linguist due to
211
-
Go regexp engine being slower than Ruby's, based on [oniguruma](https://github.com/kkos/oniguruma) library, written in C.
149
+
There are several cases where enry is slower than linguist due to
150
+
Go regexp engine being slower than Ruby's on, wich is based on [oniguruma](https://github.com/kkos/oniguruma) library, written in C.
212
151
213
152
See [instructions](#misc) for running enry with oniguruma.
214
153
215
154
216
155
Why Enry?
217
156
------------
218
157
219
-
In the movie [My Fair Lady](https://en.wikipedia.org/wiki/My_Fair_Lady), [Professor Henry Higgins](http://www.imdb.com/character/ch0011719/?ref_=tt_cl_t2) is one of the main characters. Henry is a linguist and at the very beginning of the movie enjoys guessing the origin of people based on their accent.
158
+
In the movie [My Fair Lady](https://en.wikipedia.org/wiki/My_Fair_Lady), [Professor Henry Higgins](http://www.imdb.com/character/ch0011719/) is a linguist who at the very beginning of the movie enjoys guessing the origin of people based on their accent.
220
159
221
-
"Enry Iggins" is how [Eliza Doolittle](http://www.imdb.com/character/ch0011720/?ref_=tt_cl_t1), [pronounces](https://www.youtube.com/watch?v=pwNKyTktDIE) the name of the Professor during the first half of the movie.
160
+
"Enry Iggins" is how [Eliza Doolittle](http://www.imdb.com/character/ch0011720/), [pronounces](https://www.youtube.com/watch?v=pwNKyTktDIE) the name of the Professor.
222
161
223
162
## Development
224
163
@@ -228,7 +167,7 @@ To build enry's CLI run:
228
167
229
168
this will generate a binary in the project's root directory called `enry`.
230
169
231
-
To run the tests:
170
+
To run the tests use:
232
171
233
172
make test
234
173
@@ -267,6 +206,7 @@ Separating all the necessary "manual" code changes to a different PR that includ
267
206
## Misc
268
207
269
208
<details>
209
+
<summary>Running a benchmark & faster regexp engine</summary>
0 commit comments