Skip to content

Commit 87d8e7f

Browse files
committed
✨ Initial commit
0 parents  commit 87d8e7f

File tree

11 files changed

+697
-0
lines changed

11 files changed

+697
-0
lines changed

.gitignore

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
.vscode
2+
.prettierrc
3+
4+
# Byte-compiled / optimized / DLL files
5+
__pycache__/
6+
*.py[cod]
7+
*$py.class
8+
9+
# C extensions
10+
*.so
11+
12+
# Distribution / packaging
13+
.Python
14+
build/
15+
develop-eggs/
16+
dist/
17+
downloads/
18+
eggs/
19+
.eggs/
20+
lib/
21+
lib64/
22+
parts/
23+
sdist/
24+
var/
25+
wheels/
26+
pip-wheel-metadata/
27+
share/python-wheels/
28+
*.egg-info/
29+
.installed.cfg
30+
*.egg
31+
MANIFEST
32+
33+
# PyInstaller
34+
# Usually these files are written by a python script from a template
35+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
36+
*.manifest
37+
*.spec
38+
39+
# Installer logs
40+
pip-log.txt
41+
pip-delete-this-directory.txt
42+
43+
# Unit test / coverage reports
44+
htmlcov/
45+
.tox/
46+
.nox/
47+
.coverage
48+
.coverage.*
49+
.cache
50+
nosetests.xml
51+
coverage.xml
52+
*.cover
53+
*.py,cover
54+
.hypothesis/
55+
.pytest_cache/
56+
57+
# Translations
58+
*.mo
59+
*.pot
60+
61+
# Django stuff:
62+
*.log
63+
local_settings.py
64+
db.sqlite3
65+
db.sqlite3-journal
66+
67+
# Flask stuff:
68+
instance/
69+
.webassets-cache
70+
71+
# Scrapy stuff:
72+
.scrapy
73+
74+
# Sphinx documentation
75+
docs/_build/
76+
77+
# PyBuilder
78+
target/
79+
80+
# Jupyter Notebook
81+
.ipynb_checkpoints
82+
83+
# IPython
84+
profile_default/
85+
ipython_config.py
86+
87+
# pyenv
88+
.python-version
89+
90+
# pipenv
91+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
93+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
94+
# install all needed dependencies.
95+
#Pipfile.lock
96+
97+
# celery beat schedule file
98+
celerybeat-schedule
99+
100+
# SageMath parsed files
101+
*.sage.py
102+
103+
# Environments
104+
.env
105+
.venv
106+
env/
107+
venv/
108+
ENV/
109+
env.bak/
110+
venv.bak/
111+
112+
# Spyder project settings
113+
.spyderproject
114+
.spyproject
115+
116+
# Rope project settings
117+
.ropeproject
118+
119+
# mkdocs documentation
120+
/site
121+
122+
# mypy
123+
.mypy_cache/
124+
.dmypy.json
125+
dmypy.json
126+
127+
# Pyre type checker
128+
.pyre/

LICENSE

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2020 ExplosionAI GmbH
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

Lines changed: 238 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,238 @@
1+
<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a>
2+
3+
# spacy-streamlit: spaCy comoponents for Streamlit
4+
5+
This package contains utilities for visualizing [spaCy](https://spacy.io) models
6+
and building interactive spaCy-powered apps with
7+
[Streamlit](https://streamlit.io). It includes various building blocks you can
8+
use in your own Streamlit app, like visualizers for **syntactic dependencies**,
9+
**named entities**, **text classification**, **semantic similarity** via word
10+
vectors, token attributes, and more.
11+
12+
[![Current Release Version](https://img.shields.io/github/release/explosion/spacy-streamlit.svg?style=flat-square&logo=github)](https://github.com/explosion/spacy-streamlit/releases)
13+
[![pypi Version](https://img.shields.io/pypi/v/spacy-streamlit.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/spacy-streamlit/)
14+
15+
<img width="50%" align="right" src="https://user-images.githubusercontent.com/13643239/85388081-f2da8700-b545-11ea-9bd4-e303d3c5763c.png">
16+
17+
## 🚀 Quickstart
18+
19+
You can install `spacy-streamlit` from pip:
20+
21+
```bash
22+
pip install spacy-streamlit
23+
```
24+
25+
The package includes **building blocks** that call into Streamlit and set up all
26+
the required elements for you. You can either use the individual components
27+
directly and combine them with other elements in your app, or call the
28+
`visualizer` function to embed the whole visualizer.
29+
30+
```python
31+
# streamlit_app.py
32+
import spacy_streamlit
33+
34+
models = ["en_core_web_sm", "en_core_web_md"]
35+
default_text = "Sundar Pichai is the CEO of Google."
36+
spacy_streamlit.visualizer(models, default_text))
37+
```
38+
39+
You can then run your app with `streamlit run streamlit_app.py`.
40+
41+
### 📦 Example: [`01_out-of-the-box.py`](examples/01_out-of-the-box.py)
42+
43+
Use the embedded visualizer with custom settings out-of-the-box.
44+
45+
```bash
46+
streamlit run https://raw.githubusercontent.com/explosion/spacy-streamlit/master/examples/01_out-of-the-box.py
47+
```
48+
49+
### 👑 Example: [`02_custom.py`](examples/02_custom.py)
50+
51+
Use individual components in your existing app.
52+
53+
```bash
54+
streamlit run https://raw.githubusercontent.com/explosion/spacy-streamlit/master/examples/02_custom.py
55+
```
56+
57+
## 🎛 API
58+
59+
### Visualizer components
60+
61+
These functions can be used in your Streamlit app. They call into `streamlit`
62+
under the hood and set up the required elements.
63+
64+
#### <kbd>function</kbd> `visualizer`
65+
66+
Embed the full visualizer with selected components.
67+
68+
```python
69+
import spacy_streamlit
70+
71+
models = ["en_core_web_sm", "/path/to/model"]
72+
default_text = "Sundar Pichai is the CEO of Google."
73+
visualizers = ["ner", "textcat"]
74+
spacy_streamlit.visualizer(models, default_text, visualizers)
75+
```
76+
77+
| Argument | Type | Description |
78+
| --------------------- | ------------------- | ---------------------------------------------------------------------------------------------------------------------- |
79+
| `models` | List[str] | Names of loadable spaCy models (paths or package names). The models become selectable via a dropdown. |
80+
| `default_text` | str | Default text to analyze on load. Defaults to `""`. |
81+
| `visualizers` | List[str] | Names of visualizers to show. Defaults to `["parser", "ner", "textcat", "similarity", "tokens"]`. |
82+
| `ner_labels` | Optional[List[str]] | NER labels to include. If not set, all labels present in the `"ner"` pipeline component will be used. |
83+
| `ner_attrs` | List[str] | Span attributes shown in table of named entities. See [`visualizer.py`](spacy_streamlit/visualizer.py) for defaults. |
84+
| `token_attrs` | List[str] | Token attributes to show in token visualizer. See [`visualizer.py`](spacy_streamlit/visualizer.py) for defaults. |
85+
| `similarity_texts` | Tuple[str, str] | The default texts to compare in the similarity visualizer. Defaults to `("apple", "orange")`. |
86+
| `show_json_doc` | bool | Show button to toggle JSON representation of the `Doc`. Defaults to `True`. |
87+
| `show_model_meta` | bool | Show button to toggle model `meta.json`. Defaults to `True`. |
88+
| `sidebar_title` | Optional[str] | Title shown in the sidebar. Defaults to `None`. |
89+
| `sidebar_description` | Optional[str] | Description shown in the sidebar. Accepts Markdown-formatted text. |
90+
| `show_logo` | bool | Show the spaCy logo in the sidebar. Defaults to `True`. |
91+
| `color` | Optional[str] | Experimental: Primary color to use for some of the main UI elements (`None` to disable hack). Defaults to `"#09A3D5"`. |
92+
93+
#### <kbd>function</kbd> `visualize_parser`
94+
95+
Visualize the dependency parse and part-of-speech tags using spaCy's
96+
[`displacy` visualizer](https://spacy.io/usage/visualizers).
97+
98+
```python
99+
import spacy
100+
from spacy_streamlit import visualize_parser
101+
102+
nlp = spacy.load("en_core_web_sm")
103+
doc = nlp("This is a text")
104+
visualize_parser(doc)
105+
```
106+
107+
| Argument | Type | Description |
108+
| --------------- | ------------- | -------------------------------------------- |
109+
| `doc` | `Doc` | The spaCy `Doc` object to visualize. |
110+
| _keyword-only_ | | |
111+
| `title` | Optional[str] | Title of the visualizer block. |
112+
| `sidebar_title` | Optional[str] | Title of the config settings in the sidebar. |
113+
114+
#### <kbd>function</kbd> `visualize_ner`
115+
116+
Visualize the named entities in a `Doc` using spaCy's
117+
[`displacy` visualizer](https://spacy.io/usage/visualizers).
118+
119+
```python
120+
import spacy
121+
from spacy_streamlit import visualize_ner
122+
123+
nlp = spacy.load("en_core_web_sm")
124+
doc = nlp("Sundar Pichai is the CEO of Google.")
125+
visualize_ner(doc, labels=nlp.get_pipe("ner").labels)
126+
```
127+
128+
| Argument | Type | Description |
129+
| --------------- | ------------- | ----------------------------------------------------------------------------- |
130+
| `doc` | `Doc` | The spaCy `Doc` object to visualize. |
131+
| _keyword-only_ | | |
132+
| `labels` | Sequence[str] | The labels to show in the labels dropdown. |
133+
| `attrs` | List[str] | The span attributes to show in entity table. |
134+
| `show_table` | bool | Whether to show a table of entities and their attributes. Defaults to `True`. |
135+
| `title` | Optional[str] | Title of the visualizer block. |
136+
| `sidebar_title` | Optional[str] | Title of the config settings in the sidebar. |
137+
138+
#### <kbd>function</kbd> `visualize_textcat`
139+
140+
Visualize text categories predicted by a trained text classifier.
141+
142+
```python
143+
import spacy
144+
from spacy_streamlit import visualize_textcat
145+
146+
nlp = spacy.load("./my_textcat_model")
147+
doc = nlp("This is a text about a topic")
148+
visualize_textcat(doc)
149+
```
150+
151+
| Argument | Type | Description |
152+
| -------------- | ------------- | ------------------------------------ |
153+
| `doc` | `Doc` | The spaCy `Doc` object to visualize. |
154+
| _keyword-only_ | | |
155+
| `title` | Optional[str] | Title of the visualizer block. |
156+
157+
#### `visualize_similarity`
158+
159+
Visualize semantic similarity using the model's word vectors. Will show a
160+
warning if no vectors are present in the model.
161+
162+
```python
163+
import spacy
164+
from spacy_streamlit import visualize_similarity
165+
166+
nlp = spacy.load("en_core_web_lg")
167+
visualize_similarity(nlp, ("pizza", "fries"))
168+
```
169+
170+
| Argument | Type | Description |
171+
| --------------- | --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- |
172+
| `nlp` | `Language` | The loaded `nlp` object with vectors. |
173+
| `default_texts` | Tuple[str, str] | The default texts to compare on load. Defaults to `("apple", "orange")`. |
174+
| _keyword-only_ | | |
175+
| `threshold` | float | Threshold for what's considered "similar". If the similarity score is greater than the threshold, the result is shown as similar. Defaults to `0.5`. |
176+
| `title` | Optional[str] | Title of the visualizer block. |
177+
178+
#### <kbd>function</kbd> `visualize_tokens`
179+
180+
Visualize the tokens in a `Doc` and their attributes.
181+
182+
```python
183+
import spacy
184+
from spacy_streamlit import visualize_tokens
185+
186+
nlp = spacy.load("en_core_web_sm")
187+
doc = nlp("This is a text")
188+
visualize_tokens(doc, atrrs=["text", "pos_", "dep_", "ent_type_"])
189+
```
190+
191+
| Argument | Type | Description |
192+
| -------------- | ------------- | -------------------------------------------------------------------------------------------------------- |
193+
| `doc` | `Doc` | The spaCy `Doc` object to visualize. |
194+
| _keyword-only_ | | |
195+
| `attrs` | List[str] | The names of token attributes to use. See [`visualizer.py`](spacy_streamlit/visualizer.py) for defaults. |
196+
| `title` | Optional[str] | Title of the visualizer block. |
197+
198+
### Cached helpers
199+
200+
These helpers attempt to cache loaded models and created `Doc` objects.
201+
202+
#### <kbd>function</kbd> `process_text`
203+
204+
Process a text with a model of a given name and create a `Doc` object. Calls
205+
into the `load_model` helper to load the model.
206+
207+
```python
208+
import streamlit as st
209+
from spacy_streamlit import process_text
210+
211+
spacy_model = st.sidebar.selectbox("Model name", ["en_core_web_sm", "en_core_web_md"])
212+
text = st.text_area("Text to analyze", "This is a text")
213+
doc = process_text(spacy_model, text)
214+
```
215+
216+
| Argument | Type | Description |
217+
| ------------ | ----- | ------------------------------------------------------- |
218+
| `model_name` | str | Loadable spaCy model name. Can be path or package name. |
219+
| `text` | str | The text to process. |
220+
| **RETURNS** | `Doc` | The processed document. |
221+
222+
#### <kbd>function</kbd> `load_model`
223+
224+
Load a spaCy model from a path or installed package and return a loaded `nlp`
225+
object.
226+
227+
```python
228+
import streamlit as st
229+
from spacy_streamlit import load_model
230+
231+
spacy_model = st.sidebar.selectbox("Model name", ["en_core_web_sm", "en_core_web_md"])
232+
nlp = load_model(spacy_model)
233+
```
234+
235+
| Argument | Type | Description |
236+
| ----------- | ---------- | -------------------------------------------------------- |
237+
| `name` | str |  Loadable spaCy model name. Can be path or package name. |
238+
| **RETURNS** | `Language` | The loaded `nlp` object. |

0 commit comments

Comments
 (0)