|
| 1 | +<a href="https://explosion.ai"><img src="https://explosion.ai/assets/img/logo.svg" width="125" height="125" align="right" /></a> |
| 2 | + |
| 3 | +# spacy-streamlit: spaCy comoponents for Streamlit |
| 4 | + |
| 5 | +This package contains utilities for visualizing [spaCy](https://spacy.io) models |
| 6 | +and building interactive spaCy-powered apps with |
| 7 | +[Streamlit](https://streamlit.io). It includes various building blocks you can |
| 8 | +use in your own Streamlit app, like visualizers for **syntactic dependencies**, |
| 9 | +**named entities**, **text classification**, **semantic similarity** via word |
| 10 | +vectors, token attributes, and more. |
| 11 | + |
| 12 | +[](https://github.com/explosion/spacy-streamlit/releases) |
| 13 | +[](https://pypi.org/project/spacy-streamlit/) |
| 14 | + |
| 15 | +<img width="50%" align="right" src="https://user-images.githubusercontent.com/13643239/85388081-f2da8700-b545-11ea-9bd4-e303d3c5763c.png"> |
| 16 | + |
| 17 | +## 🚀 Quickstart |
| 18 | + |
| 19 | +You can install `spacy-streamlit` from pip: |
| 20 | + |
| 21 | +```bash |
| 22 | +pip install spacy-streamlit |
| 23 | +``` |
| 24 | + |
| 25 | +The package includes **building blocks** that call into Streamlit and set up all |
| 26 | +the required elements for you. You can either use the individual components |
| 27 | +directly and combine them with other elements in your app, or call the |
| 28 | +`visualizer` function to embed the whole visualizer. |
| 29 | + |
| 30 | +```python |
| 31 | +# streamlit_app.py |
| 32 | +import spacy_streamlit |
| 33 | + |
| 34 | +models = ["en_core_web_sm", "en_core_web_md"] |
| 35 | +default_text = "Sundar Pichai is the CEO of Google." |
| 36 | +spacy_streamlit.visualizer(models, default_text)) |
| 37 | +``` |
| 38 | + |
| 39 | +You can then run your app with `streamlit run streamlit_app.py`. |
| 40 | + |
| 41 | +### 📦 Example: [`01_out-of-the-box.py`](examples/01_out-of-the-box.py) |
| 42 | + |
| 43 | +Use the embedded visualizer with custom settings out-of-the-box. |
| 44 | + |
| 45 | +```bash |
| 46 | +streamlit run https://raw.githubusercontent.com/explosion/spacy-streamlit/master/examples/01_out-of-the-box.py |
| 47 | +``` |
| 48 | + |
| 49 | +### 👑 Example: [`02_custom.py`](examples/02_custom.py) |
| 50 | + |
| 51 | +Use individual components in your existing app. |
| 52 | + |
| 53 | +```bash |
| 54 | +streamlit run https://raw.githubusercontent.com/explosion/spacy-streamlit/master/examples/02_custom.py |
| 55 | +``` |
| 56 | + |
| 57 | +## 🎛 API |
| 58 | + |
| 59 | +### Visualizer components |
| 60 | + |
| 61 | +These functions can be used in your Streamlit app. They call into `streamlit` |
| 62 | +under the hood and set up the required elements. |
| 63 | + |
| 64 | +#### <kbd>function</kbd> `visualizer` |
| 65 | + |
| 66 | +Embed the full visualizer with selected components. |
| 67 | + |
| 68 | +```python |
| 69 | +import spacy_streamlit |
| 70 | + |
| 71 | +models = ["en_core_web_sm", "/path/to/model"] |
| 72 | +default_text = "Sundar Pichai is the CEO of Google." |
| 73 | +visualizers = ["ner", "textcat"] |
| 74 | +spacy_streamlit.visualizer(models, default_text, visualizers) |
| 75 | +``` |
| 76 | + |
| 77 | +| Argument | Type | Description | |
| 78 | +| --------------------- | ------------------- | ---------------------------------------------------------------------------------------------------------------------- | |
| 79 | +| `models` | List[str] | Names of loadable spaCy models (paths or package names). The models become selectable via a dropdown. | |
| 80 | +| `default_text` | str | Default text to analyze on load. Defaults to `""`. | |
| 81 | +| `visualizers` | List[str] | Names of visualizers to show. Defaults to `["parser", "ner", "textcat", "similarity", "tokens"]`. | |
| 82 | +| `ner_labels` | Optional[List[str]] | NER labels to include. If not set, all labels present in the `"ner"` pipeline component will be used. | |
| 83 | +| `ner_attrs` | List[str] | Span attributes shown in table of named entities. See [`visualizer.py`](spacy_streamlit/visualizer.py) for defaults. | |
| 84 | +| `token_attrs` | List[str] | Token attributes to show in token visualizer. See [`visualizer.py`](spacy_streamlit/visualizer.py) for defaults. | |
| 85 | +| `similarity_texts` | Tuple[str, str] | The default texts to compare in the similarity visualizer. Defaults to `("apple", "orange")`. | |
| 86 | +| `show_json_doc` | bool | Show button to toggle JSON representation of the `Doc`. Defaults to `True`. | |
| 87 | +| `show_model_meta` | bool | Show button to toggle model `meta.json`. Defaults to `True`. | |
| 88 | +| `sidebar_title` | Optional[str] | Title shown in the sidebar. Defaults to `None`. | |
| 89 | +| `sidebar_description` | Optional[str] | Description shown in the sidebar. Accepts Markdown-formatted text. | |
| 90 | +| `show_logo` | bool | Show the spaCy logo in the sidebar. Defaults to `True`. | |
| 91 | +| `color` | Optional[str] | Experimental: Primary color to use for some of the main UI elements (`None` to disable hack). Defaults to `"#09A3D5"`. | |
| 92 | + |
| 93 | +#### <kbd>function</kbd> `visualize_parser` |
| 94 | + |
| 95 | +Visualize the dependency parse and part-of-speech tags using spaCy's |
| 96 | +[`displacy` visualizer](https://spacy.io/usage/visualizers). |
| 97 | + |
| 98 | +```python |
| 99 | +import spacy |
| 100 | +from spacy_streamlit import visualize_parser |
| 101 | + |
| 102 | +nlp = spacy.load("en_core_web_sm") |
| 103 | +doc = nlp("This is a text") |
| 104 | +visualize_parser(doc) |
| 105 | +``` |
| 106 | + |
| 107 | +| Argument | Type | Description | |
| 108 | +| --------------- | ------------- | -------------------------------------------- | |
| 109 | +| `doc` | `Doc` | The spaCy `Doc` object to visualize. | |
| 110 | +| _keyword-only_ | | | |
| 111 | +| `title` | Optional[str] | Title of the visualizer block. | |
| 112 | +| `sidebar_title` | Optional[str] | Title of the config settings in the sidebar. | |
| 113 | + |
| 114 | +#### <kbd>function</kbd> `visualize_ner` |
| 115 | + |
| 116 | +Visualize the named entities in a `Doc` using spaCy's |
| 117 | +[`displacy` visualizer](https://spacy.io/usage/visualizers). |
| 118 | + |
| 119 | +```python |
| 120 | +import spacy |
| 121 | +from spacy_streamlit import visualize_ner |
| 122 | + |
| 123 | +nlp = spacy.load("en_core_web_sm") |
| 124 | +doc = nlp("Sundar Pichai is the CEO of Google.") |
| 125 | +visualize_ner(doc, labels=nlp.get_pipe("ner").labels) |
| 126 | +``` |
| 127 | + |
| 128 | +| Argument | Type | Description | |
| 129 | +| --------------- | ------------- | ----------------------------------------------------------------------------- | |
| 130 | +| `doc` | `Doc` | The spaCy `Doc` object to visualize. | |
| 131 | +| _keyword-only_ | | | |
| 132 | +| `labels` | Sequence[str] | The labels to show in the labels dropdown. | |
| 133 | +| `attrs` | List[str] | The span attributes to show in entity table. | |
| 134 | +| `show_table` | bool | Whether to show a table of entities and their attributes. Defaults to `True`. | |
| 135 | +| `title` | Optional[str] | Title of the visualizer block. | |
| 136 | +| `sidebar_title` | Optional[str] | Title of the config settings in the sidebar. | |
| 137 | + |
| 138 | +#### <kbd>function</kbd> `visualize_textcat` |
| 139 | + |
| 140 | +Visualize text categories predicted by a trained text classifier. |
| 141 | + |
| 142 | +```python |
| 143 | +import spacy |
| 144 | +from spacy_streamlit import visualize_textcat |
| 145 | + |
| 146 | +nlp = spacy.load("./my_textcat_model") |
| 147 | +doc = nlp("This is a text about a topic") |
| 148 | +visualize_textcat(doc) |
| 149 | +``` |
| 150 | + |
| 151 | +| Argument | Type | Description | |
| 152 | +| -------------- | ------------- | ------------------------------------ | |
| 153 | +| `doc` | `Doc` | The spaCy `Doc` object to visualize. | |
| 154 | +| _keyword-only_ | | | |
| 155 | +| `title` | Optional[str] | Title of the visualizer block. | |
| 156 | + |
| 157 | +#### `visualize_similarity` |
| 158 | + |
| 159 | +Visualize semantic similarity using the model's word vectors. Will show a |
| 160 | +warning if no vectors are present in the model. |
| 161 | + |
| 162 | +```python |
| 163 | +import spacy |
| 164 | +from spacy_streamlit import visualize_similarity |
| 165 | + |
| 166 | +nlp = spacy.load("en_core_web_lg") |
| 167 | +visualize_similarity(nlp, ("pizza", "fries")) |
| 168 | +``` |
| 169 | + |
| 170 | +| Argument | Type | Description | |
| 171 | +| --------------- | --------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | |
| 172 | +| `nlp` | `Language` | The loaded `nlp` object with vectors. | |
| 173 | +| `default_texts` | Tuple[str, str] | The default texts to compare on load. Defaults to `("apple", "orange")`. | |
| 174 | +| _keyword-only_ | | | |
| 175 | +| `threshold` | float | Threshold for what's considered "similar". If the similarity score is greater than the threshold, the result is shown as similar. Defaults to `0.5`. | |
| 176 | +| `title` | Optional[str] | Title of the visualizer block. | |
| 177 | + |
| 178 | +#### <kbd>function</kbd> `visualize_tokens` |
| 179 | + |
| 180 | +Visualize the tokens in a `Doc` and their attributes. |
| 181 | + |
| 182 | +```python |
| 183 | +import spacy |
| 184 | +from spacy_streamlit import visualize_tokens |
| 185 | + |
| 186 | +nlp = spacy.load("en_core_web_sm") |
| 187 | +doc = nlp("This is a text") |
| 188 | +visualize_tokens(doc, atrrs=["text", "pos_", "dep_", "ent_type_"]) |
| 189 | +``` |
| 190 | + |
| 191 | +| Argument | Type | Description | |
| 192 | +| -------------- | ------------- | -------------------------------------------------------------------------------------------------------- | |
| 193 | +| `doc` | `Doc` | The spaCy `Doc` object to visualize. | |
| 194 | +| _keyword-only_ | | | |
| 195 | +| `attrs` | List[str] | The names of token attributes to use. See [`visualizer.py`](spacy_streamlit/visualizer.py) for defaults. | |
| 196 | +| `title` | Optional[str] | Title of the visualizer block. | |
| 197 | + |
| 198 | +### Cached helpers |
| 199 | + |
| 200 | +These helpers attempt to cache loaded models and created `Doc` objects. |
| 201 | + |
| 202 | +#### <kbd>function</kbd> `process_text` |
| 203 | + |
| 204 | +Process a text with a model of a given name and create a `Doc` object. Calls |
| 205 | +into the `load_model` helper to load the model. |
| 206 | + |
| 207 | +```python |
| 208 | +import streamlit as st |
| 209 | +from spacy_streamlit import process_text |
| 210 | + |
| 211 | +spacy_model = st.sidebar.selectbox("Model name", ["en_core_web_sm", "en_core_web_md"]) |
| 212 | +text = st.text_area("Text to analyze", "This is a text") |
| 213 | +doc = process_text(spacy_model, text) |
| 214 | +``` |
| 215 | + |
| 216 | +| Argument | Type | Description | |
| 217 | +| ------------ | ----- | ------------------------------------------------------- | |
| 218 | +| `model_name` | str | Loadable spaCy model name. Can be path or package name. | |
| 219 | +| `text` | str | The text to process. | |
| 220 | +| **RETURNS** | `Doc` | The processed document. | |
| 221 | + |
| 222 | +#### <kbd>function</kbd> `load_model` |
| 223 | + |
| 224 | +Load a spaCy model from a path or installed package and return a loaded `nlp` |
| 225 | +object. |
| 226 | + |
| 227 | +```python |
| 228 | +import streamlit as st |
| 229 | +from spacy_streamlit import load_model |
| 230 | + |
| 231 | +spacy_model = st.sidebar.selectbox("Model name", ["en_core_web_sm", "en_core_web_md"]) |
| 232 | +nlp = load_model(spacy_model) |
| 233 | +``` |
| 234 | + |
| 235 | +| Argument | Type | Description | |
| 236 | +| ----------- | ---------- | -------------------------------------------------------- | |
| 237 | +| `name` | str | Loadable spaCy model name. Can be path or package name. | |
| 238 | +| **RETURNS** | `Language` | The loaded `nlp` object. | |
0 commit comments