|
1 | 1 | <p align="center">
|
2 | 2 | <img src="https://github.com/Layout-Parser/layout-parser/raw/master/.github/layout-parser.png" alt="Layout Parser Logo" width="35%">
|
3 |
| - <p align="center"> |
| 3 | + <h2 align="center"> |
4 | 4 | A unified toolkit for Deep Learning Based Document Image Analysis
|
5 |
| - </p> |
| 5 | + </h2> |
6 | 6 | </p>
|
7 | 7 |
|
8 | 8 | <p align=center>
|
9 |
| -<a href="https://arxiv.org/abs/2103.15348"><img src="https://img.shields.io/badge/arXiv-2103.15348-b31b1b.svg" title="Layout Parser Paper"></a> |
10 |
| -<a href="https://layout-parser.github.io"><img src="https://img.shields.io/badge/website-layout--parser.github.io-informational.svg" title="Layout Parser Paper"></a> |
11 |
| -<a href="https://layout-parser.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/doc-layout--parser.readthedocs.io-light.svg" title="Layout Parser Documentation"></a> |
| 9 | +<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/v/layoutparser?color=%23099cec&label=PyPI%20package&logo=pypi&logoColor=white" title="The current version of Layout Parser"></a> |
| 10 | +<a href="https://github.com/Layout-Parser/layout-parser/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/layoutparser" title="Layout Parser uses Apache 2 License"></a> |
| 11 | +<img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/layoutparser"> |
12 | 12 | </p>
|
13 | 13 |
|
14 | 14 | <p align=center>
|
15 |
| -<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/v/layoutparser?color=%23099cec&label=PyPI%20package&logo=pypi&logoColor=white" title="The current version of Layout Parser"></a> |
16 |
| -<a href="https://pypi.org/project/layoutparser/"><img src="https://img.shields.io/pypi/pyversions/layoutparser?color=%23099cec&" alt="Python 3.6 3.7 3.8" title="Layout Parser supports Python 3.6 and above"></a> |
17 |
| -<img alt="PyPI - Downloads" src="https://img.shields.io/pypi/dm/layoutparser"> |
18 |
| -<a href="https://github.com/Layout-Parser/layout-parser/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/layoutparser" title="Layout Parser uses Apache 2 License"></a> |
| 15 | +<a href="https://arxiv.org/abs/2103.15348"><img src="https://img.shields.io/badge/paper-2103.15348-b31b1b.svg" title="Layout Parser Paper"></a> |
| 16 | +<a href="https://layout-parser.github.io"><img src="https://img.shields.io/badge/website-layout--parser.github.io-informational.svg" title="Layout Parser Paper"></a> |
| 17 | +<a href="https://layout-parser.readthedocs.io/en/latest/"><img src="https://img.shields.io/badge/doc-layout--parser.readthedocs.io-light.svg" title="Layout Parser Documentation"></a> |
19 | 18 | </p>
|
20 | 19 |
|
21 | 20 | ---
|
22 | 21 |
|
23 |
| -## Installation |
| 22 | +## What is LayoutParser |
| 23 | + |
| 24 | + |
24 | 25 |
|
25 |
| -You can find detailed installation instructions in [installation.md](installation.md). But generally, it's just `pip install` |
26 |
| -some libraries: |
| 26 | +LayoutParser aims to provide a wide range of tools that aims to streamline Document Image Analysis (DIA) tasks. Please check the LayoutParser [demo video](https://youtu.be/8yA5xB4Dg8c) (1 min) or [full talk](https://www.youtube.com/watch?v=YG0qepPgyGY) (15 min) for details. And here are some key features: |
| 27 | + |
| 28 | +- LayoutParser provides a rich repository of deep learning models for layout detection as well as a set of unified APIs for using them. For example, |
| 29 | + |
| 30 | + <details> |
| 31 | + <summary>Perform DL layout detection in 4 lines of code</summary> |
| 32 | + |
| 33 | + ```python |
| 34 | + import layoutparser as lp |
| 35 | + model = lp.AutoLayoutModel('lp://EfficientDete/PubLayNet') |
| 36 | + # image = Image.open("path/to/image") |
| 37 | + layout = model.detect(image) |
| 38 | + ``` |
| 39 | + |
| 40 | + </details> |
| 41 | + |
| 42 | +- LayoutParser comes with a set of layout data structures with carefully designed APIs that are optimized for document image analysis tasks. For example, |
| 43 | + |
| 44 | + <details> |
| 45 | + <summary>Selecting layout/textual elements in the left column of a page</summary> |
| 46 | + |
| 47 | + ```python |
| 48 | + image_width = image.size[0] |
| 49 | + left_column = lp.Interval(0, image_width/2, axis='x') |
| 50 | + layout.filter_by(left_column, center=True) # select objects in the left column |
| 51 | + ``` |
| 52 | + |
| 53 | + </details> |
| 54 | + |
| 55 | + <details> |
| 56 | + <summary>Performing OCR for each detected Layout Region</summary> |
| 57 | + |
| 58 | + ```python |
| 59 | + ocr_agent = lp.TesseractAgent() |
| 60 | + for layout_region in layout: |
| 61 | + image_segment = layout_region.crop(image) |
| 62 | + text = ocr_agent.detect(image_segment) |
| 63 | + ``` |
| 64 | + |
| 65 | + </details> |
| 66 | + |
| 67 | + <details> |
| 68 | + <summary>Flexible APIs for visualizing the detected layouts</summary> |
| 69 | + |
| 70 | + ```python |
| 71 | + lp.draw_box(image, layout, box_width=1, show_element_id=True, box_alpha=0.25) |
| 72 | + ``` |
| 73 | + |
| 74 | + </details> |
| 75 | + |
| 76 | + </details> |
| 77 | + |
| 78 | + <details> |
| 79 | + <summary>Loading layout data stored in json, csv, and even PDFs</summary> |
| 80 | + |
| 81 | + ```python |
| 82 | + layout = lp.load_json("path/to/json") |
| 83 | + layout = lp.load_csv("path/to/csv") |
| 84 | + pdf_layout = lp.load_pdf("path/to/pdf") |
| 85 | + ``` |
| 86 | + |
| 87 | + </details> |
| 88 | + |
| 89 | +- LayoutParser is also a open platform that enables the sharing of layout detection models and DIA pipelines among the community. |
| 90 | + <details> |
| 91 | + <summary><a href="https://layout-parser.github.io/platform/">Check</a> the LayoutParser open platform</summary> |
| 92 | + </details> |
| 93 | + |
| 94 | + <details> |
| 95 | + <summary><a href="https://github.com/Layout-Parser/platform">Submit</a> your models/pipelines to LayoutParser</summary> |
| 96 | + </details> |
27 | 97 |
|
28 |
| -```bash |
29 |
| -pip install -U layoutparser |
| 98 | +## Installation |
30 | 99 |
|
31 |
| -# Install Detectron2 for using DL Layout Detection Model |
32 |
| -# Please make sure the PyTorch version is compatible with |
33 |
| -# the installed Detectron2 version. |
34 |
| -pip install 'git+https://github.com/facebookresearch/[email protected]#egg=detectron2' |
| 100 | +After several major updates, layoutparser provides various functionalities and deep learning models from different backends. But it still easy to install layoutparser, and we designed the installation method in a way such that you can choose to install only the needed dependencies for your project: |
35 | 101 |
|
36 |
| -# Install the ocr components when necessary |
37 |
| -pip install layoutparser[ocr] |
| 102 | +```bash |
| 103 | +pip install layoutparser # Install the base layoutparser library with |
| 104 | +pip install layoutparser[layoutmodels] # Install DL layout model toolkit |
| 105 | +pip install layoutparser[ocr] # Install OCR toolkit |
38 | 106 | ```
|
39 | 107 |
|
40 |
| -**For Windows Users:** Please read [installation.md](installation.md) for details about installing Detectron2. |
| 108 | +Please check [installation.md](installation.md) for additional details on layoutparser installation. |
41 | 109 |
|
42 |
| -## Quick Start |
| 110 | +## Examples |
43 | 111 |
|
44 | 112 | We provide a series of examples for to help you start using the layout parser library:
|
45 | 113 |
|
46 | 114 | 1. [Table OCR and Results Parsing](https://github.com/Layout-Parser/layout-parser/blob/master/examples/OCR%20Tables%20and%20Parse%20the%20Output.ipynb): `layoutparser` can be used for conveniently OCR documents and convert the output in to structured data.
|
47 | 115 |
|
48 | 116 | 2. [Deep Layout Parsing Example](https://github.com/Layout-Parser/layout-parser/blob/master/examples/Deep%20Layout%20Parsing.ipynb): With the help of Deep Learning, `layoutparser` supports the analysis very complex documents and processing of the hierarchical structure in the layouts.
|
49 | 117 |
|
50 |
| - |
51 |
| -## DL Assisted Layout Prediction Example |
52 |
| - |
53 |
| - |
54 |
| - |
55 |
| -*The images shown in the figure above are: a screenshot of [this paper](https://arxiv.org/abs/2004.08686), an image from the [PRIMA Layout Analysis Dataset](https://www.primaresearch.org/dataset/), a screenshot of the [WSJ website](http://wsj.com), and an image from the [HJDataset](https://dell-research-harvard.github.io/HJDataset/).* |
56 |
| - |
57 |
| -With only 4 lines of code in `layoutparse`, you can unlock the information from complex documents that existing tools could not provide. You can either choose a deep learning model from the [ModelZoo](https://github.com/Layout-Parser/layout-parser/blob/master/docs/notes/modelzoo.md), or load the model that you trained on your own. And use the following code to predict the layout as well as visualize it: |
58 |
| - |
59 |
| -```python |
60 |
| ->>> import layoutparser as lp |
61 |
| ->>> model = lp.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config') |
62 |
| ->>> layout = model.detect(image) # You need to load the image somewhere else, e.g., image = cv2.imread(...) |
63 |
| ->>> lp.draw_box(image, layout,) # With extra configurations |
64 |
| -``` |
65 |
| - |
66 | 118 | ## Contributing
|
67 | 119 |
|
68 | 120 | We encourage you to contribute to Layout Parser! Please check out the [Contributing guidelines](.github/CONTRIBUTING.md) for guidelines about how to proceed. Join us!
|
|
0 commit comments