|
62 | 62 | "source": [
|
63 | 63 | "## Installation\n",
|
64 | 64 | "\n",
|
65 |
| - "First, let's install the neccessary packages:\n", |
| 65 | + "First, let's install the necessary packages:\n", |
66 | 66 | "\n",
|
67 | 67 | "- [fastdup](https://github.com/visual-layer/fastdup) - To analyze issues in the dataset.\n",
|
68 | 68 | "- [Recognize Anything](https://github.com/xinyu1205/recognize-anything) - To use the RAM and Tag2Text model.\n",
|
|
118 | 118 | "metadata": {},
|
119 | 119 | "source": [
|
120 | 120 | "## Download Dataset\n",
|
121 |
| - "Download the [coco-minitrain](https://github.com/giddyyupp/coco-minitrain) dataset - a curated mini training set consisting of 20% of COCO 2017 training dataset. The coco-minitrain consists of 25,000 images and annoatations." |
| 121 | + "Download the [coco-minitrain](https://github.com/giddyyupp/coco-minitrain) dataset - a curated mini training set consisting of 20% of COCO 2017 training dataset. The coco-minitrain consists of 25,000 images and annotations." |
122 | 122 | ]
|
123 | 123 | },
|
124 | 124 | {
|
|
271 | 271 | "source": [
|
272 | 272 | "## Zero-Shot Classification with RAM and Tag2Text\n",
|
273 | 273 | "\n",
|
274 |
| - "Within fastdup you can readily use the zero-shot classifier models such as [Recognize Anything Model (RAM)](https://github.com/xinyu1205/recognize-anything) and [Tag2Text](https://github.com/xinyu1205/recognize-anything). Both Tag2Text and RAM exihibit strong recognition ability.\n", |
| 274 | + "Within fastdup you can readily use the zero-shot classifier models such as [Recognize Anything Model (RAM)](https://github.com/xinyu1205/recognize-anything) and [Tag2Text](https://github.com/xinyu1205/recognize-anything). Both Tag2Text and RAM exhibit strong recognition ability.\n", |
275 | 275 | "\n",
|
276 | 276 | "+ RAM is an image tagging model, which can recognize any common category with high accuracy. Outperforms CLIP and BLIP.\n",
|
277 | 277 | "+ Tag2Text is a vision-language model guided by tagging, which can support caption, retrieval and tagging."
|
|
1182 | 1182 | "id": "59c8e8d0-1c00-403b-84d9-226458b9268a",
|
1183 | 1183 | "metadata": {},
|
1184 | 1184 | "source": [
|
1185 |
| - "Once, done you'll notice that 3 new columns are appened into the DataFrame namely - `grounding_dino_bboxes`, `grounding_dino_scores`, and `grounding_dino_labels`. " |
| 1185 | + "Once, done you'll notice that 3 new columns are appended into the DataFrame namely - `grounding_dino_bboxes`, `grounding_dino_scores`, and `grounding_dino_labels`. " |
1186 | 1186 | ]
|
1187 | 1187 | },
|
1188 | 1188 | {
|
|
1897 | 1897 | "id": "7a979b19-eaef-422b-944b-0285115e24d6",
|
1898 | 1898 | "metadata": {},
|
1899 | 1899 | "source": [
|
1900 |
| - "Not all images contain \"face\", \"eye\" and \"hair\", let's remove the columns with no detections and plot the colums with detections." |
| 1900 | + "Not all images contain \"face\", \"eye\" and \"hair\", let's remove the columns with no detections and plot the column with detections." |
1901 | 1901 | ]
|
1902 | 1902 | },
|
1903 | 1903 | {
|
|
0 commit comments