You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+148-2Lines changed: 148 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,6 +15,7 @@ A Next.js application that uses advanced AI technology to analyze, interpret, an
15
15
16
16
### 📄 **Professional Document Analysis**
17
17
- Advanced AI algorithms analyze documents and extract key information
18
+
-**OCR Processing**: Optional advanced OCR using Datalab Marker API for scanned documents and images
18
19
-**AI-Powered Chat**: Interactive chat interface for document-specific questions and insights
19
20
-**Role-Based Authentication**: Separate interfaces for employees and employers using Clerk
20
21
-**Document Management**: Upload, organize, and manage documents with category support
@@ -51,6 +52,117 @@ The system provides comprehensive analysis including:
51
52
52
53
## 📖 Usage Examples
53
54
55
+
### OCR Processing for Scanned Documents
56
+
57
+
PDR AI includes optional advanced OCR (Optical Character Recognition) capabilities for processing scanned documents, images, and PDFs with poor text extraction:
58
+
59
+
#### When to Use OCR
60
+
-**Scanned Documents**: Physical documents that have been scanned to PDF
61
+
-**Image-based PDFs**: PDFs that contain images of text rather than actual text
62
+
-**Poor Quality Documents**: Documents with low-quality text that standard extraction can't read
63
+
-**Handwritten Content**: Documents with handwritten notes or forms (with AI assistance)
64
+
-**Mixed Content**: Documents combining text, images, tables, and diagrams
65
+
66
+
#### How It Works
67
+
68
+
**Backend Infrastructure:**
69
+
1.**Environment Configuration**: Set `DATALAB_API_KEY` in your `.env` file (optional)
70
+
2.**Database Schema**: Tracks OCR status with fields:
71
+
-`ocrEnabled`: Boolean flag indicating if OCR was requested
72
+
-`ocrProcessed`: Boolean flag indicating if OCR completed successfully
73
+
-`ocrMetadata`: JSON field storing OCR processing details (page count, processing time, etc.)
74
+
75
+
3.**OCR Service Module** (`src/app/api/services/ocrService.ts`):
76
+
- Complete Datalab Marker API integration
77
+
- Asynchronous submission and polling architecture
- `/api/uploadDocument` - Document upload with OCR support
640
772
- `/api` - Backend API endpoints for all functionality
641
773
- `/server/db` - Database schema and configuration
642
774
```
@@ -646,7 +778,12 @@ Key directories:
646
778
### Predictive Document Analysis
647
779
- `POST /api/predictive-document-analysis` - Analyze documents for missing content and recommendations
648
780
- `GET /api/fetchDocument` - Retrieve document content for analysis
649
-
- `POST /api/uploadDocument` - Upload documents for processing
781
+
782
+
### Document Upload & Processing
783
+
- `POST /api/uploadDocument` - Upload documents for processing (supports OCR via `enableOCR` parameter)
784
+
- Standard path: Uses PDFLoader for digital PDFs
785
+
- OCR path: Uses Datalab Marker API for scanned documents
786
+
- Returns document metadata including OCR processing status
650
787
651
788
### AI Chat & Q&A
652
789
- `POST /api/LangChain` - AI-powered document Q&A
@@ -687,6 +824,7 @@ Key directories:
687
824
| `LANGCHAIN_TRACING_V2` | Enable LangSmith tracing for LangChain operations. Set to `true` to enable. Get API key from [LangSmith](https://smith.langchain.com/) | ❌ | `true` or `false` |
688
825
| `LANGCHAIN_API_KEY` | LangChain API key for LangSmith tracing and monitoring. Required if `LANGCHAIN_TRACING_V2=true`. Get from [LangSmith](https://smith.langchain.com/) | ❌ | `lsv2_...` |
689
826
| `TAVILY_API_KEY` | Tavily Search API key for enhanced web search in document analysis. Get from [Tavily](https://tavily.com/) | ❌ | `tvly-...` |
827
+
| `DATALAB_API_KEY` | Datalab Marker API key for advanced OCR processing of scanned documents. Get from [Datalab](https://www.datalab.to/) | ❌ | `your_datalab_key` |
690
828
| `UPLOADTHING_SECRET` | UploadThing secret key for file uploads. Get from [UploadThing Dashboard](https://uploadthing.com/) | ✅ | `sk_live_...` |
691
829
| `UPLOADTHING_APP_ID` | UploadThing application ID. Get from [UploadThing Dashboard](https://uploadthing.com/) | ✅ | `your_app_id` |
692
830
| `NODE_ENV` | Environment mode. Must be one of: `development`, `test`, `production` | ✅ | `development` |
@@ -700,6 +838,7 @@ Key directories:
700
838
- **AI Features**: `OPENAI_API_KEY` (used for embeddings, chat, and document analysis)
701
839
- **AI Observability**: `LANGCHAIN_TRACING_V2`, `LANGCHAIN_API_KEY` (for LangSmith tracing and monitoring)
702
840
- **Search Features**: `TAVILY_API_KEY` (for enhanced web search in document analysis)
841
+
- **OCR Processing**: `DATALAB_API_KEY` (for advanced OCR of scanned documents)
0 commit comments