-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Description
Hi, I've been reading the instructions on how to finetune the paddleocr-VL model and I have some questions regarding how to prepare the finetuning dataset: https://github.com/PaddlePaddle/ERNIE/blob/release/v1.4/docs/paddleocr_vl_sft.md
- Let's say I have a single page pdf image with some text, table, and images (see above). How should I generate the finetuning data in this case? Do I have to separate all text, tables, and images and create a finetuning dataset for each task?
- is it possible to train the paddleocr-VL model from scratch using ERNIE?
- Assume that my finetuning dataset only contains 1 task (say Table Recognition), how do you think this will impact the overall model performance?
Thank you so much!
Metadata
Metadata
Assignees
Labels
No labels