insufficient train.py with specific data folder #2531
Unanswered
JoJoistheBestOne
asked this question in
General
Replies: 1 comment
-
use webdataset or tfds if you need scale, tsv/csv datasets don't really scale any better than folders and there is too much variation in schemas to support universally |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Currently, the system requires a data folder, with 'train' and 'val' subfolders, and then class subfolders within those. This is highly inefficient because my dataset is very large. Furthermore, I'm not just using this data to train a classification model; I need to train other models as well. This rigid structure makes it very inconvenient for other models to utilize the data.
Due to the sheer size of the dataset, I can't even create symbolic links because it would exceed the inode limit.
Does this repository support inputting train.csv and val.csv? Why is it designed to be so difficult to use?
Beta Was this translation helpful? Give feedback.
All reactions