Description
🚀 The feature
Fake Datasets help us to quickly verify and validate if the instantiated model would work fine. This enables quick validation for testing purpose as well as for faster prototyping.
We already have FakeDataset in torchvision for the same, but it supports only ImageClassification as of now.
Motivation, pitch
We should have multiple FakeData
classes. Like ImageClassificationFakeData
, ObjectDetectionFakeData
.
If we do that, we should deprecate FakeData
in favor of ImageClassificationFakeData
.
We could also think of supporting different formats. E.g. Support xywh as well as xyxy format? Support formats such as Binary masks, boolean masks etc? (Not sure of this need to discuss)
Alternatives
Other libraries are maintaining something similar to test models. These libraries mostly wrap torchvision models into their framework equivalent codes and test on sample datasets.
https://github.com/Lightning-AI/lightning-bolts/blob/master/tests/models/test_detection.py
https://github.com/Lightning-AI/lightning-flash/blob/master/tests/image/detection/test_model.py
https://github.com/oke-aditya/quickvision/blob/master/tests/dataset_utils.py
Additional context
@pmeier please chip in your thoughts!
cc @pmeier