-
Notifications
You must be signed in to change notification settings - Fork 115
Open
Description
What is the issue?
I am trying to run the benchmark on my own adding in a new vector database LanceDB specifically to compare filter search performance. When I try running the benchmark using this command:
poetry run python3 run.py --engines "lancedb-*"
I get the following error:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/Yudhiesh/Projects/vector-db-benchmark/datasets/yandex-1B-200-angular/yandex_t2i_gt_100k/vectors.npy'
I noticed that the dataset that is downloaded is incomplete:
datasets/yandex-1B-200-angular
└── yandex_t2i_gt_100k
└── tests.jsonl
2 directories, 1 file
Based on the specification of the AnnCompoundReader
since the files are .jsonl
it is missing the actual vectors:
class AnnCompoundReader(JSONReader):
"""
A reader created specifically to read the format used in
https://github.com/qdrant/ann-filtering-benchmark-datasets, in which vectors
and their metadata are stored in separate files.
"""
VECTORS_FILE = "vectors.npy"
QUERIES_FILE = "tests.jsonl"
Metadata
Metadata
Assignees
Labels
No labels