Recall (R@k) way lower than the one obtained in papers

Here is what i do : 

- Get test dataset of image - caption (IIITD-20K dataset)
- calculate embeddings with my fine tuned CLIP
-  calculate cosine distance between each text to all images
- get the k closest images to the text, if the corresponding image is in it, do +1 to score
- get the recall by dividing the score by the length of the test dataset. 

This is my recall at k. I obtain a R@1 of 17%, while most papers when finetuning CLIP obtain at least 60% recall at 1. Any idea what i could be doing wrong?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recall (R@k) way lower than the one obtained in papers #502

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Recall (R@k) way lower than the one obtained in papers #502

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions