Skip to content

RotAttentionPool2d Performance Discrepancy and Comparison with naver-ai/rope-vit #2528

Answered by rwightman
ryan-minato asked this question in Q&A
Discussion options

You must be logged in to vote

@ryan-minato I haven't looked too closely at the naver impl, there are often subtle differences in impl of ROPE though they usually equivalent. It might be possible to port those vits to timm using an existing vit as base, or make a new model if it's sufficiently different.

The comment there was specific to the ROPE attention pool. I tried it once as a replacement for a standard attention pool with a ResNet model or similar and it didn't generalize well to other resolutions. I think this might have been before I added resolution scaling support to ROPE though, it was some time ago.

However, the ROPE embedding impl (

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@rwightman
Comment options

Answer selected by ryan-minato
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants