You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# This is because it instantiates it's attention layer from torch.nn.MultiheadAttention, which calls to
381
+
# `torch.nn.functional.multi_head_attention_forward` with the weights and bias. Since the hook is never
382
+
# triggered with a forward pass call, the weights stay on the CPU. There are more examples where we skip
383
+
# this test because of MHA (example: HunyuanDiT because of AttentionPooling layer).
384
+
pass
385
+
363
386
# TODO(aryan): Create a dummy gemma model with smol vocab size
364
387
@unittest.skip(
365
388
"A very small vocab size is used for fast tests. So, any kind of prompt other than the empty default used in other tests will lead to a embedding lookup error. This test uses a long prompt that causes the error."
0 commit comments