@@ -392,6 +392,7 @@ All model architecture families include variants with pretrained weights. There
392
392
* Inception-ResNet-V2 and Inception-V4 - https://arxiv.org/abs/1602.07261
393
393
* Lambda Networks - https://arxiv.org/abs/2102.08602
394
394
* LeViT (Vision Transformer in ConvNet's Clothing) - https://arxiv.org/abs/2104.01136
395
+ * MambaOut - https://arxiv.org/abs/2405.07992
395
396
* MaxViT (Multi-Axis Vision Transformer) - https://arxiv.org/abs/2204.01697
396
397
* MetaFormer (PoolFormer-v2, ConvFormer, CAFormer) - https://arxiv.org/abs/2210.13452
397
398
* MLP-Mixer - https://arxiv.org/abs/2105.01601
@@ -453,13 +454,16 @@ All model architecture families include variants with pretrained weights. There
453
454
* XCiT (Cross-Covariance Image Transformers) - https://arxiv.org/abs/2106.09681
454
455
455
456
### Optimizers
457
+ To see full list of optimizers w/ descriptions: ` timm.optim.list_optimizers(with_description=True) `
456
458
457
- Included optimizers available via ` create_optimizer ` / ` create_optimizer_v2 ` factory methods :
459
+ Included optimizers available via ` timm.optim. create_optimizer_v2` factory method :
458
460
* ` adabelief ` an implementation of AdaBelief adapted from https://github.com/juntang-zhuang/Adabelief-Optimizer - https://arxiv.org/abs/2010.07468
459
461
* ` adafactor ` adapted from [ FAIRSeq impl] ( https://github.com/pytorch/fairseq/blob/master/fairseq/optim/adafactor.py ) - https://arxiv.org/abs/1804.04235
462
+ * ` adafactorbv ` adapted from [ Big Vision] ( https://github.com/google-research/big_vision/blob/main/big_vision/optax.py ) - https://arxiv.org/abs/2106.04560
460
463
* ` adahessian ` by [ David Samuel] ( https://github.com/davda54/ada-hessian ) - https://arxiv.org/abs/2006.00719
461
464
* ` adamp ` and ` sgdp ` by [ Naver ClovAI] ( https://github.com/clovaai ) - https://arxiv.org/abs/2006.08217
462
465
* ` adan ` an implementation of Adan adapted from https://github.com/sail-sg/Adan - https://arxiv.org/abs/2208.06677
466
+ * ` adopt ` - adapted from https://github.com/iShohei220/adopt - https://arxiv.org/abs/2411.02853
463
467
* ` lamb ` an implementation of Lamb and LambC (w/ trust-clipping) cleaned up and modified to support use with XLA - https://arxiv.org/abs/1904.00962
464
468
* ` lars ` an implementation of LARS and LARC (w/ trust-clipping) - https://arxiv.org/abs/1708.03888
465
469
* ` lion ` and implementation of Lion adapted from https://github.com/google/automl/tree/master/lion - https://arxiv.org/abs/2302.06675
@@ -472,7 +476,8 @@ Included optimizers available via `create_optimizer` / `create_optimizer_v2` fac
472
476
* ` rmsprop_tf ` adapted from PyTorch RMSProp by myself. Reproduces much improved Tensorflow RMSProp behaviour
473
477
* ` sgdw ` and implementation of SGD w/ decoupled weight-decay
474
478
* ` fused<name> ` optimizers by name with [ NVIDIA Apex] ( https://github.com/NVIDIA/apex/tree/master/apex/optimizers ) installed
475
- * ` bits<name> ` optimizers by name with [ BitsAndBytes] ( https://github.com/TimDettmers/bitsandbytes ) installed
479
+ * ` bnb<name> ` optimizers by name with [ BitsAndBytes] ( https://github.com/TimDettmers/bitsandbytes ) installed
480
+ * ` adam ` , ` adamw ` , ` rmsprop ` , ` adadelta ` , ` adagrad ` , and ` sgd ` pass through to ` torch.optim ` implementations
476
481
477
482
### Augmentations
478
483
* Random Erasing from [ Zhun Zhong] ( https://github.com/zhunzhong07/Random-Erasing/blob/master/transforms.py ) - https://arxiv.org/abs/1708.04896 )
0 commit comments