timm is a PyTorch-based collection of models, pre-trained weights, and utilities focused on the state-of-the-art (SOTA) in computer vision.
The library was created in 2019 by Ross Wightman. As of version 0.6.11, timm offers 765 models with weights pretrained on the ImageNet dataset. It’s a comprehensive training libary that is beloved by the computer vision community and was recently named the top trending library on papers-with-code of 2021!
The timm library is primarily used for image classification.
Not only does this library have over 700 pre-trained SOTA image classification models, it also enables lets you use your own data-loaders, optimizers, and schedulers, There are also scripts for reproducing and fine-tuning deep learning models over custom datasets.
You can easily load a pre-trained model, get a list of models with pre-trained weights, and search for model architectures using a regex like wildcard syntax.
There’s a number of training, validation, inferencing, and checkpoint cleaning scripts in the GitHub repo. This makes it easy for you to reproduce results and fine-tune models over custom datasets.
An incredibly useful feature is the ability to work on input images with varying numbers of channels. This might pose a problem in other libraries. timm is able to do this by summing the weights of the initial convolutional layer for channels fewer than 3, or intelligently replicating these weights to the desired number of channels otherwise.
You can only use it for classification
There aren’t a tonne of example recipes for training. Most of the models don’t have their recipes available for inspection.
Though the library has a tonne of features, I found it difficult to find out where to get started, particularly when applying it for custom use-cases.
I feel like the documentation could be better. There’s a really helpful Medium post that’s a practitioners guide that gets into a lot of detail, but sometimes you just want something that gets to the point.
That being said…here’s an example of making a prediction and performing transfer learning with a timm model.