Ross Wightman's picture

Ross Wightman

rwightman

·

AI & ML interests

Computer vision, transfer learning, semi/self supervised learning, robotics.

Articles

Searching for better (Full) ImageNet ViT Baselines

MobileNet Baselines

MobileNet-V4 (now in timm)

Organizations

rwightman's activity

upvoted a collection 1 day ago

timm tiny test models

A collection of very small (~300-500k parameter) models at 160x160 resolution, for testing purposes. Trained on ImageNet-1k. • 12 items • Updated 1 day ago • 1

upvoted 2 articles 2 months ago

Article

MobileNet Baselines

By

•

Jul 26

• 23

Article

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Jul 25

• 18

upvoted a collection 2 months ago

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24 • 49

upvoted 2 papers 3 months ago

PaliGemma: A versatile 3B VLM for transfer

Paper • 2407.07726 • Published Jul 10 • 64

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24 • 55

upvoted a collection 3 months ago

Cambrian Data

3 items • Updated Jun 25 • 8

upvoted a paper 3 months ago

An Image is Worth 32 Tokens for Reconstruction and Generation

Paper • 2406.07550 • Published Jun 11 • 55

upvoted 2 collections 3 months ago

MobileCLIP Models + DataCompDR Data

MobileCLIP: Mobile-friendly image-text models with SOTA zero-shot capabilities. DataCompDR: Improved datasets for training image-text SOTA models. • 22 items • Updated Jun 20 • 21

MobileNetV4 pretrained weights

Weights for MobileNet-V4 pretrained in timm • 17 items • Updated 2 days ago • 13

upvoted 2 papers 4 months ago

MobileNetV4 -- Universal Models for the Mobile Ecosystem

Paper • 2404.10518 • Published Apr 16 • 2

On the Efficiency of Convolutional Neural Networks

Paper • 2404.03617 • Published Apr 4 • 4

upvoted 3 articles 4 months ago

Article

MobileNet-V4 (now in timm)

By

•

Jun 17

• 37

Article

Multimodal Augmentation for Documents: Recovering “Comprehension” in “Reading and Comprehension” task

By

•

May 16

• 17

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 195

upvoted 2 collections 4 months ago

PaliGemma Release

Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 133

PaliGemma FT Models

108 items • Updated Jul 31 • 27

upvoted a collection 5 months ago

Searching for Better ViT Baselines

Exploring ViT hparams and model shapes for the GPU poor (between tiny and base). • 25 items • Updated Aug 21 • 12

upvoted a collection 6 months ago

PDF Document / OCR Datasets

Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30 • 47

upvoted 2 collections 8 months ago

AIM

AIM: Autoregressive Image Models • 5 items • Updated Jun 19 • 48

plant-image-datasets

Image datasets about the kingdom Plantae. • 4 items • Updated Feb 29 • 2

upvoted 2 collections 9 months ago

Fine-Tune Image Classification Benchmark Datasets

Datasets for fine-tune benchmarking, hparam tuning. All vetted and tested with timm scripts. • 3 items • Updated Jun 12 • 1

All the ImageNets

Noteworthy instances of ImageNet on the Hub. Vetted and tested with timm train and validation scripts. • 7 items • Updated Jun 12 • 4

upvoted 5 collections 10 months ago

Fastest timm models > 75.3% IN-1k Top-1 (Original ResNet-50)

Fastest image classification models with 75.3% accuracy in ImageNet-1k . • 21 items • Updated Jul 26 • 4

timm Top-20 ImageNet-1k Models

The 20 best models on ImageNet-1k validation set, all pretrained on datasets larger than ImageNet and fine-tuned on ImageNet-1k. • 17 items • Updated Jun 12 • 7

timm Top-20 Fastest Models

Not the most accurate, but the highest throughput image classification models in timm • 20 items • Updated Jun 12 • 14

timm ImageNet-12k Models

timm has a number of unique and exclusive models trained on a 11821 (12k) subset of the full ImageNet-22k • 27 items • Updated Jun 12 • 2

timm Takes on the Classics

timm includes the most popular convolutional and vision transformer models, many with new weights from updated training recipes. • 24 items • Updated Jul 26 • 3

upvoted 2 collections 11 months ago

zephyr story

sources mentioned by hf.co/thomwolf tweet: x.com/Thom_Wolf/status/1720503998518640703 • 8 items • Updated Jan 24 • 15

Pythia Scaling Suite

Pythia is the first LLM suite designed specifically to enable scientific research on LLMs. To learn more see https://github.com/EleutherAI/pythia • 18 items • Updated Nov 21, 2023 • 22

upvoted a paper 11 months ago

Data Filtering Networks

Paper • 2309.17425 • Published Sep 29, 2023 • 6

upvoted 2 collections 11 months ago

WILDS

WILDS is a benchmark of in-the-wild distribution shifts spanning diverse data modalities and applications. • 10 items • Updated Aug 21 • 4

Daily Papers

1 item • Updated Oct 26, 2023 • 57

upvoted a paper 11 months ago

Sigmoid Loss for Language Image Pre-Training

Paper • 2303.15343 • Published Mar 27, 2023 • 4

upvoted 3 collections 12 months ago

INaturalist-2021 Fine-tunes

Fine-tune experiments for various `timm` models on the INaturalist 2021 Challenge dataset (https://github.com/visipedia/inat_comp/tree/master/2021) • 5 items • Updated Oct 25, 2023 • 6

OpenCLIP DataComp

OpenCLIP models trained on DataComp (https://huggingface.co/papers/2304.14108). • 6 items • Updated Oct 9, 2023 • 6

OpenCLIP LAION-2B

OpenCLIP models trained on LAION-2B • 19 items • Updated Sep 10, 2023 • 18