LeafInTheTree (Feuilleaubois)

upvoted an article 11 days ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 161

upvoted 5 collections 25 days ago

upvoted a paper 25 days ago

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

Paper • 2408.16768 • Published 27 days ago • 26

upvoted a collection 25 days ago

video

Collection

92 items • Updated 2 days ago • 2

upvoted 5 papers 25 days ago

CogVLM2: Visual Language Models for Image and Video Understanding

Paper • 2408.16500 • Published 27 days ago • 56

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published 27 days ago • 92

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

Paper • 2408.15881 • Published 28 days ago • 20

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

Paper • 2408.08872 • Published Aug 16 • 96

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6 • 59

upvoted 3 collections 25 days ago

Multi-modality LVM

Collection

27 items • Updated 19 days ago • 1

Multimodal LLM

Collection

74 items • Updated 3 days ago • 2

multimodal

Collection

95 items • Updated 5 days ago • 3

upvoted a collection about 1 month ago

MFM - Multimodal Foundation Models

Collection

19 items • Updated 11 days ago • 1

upvoted an article about 1 month ago

Article

PaliGemma – Google's Cutting-Edge Open Vision Language Model

May 14

• 195

Feuilleaubois

AI & ML interests

Organizations

LeafInTheTree's activity

Mixture of Experts Explained

VisionLM

General Multimodal Learning

Marqo-FashionCLIP and Marqo-FashionSigLIP

Multimodal Benchmarks

3d

SAM2Point: Segment Any 3D as Videos in Zero-shot and Promptable Manners

video

CogVLM2: Visual Language Models for Image and Video Understanding

Law of Vision Representation in MLLMs

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation

xGen-MM (BLIP-3): A Family of Open Large Multimodal Models

LLaVA-OneVision: Easy Visual Task Transfer

Multi-modality LVM

Multimodal LLM

multimodal

MFM - Multimodal Foundation Models

PaliGemma – Google's Cutting-Edge Open Vision Language Model