A Preliminary Study of o1 in Medicine: Are We Closer to an AI Doctor? Paper • 2409.15277 • Published 1 day ago • 28
RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning Paper • 2409.14674 • Published 2 days ago • 34
YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models Paper • 2409.13592 • Published 5 days ago • 39
Imagine yourself: Tuning-Free Personalized Image Generation Paper • 2409.13346 • Published 5 days ago • 57
InstantDrag: Improving Interactivity in Drag-based Image Editing Paper • 2409.08857 • Published 12 days ago • 29
Apollo: Band-sequence Modeling for High-Quality Audio Restoration Paper • 2409.08514 • Published 12 days ago • 8
ReCLAP: Improving Zero Shot Audio Classification by Describing Sounds Paper • 2409.09213 • Published 11 days ago • 10
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval Paper • 2409.10516 • Published 9 days ago • 28
Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale Paper • 2409.08264 • Published 13 days ago • 40
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines Paper • 2409.12959 • Published 6 days ago • 33
Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey Paper • 2409.11564 • Published 7 days ago • 17
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation Paper • 2409.09214 • Published 11 days ago • 43
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes Aug 17, 2022 • 56
PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation Paper • 2409.06820 • Published 14 days ago • 56
Gated Slot Attention for Efficient Linear-Time Sequence Modeling Paper • 2409.07146 • Published 14 days ago • 19
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding Paper • 2408.15545 • Published 28 days ago • 32
LLaMA-Omni: Seamless Speech Interaction with Large Language Models Paper • 2409.06666 • Published 15 days ago • 52
INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding Paper • 2409.06210 • Published 15 days ago • 24
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct Paper • 2409.05840 • Published 16 days ago • 43
Towards a Unified View of Preference Learning for Large Language Models: A Survey Paper • 2409.02795 • Published 21 days ago • 70
Configurable Foundation Models: Building LLMs from a Modular Perspective Paper • 2409.02877 • Published 21 days ago • 27
CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation Paper • 2409.03643 • Published 20 days ago • 18
Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining Paper • 2409.02326 • Published 21 days ago • 16
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA Paper • 2409.02897 • Published 21 days ago • 43
LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models Paper • 2409.00509 • Published 25 days ago • 38
Medical SAM 2: Segment medical images as video via Segment Anything Model 2 Paper • 2408.00874 • Published Aug 1 • 40
CogVLM2: Visual Language Models for Image and Video Understanding Paper • 2408.16500 • Published 27 days ago • 56
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published 27 days ago • 45
BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline Paper • 2408.15079 • Published 29 days ago • 51
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published 29 days ago • 36
Writing in the Margins: Better Inference Pattern for Long Context Retrieval Paper • 2408.14906 • Published 29 days ago • 137
Learning to Move Like Professional Counter-Strike Players Paper • 2408.13934 • Published about 1 month ago • 21
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences Paper • 2408.14468 • Published 30 days ago • 33
Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler Paper • 2408.13359 • Published Aug 23 • 21
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On Paper • 2407.08348 • Published Jul 11 • 51
Autoregressive Speech Synthesis without Vector Quantization Paper • 2407.08551 • Published Jul 11 • 13
Towards Robust Speech Representation Learning for Thousands of Languages Paper • 2407.00837 • Published Jun 30 • 10
A Closer Look into Mixture-of-Experts in Large Language Models Paper • 2406.18219 • Published Jun 26 • 15