Collections
Discover the best community collections!
Collections including paper arxiv:2309.01826
-
A Biomedical Entity Extraction Pipeline for Oncology Health Records in Portuguese
Paper • 2304.08999 • Published • 2 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 82 -
Robust Open-Vocabulary Translation from Visual Text Representations
Paper • 2104.08211 • Published • 1 -
Poro 34B and the Blessing of Multilinguality
Paper • 2404.01856 • Published • 12
-
Scaling MLPs: A Tale of Inductive Bias
Paper • 2306.13575 • Published • 14 -
Trap of Feature Diversity in the Learning of MLPs
Paper • 2112.00980 • Published • 1 -
Understanding the Spectral Bias of Coordinate Based MLPs Via Training Dynamics
Paper • 2301.05816 • Published • 1 -
RaftMLP: How Much Can Be Done Without Attention and with Less Spatial Locality?
Paper • 2108.04384 • Published • 1
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 40 -
SortedNet, a Place for Every Network and Every Network in its Place: Towards a Generalized Solution for Training Many-in-One Neural Networks
Paper • 2309.00255 • Published • 1 -
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper • 2309.08968 • Published • 22 -
Matryoshka Representation Learning
Paper • 2205.13147 • Published • 8
-
Language Modeling Is Compression
Paper • 2309.10668 • Published • 82 -
Baichuan 2: Open Large-scale Language Models
Paper • 2309.10305 • Published • 18 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 38 -
LMDX: Language Model-based Document Information Extraction and Localization
Paper • 2309.10952 • Published • 64
-
One Wide Feedforward is All You Need
Paper • 2309.01826 • Published • 31 -
Gated recurrent neural networks discover attention
Paper • 2309.01775 • Published • 7 -
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 43 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 75
-
TheBirdLegacy/FreeLoaderLM
Text Generation • Updated -
CofeAI/FLM-101B
Text Generation • Updated • 15 • 92 -
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 43 -
Composable Function-preserving Expansions for Transformer Architectures
Paper • 2308.06103 • Published • 19
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 75 -
One Wide Feedforward is All You Need
Paper • 2309.01826 • Published • 31 -
Self-Alignment with Instruction Backtranslation
Paper • 2308.06259 • Published • 40 -
Shepherd: A Critic for Language Model Generation
Paper • 2308.04592 • Published • 29