Songlin Yang's picture

1 4 1

Songlin Yang

sonta7

·

https://sustcsonglin.github.io/

AI & ML interests

None yet

Organizations

sonta7's activity

upvoted a paper 5 days ago

A Controlled Study on Long Context Extension and Generalization in LLMs

Paper • 2409.12181 • Published 7 days ago • 40

upvoted a paper 13 days ago

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published 14 days ago • 19

upvoted a paper 4 months ago

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

Paper • 2406.06484 • Published Jun 10 • 3

upvoted a collection 5 months ago

based

These language model checkpoints are trained at the 360M and 1.3Bn parameter scales for up to 50Bn tokens on the Pile corpus, for research purposes. • 14 items • Updated May 14 • 8