Sort by

Newest

Oldest

Popular

Adaptive Layer-skipping in Pre-trained LLMs
00:11:11
FineWeb2: One Pipeline to Scale Them All — Adapting Pre-Training Data Processing to Every Language
00:13:58
Hidden in plain sight: VLMs overlook their visual representations
00:13:46
Understanding R1-Zero-Like Training: A Critical Perspective
00:11:44
Luke Zettlemoyer - Mixed-modal Language Modeling
01:00:30
Tom Griffiths - Mapping the Jagged Edges of AI with Cognitive Science
00:57:53
CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation
00:12:27
Don’t lie to your friends: Learning what you know from collaborative self-play
00:15:31
PrefPalette: Personalized Preference Modeling with Latent Attributes
00:16:56
Fluid Language Model Benchmarking
00:15:39
Gillian Hadfield - Alignment is social: lessons from human alignment for AI
00:58:16
Readability ≠ Learnability: Rethinking the Role of Simplicity in Training Small Language Models
00:13:04
Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs
00:12:42
MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing
00:12:45
Language models align with brain regions that represent concepts across modalities
00:11:05
Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective
00:15:25
Single-Pass Document Scanning for Question Answering
00:13:24
Shared Global and Local Geometry of Language Model Embeddings
00:11:55
ICQuant: Index Coding enables Low-bit LLM Quantization
00:13:01
Shirley Ho - Building a Polymathic Foundation Model for Science
00:52:13
Nicholas Carlini - Are LLMs worth it?
00:59:05
Auxiliary task demands mask the capabilities of smaller language models
00:12:38
Hannaneh Hajishirzi - OLMo: Accelerating the Science of Language Modeling (COLM)
00:51:03
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens
00:11:47
Multilinguality and LLMs Special Session
00:56:35
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks
00:11:47
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
00:11:25
Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation
00:08:58
Tuning Language Models by Proxy
00:12:41
Transformer Circuit Evaluation Metrics Are Not Robust
00:13:29