ViewTube

Sort by

Newest

Oldest

Popular

Adaptive Layer-skipping in Pre-trained LLMs

Adaptive Layer-skipping in Pre-trained LLMs

41 views

3 weeks ago

FineWeb2: One Pipeline to Scale Them All — Adapting Pre-Training Data Processing to Every Language

FineWeb2: One Pipeline to Scale Them All — Adapting Pre-Training Data Processing to Every Language

69 views

3 weeks ago

Hidden in plain sight: VLMs overlook their visual representations

Hidden in plain sight: VLMs overlook their visual representations

120 views

3 weeks ago

Understanding R1-Zero-Like Training: A Critical Perspective

Understanding R1-Zero-Like Training: A Critical Perspective

167 views

3 weeks ago

Luke Zettlemoyer - Mixed-modal Language Modeling

Luke Zettlemoyer - Mixed-modal Language Modeling

1,374 views

3 weeks ago

Tom Griffiths - Mapping the Jagged Edges of AI with Cognitive Science

Tom Griffiths - Mapping the Jagged Edges of AI with Cognitive Science

518 views

3 weeks ago

CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation

CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation

75 views

3 weeks ago

Don’t lie to your friends: Learning what you know from collaborative self-play

Don’t lie to your friends: Learning what you know from collaborative self-play

78 views

3 weeks ago

PrefPalette: Personalized Preference Modeling with Latent Attributes

PrefPalette: Personalized Preference Modeling with Latent Attributes

41 views

3 weeks ago

Fluid Language Model Benchmarking

Fluid Language Model Benchmarking

66 views

3 weeks ago

Gillian Hadfield - Alignment is social: lessons from human alignment for AI

Gillian Hadfield - Alignment is social: lessons from human alignment for AI

307 views

3 weeks ago

Readability ≠ Learnability: Rethinking the Role of Simplicity in Training Small Language Models

Readability ≠ Learnability: Rethinking the Role of Simplicity in Training Small Language Models

39 views

3 weeks ago

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs

21 views

3 weeks ago

MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing

MixAssist: An Audio-Language Dataset for Co-Creative AI Assistance in Music Mixing

48 views

3 weeks ago

Language models align with brain regions that represent concepts across modalities

Language models align with brain regions that represent concepts across modalities

192 views

3 weeks ago

Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective

Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective

30 views

3 weeks ago

Single-Pass Document Scanning for Question Answering

Single-Pass Document Scanning for Question Answering

80 views

3 weeks ago

Shared Global and Local Geometry of Language Model Embeddings

Shared Global and Local Geometry of Language Model Embeddings

125 views

3 weeks ago

ICQuant: Index Coding enables Low-bit LLM Quantization

ICQuant: Index Coding enables Low-bit LLM Quantization

144 views

3 weeks ago

Shirley Ho - Building a Polymathic Foundation Model for Science

Shirley Ho - Building a Polymathic Foundation Model for Science

125 views

3 weeks ago

Nicholas Carlini - Are LLMs worth it?

Nicholas Carlini - Are LLMs worth it?

3,787 views

3 weeks ago

Auxiliary task demands mask the capabilities of smaller language models

Auxiliary task demands mask the capabilities of smaller language models

180 views

1 year ago

Hannaneh Hajishirzi - OLMo: Accelerating the Science of Language Modeling (COLM)

Hannaneh Hajishirzi - OLMo: Accelerating the Science of Language Modeling (COLM)

4,338 views

1 year ago

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens

386 views

1 year ago

Multilinguality and LLMs Special Session

Multilinguality and LLMs Special Session

318 views

1 year ago

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

Tower: An Open Multilingual Large Language Model for Translation-Related Tasks

167 views

1 year ago

SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models

SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models

216 views

1 year ago

Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

Self-Taught Optimizer (STOP): Recursively Self-Improving Code Generation

501 views

1 year ago

Tuning Language Models by Proxy

Tuning Language Models by Proxy

179 views

1 year ago

Transformer Circuit Evaluation Metrics Are Not Robust

Transformer Circuit Evaluation Metrics Are Not Robust

230 views

1 year ago

Show more