Player FM uygulamasıyla çevrimdışı Player FM !
38.3 - Erik Jenner on Learned Look-Ahead
Manage episode 455065795 series 2844728
Lots of people in the AI safety space worry about models being able to make deliberate, multi-step plans. But can we already see this in existing neural nets? In this episode, I talk with Erik Jenner about his work looking at internal look-ahead within chess-playing neural networks.
Patreon: https://www.patreon.com/axrpodcast
Ko-fi: https://ko-fi.com/axrpodcast
The transcript: https://axrp.net/episode/2024/12/12/episode-38_3-erik-jenner-learned-look-ahead.html
FAR.AI: https://far.ai/
FAR.AI on X (aka Twitter): https://x.com/farairesearch
FAR.AI on YouTube: https://www.youtube.com/@FARAIResearch
The Alignment Workshop: https://www.alignment-workshop.com/
Topics we discuss, and timestamps:
00:57 - How chess neural nets look into the future
04:29 - The dataset and basic methodology
05:23 - Testing for branching futures?
07:57 - Which experiments demonstrate what
10:43 - How the ablation experiments work
12:38 - Effect sizes
15:23 - X-risk relevance
18:08 - Follow-up work
21:29 - How much planning does the network do?
Research we mention:
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network: https://arxiv.org/abs/2406.00877
Understanding the learned look-ahead behavior of chess neural networks (a development of the follow-up research Erik mentioned): https://openreview.net/forum?id=Tl8EzmgsEp
Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT: https://arxiv.org/abs/2310.07582
Episode art by Hamish Doodles: hamishdoodles.com
48 bölüm
Manage episode 455065795 series 2844728
Lots of people in the AI safety space worry about models being able to make deliberate, multi-step plans. But can we already see this in existing neural nets? In this episode, I talk with Erik Jenner about his work looking at internal look-ahead within chess-playing neural networks.
Patreon: https://www.patreon.com/axrpodcast
Ko-fi: https://ko-fi.com/axrpodcast
The transcript: https://axrp.net/episode/2024/12/12/episode-38_3-erik-jenner-learned-look-ahead.html
FAR.AI: https://far.ai/
FAR.AI on X (aka Twitter): https://x.com/farairesearch
FAR.AI on YouTube: https://www.youtube.com/@FARAIResearch
The Alignment Workshop: https://www.alignment-workshop.com/
Topics we discuss, and timestamps:
00:57 - How chess neural nets look into the future
04:29 - The dataset and basic methodology
05:23 - Testing for branching futures?
07:57 - Which experiments demonstrate what
10:43 - How the ablation experiments work
12:38 - Effect sizes
15:23 - X-risk relevance
18:08 - Follow-up work
21:29 - How much planning does the network do?
Research we mention:
Evidence of Learned Look-Ahead in a Chess-Playing Neural Network: https://arxiv.org/abs/2406.00877
Understanding the learned look-ahead behavior of chess neural networks (a development of the follow-up research Erik mentioned): https://openreview.net/forum?id=Tl8EzmgsEp
Linear Latent World Models in Simple Transformers: A Case Study on Othello-GPT: https://arxiv.org/abs/2310.07582
Episode art by Hamish Doodles: hamishdoodles.com
48 bölüm
Tüm bölümler
×Player FM'e Hoş Geldiniz!
Player FM şu anda sizin için internetteki yüksek kalitedeki podcast'leri arıyor. En iyi podcast uygulaması ve Android, iPhone ve internet üzerinde çalışıyor. Aboneliklerinizi cihazlar arasında eş zamanlamak için üye olun.