[Linkpost] “METR: Measuring AI Ability To Complete Long Tasks” By Zach Stein-Perlman LessWrong (Curated & Popular) podcast

Artwork

İçerik LessWrong tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan LessWrong veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.

LessWrong (Curated & Popular) « »
[Linkpost] “METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman

29 gün önce 1:19

Paylaş

MP3•Bölüm sayfası

İçerik LessWrong tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan LessWrong veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.

This is a link post. Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend predicts that, in under a decade, we will see AI agents that can independently complete a large fraction of software tasks that currently take humans days or weeks.
Full paper | Github repo
---
First published:
March 19th, 2025
Source:
https://www.lesswrong.com/posts/deesrjitvXM4xYGZd/metr-measuring-ai-ability-to-complete-long-tasks
Linkpost URL:
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
---
Narrated by TYPE III AUDIO.
---

Images from the article:

undefined

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

… continue reading

497 bölüm

Artwork

[Linkpost] “METR: Measuring AI Ability to Complete Long Tasks” by Zach Stein-Perlman

LessWrong (Curated & Popular)

12 subscribers

published 29 gün önce

Paylaş

MP3•Bölüm sayfası

İçerik LessWrong tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan LessWrong veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.

This is a link post. Summary: We propose measuring AI performance in terms of the length of tasks AI agents can complete. We show that this metric has been consistently exponentially increasing over the past 6 years, with a doubling time of around 7 months. Extrapolating this trend predicts that, in under a decade, we will see AI agents that can independently complete a large fraction of software tasks that currently take humans days or weeks.
Full paper | Github repo
---
First published:
March 19th, 2025
Source:
https://www.lesswrong.com/posts/deesrjitvXM4xYGZd/metr-measuring-ai-ability-to-complete-long-tasks
Linkpost URL:
https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
---
Narrated by TYPE III AUDIO.
---

Images from the article:

undefined

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

… continue reading

497 bölüm

Tüm bölümler

×

Player FM'e Hoş Geldiniz!

Player FM şu anda sizin için internetteki yüksek kalitedeki podcast'leri arıyor. En iyi podcast uygulaması ve Android, iPhone ve internet üzerinde çalışıyor. Aboneliklerinizi cihazlar arasında eş zamanlamak için üye olun.

500+ konuyu dinleyin

Hızlı referans rehberi

En Popüler Podcast'ler

Farklı Kaydet Podcast

Socrates Podcasts

girişimci muhabbeti

Başka Kanatlar Altında Yaşayamam

BuHafta Sinema ve Dizi Gündemi

Ekonomik Gidişat

Haftalık Gündem Değerlendirmesi

Evrim Ağacı ile Bilime Dair Her Şey!

Yeşilçam Arkeolojisi

Yardım / SSS | Yükselt | Reklam ver

Sanat|İş Dünyası|Komedi|İktisat|Eğlence|Haberler|Politika|Din

Bilim|Futbol|Spor|Hikaye Anlatımı|Teknoloji|Gerçek Suçlar

Telif hakkı 2025 | Site haritası | Gizlilik Politikası | Kullanım Şartları | | Telif hakkı

Keşfederken bu şovu dinleyin