Artwork

İçerik The Nonlinear Fund tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan The Nonlinear Fund veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.
Player FM - Podcast Uygulaması
Player FM uygulamasıyla çevrimdışı Player FM !

LW - On Llama-3 and Dwarkesh Patel's Podcast with Zuckerberg by Zvi

1:12:11
 
Paylaş
 

Manage episode 413987561 series 3337129
İçerik The Nonlinear Fund tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan The Nonlinear Fund veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On Llama-3 and Dwarkesh Patel's Podcast with Zuckerberg, published by Zvi on April 22, 2024 on LessWrong. It was all quiet. Then it wasn't. Note the timestamps on both of these. Dwarkesh Patel did a podcast with Mark Zuckerberg on the 18th. It was timed to coincide with the release of much of Llama-3, very much the approach of telling your story directly. Dwarkesh is now the true tech media. A meteoric rise, and well earned. This is two related posts in one. First I cover the podcast, then I cover Llama-3 itself. My notes are edited to incorporate context from later explorations of Llama-3, as I judged that the readability benefits exceeded the purity costs. Podcast Notes: Llama-3 Capabilities (1:00) They start with Llama 3 and the new L3-powered version of Meta AI. Zuckerberg says "With Llama 3, we think now that Meta AI is the most intelligent, freely-available assistant that people can use." If this means 'free as in speech' then the statement is clearly false. So I presume he means 'free as in beer.' Is that claim true? Is Meta AI now smarter than GPT-3.5, Claude 2 and Gemini Pro 1.0? As I write this it is too soon to tell. Gemini Pro 1.0 and Claude 3 Sonnet are slightly ahead of Llama-3 70B on the Arena leaderboard. But it is close. The statement seems like a claim one can make within 'reasonable hype.' Also, Meta integrates Google and Bing for real-time knowledge, so the question there is if that process is any good, since most browser use by LLMs is not good. (1:30) Meta are going in big on their UIs, top of Facebook, Instagram and Messenger. That makes sense if they have a good product that is robust, and safe in the mundane sense. If it is not, this is going to be at the top of chat lists for teenagers automatically, so whoo boy. Even if it is safe, there are enough people who really do not like AI that this is probably a whoo boy anyway. Popcorn time. (1:45) They will have the ability to animate images and it generates high quality images as you are typing and updates them in real time as you are typing details. I can confirm this feature is cool. He promises multimodality, more 'multi-linguality' and bigger context windows. (3:00) Now the technical stuff. Llama-3 follows tradition in training models in three sizes, here 8b, 70b that released on 4/18, and a 405b that is still training. He says 405b is already around 85 MMLU and they expect leading benchmarks. The 8b Llama-3 is almost as good as the 70b Llama-2. The Need for Inference (5:15) What went wrong earlier for Meta and how did they fix it? He highlights Reels, with its push to recommend 'unconnected content,' meaning things you did not ask for, and not having enough compute for that. They were behind. So they ordered double the GPUs that needed. They didn't realize the type of model they would want to train. (7:30) Back in 2006, what would Zuck have sold for when he turned down $1 billion? He says he realized if he sold he'd just build another similar company, so why sell? It wasn't about the number, he wasn't in position to evaluate the number. And I think that is actually wise there. You can realize that you do not want to accept any offer someone would actually make. (9:15) When did making AGI become a key priority? Zuck points out Facebook AI Research (FAIR) is 10 years old as a research group. Over that time it has become clear you need AGI, he says, to support all their other products. He notes that training models on coding generalizes and helps their performance elsewhere, and that was a top focus for Llama-3. So Meta needs to solve AGI because if they don't 'their products will be lame.' It seems increasingly likely, as we will see in several ways, that Zuck does not actually believe in 'real' AGI. By 'AGI' he means somewhat more capable AI. (13:40) What will the Llama that makes cool produ...
  continue reading

1652 bölüm

Artwork
iconPaylaş
 
Manage episode 413987561 series 3337129
İçerik The Nonlinear Fund tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan The Nonlinear Fund veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: On Llama-3 and Dwarkesh Patel's Podcast with Zuckerberg, published by Zvi on April 22, 2024 on LessWrong. It was all quiet. Then it wasn't. Note the timestamps on both of these. Dwarkesh Patel did a podcast with Mark Zuckerberg on the 18th. It was timed to coincide with the release of much of Llama-3, very much the approach of telling your story directly. Dwarkesh is now the true tech media. A meteoric rise, and well earned. This is two related posts in one. First I cover the podcast, then I cover Llama-3 itself. My notes are edited to incorporate context from later explorations of Llama-3, as I judged that the readability benefits exceeded the purity costs. Podcast Notes: Llama-3 Capabilities (1:00) They start with Llama 3 and the new L3-powered version of Meta AI. Zuckerberg says "With Llama 3, we think now that Meta AI is the most intelligent, freely-available assistant that people can use." If this means 'free as in speech' then the statement is clearly false. So I presume he means 'free as in beer.' Is that claim true? Is Meta AI now smarter than GPT-3.5, Claude 2 and Gemini Pro 1.0? As I write this it is too soon to tell. Gemini Pro 1.0 and Claude 3 Sonnet are slightly ahead of Llama-3 70B on the Arena leaderboard. But it is close. The statement seems like a claim one can make within 'reasonable hype.' Also, Meta integrates Google and Bing for real-time knowledge, so the question there is if that process is any good, since most browser use by LLMs is not good. (1:30) Meta are going in big on their UIs, top of Facebook, Instagram and Messenger. That makes sense if they have a good product that is robust, and safe in the mundane sense. If it is not, this is going to be at the top of chat lists for teenagers automatically, so whoo boy. Even if it is safe, there are enough people who really do not like AI that this is probably a whoo boy anyway. Popcorn time. (1:45) They will have the ability to animate images and it generates high quality images as you are typing and updates them in real time as you are typing details. I can confirm this feature is cool. He promises multimodality, more 'multi-linguality' and bigger context windows. (3:00) Now the technical stuff. Llama-3 follows tradition in training models in three sizes, here 8b, 70b that released on 4/18, and a 405b that is still training. He says 405b is already around 85 MMLU and they expect leading benchmarks. The 8b Llama-3 is almost as good as the 70b Llama-2. The Need for Inference (5:15) What went wrong earlier for Meta and how did they fix it? He highlights Reels, with its push to recommend 'unconnected content,' meaning things you did not ask for, and not having enough compute for that. They were behind. So they ordered double the GPUs that needed. They didn't realize the type of model they would want to train. (7:30) Back in 2006, what would Zuck have sold for when he turned down $1 billion? He says he realized if he sold he'd just build another similar company, so why sell? It wasn't about the number, he wasn't in position to evaluate the number. And I think that is actually wise there. You can realize that you do not want to accept any offer someone would actually make. (9:15) When did making AGI become a key priority? Zuck points out Facebook AI Research (FAIR) is 10 years old as a research group. Over that time it has become clear you need AGI, he says, to support all their other products. He notes that training models on coding generalizes and helps their performance elsewhere, and that was a top focus for Llama-3. So Meta needs to solve AGI because if they don't 'their products will be lame.' It seems increasingly likely, as we will see in several ways, that Zuck does not actually believe in 'real' AGI. By 'AGI' he means somewhat more capable AI. (13:40) What will the Llama that makes cool produ...
  continue reading

1652 bölüm

Tüm bölümler

×
 
Loading …

Player FM'e Hoş Geldiniz!

Player FM şu anda sizin için internetteki yüksek kalitedeki podcast'leri arıyor. En iyi podcast uygulaması ve Android, iPhone ve internet üzerinde çalışıyor. Aboneliklerinizi cihazlar arasında eş zamanlamak için üye olun.

 

Hızlı referans rehberi