Artwork

İçerik Hugo Bowne-Anderson tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan Hugo Bowne-Anderson veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.
Player FM - Podcast Uygulaması
Player FM uygulamasıyla çevrimdışı Player FM !

Episode 40: What Every LLM Developer Needs to Know About GPUs

1:43:34
 
Paylaş
 

Manage episode 457226388 series 3317544
İçerik Hugo Bowne-Anderson tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan Hugo Bowne-Anderson veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.

Hugo speaks with Charles Frye, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of hardware for LLMs and AI workflows, this episode is for you.

Charles and Hugo dive into the practical side of GPUs—from running inference on large models, to fine-tuning and even training from scratch. They unpack the real pain points developers face, like figuring out:

  • How much VRAM you actually need.
  • Why memory—not compute—ends up being the bottleneck.
  • How to make quick, back-of-the-envelope calculations to size up hardware for your tasks.
  • And where things like fine-tuning, quantization, and retrieval-augmented generation (RAG) fit into the mix.

One thing Hugo really appreciate is that Charles and the Modal team recently put together the GPU Glossary—a resource that breaks down GPU internals in a way that’s actually useful for developers. We reference it a few times throughout the episode, so check it out in the show notes below.

🔧 Charles also does a demo during the episode—some of it is visual, but we talk through the key points so you’ll still get value from the audio. If you’d like to see the demo in action, check out the livestream linked below.

This is the "Building LLM Applications for Data Scientists and Software Engineers" course that Hugo is teaching with Stefan Krawczyk (ex-StitchFix) in January. Charles is giving a guest lecture at on hardware for LLMs, and Modal is giving all students $1K worth of compute credits (use the code VG25 for $200 off).

LINKS

  continue reading

40 bölüm

Artwork
iconPaylaş
 
Manage episode 457226388 series 3317544
İçerik Hugo Bowne-Anderson tarafından sağlanmıştır. Bölümler, grafikler ve podcast açıklamaları dahil tüm podcast içeriği doğrudan Hugo Bowne-Anderson veya podcast platform ortağı tarafından yüklenir ve sağlanır. Birinin telif hakkıyla korunan çalışmanızı izniniz olmadan kullandığını düşünüyorsanız burada https://tr.player.fm/legal özetlenen süreci takip edebilirsiniz.

Hugo speaks with Charles Frye, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of hardware for LLMs and AI workflows, this episode is for you.

Charles and Hugo dive into the practical side of GPUs—from running inference on large models, to fine-tuning and even training from scratch. They unpack the real pain points developers face, like figuring out:

  • How much VRAM you actually need.
  • Why memory—not compute—ends up being the bottleneck.
  • How to make quick, back-of-the-envelope calculations to size up hardware for your tasks.
  • And where things like fine-tuning, quantization, and retrieval-augmented generation (RAG) fit into the mix.

One thing Hugo really appreciate is that Charles and the Modal team recently put together the GPU Glossary—a resource that breaks down GPU internals in a way that’s actually useful for developers. We reference it a few times throughout the episode, so check it out in the show notes below.

🔧 Charles also does a demo during the episode—some of it is visual, but we talk through the key points so you’ll still get value from the audio. If you’d like to see the demo in action, check out the livestream linked below.

This is the "Building LLM Applications for Data Scientists and Software Engineers" course that Hugo is teaching with Stefan Krawczyk (ex-StitchFix) in January. Charles is giving a guest lecture at on hardware for LLMs, and Modal is giving all students $1K worth of compute credits (use the code VG25 for $200 off).

LINKS

  continue reading

40 bölüm

Tüm bölümler

×
 
Loading …

Player FM'e Hoş Geldiniz!

Player FM şu anda sizin için internetteki yüksek kalitedeki podcast'leri arıyor. En iyi podcast uygulaması ve Android, iPhone ve internet üzerinde çalışıyor. Aboneliklerinizi cihazlar arasında eş zamanlamak için üye olun.

 

Hızlı referans rehberi