A 7-minute LLM explainer (part of a Computer History Museum exhibit)
Added 2024-10-23 14:32:30 +0000 UTCEarlier this year, the Computer History Museum in Mountain View asked if I could help put together a short visual explainer for how large language models work. At that time, I was planning to cover LLMs and transformers on 3blue1brown anyway, and I love the museum, so this was an easy yes.
The target audience is a bit different from the usual 3b1b project, in that we want to assume no technical background and minimal familiarity with machine learning.
At first, I thought this would simply be an abridged version of the more detailed explainers I was already making, but ultimately it proved to be a satisfying outlet for emphasizing the most important ideas that those more technical explainers may have glossed over.
The museum is just kicking off its LLM exhibit now, and I plan to post this to the channel this weekend. In the meantime, enjoy the early view!
Comments
I know it’s far from perfect, but what did you think of the video on attention I made? What points would you most like to see explained better?
3blue1brown
2024-11-12 04:02:40 +0000 UTCNice. Thank you!
Hitoshi Yamauchi
2024-10-28 22:35:32 +0000 UTCExcellent introduction without overwhelming the consumer. Just enough to pique interest to want to learn more (the point of a museum).
Casey Miller
2024-10-26 18:02:08 +0000 UTCI live so close to the Computer History Museum but haven’t been there before! Definitely need to check it out
Kyle M. Kabasares
2024-10-26 17:40:04 +0000 UTCThis is truly amazing! I'm convinced, as I've been for the past year or so, that with systems like this, AI has already passed the Turing Test. However, although I think the latest AI systems are highly intelligent, I don't think they're sentient, since they still follow classical, i.e., deterministic, algorithms, so they're still effectively our slaves and they don't care about that. However, I do believe that quantum computers could become sentient, and in fact, I think quantum mechanical uncertainty is ultimately the source of consciousness, so if we ever develop quantum computers with these capabilities, we'll need to treat them with respect, as we should with our fellow human beings. What are your thoughts about all this?
David Terr
2024-10-25 02:17:32 +0000 UTCI wonder if we’ll soon (or ever) see an LLM that can dynamically train itself. I.E., after pre-training it can continue to adjust its weights. After all, the human brain is continuously adjusting its own synapses (esp. during sleep). Perhaps this would be too computationally expensive to be practical? But for machines that can truly learn on the fly, I think that would be a prerequisite.
Albert Farve
2024-10-24 20:19:58 +0000 UTCThat's pretty good But I still haven't had attention explained to my satisfaction yet.I miss martin gardner , he would have done that.
A Patreon of the Ahts
2024-10-24 14:54:39 +0000 UTCthis should be "Part 0"
Fran Abenza
2024-10-24 00:50:40 +0000 UTCOne tiny note: near the end you refer to "the last vector in this sequence", but I don't think you had defined "vector" earlier in the video (unless I missed it); you just said "list of numbers".
Rick Rubenstein
2024-10-24 00:13:45 +0000 UTCVery nice! I live quite near the CHM, but haven't been in in ages. I should rectify that.
Rick Rubenstein
2024-10-24 00:11:06 +0000 UTCBrilliant! Your contribution to improve global human knowledge and understanding is unmatched. I'm so grateful for everything you've done. Thank you !
Ama
2024-10-23 22:46:23 +0000 UTCAmazing!!! If more people would understand these short seven minutes of simple clear explanation, we would have considerably less craziness in the AI media.
Richard Hackathorn
2024-10-23 21:21:56 +0000 UTCWonderful. As usual.
Gabriel Bergqvist
2024-10-23 19:49:44 +0000 UTCBrilliant (as always).
Daniel Raynaud
2024-10-23 16:47:01 +0000 UTCNicely done. One of the better, or the best thus far, of explainers of LLM based AI As an, I don't know, not impressed observer of "AI" based on LLM, I cannot help but recall the many stories, rarely good, from throughout its actual history First was fuzzy logic, programmers made them, which were a MUCH simpler version of the pre-training (essentially, one column of numbers only). Then that begat neural Networks (more columns), which begat Machine Learning (even more columns). Then, more recently, Machine Learning using the technique called "AI" begat LLM. As a programmer for a quarter+ century, the problem I've always had with it was nobody can explain why it pops out the wrong answer. Never have. Never will. You simply cannot guarantee it will do anything correctly. The tendency to hallucinate means you should never trust the answer. The answers are ONLY a statistically high value that "would sound good in a conversation" which is quite a bit different from "the correct answer". Which explains why various ChatBots were convinced to be racists a few years ago As they say, but that's just me
dcy665 .
2024-10-23 16:16:57 +0000 UTCIt is just great explanation again. THX
Gerd Hintz
2024-10-23 15:22:00 +0000 UTCThank you
Ahmad Alawneh
2024-10-23 15:08:07 +0000 UTC