Replies: 1 comment
-
That's a legit question! The reason why I wrote it is to help people understand how LLMs really work. There are so many handwavy explanations out there like "LLMs generate the next word" but without real explanation of how this process works, and so forth. In my opinion, the best way to understand these inner workings is to build them (on a small scale). I agree, this book is not meant to teach you how to use 3rd party libraries like Hugging Face and so forth; I don't want to write about these libraries because I think the best way to learn about these libraries is to read their own documentation. (And a book on 3rd party libraries would quickly be outdated.) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I learned all the content of the book and tried all the code included in the main chapter. But when I starts to use LLM to do some practical work, such as a simple Chinese chatBox on a very narrow knowledge. I could hardly find any applicable GPT2 Chinese models, even in Hugging Face. When I find some model which could meet my demand, its usage, fine tuning method, and scripts could be far different from what illustrated in the book.
So what's the benefit of learning this book? To learn the foundation of LLM, as well as some terminologies?
Beta Was this translation helpful? Give feedback.
All reactions