Skip to content

Commit 0a0ee32

Browse files
authored
Merge pull request #470 from jermspeaks/content/weekly-w24-and-rag
Week 24 weekly notes and post about RAG
2 parents cdde2a2 + afe13b3 commit 0a0ee32

File tree

2 files changed

+189
-0
lines changed

2 files changed

+189
-0
lines changed

src/content/writing/2024-06-18-rag.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
---
2+
description: An explainer for Retrieval-Augmented Generation (RAG). Breaking down what it is and how people are implementing it.
3+
draft: false
4+
pubDate: "2024-06-18T23:00:00.000Z"
5+
tags: ["learning", "ai"]
6+
title: Retrieval-Augmented Generation (RAG)
7+
heroImage: https://cdn.hashnode.com/res/hashnode/image/upload/v1713375387925/9246942a-79e4-4d94-b032-a85f10480a99.png
8+
heroImageAlt: Diagram of one implementation of RAG from LangChain
9+
---
10+
11+
RAG is a framework that aids LLMs to retrieve other datasets to augment the prompt and your generate a response based off the added data source.
12+
13+
Breaking down the acrynpm, RAG means the following:
14+
15+
- **R**etrieval mechanism: turn your query into a vector and run a vector search from a database that has pre-encoded documents and passages
16+
- **A**ugmentation: Combine retrieved documents with the initial prompt to create an augmented prompt
17+
- Answer **G**eneration: With the new prompt, the LLM will generate a response informed by the external knowledge contained in the retrieved texts
18+
19+
It started with this paper: [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://arxiv.org/abs/2005.11401?ref=cohere-ai.ghost.io).
20+
21+
This method is particularly valuable in fields like chatbot development, where the ability to provide precise answers derived from extensive databases of knowledge is crucial.
22+
23+
RAG fundamentally enhances the natural language understanding and generation capabilities of models by allowing them to access and leverage a vast amount of external knowledge. The approach is built upon the synergy between two main components: a retrieval system and a generative model. The retrieval system first identifies relevant information from a knowledge base, which the generative model then uses to craft responses that are not only accurate but also rich in detail and scope.
24+
25+
### Types of RAG
26+
27+
1. Vector-based RAG - the most common type of RAG.
28+
29+
You convert text into "embeddings" and store them in a vector database.
30+
31+
![Overview showing RAG with a Vector DB](https://writer.com/wp-content/uploads/2024/02/Image-1.png)
32+
33+
> Vector databases enable search functions that are much better than typical keyword searches. If users are looking for data that has semantic similarity, a vector database can often help them find those data points, even if there isn’t a literal keyword match
34+
>
35+
> -- Deanna Dong, [Vector databases, graph databases, and knowledge graphs - Writer](https://writer.com/blog/vector-databases-graph-databases-knowledge-graphs/)
36+
37+
The downside is the context can be lost, especially when its relational context between data points. When chunking vectors, they use data-point similarity based on nearness. See KNN (k-nearest neighbors) and ANN (Approximate Nearest Neighbor)
38+
39+
2. Graph-based RAG
40+
41+
Instead of using a vector database, you use a Graph Database. A Graph DB contains vector information where links also store data. This allows relational information from retrieval
42+
43+
![Example showing relationships in a Graph DB](https://writer.com/wp-content/uploads/2024/02/Image-2.png)
44+
45+
Sam Julien on X Thread - [What’s graph-based RAG (retrieval-augmented generation) and why should you care?](https://twitter.com/samjulien/status/1801634334723432462)
46+
47+
![Using RAG with a Graph DB](https://pbs.twimg.com/media/GQCw4OHXwAA0_uR?format=jpg&name=large)
48+
49+
3. Knowledge graphs
50+
51+
These outperform vector and graph databases due to their ability to preserve semantic relationships and encode structural information
52+
53+
### Relevant sources for follow-up
54+
55+
- [Five Reasons Enterprises Are Choosing RAG](https://cohere.com/blog/five-reasons-enterprises-are-choosing-rag)
56+
- LangChain Github for [RAG from Scratch](https://github.com/langchain-ai/rag-from-scratch)
57+
- Akshay 🚀 on X thread - [RAGs, clearly explained](https://twitter.com/akshay_pachaar/status/1791446077696266334)
58+
59+
<iframe
60+
class="aspect-video w-full my-2"
61+
src="https://www.youtube.com/embed/T-D1OfcDW1M"
62+
title="What is Retrieval-Augmented Generation (RAG)?"
63+
frameborder="0"
64+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
65+
allowfullscreen></iframe>
66+
67+
<iframe
68+
class="aspect-video w-full my-2"
69+
src="https://www.youtube.com/embed/rhZgXNdhWDY"
70+
title="Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)"
71+
frameborder="0"
72+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
73+
allowfullscreen></iframe>
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
---
2+
description: "Keeping a dev journal, book recommendations, how computers reduce efficiency, introduction to jhanas, some podcast notes about AI companies, and how do dogs see color."
3+
draft: false
4+
pubDate: "2024-06-18T23:30:00.000Z"
5+
tags: ["weekly", "reflection"]
6+
title: 2024 Week 24 - Weekly Notes
7+
heroImage: https://images.unsplash.com/photo-1506485338023-6ce5f36692df?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=2370&q=80
8+
heroImageAlt: Unsplash image from Jazmin Quaynor showing a weekly calendar
9+
---
10+
11+
### Reflections
12+
13+
- Used `jscodeshift` that ChatGPT helped me to create to change `defaultProps` migration for React 19
14+
- I skimmed through "[The Definitive Guide to Google Vertex AI](https://www.packtpub.com/en-us/product/the-definitive-guide-to-google-vertex-ai-9781801815260)" from Packt Publishing by Jasmeet Bhatia and Kartik Chaudhary. The things I learned about RAG were added to my [new blog post](/blog/2024-06-18-rag/)
15+
- I met some bird watchers at SJSU this week. Looks like those [City Hall Peregrine Falcons](https://www.sanjoseca.gov/news-stories/city-hall-falcons) are back.
16+
17+
### Dev and Tech-y Tech
18+
19+
- Stack Overflow Blog - [You should keep a developer’s journal](https://stackoverflow.blog/2024/05/22/you-should-keep-a-developer-s-journal/?utm_source=tldrwebdev)
20+
- I use my Obsidian Daily Notes. Each of those daily notes are reviewed on a weekly basis (hello Weekly Notes) and may be added to its own Obsidian note. Sometimes they get added as [Streams](/curation/stream).
21+
- GitHub - [Payments 101 for a Developer](https://github.com/juspay/hyperswitch/wiki/Payments-101-for-a-Developer?utm_source=tldrnewsletter)
22+
- Ben Kuhn - [Essays on programming I think about a lot](https://www.benkuhn.net/progessays/?utm_source=tldrwebdev)
23+
- This sounds like something good to write about instead of my lindy library / timeless treasure trove
24+
- [Inside Bluesky’s Engineering Culture](https://newsletter.pragmaticengineer.com/p/bluesky-engineering-culture)
25+
- [ScyllaDB | Monstrously Fast + Scalable NoSQL](https://www.scylladb.com/)
26+
- [Expo Documentation](https://docs.expo.dev/)
27+
- Fortune - [How Amazon blew Alexa’s shot to dominate AI, according to employees who worked on it](https://fortune.com/2024/06/12/amazon-insiders-why-new-alexa-llm-generative-ai-conversational-chatbot-missing-in-action/?utm_source=tldrnewsletter)
28+
29+
### Recommendations
30+
31+
- Book: Frostbite: How Refrigeration Changed Our Food, Our Planet, and Ourselves. By Nicola Twilly. [Amazon](https://www.amazon.com/Frostbite-Refrigeration-Changed-Planet-Ourselves/dp/0735223289)
32+
> An engaging and far-reaching exploration of refrigeration, tracing its evolution from scientific mystery to globe-spanning infrastructure, and an essential investigation into how it has remade our entire relationship with food—for better and for worse
33+
- Anime: [Frieren: Beyond Journey's End](https://www.crunchyroll.com/series/GG5H5XQX4/frieren-beyond-journeys-end)
34+
- Book: [Invitation to a Banquet: The Story of Chinese Food](http://www.fuchsiadunlop.com/books/invitation-to-a-banquet-the-story-of-chinese-food/) by Fuchsia Dunlop
35+
- Book: Simple Marketing For Smart People: The One Question You Need to Win Customers without Gimmicks, Hype, or Hard Selling. By Billy Broas and Tiago Forte. [Amazon](https://www.amazon.com/Simple-Marketing-Smart-People-Customers/dp/B0D66H733H/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=&sr=&ck_subscriber_id=1900202893&utm_source=convertkit&utm_medium=email&utm_campaign=The%20Ultimate%20Tool%20for%20Thought)
36+
- I started this book this week
37+
- Book: How to Baby: A No-Advice-Given Guide to Motherhood. By Liana Finck. [Amazon](https://www.amazon.com/How-Baby-No-Advice-Given-Motherhood-Drawings-ebook/dp/B0CC1H99QG)
38+
- Blog: [Interconnected](https://interconnected.org/home/) a blog by Matt Webb
39+
40+
### Science
41+
42+
- Quanta Magazine - [Most Life on Earth is Dormant, After Pulling an ‘Emergency Brake’](https://www.quantamagazine.org/most-life-on-earth-is-dormant-after-pulling-an-emergency-brake-20240605/?utm_source=tldrnewsletter)
43+
- Locklin on science - [Computers reduce efficiency: Case Studies of the Solow Paradox](https://scottlocklin.wordpress.com/2023/11/21/computers-reduce-efficiency-case-studies-of-the-solow-paradox/)
44+
- Associated [HN thread](https://news.ycombinator.com/item?id=40233938)
45+
- [Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations](https://arxiv.org/abs/2406.06384v1?utm_source=tldrai)
46+
47+
### Obits
48+
49+
- [Françoise Hardy, Moody French Pop Star, Dies at 80](https://www.nytimes.com/2024/06/12/arts/music/francoise-hardy-dead.html?campaign_id=9&emc=edit_nn_20240614&instance_id=126227&nl=the-morning&regi_id=197092347&segment_id=169560&te=1&user_id=53888c42b17ce2b613ad43a8e73d64ef)
50+
- [Tous les garçons et les filles](https://www.youtube.com/watch?v=XPkBMqehr5k)
51+
52+
### Podcast Notes
53+
54+
- [Cohere CEO Aidan Gomez sees AI’s pathway to profitability - The Verge](https://www.theverge.com/24173858/ai-cohere-aidan-gomez-money-revenue-llm-transformers-enterprise-stochastic-parrot)
55+
- [Attention is all you need | Proceedings of the 31st International Conference on Neural Information Processing Systems](https://dl.acm.org/doi/10.5555/3295222.3295349) - Aiden is one of the authors
56+
- [On the Dangers of Stochastic Parrots | Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency](https://dl.acm.org/doi/10.1145/3442188.3445922)
57+
- Aiden doesn't think this danger is bad.
58+
> That’s more than just parroting back what you’ve already seen. I think that these models don’t just parrot back what they’ve seen. I think that they’re able to extrapolate beyond what we’ve shown them, to recognize patterns in the data and apply those patterns to new inputs that they’ve never seen before. Definitively, at this stage, we can say we’re past the stochastic parrot hypothesis.
59+
- Stochastic Parrots hypothesis
60+
> The claim of that paper is that these [models] are just repeating words back at us, and there isn’t some deeper intelligence. And actually, by repeating things back to us, they will express the bias that the things are trained on.
61+
- what does **stochastic** mean in AI?
62+
- In AI and machine learning, "stochastic" refers to a variable process where the outcome involves some randomness and has some uncertainty. It is a mathematical term closely related to "randomness" and "probabilistic" and can be contrasted to the idea of "deterministic." Stochastic processes and algorithms make use of randomness during optimization and learning, which allows them to avoid getting stuck and achieve results that deterministic algorithms cannot.
63+
- [How AI is eating Finance — with Mike Conover of Brightwave](https://www.latent.space/p/brightwave)
64+
- Databricks: Dolly
65+
- [Free Dolly: Introducing the World's First Open and Commercially Viable Instruction-Tuned LLM - The Databricks Blog](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm)
66+
- [AI for the Future of Financial Research | Brightwave](https://www.brightwave.io/)
67+
- Brightwave shared some tips on leveraging LLMs as Judges:
68+
- ![The judge model](https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd863aad4-e0a0-4e8e-9685-af3e9b91c9b0_1118x860.png)
69+
- **Human vs LLM reviews**: while they work with human annotators to create high quality datasets, that data isn’t just used to fine tune models but also as a reference basis for future LLM reviews. _Having a set of trusted data to use as calibration helps you trust the LLM judgement even more._
70+
- **Ensemble consistency checking**: rather than using an LLM as judge for one output, _you use different LLMs to generate a result for the same task, and then use another LLM to highlight where those generations differ._ Do the two outputs differ meaningfully? Do they have different beliefs about the implications of something? If there are a lot of discrepancies between generations coming from different models, you then do additional passes to try and resolve them.
71+
- **Entailment verification**: for each unique insight that they generate, they take the output and separately ask LLMs to verify factuality of information based on the original sources. In the actual product, user can then highlight any piece of text and ask it to 1) “Tell Me More” 2) “Show Sources”. _Since there’s no way to guarantee factuality of 100% of outputs, and humans have good intuition for things that look out of the ordinary, giving the user access to the review tool helps them build trust in it._
72+
> It’s been clear in the last year that the half-life of a model is much shorter than the half-life of a dataset
73+
74+
### Other Things
75+
76+
- [USA Cricket stuns Pakistan in World Cup T20 upset](https://sports.yahoo.com/usa-cricket-stuns-pakistan-in-world-cup-t20-upset-204701050.html)
77+
- As someone who's worked with Indians and Pakistanis, this sounds disruptive
78+
- Food Dive - [Chobani founder and CEO buys Anchor Brewing](https://www.fooddive.com/news/chobani-founder-hamdi-ulukaya-buys-anchor-brewing/717726/)
79+
- I used to walk by the Anchor Brewing building all of the time. This is refreshing to hear
80+
- [Bike Index - Bike registration that works](https://bikeindex.org/)
81+
- I will want to register my bike on here in case it gets stolen
82+
- [The Microsoft Excel superstars throw down in Vegas](https://www.theverge.com/c/24133822/microsoft-excel-spreadsheet-competition-championship)
83+
- The New York Times - [Why 1999 Was Hollywood’s Greatest Year](https://www.nytimes.com/2019/05/31/books/review/hollywoods-greatest-year-brian-raftery.html#commentsContainer)
84+
- Nadia Asparouhova - [How to do the jhanas](https://nadia.xyz/jhanas?utm_source=tldrnewsletter)
85+
86+
### Notable Videos
87+
88+
Fireship - When your serverless computing bill goes parabolic
89+
90+
<iframe
91+
class="aspect-video w-full my-2"
92+
src="https://www.youtube.com/embed/SCIfWhAheVw"
93+
title="What your dog sees (w/ Cleo Abram)"
94+
frameborder="0"
95+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
96+
allowfullscreen></iframe>
97+
98+
Will Larson - How should you adopt LLMs in your product?
99+
100+
<iframe
101+
class="aspect-video w-full my-2"
102+
src="https://www.youtube.com/embed/EVPY9koFceU"
103+
title="What your dog sees (w/ Cleo Abram)"
104+
frameborder="0"
105+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
106+
allowfullscreen></iframe>
107+
108+
Howtown - What your dog sees (w/ Cleo Abram)
109+
110+
<iframe
111+
class="aspect-video w-full my-2"
112+
src="https://www.youtube.com/embed/EJXG-5mZfJM"
113+
title="What your dog sees (w/ Cleo Abram)"
114+
frameborder="0"
115+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
116+
allowfullscreen></iframe>

0 commit comments

Comments
 (0)