Merge pull request #470 from jermspeaks/content/weekly-w24-and-rag

jermspeaks · web-flow · commit 0a0ee3213536 · 2024-06-18T15:15:42.000-07:00
Week 24 weekly notes and post about RAG
diff --git a/src/content/writing/2024-06-18-rag.md b/src/content/writing/2024-06-18-rag.md
@@ -0,0 +1,73 @@
+---
+description: An explainer for Retrieval-Augmented Generation (RAG). Breaking down what it is and how people are implementing it.
+draft: false
+pubDate: "2024-06-18T23:00:00.000Z"
+tags: ["learning", "ai"]
+title: Retrieval-Augmented Generation (RAG)
+heroImage: https://cdn.hashnode.com/res/hashnode/image/upload/v1713375387925/9246942a-79e4-4d94-b032-a85f10480a99.png
+heroImageAlt: Diagram of one implementation of RAG from LangChain
+---
+
+RAG is a framework that aids LLMs to retrieve other datasets to augment the prompt and your generate a response based off the added data source.
+
+Breaking down the acrynpm, RAG means the following:
+
+- **R**etrieval mechanism: turn your query into a vector and run a vector search from a database that has pre-encoded documents and passages
+- **A**ugmentation: Combine retrieved documents with the initial prompt to create an augmented prompt
+- Answer **G**eneration: With the new prompt, the LLM will generate a response informed by the external knowledge contained in the retrieved texts
+
+It started with this paper: [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://arxiv.org/abs/2005.11401?ref=cohere-ai.ghost.io).
+
+This method is particularly valuable in fields like chatbot development, where the ability to provide precise answers derived from extensive databases of knowledge is crucial.
+
+RAG fundamentally enhances the natural language understanding and generation capabilities of models by allowing them to access and leverage a vast amount of external knowledge. The approach is built upon the synergy between two main components: a retrieval system and a generative model. The retrieval system first identifies relevant information from a knowledge base, which the generative model then uses to craft responses that are not only accurate but also rich in detail and scope.
+
+### Types of RAG
+
+1. Vector-based RAG - the most common type of RAG.
+
+You convert text into "embeddings" and store them in a vector database.
+
+![Overview showing RAG with a Vector DB](https://writer.com/wp-content/uploads/2024/02/Image-1.png)
+
+> Vector databases enable search functions that are much better than typical keyword searches. If users are looking for data that has semantic similarity, a vector database can often help them find those data points, even if there isn’t a literal keyword match
+>
+> -- Deanna Dong, [Vector databases, graph databases, and knowledge graphs - Writer](https://writer.com/blog/vector-databases-graph-databases-knowledge-graphs/)
+
+The downside is the context can be lost, especially when its relational context between data points. When chunking vectors, they use data-point similarity based on nearness. See KNN (k-nearest neighbors) and ANN (Approximate Nearest Neighbor)
+
+2. Graph-based RAG
+
+Instead of using a vector database, you use a Graph Database. A Graph DB contains vector information where links also store data. This allows relational information from retrieval
+
+![Example showing relationships in a Graph DB](https://writer.com/wp-content/uploads/2024/02/Image-2.png)
+
+Sam Julien on X Thread - [What’s graph-based RAG (retrieval-augmented generation) and why should you care?](https://twitter.com/samjulien/status/1801634334723432462)
+
+![Using RAG with a Graph DB](https://pbs.twimg.com/media/GQCw4OHXwAA0_uR?format=jpg&name=large)
+
+3. Knowledge graphs
+
+These outperform vector and graph databases due to their ability to preserve semantic relationships and encode structural information
+
+### Relevant sources for follow-up
+
+- [Five Reasons Enterprises Are Choosing RAG](https://cohere.com/blog/five-reasons-enterprises-are-choosing-rag)
+- LangChain Github for [RAG from Scratch](https://github.com/langchain-ai/rag-from-scratch)
+- Akshay 🚀 on X thread - [RAGs, clearly explained](https://twitter.com/akshay_pachaar/status/1791446077696266334)
+
+<iframe
+  class="aspect-video w-full my-2"
+  src="https://www.youtube.com/embed/T-D1OfcDW1M"
+  title="What is Retrieval-Augmented Generation (RAG)?"
+  frameborder="0"
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
+  allowfullscreen></iframe>
+
+<iframe
+  class="aspect-video w-full my-2"
+  src="https://www.youtube.com/embed/rhZgXNdhWDY"
+  title="Retrieval Augmented Generation (RAG) Explained: Embedding, Sentence BERT, Vector Database (HNSW)"
+  frameborder="0"
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
+  allowfullscreen></iframe>
diff --git a/src/content/writing/2024-06-18-w24-weekly-notes.md b/src/content/writing/2024-06-18-w24-weekly-notes.md
@@ -0,0 +1,116 @@
+---
+description: "Keeping a dev journal, book recommendations, how computers reduce efficiency, introduction to jhanas, some podcast notes about AI companies, and how do dogs see color."
+draft: false
+pubDate: "2024-06-18T23:30:00.000Z"
+tags: ["weekly", "reflection"]
+title: 2024 Week 24 - Weekly Notes
+heroImage: https://images.unsplash.com/photo-1506485338023-6ce5f36692df?ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D&auto=format&fit=crop&w=2370&q=80
+heroImageAlt: Unsplash image from Jazmin Quaynor showing a weekly calendar
+---
+
+### Reflections
+
+- Used `jscodeshift` that ChatGPT helped me to create to change `defaultProps` migration for React 19
+- I skimmed through "[The Definitive Guide to Google Vertex AI](https://www.packtpub.com/en-us/product/the-definitive-guide-to-google-vertex-ai-9781801815260)" from Packt Publishing by Jasmeet Bhatia and Kartik Chaudhary. The things I learned about RAG were added to my [new blog post](/blog/2024-06-18-rag/)
+- I met some bird watchers at SJSU this week. Looks like those [City Hall Peregrine Falcons](https://www.sanjoseca.gov/news-stories/city-hall-falcons) are back.
+
+### Dev and Tech-y Tech
+
+- Stack Overflow Blog - [You should keep a developer’s journal](https://stackoverflow.blog/2024/05/22/you-should-keep-a-developer-s-journal/?utm_source=tldrwebdev)
+  - I use my Obsidian Daily Notes. Each of those daily notes are reviewed on a weekly basis (hello Weekly Notes) and may be added to its own Obsidian note. Sometimes they get added as [Streams](/curation/stream).
+- GitHub - [Payments 101 for a Developer](https://github.com/juspay/hyperswitch/wiki/Payments-101-for-a-Developer?utm_source=tldrnewsletter)
+- Ben Kuhn - [Essays on programming I think about a lot](https://www.benkuhn.net/progessays/?utm_source=tldrwebdev)
+  - This sounds like something good to write about instead of my lindy library / timeless treasure trove
+- [Inside Bluesky’s Engineering Culture](https://newsletter.pragmaticengineer.com/p/bluesky-engineering-culture)
+  - [ScyllaDB | Monstrously Fast + Scalable NoSQL](https://www.scylladb.com/)
+  - [Expo Documentation](https://docs.expo.dev/)
+- Fortune - [How Amazon blew Alexa’s shot to dominate AI, according to employees who worked on it](https://fortune.com/2024/06/12/amazon-insiders-why-new-alexa-llm-generative-ai-conversational-chatbot-missing-in-action/?utm_source=tldrnewsletter)
+
+### Recommendations
+
+- Book: Frostbite: How Refrigeration Changed Our Food, Our Planet, and Ourselves. By Nicola Twilly. [Amazon](https://www.amazon.com/Frostbite-Refrigeration-Changed-Planet-Ourselves/dp/0735223289)
+  > An engaging and far-reaching exploration of refrigeration, tracing its evolution from scientific mystery to globe-spanning infrastructure, and an essential investigation into how it has remade our entire relationship with food—for better and for worse
+- Anime: [Frieren: Beyond Journey's End](https://www.crunchyroll.com/series/GG5H5XQX4/frieren-beyond-journeys-end)
+- Book: [Invitation to a Banquet: The Story of Chinese Food](http://www.fuchsiadunlop.com/books/invitation-to-a-banquet-the-story-of-chinese-food/) by Fuchsia Dunlop
+- Book: Simple Marketing For Smart People: The One Question You Need to Win Customers without Gimmicks, Hype, or Hard Selling. By Billy Broas and Tiago Forte. [Amazon](https://www.amazon.com/Simple-Marketing-Smart-People-Customers/dp/B0D66H733H/ref=tmm_pap_swatch_0?_encoding=UTF8&qid=&sr=&ck_subscriber_id=1900202893&utm_source=convertkit&utm_medium=email&utm_campaign=The%20Ultimate%20Tool%20for%20Thought)
+  - I started this book this week
+- Book: How to Baby: A No-Advice-Given Guide to Motherhood. By Liana Finck. [Amazon](https://www.amazon.com/How-Baby-No-Advice-Given-Motherhood-Drawings-ebook/dp/B0CC1H99QG)
+- Blog: [Interconnected](https://interconnected.org/home/) a blog by Matt Webb
+
+### Science
+
+- Quanta Magazine - [Most Life on Earth is Dormant, After Pulling an ‘Emergency Brake’](https://www.quantamagazine.org/most-life-on-earth-is-dormant-after-pulling-an-emergency-brake-20240605/?utm_source=tldrnewsletter)
+- Locklin on science - [Computers reduce efficiency: Case Studies of the Solow Paradox](https://scottlocklin.wordpress.com/2023/11/21/computers-reduce-efficiency-case-studies-of-the-solow-paradox/)
+  - Associated [HN thread](https://news.ycombinator.com/item?id=40233938)
+- [Generalizing to Unseen Domains in Diabetic Retinopathy with Disentangled Representations](https://arxiv.org/abs/2406.06384v1?utm_source=tldrai)
+
+### Obits
+
+- [Françoise Hardy, Moody French Pop Star, Dies at 80](https://www.nytimes.com/2024/06/12/arts/music/francoise-hardy-dead.html?campaign_id=9&emc=edit_nn_20240614&instance_id=126227&nl=the-morning&regi_id=197092347&segment_id=169560&te=1&user_id=53888c42b17ce2b613ad43a8e73d64ef)
+  - [Tous les garçons et les filles](https://www.youtube.com/watch?v=XPkBMqehr5k)
+
+### Podcast Notes
+
+- [Cohere CEO Aidan Gomez sees AI’s pathway to profitability - The Verge](https://www.theverge.com/24173858/ai-cohere-aidan-gomez-money-revenue-llm-transformers-enterprise-stochastic-parrot)
+  - [Attention is all you need | Proceedings of the 31st International Conference on Neural Information Processing Systems](https://dl.acm.org/doi/10.5555/3295222.3295349) - Aiden is one of the authors
+  - [On the Dangers of Stochastic Parrots | Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency](https://dl.acm.org/doi/10.1145/3442188.3445922)
+    - Aiden doesn't think this danger is bad.
+      > That’s more than just parroting back what you’ve already seen. I think that these models don’t just parrot back what they’ve seen. I think that they’re able to extrapolate beyond what we’ve shown them, to recognize patterns in the data and apply those patterns to new inputs that they’ve never seen before. Definitively, at this stage, we can say we’re past the stochastic parrot hypothesis.
+    - Stochastic Parrots hypothesis
+      > The claim of that paper is that these [models] are just repeating words back at us, and there isn’t some deeper intelligence. And actually, by repeating things back to us, they will express the bias that the things are trained on.
+    - what does **stochastic** mean in AI?
+      - In AI and machine learning, "stochastic" refers to a variable process where the outcome involves some randomness and has some uncertainty. It is a mathematical term closely related to "randomness" and "probabilistic" and can be contrasted to the idea of "deterministic." Stochastic processes and algorithms make use of randomness during optimization and learning, which allows them to avoid getting stuck and achieve results that deterministic algorithms cannot.
+- [How AI is eating Finance — with Mike Conover of Brightwave](https://www.latent.space/p/brightwave)
+  - Databricks: Dolly
+    - [Free Dolly: Introducing the World's First Open and Commercially Viable Instruction-Tuned LLM - The Databricks Blog](https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm)
+  - [AI for the Future of Financial Research | Brightwave](https://www.brightwave.io/)
+  - Brightwave shared some tips on leveraging LLMs as Judges:
+    - ![The judge model](https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd863aad4-e0a0-4e8e-9685-af3e9b91c9b0_1118x860.png)
+  - **Human vs LLM reviews**: while they work with human annotators to create high quality datasets, that data isn’t just used to fine tune models but also as a reference basis for future LLM reviews. _Having a set of trusted data to use as calibration helps you trust the LLM judgement even more._
+  - **Ensemble consistency checking**: rather than using an LLM as judge for one output, _you use different LLMs to generate a result for the same task, and then use another LLM to highlight where those generations differ._ Do the two outputs differ meaningfully? Do they have different beliefs about the implications of something? If there are a lot of discrepancies between generations coming from different models, you then do additional passes to try and resolve them.
+  - **Entailment verification**: for each unique insight that they generate, they take the output and separately ask LLMs to verify factuality of information based on the original sources. In the actual product, user can then highlight any piece of text and ask it to 1) “Tell Me More” 2) “Show Sources”. _Since there’s no way to guarantee factuality of 100% of outputs, and humans have good intuition for things that look out of the ordinary, giving the user access to the review tool helps them build trust in it._
+    > It’s been clear in the last year that the half-life of a model is much shorter than the half-life of a dataset
+
+### Other Things
+
+- [USA Cricket stuns Pakistan in World Cup T20 upset](https://sports.yahoo.com/usa-cricket-stuns-pakistan-in-world-cup-t20-upset-204701050.html)
+  - As someone who's worked with Indians and Pakistanis, this sounds disruptive
+- Food Dive - [Chobani founder and CEO buys Anchor Brewing](https://www.fooddive.com/news/chobani-founder-hamdi-ulukaya-buys-anchor-brewing/717726/)
+  - I used to walk by the Anchor Brewing building all of the time. This is refreshing to hear
+- [Bike Index - Bike registration that works](https://bikeindex.org/)
+  - I will want to register my bike on here in case it gets stolen
+- [The Microsoft Excel superstars throw down in Vegas](https://www.theverge.com/c/24133822/microsoft-excel-spreadsheet-competition-championship)
+- The New York Times - [Why 1999 Was Hollywood’s Greatest Year](https://www.nytimes.com/2019/05/31/books/review/hollywoods-greatest-year-brian-raftery.html#commentsContainer)
+- Nadia Asparouhova - [How to do the jhanas](https://nadia.xyz/jhanas?utm_source=tldrnewsletter)
+
+### Notable Videos
+
+Fireship - When your serverless computing bill goes parabolic
+
+<iframe
+  class="aspect-video w-full my-2"
+  src="https://www.youtube.com/embed/SCIfWhAheVw"
+  title="What your dog sees (w/ Cleo Abram)"
+  frameborder="0"
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
+  allowfullscreen></iframe>
+
+Will Larson - How should you adopt LLMs in your product?
+
+<iframe
+  class="aspect-video w-full my-2"
+  src="https://www.youtube.com/embed/EVPY9koFceU"
+  title="What your dog sees (w/ Cleo Abram)"
+  frameborder="0"
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
+  allowfullscreen></iframe>
+
+Howtown - What your dog sees (w/ Cleo Abram)
+
+<iframe
+  class="aspect-video w-full my-2"
+  src="https://www.youtube.com/embed/EJXG-5mZfJM"
+  title="What your dog sees (w/ Cleo Abram)"
+  frameborder="0"
+  allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
+  allowfullscreen></iframe>