RAG - Basic RAG using llama3, langchain and chromadb

embedding

RAG

A primer about RAG’s

Author

fastdaima

Published

July 31, 2024

what the hell is it ?

RAG is an advanced form of information retrieval (IR) mechanism.
Information retrieval, is the process of retrieving relevant information from a dataset for your query.

Input: Query and Array of documents (dataset)
Output: Top N relevant documents
RAG uses Bi-Encoder encoding and cosine similarity search to retrieve top n documents
first, query is converted into embeddings (floating point representation) i.e) array of vectors, and then pooled together into single vector say Q
second, input documents array, here each document in the array is converted into embeddings (floating point representation) i.e) array of vectors, and then pooled together into single vector say Docs
then, cosine similarity search is taken between the Q and each document vector in Docs say cosine_similarity_score_list, and it is sorted in descending manner
Top n docs is returned from cosine_similarity_score_list as relevant documents