Semantic Search Basics

Semantic search is a way of finding info that looks at meaning, not just words. It's a key part of how Retrieval Augmented Generation (RAG) works. Unlike old search methods that just match exact words, semantic search tries to understand what you're really asking for.
The core idea of semantic search is that words with similar meanings should be treated as related. For example, if you search for "car," semantic search knows that "automobile" and "vehicle" are related words. This helps it find more relevant info.
Mathematical relevance
To do this, semantic search uses vectors. Remember, vectors are lists of numbers that represent words or phrases. Words with similar meanings have similar vectors. This lets computers measure how close in meaning different words or phrases are.
When you do a semantic search, your search terms get turned into vectors. Then, the search looks for documents or chunks of text with similar vectors. This helps find relevant info even if it doesn't use the exact words you searched for.
Semantic search is enabled by embedding math. Embedding is the process of turning words into vectors. Good embeddings are crucial for semantic search to work well.
One big advantage of semantic search is that it can understand context better. It can tell the difference between words that are spelled the same but mean different things. For example, it can tell if "bank" means a place for money or the side of a river.
Meaning is key
Semantic search also helps with questions in different languages. Because it looks at meaning, it can often find relevant info even if it's in a different language from the search terms.
In RAG systems, semantic search is used to find relevant chunks of text to help answer questions. It's a big improvement over older methods because it can find info that's related but doesn't use the exact same words as the question.
However, semantic search isn't perfect. Sometimes it might miss relevant info if the meaning is expressed in an unusual way. It can also sometimes return info that seems related but isn't actually helpful.
Time to scale
To work well, semantic search needs a lot of data and computing power. The more text it has to learn from, the better it can understand meanings and relationships between words.
As AI gets better, semantic search keeps improving. Newer models can understand more complex relationships between words and ideas. This makes RAG systems better at finding the right info to answer questions.
In summary, semantic search is a powerful tool that helps RAG systems understand and find info based on meaning, not just exact words. It's a big part of what makes modern AI search and question-answering systems work so well.