Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessTrump to Axios: Iran deal possible by Tues., otherwise "I am blowing up everything"Axios TechThe New York Times drops freelancer whose AI tool copied from an existing book reviewThe Decodertrunk/83e9e15421782cf018dae04969a387901ba8ec1b: Fix Python refcounting bugs in profiler_python.cpp (#179285)PyTorch Releases🥇Top AI Papers of the WeekNLP News SubstackA profile of Mikko Hyppönen, a cybersecurity veteran who pivoted from fighting malware to developing anti-drone systems for law enforcement and the military (Lorenzo Franceschi-Bicchierai/TechCrunch)Techmeme[D] ICML Rebuttal QuestionReddit r/MachineLearningDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - TNGlobalGNews AI educationChina Cracking Down on the Types of AI That Are Tearing America Apart - FuturismGNews AI USAIsrael Court Indicts West Bank Man For AI-Enabled Extortion - Let's Data ScienceGNews AI IsraelAI on CanvasHacker News AI TopAI could transform patient education in eye care, new research shows - Medical XpressGNews AI educationDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - Thailand Business NewsGoogle News - AI ThailandBlack Hat USADark ReadingBlack Hat AsiaAI BusinessTrump to Axios: Iran deal possible by Tues., otherwise "I am blowing up everything"Axios TechThe New York Times drops freelancer whose AI tool copied from an existing book reviewThe Decodertrunk/83e9e15421782cf018dae04969a387901ba8ec1b: Fix Python refcounting bugs in profiler_python.cpp (#179285)PyTorch Releases🥇Top AI Papers of the WeekNLP News SubstackA profile of Mikko Hyppönen, a cybersecurity veteran who pivoted from fighting malware to developing anti-drone systems for law enforcement and the military (Lorenzo Franceschi-Bicchierai/TechCrunch)Techmeme[D] ICML Rebuttal QuestionReddit r/MachineLearningDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - TNGlobalGNews AI educationChina Cracking Down on the Types of AI That Are Tearing America Apart - FuturismGNews AI USAIsrael Court Indicts West Bank Man For AI-Enabled Extortion - Let's Data ScienceGNews AI IsraelAI on CanvasHacker News AI TopAI could transform patient education in eye care, new research shows - Medical XpressGNews AI educationDFRobot Showcases AI Maker Projects at Robot Hokoten in Akihabara - Thailand Business NewsGoogle News - AI Thailand
AI NEWS HUBbyEIGENVECTOREigenvector

Building Production RAG Systems in .NET 10: The Complete Guide to Embeddings

DEV Communityby Vikrant BagalApril 1, 202610 min read0 views
Source Quiz

<h1> Building Production RAG Systems in .NET 10: The Complete Guide to Embeddings </h1> <h2> The Hallucination Problem </h2> <p>Your company spent $50K building an internal chatbot. It tells customers "yes, we ship internationally" when you only ship to the US. Your support team is drowning in corrections.</p> <p>Sound familiar?</p> <p>This happens because traditional LLMs generate responses from training data patterns, not your actual data. They hallucinate. They confidently state false information.</p> <p><strong>RAG (Retrieval-Augmented Generation) fixes this.</strong> Instead of hoping the LLM knows about your data, you explicitly feed it your documents first.</p> <h2> What Are Embeddings? </h2> <p>Think of embeddings as a way to convert text into mathematics.</p> <h3> The Simple Versi

The Hallucination Problem

Your company spent $50K building an internal chatbot. It tells customers "yes, we ship internationally" when you only ship to the US. Your support team is drowning in corrections.

Sound familiar?

This happens because traditional LLMs generate responses from training data patterns, not your actual data. They hallucinate. They confidently state false information.

RAG (Retrieval-Augmented Generation) fixes this. Instead of hoping the LLM knows about your data, you explicitly feed it your documents first.

What Are Embeddings?

Think of embeddings as a way to convert text into mathematics.

The Simple Version

Text: "The quick brown fox" ↓ Embedding (float array, 1536 dimensions) [0.234, -0.156, 0.892, ..., 0.421] ↓ This vector captures semantic meaning

Enter fullscreen mode

Exit fullscreen mode

Why Vectors Matter

Two sentences with different words can have similar embeddings if they mean the same thing:

Sentence A: "Our Q3 revenue exceeded $5 million" Embedding A: [0.234, -0.156, 0.892, ...]

Sentence B: "Q3 generated more than $5M in sales" Embedding B: [0.235, -0.154, 0.894, ...]

← Very similar! The model understands they mean the same thing.`

Enter fullscreen mode

Exit fullscreen mode

But this completely different sentence:

Sentence C: "I like coffee" Embedding C: [0.892, 0.234, -0.156, ...]

← Very different vector! Different meaning.`

Enter fullscreen mode

Exit fullscreen mode

This is how RAG systems find relevant documents by meaning, not just keyword matches.

The RAG Pipeline in .NET 10

Step 1: Generate Embeddings from Your Documents

// In .NET 10 with Microsoft.Extensions.AI public class DocumentEmbedder {  private readonly EmbeddingsClient _embeddingClient;  private readonly VectorStore _vectorStore;

public DocumentEmbedder(EmbeddingsClient client, VectorStore store) { _embeddingClient = client; _vectorStore = store; }

// Embed your documents once public async Task IndexDocumentsAsync(List documents) { var embeddings = await embeddingClient.GenerateAsync(documents);

var vectors = embeddings.Value.Select((e, i) => new VectorDocument { Id = Guid.NewGuid().ToString(), Content = documents[i], Vector = e.Vector.ToArray(), Metadata = new { Source = "DocumentLibrary" } }).ToList();

await vectorStore.UpsertAsync(vectors); } }`

Enter fullscreen mode

Exit fullscreen mode

Key point: You embed documents once and store them. Embeddings are deterministic—same document = same vector, every time.

Step 2: When User Asks, Search Semantically

public class RAGResponseGenerator {  private readonly VectorStore _vectorStore;  private readonly EmbeddingsClient _embeddingClient;  private readonly ChatClient _chatClient;_

public async Task AnswerAsync(string userQuestion) { // 1. Embed the question var queryEmbedding = await embeddingClient .GenerateAsync(new[] { userQuestion });

// 2. Search vector database for similar documents var relevantDocs = await vectorStore.SearchAsync( vector: queryEmbedding.Value[0].Vector.ToArray(), topK: 5, threshold: 0.7 // Similarity score );

// 3. Build context from relevant documents var context = string.Join("\n\n", relevantDocs .Select(d => $"Source: {d.Metadata["Source"]}\n{d.Content}"));

// 4. Generate response grounded in real data var response = await chatClient.CompleteAsync( new ChatMessage(ChatRole.System, "You are a helpful assistant. Answer using ONLY the provided context. " + "If the context doesn't contain the answer, say 'I don't have that information.'"), new ChatMessage(ChatRole.User, $"Context:\n{context}\n\nQuestion: {userQuestion}") );

return response.Content[0].Text; } }`

Enter fullscreen mode

Exit fullscreen mode

Real-World Use Cases

1. Enterprise Document Search

Problem: "Find all contracts where we agreed to 30-day payment terms"

Keyword search fails. It finds "30 days" but also matches "30-day warranty" in unrelated docs.

RAG solution:

// Semantic search understands intent var searchResults = await _vectorStore.SearchAsync(  query: "payment terms agreements",  topK: 20 );_

// Returns contracts actually discussing payment terms // Not just keyword matches`

Enter fullscreen mode

Exit fullscreen mode

2. Customer Support Automation

Problem: Support tickets are repetitive. Your FAQ is massive.

RAG solution:

public class SupportChatbot {  public async Task AnswerSupportQuestionAsync(string question)  {  // Search FAQ, past tickets, knowledge base  var relevantArticles = await _vectorStore.SearchAsync(  query: question,  filter: new { Type = "FaqOrTicket" }  );_

// Generate response from actual support history var response = await chatClient.CompleteAsync( context: relevantArticles, prompt: $"Customer asks: {question}" );

return response; } }`

Enter fullscreen mode

Exit fullscreen mode

Result: Consistent answers based on real support history, not hallucinated solutions.

3. Technical Documentation Assistant

Problem: Your API docs are 500 pages. Developers give up.

RAG solution:

// "How do I paginate API results?" // Search finds: Authentication docs, Pagination section, Examples // Returns: Exactly what the developer needs

var docSearch = await vectorStore.SearchAsync( query: "pagination API results", filter: new { DocumentType = "ApiDocs" }, topK: 3 );`

Enter fullscreen mode

Exit fullscreen mode

4. Code Analysis & Documentation

Problem: Onboarding takes weeks. New devs can't find relevant code examples.

RAG solution:

public class CodebaseAssistant {  // Embed your entire codebase  // "Show me examples of dependency injection usage"  var examples = await _codeVectorStore.SearchAsync(  query: "dependency injection usage examples",  topK: 10  );_

// Returns actual code from your repo }`

Enter fullscreen mode

Exit fullscreen mode

DO's and DON'Ts for RAG in .NET

✅ DO

  • Chunk documents smartly. 512-1024 token chunks work best. Too small = lost context. Too large = expensive embeddings.

var chunks = ChunkDocument(doc, chunkSize: 512, overlap: 100);

Enter fullscreen mode

Exit fullscreen mode

  • Store metadata. Source, date, version - makes results traceable.

var vector = new VectorDocument  {  Content = text,  Vector = embedding,  Metadata = new { Source = "SalesReport", Date = DateTime.Now }  };

Enter fullscreen mode

Exit fullscreen mode

  • Monitor similarity scores. Not all search results are good results.

var results = await vectorStore.SearchAsync(query, topK: 5);  var confident = results.Where(r => r.SimilarityScore > 0.75);

Enter fullscreen mode

Exit fullscreen mode

  • Regenerate embeddings when documents change significantly.

❌ DON'T

  • Embed raw PDFs. Extract text first. Preserve structure.

// Bad  var embedding = await client.GenerateAsync(pdfBytes);

// Good var text = ExtractTextFromPdf(pdf); var embedding = await client.GenerateAsync(text);`

Enter fullscreen mode

Exit fullscreen mode

  • Trust low similarity scores. If your search returns 0.45 relevance, it's basically random.

// Bad: Use anything over 0.5  // Good: Use results > 0.7, fall back to "I don't know"

Enter fullscreen mode

Exit fullscreen mode

  • Use outdated embeddings for new documents. Inconsistent results.

  • Forget about cost. Embedding a million documents is expensive. Plan your chunk strategy.

Vector Databases for .NET

Database .NET Support Best For Cost

Azure Cosmos DB ✅ Native Enterprise, serverless

Azure OpenAI ✅ Built-in Quick start, OpenAI models

pgvector (PostgreSQL) ✅ Npgsql Self-hosted, low cost $

Milvus ✅ Community Open source, scalable $

Pinecone ✅ REST API Managed, serverless

Minimal Example with Cosmos DB

services.AddAzureOpenAIClient(endpoint, credentials); services.AddScoped(); services.AddScoped();

// Dependency injection handles the rest`

Enter fullscreen mode

Exit fullscreen mode

Measuring RAG Quality

Retrieval Metrics

Precision: Of top-5 results, how many are relevant?

var relevant = searchResults.Count(r => r.IsRelevant); var precision = relevant / searchResults.Count; // Target: > 0.8

Enter fullscreen mode

Exit fullscreen mode

Recall: Of all relevant documents, did we find them?

var foundRelevant = relevantDocuments  .Count(d => searchResults.Contains(d)); var recall = foundRelevant / totalRelevantDocuments; // Target: > 0.7

Enter fullscreen mode

Exit fullscreen mode

Conclusion

RAG eliminates hallucinations by grounding AI in your actual data.

Key Takeaways:

  • Embeddings = Text as math. They capture semantic meaning.

  • RAG pipeline = Search → Feed → Generate. Find relevant docs, include them, answer based on reality.

  • .NET 10 + Microsoft.Extensions.AI makes this native and simple.

  • Vector databases store and search embeddings at scale.

  • Production-ready requires chunking strategy, metadata, similarity thresholds.

Next Steps:

  • Review Generative AI for Beginners .NET v2 - Lesson 3 covers RAG

  • Choose your vector database (start with pgvector for simplicity)

  • Extract and chunk your documents

  • Build your first RAG pipeline

Resources

  • Microsoft.Extensions.AI Docs

  • Generative AI for Beginners .NET v2

  • RAG vs Fine-tuning

  • pgvector for PostgreSQL

  • Azure OpenAI Embeddings

What's your biggest question about RAG? Drop it below!

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modeltrainingversion

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Building Pr…modeltrainingversionopen sourceproductserviceDEV Communi…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 112 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Products