Vector Databases - Getting Started

Overview

LlmTornado provides integration with vector databases for semantic search, embeddings storage, and retrieval-augmented generation (RAG). Vector databases enable you to store and query high-dimensional vectors efficiently.

Quick Start

csharp

// Install the packages:
// dotnet add package LlmTornado.VectorDatabases
// dotnet add package LlmTornado.VectorDatabases.ChromaDB

using LlmTornado;
using LlmTornado.VectorDatabases;

// Initialize API and vector database
TornadoApi api = new TornadoApi("your-api-key");

// Create embeddings
List<float[]> embeddings = await api.Embeddings.CreateEmbedding(
    new List<string> { "Hello world", "AI is amazing" },
    ChatModel.OpenAi.Embedding.Ada002
);

// Store in vector database (see ChromaDB documentation)

What are Vector Databases?

Vector databases store data as high-dimensional vectors (embeddings) that capture semantic meaning. This enables:

Semantic Search: Find similar content by meaning, not just keywords
RAG: Retrieve relevant context for AI responses
Similarity Matching: Find similar items across large datasets
Clustering: Group related items automatically

Creating Embeddings

Text Embeddings

csharp

TornadoApi api = new TornadoApi("your-api-key");

// Single text
float[] embedding = await api.Embeddings.CreateEmbedding(
    "The quick brown fox",
    ChatModel.OpenAi.Embedding.Ada002
);

// Multiple texts
List<float[]> embeddings = await api.Embeddings.CreateEmbedding(
    new List<string> 
    { 
        "First document",
        "Second document",
        "Third document"
    },
    ChatModel.OpenAi.Embedding.Ada002
);

Multimodal Embeddings

csharp

// Image and text embeddings (if supported by provider)
// Useful for image search and multimodal RAG

Use Cases

Semantic Search

Store document embeddings and find similar content:

csharp

// 1. Embed documents
// 2. Store in vector database
// 3. Embed query
// 4. Find similar vectors
// 5. Return matching documents

RAG (Retrieval-Augmented Generation)

Enhance AI responses with relevant context:

csharp

// 1. Embed user query
// 2. Retrieve relevant documents
// 3. Pass documents as context to AI
// 4. Generate informed response

Document Clustering

Group similar documents automatically:

csharp

// 1. Create embeddings for all documents
// 2. Use clustering algorithm
// 3. Group by similarity

Supported Providers

LlmTornado supports various embedding providers:

OpenAI: text-embedding-ada-002, text-embedding-3-small, text-embedding-3-large
Google: Gecko embeddings
Cohere: Embed v3
Voyage AI: Voyage embeddings

Best Practices

Choose appropriate embedding models for your use case
Batch embedding requests for efficiency
Cache embeddings when possible
Consider embedding dimensions vs. performance trade-offs
Normalize vectors when required

ChromaDB - Using ChromaDB with LlmTornado
Chat Basics

Vector Databases - Getting Started ​

Overview ​

Quick Start ​

What are Vector Databases? ​

Creating Embeddings ​

Text Embeddings ​

Multimodal Embeddings ​

Use Cases ​

Semantic Search ​

RAG (Retrieval-Augmented Generation) ​

Document Clustering ​

Supported Providers ​

Best Practices ​

Related Topics ​

Vector Databases - Getting Started

Overview

Quick Start

What are Vector Databases?

Creating Embeddings

Text Embeddings

Multimodal Embeddings

Use Cases

Semantic Search

RAG (Retrieval-Augmented Generation)

Document Clustering

Supported Providers

Best Practices

Related Topics