CL
ContextLite
CL

The SQLite of AI Context

Fast local context for any AI system. One Go binary. Zero dependencies. No cloud, no vectors, just intelligent text search that actually works.

Download Binary View Source
# Start the context server
$ contextlite serve --port 8080
✓ ContextLite running on http://localhost:8080
# Any AI tool can now get context via HTTP
$ curl -X POST http://localhost:8080/api/v1/context \
-d '{"query": "How does authentication work?", "max_tokens": 4000}'
# Returns relevant context from your codebase
{"context": "...", "documents": ["auth.go", "middleware.go"], "tokens": 3847}

What We Actually Built

ContextLite is a production-ready HTTP sidecar that provides context to AI systems. It's not a vector database replacement—it's a different approach using proven technology.

Core Technology

  • SQLite FTS5 for full-text search with Porter stemming
  • 7-Dimensional Scoring for intelligent document ranking
  • Multi-Level Caching (memory → SQLite → snapshots)
  • HTTP API that any tool can call
  • Single Go Binary with zero dependencies

Why It's Fast

SQLite FTS5 on local disk is inherently fast. We're not computing cosine similarity on high-dimensional vectors—we're doing indexed text search with intelligent ranking.

Performance characteristics:
• Cached queries: microseconds
• Cold queries: single-digit milliseconds
• No network latency
• Minimal memory footprint

The Problem with Current Solutions

Vector Databases

  • • Require expensive embeddings
  • • Cloud-based = latency + privacy concerns
  • • Monthly bills ($100-500+)
  • • Complex Docker/K8s setup
  • • Semantic ≠ Relevant for code

Local Vector DBs

  • • Still need embedding models
  • • High memory usage (GBs)
  • • Slow indexing
  • • Configuration complexity
  • • Version compatibility issues

Custom Solutions

  • • Reinventing the wheel
  • • Maintenance burden
  • • No standardization
  • • Performance unknowns
  • • Integration complexity

We Built Different

Text search with intelligent ranking beats embeddings for code context. Everything local means no latency, no privacy concerns, no recurring bills. One binary means download and run—no complex setup.

Our Approach: Simple, Fast, Local

🔍

SQLite FTS5

Full-text search with Porter stemming. Proven, fast, reliable.

🎯

7D Scoring

Multi-dimensional ranking for better context selection.

Smart Caching

Three-layer cache for microsecond responses.

🔒

100% Local

No cloud, no API keys, your data stays private.

Technical Architecture

// What's actually running
type ContextLite struct {
    Storage     *SQLiteStore   // FTS5-powered storage
    Quantum     *QuantumEngine // 7D scoring algorithm
    Cache       *MultiCache    // L1/L2/L3 caching
    HTTPServer  *Server        // Simple REST API
}

// The API is dead simple
POST /api/v1/documents       // Add document
GET  /api/v1/documents       // List documents
POST /api/v1/context         // Get context for query
GET  /health                 // Service health

7-Dimensional Scoring

Documents are evaluated across seven dimensions simultaneously:

  1. 1. Relevance: Query-document match strength
  2. 2. Coherence: Inter-document compatibility
  3. 3. Recency: Time-weighted importance
  4. 4. Diversity: Anti-redundancy measure
  5. 5. Entanglement: Cross-reference detection
  6. 6. Probability: Statistical selection
  7. 7. Uncertainty: Controlled randomness

Cache Architecture

L1 Memory Cache ~0.001ms
↓ miss
L2 SQLite Cache ~0.1ms
↓ miss
L3 Snapshots ~1ms
↓ miss
Compute Fresh 2-10ms

How to Use ContextLite

1

Start the Server

$ contextlite serve --port 8080
✓ ContextLite running on http://localhost:8080
✓ Database initialized: ~/.contextlite/context.db
2

Add Your Documents

import requests

# Add files to context
for file in project_files:
    requests.post('http://localhost:8080/api/v1/documents', 
                  json={
                      'id': file.path,
                      'content': file.read(),
                      'metadata': {'language': file.extension}
                  })
3

Get Context for Your AI

// From any application
const response = await fetch('http://localhost:8080/api/v1/context', {
  method: 'POST',
  body: JSON.stringify({
    query: userQuestion,
    max_tokens: 4000
  })
});

// Use with your LLM
const answer = await llm.complete({
  prompt: `Context:\n${response.context}\n\nQ: ${userQuestion}`
});

Works With Everything

Any tool that can make HTTP requests can use ContextLite. No SDKs, no special integration.

Ollama Integration

import requests

# Get context from ContextLite
ctx = requests.post('http://localhost:8080/api/v1/context',
    json={'query': question, 'max_tokens': 4000})

# Send to Ollama with context  
response = requests.post('http://localhost:11434/api/generate',
    json={
        'model': 'llama2',
        'prompt': f"{ctx.json()['context']}\n\nQ: {question}"
    })

Continue.dev Config

{
  "contextProviders": [
    {
      "name": "contextlite",
      "params": {
        "endpoint": "http://localhost:8080/api/v1/context",
        "maxTokens": 4000
      }
    }
  ]
}

Any Language

# Bash
curl -X POST http://localhost:8080/api/v1/context \
  -d '{"query": "How does X work?"}'

# Any HTTP client in any language works

Custom Integration

// Your app
const getContext = async (query) => {
  return fetch('http://localhost:8080/api/v1/context', {
    method: 'POST',
    body: JSON.stringify({ query, max_tokens: 4000 })
  }).then(r => r.json());
};

Simple, Fair Pricing

Start with the free Community Edition. Upgrade when you need commercial licensing.

Community

Free
  • Full source code
  • MIT license
  • All features included
  • Community support
  • Perfect for open source
View on GitHub
POPULAR

Professional

$99 one-time
  • Commercial use license
  • Email support
  • 1 year of updates
  • Setup assistance
  • Invoice available

Enterprise

$999 one-time
  • Unlimited commercial use
  • Priority support with SLA
  • Custom features possible
  • Training session
  • PO accepted

Frequently Asked Questions

How fast is it really?

SQLite FTS5 queries on local disk are inherently fast. Cached queries return in microseconds, cold queries in single-digit milliseconds. We'll publish detailed benchmarks with methodology soon.

How is this different from vector databases?

Vector databases use embeddings and cosine similarity. We use full-text search with intelligent multi-dimensional scoring. For code context, keyword matching with smart ranking often works better than semantic similarity.

What's "quantum-inspired" mean?

We use concepts from quantum mechanics (superposition, entanglement, probabilistic collapse) in our scoring algorithm. It's not quantum computing—it's applying quantum principles to document ranking.

Is my data private?

100% yes. Everything runs locally on your machine. No network calls, no telemetry, no data leaves your computer. Check the source code if you want to verify.

Can it scale to large codebases?

SQLite can handle millions of documents efficiently. The FTS5 index scales well, and our caching layers ensure fast responses even with large document sets.

Which AI tools does it work with?

Any tool that can make HTTP requests. This includes Ollama, Continue.dev, custom scripts, VS Code extensions, and any LLM framework. No special SDKs or integration needed.

Ready to Get Started?

Download ContextLite and give your AI instant access to local context.

Quick Start

# Download for your platform
wget https://github.com/yourusername/contextlite/releases/latest/contextlite-linux-amd64

# Make executable
chmod +x contextlite-linux-amd64

# Start the server
./contextlite-linux-amd64 serve

# You're ready to go!