Ship AI Agents in Minutes, Not Months
RecallBricks Runtime turns any LLM into a production-ready agent. Persistent memory. Automatic context. Self-improving intelligence. Zero configuration required.
No credit card required • Production-ready in 5 minutes
npm install @recallbricks/runtimeimport { AgentRuntime } from '@recallbricks/runtime';const agent = new AgentRuntime({ llmApiKey: process.env.ANTHROPIC_API_KEY, apiKey: process.env.RECALLBRICKS_API_KEY});const response = await agent.chat('My favorite color is blue');// Later...const recall = await agent.chat("What's my favorite color?");// "Your favorite color is blue!"// Memory, context, identity — all automatic.Works with Claude, GPT, Gemini, Ollama, and any LLM
RecallBricks Runtime is the cognitive operating system that turns any LLM into a production-ready agent — with persistent memory, automatic context, and self-improving intelligence.
The problem
AI agents have a memory problem.
They can't remember, learn, or explain what they know.
✕Your agent forgets everything
Context windows reset. Every conversation starts from zero — no memory of past decisions or preferences.
✕You're building memory from scratch
Wiring embeddings, vector DBs, and retrieval logic just to give your agent basic awareness.
✕Retrieval isn't understanding
Vector similarity finds matches, not meaning. Your agent retrieves documents but doesn't learn from them.
✕Your agent can't explain itself
It can't tell you what it knows, why it's confident, or when it should ask for help.
Why RecallBricks Runtime?
Everything your agent needs, built in
Stop building infrastructure. Start shipping agents.
Zero Configuration Memory
Save and recall automatically. No manual management, no vector databases to configure. Just works.
Works With Any LLM
Claude, GPT, Gemini, Ollama, or custom models. Switch providers without changing code.
Self-Improving Intelligence
Memories automatically upgrade from storage to enriched to expertise based on usage patterns.
Enterprise-Grade From Day One
Identity validation, error handling, retry logic, and automatic context management built in.
Production-Ready in 5 Minutes
One npm install, one API call. Your agent has persistent memory and automatic context.
Simple .chat() API
No orchestration code. No retrieval pipelines. Just call .chat() and everything is handled.
Works with Claude, GPT, Gemini, Ollama • Production-ready in 5 minutes
Built for Real Products
One API for any AI agent
Same simple interface. Different use cases. Production-ready in minutes.
Customer Support
Chatbot that remembers customer history, preferences, and past issues. Never asks the same questions twice.
const supportBot = new AgentRuntime({
agentId: 'customer-support',
userId: customerId,
llmApiKey: process.env.ANTHROPIC_API_KEY,
apiKey: process.env.RECALLBRICKS_API_KEY
});
await supportBot.chat('How can I help you today?');Personal AI Assistant
Assistant that learns your preferences, remembers your decisions, and anticipates what you need.
const assistant = new AgentRuntime({
agentId: 'personal-ai',
userId: userId,
llmApiKey: process.env.ANTHROPIC_API_KEY,
apiKey: process.env.RECALLBRICKS_API_KEY
});
await assistant.chat('What should I work on today?');Code Assistant
Developer tool that understands your codebase patterns, remembers past reviews, and gives consistent feedback.
const codeBot = new AgentRuntime({
agentId: 'code-assistant',
userId: developerId,
llmApiKey: process.env.ANTHROPIC_API_KEY,
apiKey: process.env.RECALLBRICKS_API_KEY
});
await codeBot.chat('Review this pull request');The RecallBricks Runtime
The cognitive OS for AI agents
Built for developers who want to ship agents fast.
One API: .chat()
No orchestration code. No retrieval pipelines. Just call agent.chat() and everything is handled.
Your data stays private
Your agent's memory stays private. Zero external logging, full encryption, complete data sovereignty.
Production-ready
Enterprise infrastructure that just works. Sub-50ms latency, always available.
Open source and ready to use
View on GitHubHow it works
Memory that gets smarter with use
The more your agent uses a memory, the more intelligent it becomes — automatically.
You Bring Your LLM
Use any LLM for your agent's reasoning.
RecallBricks works with all of them.
RecallBricks Memory
Your agent saves and retrieves. We automatically enrich memories based on usage.
Every memory starts here • FREE
Retrieved 2+ times
Retrieved 5+ times
Claude Haiku & Sonnet enrich memories in the background
Your Agent Gets More Intelligent
Zero configuration. Zero maintenance.
The more it runs, the more intelligent it becomes.
// Just call .chat() — memory is automatic
agent.chat("I prefer dark mode")
// Later, context is recalled automatically
agent.chat("What are my preferences?")How Memories Get Smarter
Simple Pricing
100,000 operations/month included
Then just $0.10 per 100 operations after that. Tier upgrades included.
Common Questions
No. You bring your own LLM (OpenAI, Claude, Gemini, etc.) for your agent's reasoning. RecallBricks uses Claude Haiku and Sonnet internally to enrich memories — this is included in your plan.
An operation is any API call: save, search, recall, get, delete, etc. You get 100,000 operations per month included. After that, it's $0.10 per 100 additional operations.
No. Tier upgrades are free and included. You only pay extra if you exceed 100,000 operations/month ($0.10 per 100 additional operations).
You don't need to. RecallBricks automatically upgrades memories based on usage. Memories your agent uses frequently become more intelligent automatically. Zero configuration.
It stays at Tier 1 (free storage). You're not paying for enrichment you don't need. Only frequently-used memories get enhanced.
Every AI agent I built forgot everything between runs. I got tired of wiring up embeddings, vector DBs, and retrieval logic just to make it remember.
So I built RecallBricks — persistent memory for AI agents that just works. No infrastructure, no embeddings, production-ready in 5 minutes.
Dec 2025
Public Launch
3 SDKs
Python, TypeScript, LangChain
Open Source
All SDKs on GitHub
Developer ecosystem
Build on RecallBricks
Open SDKs, full API access, and integrations you can extend.
Open SDKs
Python, TypeScript, LangChain — fully open, well-documented, ready to extend.
Full API Access
Build custom integrations. Every endpoint is documented and accessible.
Have an integration idea? We want to hear it.
Add persistent memory to:
Pricing
Simple, transparent pricing.
Pay for operations. Storage is unlimited and free.
Free
No credit card required
For developers exploring AI memory
- 10,000 operations/month
- Unlimited storage
- 1 agent
- Vector search
- Manual enrichment only
- Community support
Builder
Most popular
For developers building AI agents
- 100,000 operations/month
- Unlimited storage
- 5 agents
- 1,000 Haiku + 100 Sonnet enrichments
- Working memory & goals
- Email support
Pro
For teams in production
- 500,000 operations/month
- Unlimited storage
- 25 agents
- 5,000 Haiku + 500 Sonnet enrichments
- Priority support
- Advanced analytics
Team
For organizations scaling agents
- 2M operations/month
- Unlimited agents
- 20,000 Haiku + 2,000 Sonnet enrichments
- Dedicated support
- Custom integrations
- Team management
Enterprise
For enterprises at scale
- Everything in Team
- Custom operation limits
- SSO & audit logs
- Dedicated account manager
- Custom SLAs
- On-premise option
What counts as an operation?
Save
Each memory saved = 1 operation
Recall
Each search or recall = 1 operation
Enrich
First retrieval extracts metadata = 1 operation (cached forever)
Typical agent: 5,000-20,000 ops/month • Heavy production: 50,000-100,000 ops/month
📊Predictable pricing
Builder includes 100,000 operations per month. You'll get notified as you approach your limit. Need more? Contact sales to discuss higher-tier plans.
Typical agent: 5,000-20,000 ops/month. Heavy production: 50,000-100,000 ops/month.
Works with your stack
Support
Frequently asked questions
What is RecallBricks Runtime?
RecallBricks Runtime is an npm package that turns any LLM into a production-ready agent. It handles persistent memory, automatic context management, and self-improving intelligence — all with a simple .chat() API.
Which LLMs are supported?
RecallBricks Runtime works with Claude, GPT, Gemini, Ollama, and any OpenAI-compatible API. Switch providers without changing your code.
How is this different from LangChain?
LangChain requires you to configure vector databases, build retrieval pipelines, and manage orchestration. RecallBricks Runtime handles all of this automatically — you just call .chat() and everything works.
Do you train on my data?
Never. Your data is yours. We use LLMs to enrich memories, but we never train models on your data.
How long does it take to get started?
About 5 minutes. Install the package with npm install @recallbricks/runtime, add your API keys, and call agent.chat(). Your agent has persistent memory immediately.
What happens if RecallBricks goes down?
We focus on reliability with monitoring and redundancy. If the API is down, your agent falls back to basic recall (no enrichment) until we're back.
How does pricing work?
You pay for operations (save, recall, enrich). Storage is unlimited and free. The Builder plan includes 100,000 operations/month. Most agents use 5,000-20,000 ops/month.
Advanced Options
Need framework integration or manual API control? RecallBricks also provides:
TypeScript SDK
Full API control for custom implementations
Python SDK
Native Python integration
LangChain Integration
Memory for LangChain agents
CLI Tool
Command-line memory management
Security
Your data stays private.
Built with privacy and security as first-class citizens.
Tenant isolation
Your data is isolated and protected. Never shared across accounts.
Encrypted in transit
End-to-end encryption for all API calls.
Zero external logging
No third-party analytics or tracking.
Production-ready
Monitoring and reliability focus. Built for production workloads.
Your Agent Needs Memory.
We Built It.
Stop configuring vector databases and building retrieval pipelines.
RecallBricks Runtime gives your agent persistent memory, automatic context, and self-improving intelligence — in one npm install.
npm install @recallbricks/runtime