Free to start · No card required · Live in 2 minutes

Your AI agent forgets everything.
Fix that in 2 lines.

Send user and assistant messages. Extract what matters. Optimize your context window automatically.

# pip install getmem-ai
import getmem_ai as getmem
mem = getmem.init("gm_live_...")

# Get context before LLM call
ctx = mem.get("uid", query=msg)["context"]

# Save both roles after each turn
mem.ingest("uid", messages=[
  {"role": "user", "content": msg},
  {"role": "assistant", "content": reply}
])
Get started free → ▶ Live demo

14-day free trial · No card required

Improve AI outputs No proprietary lock-in Open source SDK Low latency <100ms
Works with
OpenAI· Anthropic· Gemini· LangChain· LlamaIndex· OpenClaw· Hermes
Token Efficiency
Save up to 95%
on LLM costs.

Most apps stuff entire conversation histories into every prompt. getmem.ai builds a compact, query-relevant context instead — sending only what the LLM actually needs, nothing it doesn’t.

× Without getmem — naive approach
system_prompt
+ full conversation history (5,000–40,000 tokens)
+ user message
Every token in history billed on every call. Costs compound fast.
With getmem — intelligent context
system_prompt
+ mem.get(user_id, query) — 200–800 tokens
+ user message
Only the memories relevant to this query. Everything else stays in the vault.

Simple by design.

Two calls. That's the entire integration. Everything else happens on our side.

01
mem.ingest() — Ingest

Call mem.ingest() after each conversation turn. Send the raw messages — we handle extraction automatically using an LLM pass.

02
Extract

We extract entities, facts, preferences and decisions. Deduplicated, categorised, stored in a graph + vector hybrid. Async — your users never wait.

03
mem.get() — Retrieve

Call mem.get() before each LLM call. Returns a compact context string, ranked by relevance, ready to inject into any prompt.

If your AI talks to people,
it needs to remember them.

Every time a user has to repeat themselves, you lose them. getmem fixes that in two API calls.

Customer support agents
No “can you describe your issue again?” — ever.
Cuts repeat questions by 70%+
AI writing assistants
Learns your voice, not just your words.
Writes like you, not like everyone
Developer copilots
Remembers your stack, patterns, and open PRs.
Gets smarter with every session
Sales agents
Full deal history recalled before every call.
Closes deals with context
AI tutors
Tracks what each student knows and how they learn.
Personalised curriculum at scale
Companion robots
Names, relationships, preferences — retained.
The memory that makes robots feel human
Wearable AI
AI pins and earbuds that learn your context across the day.
Your context, always on
Autonomous agents
Long-running agents that recall goals and past actions.
Memory across hours, days, entire runs
Start building →

Free to start · No card required · Works with any LLM

Everything your AI is missing.

The context layer that makes your AI feel genuinely intelligent — not just stateless and forgetful.

Your AI stops asking the same questions

Send raw messages. We extract what matters automatically. Your users never have to repeat themselves again.

Context that arrives before the user finishes typing

mem.get() returns context in under 100ms. mem.ingest() runs async — users never wait.

Responses that feel like they actually know you

Graph + vector hybrid retrieval. Not just similarity search — we surface relationships, history, and context that raw embeddings miss.

vs. building it yourself.

Memory infrastructure is an iceberg. You only see 10% of the work until you're 3 weeks in.

getmem.ai
Build it yourself
Integration time< 2 minutes
Integration timeDays of engineering
Extraction logicAutomatic — send raw messages
Extraction logicYou write every rule
Context selectionIntelligent, token-budgeted
Context selectionTop-K similarity, you tune it
Graph + vector hybridBuilt in
Graph + vector hybridMonths of infrastructure work
Ongoing maintenanceNone
Ongoing maintenanceYours forever

Memory that actually works.

Not just storage. Structured extraction, relationship tracking, and intelligent retrieval — built into two API calls.

Automatic extraction
Send raw messages — structured facts extracted automatically. No schemas, no parsing logic required.
8 memory types
preference · fact · decision · goal · constraint · belief · experience · relationship
History that evolves
Contradictions preserved with provenance. Old facts are superseded, not overwritten — full history intact.
No retrieval caps
Pure pay-per-call. No monthly ceiling, no hard limits. Memory that scales with your product.
Async ingestion
Returns immediately. Extraction runs in the background. Your users never wait for memory to save.
mem.get() under 100ms
Fast enough for real-time chat. Ranked by relevance, ready to inject directly into your prompt.

Simple pricing. No surprises.

14-day free trial, no card required. Then pay only for what you use.

Starts at $0.0002 per call

Free tier at launch: 50,000 calls/month

No retrieval caps · Volume discounts · Full pricing on launch

Early bird pricing locked in. Sign up now — early access members get the free tier automatically and lock in their rate for 12 months post-launch.

Common questions.

When you call mem.ingest(), we run a lightweight LLM pass over the conversation to extract structured facts — entities (names, places, products), preferences, stated decisions, and key context. These are stored in both a vector store and a knowledge graph, deduplicated against existing memories. Then we add some magic. You never define schemas or write parsing logic.
No. getmem is priced to be a negligible line item, not a budget decision. Usage is per-call and volume-based — the more your app grows, the lower your per-call rate. There's no dramatic jump as you scale. Early bird pricing is fixed and won't change without notice. Early access members lock in their rate for 12 months post-launch.
No. mem.ingest() is fully asynchronous — the call returns immediately with a job ID and extraction happens in the background. Your response latency is unaffected. mem.get() is synchronous but returns in under 100ms p99, so it's safe to put in the hot path before your LLM call.
Yes. getmem.ai is framework-agnostic. mem.get() returns a plain string you inject wherever you want in your prompt — system prompt, user message, tool results. Works with OpenAI, Anthropic, Gemini, or any other provider. Also integrates natively with LangChain and LlamaIndex via our official packages.
RAG over raw messages retrieves chunks of conversation text. getmem retrieves structured facts. Instead of "...the user mentioned they prefer dark mode and dislike onboarding emails in turn 47...", you get "Prefers dark mode. Opted out of onboarding emails." — compact, deduplicated, ready to inject. We also track relationships between entities across sessions, which vector similarity alone can't do.
Most AI apps solve memory by sending the entire conversation history with every request. A 100-message conversation can easily consume 10,000–40,000 tokens per call — and you pay for every one of them, every time.

getmem does the opposite. Instead of sending history, it sends only the memories relevant to the current query. mem.get(user_id, query) returns a compact, ranked context string — typically 200–800 tokens — containing only the facts, preferences, and relationships the LLM actually needs right now. Everything else stays in the vault.

The result: the same quality of personalisation at a fraction of the token cost. For apps with active users and long conversation histories, savings of 80–95% on context tokens are typical.
Persistent memory

Build AI that remembers
across a lifetime.

Most AI products forget the moment a session ends. getmem stores indefinitely — no TTL, no purge, no expiry.

Imagine an AI companion that knew someone for 20 years. Their quirks. Their history. What they cared about. Their voice. That’s what you can build.

Lifelong personal assistants AI companions that grow with you Family memory vaults

Two API calls. No infrastructure. No configuration.

npm install getmem  ·  github.com/getmem-ai/getmem-js

View on GitHub Get started free →