Send user and assistant messages. Extract what matters. Optimize your context window automatically.
# pip install getmem-ai import getmem_ai as getmem mem = getmem.init("gm_live_...") # Get context before LLM call ctx = mem.get("uid", query=msg)["context"] # Save both roles after each turn mem.ingest("uid", messages=[ {"role": "user", "content": msg}, {"role": "assistant", "content": reply} ])
14-day free trial · No card required
Most apps stuff entire conversation histories into every prompt. getmem.ai builds a compact, query-relevant context instead — sending only what the LLM actually needs, nothing it doesn’t.
system_prompt
+ full conversation history (5,000–40,000 tokens)
+ user message
system_prompt
+ mem.get(user_id, query) — 200–800 tokens
+ user message
Two calls. That's the entire integration. Everything else happens on our side.
Call mem.ingest() after each conversation turn. Send the raw messages — we handle extraction automatically using an LLM pass.
We extract entities, facts, preferences and decisions. Deduplicated, categorised, stored in a graph + vector hybrid. Async — your users never wait.
Call mem.get() before each LLM call. Returns a compact context string, ranked by relevance, ready to inject into any prompt.
Every time a user has to repeat themselves, you lose them. getmem fixes that in two API calls.
Free to start · No card required · Works with any LLM
The context layer that makes your AI feel genuinely intelligent — not just stateless and forgetful.
Send raw messages. We extract what matters automatically. Your users never have to repeat themselves again.
mem.get() returns context in under 100ms. mem.ingest() runs async — users never wait.
Graph + vector hybrid retrieval. Not just similarity search — we surface relationships, history, and context that raw embeddings miss.
Memory infrastructure is an iceberg. You only see 10% of the work until you're 3 weeks in.
Not just storage. Structured extraction, relationship tracking, and intelligent retrieval — built into two API calls.
mem.get() under 100ms14-day free trial, no card required. Then pay only for what you use.
Starts at $0.0002 per call
Free tier at launch: 50,000 calls/month
No retrieval caps · Volume discounts · Full pricing on launch
mem.ingest(), we run a lightweight LLM pass over the conversation to extract structured facts — entities (names, places, products), preferences, stated decisions, and key context. These are stored in both a vector store and a knowledge graph, deduplicated against existing memories. Then we add some magic. You never define schemas or write parsing logic.
mem.ingest() is fully asynchronous — the call returns immediately with a job ID and extraction happens in the background. Your response latency is unaffected. mem.get() is synchronous but returns in under 100ms p99, so it's safe to put in the hot path before your LLM call.
mem.get() returns a plain string you inject wherever you want in your prompt — system prompt, user message, tool results. Works with OpenAI, Anthropic, Gemini, or any other provider. Also integrates natively with LangChain and LlamaIndex via our official packages.
mem.get(user_id, query) returns a compact, ranked context string — typically 200–800 tokens — containing only the facts, preferences, and relationships the LLM actually needs right now. Everything else stays in the vault.
Most AI products forget the moment a session ends. getmem stores indefinitely — no TTL, no purge, no expiry.
Imagine an AI companion that knew someone for 20 years. Their quirks. Their history. What they cared about. Their voice. That’s what you can build.
Two API calls. No infrastructure. No configuration.
npm install getmem · github.com/getmem-ai/getmem-js