LLM Integration

Membrane is designed to sit between your orchestration layer and model calls. Capture execution traces as typed memory, retrieve a trust-gated graph neighborhood before prompting, then reinforce the records that actually helped.

The 4-Step Pattern

Capture

Store tool outputs, observations, agent turns, and working state with captureMemory. Membrane creates a primary record and can link entities or related records.

Retrieve graph context

Before a model call, use retrieveGraph with a task descriptor and trust context. The response includes ranked root memories and connected graph nodes.

Prompt

Project the graph into concise prompt context. Include record IDs so the model can cite sources and your application can reinforce useful records later.

Reinforce or penalize

After the model output is validated, reinforce useful records or penalize misleading records. Salience changes drive future ranking.

Full TypeScript + OpenAI Example

import OpenAI from "openai";
import {
  MembraneClient,
  Sensitivity,
  SourceKind,
  type GraphNode,
} from "@bennettschwartz/membrane";

const memory = new MembraneClient("localhost:9090", {
  apiKey: process.env.MEMBRANE_API_KEY,
});

const llm = new OpenAI({
  apiKey: process.env.LLM_API_KEY,
  // OpenAI-compatible providers are supported here, for example:
  // baseURL: "https://openrouter.ai/api/v1",
});

const graph = await memory.retrieveGraph("plan auth service rollout", {
  trust: {
    max_sensitivity: Sensitivity.MEDIUM,
    authenticated: true,
    actor_id: "planner-agent",
    scopes: ["project-auth"],
  },
  memoryTypes: ["entity", "semantic", "competence", "working", "plan_graph"],
  rootLimit: 10,
  nodeLimit: 25,
  edgeLimit: 100,
  maxHops: 1,
});

function formatNode(node: GraphNode): string {
  return JSON.stringify({
    id: node.record.id,
    type: node.record.type,
    root: node.root,
    hop: node.hop,
    confidence: node.record.confidence,
    salience: node.record.salience,
    payload: node.record.payload,
  });
}

const context = graph.nodes.map(formatNode).join("\n");

const completion = await llm.chat.completions.create({
  model: "gpt-5.5",
  messages: [
    { role: "system", content: "Use memory context as evidence. Cite record ids." },
    { role: "user", content: `Task: plan auth service rollout\n\nMemory:\n${context}` },
  ],
});

const answer = completion.choices[0]?.message?.content ?? "";

const planCapture = await memory.captureMemory(
  {
    task: "plan auth service rollout",
    answer,
  },
  {
    source: "planner-agent",
    sourceKind: SourceKind.AGENT_TURN,
    reasonToRemember: "Persist the generated rollout plan and its supporting context",
    summary: answer.slice(0, 500),
    tags: ["llm", "plan", "auth"],
    scope: "project-auth",
    sensitivity: Sensitivity.MEDIUM,
  }
);

await memory.reinforce(
  planCapture.primary_record.id,
  "planner-agent",
  "plan was accepted after review"
);

memory.close();

Capture During Execution

Use one capture method for all runtime evidence. Change sourceKind and content shape based on what happened.

// After a tool runs
const toolCapture = await memory.captureMemory(
  {
    tool_name: "go test",
    args: { packages: ["./pkg/auth"] },
    result: { exit_code: 0, stdout: "ok ./pkg/auth" },
  },
  {
    sourceKind: SourceKind.TOOL_OUTPUT,
    source: "auth-agent",
    reasonToRemember: "Successful auth package verification",
    summary: "Auth package tests passed",
    tags: ["auth", "tests"],
    scope: "project-auth",
    sensitivity: Sensitivity.LOW,
  }
);

// After a useful observation
const observationCapture = await memory.captureMemory(
  {
    subject: "auth service",
    predicate: "uses_retry_policy",
    object: { max_attempts: 3, backoff: "exponential" },
  },
  {
    sourceKind: SourceKind.OBSERVATION,
    source: "auth-agent",
    reasonToRemember: "Retry policy affects debugging and rollout planning",
    summary: "Auth service retries requests up to three times",
    tags: ["auth", "runtime"],
    scope: "project-auth",
    sensitivity: Sensitivity.LOW,
  }
);

Retrieve Before Prompting

Pass a concrete task descriptor and bounded graph limits. The root nodes are the highest-signal records; connected nodes add entities, facts, and provenance.

const graph = await memory.retrieveGraph("debug auth retry failures", {
  trust: {
    max_sensitivity: Sensitivity.MEDIUM,
    authenticated: true,
    actor_id: "auth-agent",
    scopes: ["project-auth"],
  },
  memoryTypes: ["entity", "semantic", "competence", "episodic"],
  rootLimit: 8,
  nodeLimit: 20,
  edgeLimit: 80,
  maxHops: 1,
});

If you only need the most relevant records, project root nodes:

const rootRecords = graph.nodes
  .filter((node) => node.root)
  .map((node) => node.record);

If entity and provenance context matter, include all graph nodes and edges:

const context = JSON.stringify({
  nodes: graph.nodes,
  edges: graph.edges,
  root_ids: graph.root_ids,
});

Build The Prompt

Use compact JSON lines when prompts need source IDs:

const memoryContext = graph.nodes
  .map((node) => {
    const record = node.record;
    return JSON.stringify({
      id: record.id,
      type: record.type,
      root: node.root,
      hop: node.hop,
      payload: record.payload,
    });
  })
  .join("\n");

const messages = [
  { role: "system", content: "Use memory context as evidence. Cite record ids." },
  { role: "user", content: `Task: debug auth retry failures\n\nMemory:\n${memoryContext}` },
];

Reinforce On Success

Reinforce records that contributed to a useful answer:

for (const id of graph.root_ids) {
  await memory.reinforce(id, "auth-agent", "used in accepted debugging plan");
}

Reduce salience when a record misled the agent:

await memory.penalize(recordId, 0.2, "auth-agent", "not relevant to this incident");

Tips For Effective Integration

Scope Records To Projects

Use scope on capture and trust.scopes on retrieval to isolate memories:

await memory.captureMemory(
  { text: "Frontend build uses pnpm" },
  {
    sourceKind: SourceKind.OBSERVATION,
    scope: "project-frontend",
    sensitivity: Sensitivity.LOW,
  }
);

const graph = await memory.retrieveGraph("deploy frontend", {
  trust: {
    max_sensitivity: Sensitivity.LOW,
    authenticated: true,
    actor_id: "frontend-agent",
    scopes: ["project-frontend"],
  },
});

Choose Memory Types Per Task

Task	Recommended types
Planning	`entity`, `semantic`, `competence`, `plan_graph`
Debugging	`entity`, `episodic`, `competence`, `semantic`
Context restoration	`working`, `entity`, `semantic`
Self-correction	`competence`, `episodic`
Preference retrieval	`entity`, `semantic`

Keep Prompt Context Bounded

Use rootLimit, nodeLimit, edgeLimit, and maxHops instead of retrieving a large flat list. Start with rootLimit: 8, nodeLimit: 20, and maxHops: 1, then tune by prompt size.

LLM-Backed Interpretation

CaptureMemory uses a separate ingest-side interpreter configuration. Enable it when you want capture-time summaries, mentions, relation candidates, entity resolution, and proposed linked facts.

backend: postgres
postgres_dsn: "postgres://membrane:membrane@localhost:5432/membrane?sslmode=disable"

ingest_llm_enabled: true
ingest_llm_endpoint: "https://api.openai.com/v1/chat/completions"
ingest_llm_model: "gpt-5-mini"
# ingest_llm_api_key: ""  # or set MEMBRANE_INGEST_LLM_API_KEY

The background consolidation scheduler uses llm_* settings separately. Configure those when you want asynchronous episodic-to-semantic extraction:

llm_endpoint: "https://api.openai.com/v1/chat/completions"
llm_model: "gpt-5-mini"
# llm_api_key: ""  # or set MEMBRANE_LLM_API_KEY

The consolidation scheduler promotes eligible episodic traces into semantic facts, competence records, and plan graphs over time.

Embedding-Backed Retrieval

With Postgres + pgvector and an embedding endpoint configured, graph roots are ranked with hybrid vector and salience scoring.

backend: postgres
postgres_dsn: "postgres://membrane:membrane@localhost:5432/membrane?sslmode=disable"

embedding_endpoint: "https://api.openai.com/v1/embeddings"
embedding_model: "text-embedding-3-small"
embedding_dimensions: 1536
# embedding_api_key: ""  # or set MEMBRANE_EMBEDDING_API_KEY

Tip

Specific task descriptors improve retrieval. Prefer "debug auth retry failures in project-auth" over "help".

The 4-Step Pattern​

Capture

Retrieve graph context

Prompt

Reinforce or penalize

Full TypeScript + OpenAI Example​

Capture During Execution​

Retrieve Before Prompting​

Build The Prompt​

Reinforce On Success​

Tips For Effective Integration​

Scope Records To Projects​

Choose Memory Types Per Task​

Keep Prompt Context Bounded​

LLM-Backed Interpretation​

Embedding-Backed Retrieval​