Agent Memory with LangChain4j and Oracle AI Database

One of the quickest ways to make an impressive agent demo is to prepare a clever prompt. One of the quickest ways to make that same agent fall apart in production is to give it no durable memory.

In this article, we'll build a small, memory-backed assistant with LangChain4j and Oracle AI Database. The assistant can search prior incidents, runbooks, decisions, and shift handoffs to answer questions. It can write new memories back to the database so they become searchable in any session. Additionally, all user, agent, and tool messages are logged to database table for observability and auditing.

Database feature overview
Run the sample
Chat Memory vs Durable Memory
Hybrid retrieval: semantic + full-text search
Lightweight reranking
LangChain4j agent
Memory writeback
Recording user, agent, and tool messages
Why database memory is useful for agents
Code pointers
Where you can take this next

Database feature overview

The agent is built with modern Oracle AI Database features:

persistent JSON memory documents in Oracle AI Database
vector embeddings in a VECTOR column
Oracle Text search over the same JSON document
hybrid ranking that blends semantic and exact-match retrieval
append-only transcript logging by conversation ID

Using these features, the agent (a fictional operations assistant) can answer question about runbooks, incident reviews, change requests, and shift handoffs from its persistent memory. Because the memory is database backed, multiple agents from concurrent sessions may access the same data safely.

Run the sample

You will need Java 21+, Maven, Docker, and an OpenAI API Key.

From the module root, run the tests:

export OPENAI_API_KEY=<your key>
mvn test

To run the live terminal app using your database connection string and user:

export OPENAI_API_KEY=<your key>
mvn compile exec:java \
  -Dexec.args="jdbc:oracle:thin:@localhost:1521/freepdb1 testuser testpwd"

Once it starts, try prompts like:

What happened during the checkout incident after CHG2145?
Which runbook section should I use for the checkout rollback?
Draft a next-shift handoff and remember it.

Chat Memory vs Durable Memory

Chat memory and durable memory solve different problems. Operational memory has different requirements:

it should survive process restarts
it should be queryable across conversations from distributed, concurrent agents
it should support structured metadata like service, environment, incident ID, and change ticket
it should be searchable both semantically and exactly
it should allow writeback when the agent learns something worth preserving

That starts to look a lot more like a database problem than a prompt engineering problem.

Hybrid retrieval: semantic + full-text search

The MemoryRepository runs two queries, which are fused into one ranked list:

Vector search over the embedding column using cosine distance.
Oracle Text search over the JSON payload using json_textcontains.

Here is the vector query:

select id,
       memory_kind,
       title,
       memory_doc,
       (1 - vector_distance(embedding, ?, COSINE)) as vector_score
from agent_memories
order by vector_score desc, id
fetch first ? rows only

And here is the text query:

select id,
       memory_kind,
       title,
       memory_doc,
       score(1) as text_score
from agent_memories
where json_textcontains(memory_doc, '$', ?, 1)
order by score(1) desc, id
fetch first ? rows only

Pure vector search is often too fuzzy for ticket IDs. Pure text search is often too brittle for paraphrases. Hybrid retrieval handles both.

Lightweight reranking

Once both branches return hits, MemorySearchRanker merges the results with deterministic weights:

a bonus when the incident ID or change ticket matches directly
a bonus for keyword overlap in the indexed memory text
a combined matchedBy indicator of VECTOR, TEXT, or BOTH

The deterministic ranker could be implemented by an LLM judge or a more complex re-ranking system. For this sample, I kept it intentionally lightweight and low-latency.

LangChain4j agent

The LangChain4j agent implementation is quite small, using a single interface:

public interface OpsMemoryAssistant {
    @SystemMessage("""
            You are an operations handoff assistant backed by Oracle AI Database memory.
            Use searchMemories when prior incidents, runbooks, handoffs, decisions, or change history are relevant.
            When you rely on memory results, include the references in the form [M123].
            If the user asks you to remember or preserve a new handoff or decision, call storeMemory after drafting it.
            Keep answers concise and operational. Mention incident IDs and change tickets when they matter.
            """)
    @UserMessage("{{message}}")
    String chat(@V("message") String userMessage);
}

That is the right level of abstraction for this sample.

LangChain4j handles chat orchestration and tool wiring. Oracle AI Database handles durable memory, search, and transcript persistence. Each layer is doing the job it is actually good at.

Memory writeback

The sample keeps two memory stores:

a curated durable memory store for retrieval
an append-only transcript for observability and auditing

This one also stores new durable memory through the storeMemory tool when the user explicitly asks the assistant to preserve a handoff or decision.

That matters because an agent memory system should not just be a read-only archive. If a useful conclusion comes out of a conversation, the system should be able to keep it.

In this sample, writeback creates a new MemoryDocument, generates an embedding, and inserts both the JSON payload and vector into agent_memories. Because the JSON search index is configured with sync (on commit), newly stored handoffs are searchable immediately after commit.

That last detail is important. Delayed indexing is exactly the kind of thing that makes an agent feel unreliable.

Recording user, agent, and tool messages

With our database connection, it's easy to record chat sessions in the database. To do this with LangChain4j, we implement the ChatMemory interface in the LoggingChatMemory.java class.

Each session gets its own unique conversation ID, and user/agent/tool messages are written to the agent_conversation_log table.

That table captures:

conversation_id
message_seq
role and message type
message text
tool name and tool call ID when relevant
optional JSON context
creation timestamp

That distinction tends to get blurred in agent demos. It should not.

Why database memory is useful for agents

Chat windows and flat files can't scale the same way a database can. A database-backed memory layer gives you:

durable storage
structured metadata
many types of retrieval: semantic, text, relationship, graph, etc.
transactional writes and concurrency
better auditability

Databases can help you progress from agent demos to real applications that effectively utilize agent memory.

Code pointers

If you want to explore the implementation, start here:

README.md -> app overview
OpsMemoryAgentApplication.java -> Main class and agent loop
MemoryRepository.java -> Memory retrieval for text and vector search
MemoryTools.java -> LangChain4j tool bindings to search and store memories
LoggingChatMemory.java -> LangChain4j ChatMemory implementation to log chat interactions
MemoryRepositoryIntegrationTest.java -> test using Oracle AI Database Free and Testcontainers

The tests validate the behavior that matters

The integration tests are worth reading because they verify the actual retrieval patterns we care about:

exact text search finds the checkout incident for CHG2145 and INC4721
vector search finds the same incident from a paraphrased outage description
hybrid fusion marks the strongest result as matched by both channels
a stored handoff can be found on the next combined search

Where you can take this next

If you'd like to extend this sample, here's a few ideas to play with:

Add "forgetting" with recency ranking so newer memories are ranked as more relevant.
Parameterize scoring and filtering mechanisms to make the app more flexible.
Add another agent tool that uses an LLM to judge search results.
Add approval/rejection when storing memories. Maintain a log of failures so the agent knows what not to do.

推荐订阅源

DEV Community