Level Up OutSystems Agent Memory with Hindsight

Throughout my diverse career, I've accumulated a wealth of experience in various capacities, both technically and personally. The constant desire to create innovative software solutions led me to the world of Low-Code and the OutSystems platform. I remain captivated by how closely OutSystems aligns with traditional software development, offering a seamless experience devoid of limitations. While my managerial responsibilities primarily revolve around leading and inspiring my teams, my passion for solution development with OutSystems remains unwavering. My personal focus extends to integrating our solutions with leading technologies such as Amazon Web Services, Microsoft 365, Azure, and more. In 2023, I earned recognition as an OutSystems Most Valuable Professional, one of only 80 worldwide, and concurrently became an AWS Community Builder.
This article is for agent developers working with ODC Agent Workbench who want to raise the accuracy and quality of their agentic memory. It walks through how memory and grounding is scaffolded by Agent Workbench today, where those defaults fall short, and how Hindsight — an open-source memory system for AI agents — can fill the gaps.
Hindsight is built by Vectorize and released under the MIT license. Vectorize offers Hindsight as a managed cloud service, but it can also be provisioned on-premise or in a private cloud — giving you full control over where your agent's memory lives.
How Memory Works Today in ODC Agent Workbench
When you build an agent in Agent Workbench, a lot happens behind the scenes every time a user sends a message. The agent sends the user's message, along with supporting context, to a Large Language Model (LLM) such as GPT-4 or Claude. The LLM generates a response based on what it receives in that single call. Crucially, the LLM itself has no memory between calls — but Agent Workbench handles that for you.
Out of the box, Agent Workbench gives you two built-in mechanisms that together provide an agent with context: grounding data and conversation memory. This article explains how those mechanisms work, where they fall short in more advanced scenarios, and what you can do about it.
Grounding Data
Grounding data is domain knowledge the agent draws on to produce accurate, contextually relevant answers — product documentation, company policies, or any structured knowledge base your organization maintains.
In the current approach this works as follows:
A search index holds your documents in vectorised form — Agent Workbench supports Azure AI Search and AWS Kendra out of the box, and other search providers can be integrated via a custom connector.
When a new user message arrives, Agent Workbench runs a retrieval query against this index using the message text as the search input.
The top results are injected into the system prompt as context before the LLM call is made.
This is a classic Retrieval-Augmented Generation (RAG) pattern. It works well for factual lookups where the question closely matches indexed content.
Conversation Memory
Conversation memory gives the agent awareness of what has already been said in a session. The current approach stores conversation turns in an Entity:
Each inbound user message is written as a record to an ODC Entity.
Each final agent answer is also written as a record to the same Entity.
Before every new LLM call, the full history of records for the current session is loaded and injected into the message list, so the LLM sees the complete thread.
This is a direct, transparent, and easy-to-debug pattern. The conversation history lives inside your ODC database and is fully under your control.
Where the Default Falls Short
Understanding where this design breaks down requires separating two distinct use cases that have very different memory needs.
Short, Single-Session Interactions
An agent helps a user complete a task within one session: filling in a form, generating a document, answering a product question. The conversation is self-contained and disposable once done.
How the default approach performs here: Well. The Entity-based memory captures the thread accurately, the RAG lookup surfaces relevant domain content, and the session is short enough that injecting the full history doesn't become a token problem.
A quick note on tokens, since they'll come up repeatedly: a token is the unit of text an LLM processes — roughly ¾ of a word. Every LLM call has a maximum number of tokens it can accept as input and produce as output, called the context window. Everything sent to the LLM — the system prompt, the grounding data, the conversation history, and the user's new message — must fit inside that window. When it doesn't, something has to be cut.
Disadvantages here: Minimal. The main gap is that the RAG search is a single-strategy vector lookup, so nuanced or indirect questions — "What is the cancellation policy if I purchased on credit?" — may miss relevant chunks.
Long-Running, Multi-Session Agents with User Context
An agent supports the same user over days, weeks, or months — a personal assistant or a support agent that knows your history. This is where the default approach starts to show real strain.
Disadvantage 1 — Growing context window pressure
Every turn adds records to the Entity. As the history grows, so does the token cost of injecting all prior messages into every LLM call. Long-running users will eventually hit context window limits, and even before that, inference costs climb linearly with conversation length.
Disadvantage 2 — No knowledge consolidation
The history is a raw log. If a user mentioned in session 3 that they prefer TypeScript, and again in session 12, the agent has no consolidated belief about that preference — it has two separate Entity records buried in a growing list. Over time, the signal gets diluted by noise. The agent has to re-read everything to find what it already "knows."
Disadvantage 3 — No cross-session user memory
The Entity stores conversation turns. It doesn't naturally surface what the agent has learned about a user across sessions. There is no built-in mechanism to say "given everything I've seen about this user, they tend to prefer X." That kind of derived, persistent knowledge requires custom logic — custom queries, custom summarisation, custom storage.
Disadvantage 4 — Single-strategy retrieval for grounding
A single vector query against your search index is effective for direct lookups, but it struggles with:
Temporal queries — "What changed in the policy last spring?"
Indirect connections — "What does our refund policy say about orders placed during a promotion?" — two memories that must be linked
Exact-term matching combined with semantic similarity — names, product codes, and natural language in the same question
A single retrieval strategy picks up one kind of match well and misses the others.
Disadvantage 5 — Memory is opaque to the agent
The raw turn-by-turn log can't be queried intelligently by the agent itself at runtime. There is no built-in tool the agent can invoke mid-conversation to look up what it knows about a user. The agent receives what the system prompt gives it and nothing more — it can't go looking for something it suspects it should know.
What Hindsight Is — and Why It Fits
Hindsight (by Vectorize) is a memory system designed specifically for AI agents. Rather than storing raw text or conversation logs, it analyses content at ingestion time and extracts structured memories from it. Those memories are stored in a memory bank and can be retrieved later using four parallel search strategies — semantic similarity, keyword (BM25), graph traversal, and temporal — fused into a single ranked result. This four-strategy retrieval system is called TEMPR.
Hindsight organizes what it knows into four memory types:
World Fact — Objective facts about people, places, and things. Example: "Alice works at Google as a software engineer."
Experience Fact — Personal events and user-specific facts. Example: "I recommended the Enterprise plan to this user during the onboarding call."
Observation — Consolidated, deduplicated beliefs derived from evidence across many facts. Example: "This user consistently prefers concise responses and dislikes jargon."
Mental Model — User-curated summaries for common queries. Example: "Team communication best practices." Mental Models are the highest-priority source during reasoning — checked before Observations and raw facts.
Beyond storage and retrieval, Hindsight provides a Retain API (to write memories), a Recall API (to retrieve them), and a Reflect API (for agent-driven reasoning over memory). Every memory can be tagged at ingest time — for example, user:alice — and retrieval can be scoped to those tags. A single shared memory bank can safely serve multiple users without cross-contamination.
These memory types are introduced in the next section.
Hindsight Memory Concepts Deep Dive
World Facts
World facts are memories about people, places, and things. They capture what is true about entities in the world — who someone is, what something does, how a system works. When you retain your product documentation, company policies, or any reference knowledge, Hindsight's extraction pipeline reads the content and produces world facts from it — statements about things that exist and how they behave.
In an ODC context, retaining content like product documentation, pricing pages, or policy documents will produce world facts such as:
"The standard return period is 30 days for all consumer products"
"The Professional plan supports up to 25 users"
"Two-factor authentication is required for admin accounts"
Experience Facts
Experience facts are personal events and user-specific facts — they capture what happened, who was involved, and when. When you retain a conversation between a user and your agent, Hindsight extracts experience facts from it: what was said, what was decided, what was recommended, and what the user themselves did or experienced.
The key distinction from world facts: experience facts are about occurrences in time, not timeless truths about entities. "The agent recommended the Professional plan to this user during the onboarding call on [date]" is an experience fact — it happened. "The Professional plan supports up to 25 users" is a world fact — it is true independent of any event.
In an ODC context, retaining agent conversations will produce experience facts such as:
"The agent recommended the Professional plan during the onboarding session"
"The user's open ticket #892 was escalated to tier 2 support"
"The user asked about GDPR compliance for EU data residency"
Because experience facts are timestamped events, temporal recall queries like "What did I tell this user last month?" work out of the box. Tagged with user: at retain time, they are scoped to a specific user and never visible in other users' recall queries.
Important: Hindsight classifies memories automatically based on what the content says — you don't specify the type explicitly in the retain call. The same retained conversation can produce both world facts (e.g. something the user revealed about themselves: "Alice works at Google") and experience facts (e.g. what the agent did: "I recommended the Enterprise plan"). The Context parameter is your primary lever for improving extraction quality — passing Context="support conversation" or Context="product documentation" actively shapes how the extraction LLM interprets the content, helping it distinguish meaning that would otherwise be ambiguous. It does not directly control which memory type a fact becomes; that is determined by the content itself.
Observations and Mental Models
Beyond world facts and experience facts, Hindsight provides two higher-order memory types that unlock more sophisticated agent behavior. Observations are consolidated, deduplicated beliefs that Hindsight builds automatically in the background from accumulated facts — things like "this user consistently prefers concise responses." Mental Models are human-authored summaries for common queries that the agent can draw on immediately. During reasoning, Hindsight checks these sources in priority order: Mental Models first, then Observations, then raw facts.
Both are powerful capabilities, but covering them in full is outside the scope of this article. A dedicated tutorial on Observations and Mental Models is planned for the future.
How It All Fits Together
The memory types serve different roles at different points in the agent's request cycle. Understanding when each one is used — and how — turns the individual concepts into a coherent flow.
At the start of every LLM call: World facts and experience facts go into the prompt
When a user sends a message, your agent flow makes two Recall API calls before invoking the LLM, and the results are placed in different parts of the LLM call.
First recall — knowledge base grounding
Query with types ["world", "experience"], filtered by type:source (the tag applied when your knowledge base documents were originally ingested). This surfaces relevant domain knowledge — product docs, policies, FAQs — and replaces the traditional RAG retrieval from your search index. The results are injected into the system prompt as grounding data, sitting alongside the agent's instructions.
Second recall — user interaction history
Query with type ["experience"], filtered by user:<userid>. This surfaces the interaction history specific to the current user — what the agent has previously said, recommended, and resolved with them. These results are appended to the user message rather than the system prompt, placing the personal context closest to the question being asked.
Recall returns results within a token budget (defaulting to 4,096 tokens) rather than a fixed result count — Hindsight fills the budget with the highest-ranked memories, which integrates naturally with the token constraints of your LLM's context window.
At this point the LLM has domain knowledge in its system prompt and relevant personal interaction history appended to the user's message — enough to answer most questions.
During the conversation: Observations and Mental Models via Reflect
For questions that require deeper reasoning over consolidated user knowledge — preferences, long-term patterns, derived beliefs — your agent flow can call Hindsight's Reflect API. Unlike Recall, which returns raw facts for your agent to reason over, Reflect runs its own internal agentic loop: it autonomously gathers evidence, checks Mental Models and Observations in priority order, and returns a synthesized answer with citations. This is where Observations and Mental Models come into play. A full walkthrough of that pattern is covered in the forthcoming Observations and Mental Models tutorial.
After the LLM responds: New memories are retained
Once the agent has produced a response, your flow calls the Retain API to record what just happened — the user's question, the agent's answer, and the context of the interaction. Hindsight extracts the meaningful memories from this content and classifies them automatically. Over time, its background consolidation process builds and updates higher-order memories from the accumulating facts.
One important timing constraint: do not retain and recall in the same turn. Retain is a write operation and the extracted memories are not available for recall immediately. The correct pattern is to retain at the end of a turn and recall at the start of the next one.
This means the agent's memory becomes more comprehensive with every conversation — not because you write custom summarization logic, but because the retain-and-consolidate cycle runs continuously in the background.
Scoping Memory with Tags
A single Hindsight memory bank can hold memories for your entire application — world knowledge, every user's interaction history, every agent's behavior log. What prevents those memories from bleeding into each other is tags.
Every memory retained in Hindsight can carry one or more tags. At recall time, you filter by those tags to control exactly which memories are eligible to be returned. This is the mechanism that makes a shared memory bank safe for multi-user, multi-agent deployments.
The Three Tag Scopes Worth Knowing
Tags are free-form strings. There is no enforced naming convention — but a consistent, hierarchical naming pattern pays off quickly. The three scopes most relevant in an ODC Agent Workbench context are user, agent, and session.
user:
This is the most important tag scope. It binds a retained memory to a specific user and ensures their conversational history is never visible to anyone else. Every interaction a user has ever had with your agent — preferences, decisions, history — stays inside this scope.
Apply this tag when retaining conversations, and use it as a recall filter when querying experience facts for that user. This is the baseline for any agent that serves multiple users from a single memory bank.
Example: user:f0d63c15-dfd2-4e0c-9598-bc96d0240246 or user:alice@example.com
agent:
ODC applications can host multiple agents — a sales agent, a support agent, an onboarding agent. A user interacting with the support agent shouldn't pollute the memory of the sales agent with support-specific context, and vice versa.
Adding an agent: tag alongside the user: tag is a recommended convention for scoping memory to a specific user–agent combination. When recalling experience facts for the support agent, you filter on both user: and agent:support, returning only what the support agent itself has seen. Note that agent: is not a built-in Hindsight scope — it is a naming convention you apply consistently yourself, just like any other tag.
Example: agent:support or agent:onboarding
session:
Session tags have a different character. They are not primarily a privacy boundary — they are a relevance boundary. The current session is always the most relevant context for the active LLM call. Memories from three months ago are still valuable, but they should rank below what happened ten minutes ago.
By tagging retained memories with the current session ID, you can recall the current session's memories with high priority — either by filtering exclusively on the session tag for short-term context, or by combining the session tag with the user tag and adjusting your result mix accordingly.
Session tags are also useful for cleanup. In some use cases, you want to purge or deprioritise old session memories after a session closes, keeping the memory bank lean and focused on durable, cross-session knowledge.
Example: session:sess-20250524-abc123
Combining Tags
Tags compose. A single retain call can carry multiple tags simultaneously:
tags: ["user:f0d63c15-dfd2-4e0c-9598-bc96d0240246", "agent:support", "session:sess-20250524-abc123"]
At recall time, TagsMatch "any_strict" returns memories that match at least one of the specified tags, and excludes memories that carry no tags at all. This is OR logic across the specified tags, with untagged memories always excluded. This means you can design granular recall queries, but be aware of what each filter actually returns:
All memories for this user across all agents — Filter: ["user:f0d63c15-dfd2-4e0c-9598-bc96d0240246"]. Returns memories tagged with user:f0d63c15-dfd2-4e0c-9598-bc96d0240246.
Current session memories only — Filter: ["session:sess-20250524-abc123"]. Returns memories tagged with that session ID.
To match multiple assigned tags set TagsMatch to all_strict.
- All memories for this user and only for the support agent - Filter: ["user:f0d63c15-dfd2-4e0c-9598-bc96d0240246", "agent:support"]
Trade-Offs and Considerations
Hindsight addresses real limitations in the default ODC memory approach, but it is not a free upgrade. Adopting it introduces trade-offs that are worth evaluating before you commit.
Added Architectural Dependency
Hindsight is an external service. Your agent's memory now depends on a third-party API being available, performant, and maintained. This means:
Availability risk. If the Hindsight API is unreachable, your agent loses access to its long-term memory. You need to decide how the agent should behave in that scenario — fall back to the Entity log? Respond without memory context? Refuse to answer?
Latency
Every Retain and Recall call adds network round-trip time to your agent's response cycle. For short, single-session interactions (Use Case A), where the default approach already performs well, this added latency may not be justified. The benefit of Hindsight is most pronounced in Use Case B, where the alternative — loading an ever-growing Entity log — becomes slow in its own right.
Background consolidation of higher-order memory types happens asynchronously, so it doesn't block the agent's response.
Cost
Hindsight adds a cost layer on top of your existing LLM and ODC infrastructure costs. You're paying for API calls (Retain, Recall), storage of memories in the memory bank, and the compute behind multi-strategy retrieval and background memory consolidation. For high-volume applications with many users and long interaction histories, these costs can be meaningful and should be modeled before rollout.
On the other hand, Hindsight can reduce LLM token costs by replacing full conversation history injection with targeted, relevant memory retrieval — so the net cost impact depends on your usage pattern.
Debugging and Transparency
The default ODC approach stores raw conversation turns in an Entity you fully control. You can query it, inspect it, and reason about exactly what the LLM saw. Hindsight's memory extraction is more opaque — the system decides what constitutes a "memory." If the agent behaves unexpectedly, tracing the cause back through extracted facts is harder than reading a raw message log.
Not a Full Replacement for All Scenarios
For Use Case A — short, single-session interactions with straightforward grounding needs — the default ODC approach is simple, fast, and sufficient. Adding Hindsight in these scenarios introduces complexity without proportional benefit. The strongest case for Hindsight is Use Case B, and hybrid approaches (default memory for simple agents, Hindsight for long-lived ones) are perfectly valid.
Attaching a Hindsight Memory Bank to an OutSystems Agent
Now that we understand Hindsight, let's explore how to attach a memory bank to an OutSystems Agent.
You'll find two assets on OutSystems Forge:
Hindsight API - At the time of writing, this connector library offers REST wrapper actions for the Retain and Recall Hindsight operations.
Hindsight Demo Agent - This sample app demonstrates the use of the Hindsight API connector library, illustrating how to modify and extend the default agent flow to integrate with a Hindsight memory bank.
Please install both assets in your environment. We'll examine the implementation details using the Demo Agent.
Prerequisites
Before proceeding make sure that you have access to a Hindsight instance. Vectorize offers a Pay-as-you-go instance, meaning that you only pay for usage. See pricing information here: Pricing — Hindsight Agent Memory | Vectorize.
Create a Memory Bank
In the Hindsight console create a new memory bank
Add Documents to the Memory Bank
Add text content to the memory bank by copying your text or markdown into the Content field. You may optionally set an Event Date, which is crucial if aligning the content temporally is necessary. Also, consider setting a Context. Be sure to check the "Process in background (async)" option.
In the Tags tab, add the tag type:source. This step is crucial as it helps distinguish original information from memory retained during user interactions with the agent.
Finally, click "Add Document" to begin the memory ingestion process. Continue adding more documents using this method.
Configure Agent in ODC Portal
In the ODC Portal, go to the Agents menu. Choose the Hindsight Demo Agent and navigate to the Configuration tab.
Set your Hindsight API endpoint in Consumed REST APIs, either using the Vectorize-hosted API endpoint or your own.
In Settings, enter your API Key and the name of your memory bank.
Haiku Model
The Demo Agent defaults to the Trial Haiku model. You can either add the Trial model to your environment in the ODC Portal or select a different model of your choice.
Implementation
Let's go over the implementation details. Open the Hindsight Demo Agent in ODC Studio and access the AgentFlow action under Logic - Server Actions - AgentFlows.
When creating the demo application, I aimed to maintain the default agent flow as much as possible to facilitate your exploration of the details. The agent flow includes: LoadKnowledge - This action retrieves world and experience facts from the memory bank, filtered by the tag type:source. This ensures that only curated memories, ingested outside the agent flow, are recalled.
BuildMessages - This action is based on the original default with some modifications. Notably, memory bank experiences limited to the current user (filtered by the tag user:GetUserId()) are added to the user message of the prompt.
StoreMemory - Instead of saving the conversation history to a Memory entity, both the user input and the model's response are stored in the memory bank, scoped to the tag user:GetUserId().
LoadKnowledge Action
Inspect the properties of the Recall action.
This action queries the memory bank for world and experience memories, filtered by the tag type:source.
Next, examine how the GroundingData output is constructed by iterating over the results. GroundingData is then incorporated into the system prompt in BuildMessages.
LoadUserExperience Action
This action retrieves all experiences from the memory bank filtered by the tag user:GetUserId(), ensuring only the current user's experiences are recalled. It then constructs the Experiences output, considering only the results up to the specified MaxItems property, which defaults to 5.
User experiences are incorporated into the user message prompt in BuildMessages.
BuildMessages Action
This action constructs the final model prompt. Inspect the individual items, especially
The SystemMessage assignment containing the overall system prompt
The AddGroundingDataToSystemMessage assignment which will contain the recalled memories from LoadKnowledge.
The Experience assignment and ListAppendUserExperience action that adds the user experiences recalled in LoadUserExperience to the model prompt.
StoreMemory Action
This action is straightforward. Instead of writing the user input and model response to a Memory entity, it retains both in the Hindsight memory bank.
Review the properties of both actions, noting that each is limited to the tag user:GetUserId().
Try the Demo Agent
Click the Test agent button in ODC Studio to create a new Test app. Engage in a chat with your ingested documents. Monitor your Hindsight console for new experiences. Debug the agent flow to observe how the prompt is constructed and which memories are recalled from the memory bank.
Summary
This article walked you through the memory architecture behind ODC Agent Workbench and showed you where it works well, where it breaks down, and how Hindsight can address the gaps.
Here's what you've learned:
How Agent Workbench handles memory today. You now understand the two built-in mechanisms — grounding data (RAG-based retrieval from a search index) and conversation memory (turn-by-turn storage in an ODC Entity) — and how they combine to give an LLM context on every call.
Where those defaults fall short. You can distinguish between short, single-session interactions (where the defaults perform well) and long-running, multi-session agents (where growing context windows, lack of knowledge consolidation, absent cross-session memory, single-strategy retrieval, and opaque memory become real problems).
What Hindsight is and how it works. You now know that Hindsight extracts structured memories at ingestion time, classifies them into four types — World Facts, Experience Facts, Observations, and Mental Models — and retrieves them using four parallel search strategies (semantic, keyword, graph, and temporal) fused via its TEMPR system.
How memory fits into the agent request cycle. You learned the pattern: Recall domain knowledge and user history before the LLM call, optionally use Reflect for deeper reasoning during the conversation, and Retain new memories after the LLM responds — with the constraint that retained memories are not available for recall in the same turn.
How tags scope memory safely. You understand how
user:,agent:, andsession:tags let a single shared memory bank serve multiple users and agents without cross-contamination, and how tag filters compose at recall time.The trade-offs of adopting Hindsight. You can now weigh the benefits against the costs: added architectural dependency, latency from API round-trips, additional spend on API calls and storage, reduced transparency compared to a raw Entity log, and the fact that simpler agents may not need it at all.
How to wire it up in practice. You walked through the Hindsight Demo Agent available on OutSystems Forge — configuring the memory bank, loading knowledge, building the prompt with grounding data and user experiences, storing new memories after each turn, and testing the result.
You should now have enough context to evaluate whether Hindsight is the right fit for your agent, and a concrete starting point for integrating it if it is.
Thank you for reading. I hope you found the information helpful and gained new insights on the topic. Hindsight has much more to offer, and I plan to write another article soon, covering more advanced techniques.





