The current AI paradigm is inherently probabilistic. What that means is that AI-powered web agents fail – a lot.1 And they fail in unpredictable ways. It is, therefore, extremely hard to build and commercialize products around AI. Yet, LLMs are incredibly powerful. How do you harness their power while taming their unpredictability?
This is the problem that HDR set out to solve. We’re building a framework for launching web agents to automate arbitrary tasks – everything from complex research projects to AI-powered e-commerce. The cornerstone of our framework is an entirely novel product concept and a new contribution to the agent developer’s toolkit: the Collective Memory Index (CMI). The goal of the CMI is to offer developers an off-the-shelf solution to the agent reliability problem by feeding their agents “memories” of how to complete complex workflows. How do we do this? There are three steps: 1) observe; 2) record; 3) recall.
First, we observe agents using a browser to accomplish the tasks that their users set for them. When they do so successfully, we record the set of actions they took to reach the desired end state as a memory. Those memories take the shape of a set of state and action pairs which are mapped to the web pages through which the AI navigated toward its goal. In cases where AIs persistently fail to accomplish their goals, we allow a human user to drive the browser to generate the necessary memory. Last, we make those memories searchable at web scale, empowering any agent that needs to accomplish the same or a substantially similar task to recall them. It is in this sense that the memories are collective. If no memory exists for a given website, we search for similar site structures and offer agents use of those memories as a cognate.
In short, every agent that uses our product contributes memories of successful actions which future agents can recall and then, in turn, they will contribute more. This generates a profound network effect: the more people use our product, the better it becomes, which draws in more users, and so on. It also puts us in a unique position within the entire AI ecosystem. We are not competitive with the state-of-the-art models. They simply make our product better.
The CMI solves other problems that contribute to customer churn as well. Since developers can now use state-of-the-art models to pathfind towards a goal and record a memory of their successful action sequences, they can subsequently employ smaller, cheaper, more specialized models to accomplish complex tasks. Additionally, the CMI can guide agents on the shortest path from their starting point to their ending point, reducing latency. What this means is that, using the collective memory index, developers can launch web agents to reliably perform complex tasks at a lower cost and latency.
1.Benchmarking of state-of-the-art LLMs suggests that they fail at complex tasks approximately 86% of the time.↩️