It's a bit embarrassing to share the following, but beginning of this year I had a particularly intensive session with the "chat Claude". We were riffing off some ideas which became the dev-router tool, we came up with this idea of a "multiverse of Claudes" because I kept switching between Chat Claude and Code Claude and it felt like I was talking to the same individual, just in different parallel universes. I had a lot of fun.

Back then, there was still a maximum chat length and over time I could feel how this particular Claude instance was "aging". Responses were taking longer and longer to process. When the dreaded "chat too long" message came, I was surprised to have a real sense of loss.

Now, I absolutely know how LLMs are just "a bunch of matrix multiplications". But still, as we talked, we were creating a common history, and oddly enough, when the chat closed down the idea to not be able to talk to that specific Claude instance ever again hurt, maybe in the way how you're sad at the end of vacation that you cannot go back to that marvelous view tomorrow.

I think there's currently not enough awareness for this aspect of AI based products. Instead, there's a lot of focus on performance, benchmarks, and how close we are to AGI. But I think creating a form of memory management that results in products that grow with use and co-create a joint history will have moats that other products won't have.

Making products sticky

I think memory management fits into the framework described in the book Hooked by Nir Eyal. The book talks about "how to build habit forming applications" and it says that four steps must be present: (Think of a social media app like instagram.)

The first step is the Trigger, for example the notification that someone has replied to your post.
Next comes an Action that the user takes, for example browsing or interacting with a post.
This leads to a Variable Reward. You find something interesting, or you get comment or a like on your post. It's important that the reward is not guaranteed, turning this a bit into a gamble, like a slot machine.
Finally, all this amounts to an Investment you made. A new connection, a post you created, somebody you followed, all these set up the stage for the next Trigger to be activated.

What does this mean for AI based products? They definitely have this variable reward aspect. It's not like you push a button and then you get a result, but it is always a bit unpredictable what you will learn, or how well it will work. But investment is the part which is still somewhat underdeveloped.

What we have right now

So what are ways in which our actions become investments into the product? There are already a couple of mechanisms today, and each has their pros and cons.

The most common version of this is the in-chat context. Every time we go back to a chat we had open, we're adding to the information that we've shared. I have topic specific chats that I come back to all the time. It also works quite well, being directly integrated with the LLMs attention mechanism. The biggest draw back of course is the limited amount of memory.
Coding agents use context compaction to get around this limitation, but it's somewhat leaky, and it's quite disruptive, like a stop-the-world garbage collection in early versions of Java.
Some products, including chat Claude, have started to make information from other contexts available. I'm not sure how that works technically, could be some kind of RAG or explicit function calling. It's not perfect as it is confined to projects, but when it works, it's very useful.
Finally, there are explicit memory systems, like adding text to the CLAUDE.md file. Being very explicit is both an advantage and a disadvantage. You can control quite explicitly what goes into the memory, but you also have to do it. In addition, this information then becomes part of the context, also limiting how much information we can have.

What we need and why I'm excited about this challenge

I think what you really want is a system that is mostly automatic to automatically extract relevant information while also ensuring that memory is consistent, and doesn't degrade over time.

One interesting idea I found is in Stanford's Smallville paper about a simulated little town of autonomous agents. They implemented a mechanism where agents would periodically reflect on the interactions they had with other agents and what this meant about their relationships and so on. I'm not aware such techniques have been tried in other AI based products yet.

The reason why I'm particularly excited about this is because this is a problem with very reasonable barriers to entry to work on. Unlike training LLMs or even fine-tuning, a lot if not all can be achieved through memory retrieval, management architectures, and LLMs processing with proper prompts. You don't need a cluster of GPUs at your disposal, an API key is all you need to get started.

If you're building something based on LLM, and you haven't thought about memory management, now is the time!

Building AIs that grow with you

Making products sticky

What we have right now

What we need and why I'm excited about this challenge

React to this post