February 26, 2025

How Decagon builds customer support AI agents that really work

If you’ve looked into AI for customer support, you’ve probably heard of Decagon: they’re raised over $100M and built a product used by the likes of Rippling, Notion, Duolingo, and Eventbrite. I sat down with Jesse, Decagon’s CEO and co-founder, to talk about how they actually build customer support AI agents, from a logic flows framework to testing and evals. 

On the role of human agents in a customer support world that’s driven by AI:

“We’ve given a lot of thought to this, because we believe that human agents are going to be a core part of how our product gets used. One thing we’ve seen is that there’s a lot of new work that emerges: for example, supervising the AI, QA-ing at and sampling conversations. There will also probably be people whose job it is to build logic into the AI, using our logic framework.”  

On how deploying customer support agents rely on figuring out logic flows:

“A lot of the value that we’ve built is in handling logic flows. For example, let’s say you’re working with a credit card company and the flow is hey, I lost my card, I need a new one. You probably have a lot of steps involved: looking up their account, confirming the card, figuring out their address…all this stuff that needs to go in there. We provide the building blocks for you to teach AI to do this in a very reliable way, and get them something they can run within a couple of days.” 

On what needs to improve in state of the art models:

“For text, the thing that we really care about for models is how good they are at following instructions. If we had a model that was perfect here that would be great for our space, because most of what we do is describing what needs to be done and having the AI listen. On voice, the problem right now is latency. If you’re on chat and the model responds within 5-6 seconds, that’s fine, but on voice it’s not. 

On how they test model improvements before they hit production:

“This is a pretty well researched problem. The first thing we do is standard evals and regression testing, where we simulate things and make sure the answers are what we’d expect. Another thing we do is red teaming, where we create malicious tests to see if things break. The nice thing about our space is that it’s easy to incrementalize things. So we can roll a change out to a small percentage of our users, it’s not an all or nothing thing.”
Become a better AI founder every Wednesday with articles and episodes sent directly to your inbox.
explore untold stories in ai, directly from the industry's top founders.
Delivered to your inbox every Wednesday.