ROI For AI – Prosus Toqan – Talk of Many Things

Here’s a great excellent ROI analysis and computation for an enterprise AI software: Copy-pasted from Eye for AI newsletter dated 8 January 2025 via Jeremy Kahn (@jeremyakahn) @FORTUNE (@fortunemagazine).

Prosus Toqan System
One of the people I spoke to was Euro Beinat, executive vice president and global head of AI and data at Prosus, the Netherlands-based technology investment firm whose portfolio includes dozens of tech startups worldwide. I like talking to Beinat because Prosus’s diverse portfolio gives him a good vantage point from which to gauge how AI is being adopted across different kinds of companies—from food delivery apps to ecommerce plays to fintechs—and across different job functions. I also like speaking to him because he is unusually candid about what has worked, and what hasn’t.Beinat said that in the past year, Prosus has rolled out its Toqan AI system to 25,000 employees across its various portfolio companies. Employees can use Toqan to do everything from answering questions about employment policies to drafting marketing surveys to assisting human customer support agents in finding the right documentation to answer customer queries. One of Prosus’s companies, OLX Poland, has created an agentic AI system, called OLX Magic, that helps walk sellers through the process of posting a listing or, in the case of a potential buyer, helps them shop, letting them specify what they are looking for in natural language and have a “conversation” about the options with an AI chatbot, rather than using a traditional search.Using multiple models and an “agentic workflow”
Of course, one of the things that has held AI adoption in business back, as Marcus rightly points out, is reliability. Few business use cases can tolerate the 10% to 25% level of inaccuracies many large language models (LLMs) generate if used without any other interventions. Through an iterative process of improvement—including building better guardrails and updating the AI models it was using—Prosus gradually brought Toqan’s hallucination rate down from 10% in 2022 to 2.5%. But to get it down further, Beinat says, Prosus had to change how the entire system is engineered to build a more “agentic workflow.”That process involves having an AI model that reasons about the nature of the question it’s being asked and decides whether the question can be given to an LLM (large language model) to answer directly, or whether it requires the agentic workflow. If it does, the model breaks the task into discrete parts and gives different AI “agents” (either models that have been fine-tuned for a specific task or LLMs that have been prompted to play a particular role and perhaps given a specific software tool to use to help complete that task) each part. Then there is a “reflection phase,” where an AI model checks the overall result of this workflow for errors, repeating the entire process if any are found. Using this system, Process has reduced hallucinations to 1.5%.

But, Beinat warns, “it is slower to do this and a lot more expensive in terms of token usage” than simply giving the question straight to an LLM and having it answer. Overall, the number of tokens used per query has increased by 2.5 times. Meanwhile, the average price per token has, thanks to price wars among cloud providers, slightly more than halved. So, on average, the system is only about 10% more expensive today than it was in early 2023.

Measuring ROI
The lower hallucination rate is probably worth the cost, he says. When Toqan was initially rolled out, it was embraced mostly by engineers, while people in other domains, such as human resources and legal, were reluctant to use it. Beinat says he thinks this was because engineers, due to the nature of their work, often had an intuitive sense of when they could trust the model’s output, whereas in other areas, detecting hallucinations was more difficult and the chance of errors made people hesitant to use Toqan. Now, with the lower hallucination rate, the majority of Toqan users are from non-engineering roles. Still, Beinat warns, managers should not expect AI’s impacts to be apparent immediately after a system is introduced. Prosus has found that on average it takes six months of learning and experimentation for users to figure out how to use these new AI tools most effectively in their particular role, he says.

And, even then, Beinat acknowledges figuring out the return on investment from AI is difficult. So far, he says, Prosus data shows that Toqan saves about 48 minutes on average per user per day. That’s not nothing, but he says the problem is that those 48 minutes “are spread all over the place. There are all these microbursts of productivity.” And the value of those saved minutes varies a lot depending on the use case. Prosus has calculated that right now, the cost of those 48 saved minutes per day, is about $12 per user per month, which he says is definitely worth it.

Reducing the cost of growth
Still, 48 minutes each day doesn’t seem like a game changer. And that’s why he says he often likes to highlight individual use cases, where AI’s transformative impact is more apparent. He points to iFood, a Brazil-based food delivery app Prosus owns. iFood told its employees that if they had a data analytics questions to try asking that question to Toqan before sending it on to a human data analyst. The company discovered that 70% of these questions could be solved by Toqan. iFood still employs plenty of data analysts who handle the question Toqan can’t, but now their backlog has been reduced and the capacity of those human data analysts is less of a bottleneck. And, of course, savings such as this mean that iFood can grow without hiring as many new employees—in essence, AI reduces the cost per dollar of revenue generated.