
Instabase founder and CEO Anant Bhardwaj discusses the revolutionary impact of LLMs on analyzing unstructured data and documents, and shares his vision for how AI agents could take things even further.
Loading summary
Anant Bhardwaj
So robotic process automation is literally if human had to do something, you basically open some browser or whatever, take some data put into some other system, click some button and all that stuff. So it records that human clicks on that desktop and tries to keep repeating it. So you kind of like get that automated. And the hard part that they had is you can't do robotic process for unstructured data because it's not fixed. They change it. So anything will be very, very brutal. The bet that we are taking is that AI will drive automation in a significant way. RPA would be fully eaten by AI automation and the future is likely going to be more of decentralized federated execution.
Derek
Thanks for listening to the A16Z AI podcast. I'm Derek and I hope you're ready to talk unstructured data For a long time, optimally managing and utilizing and even being able to locate unstructured data was a holy grail of enterprise it. And what is unstructured data? As this episode's guest Instabase founder and CEO Anant Bhardwaj explains, it's basically everything that's not nicely housed in rows and columns in a SQL database, text files, bank statements, passport photos, you name it. It's the stuff that's critical for any number of business operations, but that until recently was quite difficult to process or even search for without significant manual effort. So in this episode, Anant sits down with a 16z info partner Guido Appenzeller to talk through InstaBase's history with automating the manage regiment of unstructured data, from Anant's early research at MIT through to the revolutionary advances brought by large language models. He shares some exciting new use cases like an Indian bank approving loans via WhatsApp, and as you just heard, his vision of and a strategy for building a future where AI agents can make the leap from analyzing documents to acting on them. You'll hear it all, starting with some of Anant's personal journey after these disclosures. As a reminder, please note that the content here is for informational purposes only, should not be taken as legal, business, tax or investment advice, or be used to evaluate any investment or security, and is not directed at any investors or potential investors in any A16Z fund. For more details, please see a16z.com disclosures.
Anant Bhardwaj
I'll just give you a little bit of history. When I was doing research at MIT and I think big data was a big thing in 2015, everybody was doing this. And so first, let me just define unstructured data because People have different definition of unstructured data. So my definition is very simple. Anything that cannot be put into nice database tables where you can run SQL, anything that is not that is unstructured data.
Guido Appenzeller
So like a PDF document or an image or.
Anant Bhardwaj
Or anything. Yeah, anything that cannot be put into a nice table that you can run query. And we already knew how to answer questions when data is nicely in structured format. So at mit, the question that they were trying to ask was how do you answer question when data is not in that format? And that's very heterogeneous, that basically doesn't have any schema. You don't even know what questions are relevant or not. So that was the key sort of hypothesis. And we were building this product called Data Hub. And this has the ability to mount different kind of things. So you could mount file systems, you could mount databases, and you can mount something called Application node. So because some data also lives in random applications and can you ask any question? So that was big research project. I was like, this could be very valuable. So I dropped out, I didn't solve the whole problem and came here in Silicon Valley and then I started talking to a bunch of company. Tell me, what is your unstructured data problem? Because we have to figure out business or where to sell and where is the real value for the organizations, especially enterprises. And we got pulled into this gnarly problem, which is, here are all my images and documents in Excel and PowerPoint and can you help me answer questions? So my first question was, why do you even care what question you want to answer? We need to understand that. And they're like, we do a bunch of processes that receives a ton of unstructured data and we have to make a decision. Like for example, if immigration, once somebody applies for immigration, they submit a bunch of things and they have to make a decision whether they should give you visa or not, or you apply for loan. You submit a bunch of things and they have to make a decision whether you should get a loan or not. So we were like, sounds interesting. So let's think about how to solve it. And you won't believe it. The techniques at that time were very rudimentary. So there were four common techniques that people use. Number one, they call this like templates, where they will simply say, here is a template for passport, and if you want a passport number, go look 10 pixel below and 10 pixel from the right and draw a 20 pixel long box and whatever you find is your passport number.
Guido Appenzeller
Good luck with that.
Anant Bhardwaj
Yeah, it's very, very brittle. Right. Because as soon as you scan differently and think things will break. Second techniques were basically people writing different kind of rules like go and look for the keyword period beginning and anything right of that age of start date or something. Doesn't work like, it just breaks A third technique, people were trying to train these ML models by writing features for a specific document type. And what feature do you write for? Like, it's just very, very hard. So those also didn't work. So we basically at that time started doing research which we killed in two years. What's called program synthesis, which is we were basically like, if I had access to amazingly intelligent people, how do I solve unstructured data problem? I will ask them to write code on the fly. So can I basically ask computer to synthesize a program on the fly? It's very, very hard for computer to write program. At that time, LLM wasn't the thing, but we would like. Most of the data can be extracted from documents and all that by writing some form of regular expression and those kind of things. So let's do the synthesis of these regular expression based on what input output combination that you give and that is the answer. And that worked reasonably well as long as your input is in the similar kind of structure. Because the problem with program is it's deterministic. So if your input changes, it will break, but it still produce reliable results, but not good enough that we could solve many problems. But we could solve some part of the problems. This was 2017 and transformer paper came and I think the transformer paper, they also released a model called BERT at that time. So we were super excited. We were like. Because that was like state of the art and best sort of model to understand natural language. So we basically applied BERT on these unstructured documents. We took a bunch of those tokens and put that. And that produced really bad results. Really, really bad results. So we would like. At that time, actually I sent a note which Martin would have a copy of, which we were like, seems like this problem is not solvable unless somebody solves AI complete problems which they call AGI. But we were like, there is nothing else that is promising enough. So what do we do? So we basically tried to use some creative approach that if you look at the BERT language model, they were encoding the token as the position of the word in the sentence and that's how the attention mechanism would work and would do the fill mask problem. So we were like, what if we also start encoding in addition to the position of the word in the Sentence X and Y coordinate too. So we basically took 110 million documents, took every single word or the token and encoded with the position in the sentence, but more importantly X and Y coordinate, and then tried to basically solve Fillmore's problem by basically blocking and see that if that box can be filled by the model and trained a model which is similar to Bert, we call this instalm. And that produced great results because the attention is now not just looking at the sequence of tokens, but also XY coordinate in the two dimensional space, which is really, really cool from the perspective of the document layout understanding.
Guido Appenzeller
And I think it's fair to say that has become, I mean this is much later today, but today is sort of a standard technique almost. Right. If you're looking at two dimensional data, you have some rotary encoding of X and Y or something like that.
Anant Bhardwaj
Yeah, yeah, yeah. So at that time that was not, that was not the case. So actually Rafal, who was one of our ML engineers, you will see those two or three paper being in the top of the arena during, during that time. So we were very happy. We started winning a lot of deals. We tripled our revenue that year 2021 to 2022. But then OpenAI launched ChatGPT, which in November 2022, it turns out the bitter.
Guido Appenzeller
Lesson held size matters.
Anant Bhardwaj
And we were like, oh man. Basically you could actually pass the documents. And at that time they didn't support documents in the first release, but you could basically take the text with the positions like preserve and pass. And it did a reasonably good job. And we were like, is this end of Instabase? You can now seems like solve this whole problem. And then we realized that there is just a ton of things. And I think there is a paper by Databricks which is compound AI system that LLMs are very good, but you need a bunch of systems before and after this for that to be reliable. And we can get into the details, but that is the history of how we are where we are today.
Guido Appenzeller
Yeah, amazing. Very small personal anecdote. I have a lot of PDF files, everything a piece of paper I get, I just scan and dump into a folder. And I recently wrote myself a little tool that basically first I ask an LLM to come up with a hierarchy of documents. You know, we're a family of five, you know, here's some things about our, about our family. And then give me a document hierarchy. And then basically by taking a document and just taking the summary of the document given to an LLM, say which folder should this Go into. That's an amazingly efficient sorting algorithm. It's really impressive what LLMs can do. So today you have a solution that basically allows enterprises or companies to work with unstructured data. Can you talk a little bit about what this does or what some of the use cases are?
Anant Bhardwaj
So the use case is pretty simple, which is let's say I'll take a simple example of a bank that wants to do lending or an insurance company that wants to basically process your claims. So let's take one of one of these two use cases. So when, when people apply for, let's say application of home loan, it's like literally 100 page long packet and you don't even know where is what it could be. That first 10 page is their bank statement. Here's a shoebox of documents and in between there might be cat's picture. In between there might be some random letter from somebody. And so the issue I think is there is no one structure what bank says I need to something that can verify your income. I need something that verifies your identity. So it's not that they tell you here is my passport and here is my driver license. Here is application packet and go and process it. You have to do this reliably because you cannot make a single error. You can just think about like how do you solve this? So there are two techniques and that's what one of the I think paper that we wrote is. LLMs is not all you need because one thing that you can do is put that into some stuff and ask LLM the question. But the problem is if it goes beyond the context window, then that's a problem. You can do rag, because RAG is a technique where you put that into some vector database, figure out for what question what are the relevant chunks that could be useful and then produce that. But how do you know something you did not miss? You might get precision, but if you missed something then that's a problem. And LLMs are great, but they make surprising errors. So for example, let's say you have 10 page long bank statement with tables. Somehow they will get a lot of things right. But miss like four random cells with the value and you don't even know that they missed it. And that just changes the whole thing. So these are very surprising kind of errors. So we looked at like how do you solve this reliably? Because reliability part is important because these are complex decisions that banks or insurance companies or immigration make. So the right way to solve this is how do you know how to split this particular Packet into a bunch of things we care about. So you have to analyze every single thing in detail. Once you have done this, then how do you get all of these structures that we care about? Like for example we run separate table detection algorithm rather than passing the whole thing to LLM. Because how do you know you didn't miss four things? How do you make sure all the cells are correct? Similar thing for checkboxes and the signatures and other things that basically matter. Once you have classified, then what are the relevant schema that we need? Then you basically go and do those things. How do you validate that each of those things are correct? Then write validations and then do cross validation. Because is the pay stub saying the same thing that W2 does? Because if not, then that. So basically what we provide is this interface where people can build all of those things without writing a single line of code. And then you build this application and now you can run this application as part of deployment which will integrate with your upstream and downstream. So now you can do lending in like less than five seconds rather than earlier. That would have taken several, several weeks. There is one very interesting use case is intelligence use case, for example. So let's say, and that's where I talk about why the approaches are critically important. So let's say you are a country and you want to collect bunch of intelligence. You collect bunch of intelligence data and you want to answer if there is any threat to the country. So and you receive like let's say millions of documents per day. One way to dump that all into some rag system and ask a question, how do you know you didn't miss anything? Because they care about that. And maybe the right way to answer that question is not putting all the documents into a search, rather looking at every single page of the document. Look for the things that you care about like which is terrorism threat or money laundering or whatever and then extract that, put that into database, run SQL query. Once the things that match, then go and do the deeper analysis. Because now you guarantee completeness. So I think that what we have seen is that while RAG is good for casual search, you need a complex workflow under the hood that is explainable, that is auditable, that is guaranteed to be accurate and correct is important for solving many of these enterprise problems. So that's what we do. We help basically enterprises take any kind of unstructured data and make decisions from it for reliable, 100% complete and accurate use case. There are cases where we can make error and in that case we have to pass to humans, like hey, seems like something is wrong. Can you go and look at it?
Guido Appenzeller
Totally. And look, I mean I think this is the trend with current AI systems, Right. I've not encountered an AI system yet that is perfect and by some metric I think we never will. Right. I think what you need is finding things with reasonable error rates and then a good escalation path to humans to deal with those.
Anant Bhardwaj
Right, exactly. And even humans, humans are not 100% correct. Right. So you have to build the right processes to catch it. So that's why I think sometimes when people say this AI didn't work, it's just that AI is not supposed to work reliably 100% of the time. You have to build a system around it and that is going to be a lot of investment that you will see across the board, which is how do we build the right systems around AI and LLMs that solves the problem.
Guido Appenzeller
Is there a shift in how enterprises or general, I think consumers of AI think about reliability. I mean look, classically if I'm a chief compliance officer in a bank or so, I have a new piece of software and my take is this software can never do X because that puts us out of compliance. I recently spoke to a bank that basically said like well, we tried that. It doesn't work with AI, right. So now we're saying a well trained human gets us out of compliance about X times every X hours or so. Right. And so the AI has to be 10x better and then we're going to sign off on it. You cannot have absolute perfection. So we have to change the acceptance criteria. Is that something you're seeing as well?
Anant Bhardwaj
I think more important is predictability. I think people are fine with errors as long as errors are predictable. When errors are not predictable, that's where the problem is. So when basically somebody makes an error and you don't even know the error was made, that's when. Because in humans they will make 3,4% error. But if you put the second human by default, the chance of that is low and compounds. Yeah, with AI the issue is that they're pretty accurate, they're very good. But they make mistakes in a surprisingly unpredictable way. And that is a bigger problem. And that's where I think the tooling and systems around it to detect them, to be able to explain when the error was made, to be able to figure out how to catch them or building system that allows you to minimize that effect, that is the critical part. So I think in general what we have seen in age enterprises are fine using AI as long as we show them predictability. They don't care about 99% accuracy. You can be 90% accurate or even 80% accurate, but just tell us which 20% need to be reviewed or which 20% need to go somewhere. And that requires a lot of systems around these tools to get there. So I think we sometimes misunderstand what enterprises want. They don't want 100% accuracy, they want predictability.
Guido Appenzeller
Yeah, that makes sense. Is this the future that essentially, you know, in the future if an organization receives a document that typically human will no longer see the document, but will primarily look at an AI generated summary or you know, I will prepare it and you know, I can reason about it at a higher layer.
Anant Bhardwaj
Whenever unstructured data like documents come in, humans will still see some kind of dashboard with like whatever stuff is and only the thing of interest they will go and double click on and AI will do a lot of things to minimize their time to get to that thing of interest very, very quickly. Like Google is a great example. When you search, you don't read every single thing Google gives you. Like here is maybe three or four things of interest that you want to double click and do research on. And I think AI will play similar important role where in many cases gets.
Guido Appenzeller
Rid of the boilerplate. It reduces the thing to the absolute essential.
Anant Bhardwaj
Essential.
Guido Appenzeller
Are we looking at a world where my system will take my couple of key points or key phrases and generate a PDF document? Then your system will take the PDF document, reduce it back on the couple of key points and phrases?
Anant Bhardwaj
Phrases, exactly.
Guido Appenzeller
That's, I guess not a bad way to operate in the future. What is the most interesting use case you've seen for your technologies? Anything sort of out of the ordinary.
Anant Bhardwaj
I think what we are seeing customers being a lot more creative than we had ever imagined. So just think of I was working with a bank in India and now given that AI has become reasonably reliable, they are offering entire lending over WhatsApp. So you go to WhatsApp, you say like, hey, I'm a business and I want a loan. And then on WhatsApp you get a response back saying, hey, can you upload these things your last 30 days of like all the, you know, your P and L statement and whatever those things look, and you basically piecemeal, you submit these three posts like, oh, this looks good, can you also do this? And I've never seen like lending being done conversationally over WhatsApp. This is insane. Like the customer experience is like fundamentally very different. And I Think that. I do believe that over the coming years it is going to change the user experience in a very, very significant way. Currently, I think a lot of people think AI is a technology and how we can use this inside software. I think that the biggest impact would be with the degree of affordance that it gives you. You can completely build a new class of interaction with your customers and that would never have been possible. And we are seeing more and more of those currently. Like all of these processes like insurance claims and all these pretty painful process, right? And I think America is slightly more conservative in those things. But if you go to developing world where digitization is more of a new thing and people are already using all the stuff on phone, things are just moving in a way where AI makes you feel like you're talking to humans. Nobody loved chatbots before, but now you feel good because they basically are conversing with you in pretty similar to human like behavior. And that interface coupled with all the customer interaction that they have. Of course one of the big use case that everybody's trying to go after is a call center. But just think of every other things too, like how do you create open an account, how do you do lending, how do you do processing? It will have significant impact on how the user experience is going to change in a very, very significant way.
Guido Appenzeller
Yeah, totally. And I think there's even, I think an opportunity here to take some processes which currently were very. I take a lot of documents, I throw them over wall and back comes a response, really turn to something more interactive, right? Where it's like, hey Guido, tell me more about your spare use case. Okay, then I need these documents and I send them. It's like, well, that document is missing something. And you can do this interactive with very, very short latency everything, even immigration, right?
Anant Bhardwaj
Like you send the stuff and you don't even know. Two months later you hear like your stuff is rejected or we need something like this. All of those things can fundamentally be changed.
Guido Appenzeller
I just got a letter back from the submitted a long application with lots of supporting documents. I got a form letter about saying the documentation is not complete without any mention of what is not complete. Amazing. I was like, what does this mean?
Anant Bhardwaj
This can be just much more interactive and because now you can do things in real time. And so I'm pretty optimistic on the impact of this on every single business, on how they interact with their customers.
Guido Appenzeller
That makes no sense. What do you see as the main barriers for companies to adopt this? It's like, you know, I mean, I've seen many a classic enterprise adopting AI. There's discussions around, you know, compliance and legal and where does my data go and you know, like long list of sort of concerns that are being expressed. What are the top sort of items that you've seen?
Anant Bhardwaj
The enterprises are not historically known for moving very quickly. So that's number one. So I think expecting that like you.
Guido Appenzeller
Know, I would say they're moving a little quicker in the AI revolution than they did previously. So.
Anant Bhardwaj
Exactly. In general, I think each of these large enterprises have to get approval from their compliance committee and the regulations committee and they all basically. And none of them really understand things. And sometimes you get regulations that might or questions that might not even be applicable. Like for example, tell me every time you change the feature how did LLMs like LLM developers don't change features. Right. But when you get like all of those things that basically is a massive time sink. But I think the two key things that they care about is how do you guarantee that my data is safe and secure? So that's number one and second is how do you give me auditability and predictability? That's the two most like. If you boil down to all their questions, they eventually boil down to those two things like nobody wants AI making a decision even if it is corrupt. If they cannot explain, here are the set of steps that it took because if something wrong happened, they have to explain because in human world you can explain something came this went to these five different teams where they did this part and this particular error was made. And that's why we will correct in the future so that this kind of mistake would not happen. If AI becomes a black box with no instrumentation of how things get done internally, that basically has hard time especially for customer centric use cases for simple casual search and those kind of things, it's fine. But the runtime has to be something that is auditable and you should be able to find if something went wrong, where it went wrong and they don't tell you directly, they ask questions that eventually boils down to this. But that's what we have seen as a major requirement.
Guido Appenzeller
Makes sense. Let me switch tax here a little bit. We've seen one of the, I think hottest buzzwords at the moment are agents. Right. And so it's an overused term. It's sometimes used as a marketing term for a glorified set of prompts basically. Right. But we're also seeing it as essentially different user interface paradigm. Right. Where I no longer walk through a transaction step by step but basically I Give a high level instruction to agent, Agent acts autonomously. We even see itself as a software design paradigm where I now have multiple agents that work together and make decisions more autonomously. How do you think this will change with how enterprises process data, how we work with unstructured data and this entire space.
Anant Bhardwaj
So let's look at like what we already know that has worked well. So what we already know that has worked well is enterprises already know how to run some workflow that is created by some developer and they define a bunch of steps using some workflow management tool and you can run it. So people already know how to run. This argument you can make is can we just tell the agent, like give me the answer and they do it. The problem with currently the agents are if you just give them same goal and same set of tools and they might choose different paths two different times. So they are not guaranteed to deterministically always go in one path. So in general people don't like runtime inconsistencies. So runtime has to be consistent. So I think where I have seen things work well within enterprises during build time, when somebody has to define the control path and the logic and all of those kind of things, you can maybe have agent produce the first draft which is like, hey, this is how I plan to execute. This is what it looked like. Because otherwise human might have taken like long period of time. Pretty similar to Cursor, right? If want to build something, they can write the first draft of the code. The human can look make some minor edits, but then you run that code deterministically. Yeah, exactly. So my point is I think it's the same way. So I do not believe that autonomous agent would be a runtime phenomena. However, there would be a build time or compile time phenomena, which basically means that during build phase they can do the 90% of the work, humans make some changes and that's a huge, huge, huge value. Because the reason why things don't scale at the enterprises is because there is either lack of enough developers or skills or drive or whatever. If AI agents can do things and make it so easy that you can build those, and then once it is approved, then we know what is running, then it is auditable. And you can also add steps and checkpoints, whatever that is needed. Like for example, Cursor generated a code, but you want more logging so you can add logs in between whatever those things could be. So once you have that deterministic artifact that can run in production. So that's where I think the world is going to move towards, which is your compile time phenomena and the runtime phenomena. Runtime phenomena has to be deterministic, something that is auditable, debuggable. You exactly know what is happening. You should be able to see the logs and all that kind of stuff at compile time. Agent can play an important role because they can help with the reasoning and create the first draft where human can participate with the agent to produce the artifact that he's going to run.
Guido Appenzeller
It makes a lot of sense. I mean this is a super hot debate at the moment, right? And I think we've everything from this AGI vision where it's like, no, this is going to be a fully agentic loop and it decides when it wants to terminate, decides what tools to use and it's just, you know, you give your credit card and let it run. Right. And I think, I personally agree. I don't think we're there yet. Right. These most freeform agentic systems that we've seen, they typically don't work yet. This approach of saying let the LLM generate the flow but then freeze the flow once it works, I think at least in the short term, it's a.
Anant Bhardwaj
Much, much more programmatic vision Also basically, I think we can take a lot of lessons from what works in the human world. Let's assume every human in agent, you don't allow every single employee in your company make autonomous decision. No. Some person at the top says here is the set of things that we are going to do. You can only do these set of things. And then so basically the runtime is pretty deterministic. Most of the reasoning and agency and all that cool stuff is used.
Guido Appenzeller
So LLM process re engineering is a thing now, I guess. That's fantastic. So what are you excited looking forward, what are you excited about in your space? I mean, AI at the moment is hard to predict what's happening in six months. Right? But if you try to stretch your crystal ball to the absolute limits here, what things do you think we'll see 12 months out, two years out in your space?
Anant Bhardwaj
So we've been debating and reasoning on this for quite a period of time. And maybe my answer would be slightly controversial because different people have different view of what would be the future. So I do believe that AI will continue to improve and the capabilities and I think they will play an important role in compile time, building things, reasoning and all that. Although runtime is going to be much more deterministic and predictable and controllable. Now the question is, what is going to be execution pattern. There are two different view of the world. One is that does it make my data management problem easier that it allows compile time, move all the things into one place and be able to answer and do things. Or you basically keep the tooling in the world the way it is siloed everywhere. And AI would become smart enough to have multi agent communication where each agent can do things and figure out how to, how to, you know, if one makes an error and affects how to do the communication. So we have been working on this idea of federated AI execution where how you can as an organization you can define these thousands of agents in a very federated way, but dynamically are able to discover other agents through some platform or whatever that could be and then able to communicate. So if you give a bigger goal somehow, basically you don't need one central person to decide everything dynamically. All the agents can discover, they all consider the capabilities, then you can figure out the control path, then you can figure out how to run. So we are trying to build federated decentralized automation framework, which basically means that can I take any process in any organization and figure out the federated decentralized execution framework and that can run. And that's where I believe that automation world would move. There are still a lot of open questions, a lot of unknowns, a lot.
Guido Appenzeller
Of work to do.
Anant Bhardwaj
Yeah, but the bet that we are taking is that AI will drive automation in a significant way. RPA would be fully eaten by AI automation and the future is likely going to be more of decentralized federated execution. And so that's what one hell of a vision there.
Guido Appenzeller
I'm excited about it. So AI is progressing very rapidly. How have the technical advances of AI impacted what you can deliver to your end customers? I mean they must be changing basically constantly. Is that right?
Anant Bhardwaj
Yeah, yeah. So I think the earlier we focused primarily on the unstructured data problem as part of the automation because that's one of the long tent in the poll is how do you even understand them? Because once you get data in the structured format, you know how to do next steps. So we primarily focused until now, which is if you get a bunch of unstructured data, how to get you the things that you need to make the next step of the decision. We did not touch the next step of the decision. Like let's say you are a lending company or you are an insurance company. Once you get all the data, you might have to trigger some other tool like their lending system or some sort of fraud system. Or whatever the risk system and things like that, because that requires knowing about those systems, how to interpret the results and all that kind of stuff. So we are like data in, we will do everything, give you valuable data out. And after that you are responsible for all the other integrations. And the way these guys solved those problems was by using this technology called rpa. You might have heard robotic process automation. So robotic process automation is literally, if human had to do something, you basically open some stuff, browser or whatever, take some data put into some other system, click some button and all that stuff. So it records that human clicks on that desktop and tries to keep repeating it. So you kind of like get that automated. And the hard part that they had is you can't do robotic process for unstructured data because it's not fixed, they change it. So anything will be very, very brittle. But if the things are exactly the same after that you can actually record the screen and replay. It has been very, very brittle. Like the problem with the RPA is even though they add value, I think there are some big players there, UiPath automation anywhere, and many of them have reasonable massive market cap. Now with AI, the argument that we make, and we might be wrong is once the data comes out, which we are very, very good at, until that point, can we also start operating those other systems? Now this makes a massive assumption, which is AI will help us operate those systems. And there are some interesting protocols that has come, which is model context protocol that allows you to dynamically discover capabilities, call those functions. It has a ton of problems still, which is, does all the system even support mcp? What if they don't?
Guido Appenzeller
They sort of punted on authentication, but we'll figure that out over time.
Anant Bhardwaj
Then authentication, then how do you know if something breaks? One of the arguments that we are making is that maybe in future, as we basically go broader, can we do entire end to end workflow? So once data comes out, do we have a way to plan and region during the compile time, which AI agent can do, how to operate those systems, how to call them, how to get the data, then call some other system if something gets wrong, how to involve humans. So create all that stuff with the AI agent during compile time and then extend our offering to do this entire thing end to end. Like can RPA be fully replaced with AI automation? RPA had some stuff that is easier to solve because some user logs in. So it always runs in the context of the user. If you're clicking on desktop and all that. One of the hacks that we believe might work is called identity pass through that. Can we assume the user identity that can be provided during runtime and then let that user identity get passed to all of the mcp?
Guido Appenzeller
Do I always want an agent to have the same capabilities that I have an agent is today a good intern. Right. So I trust the intern up to a point. I don't necessarily want unlimited spending on my credit card. May want to cap that at $50 or something like that.
Anant Bhardwaj
And you can decide that during compile time. So basically you can say that like, hey, even if, let's say this user context is this, but as soon as it gets to this operating tool, maybe we create some user divided by half fake identity that will have less permissions or things like that. So the good thing is, and that's why I said the AI agent should only be used during compile time, so that it gives humans all the control that what the runtime behavior should be.
Guido Appenzeller
Yeah, makes sense.
Anant Bhardwaj
This problem comes when AI agent is making runtime decisions because then you have no control where things are going. So the separation is critically important. So during the initial build, you can choose, like if you want to curb what agency they have and what limits and constraints that they have. And that's what it will go and do during the runtime. Right.
Guido Appenzeller
Alan, thanks for being here today. That was absolutely amazing. I think we're on a very exciting journey with AI. And looking back, I think the last big wave that I was a part of was probably the dot com boom. And I think there's one lesson learned for enterprises in general back in those days is that these big technological shifts happen. You have to jump on the wave early. It may be complicated, maybe still a little weird. Your compliance, your legal folks, they don't know how to deal with it. But if you don't, you may end up like Barnes and Noble. The downside is, is substantial. And I think it is clear today that there's a huge opportunity for enterprises here to both have more efficient workflows for themselves, but also to have a much, much better end customer experience and partner experience.
Anant Bhardwaj
And in addition, it does three things, which is it saves you a lot of cost.
Guido Appenzeller
It does.
Anant Bhardwaj
It allows you to do things much, much faster. And the third one is fundamentally changes customer experience in a very significant way. So I think there are all the business reasons for enterprises to adopt these things now. It's just about how to make this work. I don't think I have any question on whether this, you know, whether this will work or is this the right decision. It's about how to make it work. That is the bigger question, I think.
Guido Appenzeller
Fantastic. Well, thank you so much.
Derek
And that's it for this episode. Maybe it's the years I spent covering the data space and thinking about big data, but I thought that was a great discussion. If you agree, please do share the podcast and rate it wherever you listen and keep listening for more great stuff in the weeks to come.
Date: June 6, 2025
Host: Guido Appenzeller (a16z)
Guest: Anant Bhardwaj (Instabase Founder & CEO)
Special Notes: Co-Host Derek introduces and concludes; transcript skips ads/disclosures.
This episode explores the evolution and future of automating unstructured data using large language models (LLMs) and AI agents, focusing on enterprise applications. Instabase founder Anant Bhardwaj recounts his research journey (from MIT to Silicon Valley), provides a technical history of approaches to unstructured data, and discusses real-world use cases transforming business processes. The conversation addresses technical breakthroughs, challenges with AI adoption (including reliability and compliance), and a vision for federated, agent-driven automation in the enterprise.
Unstructured Data Defined:
Anything that can't be neatly stored in SQL tables—PDFs, images, scans, emails, mixed documents (02:15).
"Anything that cannot be put into nice database tables where you can run SQL, anything that is not that is unstructured data."
— Anant Bhardwaj, 02:15
Enterprise Challenge:
Businesses have critical processes relying on unstructured information (immigration forms, loan packets, insurance paperwork).
"The techniques at that time were very rudimentary… As soon as you scan differently things will break."
— Anant Bhardwaj, 04:41
Technical Breakthroughs:
LLMs like GPT-3/ChatGPT:
How Instabase Approaches the Problem:
"LLMs is not all you need because …if it goes beyond the context window, then that's a problem…How do you know you didn't miss anything?"
— Anant Bhardwaj, 09:40
Error Handling and Human-in-the-loop:
"You have to build systems around [AI]... it's going to be a lot of investment."
— Anant Bhardwaj, 14:16
Customer Use Case:
“I've never seen lending being done conversationally over WhatsApp. This is insane...the customer experience is fundamentally very different.”
— Anant Bhardwaj, 17:53
Shift in Acceptance Criteria:
“They don't care about 99% accuracy. You can be 90% accurate or even 80%... just tell us which 20% need to be reviewed.”
— Anant Bhardwaj, 15:21
Barriers to Adoption:
What are Agents?
Build-time vs. Runtime Autonomy:
Agentic workflows are promising for drafting solutions, but determinism is needed at runtime for auditability.
“The agents…are not guaranteed to deterministically always go in one path. In general people don't like runtime inconsistencies…runtime has to be consistent.”
— Anant Bhardwaj, 23:36
Best practice: Use AI/agents to generate the initial solution ("compile time"), human review/approval, then run deterministically in production.
“I do not believe that autonomous agent would be a runtime phenomena. However, there would be a build time or compile time phenomena…”
— Anant Bhardwaj, 24:21
Vision: Federated AI / Decentralized Automation
"AI will drive automation in a significant way. RPA would be fully eaten by AI automation and the future is likely going to be more of decentralized federated execution."
— Anant Bhardwaj, 29:12 & 29:33
[04:41] “The techniques at that time were very rudimentary. Templates…as soon as you scan differently, things will break.”
— Anant Bhardwaj
[07:47] “That produced great results because the attention is now not just looking at the sequence of tokens, but also XY coordinate in the two dimensional space, which is really, really cool from the perspective of document layout understanding.”
— Anant Bhardwaj
[08:18] “Lesson held: size matters.”
— Guido Appenzeller
[09:40] “LLMs is not all you need … the right way to solve this is how do you know how to split this particular packet into a bunch of things we care about.…You need a complex workflow under the hood that is explainable, that is auditable, that is guaranteed to be accurate and correct.”
— Anant Bhardwaj
[15:21] “More important is predictability. I think people are fine with errors as long as errors are predictable. When errors are not predictable, that's where the problem is.”
— Anant Bhardwaj
[17:53] “I've never seen lending being done conversationally over WhatsApp. This is insane.…the customer experience is fundamentally very different.”
— Anant Bhardwaj
[23:36] “If you just give agents the same goal and same set of tools…they are not guaranteed to deterministically always go in one path. In general, people don’t like runtime inconsistencies.…runtime has to be consistent.”
— Anant Bhardwaj
[29:33] “The bet that we are taking is that AI will drive automation in a significant way. RPA would be fully eaten by AI automation and the future is likely going to be more of decentralized federated execution.”
— Anant Bhardwaj
[35:01] “It allows you to do things much, much faster. And the third one is fundamentally changes customer experience in a very significant way. So I think there are all the business reasons for enterprises to adopt these things now. It's just about how to make this work.”
— Anant Bhardwaj
This episode provides a comprehensive look at the technological evolution and enterprise adoption of AI for unstructured data. Bhardwaj emphasizes that blending advanced LLM techniques with engineered workflow systems, human oversight, and emerging agent-driven automation can fundamentally transform enterprise operations. He articulates a shift from brittle, rule-based automation to federated, AI-powered workflows that are reliable, explainable, and customer-centric—signaling a future where routine enterprise labor is increasingly handled by intelligent, collaborative digital agents.