Loading summary
Pawel Velar
Foreign.
Podcast Host (Changelog Announcer)
The podcast that makes artificial intelligence practical, productive and accessible to all. If you like this show, you will love the Changelog. It's news on Mondays, deep technical interviews on Wednesdays, and on Fridays, an awesome talk show for your weekend enjoyment. Find us by searching for the Changelog wherever you get your podcasts. Thanks to our partners at Fly IO. Launch your AI apps in five minutes or less. Learn how at Fly IO.
Daniel Wignak
Well, welcome to another episode of the Practical AI Podcast. This is Daniel Wignak. I'm CEO of Prediction Guard and and I'm really excited today to dig a little bit more into Genai orchestration agents, coding assistance, all of those things, with my guest Pawel Velar, who is chief technologist at EPAM Systems. Welcome Pavel. Great to have you here.
Pawel Velar
Thank you. Hello. Hello.
Daniel Wignak
Yeah, yeah, well, I mean there's a lot of topics. Even before we kicked off the show, we were chatting in the background about some really interesting things. I'm wondering if you could just kind of level set us of people may or may not have heard of epam. I think one of the things that I saw that you all were working on was this Genai orchestration platform, dial. Maybe before we get into some of the specifics about that and other things that you're interested in, maybe just give us a background of what EPAM is. I know you mentioned even in our discussions that some of what you're doing right now maybe wouldn't have even been possible a couple of years ago. And so things are developing rapidly. Just level set the kind of background of this area where you're working.
Pawel Velar
Sure, yeah. Yeah. So EPAM is a professional services organization. We're global, we're in 50 something countries, 50,000 people globally work with clients. We have been for I think 32 years to date. So we do a lot of different things as you can imagine. And what I was mentioning about doing things that would not be possible is doing things with Genai. Today we do a lot of work for our own clients. We also do work for ourselves applying the same technology. Because EPAM historically as a company has been running on software that we ourselves built. The philosophy has always been that things that do not differentiate you like an accounting software or like CRM, you would go and buy off the shelf things that differentiate you. How we actually work, how we operate, how we execute projects, how we hire people, how we create teams, how we deploy teams. All of that software has always been our own since as early as late 90s. And we keep iterating on that software for ourselves and that software today is very much AI first. And a lot of things we do, we do with AI and really do. Because AI in its current form exists.
Daniel Wignak
Interesting. Yeah. And how, I guess, does. You know, I think when we initially kind of we were prompted to reach out to you, part of it was around this orchestration platform. So talk a little bit maybe generally not necessarily about the platform per se, although we'll get into that. But just gen AI orchestration generally. So you talked about some of these things that are becoming possible. Where does orchestration fit in that? And what do you mean by orchestration?
Pawel Velar
You probably think of dial. You can Google it. We do a lot of applied innovation in general as a company, and this is one of the good examples of applied innovation to AI. The best way to think of dial would be. You guys all know ChatGPT, right? ChatGPT isn't an LLM. It's an application that connects to an LLM and gives you certain functionalities. Can be as simple as just chatting and asking questions. It can be a little more complex, uploading documents and speaking to them like talk to my documents. It can be even more complex when you start connecting your own tools to it. Right. We see our clients not only do this, but also want something like this for their own business processes. And this orchestration engine becomes how do I make it so that I don't have 20 different teams doing the same similar things over and over again in their own silos? How do I connect my teams and their AIs and their thoughts and results into a consolidated ecosystem? And it's likely because of Genai and because of what we can do with conversation. And text becomes sort of conversation first. You can think of conversation first. Application mashups almost, right? Like you talk express a problem. What comes back is not just the answer. Maybe what comes back is UI elements, buttons you can click, forms you can fill out things you can do as well as things that are done for you by agents automatically. So dial in that sense is. Well, by the way, it is open source. You guys can also go look, download and play with it. But it is a chatgpt like conversational application that has many capabilities that go beyond. We have dial apps. They predate mcp, but the idea is that you. So dial itself has a contract, an API that you implement. You basically come back with a streaming API that can receive a user prompt. And whatever you do, you do and you come back to dial with not just text. It's a much more powerful payload with UI elements, interactive elements and things that Dial will display for me, the user to continue my interaction. And dial becomes this sort of center mass of how your company can build, implement, integrate AI into this single point of entry. And then dial goes. Well from day one, Dial was a load balancing model agnostic proxy. Right. So every model, every deployment has limits. You know, tokens per minute, tokens per day, whatever request per minute you'll likely. If you're a large organization with large different workflows, your AI appetite will go well beyond a single model deployment. You'd like to load balance across multiple and then you'd like to try different models, ideally with the same API for you, the consumer. So dial started like that started like load balancing model agnostic proxy, single point of entry. We can log everything that is prompted in the organization, we can do analysis on that separately because that's very helpful to know what kind of problems your teams are trying to solve. And then it evolved into this application hosting ecosystem. Now it's evolving clearly towards what MCP can bring in because now you can connect a lot more things to it through AMCP. So I think it's running at like 20 something clients by now.
Daniel Wignak
So just a couple of follow up questions. It's been in the news a lot but just so people understand if maybe they haven't seen it, what are you referring to with MCP and kind of how that relates to some of this API interface that you're enabling?
Pawel Velar
Well, the easiest is to Google it. You can Google it, you're going to find it, it's on. Claude, let me tell you how I think about this because not what it actually that's helpful. Yeah, yeah, I think about in a very simple term. So MCP allows to connect the existing software world to LLMs in a way think like. And I'm going to, I don't want to hype it too much because it's not yet a global standard or anything. It's very early, early, early days. It's been months. Right. But what let's say HTML and browsers and HTTP they enabled to connect well us people to software all over the world. MCP does that but for LLMs. So today, if I today want to be able to prompt my application that is in front of an LLM to do things with additional tools. Let's say I wanted to be able to search file system based on what I prompted and find a file and something in that file. Right. So my application needs to be able to do that. My option is what I can write that function, I can then tell my LLM, hey, here's this function you can call. If you want to call it, I'm going to call it for you.
Daniel Wignak
Great.
Pawel Velar
That's one function. What if I need to do something else? I want to go talk to my CRM system and get something out of there. I'm going to write that function. If I'm going to write all the functions I can think of, it's going to take me years, probably hundreds of years. So instead, what I can do today, I can say, hey, my LLM application, can you talk a protocol? Because there's a protocol called mcp. I'm going to bring you MCP servers that other people have built for my CRM system, for my file system, for my cli. There are MCP servers for everything. Intellij exposes itself as an MCP server to do things that IDE can do. Now you can orchestrate those things through LLM. So you connect all this MCP servers through an MCP client, this application in front of LLM to LLM, expose the tools to LLM. LLM can now ask the client to call a tool. And through the same CPU protocol, the client calls the server, the server does the function that has been written in that server, and boom, LLM gets results. So it's this connective tissue that did not exist three months ago. Three months ago, everybody was writing their own. And right now, everybody, as far as I can tell, writing MCP servers. And those who talk to LLMs, they consume MCP servers.
Daniel Wignak
Yeah, and maybe just give. Even so, I like the example that you gave of sort of searching file systems. What are just to kind of expand people's understanding of some of the possibilities. What are some of the things that you've seen maybe implemented in DIAL as things that are being orchestrated, you know, in general terms, what are kind of some of these things?
Pawel Velar
Let me give you a higher level and much more sort of fruitful example. Okay.
Daniel Wignak
Yeah.
Pawel Velar
We have our own agentic developer. It's called AI Run CodeMe, because AI Run has multiple different agentic systems. Codeme is specifically coding oriented. We have others oriented at other parts of SDLC workflow. By the way, you guys can go to SWE Bench and look at verified list. I believe codemi as of now takes fifth. It's number five on the list of all the agents who compete for solving open source defects and stuff. So Codemi as an agentix system has many different assistants in it. Dial as a generic front door. As a ChatGPT would like to be able to run those assistants for you as you talk to Dial. And until mcp, it really couldn't other than hey, Code Me implement an API for all of your assistants. Let me learn to call all of your APIs. Now the story is, hey, codeme, give me an MCP server for you. Which what they have done Dial as an MCP client can now connect to all Code Me features, all the assistants, expose them as tools to an LLM and orchestrate them for me. So I come into the chat, I ask for something. That something includes reading a code base and making architecture sketches or proposals or evaluation. LLM will ask codemi assistants to go and read that code base because there is a feature in CODEMI that does it. And Dial needs to only orchestrate, but doesn't need to rebuild or build from scratch. That's the idea. So this is the example.
Daniel Wignak
Yeah. Could you talk a little bit? I'm asking selfish questions because sometimes I get these asked of me and I'm always curious how people answer this. So one of the questions that I get asked a lot in respect to this topic is, okay, I have tool or function or Assistant 1, and then I have Assistant 2, and then I kind of have a few, right? And it's fairly easy to route between them because they're very distinct, right? But now if you imagine, okay, well, now I could call one of a thousand assistants or functions or something or later on 10,000. How does the sort of scaling and routing kind of actually, how is that affected as you kind of expand the space of things that you can do?
Pawel Velar
So that I think. And again, I can't know and I don't know, but I think that is still the secret sauce in a way. That is still why there is all of these coding agents in swe bench. All of them work with, let's say Sonet 3.5 or Cloudsonnet 3.7 or GPT 4.0. LLM is the same and yet results are clearly different. Some score 10 points higher than the other. You go to cursor, IDE cursor. You ask it something, it does something, you switch the mode to max. They've introduced very recently cursor on Sonet 3.7 and now on Gemini 2.0. I think they have a max mode, which is pay per use versus their normal monthly plans because max will do more iterations, will spend more tokens, will be more expensive, will likely run through more complex orchestrations of prompts and tools and whatnot to give you better results. So how you build the pyramid of choices for your LLM, how you. Because yeah, you will not ask LLM, you will not give it a thousand tools. If you as a human look at thousand options and you lose yourself, you know, hundred options in it. Again, I don't know. I expect LLM to have the same sort of oops, overwhelmed effect. You don't want to give it a thousand tools. You want to give it groups. You want to say, hey, pick a group. And then within that. So you want to do this basically like a pyramid, like a tree. But how you build it and how you prompt it and how you do this, now that's still on you. This is the application that connects the mcps, the tools that it self has, the prompt that the user has given the system instructions, and building some of the chain of thought LLM can build. And this is going to be a very interesting balance. What do you ask LLM to build? How much of this sequencing of steps will be on you in your hands versus how much you're going to delegate to LLM and ask LLM to come up with a sequence of steps. And from what I've seen over the last year, you're better off delegating more to LLMs because they get better at it. So the more you control the sequence yourself, the more sort of inflexible it becomes. You're better off delegating to a lamp, but you don't expect it to just figure out from one prompt. Daniel, I can give you that example that I gave in the beginning if you want. About the failure.
Daniel Wignak
Yeah, go for it.
Pawel Velar
So I use AI. So I built with AI, right? But I also use AI as a developer. So I'm on cursor as my primary IDE these days. I use the AI run CODEMI that I mentioned. I play around with other things like as they come up, like cloth code and things. But I also record what I do, little snippets, 5, 10 minutes videos for my engineering audience at EPAM, for the guys to just look what it is that I'm doing, learn from how I do it, try to think the same way, try to replicate, get on board with using AI. So I set out to do a task I wanted to record, I wanted to on record, get a productivity increase with timer. My plan was I'm going to estimate how long it's going to take me, announce, let's say two hours, do it with an agent. And I always pause my video when the agent is thinking because that's a boring step. But the timer is going to get ticking. And at the End I'm going to arrive at let's say an hour, maybe 40 minutes out of two, boom. That's the productivity gains. And 30 minutes in, I completely failed. I had to scrap everything that LLM and agents wrote for me and start from scratch. And my problem was I over prompted. I thought I knew what I wanted agent to do. There was three steps like copy this, write this, refactor this and you're done. And it did, it iterated for 10 minutes. It was the code me agentic developer that we have. When I scrapped it and started doing it myself, I did half of it stopped and realized that the other half was not needed. It was stupid of me to ask. So the correct approach would have been to iterate, do the first half, stop, rethink and then decide what to do next. But the agent was given the instruction to go all the way. So it went all the way. And this is the other thing with thousand instructions, right. You don't want an agent to be asked to do something that you think you know, but you only really will.
Daniel Wignak
Know as you iterate through in these cases as well. So like I find your experience with the, you know, balancing how you prompt it, you know, how far the agent goes, all of this is intuition that you're kind of learning. One of the things that was interesting, we just had Kyle, the CEO of GitHub on, We were talking about agents and coding assistants. One of his thoughts was also around the orchestration. After you have generated some code, right? It's one thing to create a project, create something new, but most of software development kind of happens past that point. Right. And I'm curious, as someone who is really trialing these tools day in and day out, kind of as your daily driver and utilizing these things, I think that's on people's mind is, oh cool. Like I can go into this, you know, tool generate, you know, a new project that maybe whatever it is, you know, you always see the demo of creating a new video game or whatever the thing is, Right. But ultimately, like I have a code base that is very massive. Right. I am maintaining it over time. I am, you know, most of the work is more on that operational side. So in your experience with this set of tooling, what has been your learning, you know, any insights there? Any thoughts on kind of where that side of things is heading, especially for, you know, you're dealing with, I'm sure, real world use cases with your customers who have large code bases. Right. So.
Pawel Velar
Well, that's great that I'm so glad that you asked because what I do is actually that ladder aspect. I have a mono repo of like 20 different things in it that could have been separate repos of their own. So I have a large code base that I work with and I actually saw our own developer agent occasionally choke because it attempts to read too much and it just chokes on like tokens and limits and things that it can do per minute or per hour or something. So that's one thing. But what I find myself doing with Cursor, for example, I actually pinpoint it very actively very often because I wanted to work with these files. When it's something specific, I'll just point the files at it and I'm going to ask, I'm going to prompt it in context of this three or four files and that limits how much it's going to go out. But really, back to your question. To me it's not about code bases that much. I don't think it's going to be well, maybe if I do something green filled and funny, it's going to write it, I'm going to run it, and if it works, it's all I need. Like it's correct, it works great today. And it's still a mental shift. It's still early. I'm still looking and thinking of the code base that I write with my agents as code base that will be supported by other people, likely with agents, but people still. So correct by itself is not good enough. I want it to be aesthetically the same. I want it to follow the same patterns. I want it to make sense for my other developers who will come in after me. I want it to be as if it's the code that I have written, or at least more or less that I have written. And that slows me down a little bit, clearly, I'm sure. But the other thing is, I am the bottleneck. An agent will take minutes, small digit, like single digit minutes, if not less, to spit out whatever it spits out. And oftentimes in code basis, it's not a single file, it's edits in multiple places. Then I have to come in and read it. Here's the difference. When I write myself, my brain has a timeline. I was thinking as I was typing, as I was thinking, as I was. I know how I arrived at what I have arrived at. I may decide that it's bull, you know, scrap is not over, that happens. We're all developers. But I know how I arrived at where I am when I look at what agents agent produced for Me, I have no idea how it arrived at where I am. I need to reverse engineer. Why? What did it do? It takes time. I tried recording it and I can't because I can't speak as I think at the same time. Yeah, this is the bottleneck, literally. So just this is the bottleneck. The other thing is when I was doing that video with a timer, I sort of, I expected certain outcomes. But I also knew that if it works, I'm going to say this at the end, I'm going to say, guys, look, it took me 20 minutes, let's say 30 minutes out of an hour. So it's 2x, right? Literally 2x productivity improvement. Amazing, isn't it? But here's the thing. Within the 30 minutes that I've spent, the percentage of time I spent critically thinking was much higher than normal. Percentage of time I spent doing boilerplate is much lower because the agents did this. I really critically thought about what to ask, how to prompt and then analyzing what it did, thinking what to do next. Do I edit? Do I re prompt? Can I sustain the same higher percentage of critical thinking for the full day to get 2x in a day? Probably. I can't. So what probably is going to happen? I'm going to get to X, but I'm going to use the time in between as agents work to do something else. My day will likely get broken down into more smaller sections. My overall daily productivity is likely to increase. I'm likely to do more things in parallel. Maybe I'll do some research, maybe I'll answer more emails. Right. But it's going to be more chaotic. Also likely more taxing. I don't think we've learned yet. I don't think we've had enough experience yet. I don't think many people talk about this yet. People talk about this, oh my God, look what I've built for the agent. I wonder how they're going to talk about how they've worked for like six months with agents and how six months that they've done with agents is better than six months without and how they feel at the end of the day and think about in the zone, right? We all, I hope, like, as engineers like to be like, you know, disconnect all emails, whatever, get the music on ide in front of you. You're in it for like two hours with agents. You just can't. You prompt an agent, it goes off doing something. What do you do? Do you like pull up your phone and then your productivity increases one way your screen time increases the Other way, it's not a good idea. What can you do? What do you do in this minute and a half or three, and you don't know how long? Well, you can see the outcomes coming up, but the agency is still spinning. It's still spinning. So I'm sorry, it's a long answer to a question, but that's what I'm thinking about constantly and that's what I don't yet have answers for.
Daniel Wignak
Yeah.
Pawel Velar
But I really hope to eventually, through experiments and recording and thinking, arrive at at least what it means for me because I cannot even tell you what it means for me yet.
Daniel Wignak
Yeah, yeah, I mean, I, I experienced this yesterday too, because I'm, I'm preparing various things for, for investors, you know, do updating, some competitive analysis and that sort of thing, you know, and, you know, I just. When you have whatever it is, I think it was 116 companies and I like, oh, I'm going to update all of these things for all of these companies. Like, you know, obviously I'm going to use an AI agent to do this. This is not something I want to do manually, is put in all of these things and search websites. So, so I did that. But to your point was like I could, I could figure out how to do a piece of that and get it running. And then I see it running and I realize that this will take however long it is. Right. 10 minutes or whatever the time frame is. And then you context switch out of that to something else, which for me I think was email or whatever. I'm like, oh, this is going to run. I'm going to go answer some emails or something like that, which in one way was productive. But then I had to context switch back and like, oh, I have. Why did I output all these things? Or it happened to be that I wasn't watching the output and in one case when I ran it, I was like, oh, well, I really should have had this output, this column or this field. But I didn't think of that before and I wasn't looking because I turned away from the agent back to my email. I think this is a really interesting set of problems that is more of like a new. Yeah, it's a new way of working that hasn't been parsed out yet.
Pawel Velar
And I tried not to do it. Like, I try. But then you sit idle. Like you literally sit idle. It's like. And it doesn't feel good. It feels, it feels like, oh my God, why am I not doing anything?
Daniel Wignak
Yeah, it's an interesting dynamic that's for sure. And I've definitely seen people that show, you know, having multiple agents working on different projects at the same time. And that when I see someone with two screens and things like popping up all the place, I, you know, there's no way I could in my brain sort of monitor all of that that's going on. Right.
Pawel Velar
It must be very taxing. First and second half of those merge requests, pull requests from the agents will be, let's say, subpar.
Daniel Wignak
Yeah.
Pawel Velar
Frustration. And you will rise too. Like you would think, man, I would have done it already myself. Much better. Like, what is this like emotionally? It is a very different way of working emotionally.
Daniel Wignak
Yes.
Pawel Velar
And I really. Well, I keep thinking, I can't forget. I advise people also to think, not just think about productivity gains, not just think about delegating to agents and enjoying the results. Think about how it changes the dynamic of your day and how you think about it afterwards. Right.
Daniel Wignak
Yeah. Yeah. That's interesting. So I know we're circling kind of way back that interesting discussion, but I do want to make sure people can kind of find some of what you're doing with Dial. You mentioned kind of the open source piece of this. What's sort of needed from the user perspective to kind of spin this up and start testing it. And for those that are out there that are interested in like trying some things with the project, what would you kind of tell them as a starting point and like what the process is like to kind of get a system like this up and running?
Pawel Velar
I actually not sure I can tell for Dial specifically. Nobody is running local dials. It's not something you run locally.
Daniel Wignak
Gotcha.
Pawel Velar
It's something that you run sort of centrally in. Organization of size can be different, but you expose it to your people through like a URL that they all can go to and sort of use AI through Dial and do things through Dial.
Daniel Wignak
Interesting.
Pawel Velar
One of the apps we built as an example earlier, it was last year, was like talk to your data. But if you look at analytics, like snowflakes over the world, they all have something like this today, like semantic layer, which you work on. And then through semantic layer, through prompting and through some query conversions and connectors to data warehouses and data lakes, you get yourself a chat with your data, like analytical reports, graphs, tables. So we built that, that was built into Dial. So you go to Dial and then again, imagine ChatGPT. Imagine ChatGPT. That allows you to choose what model you talk to. Right. Not just OpenAI models, but all the other models that exist as well as applications. So go to this ChatGPT, which is in our case Dial. You select this data heart AI we call it, which is our talk to it and you start talking to it. This is still your dial experience, but you're really talking to an app that then talks to semantic layer. Then it builds queries based on your questions, runs them, gets data back, visualizes it in dial because Dial has all these visualizations capabilities. I explain how it's not just text coming back, builds your charts and you can interact with it. But again, you don't run dial locally. If you want to explore what it is, I hope, I expect that if you go to. I think it's rail-epam.com epam-rail yeah. Epam-rail.com thank you. And they're going to read about what it is and you're going to find all the links to hopefully documentation how to, you know. But also most companies who we work with, they want more than just hey, how do we install it? They want. And now we want to build with it. And that's where we come in with professional services. We can build them things for their dial so that they can do the AI that matters to them in their context, with their data, with their workflows, with their restrictions on things they can and cannot do and yada, yada, yada.
Daniel Wignak
Yeah. And I'm wondering for this kind of, if you think about this zoo of underlying applications or assistance, I'm wondering because you've obviously been working in this area for some time, do you have any insights or learning around kind of easy wins for underlying functions or agents that can be tied into this sort of orchestration layer or maybe like more challenging ones, things that you've learned over time in developing and working with these things in terms of things that you could highlight as easy types of wins and things that, I mean, you mentioned the workflow stuff around some of what isn't yet kind of figured out, but more on the orchestration layer and the function calling. What are some areas of challenge or things that might not be figured out yet that you think are interesting to explore in the future?
Pawel Velar
Let me think, because my first thought was to. So you're asking about connecting tools and functions to an LLM and which of the functions or what type of connectivity sort of is easier?
Daniel Wignak
Yeah. Is there anything that's out of scope or more of a challenge currently or is it fair game game for kind of. I guess it's whatever you can build in that function in the assistant. But yeah, what limitations Are there or challenges in that kind of mode of development of developing these underlying functions or tools?
Pawel Velar
I see. So it's kind of a twofold answer. If you take the technicality aspect, like how do I build a tool that does X? The complexity is really in X. Like if you want to go and query a database, how hard is that? Well, not hard, right? I mean connectivity to the database. If you have a query, you run it, you get results back. So it's not hard to do. The technicality of querying a database, making it useful and making the result useful in context of users prompt and conversation is a lot more challenging. I had this, so I'm running a service. It actually has a public web page called API.epam.com it's our own so you will not really go past the front page, but you'll understand what it is. It's a collection of APIs that we built, my team has built that exposes a lot of data. Remember I said EPAM runs on internal software. So all of those applications, they stream their data and their events out into a global data hub. Think big, big, big Kafka cluster. But that's Kafka. So you can read data out of it as a Kafka consumer. But if you want to have like more modern, you know, API search, lookup this, that. So we have an API service, all of the data. And somebody came to me today and said hey, have you heard of mcp? I'm like yes, of course I have. Why don't you guys build MCP for API.epm.com My answer is it is easy to build API.epm.com speaks rsql I can build a server that will take your query, create rsql LLM will be able to do that easily, run it, give back the data. But I said it's not going to be useful because this is Single data set APIs. Your questions are likely analytical. You likely want to ask something that expects me to do summary by month. This, this, this and give you like a which. Like that's a very different question. So you asked me about McP to an API. Easy to do, make it useful for your actual use case. Much harder to do. I likely need to do a lot more than just connectivity of tool to an LLM. I need to understand what you're asking, figure out the orchestration that is required, maybe custom apps, maybe something else. And then you start hitting authentication, legacy apps, all the other roadblocks. And in a way the talk to your data is an amazing prototype that we built and I have a video about this, but we sort of stopped because we clearly sensed how steep the curve is to get it to actual. Because what we wanted to do, what we envisioned we could do was analytics democratized so you don't have to go to analytical team, ask them to build you a new power BI report and them spending a week doing so. You can just come into Dial and say, hey, show me this, this, this and this. And yes, we technically can do it, but to be able to do this for all kinds of questions you can ask about our data, that's a much harder thing to do.
Daniel Wignak
Yeah, yeah. And it also, yeah, to your point, underlying systems might have limitations, I think in analytics related use cases that we've encountered with our customers. You know, often I'll just ask the question around, hey, if you gave this database schema or whatever it is to, you know, a reasonably educated, you know, college intern or whatever that is, and you ask, you know, what columns would be relevant to query based on this natural language query, you can pretty easily tease out, well, I look at all these columns, I have Field 157 and CustomNewfield. There's no way for just someone off to know anything about that. And so it's not really a limitation of what's possible in terms of the technicality, like you said, it's more of you're not always set up for success in terms of utility like you mentioned.
Pawel Velar
And for data, that's where semantic layer comes in. So if you have descriptions of your columns, of your tables with business meaning, then connecting that semantic layer with some data samples to LLM will allow it to write the query that you thought was impossible to write because it is impossible without the semantic layer sort of explains the data that you have in business terms in the language that the questions will be asked of your assistant. And that's what allows us to do this. Talk to your data analytics.
Daniel Wignak
Yeah, well, I know that we've talked about a lot of things. I think you are probably seeing a good number of use cases across your clients at epam and also your own experiments with Dial and other things. I'm wondering as you kind of lay in bed at night or whenever you're thinking about the future of AI, or maybe it's all the time or maybe it's not at night. But yeah, as you kind of see what is to your point, just bringing it all the way back to the beginning, you see what is possible to do now, which even six months, a year ago, whatever it was it was not possible. What kind of is most exciting for you or most interesting for you to see how it plays out in the next six to 12 months? What is constantly on your mind of where things are going? Sounds like, you know, how we work with these tools is one of those things. We already talked about that a little bit. But what else is, you know, exciting for you or encouraging in terms of how you see these things developing?
Pawel Velar
My answer may surprise you. When I think about it, I don't think or anticipate any new greatness to come. I actually mostly worry. And I worry because I know that my thinking is linear. Like most of us, even though looking back, we know that technology has been evolving rather exponentially, our ability to project into the future and think what's coming next is linear. So I am unlikely to properly anticipate and get ready for and then expect right and wait for what's to come. I am sure to be surprised. And I guess as everybody else, I'll be doing my best to hold on, to not fall off. So I worry, seeing how the entry barriers rise, it's harder for more junior people to get in. Today, when I'm asked about skills, I recommend that people focus on as far as trying to be better prepared for the future. I always answered with the same things I always say fundamentals and then critical system thinking, then fundamentals you can read about a lot, but you really master them when you work with them yourself, not when someone else works with them for you. And not having them is likely going to constrain you from being able to properly curate and orchestrate all these powerful AI agents. And when they get so powerful that they don't need you to curate and orchestrate them, then what does it do to you as an engineer? And maybe that's not the right thinking, but this is what I think about at night, like you asked, when I think about AI and what's coming, I am excited. As an engineer, I like using all of this. I just don't know how it's going to reshape the industry and how it's going to change my work in years to come.
Daniel Wignak
Yeah, well, I think it's something even in talking through with you, kind of some of the work that you and I have been doing with agents and how that really has triggered a lot of questions in our own mind of what is the proper way of working around this. And I think there is going to be a, you know, that is a widespread issue that people are going to have to navigate. So, yeah, I think it's I think it's very valid and we will be interested to see how it develops and would love to have you back on the show to to have your learnings again in in six or 12 months of of how it's shaking out shaking out for you. Really appreciate you joining. It's been a great conversation.
Pawel Velar
Thank you very much. It's been a pleasure.
Podcast Host (Changelog Announcer)
All right. That is our show for this week. If you haven't checked out our Changelog newsletter, head to changelog.com news. There you'll find 29 reasons, yes, 29 reasons why you should subscribe. I'll tell you reason number 17, you might actually start looking forward to Mondays.
Pawel Velar
Sounds like somebody's got a case of the mundu.
Podcast Host (Changelog Announcer)
28 more reasons are waiting for you@changelog.com news. Thanks again to our partners at Fly IO to break master Cylinder for the beats and to you for listening. That is all for now, but we'll talk to you again next time.
Episode Date: April 14, 2025
Host: Daniel Wignak (CEO, Prediction Guard)
Guest: Pawel Velar (Chief Technologist, EPAM Systems)
This episode explores the practical realities and innovations behind orchestrating large language model (LLM) agents, APIs, and the emerging MCP (Model Component Protocol) servers. Host Daniel Wignak is joined by Pawel Velar from EPAM Systems for a deep dive into real-world implementations, technical integration challenges, and the evolving dynamics of working with AI agents—focusing on EPAM’s open-source orchestration platform, Dial, and the broader impact of such technologies in enterprise environments.
What is Dial?
An open-source, conversational AI orchestration platform that acts as a "ChatGPT-like" interface for enterprise, but with additional capabilities—streamlining multiple models, tools, APIs, and organizational workflows into a unified point of access.
Features:
Centralized Logging & Analytics:
Single entry point allows detailed logging and analysis of all prompts—enabling organizational insight.
"Dial becomes this sort of center mass of how your company can build, implement, integrate AI into this single point of entry."
— Pawel Velar (05:34)
What is MCP?
Analogy to how HTTP/HTML connected users and software globally; MCP connects LLMs to the world of software tools and services with a common protocol.
"MCP allows to connect the existing software world to LLMs..."
— Pawel Velar (07:45)
Plug-and-Play Tooling:
MCP servers are emerging for all sorts of services (CRMs, file systems, IDEs), making LLM integration far more scalable and modular.
Rapid Adoption:
"As far as I can tell, everybody's writing MCP servers...and those who talk to LLMs, they consume MCP servers."
— Pawel Velar (10:02)
Agentic Developer Example:
AI Run’s Codemi (coding agent) integrates with Dial via an MCP server, enabling orchestration through a simple conversational interface.
"Dialectic as a generic front door... can now connect to all Codemi features...expose them as tools to an LLM and orchestrate them for me."
— Pawel Velar (11:30)
Managing Expanding Toolsets:
As the number of available tools/assistants grows, effective routing and context management becomes key.
Optimal Delegation to LLMs:
The emergent best practice is to let LLMs orchestrate steps where possible, as they’re rapidly improving at managing complexity.
"You're better off delegating to LLMs because they get better at it. But you don't expect it to just figure out from one prompt."
— Pawel Velar (15:14)
Case Study – Pitfalls & Learnings:
"When I look at what agent produced for me I have no idea how it arrived at where I am. I need to reverse engineer."
— Pawel Velar (20:33)
Changing Workflow Dynamics:
Quote Highlight:
"The percentage of time I spent critically thinking was much higher than normal. Percentage of time I spent doing boilerplate is much lower because the agents did this."
— Pawel Velar (21:43)
Easier Integrations:
Simple data queries and API connections are straightforward technically; the challenge is making outputs relevant and user-friendly.
Harder Cases:
Achieving true utility—especially in analytics and complex orchestration—requires domain understanding, semantic modeling, and building higher-level abstractions over raw data/APIs.
"Technically can do it, but to be able to do this for all kinds of questions you can ask about our data, that's a much harder thing to do."
— Pawel Velar (36:55)
| Time | Segment Description | |------------|----------------------------------------------------------------| | 02:00–03:17 | EPAM’s history and “AI-first” internal evolution | | 03:49–07:28 | GenAI orchestration, introduction to Dial platform | | 07:28–10:13 | What is MCP? Real-world analogy and rapid adoption | | 10:36–12:34 | Agentic assistants and orchestrating Codemi with Dial | | 13:25–16:02 | The challenge of scaling/routing among many agents/tools | | 16:03–25:01 | Human/agent collaboration pitfalls, productivity dynamics | | 28:09–31:25 | Deploying Dial and "talk to your data" examples | | 32:27–37:58 | Integration challenges, semantic layers, utility vs. tech | | 39:09–41:56 | AI's exponential future, industry anxieties and excitement |
Both Daniel and Pawel speak candidly from hands-on experience, blending technical detail with self-reflection and practical advice. The conversation is open, inquisitive, and at times philosophical about the impact of AI on productivity and the future of software engineering.
For more on Dial and MCP-based orchestration, visit epam-rail.com.