
Loading summary
David
You're about to make a trade which.
Wayfair Announcer
U do you listen to?
David
Is it get optioning those options or.
Wayfair Announcer
Let'S do a little research. Learn more@finra.org TradeSmart.
Ford Announcer
Ford BlueCruise Hands Free highway driving takes the work out of being behind the wheel, allowing you to relax and reconnect while also staying in control. Enjoy the drive in blue cruise enabled vehicles like the F150 Explorer and Mustang Mach E available feature on equipped vehicles terms apply does not replace safe driving. See Ford.com BlueCruise for more details.
David
The key here is that we want to be able to forecast the relative attractiveness or the relative return for a broad universe of stocks over the next month, defining that universe of features the human is better at now with our investment horizon. What the machine learning, the AI is way better at is understanding which of those features should be most emphasized, which of those features work most effectively together. What we're increasingly finding though is the horizon gets longer and longer. The real benefit of using the machine learning over and above just building like a traditional factor model and how much of the return you can forecast with that, it gets lower. If the ratio of up down is above this and you're within 10 days of the reporting period and the short interest is below this level and the CEO has been in the seat for at least six months. It's, it's a lot of those combinations of things working together. That is, is the power of the model.
Host
David, thank you very much for joining us on Excess Returns.
David
Well, thank you very much for having me.
Host
As head of quantitative investing at Pitet Asset Management, you play a very important role at the firm in terms of leading and shaping how the firm approaches building systematic strategies. And that includes strategies that use artificial intelligence and machine learning in terms of stock selection. And so I think today what we wanted to do is sort of sit down with you and talk to you about how you are leveraging AI and machine learning in your investment process, what it can and can't solve for and you know, how you are thinking about this intersection of human judgment along with machine driven investment process and how you think things may evolve in the future. So that's going to be sort of the broader framework that, you know, we'll get kind of get in with today with you and then sort of talk specifically about, you know, how these things are being used in terms of building, you know, the portfolios that you guys are running. I should note that Pictet recently launched a series of US based ETFs. You can sort of Google that. But actually the, The URL is etf.am.pictet.com for the US sort of strategies. And I think a few of those strategies actually are using or have an AI orientation or focus. So a lot of what we're going to talk about today sort of is powering, you know, how you guys are building those portfolios. So again, thank you very much for joining us. I think our audience is going to find this conversation very, very interesting. Where I wanted to start with you, I guess is a little bit more broadly here, just to set the stage. And you know, when we think about AI, it is a very broad term. So when we, when you think about it from an investing context, can you just kind of explain what you mean, what you think about when you hear the words like artificial intelligence used when, you know, powering an investment strategy or process?
David
Yeah, I mean we get asked this sort of like this high level starting point quite a bit. So I mean AI, artificial intelligence, I simply think it means a machine doing some kind of function or role that a human would generally do. So it's whether it's, whether it's generating something, whether it's sensing something, it's doing something in a human like way. In practice though, generally what we're talking about, when most people are talking about AI and finance, they mean machine learning. Not all AI is machine learning, but machine learning is a very large subset of it. And what we generally mean there is rather than a human programming an algorithm to do a task, an algorithm is trained on data to learn to do that task.
Host
Can you kind of bring that through? Like how does that work with like the training aspect of it? Because that's a very important part of machine learning. So sort of talk through how the system actually does that.
David
Okay, so again, I think we're using the right kind of terminology here. I sometimes think the learning aspect confuses people because it kind of gives the impression that most machine learning is very free and unsupervised and there isn't a lot of structure to it. But I like once we start using the word train because I think it kind of highlights the structure around a lot of what's going on here. So using the example of what we do in our world. So essentially a big part of the training is defining what you want to train the algorithm to do. And for us, powering that ETF that you mentioned, Pequan, is the, is the one that is particularly run by my team of the three that Pictay have launched. The key here is that we want to be able to forecast the relative attractiveness or the relative return for a broad universe of stocks over the next month. Now to do that, we define a large amount of input information. Jargon wise we would call these features that tell us something about the stock or the company. We then define an output that we want to train the model to be able to forecast going forward. And that output again is a one month specific return on every company in the universe. And then rather than a portfolio manager assigning weight to those features or characteristics of the company and testing how a model would look with that weighting scheme, we actually take decades of information, the point in time features and the rolling one month forward returns and an algorithm is trained on the data to understand how you most effectively move from that input to that output.
Host
I think we'll try to peel back the onion on a lot of, on a lot of that, what you just said. But before we get to that, I just want to ask you because I think a lot of people that are listening to this probably have used, you know, some type of large language model, whether that be ChatGPT, Gemini or Claude. So just contrast and, and I do think some investors, both individual investors and professional investors are actually also using those large language models to select stock. So how would you, I guess compare or contrast the machine learning method versus the, you know, the ChatGPT large language model method? Like how would you describe that?
David
I mean, so the first thing in very shorthand is that you've got to have the right technique for the specific task at hand. Now a large language model, as again many of us are becoming much more aware of, is a form of generative AI. It's a form of generative AI that is trained on text and then it generates text as its output. It's not making a prediction, it's not classifying anything in a particular way. It's making a generative output for what we're specifically wanting to do. And again, I defined it quite tightly in how we're using the approach that we do, but the approach that we're using that is based around regression trees or decision trees. And again we can go into a lot more detail about what that exactly means. They are very well suited to making broad forecasts, particularly, particularly numerical forecasts. They can be trained with the differing types of data, but they offer benefits in the financial field well above LLMs that again they are very effective for making predictions. They're very stable approaches. So LLMs and the types of machine learning that you use to train LLMs are not particularly stable. They're not always going to work in the same way. And they're very hard to interpret. You know, it's the chat GPTs of this world, I think are getting better if you sort of like challenge them on why they've come up with their answer. But that type of approach is really not bedded into LLMs. The types of approach that we're using, being able to interpret and understand why it comes up with its answers is a sort of core benefit of that approach.
Host
Yeah, it's almost like LLMs who will give you almost what you're like. If you say, like give me a portfolio that is made up of stock, you know, it almost. Sometimes it feeds you almost you the prompts that you're giving it.
Co-host
Yeah.
Host
So it's not giving you like a true objective thing because it's kind of tilted towards the bias that you, you know, have prompted it with. But just back to the. You said something early. Walk us through like a simple. This decision tree, this process that, you know, is kind of embedded in the machine learning. Like give us an example of that with something, you know, like a financial data input that. Something that would go into that decision tree. I guess.
David
Okay.
Host
Process.
David
So we can talk more about the sort of the decision tree process that we're using and specifics to it as we go through this. But in simple terms, it's almost exactly as it sounds. So I think most of us would get the idea of a decision tree. You have splitting points, kind of like different branches within the tree. Each of those splitting points is set by one of these features. And again, we can come into more detail on this. But for those of us that are in the quant world, that is kind of like machine learning jargon. In practice, what we generally mean is like a signal, again something a characteristic about the company or the stock. So you can kind of think of a simple decision tree as a feature or a signal being at every point splitting branch in the tree. And then how you split, whether it is positive or negative for the forecast is going to be set at a score of that feature. It could be done something like negative is a one third point within the distribution and positive is the other 2/3. It could be an actual defined piece in the distribution of the score. So again, every feature is assigned, there is a splitting point on this. And you can start to use and understand the effectiveness of this individual decision tree and in the approach that we're using, then use that to start refining other decision trees that you can combine together in a combination.
State Farm Announcer
This episode is brought to you by State Farm. Listening to this podcast Smart move. Being financially savvy Smart move. Another smart move. Having State Farm help you create a competitive price when you choose to bundle home and auto bundling. Just another way to save with a personal price plan like a good neighbor, State Farm is there. Prices are based on rating plans that vary by state. Coverage options are selected by the customer. Availability, amount of discounts and savings and eligibility vary by state.
Ford Announcer
This episode is brought to you by indeed. You're ready to move your business forward, but first you need to find the right team. Start your search with Indeed Sponsored Jobs. It can help you reach qualified candidates fast, ensuring your listing is the first one they see. According to Indeed data, sponsored jobs are 90% more likely to report a hire than non sponsored jobs. See the results for yourself. Get a $75 sponsored job credit at Indeed.com podcast. Terms and conditions apply.
Co-host
Going back to the beginning, I want to talk about the data you referred to because I think that's probably an interesting choice you have to make, which is how much data am I putting in here? You know, I mean, you could probably do something very, you know, narrow and put just fundamental data or something like that. And you could probably get into things like the cars and number of cars in the Walmart parking lot or some of this more advanced data. Like how do you think about what to put in there in the first place?
David
Yeah, I mean I, I think this is going to be a good way to kind of benchmark what we do a little bit here, but it is simplifying it. So I kind of think a decision that we made and maybe, maybe you have to take a little bit of a step back here. I mean, so much of what we're going to talk about today is sort of very technology focused, very data heavy, but a lot of the decisions made by the people that develop this and use it have a big impact on the way that the algorithm learns and then is structured. And probably one of the biggest macro decisions on what type of machine learning you're using is the data. And again, I think we can simplify that decision that we made down to do we want to go the more traditional data route that quants have been using for 30 or 40 years? Your prices, your fundamental information, your sell side forecasts, your positioning type data, or do you want to go the more alternative data route? As you sort of alluded to the satellite imagery, the smartphone locations, the social media posting, those types of things that have garnered a lot of excitement in the quant space now we through a lot of testing and refining our approach, we like to train our models over around a 15 year look back now, a 15 year period of data. That means we have multiple different economic and market cycles within it. Our models can learn from a lot of historical information. They can learn about the persistence of the relationships that we're going to end up trading off. Now, a lot of the more alternative data does not have that kind of history. It might have three years, it might have five year history. It also is unlikely to have full breadth. So using more traditional data, you're generally going to have a feature score for every company in your universe and you'll probably have that every day, every trading day, right back for that 15 years. If we start incorporating more alternative data, we don't have the history, we don't have the breadth. It creates a lot of challenges. So we have focused much more on that traditional data side, again because of the history, because of the breadth and because a lot of those signals they have a strong rationale on why they should work and a history of developing alpha in traditional approaches as well.
Co-host
That rationale is what I wanted to ask you about next because that seems to be a little bit of a debate within the quant communities. We've always been told we should use factors that are intuitive, so we should use things where we can understand why this should influence stock prices. But I think a lot of people in the new age of machine learning are going a different direction, which is saying if it's in the data that is true, I don't care if I understand why it influences stock prices. So how do you think about that balance in terms of putting data in there that should have something to do with stock prices?
David
Yeah, I mean, why? Why? I've got a sort of wry smile here. I mean, one of the reasons that I find this really so interesting is you talked about like the quant industry having this debate. I would say it's even going on in our own, in my own team. So if I think about the backgrounds in my team who, who works on these types of strategies, who have developed these strategies, we've got more traditional quants who have built factor models historically and a very, very, very, very wedded to the idea of that rationality, that explainability of the features that we would use. At the other extreme, we've got some of the kind of computer scientists, data scientists, who are very much about the machine learning approach. Let's just give it lots and lots of data and sort of see what it comes up with and then, and then we've also got experts and I would sort of characterize this a bit more of a physics background somewhere in the middle who have got a lot of experience of building large models for other types of tasks. So we've probably ended up coming somewhere down in the middle. A lot of the features that we use are essentially signals. They are things that have been researched by us or academia and have a strong rationale on why they should work. There is a fundamental underpinning on why they work. There is a behavioral underpinning on why they should work. But we also have a great belief that we want to use as many of them as possible and that we are willing to put in additional features that would have less of a rationale or less of a history of working as signals but they could potentially have a strong conditioning benefit of how they would work together with other more traditional signals. So I would say we've come out somewhere in the middle but the majority of the features that we do use do have some kind of rationale behind them.
Co-host
Do you think as an industry we'll go more and more towards this not needing a rationale over time? Like I remember we talked to Cliff Asness. He was very resistant to doing any of this and AQR's process but he's slowly been being more and more comfortable with a little bit of this. You know, I don't understand why this works but it works type of thing. I mean do you think as people get more comfortable with these techniques we're going to get away from this rationale over time?
David
Yeah, I mean I, I think, I think your horizon of investment is a little bit of a part of this. So I talked about that our forecasting model is with, with this strategy is around one month. We do some in some of our AI strategies combine in some models that we train with a multi month horizon as let's just view us as like a one month horizon type type strategy for this answer. I think if you're as quite a lot of the teams who are focusing on machine learning are they're more like the high frequency end of the market intraday. I think at that point I think the rationale becomes less important. I do think you can just feed the machines a lot of data and you have to have that kind of acceptance around it. I think as you go further out on the horizon I think the more that rationale will remain important to you. So I do think there are going to still remain different schools within this from our Side the way that I would characterize it is that a lot of the signals, the features that we train the model with have the rationale, but then the relationships that the machine learning are understanding between those features, the interactions, the nonlinear elements to them, where we believe we're generating a lot of our alpha from, I think we have to accept that a lot of those relationships, you know, you and I could maybe look at the thousands of relationships that machine learning identified. We could probably put a solid story on 10 or 20% on why some of those work and then a lot we don't understand why they work. So we've had to get to comfortable with that kind of breakdown. We like rationale in our inputs, but increasingly what the machine finds in the relationships between those different pieces, we do just have to accept that they're a little bit less obvious.
Co-host
You mentioned you guys use sort of a one month time frame here. How do you think about time frame with this? Because traditional factors, you know, value momentum would be over longer time frames. I mean do you think this type of technology is more suited for shorter time frames or is that just the way you choose chose to implement it?
David
Yeah, I mean a little bit of both is a kind of short answer. So certainly in the refining of this approach we tried lots of different things. So let's use Horizon. We trained five day models, we trained 10 day models, we trained 20 day models. We as I mentioned in some of our strategies we do combine in some six month train models. What we're increasingly finding though is the horizon gets longer and longer. The real benefit of using the machine learning over and above just building like a traditional factor model and how much of the return you can forecast with that, it gets lower. So it does seem the horizon is a big part of that. The shorter the horizon becomes, the more beneficial the machine learning element becomes within it. So for us that's been, you know, we wanted to, we, we had a history as a, as a team running factor strategies with, with actually emphasizing more the quality low vol factor which are really quite at the longer end of the, of the quant spectrum. We wanted to have something that was complementary alongside that, that we could build different strategies. So once we were into that, that shorter end of the market it, it did, we did find that in all of our analysis the machine learning just became a much more helpful over that horizon.
Host
Do you have any theory as to why?
David
Yes, but, but some of it's kind of conjecture I think even amongst us a little bit at this point. So, and I think people could justifiably argue about this. I think when you go out longer horizon and maybe this would sound counterintuitive, there's maybe a little less, a less number of things that are driving the relative return. You can pin it down to kind of like economic cycle elements, very specific company fundamental elements with these things that are more aligned to like style elements or sort of common elements. Once you get down to the shorter horizon, again, sort of counterintuitively, lots of different things are driving returns. It gets quite noisy at that point. And that noise is just very hard to kind of understand. It's clearly really hard to understand a humans. But even with more traditional quant approaches, it's very hard to disentangle a lot of that noise that's going on. And machine learning does give you that ability and that capability.
Co-host
Do you think machine learning will make traditional factor strategies better? I mean in the US here there's tons of value ETFs and momentum ETFs and multi factor ETFs. I mean, do you think for that type of thing this will make those things better or do you think it doesn't add value for those types of strategies?
David
I think it has a lot of potential. But it comes back to one of the earlier questions. It's so let's maybe use our example. You know, my team's example here is the easiest way to illustrate this. So we run, I mentioned we run factor strategies. They emphasize the particularly the quality, low volume element, little bit of value. And then we run these machine learned that are forced to be factor neutral as much as possible, both in the way that we train the model and we construct portfolios at the moment, while we are very collaborative as a team, those strategies have been kept very separate. So one uses AI machine learning, one doesn't. And again it's because we've trained and spent so much time training our machine learning engine for one very specific task. However, we are increasingly as a team investigating the idea. Can the factors that we're using in our more factor driven strategy, can they be improved with other types of machine learning? So I put a lot of hope, as I know a lot of quants do in. Can you use LLMs to make better analyst forecasts? Can you, can you, can you analyze the sell side reports that are written? Can you analyze the, the earnings call transcripts using LLMs? Can you analyze news? So I think there is real potential on the factor side to make improvements to some of the momentum models or the value models or the quality models using those types of things. But again, that is going to be a very different machine learning technique or tool than the one that we've built to, to make this prediction model.
Co-host
One of the misconceptions I think that's out there about machine learning is this idea that it's just data mining, you know, and as quants were always taught, you know, they have this whole saying, you know, you torture the DA data until it confesses. And we're taught just don't keep testing things over and over and over again to get the result you want. And some people I think have a misconception that that's what machine learning does. So could you explain why that's not the case?
David
Yeah, I mean, I can, but there is also a little bit that it kind of is data mining. I mean, that's the kind of funny thing that you almost have to admit to this. So clearly we want to and the power of the model that we've trained is that it's learning from data. So you're inherently building that element within it. But clearly you have to put a huge amount of guardrails in place to overcome and check for some of the challenges that this potentially creates. And clearly the one that you're alluding here is sort of is overfitting. It's sort of, it's over forcing a model to look too much like the past and then it's not necessarily going to work once you're using it live. So a few things that how we look to overcome this. So firstly, I think an initial point that the features that we're using, that a lot of them have this rationale to them, I think is a good starting point. The fact that when we train the models over 15 years, that the types of technique that we're using, you know, so we within that 15 year period we train for 12 and then we validate on three years of the training set that we pull out and we randomly, we train some models on this 12 years and validate on three, we'll train another part of it on a different 12 and validate on three. And we do multiple different combinations of that to again trying to make it as robust as possible. We also, while we're basing the training on the last 15 years, we do exactly the same type of training outside that 15 years as well to see are the relationships that the model can identify and then generate alpha from, are they pretty stable like 20, 30 years ago? So there's lots of different techniques, statistical techniques, techniques that sort of often pioneered outside finance that you can use to make this training as robust as possible, but clearly at its starting point you are learning relationships from data.
Co-host
One of the things I've been thinking about a lot is what do humans do better and what do these models do better? What does AI do better? And that's probably going to change over time, but I'm thinking like where you are right now, like what is, what does your team do better than these AI systems can do? And what do the AI systems do better than your team can do?
David
Yeah, I mean so, and again I ask myself a lot this as well, thinking about how I run the team and how that might need to evolve over time. I mean, so firstly if we think about the process that we're using here, the feature engineering, so creating the signals, the features, that is still predominantly a human done approach within our team. So again I talked about some of these ideas that we're using come from academia. Some of them we research ourselves, some of them, you know, like ideas can come maybe from an interaction that someone is having with someone else in the firm. And Picte does a lot of different things. So but generally that type of thing is coming from us. So we are the broad group of features that we choose to train the model with. That broad group is defined by us. And I still think at this point, defining that universe of features the human is better at. Now with our investment horizon, what the machine learning, the AI is way better at is understanding which of those features should be most emphasized, which of those features work most effectively together and really essentially structuring the model itself. So with the horizon that we're talking about here, the machine is better at structuring the model by learning from that historical data. We also in the way that we then take that and construct a portfolio, clearly that is a very automated process. So actually taking this information, interacting it with risk views, cost views, constraints and then building a portfolio, that optimization is clearly is much more effective in a machine driven way, even if it's not necessarily an AI way. Then the final piece for us, checking a final portfolio before we go and trade it, while we have many dashboards that give us lots of useful information, I still like that a portfolio management team does that and has that final check, maybe that there is some news, some piece of information that the model is unaware of. So the start and the end for us is a bit more human, but the piece in the middle, that's really where we see the real benefit of the machine.
Co-host
So do you not tell the models anything in terms of what data is more important when you Feed it in. Like for instance, you're feeding in fundamentals and you think those should have a lot to do with stock prices. You're feeding some peripheral data set. Do you say anything about this is more important than this or do you let it figure that out?
David
I mean, so broadly we're letting it figure it out. Now there is an element of course that we are slightly forcing this by just the number of kind of features that we're using here. So if we think of the breakdown, around 400 features, like 100, 125 are price based. Similar number we build from sell side forecasts, we probably build, we build another hundred from more accounting based type information and the remainder is from kind of investor positioning, calendar information, from more qualitative type information about companies. So clearly some of those final pieces that I've mentioned there, they're smaller, so you have forced a little bit. The bigger groupings of features potentially have a higher likelihood of importance. But again, what is incredibly powerful about these decision tree based models is what it is learning is not just the importance of these features, which means that they would be further up in the decision making process. So they have a larger impact on the path that a stock would follow to make the forecast. But what other features cluster in with it? So what we tend to see is that there are certain features in the feature set that do not have a very high number of them, but they can work very effectively with the more common features and have a lot of conditioning elements to them, a lot of relationships that are built with their involvement. So again, we really find learning that from the data is much more effective than us trying to force that too hard.
Co-host
And is the data you're using an ongoing process? So for instance, you have a live model sitting here and then behind the scenes you have more of a test model where you're continually testing different data sets to see if it should be in the live model.
David
So we will add data sets and features over time. So while Pequan Pictae's active ETF for the international market has launched quite recently, we've run this strategy even either long, short or long only in other vehicles. Live now for over two years, over two and a half years. In that time, the feature set we're using has grown from around 250 to 400. Predominantly that is engineering features from existing data sets. But we are also testing other data sets and building other features from them as well. So yes, in the background we are constantly refining and developing the model that we use. How that then comes in live, whether new features are incorporated or not, every three months we do a full retraining of the model. So the model has a is trained on 15 years of data. And then every three months we roll it forward by three months. So we drop the oldest, we add on three new months and we do the training completely fresh. So it has nut wall. We set the parameters the same. So the sort of structure around the training the same as we were comfortable with previously. It's done without any understanding the previous model existed, it's completely retrained. And if we want to incorporate new features, we can at that point. But building it again from the ground up and making sure that the model looks very similar is a nice safety check that you've not done kind of anything stupid in that training.
Co-host
I'm just curious, as you do these retraining runs, are you seeing the technology leap a lot over time? Like in the US we're seeing with these LLMs, you know, we're seeing mass everybody jumping on top of each other and these massive changes, these massive innovations. Like when you, as you keep looking at this is what you're able to do or how fast you're able to do it, is that changing rapidly over time?
David
I mean, certainly in the time that we've been doing this, sort of like the last five, six years, which would be an evolution from, you know, the first kind of academic papers the team wrote on this, through to a more applied approach to training our own models, to then running live capital on it. The time it takes us to train this has definitely fallen. So we're finding efficiencies, the type, you know, we have. Also our hardware has improved over time, so we have more computer processing chips, more graphic processing chips to speed up these things. So the approach is becoming more efficient so we can do these things quicker and more efficiently. But I think what is really interesting for us is that is how stable a lot of the relationships that we actually find. So yes, the machine learning world is improving. The efficiency in the way that we train models is improving, but something that we consider so dynamic, the financial markets, the equity markets, many of the relationships that we find through the, through the data analysis, the relationships between these different features, they've been static for a long while. Versions of this model that we could train with 90s data have a lot of the same relationships between the features in it to now. And these relationships are still very, very effective. And I would say that has been the biggest surprise to us through this approach is how stable a lot of these relationships we discover are.
Co-host
And I would think that's important in terms of thinking about things that will.
David
Work over the long term.
Co-host
Like if every run you did, you got completely different results, you'd probably have less confidence than you do now because you're seeing similar things each time.
David
Exactly. And I think it's. And if someone asked me to sort of like decompose the benefit over and above a traditional factor model over the fact that we're talking about they're slightly different things, particularly with the aim of the different horizons. But you know, as people who work in quant and people who have looked at this like yourselves a lot, we know that many traditional factors have a decay element to them over time. And that decay can be quite significant in some factors and it can be slower in others. And it's not necessarily a 45 degree angle, it can be a little bit more up and down than that. But generally there is a decay element to it. What's really exciting for us is that the kind of 40, 50% of the return of our model that comes from the capturing the understanding of these non linear relationships, these interactions between the features, the decay in them is just minimal in comparison to sort of a linear exposure to a traditional factor. So that is really exciting for us in the potential in this going back.
Co-host
To the human versus AI thing. And I know none of us know the answer to this, but I'm thinking about like, as, as we go well into the future, how is that going to change? Like, is AI going to be able to take over all the things we do? Well, like you talked about, you guys are better at figuring out like what to feed into the model. Like, are we all going to be sitting on a beach at some point and these models are just going to be running and competing with each other. I'm just trying to think like, if you think forward, how do you think about how this might play out?
David
I mean, if I look at, you know, I've worked in the Quant industry for 25 years from what I've seen over that time, and my colleagues and other people in the industry will have seen something very similar over that time, is that almost every part of a quant process has got more automated over time. I mean, like it's almost every aspect of it, there have been ways to find and improve and get more technology in there. So, so I sort of use some descriptions of things where it's more human led for us or there's stopping points in the process. Another stopping point in the process for us is that our machine learned model, our train learn model produces our relative return forecast. That relative return forecast is then used within a relatively traditional mean variance optimization to build the portfolio, again adding in risk views, cost views to it. We have research underway and certainly people using machine learning who are much higher frequency than us would have to work in this way given the time constraint where we actually train a model that is risk and cost aware so it outputs holdings for us rather than just a forecast that we interact in the portfolio construction. So this is just an example that I do think the steps are going to probably get more and more automated within it. But at the same time, you know, a lot of, a lot of my time is spent with our client base. And while people really appreciate the returns that we're able to deliver with this type of strategy, they do want to know that there are human elements to it. They do want to know that there is very strong human oversight, that there is guardrails and structure and supervision around how these models are trained. So I think you're always going to have a little bit of that side pushing on, on, on, on the that they wouldn't want it to be front to back the machine.
Wayfair Announcer
Hey, what's up y'?
David
All?
Wayfair Announcer
Kelly Clarkson with Wayfair. My favorite thing about the holidays, decking out my whole house. It's not a competition, but if it was, well, I'd win the season with Wayfair Outdoor Inflatable Santa.
David
Got it on.
Wayfair Announcer
Wayfair trees, lights and ornaments. Wayfair hosting must haves like dining sets, beds, sheets and towels. Wayfair for everything in your style delivered with fast and free shipping. Visit Wayfair.com or the Wayfair app to win the season. But again, it's not a competition.
Ford Announcer
Wayfair Every style, every home experience a membership that backs what you're building with American Express Business Platinum, get 2 times Membership Rewards points per dollar on eligible purchases in key business categories, as well as on each eligible purchase of $5,000 or more on up to $2 million in eligible purchases per calendar year. American Express Business Platinum there's nothing like it terms apply. Learn more@American Express.com Business Platinum.
Co-host
Yeah, it's interesting because when I when I talk to clients, one of the things that they often say is, you know, I use a quant strategy because I want to take emotion out of the investment process, human emotion. And that's very true. You do take a lot of it out.
David
But what I always say to them.
Co-host
Is you don't completely take it out right now because you want someone sitting behind that strategy who knows what they're doing, who decides what goes in there, you know, you want to have someone. You know, I could panic when my strategy is not working and I can start pulling stuff out of it. So right now the human sitting on top is a very important element, I think, of these strategies and I wonder is, I think over time, like how that will change and I don't think any of us necessarily know the answer.
David
Yeah, I mean I, again, I think, I think that the human emotion one is very interesting and bias and clearly again, quant was structured to overcome that. But we have to be fair to ourselves and the way that we work. If a human is building the model and allocating weight to groups of signals or certain factors, a bias coming back comes back in that way as well. So I think one of the reasons I do really like what we're doing with the machine learning side of things is taking, again, taking that piece away where the human can kind of get back into it and sort of involve themselves and their views on things. Again, we're taking that aspect away from them by learning from the data, understanding from the data, the feature importance, the way to structure these different things. So I think this is a step forward in further removing that a source of bias that maybe we overlook a little that exists in traditional quant approaches.
Co-host
I want to come back to ChatGPT. Justin mentioned it earlier, but a lot of people are thinking about the idea now of using this to build and test strategy. So, you know, something like you are Warren Buffett, you know, everything Warren Buffett has ever known. You know, select stocks like Warren Buffett and create a strategy that'll work going forward. And I'm wondering, just as someone inside of this, how you might think of the ability of those models to do that and whether they're intended to do that type of thing successfully with strategies that might work going forward.
David
Yeah, I mean I, I really, I really remain very skeptical that that is what they're going to be able to do effectively again because I mean, I see it from my own usage. ChatGPT has got way better, there's clearly more reasoning within it than, than just sort of full generation now. But it hasn't been trained to do that very specific task. It hasn't been trained to build a portfolio. It hasn't been trained to do, if we use the Buffett example, that in depth analysis on, on, on individual companies. But whether, whether it kind of has that or not, the challenge that I still have mostly with, with, with the LLMs and the generative side of things being effective in finance is just the lack of interpretability of it. So really, I mean you could question it why it's coming to that decision and it give you an answer, but I think really being able to interpret how it came to that, I don't see that in the LLMs. Now the approach that we're using, again we kept a reasonably high level in the way that we do things, but we've talked about the decision tree based approach. We end up having thousands and thousands of decision trees as our end model. They're trained with gradient boosting is the technique where they're iteratively trained like one after the other learning from the mistakes of the previous tree. But it's still the end model ends up being thousands and thousands of decision trees. Those trees are interpretable. So every position that comes out in our model on a given day, on a given stock, we can understand which features in which combination are driving the view of that model. So it's interpretable. If we, as we did when we were starting to investigate this space and look at different ways of doing this, if we trained the model using a neural network deep learning, which is the way LLMs are trained, you don't have that interpretability of it. So actually we found that we were able with neural networks to produce forecasts that were of a similar kind of accuracy as the boosted tree models that we use, but then we wouldn't have any way of interpreting it. So trying to find that combination of accuracy, stability and interpretability, I think that is going to keep a lot of people away from using the kind of LLM approach.
Host
It's just one point of clarification. So when you say accuracy accuracy you mean like you buy of 10 positions, you might get 6 out of 10 correct? Is that what exactly.
David
So exactly the. For us it's a relative forecast. It's, it's kind of like the way that we would judge that. I mean this is simplifying a little bit is kind of like the, the, the correlation between the distribution of the forecast and then the correlation to the, the actual final return over the horizon, the distribution within that group as well. That for us is how we're assessing accuracy.
Co-host
And I think one of the challenges is when we build models we only want to basically have the model know what it would have known at the time. And that can be tough with these alums because they know everything as of now. So you can't go back and say you forget everything you've learned. Since 1990, you know, and build me a model in 1990. And so, because obviously if I know everything I know now, I could build an incredibly successful model. But it's probably not going to work going forward. So I think that's sort of a challenge of these types of things.
David
And again, I think that's why I just don't see a lot of quants certainly on the quant side, given the history that we have of understanding about, you know, peaker head bias, restating data, all these types of things, and really have a solid understanding of the way to correctly do backtests. I think it just, it creates a barrier that I don't see really being overcome to actually use it as a, again to, to build portfolios or make forecasts. So again, if we contrast to what we're able to do when we run simulations for this strategy, the model that we use on a given day, even if we're simulating right Back to the 1990s, it has its 15 year look back at that point, doesn't know anything about the data that we've had since. It doesn't know about any relationships that have been developed since. So you really get a robustness to that simulation in a way that as you described that you wouldn't with like an LLM type model.
Host
So let's just talk about how this all comes into an actual portfolio. So first off, are you buying a broad swath of, you know, stocks or is this more used as a more focused, concentrated type of, you know, portfolio that you're constructing off of this system?
David
No, it's very much about building a diversified portfolio. So this forecasting model, the. So we train it over 15 years as I've described a couple of times. We then use that to make our daily one month forward forecast for the next three months and then we retrain on 15 years. So again we, we've sort of defined the structure of it there. That model's power again, as we've also got a little bit into, is its cross sectional forecasting power. So it's individual forecasting accuracy on an individual name of that name's going to go up 5% or down 5% over the next month is not fantastic, but its ability to say where in that distribution, it's much better at that. So what does that mean in the way that we then want to build a portfolio? We want to take that cross sectional power and best reflect that in a portfolio. So we do that in two ways in long only, we do that in enhanced index. So lower tracking Error strategies that hold a lot of the index names and just use this forecasting model to tilt a proportion of the stocks above the benchmark weight until a proportion of the stocks below benchmark weight or don't hold them. And then if we want to run a higher octane type strategy, we would do that with leverage in long short. So we remove the benchmark, we still maintain our diversification and we use leverage to lever up the portfolio so we can combine higher risk and diversification. So the power here is very much in a diversified forecast and then a diversified portfolio. Yeah.
Host
And I think in that enhanced index it's, correct me if I'm wrong here, it's like factor neutral, sector neutral and geographic neutral. So you really are isolating like over time as a track record gets established, you know, the alpha, which hopefully it's. There will be a function completely of the stock selection process.
David
Exactly. So we want to. And that comes out in two dimensions. So when we train the model, I've talked about our features as inputs, the hundreds of features inputs and our output is then that 20 day or 20 trading day forecast. What we train the model is on the specific return. So we clean those historical returns statistically of the market beta of the sector beta or industry country style betas as well. So it's learning to just forecast that piece of the stock, the idiosyncratic part of the return, the unique part of the return to the company. And then when we build the portfolio we have a lot of guardrails and all those dimensions as well. So exactly as you've described in, in enhanced index, we don't want to deviate much from the benchmark on any of those common dimensions. And then what our clients will see when that if they have the ability to analyze their, their performance as well that we deliver to them, they will find that the vast majority of this, you know, well above 95% will come from kind of stock specific alpha and not any of those common elements.
Host
So what are the, are you guys running this on the S&P 500 or what are what index currently are you running it on?
David
So the evolution to this point, the first strategy that we did with a vehicle domiciled here in Europe, we did against an MSCI World benchmark. Then the peak one that we've launched as an active ETF in the US is an EAFI benchmark. So WorldX, US and Canada, we will likely bring other strategies with the same approach to the US market. And our research has shown, and we're already running money for individual Clients with us, with em. So and we intend to do this in Europe. So what we've seen is this is a very transferable approach as long as you have a kind of minimum universe size. So as much as I would love it, there's never going to be an AI driven, probably UK strategy, it's not broad enough, we're not going to run a Swiss version of this strategy. But as long as we've probably got four or five hundred names, then it looks to be pretty effective.
Host
And you find the persistence, and I'm sorry if you just answered, but you find the persistence across different geographic areas and markets to be. I guess the model would sort of vet that out if there was any difference. I'm thinking like international versus US or emerging versus developed and how that kind of plays out. I guess the model would kind of figure that out.
David
I mean, so yes, but also we wanted to make sure that was the case in the way that we trained it and tested it as well. So I, and I think we as a team were maybe expecting some of the, you know, again, the power of this, as I hope we're getting across, is finding the relationships between the different features, the way that they work together most effectively to forecast the return. Now we had some prior thought that a lot of these relationships that we found, maybe they would work in us, maybe they would work in Europe, but maybe they would be less effective in emerging markets, for example, maybe just the dynamics of those markets, I don't know, the dominance of different investor types or just because the maturity of the market, whatever it might be. But what we found is these relationships were very, very transferable. And the way that we checked that out is what we did is we ran simulations where we just took the model that was trained on our global, the global model that we'd built and we were live with and we saw how that structure of model would work in something like em or just unique in us. Then we trained versions of the model where the training could only see just the subset of the universe that we would then run it on. We then would train variations of the model where we would slightly change the feature set with our view of the dynamics of the market. And what we found is the most effective thing to use in these different regions was the globally trained model with the global. Just so again it said these relationships were very stable and you actually can learn things in one market. And even though it wasn't necessarily trained on that, on that subset of the market, it works very effectively there.
Co-host
How do you think about rebalancing something like this, I mean, I would assume it'd be something that would be happen slow over time and it would be a function of the change in the projected return from the model with also balance against the trading costs. But how do you do that and how much of that is done by a person and how much of that is completely inside of the model?
David
Okay, your first part, you answered almost perfectly, so maybe I'll just repeat kind of what you said. So the model does update daily. Given the amount of money that we're running in the strategy at the moment, and the fact that the correlation between the model forecast day to day is quite similar, we do at the moment rebalance weekly. So that is regular enough for us where we're capturing some evolving views in the model and we're not going to spend too much in transaction costs to do that. And we obviously need to, that's a key consideration as it is quite a fast strategy. As the strategy grows over time, we've been analyzing to understand will we need to rebalance more frequently? And I think that probably is the case. So over time as the strategy grows, we will need to rebalance more frequently. Clearly the model is not changing more often, but it's. We kind of need to chop our trade up a little bit more frequently to it. But there is certainly an optimal level of rebalancing that you need to do. It is a human that has made that final decision. But we analyze and test a huge amount of data to get to that decision.
Host
Is there. And I don't know if this is the right terminology. Is there one conditioning effect or signal if those two things are the same? I'm not sure. But is there something that really. I'm just trying to get it like something that really surprise. Surprised you when you looked at it and you said, wow, that is just something I've never thought would have. Obviously the data you're putting in, you know, you're putting it in there because like you said, it's, it's, it's, it's data that should be important to the future performance, the stock prices. But I'm just wondering if there's something that came out of the model that you, you and your team were just like. I never would have thought that that would have that much impact on returns.
David
Yeah, I mean. Okay, so I kind of need to answer this in two ways. So the first one is given how we. I described this earlier at the start, quite a lot will surprise us a vast number of these conditioning elements. So. And again Let me sort of just describe it given how you sort of hit on it at the start that. So we have, we have these, we have these 400 features or signals and what we can kind of think of as the conditioning elements are which of those work together almost like if functions. So you might have a traditional signal like the ratio of cell side analyst upgrades to downgrades over the last month and you in a traditional model can just trade on that as an individual isolated signal. But can it? Can the model learn from data? What other features will improve the performance of that signal? If this is happening and that is happening and that is happening. So, so to go back to what I said at the start, we can probably put a story on maybe 10 or 20 of these little groups of features that work together very effectively and then a large number of them do not have an obvious reason why they work. So the short easy out for me is to say lots of it surprises us. And actually it's easier to give people an idea of how this works a little bit with something that is less surprising like calendar information works very well with analyst type information. So. So the further away you are from a company reporting its, its official numbers, you're better off following the sell side forecast and then the closer you get to the official numbers the less relevant that it is. Now in my mind that has a rationality to it. You know, the market is probably going to respond to live numbers so I'll care less about the forecast to it. So but then there are thousands, tens of thousands of the relationships on how these different pieces together that just don't have that obvious piece to them and is that where we're getting a lot of the alpha from? So probably the thing that's closest to answering both those pieces is the calendar element does work with a lot of different aspects, probably more than we would expect.
State Farm Announcer
Meet the computer you can talk to with Copilot on Windows working, creating and collaborating is as easy as talking. Got writer's block? Share your screen with Copilot Vision to help spark inspiration and use Copilot voice to have a conversation and brainstorm ideas. Or maybe you need some tech help with Copilot Vision. Copilot sees what you see. Let Copilot talk you through step by step guidance so you can master new apps, games and skills faster. Try now@windows.com copilot Close your eyes, exhale, feel your body relax and let go.
David
Of whatever you're carrying today.
Ford Announcer
Well, I'm letting go of the worry that I wouldn't get my New contacts in time for this class. I got them delivered free from 1-800-contacts. Oh my gosh, they're so fast. And breathe. Oh, sorry. I almost couldn't breathe when I saw the discount they gave me on my first order. Oh, sorry.
David
Namaste.
Ford Announcer
Visit 1-800-contacts.com today to save on your first order.
David
1-800-Contacts.
Host
Yeah, and that, and that calendar is like you said, the longer you are out, like you might analysts, recommendations and buy sells might be more important. And as you kind of hone in, this is just with the one example and just repeating back, like, as you hone in, it gets less important. Like that sort of idea.
David
Right, Exactly. But again, generally what, what ends up being used here is not, it's not like, it's not like pairs of features. It's like six features together.
Host
Right.
David
So it will kind of be like if the ratio of up down is above this and you're within 10 days of the reporting period and the short interest is below this level and the CEO has been in the seat for at least six months, it's. It's a lot of those combinations of things working together. That is the, is the power of the model.
Host
Yeah. Interesting. Be interesting to like, look at that in detail and kind of, you know. Yeah, that's, that's cool.
Co-host
I'm just wondering, is there anything that surprised you in terms of not being used? Like, for instance, if we think about traditional factors, like, are valuation factors not used that often in here or is anything like that that you maybe traditional factors are not used as much as you thought they might be?
David
I mean, so we, in that feature set, I talked around, like, there are a hundred features that are more fundamental in nature, but we tend to force them to not be that kind of traditional in the way that we construct them. So it might be we're not necessarily using the dividend yield of a company, but we're using how the dividend changes over a certain horizon. When not necessarily looking at the return on assets, but we're looking at how there's been volatility in the return of assets of a particular company. So we do force the construction a little bit. So it moves away from that traditional factor bias to it. But I think maybe I'll turn it around. What is a little surprising for us is actually the other way around. Even when we train the model with just the target, the specific return, if we don't then in the portfolio construction, put some guardrails on those style factors, some of it still sneaks back in. So like we have found, and I don't think this is a bias because our factor models tend to be a bit more quality in nature. But we found if we don't force that style neutrality at the portfolio construction level, even when we trained on the specific return, we get a bit of quality. So what does that back that out? What does that say? It says that certain quality elements do have a forecasting power, at least a conditioning element to them on forecasting that specific return. So that has probably been a bit of a surprise for us.
Host
We know it's getting late in the day, Geneva time for you, David. So this has been a great conversation. Thank you so much. We like to ask all of our guests sort of two standard closing questions. And the first is what is the one thing you believe about investing that you think most of your peers would disagree with you with?
David
I'm going to answer as a little bit more like how would I, how would I offer something different to our competitors? And a big one for me is that I'm, I'm not a big fan of talking about what your edge is and because I think it can make you quite arrogant in thinking that you have a moat on a specific area. I think the power of what we've built is that we've done very in depth work at every stage of what we do. The feature generation, the training, the portfolio construction. And I think we've got very, very good at that. But I, so I would differ with some competitors who want to tell you they've got a specific edge in one area. I think you need to just refine and be as effective as you can in all aspects of what you're doing.
Host
And the last question is, based on your experience in the markets, what's the one lesson you would teach your average investor?
David
So most of my friends don't work in finance. So the one that when asked things like this is don't over trade. So bring this back to my world. And Jack kind of asked this quite a lot. Trading costs and understanding the speed and horizon of your forecasts and that element to it is so incredibly important and so easy to just give up your alpha. So I think that's what I always tell my friends is don't trade too much.
Host
Great discussion, David. Thank you very much for joining us. We appreciate it.
David
Thank you very much for having me.
Host
Thank you for tuning in to this episode. If you found this discussion interesting and valuable, please subscribe on your favorite audio platform or on YouTube. You can also follow all the podcasts in the Excess returns network@excessreturnspod.com if you have any feedback or questions, you can contact us@excessreturnspodmail.com no information on this podcast.
David
Should be construed as investment advice securities. Discuss the the podcast may be holdings of the firms of the hosts. Ever spend all day fishing and catch nothing?
Co-host
That's what happens to hackers when Cisco.
David
Duo's on Watch every login, every device, every user protected.
Co-host
Cisco Duo Fishing season is over. Learn more@duo.com so you want to start a business?
GoDaddy Announcer
You might think you need a team of people and fancy tech skills, but you don't. You just need GoDaddy arrow. I'm Walton Goggins, and as an actor, I'm an expert in looking like I know what I'm doing. GoDaddy Arrow uses AI to create everything you need to grow a business. It'll make you a unique logo, it'll create a custom website, it'll write social posts for you, and even set you up with a social media calendar. Get started@godaddy.com Aero that's godaddy.com Air O for their clients.
Podcast: Excess Returns
Date: December 17, 2025
Guest: David Wright, Head of Quantitative Investing at Pictet Asset Management
Hosts: Jack Forehand, Justin Carbonneau, Matt Zeigler
This episode explores the integration of artificial intelligence (AI) and machine learning (ML) into quantitative investing, guided by the practical experience of David Wright. As head of quant investing at Pictet Asset Management, Wright discusses how ML models, particularly decision tree-based methods, are transforming stock selection, portfolio construction, and the overall investment process. The conversation covers the strengths and limitations of machine-driven strategies, the complementary roles of human judgment, and how technological advances might shape the future of systematic investing.
"AI, artificial intelligence, I simply think it means a machine doing some kind of function or role that a human would generally do..." —David
"They’re very stable approaches...LLMs are not particularly stable...and they’re very hard to interpret." —David
"Using more traditional data, you’re generally going to have a feature score for every company in your universe...If we start incorporating more alternative data, we don’t have the history, we don’t have the breadth..." —David
"Once you get down to the shorter horizon, again, sort of counterintuitively, lots of different things are driving returns. It gets quite noisy at that point...machine learning does give you that ability..." —David
"Versions of this model that we could train with 90s data have a lot of the same relationships between the features in it to now." —David
"They do want to know that there is very strong human oversight, that there is guardrails and structure and supervision around how these models are trained..." —David
"I really remain very skeptical that [LLMs] are going to be able to do [investment strategy construction] effectively..." —David
"It’s a lot of those combinations of things working together. That is the power of the model." —David
"Don’t over trade...Trading costs and understanding the speed and horizon of your forecasts...is so incredibly important and so easy to just give up your alpha." —David
David Wright’s perspective underscores that while machine learning offers a powerful edge in parsing noisy data and uncovering multi-factor relationships—particularly over short horizons—human oversight remains vital in feature selection and risk management. The future is likely to see increasing automation and broader model application, but with strong human guardrails, an emphasis on interpretability, and a continued focus on trading discipline.