What Does AI Buy Us or Cost Us? Views From the Financial Industry - Harvard Data Science Review Podcast

Summary7 min read

Harvard Data Science Review Podcast Summary

Title: What Does AI Buy Us or Cost Us? Views From the Financial Industry
Host/Author: Harvard Data Science Review
Release Date: February 23, 2024

Introduction

In the February 23, 2024 episode of the Harvard Data Science Review Podcast, hosts Liberty Vittert and Shelley Mack explore the evolving landscape of artificial intelligence (AI) within the financial industry. Featuring insights from two esteemed guests—Christina Chi, CEO of Data Bento, and Victor Lo, Senior Vice President of Data Science and Artificial Intelligence at Fidelity Investments—the episode delves into how AI is reshaping investment strategies, the challenges it presents, and the ethical considerations that accompany its integration.

Guest Introductions

[00:03] Liberty Vittert opens the episode by highlighting the rich history of data usage in finance, from hedge funds employing regression analysis to fintech startups leveraging summary statistics. She introduces Christina Chi and Victor Lo as key voices in discussing the impact of AI on the financial sector.

[00:58] Shelley Mack commends Christina's remarkable journey from a rejected intern to a Forbes 30 Under 30 founder, setting the stage for an in-depth discussion on current projects and industry trends.

Current Projects and Trends in AI and Finance

[01:28] Christina Chi provides an overview of her role as co-founder and CEO of Data Bento, a market data provider serving finance, fintech, and other sectors. She emphasizes the company's role as the "data backbone" for AI applications, noting that many AI tools used in finance are trained on Data Bento's datasets.

Christina reflects on her decade-long experience running a hedge fund, observing the cyclical nature of trends like AI, cryptocurrency, NFTs, meme stocks, ESG (Environmental, Social, and Governance) criteria, and alternative data. She points out that while some trends endure, others fade, underscoring the dynamic nature of the financial industry.

Data Science Skills and the Rise of Causal AI

[02:43] Shelley Mack turns to Victor Lo, appreciating his article on the "10 Challenges of Data Science from an Industry Perspective." She seeks his insights on the rapid advancements in AI, particularly regarding the skill sets required for data scientists navigating these changes.

[03:25] Victor Lo responds by differentiating between generative AI (e.g., large language models like ChatGPT) and the broader field of data science, which encompasses both soft and hard skills. He outlines the data science pyramid:

Descriptive Analytics: Understanding past events.
Predictive Analytics: Forecasting future events.
Prescriptive Analytics: Guiding future actions based on predictions.

Victor introduces Causal AI as an emerging field focused on understanding and optimizing the relationship between actions and outcomes, highlighting its potential to enhance decision-making in areas like marketing through experimental design and causal inference.

AI in Investment: Current Usage and Future Directions

[09:34] Shelley Mack prompts Christina and Victor to discuss the practical applications of AI in investment, both for individual investors and large institutions, and to envision the future trajectory of AI in finance.

[09:34] Christina Chi shares her observations from a recent financial conference, noting that while many believe AI will revolutionize finance, full-scale adoption has lagged. She explains that unlike closed systems like chess or Go, financial markets are highly dynamic and influenced by ever-changing rules and irrational behaviors. This complexity makes it challenging for AI to consistently outperform human strategies. Christina recounts her experience with high-frequency trading algorithms that required constant tweaking, as market conditions evolved and competitors adapted, demonstrating the limitations of AI in maintaining long-term superiority.

Challenges and Lessons in AI Adoption in Finance

[13:26] Shelley Mack references a quote by Jeff Bezos, emphasizing the tension between data-driven decisions and anecdotal evidence. Christina elaborates on the pitfalls of relying solely on historical data through stories of hedge funds that failed despite employing top researchers and advanced algorithms. She highlights issues like data overfitting and the discrepancy between simulated success (backtesting) and real-world performance, reinforcing the idea that reality often diverges from theoretical models.

[16:38] Victor Lo echoes Christina's sentiments, drawing parallels with academic simulation studies that rarely report failures. He underscores the importance of recognizing that "there's no free lunch," acknowledging that AI systems, like humans, can make significant mistakes with far-reaching consequences.

AI Risks and Ethical Considerations

[17:50] Victor Lo transitions the conversation to the broader risks associated with AI, such as disinformation through deep fakes and algorithmic discrimination leading to fairness and bias issues. He cites how large language models trained on imperfect historical data can perpetuate biases, such as gender or racial biases, inadvertently affecting decision-making processes.

[19:23] Victor Lo discusses industry best practices and governmental frameworks aimed at mitigating these risks. He references the White House Blueprint for an AI Bill of Rights (2022), which emphasizes safety, effectiveness, fairness, and transparency in AI systems. Victor highlights the challenges of explaining AI decisions, especially with "black-box" models, and the ongoing research to improve fairness testing and bias mitigation.

Cultural Impact on AI Ethics and Regulations

[27:08] Victor Lo addresses the influence of cultural differences on AI ethics and regulations. He observes that while countries may prioritize aspects like data privacy differently, many global regulations converge on key principles such as data privacy, intellectual property, and algorithmic discrimination. The primary variation lies in balancing AI safety with innovation, with each country determining its approach based on cultural and economic priorities.

[28:02] Christina Chi concurs, noting that in finance, regulatory standards like those from the Securities and Exchange Commission (SEC) and the CFA Institute's Ethics Handbook provide a relatively standardized ethical framework across different jurisdictions. However, she acknowledges the gray areas and varying interpretations that can lead to ethical dilemmas and legal challenges, especially as AI becomes more integrated into financial practices.

Personal Lessons and Advice from Guests

[31:56] Christina Chi shares a personal lesson emphasizing the importance of sometimes disregarding statistical probabilities in favor of gut instincts. Reflecting on her early career, she explains how dismissing discouraging data allowed her to pursue ventures that, despite eventual failures, provided valuable experiences and unique opportunities. Christina advises ambitious individuals to follow their passions even when data suggests otherwise, highlighting the intrinsic value of personal drive and intuition.

[35:28] Victor Lo complements Christina's advice by stressing the necessity of continuous learning. He advocates for a multidisciplinary approach, integrating technical skills with knowledge from fields like philosophy, law, and social sciences. Victor encourages data scientists to remain perpetual students, adapting to the evolving landscape of AI and its applications.

Magic Wand Question and Concluding Thoughts

In the finale, host Shelley Mack poses a "magic wand" question to Christina and Victor.

[36:56] Christina Chi wishes for the investor market to become more accessible and affordable to everyone, not just elite or well-funded entities. She underscores the importance of democratizing data and opportunities in finance, coupled with robust education on the associated risks.

[37:58] Victor Lo echoes the significance of education, advocating for more efficient and comprehensive learning methods to equip individuals with the necessary skills to navigate the complexities of AI and finance.

Both guests agree that addressing the challenges and ethical considerations of AI requires a balanced approach of innovation, regulation, and education to harness the benefits while mitigating the risks.

[38:16] Victor Lo concludes by reinforcing the need for proper education and informed decision-making as AI continues to influence the financial sector.

[38:50] Shelley Mack and the hosts express gratitude to Christina and Victor for their insightful contributions, wrapping up the episode with a call to stay informed and cautious in the face of AI advancements.

Notable Quotes

Christina Chi [09:34]:
"It's almost like sometimes in life, it's like you have to just listen to your gut instinct on, like, what wakes you up in the morning. And that might not always correspond with what the data wants you to do, but you just have to do it, and that's okay."
Victor Lo [27:08]:
"AI ethics involves both technical approaches like fairness testing and philosophical approaches like utilitarianism and deontology, requiring expertise from legal, risk compliance, and ethical domains."
Christina Chi [31:56]:
"Sometimes you have to just ignore the data and follow your gut instinct, because the human is still the most complicated machine."

Conclusion

This episode of the Harvard Data Science Review Podcast offers a comprehensive exploration of AI's benefits and challenges within the financial industry. Through candid discussions, Christina Chi and Victor Lo illuminate the nuanced interplay between technological advancements, ethical considerations, and the human element in finance. The conversation underscores the imperative for continuous learning, ethical vigilance, and embracing both data-driven insights and human intuition to navigate the future of AI in finance.

Loading summary

Transcript37 lines

[00:03]
Liberty Vittert
Welcome to the Harvard Data Science Review Podcast. I'm Liberty Vittert, the feature editor of the Harvard Data Science Review and joining me is my co host and editor in chief, Shelley Mack. The finance industry has a rich history of leveraging data for predictive purposes. From hedge funds employing regression analysis to fine tuned portfolios, to individuals utilizing summary statistics in pursuit of the next groundbreaking startup. Data has been a cornerstone of investment strategies for years. However, with the recent surge in artificial intelligence, how is this landscape evolving? Join us as we delve into this intriguing question and more with our esteemed guest, Christina Chi, the CEO of Data Bento, and Victor Lo, Senior Vice President of Data Science and Artificial Intelligence at Fidelity Investments. Stay tuned for a fascinating conversation right here on the Harvard Data Science Review podcast.
[00:58]
Shelley Mack
Christina, let me start with you. You know, your story is such a fabulous one. As a person that went from being an intern that was rejected from several job offers, which is a wonderful story, to the founder of multiple companies and a member of the Forbes 30 under 30 list, your work really has seemed to progress sort of at lightning speed, at least from the outside. So could you bring us up to date? You know, what are the projects you're really currently immersed in?
[01:29]
Christina Chi
Yeah, thanks Liberty for having me. So currently I am the co founder and CEO of Data Bento. We are a market data provider. So in other words, think of us as the data backbone of not just finance, but also fintech and other applications. Oftentimes we have a lot of AI users and customers. So if you're searching for various finance related questions in some sort of AI tool, chances are they might actually be training their data off of the data that we're providing here. But I guess what's crazy about my background, what people always want to ask about, is my background of my first company that I ran, the hedge fund for 10 years. So basically I uniquely, I guess, saw the birth and death of my own fund over the course of a decade. But just funny thing is that over the course of this past decade I'm seeing all of the trends that come and go in the financial industry has been very fascinating and I'm sure we'll talk about that a little bit. But you know, topics like AI, crypto, NFTs, meme stocks, what else, ESG, alternative data, you know, these trends, some of them stay and some just kind of die off. And it's been very fascinating to see that journey and to continue to kind of be a part of that. But from the data lens today.
[02:43]
Victor Lo
Well, Victor, I have a similar question for You. But let me thank you first for writing that beautiful article about the 10 challenges of data science from industry perspective. The question similar to you, is in this rapid change world, that what you're working on, taking your perspective particularly, I know that you are very keen about having the data scientists have the right skill set. And I assume that with this rapid change, with all the rise of cryptocurrency and organ reality, all this stuff, what's on your mind? What are you working on now? What's your advice for all the people who wants to get into this space? What would you tell them?
[03:25]
Sure. Thank you so much for having me. First, I just wanted to add that my opinions here do not represent my employer. To answer your question, I would first think about what is the biggest AI development these days. I think most people would say it's generated AI or large language models such as ChatGPT. It's a very useful tool that can help you with a lot of tasks. But the field of data science actually covers a very wide range of spectrum, including both the soft skill side and also the hard skill side. Under the soft skills, any experienced data scientist usually would be involved in some kind of analytic consulting, which involves many stages, from generating ideas, initiating a project, proposing a solution, identifying resources, and also executing a project like fitting a model and so on, and then deploying the model. So all these stages require a lot of skills and experience. It's not just the hard skills, but it involves a lot of soft skills such as communication skills and presentation skills, teamwork on the hard side. On the hard skill side, there are many, many things that are actually included in data science. And in addition to the most obvious things, and you need to know how to use a sophisticated machine and write some algorithms. But people usually think of a pyramid. And if you think of a pyramid, the lowest level or the most foundational level is the descriptive analytics. It's about what happened in the past. And then the middle layer is predict predictive analytics, which is about what will happen. And the highest layer is the prescriptive analytics level, which is about, okay, what you should be doing, so what happened, what will happen and what you should be doing. In a very simple example like weather forecasting, I'm sitting in the Boston area. It snows from time to time during winter. Knowing whether it snowed yesterday, it doesn't help me. Knowing whether it will snow or not tomorrow will help me. So that's a form of predictive analytics. If you know it's going to snow with a high probability tomorrow, then you Have a decision. Let's say tomorrow is a work day. Then you can make a call whether you should go to the office or you should work from home. You go to the office, there may be two outcomes that you want to think about in terms of productivity and safety. And likewise, you stay at home. You may be safer, but maybe less productive sometimes. So knowing the link between your action and the outcome is a very powerful thing. It's a form of causal inference or causal AI. It allows you to optimize your future action. And that is horse stat is a form of causal AI. And causal AI is now a rising field. If you look at the Gartner's latest hype curve, generated AI is at the peak right now. Causal AI is on the rise. And that makes a lot of sense because a lot of things that we do require causality, even for generative AI. If we are using generative AI to generate a lot of marketing contents, there will be much more marketing content, much more variations, and you need to test to see which one would work the best. The best for your target population or best for each and every customer group. And you still need to do a lot of testing, which would require either experimental design, also known as a B testing, or some kind of causal inference to help you optimize your future marketing efforts.
[07:00]
Victor, let me follow up very quickly. You mentioned this Causal AI sounds very intriguing to me, but I want to push a little bit here. I mean, we know that generative AI is a real thing, right? Is that being able to do things like we humans, at least not at the speed to do those things. But you really think the AIs can really do the causal inference? Because we humans are struggling with doing causal inference. And we know, you know too well the whole worry about mixing association versus causation or the problems. Do you actually see there's a possibility here there will be some breakthroughs like the generative AIs bring to us that.
[07:42]
On the causal side, yeah, it's a very emergent field. Causal AI is emerging enough and large language model is emerging enough. If you combine the two things together is even more emerging. It's evolving research right now. So one idea is if you can mine something about the content about some text or videos, you may be able to infer causality in some way and that might help you draw the causal diagram and it might help you facilitate how to do a proper causal AI or causal inference based on the insights from the last language model's output. So that's one area I think I.
[08:22]
Now understand what you're saying. You're saying the traditional causal inference, statistically speaking, using numbers, using variables. But you say here, you probably can draw causal directly from these images, these text data, because the live language model builds enough model just like a traditional statistical model. I get the picture.
[08:43]
Shelley Mack
Well, I love it whenever I. I get terrified when I say something and Shyli says, well, let me push back on that a little bit. I'm like, oh, God, just. It's all yours, Shaoe. Whatever you think is the right thing. Cause I'm never right one. So, Victor, I'm very impressed. I can never answer anything when Shelly says, I'm gonna push back on it. So that was very impressive to me. But I think both of you sort of touched on this a little bit just now. But just to start with you, Christina, and then, Victor, please chime in. How are you seeing these big changes happening with AI in the investment game? How is AI being used right now both for sort of personal, individual investors and larger groups in general? And where do you really see it heading in the world of finance? What's the future look like for AI?
[09:34]
Christina Chi
It's a funny thing you asked that. I was actually at a financial conference just about a week ago, and there's a panel on AI. I wasn't on the panel, but it was interesting because, you know, a lot of the panelists are saying, look, AI is going to be the future of finance, right? Kind of applying a lot of the concepts that Victor just introduced to us here. Why not take financial data from history, right from the past, and be able to make predictions on the future of what's going to happen in the stock market, for example, based on that historical data. And what's interesting, though, is that, you know, because I've been in the industry for a while, and I've been to these conferences for a while. This kind of stuff has been shouted over and over again at conferences for decades, by the way. You know, even like 10 plus years ago, this has been a thing that, you know, people have said, hey, AI is going to take over in the next 10 years even. And this is like back in, you know, 2008, they're like, AI is the next big thing, you know, is going to be the future of finance. We're going to have fully autonomous hedge funds that just trade based on AI. But then, you know, fast forward to today, and that really hasn't happened. I think the adoption has been a lot less. Yes, there still is a lot of adoption. There's a lot of funds like Our fund, we use a lot of machine learning related technologies, but I wouldn't call it AI. Maybe some regression analysis, things like that, but again, not enough to be fully autonomous AI that I think a lot of the public had imagined. And there's a reason for that. So just in case people are curious on kind of why isn't there that much adoption compared to other industries. Right. Like you look at ChatGPT and it is so incredibly successful, successful when it comes to, you know, generating just language learning models in general. Right. But then you look at finance and you're like, how come the top funds in the world these days, yes, they use a lot of technologies, but they don't necessarily rely fully on AI or not as much as we would expect. And the reason why is actually because in the financial industry data is often changing. The rules are changing so frequently. Compared to like a game of Go or chess, right, where AI has completely learned to dominate those kinds of games, those are closed ended games, right? You either win or you lose. The rules have been fairly similar for, you know, not only decades, like sometimes centuries, depending on how old the games are for a really long time. And so there's a lot of data to train on versus in finance, you know, you have maybe companies that are brand new on the markets, right? There's not a lot of data on those companies or participants in the markets aren't always rational as well. And there's a lot of issues there too, you know, and then also on top of that, not everyone has the same goal in the financial industry as well. And so just given that the rules are constantly changing and there's, you know, so much going on in this space here, it actually has been very difficult, at least for now, for an AI to completely be able to, you know, know everything that's going on and then be able to predict what's happening in the markets. Even for us, by the way, my fund focused on high frequency trading and when we were working on that, a lot of our algos, yes, they would do a really good job of like short term predictions. But then if you brought it out to the long term, those algos would die after three or four months. And we were constantly having to revise and fix that data over time. And then the other problem was other market participants were also catching on to those algos. They were also learning what we were up to and building smarter algos on top of it. And so it was like a constant race of like, you know, who can be smarter and you know, who can stay on Top of things, but it's going to be very rare. If not, I would say still impossible today to build an AI that just, you know, forever can beat the stock market year after year versus the human brain. And I think maybe my biggest lesson learned over the years is that the human is the most complicated machine. Despite us working on AI over so many years and seeing its pros and cons, I would say, you know, still the best strategies that we had over the years were just created by the humans and they were as smart or as dumb as the people behind this. And so I'm not going to say I'm the smart one, but that was the reality of the situation.
[13:27]
Shelley Mack
You know, it's so funny you say that. I just yesterday heard a quote by Jeff Bezos and someone had asked him, you know, if AI is going to take over this whole thing. And he said, you know, which I thought was a funny quote from him. He said, if the data and the anecdotes are saying something different, I usually believe the anecdotes because the data, not because the data is necessarily totally wrong, but it's been analyzed wrong, or it's been interpreted wrong, or it doesn't. But I thought that was fascinating that he, you know, he's like, I believe the human over the. Or the anecdotes over the data.
[13:59]
Christina Chi
Yeah, I was gonna, I just wanted to add a funny story, which is you guys have probably heard of some of the, you know, better performing hedge funds over the years, like Bridgewater Renaissance Technologies. There's books written about, you know, their founders and stuff like that over the years. And I have heard at one of these funds, I won't mention the name of it, but some of their top researchers left and these are like, I think they had one of them had won like some sort of prize, I forgot what it's called, like the Nobel Prize, but like in math and like in their respective fields, like they're some of the top researchers in their fields. They were the ones who built some of the original algos at their original companies before they left and tried to start their own thing. And that fund that they started was a total flop. I mean, it just did not work out at all. And the reason why was actually because even though they had taken a lot of the strategies that they had learned over the years, those strategies didn't last very long. And so again, it's just a lesson in finance that you can take something there and even if once you implement it three months later, it's already too late. The Markets have caught up and what they were building was no longer feasible. And then what you learned is that at a lot of these top performing funds, the researchers are constantly fine tuning and improving and fixing their existing strategies over and over again. And so it's just really interesting to see kind of the constant work that's involved in this space. And you're right, it's almost like even though something looks theoretically really great on paper, sometimes you put it into practice and it just doesn't work. And that's something we often face, by the way, in finance. Oh, this is a funny thing is also when we were fundraising from investors before we had any track record, we'd show like a page on our pitch deck with our backtest results, which is basically testing our strategy on data in the past, right. And our investors would, they would just ignore that page. I remember one of our more aggressive investors like ripped up that page and was like, don't show me this ever again. I've never seen a bad back test in my history as an investor, right. I've seen thousands of pitch decks. Every single pitch deck shows me a wonderful back test. But then when you actually trade it, I bet you anything 99 out of 100 of you guys are going to fail and it's not going to work out. And so there's kind of like what Victor mentioned, by the way, between training on historical data versus deploying something in real life. I think there's still a little bit of a, at least in finance especially, there's a lot of discrepancy and a lot of differences between, you know, the past and the future. But also, of course, there's a lot of data science issues that we should mention. There's a lot of overfitting. Right. For your backtests and your back testing models. That happens a lot. And that was a big reason why some of these recent quant funds actually failed over time was because of the overfitting. Long story short is that reality oftentimes isn't as glamorous as the simulation.
[16:39]
Victor Lo
Well, that's a fantastical point. You remind me, you said these back test in reality does not work in the academic world. There's another phrase we use often, papers present simulation studies. And we will say simulation studies are doomed to be successful. You never see a paper simulate something, hey, my message failed. You know, they do simulate it to say other people's message have failed. Right. So it's a very similar lessons here. Now, speaking of lessons, I would like to ask Victor that for me, one of the few lessons in life is there's no free lunch. There's always a downside somewhere. And we have talked about the potential of AI, but AI obviously just like a human, one with the other, that it will make a mistake. And the difference probably when it's making mistake, it probably could be huge mistake. So from your perspective, what are the potential risks associated with these AIs whether they use for finance or for others? And what precautions should firms take to mitigate these risks? Effectively, what they should do.
[17:50]
Yeah, thank you for the great question. Globally, many governments and organizations have been talking about AI safety, AI risk, AI ethics or similar kind of terms, including responsible AI and trustworthy AI. The most prominent issue right now is actually you probably heard of disinformation or misinformation caused by deep fakes in the news almost daily. Now another key area is algorithmic discrimination. What it is is about fairness and bias. If you think of large language models where Wikipedia data is the key data source for training the model. So how good is Wikipedia data? Well, first, Wikipedia data may not be completely 100% accurate. Second, it is just an accumulation of past data, even if it's 100% accurate. So whatever happened in the past may or may not reflect current or the future. One obvious example is in the past, at least in the US most doctors were men. Now in medical schools, more than half of medical students are female. So when you allow LLMs to train on past Wikipedia data, they would easily classify a doctor as a male automatically, which may or may not be correct and in many cases not correct. So clearly there's a gender bias here. Gender bias, racial bias. Those could be intentionally or in many cases unintentionally caused by models like that. So those are some of the risks.
[19:24]
But again, I'm going to push you as Liberty hates I use the word to push you. No, seriously, I just think about, do you have any thoughts, particularly from your kind of studies, thinking about all the skill sets and data scientists should be aware of anything like in the day to day operations, suppose you have a data scientist engaged in using these AI tools or make a decision based on them, or devise policies, what they should watch for, what are things they should do trying to reduce this kind of risk?
[20:01]
Yeah, that's a great question. There are many industry best practices and government documents around already which listed many principles and guidelines and frameworks that we can use. One example is the White House blueprint for an AI Bill of Rights that was released in October 2022, which is a very solid framework which documents some of these principles extremely well. And these include making sure your AI system is safe and effective through some kind of testing, and making sure that your model is not having any algorithmic discrimination issue which requires some form of fairness testing. It's not an easy topic. Sometimes it's not easy to judge whether a model is fair or not. And it also requires or recommends some kind of explanation of outcomes produced by AI or automated systems, which itself is also not simple because many AI models are just large black boxes. But there are ways that documented ways to handle fairness testing and how to mitigate bias. It's just an evolving area right now, but there are some ways that are getting more and more standard. But it's getting much harder for general AI generated AI requires much more research to understand how to explain those models and how to reduce the bias and hallucinations and so on.
[21:28]
Shelley Mack
I think that brings me to both of you and Christina, I'd love to start with you, but as we've just said, and as Victor just said, with any sort of technological advancement, we obviously have these ethical considerations. Considerations. But what are the ethical considerations that come up when we think about how to use AI in the financial sector? Particularly, what are the issues that you all see in the financial sector and how that can really affect the everyday person?
[21:58]
Christina Chi
Oh, there are so many. I mean, every day, you know, even for my current company, we get pitched by startups that are trying to build AI for finance. And, you know, the. One of the questions we would always ask them is like, look, if I bought your product and let's say I traded off of it, or one of my employees or interns, someone traded off of it and lost all their money, who is to blame for that? If we're relying on your AI and your AI told us something that was false and we ended up using that or relying on that and automating a strategy based off of it, who's that fault here? Right? So there's a lot of questions about kind of ownership and kind of, you know, what's involved there. And then there's, you know, there's all kinds of other questions too, like who. Yeah, who gets credit as well. Like when something does well, you know, so it's almost like when it does poorly, who's at fault when it does well, who gets credit? And it's similar to other spaces, I'm sure. Right. If you're working in the art space and you see AI generated art, you can be very upset when you see, you know, the mangled signatures of the Original artist. I don't know if you saw the Facebook filters, right? With people posting photos of themselves like a superhero.
[23:07]
Shelley Mack
Yeah.
[23:08]
Christina Chi
And then you see they're like someone's like, signature on the side. It's like mangled up because some AI was training off of someone else's work. And it's like, well, who gets credit? And you know, is it fair in this world that the original artist, you know, whose works were inspired that style. Right. Didn't get the proper credit for that? So for us in finance in particular, there are similar issues, issues like that. For us, it would be issues on, for example, when AI trains off of data in the financial space, where is that data coming from and who's kind of getting credit for that data? Right. And also, is the data accurate? And it's similar to Xiao. You know, you mentioned how there's no such thing as a free lunch, and it's very much true in finance as well. I wouldn't say we're not a perfectly efficient market. There's a lot of irrationality going on. But I think the markets are still, given just how many players there are, it's efficient enough to the point where it's fairly obvious when you're building an AI product off of very bad data, by the way, and there's a lot of cheap data out there. Right. You can get free data. And finance from Yahoo. Finance, I guess, is a common source people bring up. But then the challenge there as well, is that data truly accurate? You know, where are they getting their data from and are there gaps in the data? Is it delayed? You know, yes, it's often delayed. And so what we typically discover is in finance, you pay for. For what you get. You know, a lot of, especially retail financial traders. Like, if you ask my parents, I'd be like, I thought financial data was free. You know, it should be available everywhere. But in reality, you know, that data is actually very expensive if you get it from directly from the source, which is like meaning New York Stock Exchange or nasdaq, you know, if you're growing there, it's actually fairly expensive. And so, yeah, it's like almost like you pay for what you get in this industry as well as in others. But yeah, there's a lot of ethical considerations. I do anticipate there's going to be quite a few lawsuits coming out. That's one thing for sure. I can say I'm bad at predictions otherwise, but definitely there's going to be more and more lawsuits. Not just within, you know, of course, within AI in General, but also within the financial industry, people who are going to be, you know, analysts who are going to blame their AI for their mistakes. And then we're going to have to see, okay, what's going to happen there and who takes credit for work.
[25:15]
Victor Lo
Victor, you want to follow up on that question as well?
[25:18]
Absolutely. So AI risks are actually across all industry sectors. It's not just for finance. Governments across the world are already on regulations and guidelines. And I think the industry can do a lot of things here too. Almost by definition, AI ethics involves AI and ethics. It's one of the very few areas that actually require both the left brain and right brain thinking and expertise. It definitely includes technical approaches such as how to assess whether a model is biased and how to handle bias mitigation and so on. That's for algorithmic discrimination. But it also involves philosophical or ethical approaches to manage AI. And because many of the problems could be gray areas and that would require expertise from a wide variety of areas such as legal risk compliance, privacy, ethics and so on. Now, AI is evolving really fast, so AI ethics of safety has to catch up also. So that's why it's also evolving very fast and there are tons of research going on. Let me just give one example here. So in ethics theories, two of the key theories are called deontology and utilitarianism. Deontology, also known as universalism, is about doing the right thing with the right intention. And this can be applied to checking whether your data is appropriate, the training data is appropriate, and whether your model features or the predictors are reasonable. And utilitarianism is about measuring the outcome. It's whether your outcome seems to be fair, accurate, and it has customer inclusion and so on. So one can actually apply ethics theories in conjunction with the technical side of AI. That's why AI ethics is so interesting. It covers both the left brain and right brain thinking.
[27:08]
Victor, you mentioned government all over the world are thinking about these issues, industry thinking about these issues, how much these ethical considerations are related to the culture. Because the cultures could be very different. For example, at least in the Western society, simply in us, the data privacy is a very big issue in the AI space. I'm now teaching a course on data privacy, but I'm also coming from Asia culture. I know that in certain Asian countries privacy is not. It's not like people don't like the privacy, but it's not the thing they can really rely on. So they will consider other things much more on them. How do we keep the society harmonies instead of data privacy so just kind of what you're thinking about how much these asset considerations should be culturally sensitive.
[28:03]
That's a very interesting question, Sally. I've been reading following AI regulations across the world, at least some of those are pre regulations and actually they are more similar than people thought. So many of the key regulations actually are all pointing to data privacy, intellectual property, copyright data and so on. They are actually quite similar and expressly algorithmic discrimination is mentioned in multiple countries proposed regulations at least. So they are actually very similar. There may be some degree of differences, mostly about how you balance between innovation or adoption with safety or control. So every country has to face how much waste do you put on AI safety versus innovation adoption. So they have to balance between those two aspects. But other than that, underneath many of their prerequisites are actually very similar.
[29:03]
Christina Chi
You know, it's interesting because in finance, just like Victor mentioned, it actually is surprisingly, I wouldn't say it's all the same, but it's fairly similar in terms of the AI related regulations across various countries and jurisdictions for the most part. And there's a reason why actually there's a few kind of governing bodies that sort of have a lot of authority over what happens. Like the SEC is one of them, for example. And when the SEC kind of puts out a new regulation and has some new policy that comes out, typically a lot of other countries will follow suit in the upcoming years and adopt that too. Same with for hedge funds, the Cayman Islands actually we actually incorporated our fund, we were in British Virgin Islands and BVI instead. But every time a new regulation came around and came in, you could probably expect within one or two years that BVI would also adopt it too. And so it was nice in that we had almost a few years to prepare for stuff and to see the impacts of certain regulations realistically before it came over to us in terms of ethics as well. By the way, as a fund we were required to adhere by a standard set of ethics. For our fund we just use the CFA actually ethics standards. They have a CFA ethics handbook and that's actually used fairly, I don't want to say everyone uses it, but fairly universally. You know, there's people from various countries who all adhere by that set of ethics in their work as well in the financial space. And there's a lot of questions in that handbook on what's, you know, material non public information, for example, and how, you know, you can't. That's insider trading, right? You can't trade based off of information that's, you know, considered Insider information that, that just makes sense and I think is fairly standard. And, you know, of course, if it's an AI as well. Right. You can't be training an AI off of that kind of information either. So. So, yeah, overall I think it's been, you know, fairly standardized, but of course, there's a lot of gray area and there's a lot of room that, you know, people will interpret things slightly differently. And that's why it leads to various issues, by the way, today and in our industry.
[31:01]
Victor Lo
Well, thank you. It's reassuring to see that there are lots of commonalities for these issues across, you know, different countries, different cultures. Now, Christina, you are very open about the lessons you have learned, which is terrific, I think, particularly for the young generations. You know, there is always this tendency seeing all the successful people. You know, everything written is about their success mostly. And it's kind of a. A bit misleading because we know how we all struggled all the time. And I know that particularly during this time when the world changes so fast and in many different ways, we are making more mistakes just because it's an unfamiliar situation. So can you share a story about your, you know, either you call the best lessons or worst lessons, there's something that you want to help to readers or listeners out there that can prevent them from making the similar mistakes.
[31:57]
Christina Chi
Well, since this is a data science podcast, I'll say my biggest lesson in relation to life and data in general is that sometimes you have to just ignore the data and. Or ignore the statistics. Right. And I say this because, look, when I first started my first company, I was what, 20, 20 years old. And, you know, now I'm like mid-30s, I'm a lot older. But back then, you know, look, statistically, the chances of success, if I had looked at the data and seen how high that failure rate is, which is almost 100% at my, you know, in terms of my age and background and given my lack of experience, you know, would I have done this? Right? I probably wouldn't have. And I probably would have gone on to, you know, my path was I really wanted to go work in finance at a trading floor. And, you know, who knows? Today, you know, probably would be an analyst on Wall street in a very different path, right. If I had answered the data and actually looked at the data and taken it seriously. And sometimes it's like for. And I say this to anyone who wants to potentially start a business in the industry is like, yeah, yeah, your chances of success are almost zero. You know, the data, if you had Listened to the data. I'm sure, you know, the data will tell you, okay, go back to school, right? Go work in the industry for 30, 40 years. Wait till you're retired. Wait till your kids are retired, you know. You know, so it's almost like sometimes in life, it's like you have to. When you follow the data and it's telling you to do one thing, like, same thing, by the way, like, 10 years ago, all the data was pointing to study computer science, right, As a major and take that path, become a software engineer, because it's a very lucrative path. Okay, well, my. My youngest brother, he's 12 years younger than me, he studied computer science because he followed that advice. And now, you know, he's telling me none of his classmates can find a job. And also, no one's happy. You know, like, it's like one of those things where. Do you really like what you're doing? Do you, like, sleep and dream about code like, you know, some people do, and that, you know, kudos to you. Mad respect, but I personally don't. And so it's almost like, I hate to say this generic kind of follow your dreams and, you know, whatever it is, and kind of ignore what the statistics are saying, but, you know, I'm sure out there, when I used to go to conferences 10 years ago before, you know, when I was about to start my fund and stuff, and the common advice on the street was like, don't do it, right? And at the end of the day, yes, even though my fund failed 10 years later, I'm still grateful that I had that experience and got to live that kind of unique life that was kind of largely ignoring what all the data was telling me to do and still going for it anyway. And, yeah, so I think that would probably be my biggest piece of advice there is. Like, look, if you're listening to this podcast, you're very ambitious. I'm sure you have a lot of great goals in life, whether that's a startup or a large company or side projects, whatever it is, right? And so sometimes you just have to just listen to your gut instinct on, like, what wakes you up in the morning. And that might not always correspond with what the data wants you to do, but you just have to do it, and that's okay. So that would be my biggest piece of advice for this audience today.
[34:58]
Victor Lo
Well, thank you very much that Victor start this podcast saying his view does not necessarily represent his employee. Now I have to issue disclaim. The views expressed on this podcast does not necessarily represent the Whole data science profession. I'm getting lots of my data colleagues going to say you're podcasting something. Said ignore the data. Of course, we get that. Thank you very much, Christina. That's a wonderful advice. Victor, you have any story to share?
[35:28]
Yeah. My own advice is I always found that I have to keep learning. I have to study all the time, and I continue on the path of just being a student all the times. Part of it is because data science, or whatever we do in this quant area involves a lot of things. It's not just the quantitative or technical side, which itself is a lot. It's a combination of math, statistics, computer science, and so on. But it also involves many less, less quantitative sides, such as philosophy, legal, risk compliance, and so on, and social sciences. So I just have to continue to study all of those things, and I would advise everyone to do that because some of those skills are really useful and some of the knowledge is really useful, especially when we get into using data science and governing AI and so on. And it's not easy, but it's very interesting to broaden your skills and your expertise.
[36:26]
Shelley Mack
I love this. I feel like this is advice I need to think about in my life. But as always, we could have 8 million questions for the two of you, but we have to wrap it up, and as Shelley said, we wrap it up with our magic wand question. So if you all would both answer. Christina, I'll put you on the spot first. If you could wave your magic wand and change one thing about the investor market, what would it be?
[36:57]
Christina Chi
Wow. I think for me, it would be making this market more accessible and affordable by everyone. Data should be available to everyone, not just to high frequency trading firms. Opportunities in finance as well. Making sure they're more available, not just to people, elite, Ivy League educated people, which I think a lot of us are, but still, it should be available to everyone and more accessible and available. And it's gotten better over the years, by the way, right? Like today there's what, like traders on Robin Hood who like, are in high school, you know, like this 10 plus years ago, this wasn't a thing, you know, and so it's gotten. It's definitely gotten better. But also, of course, there comes risks with that, right? So education, making sure that we're educating this next generation properly on the risks involved of going into, whether that's finance or AI or data. It's so, so important to also have that education component along with making it more available to everyone. Victor?
[37:59]
Victor Lo
Yeah, I don't have anything to add other than Christina's point of education, there's just so many things to learn about, whether it's financial market or anything related. So just keep getting the education. I think that's always a good thing. I don't think it's a magic wand, but it may be most obvious.
[38:17]
Well, I guess that if we have a magical one about education, we want to educate a lot more people in a more efficient way. Right. But thank you to both of you, Christina and Victor, for a really insightful conversation and for lots of advice. There are great things happening with AI. There's lots of risk happening. So we all need to deal with both of them with, with caution, with, you know, proper education and to inform ourselves. So thank you again for really a great conversation.
[38:50]
Shelley Mack
Thank you both so much.
[38:52]
Christina Chi
Thank you.
[38:53]
Victor Lo
Thank you for having us.
[38:57]
Liberty Vittert
Thank you for listening to this week's episode of the Harvard Data Science Review podcast. To stay updated with all things hdsr, you can visit our website at HDSR or follow us on Twitter and Instagram. A special thanks to our executive producer, Rebecca McLeod and producers Tina Toby Mack and Arianwin Frank. This has been the Harvard Data Science Review. Everything Data Science and Data Science for everyone.