
Loading summary
Liberty Vittert
Hello and welcome to the Harvard Data Science Review Podcast. I'm Liberty Vittert, the feature editor of the Harvard Data Science Review, and hosting our episode today is our Editor in Chief, Xiaoli Meng. This month we wanted to take some time to talk in depth about what exactly generative AI is, what it can do, and what it can't do. In this special episode derived from a webinar titled Future Grappling with the Generative AI Revolution, Xiaoli Meng collaborates with the Harvard Graduate School of Arts and Sciences to tackle the topic of generative AI with the help of esteemed panelists and the three co editors of this month's special issue of the Harvard Data Science Review, Francine Berman, Ralph Herbrook, and David Leslie. Stay tuned for all of this and more on the Harvard Data Science Review Podcast.
Jen Flynn
Hello everyone. I'm Jen Flynn, Senior Director of Global Outreach at the Kenneth C. Griffin Graduate School of Arts and Sciences. We have folks joining us from all across the United states and from 28 other countries around the world. And on behalf of Harvard Griffin GSAs, I want to thank you for taking time out of your day today to join us as we learn from our distinguished colleagues from the Harvard Data Science Review about the many dimensions of future shock and how as individuals and societies were grappling with the Generative AI revolution. Our interlocutor for the day is Xiao Li Meng, Ph.D. 90 in statistics. He is the founding Editor of the Harvard Data Science Review and the Whipple V.N. jones professor of Statistics at Harvard, and he is the former Dean of the Graduate School of Arts and Sciences. Meng was named the Best Statistician under the age of 40 by the Committee of Presidents of Statistical Societies in 2001, and he's the recipient of numerous awards and honors for his more than 150 publications and at least a dozen theoretical and methodological areas as well is in the areas of pedagogy and professional development. Francine Berman is the Director of Public Interest Technology and Stuart Rice Research professor in the Manning College of Information and Computer Science at the University of Massachusetts Amherst. She is a Faculty Associate at the Berkman Klein center for Internet and Society at Harvard University and was selected as the 20192020 Katharine Hansen Bessel Fellow at the Radcliffe Institute for Advanced Study at Harvard University. Ralph Hayobrisch is the professor of Computer Science at Hasso Plattner Institute and the University of Potsdam and Chair of the Research Group on Artificial Intelligence and Sustainability. Previously he served as Senior Vice President of Artificial Intelligence at Zalando and Director of Machine Learning at Amazon in Berlin after starting and leading Facebook's Unified Ranking and allocation team from 2011 to 2012. And last but certainly not least, David Leslie, Director of Ethics and Responsible Innovation Research at the Alan Turing Institute and Professor of Ethics, Technology and Society at Queen Mary University in London. He previously taught at Princeton's University center for Human Values, Yale's Program in Ethics, Politics and Economics, and at Harvard's Committee on Degrees in Social Studies, where he received over a dozen teaching awards. And now I am delighted to turn things over to Xiao Li. Meng Xiao Li, over to you.
Xiaoli Meng
Thank you all for joining this event and I know that we probably are all different time zones. I'm actually hosting from Australia early in the morning and I'm particularly happy to host the event for alumni. And as Jane mentioned that I was the former Dean of Graduate School Arts and Sciences during my deanship. One of the highlight of being a dean is the opportunity to meet all kinds, all alumni all over the world. And I hope that I have actually met some of you. Welcome again after I stepping down the Dean, I have been given this opportunity to start the Harvard Data Science Review and here is just a very quick introduction because it's a great intellectual resource for all of you. It's free of charge, you can check anytime. And we publish articles from perspectives to applications. We have all the different columns and the most relevant for this events. We have special issues calling for any articles for generative AI. And these three colleagues of mine here, they are the three co editors for this special issue. And I'm extremely pleased today that we will hear directly from them what they think, starting from from the question I'm going to ask all of them right away. We all talk about generative AI, but what does that really mean? And I like to start the question with Fran.
Francine Berman
It's great to see everybody there and everybody in the audience. Generative AI is artificial intelligence capable of generating things. So it's other media and patterns and all of that. The way it works is interesting is because it's trying to pick up patterns and characterize what it is about them that makes the image the image. And so generative AI has all kinds of interesting technical ways of doing that and to get the accuracy you want of what that image might be.
Xiaoli Meng
Rav the same question?
Ralph Hayobrisch
Yeah, kind of. Similarly, I think generative AI in itself is not that new as a mathematical model. I think what's really novel about it right now is the amount of data that is used to extract these patterns. So previously we had relatively small COBRA of maybe some handwritten nodes. But because we have models that start to match in terms of the number of parameters, the capacity that we have in human brain, and because we have the amount of textual and picture information that exceeds anyone's single lifetime, the patterns it does learn are extremely powerful and expressive. And so for me, generative AI is sort of sequence models of data on steroids because of the amount of data that we produce that comes from our society.
Xiaoli Meng
Thank you very much.
David Leslie
And David, I mean, I would add to this maybe a governance perspective or even a social science perspective and say when we think of generative AI, we need to think of its conditions of possibility. So we need to think modally. So what's needed for a generative AI system to sort of run and operate in the world? And if we think this way, we start to think about various levels of infrastructure. So generative AI is not just the kind of character of the technology itself, it's compute infrastructure, right? Everything from the fiber optic cables that transfer information to the underlying GPUs and all the infrastructure that's allowing for information processing, it's data infrastructure, it's the proprietary data sets, it's the data centers, the modes of curating across the data pipeline, it's skills and expertise infrastructure. So here we're thinking about those, you know, very small numbers of our population who are actually equipped to handle the kind of software development and to handle the statistical and the mathematical details. The other bit of definition that I would add here is if we set generative AI next to the conventional machine learning systems, where we understand the lifecycle of these systems, as you know, roughly involving project design, model development and system deployment. What we would understand as what's being different in generative AI is that these systems are multiphase. And so it's not just that you have one simple socio technical life cycle of design development and deployment. You have first off, the data feeds into the training of a sort of base model, a foundation model, which is the kind of large, you know, the basis of what a generative AI system is. But then that system can have multiple downstream applications. And so what you have is you have the design development, deployment of the base model, but then you have what we call amplification, or you have got the fine tuning of that base model in accordance with whatever desired functions the application builder might have. Right? So one thing in terms of governance we need to think about here is that this is a distributed sort of, you know, supply chain. And we need to think about all of these complex moments along this multi phase supply chain.
Xiaoli Meng
Thank you, David. And then let's get to the second question. And again it's to all of you, what do you feel like are the three biggest risks and the three biggest opportunities with generative AI? And let's start with Fran again.
Francine Berman
Yeah, this is an interesting question. We could probably spend the whole webinar on this. I think one of the biggest risks is tech dominance and misbehavior, overriding the public interest. So if you think about everything you read, it's the drama you're seeing out of Silicon Valley, it's what companies are doing, it's the impact of innovation and turning that into tools and services. But AI is much more than that. And the management of AI that will help humanity and society thrive is much more than that. And I think we don't put enough time and effort and scrutiny onto what we need to do that in order to take these wonderful innovations. You know, think about, it's not an exactly right analogy, but think about atomic power, nuclear power can power cities and it can destroy humanity. And what would have we done about that? We've put all kinds of guardrails on that. We have standards, we have laws, we have regulation. We're trying to integrate it into society in as high bench benefit and as low risk as possible. We haven't done that with AI. And so that I think is something that encompasses probably all of the risks. Poor oversight and management, no humans in the loop. I think those are really problems with AI. Opportunities I think are super interesting. If you think about AI, we are already seeing in the right kinds of circumstances, opportunities for more efficiency, for more customization when it matters, more tools for assisting us. Think about the way people are now, reading images for health and the kinds of things AI can do to kind of go through a whole set of cases that are helping doctors sort of figure out what's going on. I think it's giving us in a challenging way, but I think it's still an opportunity to restructure work time and responsibilities for the betterment of humanity. And I think getting there is hard, but I think if we can utilize AI as a tool and as an assist to do a lot of the things that perhaps we have people doing now, what does that mean for the workforce and can we restructure things in a way that's actually better? The last thing I want to say that is really intrigues me is I was listening to Stuart Russell's Commonwealth, you know, speech the other day and he said that he thought that humans have A competitive advantage in a lot of things that AI will never do. AI will never have empathy, AI will never have emotions. AI will never have interpersonal relationships. And if you think about kind of restructuring things down the line in a way that really leverages our capacity for those kinds of things, our capacity for ethical behavior, and kind of create a hybrid world in which AI does some things and we do others, it gives us a chance to be the best of ourselves. So I find all these things opportunities and intriguing and of course, the devil's in the details.
Xiaoli Meng
Thank you very much, Friend. And the same question to Roth.
Ralph Hayobrisch
Yeah, so I want to talk first about two of the risks I see that Friend hasn't mentioned. So one of them is we see the systems producing very human like and believable predictions in the form of text or images. But what I think we humans are also very good at is to express uncertainty. One measure of someone's seniority is how much they know what they don't know. And these systems right now are not built algorithmically to reliably quantify when they're uncertain. And what that leads to is a too high degree of belief in the truth of the statements or of the visual effects that get created. And that's a big risk. And, you know, as an opportunity, it's also a great research area. I think we've as a research community have demonstrated that we can learn the patterns from these humongous amounts of text and image data, but we haven't yet leveraged all the methods that also allow to quantify this uncertainty. The other aspect I see is that these systems right now, and again, this is a snapshot of research as we stand, consume a lot of energy as they do these calculations. In the days where I worked in the companies that was mentioned, one of the key consideration was always what does it cost to compute an intelligent prediction? Whether it's a web search result or an advertisement, Right now, the cost stands often in no relationship. And the cost is you can measure it in, in dollars or you can measure literally what it is in terms of energy. So something that I think is also a bit of a risk is to use these models at their current way of implementing and running without doing the energy balance. Now the opportunities that I see is when I started in 95 to look for the first time into AI methods, and then in the 2000s in their applications in industrial context, an enormous amount of effort was going into what we call feature engineering, that you have a data set, it's textual or its images, and you as the engineer of the AI system engineer first, how you represent the data to the algorithms that make predictions. I think the great opportunity we have is with the patterns that are abundant in the way that text is structured or image is structured. We have a very good representation of visual and textual data that we can build upon for the main specific applications that have taken decades if not longer of human intelligence to produce of software engineers and modelers and statisticians. So I think that's a great opportunity to leverage those representations for further applications in a much more domain specific way. Thank you Rav.
Xiaoli Meng
And the same question to David.
David Leslie
I want to go back first to something that Fran said which is that, you know, we have this kind of unique leverage of having empathy and capacity for solidarity and interpersonal relationships. And I would identify the first risk just in this, in the sense that these systems as they're being trained through things like reinforcement, learning through human feedback, are really being trained to be anthropomorphically almost deceptive when it comes to the relationship that one, the system itself, right? So when you have anthropomorphic deception, it can lead to behavioral manipulation, it can lead to harms of human dignity and a person's sense of moral or psychological integrity. So you have for instance, conversation agents like GPT that are being trained to carry out human like dialogue. And this can actually deceptively lead people to believe they're interacting with real warm blooded agents. And that can really undermine in a sense the interpersonal relationship that we sort of come to assume is part of our interaction. Many, many years ago there was a researcher named Joseph Weizenbaum who called the Eliza effect, which is this tendency for human beings when interacting with the simplest of chatbots, to just project kind of human like qualities into the system. And that in a sense it really does make us vulnerable to losing a bit of that authentic interpersonal relationship. Another thing I just mentioned in terms of a larger scale set of risks, risk to the integrity of information ecosystems. So we know that the irresponsible or malicious development of Gen A technologies could lead to the scaled production of disinformation propaganda, false but true sounding information which could potentially flood the digital public square with misleading and non factual content. I mean this is so important right now to think about this amidst all of these major elections coming up and about a billion people, the biggest election I think in human history, about a billion people will show up at the polls quite soon. And we can think of the kind of potential scaled use of these systems as posing a threat to those broader democratic processes. Also, if you think about the integrity of information ecosystems, we can think about how it's not just the undermining practices, but it's also the downstream effects on our data ecosystem. Right? So we see now that the volume of generated, synthetic, generated content out there in our information ecosystem is likely going to create the kinds of data pollution that will contaminate downstream data sets and really challenge future attempts to draw on this kind of, you know, public space of data in order to train systems. There's been things written about model collapse, how eventually a polluted downstream data set will sort of narrow into the mean and all of the kind of unique outlying ideas that are out there won't be picked up by the models. There'll be a collapse of the models. So we really need to think about that. One final thing, Jali. What we've seen, I think over the last year is a kind of commercialization or industrialization of these generative AI systems. But in a way which I think just looking at what's happened recently with OpenAI and Hasan Altman and Microsoft, which has really led to a kind of unchecked consolidation or centralization of power amidst a few large tech companies to control the data infrastructures, the compute infrastructures, the kind of skills engineering and software engineering infrastructures. And if we sort of step back from a social scientific point of view, this can really undermine ultimately the public interest. This could lead to increasing wealth polarization, increasing global inequality, uncontrolled labor disruption, elimination of vulnerable industries. And so we really need to be careful that as we now think about how to regulate these systems, we really think about that broader ecosystem dynamic of power differential and how there is a difference between small and medium sized companies trying to use these systems to advance whatever purpose and the way that these bigger players are actually controlling and intermediating the broader environment. Now, I do have one nice set of opportunities that I would highlight and this really has to do with scientific discovery because I do think for me the most liberating possibility with generative AI and transformer models and these large scale models is that they can assist us to be more creative. So in a sense, if we can think of these systems as computational vehicles of observation and assay and even analysis, these systems can allow scientists heuristically probe unbounded search spaces and patterns in high dimensional data that would otherwise be inaccessible to human scale examination and thinking. Right. But what these systems are doing is they're picking up pattern in very high dimensional data that would simply exceed our own capacity to identify those patterns. And so one thing that these systems are going to do is they're going to allow for us to have new kind of lenses into different worlds in the natural world. So thinking about, for instance, how already these systems have been used to discover new chemical compounds, they've been used to discover new engineered materials in virtue of this capacity to heuristically search and identify large dimensional patterns.
Xiaoli Meng
Thank you very much, David. My next question is to David again. To what extent are these advancements presenting contemporary society with a danger of the future shock?
David Leslie
Right, Shelley. So this is our title of the collected volume. And I think when, well, when we were all, you know, initially talking about this, one of the first things that came to mind about this kind of industrial revolution of AI, if we could call it that, right this moment, this Gen AI moment, is something that just Alvin Toefler described as future shock, which is to say that when, when technological change, if you will, outpaces the kind of norms, understandings and ways of governing our behavior that exist at any given time, we are subject, as TOEFLAR would say, to too much change in too short a time. And really what future shock is, it presents us with this issue of are we ready for this? Do we have the institutional structures, do we have the governance structures in place to responsibly manage this type of change? And I mean, we're kind of seeing that we are feeling this kind of what TOEFLAR would call shattering stress on our bigger systems, rushing to, for instance, regulate AI in time to head off at the pass, if you will, really significant labor disruption or significant disruption of information ecosystems. And so when we talk about future shock, that we don't treat the AI in scare quotes as being out of our hands, right? If the horse is out of the barn and it's going to do what it does, I don't think we should see it that way because at the end of the day, we have our hands on the wheel, always of the technology, because it's humans who make the technology, right? So we have to be aware that we can affect future shock by hastily producing technology. But that doesn't mean that the technology itself is out of our hands. And so I think we need to just be careful not to be what we would call technologically deterministic and say, well, there's nothing we can do about this. We need to do the best we can. And I would say, I think that's maybe one of the reasons why we've been motivated to produce this edited volume amidst all of the busyness that this time Presents to everyone.
Xiaoli Meng
Well, thank you, David. And I should thank again all of you for taking on this tremendous task. Let me ask a friend as a specific example of are we experiencing the future shock? And how does the generative AI shock the academic world regarding the accepted norms and the practice of scientific research, teaching, scholarship and academic publication and all the things that we do in the academic world?
Francine Berman
Yeah, I did want to reinforce one of David's points, which is that by definition, disruptive technologies disrupt, and they really cause us to do things differently. And that transition period is always challenging. And it's not just AI. I mean, think to nuclear energy and how we had to integrate things. Think to the industrial revolution when iron and chemical manufacturing changed everything. It changed where people lived. It changed the kind of laws they had. You had some of your first child labor laws and environmental protection protection laws because of the rise of factories. And so we are going to need to do the same things. Now, if you look today at our institutions, in particular our academic institutions, AI is truly a disruptive technology. And in particular, ChatGPT. It's changing the way we teach and learn. It's changing how we do research. And something you might not be thinking about is it changes how we run our universities, because our universities are businesses and they're organizations. And there's lots of things our universities do at scale. They admit students, they do performance evaluation, they hire, et cetera. And that's changing the way a lot of people are doing these things. And so, like other organizations, you know, universities are trying to grapple with this. I'm on an AI all university task force for University of Massachusetts Amherst with a lot of other people about how are we going to be dealing with it, how do we get ahead of the curve? And, you know, there's ways that, you know, you're not going to deal with it. You can't put the genie back in the bottle. You just can't pretend we don't have these things and say we're not going to use AI because essentially all of our students will be using AI in their professional lives. All of our faculty are using AI in various ways. Our administration. So we can't put the genie back in the bottle. We have to do our jobs with the best tools available on checks and balances. So we use them ethically, equitably, etc. And you know, we're not going to wait until we know everything about AI until it happens, because we will never be in that space. And so what universities are trying to grapple with is how do you introduce AI in the classroom? How do you introduce AI in research? And how do you introduce AI in administration and in the classroom? You read all the time about the, like, ChatGPT got into Wharton by taking, you know, all of these kinds of crazy things. But I will tell you, I use ChatGPT in my introduction to Public Interest Technology class and I tell the students three things. I tell them, number one, you need to know how to be a good writer and you need to know how to generate ideas yourself. You will not always have these AI tools, these AI tools can help you tremendously, but you need to know how to do this yourself. A well written email is magic and you need to know how to do that. Number two, ChatGPT is okay at helping you with this, but it's often wrong and it's not an interesting writer, at least not yet. And so that you really need to use it as a co pilot with you as the pilot, as a developmental editor, maybe as an idea generator to go along with yours. But you have to be in control. And the other thing I've told my students, which is absolutely true, is nothing you put in these tools is private. So if you want to talk about sensitive material, if you want to talk about things that will go in the giant database, you have to be careful. And as an instructor, what that means is I have torqued my assignments a bit. I ask much more about students personal experiences, I ask what their opinions on various things are and to do that with this context. And I think a lot of people are doing similar sorts of things. But the other thing I wanted to mention is that universities have to admit students at scale, they have to hire at scale, and all organizations are going to have to worry about that in the future. So in a sense, universities are kind of the point of the spear in this. But I think all of us will have to worry about it.
Xiaoli Meng
I certainly do as a faculty. And thank you, friend. The next question is for Raf, and how does this moment of transformational technology change compared to other major transformations in human history, for example, compared to the emergency of the World Wide Web three decades back?
Ralph Hayobrisch
It's an interesting question because it forces me to predict the future impact of it that we've already seen of the World Wide Web. I think one of the aspects of generative AI, in particular large language models, that makes it so, so appealing is that the interface is much more natural. You can include in terms of questions, your response is not a set of labels or it's text, and it's text written in your language that you understand, which is what David referred to, that you can almost give it the feeling you're talking to a human. It's not a system. It has life. Now, why do I say this is important? Because I think it makes the technology much more available in many more jobs in society. So will it have the same transformational effect? The worldwide web was impactful because it allowed to connect, you know, beyond physical distances. So I was able to get information that is not just, you know, a few hops in my network away. I got the information through a wire, even though it was in a different part of the planet in a different time. Now I get to information that is contained in parts of the Internet that is hard to find, and I get it in a form that I understand it without being a programmer. If I simply can read or write, which many, many people can do, I get access to that information that is buried somewhere deep in the whole body of knowledge that it's called the Internet, with all its risks and all its dangers when this is wrong information or when this is biased information. But I can get to it. So will it have the same technological impact in society? I think it does. With the large language models, I enabled the same ease of finding something that's not physically far away from me, but that's hidden through the notions of time or hidden through the depth of words created in parts of the Internet. And therefore, I do think it will be quite transformational. Maybe I'll give you an example, continues with what Fran just said. I use Copilot for the last six months extensively in my class. So how it has a change? Well, I can now get to much harder problems, much more difficult software problems, because I, first of all, I use it, so it helps me a great deal in the preparation of code that I need for, for the, for the assignments. I teach computer science classes, but I can also make the assignments a lot more complex. And is this like cheating? I think to a degree it's not. Because when I started, there wasn't any fancy ide. We had to type into a system called VI or Emacs. And it was very, you know, your development speed of code writing was way slower because you made a lot of mistakes. You didn't have a proper debugger. Fast forward to 20 years. These tools are available now and they're accepted and they're not treated. They're considered cheating. When you program and have your refactoring code macros in your ide, you have your syntax highlighting and you have your automatic Formatting and completion of method calls. This is just a continuation of this tool chain and what it allowed me in the past 20 years and students is to tackle harder problems in a short amount of time. And I can do that now and so I do use it. So has it changed the teaching of computer science? It has. Is it that we, you know, we don't need computer science in the future? Absolutely not. They'll just be able to tackle harder problems in a short amount of time.
Xiaoli Meng
Thank you, Raf. Let me move from the technology, the use of AI to the another really important issue which is the AI governance and the laws. And this question will be for, you know, for Devi and what's your view on, in terms of the law into the change into the AI governance and particularly how do you envision the law and AI governance evolving in US, EU's and internationally? Because the whole generative AI is certainly a global phenomenon.
David Leslie
Yeah. So I think that we've seen over the last few months a kind of really huge increase in the activity around international action or international cooperation and collaboration on thinking about AI governance. So we had the UK AI Safety Summits which produced this Bletchley Declaration. The Partnership on AI brought together a kind of multi stakeholder group to produce guidance on foundation models. The G7, which has a Japanese presidency, has something called the Hiroshima process, which produced a whole code of conduct, international code of conduct for producing the systems. And so we see this kind of slew of multistate, multi stakeholder initiatives, mostly voluntary, at the international level. And that's an important thing to flag up because we simply don't have a set of international structures that can sufficiently put controls on the technology. The other issue that we're seeing right now with this international environment is the people that are at the table and who are talking about this and who are writing these guidances largely in a sense overrepresent the positions, interests and policies really of a select few geopolitical and private sector stakeholders. In a sense right now, the international policy and governance conversation to center the views of the global north and the kind of prominent tech actors that exist in kind of high income countries. And so we really do have a risk now of the voices and views of those who are from the non west and what's historically been called the Global south, that their voices, their views are being neglected. And we need to be concerned about this. We need to have a more even pitching of the global conversation on this. One other thing that I'd point out here is I think there's been A tendency, especially in the uk where I do a lot of my work, to set a kind of pro innovation, minimal regulation mentality again of advancing these systems. And so the thought is the more regulation you have, the less innovation you'll have. And I think that often is a false dichotomy, because good solid regulation, where you've got standards that are the kind of regulatory action and the governance constraints, you've got this creation of certainty in markets and you've got this expansion of public trust. And so oftentimes having good solid regulation that enables the generation of solid science does enable innovation rather than preventing the advancement of innovation. More recently now, I think there is some sort of positive and constructive action on this front. We're really thickening the approach across the different sectors, across the different government departments to increase readiness to address competition issues in the broader ecosystem. So I do think that progress is made. And the last thing I'd mention, in parallel, we've got the EU AI act, which just this past couple of weeks has been pretty much brought to a political agreement, at least it won't come into effect for a year or more. But we have a kind of a solid basis now of a kind of risk based, but rights aware regulatory framework. And I think this is where we need to go. We need to be risk based in the sense that the risks, the scale and the scope of the systems establishes the proportionality of the governance intervention at the same time as its values and rights based thinking that defines those risks and defines the way in which we consider the impacts on society. So we need to walk and chew gum. We need to both understand the risks and respond proportionately to the risks. But we also need to think first and foremost about fundamental rights and freedoms, you know, human rights and the broader set of values that will drive the innovations forward to more sustainable society.
Francine Berman
I wanted to jump in on that just a little bit because I think a lot of what David says is really important for people to know and I think they don't think about that a lot. AI all over the world is really different. And what you're seeing from the eu, what you're saying from the us, what you're seeing from China are all really different and they have a lot to do with power and politics and what the cultural notion of equity is and who's on first, is business more important, are individuals more important, what are the penalties, what are the rewards? And you see a variety of different ways of thinking about law, thinking about policy, thinking about the private sector all over the world. I Think it's also important to note, and this gets back to something Ralph said earlier, is that power matters and now I'm talking about electrical power. For us to be able to run these large language models, we need huge amounts of data and we need huge computers that eat up a lot of power. And so if you think about these disruptive technologies and you can even include Bitcoin in that, the computational impacts of these are tremendous. And it means that big companies can get ahead. It's harder for small companies and it makes a difference. I want to say one last thing because I think we all hear the kind of innovation versus regulation controversy a lot. But the fact is nothing is done in a clean slate. I mean, we have laws of gravity. If we don't pay attention to those, that's not good. And so there are always restraints on the environment and we innovate sometimes because of that and sometimes in spite of that. And I think that will be true for AI as well. As there is more and more regulation as there and more and more constraints based on cultural and political and other kinds of things, we will still innovate. We'll just innovate in a slightly different ecosystem.
Xiaoli Meng
Thank you, Fran. That's a perfect segue to the next question. I know it's on many people's mind, how should we adjust our society and education institution to ensure AR empowers the human equality? And particularly talk about, for example, the companies now really have much more resources to the computing than for example, academic institution. But a lot of these quality issues about, you know, both the individuals and different, different sectors. And how would you assess the likely impact of generated AI on promoting greater equality or protecting privilege?
David Leslie
I mean, I think we need to, we need to step back for a second and understand the amount of potential good that these technologies could do if guided by the public interest. Right. Which is to say, you know, we see that these systems could help address public health challenges, address clinical challenges in clinical medicine, challenges confronting a biodiversity drain, environmental sustainability, and also to sort of address issues of poverty, to address educational inequities. I mean, there just is a, an endless list of possible public benefits that could come from the responsible and society led use of the technology. And I'm talking about this when we talk about equality because if we take that sort of public interest starting point, we would then recognize that these technologies could function as public utilities. And if we understood these as public interest technologies, then we would have absolutely no reason not to marshal their potentials to address the issues rather than to increase wealth Polarization due to the concentration of technological and financial power. And so, yeah, I mean, Shelley, I think that we see the capacity of our environments to really do a lot of good in the world. And yet, as you were saying, the distribution of compute resources is not equitable. And oftentimes now the researchers themselves are subject to the agenda setting power of larger tech companies because they're dependent upon those tech companies for their compute and for their data. And so we don't want to sort of portray too much of a high relief here because I think public interests are in everybody's benefit. And so first and foremost we need to pursue this innovation in a bias mitigation and discrimination aware manner. Right. We need to address the inequities first and foremost in the data sets and in representation. And from there then move to try to see where the public benefits lie. But we need to address inequities first and foremost.
Francine Berman
Yeah, I just jumping in on what David's talking about. A lot of the stuff we use is stuff that is not critical infrastructure for us. But imagine life without the Internet. And increasingly there are things that are born digital. There are ways that you have to use it and you have to just accept. Imagine going to your library and using encyclopedias and never using search. A lot of this stuff is critical infrastructure. And we have different rules for critical infrastructure. We have public interest rules for that. We ask questions like, is this cost effective? Is this fair? Does this advantage some groups and disadvantage others? And we try to fix it so that it is equal opportunity for everyone within the cost basis of everyone, you know, as fair as we can make it. Because you're right, you know, human beings make these things and often we embed inequalities even when we don't want that. But I think we have to pay special attention for the parts of AI and the parts of technology in general that we use as critical infrastructure. And I think we have to start applying different rules to it and different cost models to it, because we need these things in order to progress as a society.
Ralph Hayobrisch
Maybe I want to add more. It's not exactly on the use of those systems, but on the research on those systems. And the interesting thing is that this isn't a new phenomenon. For the past 20 years, private companies were privy to the large amounts of data that's relevant to their business. So therefore they're willing to invest. And you know, we've seen a brain drain in the field of machine learning and AI, not just since 2022, but since 2010. And the reason was that there are data which was customers data, whether they searched the Internet, they shop virtually, or they connect through the Internet on social channels, were present at the companies, wasn't present in public, and you couldn't make it public as private data. I think what we see here now is almost a bit easier to solve that the amount of computation that you require for getting these emergent effects is larger than a single researcher and the research budget can afford. But in the context of a large corporation, that can be fulfilled and there's business value attached to it. So I do think if we wanted to, we get to equity in terms of the research on those methods. We need to think about public infrastructure that can be used for the purpose of research and research only that's currently not present. So I think some chairs that are well equipped, some universities have ability to tap into, let's say $100,000 of computation. Others cannot do that, but at least you can make that equal as opposed to what happened in the past 20 years or 10 years where you couldn't possibly release the data. Here these models are trained on public data, but expensive in cost of compute. So it seems more solvable than getting equity in terms of the research on private data.
Xiaoli Meng
Thank you to all. There's two more questions we want to cover. I know that one of them is almost on everybody's mind across different age group is, well, this generator take away my jobs. Do I still have a job in the future? I know that lots of my students are pretty anxious about it.
Francine Berman
Yeah, it's really interesting. And I think during disruptive technologies, we always go through these kinds of things. Will it take away your job? Well, it depends on what your job is. So I don't know. But I do think that we will. It will be really important for us to work with AI the same way that the pilot of the plane you just flew on works with AI to make sure that the plane is. We need to vet the answers. We need to sort of work with it as a tool. I think that it's likely that AI can enhance your job. Will it take away your job? That's pretty risky. So there's a good question about that kind of thing. But I do think that as we kind of smoothly move into a better relationship with AI, that we really have to look at it enhancing jobs and some jobs it will take away. But then the question is, we will need humans to be dealing with these AI, so we will create some new jobs out of that as well.
Xiaoli Meng
Ross?
Ralph Hayobrisch
Yeah, I want to continue where friends are. I Completely agree. Some jobs will just, you just get more productive because you have a more powerful tool. Others might go away. But I think as society, what's important is that we don't leave the people behind, have learned the previous job without the technology so that we educate them. We don't just educate them in school or we don't just educate them at uni. But the growth of people's jobs will be affected because the tool becomes available. They need a society, we need to continue educating them whether they're 30, 40, 50, or even 60. Because just having these more powerful tools, but not a training how to use them, or not a training on new jobs that emerge because the tools replace big parts of a job, is societally unfair.
David Leslie
Debbie, Just to add on to those two, I would say we have to ask the question who benefits from automation and the labor transformation? Right. What we know about these systems is that these are cognitive surgeons targets. They're stand ins for thinking functions. Right. And as there's more automation of thinking functions, as we see with generative AI, we'll have a larger and larger, if you will, bloodless labor force. Right. It'll be automated cognitive systems out in the world. And when you move in that direction, the question is who's actually benefiting from that automation? Because if the benefits are accruing to very few people, then we won't have better outcomes that will free those who are, who are being displaced. And so we need to really think about the broader dynamics of if there is labor displacement, how can society create opportunities for people to better contribute to the life of the community through their creativity, through their talent. And those are social problems that we can't just treat in an economic frame. We need to think of those as social issues.
Xiaoli Meng
Thank you, David. And the last question is I would ask if people want to engage further, can you suggest an action item or ways in which they can make a difference and, or could you suggest the good resources for those wanting to educating themselves? Maybe start with Fran again?
Francine Berman
Yeah. I would say we want to be in a world where humans are in charge and AI is great help. So we control them, they don't control us. I would say each individually, buyer beware. We have to think about the tools and services that we use. We want to make sure we're protected, we want to make sure our data is good. We want to make sure that AI is used appropriately and really check your answers when you use ChatGPT. Don't believe everything ChatGPT tells you. Rough.
Ralph Hayobrisch
So I think your question was that where can they educate themselves further? I would say working on a special issue on HTSR on future shock, we pay attention that it's broadly accessible. We you know, this is not your deep dive method deep dive that you would get at the current Europe's conference. So that's a good source if you want to learn more and find it accessible.
David Leslie
Debbie, I would say first and foremost, don't believe the hype. These are statistical models, mathematical systems and we we should not anthropomorphize the systems. We should really understand these as tools that are in the human hand, as Fran was saying, and are to be directed in accordance with our goals and purposes. At the end of the day, our future should and will be a human future that's democratically decided rather than technocratically decided. So I would just say we need to start from our humanity rather than the technology.
Xiaoli Meng
Well, thank you all very much. I want to first thank all my three co editors for your fabulous answers. Hopefully we can continue this conversation. I want to thank GSS and Jen for giving us this opportunity to engage the alumni and I certainly welcome everybody to read the Harvard Data Science Review, as Raf mentioned. But I also hope that in the next conversation we're going to have, we're going to have AI to join us.
Liberty Vittert
Thank you so much for listening to this week's episode of the Harvard Data Science Review podcast. To stay updated with all things HDSR, you can visit our website at HDSR, mitpress, mit.edu or follow us on Twitter and instagramhdsr. I'm Liberty Vittert, the feature editor of the Harvard Data Science Review, and our host of this episode is our Editor in chief, Shali Meng, and my co host. A special thank you to our executive producer, Rebecca McLeod and producers Tina, Toby Mack and Arianwen Frank. If you liked this episode, please leave us a review on Spotify, Apple or wherever you get your podcasts. This has been the Harvard Data Science Review. Everything Data Science and Data Science for everyone.
Harvard Data Science Review Podcast: Future Shock – Grappling With the Generative AI Revolution
Release Date: May 31, 2024
In the May 31, 2024 episode of the Harvard Data Science Review Podcast, host Liberty Vittert introduces the discussion led by Xiaoli Meng, the Editor in Chief of the Harvard Data Science Review. The episode, derived from a webinar titled Future Shock: Grappling With the Generative AI Revolution, delves into the complexities of generative AI, exploring its capabilities, risks, opportunities, and its profound impact on society and various institutions.
Francine Berman initiates the conversation by defining generative AI as "artificial intelligence capable of generating [various] media and patterns" ([04:50]). She emphasizes its ability to discern and replicate intricate patterns to create accurate representations, such as images.
Ralph Hayobrisch expands on this by highlighting the novelty of generative AI's current state, not in its mathematical foundations but in "the amount of data that is used to extract these patterns" ([05:24]). He likens generative AI to "sequence models of data on steroids," attributed to the vast and diverse data generated by society today.
David Leslie adds a governance and social science perspective, explaining that generative AI encompasses not just the technology but also the "compute infrastructure," "data infrastructure," and the "skills and expertise infrastructure" required for its operation ([06:16]). He underscores the multi-phased lifecycle of generative AI systems, distinguishing them from conventional machine learning models by their foundation and downstream applications, necessitating a comprehensive governance approach.
Francine Berman identifies several key risks:
Ralph Hayobrisch points out:
David Leslie highlights:
Francine Berman envisions:
Ralph Hayobrisch suggests:
David Leslie proposes:
David Leslie discusses the concept of future shock, where technological advancements outpace societal norms and governance structures, leading to "shattering stress on our bigger systems" ([21:08]). He emphasizes the need for proactive governance to manage rapid AI-driven changes and prevent societal destabilization.
Francine Berman addresses how generative AI is transforming academia:
Ralph Hayobrisch compares the generative AI revolution to the emergence of the World Wide Web:
David Leslie outlines the current landscape of AI governance:
Francine Berman adds:
Ralph Hayobrisch emphasizes:
David Leslie advocates for viewing AI as a tool for public good, capable of addressing critical societal issues like public health, environmental sustainability, and educational inequities ([39:39]). He stresses the importance of bias mitigation and equitable data representation to ensure AI benefits all segments of society.
Francine Berman likens AI to critical infrastructure, arguing that it should be governed with public interest principles to ensure fairness and equal opportunity ([41:59]).
Ralph Hayobrisch highlights the necessity of equitable access to computational resources for research, proposing public infrastructure to democratize AI development ([43:19]).
Francine Berman reflects on the dual impact of AI on employment:
Ralph Hayobrisch echoes the need for lifelong learning:
David Leslie raises concerns about:
Francine Berman advises individuals to:
Ralph Hayobrisch recommends:
David Leslie emphasizes:
The episode concludes with Xiaoli Meng expressing gratitude to the panelists and encouraging continued dialogue on generative AI's multifaceted impact. Liberty Vittert wraps up by directing listeners to the Harvard Data Science Review's resources for further engagement.
Francine Berman ([04:50]): "Generative AI is artificial intelligence capable of generating things. So it's other media and patterns and all of that."
Ralph Hayobrisch ([05:24]): "Generative AI is sort of sequence models of data on steroids because of the amount of data that we produce that comes from our society."
David Leslie ([06:16]): "Generative AI is not just the kind of character of the technology itself, it's compute infrastructure, right?"
Francine Berman ([08:57]): "AI is much more than that. And the management of AI that will help humanity and society thrive is much more than that."
Ralph Hayobrisch ([12:16]): "These systems right now are not built algorithmically to reliably quantify when they're uncertain."
David Leslie ([15:03]): "Anthropomorphic deception can lead to behavioral manipulation, it can lead to harms of human dignity and a person's sense of moral or psychological integrity."
Francine Berman ([23:48]): "We cannot pretend we don't have these [AI] things and say we're not going to use AI because essentially all of our students will be using AI in their professional lives."
Ralph Hayobrisch ([28:27]): "With the large language models, I enabled the same ease of finding something that's not physically far away from me, but that's hidden through the notions of time or hidden through the depth of words created in parts of the Internet."
David Leslie ([32:35]): "We need to walk and chew gum. We need to both understand the risks and respond proportionately to the risks."
Francine Berman ([39:39]): "Imagine life without the Internet. And increasingly there are things that are born digital."
David Leslie ([47:11]): "We need to really think about the broader dynamics of if there is labor displacement, how can society create opportunities for people to better contribute to the life of the community through their creativity, through their talent."
Listeners are encouraged to delve deeper into the topics discussed by exploring the Harvard Data Science Review, particularly the special issue on future shock. For ongoing updates and resources, visit HDSR at MIT Press or follow them on Twitter and Instagram.
This summary encapsulates the rich dialogue and insights presented in the episode, providing a comprehensive overview for those who have yet to listen.