Loading summary
Tracy Alloway
Hey Fidelity, what's it cost to invest with the Fidelity app? Start with as little as $1 with no account fees or trade commissions on US stocks and ETFs. That's music to my ears. I can only talk
Fidelity App Disclaimer Voice
Investing involves risk, including risk of loss. Zero Account fees apply to retail brokerage accounts only. Sell order assessment fee not included. A limited number of ETFs are subject to a transaction based service fee of $100. See full list of Fidelity.com commissions Fidelity Brokerage Services, LLC member NYSE SIPC
Safeway/Albertsons Advertisement Voice
this week with digital coupons at Safeway and Albertsons get beef rib roast for $7.97 per pound member price with minimum purchase of $50 or more in a single transaction. Exclusions apply. See Store for details and Broccoli, cauliflower or russet potatoes are 97 cents per pound. Member price limit 6 pounds plus selected sizes and varieties of Lucerne Butter Cheese or Philadelphia cream cheese are $1.97 each member price. Visit safeway or albertsons.com for more deals and ways to save
Tracy Alloway
Bloomberg Audio Studios Podcasts Radio News.
Joe Weisenthal
Hello, and welcome to another episode of the Odd Lots podcast. I'm Joe Wiesenthal.
Tracy Alloway
And I'm Tracy Alloway.
Joe Weisenthal
So, Tracy, you know, you ever come across some writing and you can't articulate exactly why, but you're like, I'm pretty sure AI wrote this. Does this happen too much?
Tracy Alloway
So, full disclosure, I haven't really thought about it that much. Yeah. Because the thing is, I probably should think about it more, but there's a lot of bad writing out there and I've become sort of inured to it. And I also think that, I don't know, trying to figure out whether or not something was generated by AI Nowadays, if you actually dedicate a lot of your own time to doing that, that is a huge mental burden to be attempting. Especially you and I are in the journalism industry. How many of the pitches do you think that we get from PRs right now are being generated by AI? Imagine if you're reading each one of those and trying to figure it out on a daily basis.
Joe Weisenthal
You know what I suppose I think about it the most is someone will respond to a tweet.
Tracy Alloway
Yeah.
Joe Weisenthal
And I'll be like, well, if this is a real person, then maybe this person deserves some engagement and they ask a question. Or I want to respond.
Tracy Alloway
But if it's a real.
Joe Weisenthal
If there's a person in the bot, and obviously I don't. And that's where I'm like, you know what I want to figure it out. I would like to know the answer. You know, I have a controversial view about AI writing, by the way, which is that it's pretty good. I mean, like, by and large. And I said this, I think maybe in a recent episode, when you consider the fact that I don't know, the majority of the population, like, doesn't know where to put a comma within a sentence.
Tracy Alloway
Well, this is my point.
Joe Weisenthal
It's actually pretty good. I mean, one thing I'll say about AI is it never gets the placement of a comma wrong. On some level, it's perfect.
Tracy Alloway
Did you do that? I think it was in the New York Times. The. The test.
Joe Weisenthal
Yeah. I kind of hated that.
Tracy Alloway
Okay. Why?
Joe Weisenthal
Well, because. I'll tell you why. First of all, there's only five examples. There's not very many. 2. It asked the reader, which do you prefer? But I think.
Tracy Alloway
And they were different subjects as well.
Joe Weisenthal
Yeah. And also, I think most people probably treated that as, can you guess which one is a human? Because everyone wants to say they prefer the human. I didn't think it was, like, a great test. Nonetheless, look, not only is it often indistinguishable, not often. Is it often fine writing. Sometimes AI could come up with a really remarkable turn of phrase.
Tracy Alloway
Ye.
Joe Weisenthal
But I still, by and large, don't like it. You read like a thing, especially a long text that's AI and it's like, even if you can't articulate it, it's like this feels AI. It has a certain sickliness, sweetness to it that is often annoying. It's annoying.
Tracy Alloway
What I notice about it is it doesn't do style very well. Right. So if you ask it to write something in the style of a writer, if you choose anything other than something really obvious, like Shakespeare, it really. It suffers. But the text that it actually outputs is pretty clear. Yeah, right. For basic understanding, totally. It's probably better than a lot of what's on the Internet.
Joe Weisenthal
The real people who are going to have to worry about this are, like, teachers, obviously, in universities, and lawyers, student lawyers and maybe law. It's fine. But there are some times it's like, okay, did someone write this or not? And there has to be. It'd be nice if, like, we could know the answer.
Tracy Alloway
Well, the other thing that's starting to happen is, have you seen any books out there that actually come with a disclosure or disclaimer that say, this book has been written only by humans? No AI used at all. I saw that for the first time on a book that we actually read for an Odd Lots episode. I don't think it's come out yet, but that kind of threw me.
Joe Weisenthal
Yeah, no, it's more and more. Anyway, as we enter a world in which the vast majority, if not already, of words written are written by AI, there's going to be interest in this question of whether we know anyway. There's this company called Pangram Labs, and they have a little thing, and you can pay for it, but also a free service where you can drop like a text in and it'll say the odds that it was written by human or AI. And I'm pretty impressed by it. I like, did some samples of my own writing and then AI outputs it. Got em all right. But then I did some further. Like, I tried to stump it to see if, like. So what I did was I took a piece of AI writing and then I had it translated into Chinese.
Tracy Alloway
Okay.
Joe Weisenthal
And then I had it translate that into high Chinese. So it's like, okay, imagine this is being written by a more formal register. And then I had that translated into Hebrew, and then I had that translated into English. So the original thing, through this series of AI telephone, through various translations, and then I put that output back into Pangram and it got that right. It said it was AI. So even after a series of sort of transformations designed to obfuscate the original style of the piece to see if, you know, eventually it would emerge in something else. So I was pretty impressed. It seems to work. And, you know, I think that's interesting for a couple reasons, which is maybe there is something that you can just tell. But two, it sort of worries me because, you know, there have been articles and they'll say like, this is written by AI. And I think one of my big fears would be that I write something. I like to use an EM dash. I've always been an EM Dash fan.
Tracy Alloway
I love EM dashes. That's how people talk. I'm sorry.
Joe Weisenthal
And then what if it says you wrote this by AI? And I'm like, I didn't. And then here is this black box that is suddenly a Judge Jurgen executioner for my career, potentially. You wrote this via AI, the lab says, so you are now done. Like that worries me. So I think this raises a lot of very interesting questions about these model detection things, and I want to learn more about how it works.
Tracy Alloway
Well, there's also a lot of philosophical questions about just what we value in writing. True as well. No one's going to yell at you for using Spellcheck or something. Like that, right? Like, it's kind of crazy to think that reputational risk is going to hinge on whether or not you might have used a platform, a chat platform, to, like, do some basic copy editing.
Joe Weisenthal
Totally. Well, very happy to say, we do in fact, have the perfect guest. We're going to be speaking with Max Spiro. He is the founder and CEO of Pengram Labs, and he can answer all of our questions. So, Max, thank you so much for coming on Outlaws.
Max Spiro
Thanks for having me.
Joe Weisenthal
How do you know it's right? So someone puts in a piece of text, and we'll get into the method in a second. But someone puts in a piece of text and it says human AI. What makes you believe that? You have a very good track record on this question.
Max Spiro
So when we started Pangram, we started by doing this thing we call a human baseline, which is how well can we, as a human, predict whether something's AI or not? That's the first step at learning is this problem tractable? How hard or easy is it? And I found, me personally, I was able to get about 90% accuracy. And so we figured an AI model should be able to do much better than that.
Tracy Alloway
So I have a bunch of methodology questions which we can get into. But just before we get into any of that, why is AI slot bad, in your opinion? Why does it need to be tracked and identified?
Max Spiro
I think the problem is it's just so easy to generate, and so it's very difficult to know what is the intent behind it. Basically, right now, I think we're actually pretty lucky. We live in a world where the signal to noise ratio on the Internet and in our information channels is pretty high. We have pretty high signal to noise, but any bad actor can come in and just flood our information channels with AI slop. That looks legitimate. It looks like somebody put actual effort and thought into it, but really it was just like a single prompt, which could have also been automated.
Joe Weisenthal
This is something that I think about a lot, which is that there was a point in time, and maybe still is the point in time, where if you read something that was grammatically correct or the punctuation was strong or the spelling was strong, there was reason to think that the person who wrote it was a person of certain seriousness and a certain intelligence behind it. And I think that the issue that you're identifying is that that link is now being severed so that we can't use these heuristics anymore, such as the strict quality of the prose, to know, in fact, whether this was published by someone who was, like a serious actor, intelligent or not.
Tracy Alloway
And now you have people inserting typos into their comics.
Safeway/Albertsons Advertisement Voice
I know.
Joe Weisenthal
Well, that too, to prove that they
Tracy Alloway
are, in fact, established.
Joe Weisenthal
But, Wade, Sorry, just to go back to my original question, so you mentioned, okay, you were able to get it 90% right, but now it's been used a lot more, and you have people paying software, presumably teachers and journalists, et cetera. Given all of that, getting from 90% to 100, I mean, if you could make one out of 10, is clearly an unacceptable error rate for a piece of commercial software that could call someone an AI creator. So you have to do a lot better than 90%. Talk to us about, like, what you've seen so far in your data since releasing it as commercial software that makes you believe the software is doing a correct job of allocating between the two categories.
Max Spiro
So. So we've built really comprehensive evals. And so our evaluations. There's two kinds of errors. There's a false positive, which is when something is written by a human, and then we say that it's written by an AI. And there's a false negative, which is if it was AI written and we don't catch it. And so we track our numbers for both of these. And for human writing, we're actually pretty fortunate. We have millions and millions of samples, so we can get a false positive number that we have a very high degree of confidence in. And our number right now is about 1 in 10,000. Okay, so if we scan 10,000 documents, on average, one will come back as AI when it was actually human.
Joe Weisenthal
And what about in the other direction?
Max Spiro
False negative. I would say around 99% accuracy. So around 1% false negative rate. I think this depends a little bit more on how adversarial the prompting is, how much they're trying to.
Joe Weisenthal
What I did exactly. Send it through multiple filtrations to obfuscate the original output. That would be an example of adversarial prompting.
Max Spiro
Exactly. But in, like, the general case where we're just looking at straight outputs from AI, it's above 99%.
Joe Weisenthal
Okay.
Tracy Alloway
Okay. So what is your model looking for exactly, when it's evaluating a text? Because, as we mentioned in the intro, you know, syntax and grammar tends to be pretty good. On AI generated copy, the style is sometimes more of an identifier. I would argue, to your point, Jo. Like, sometimes it reads very saccharine and kind of overly earnest in some ways. So what exactly are you focusing on here? What are the tells?
Max Spiro
Yeah, so the style and the Word choices are definitely part of it. But I think what a lot of people don't realize is they're actually making a lot of decisions when they write a piece of text. So every. There's, you know, dozens or hundreds of ways to phrase every single phrase. And over the course of 50 or 100 or 200 words, you're making thousands of decisions, actually. And so what we're doing is we're learning the patterns and how these frontier models make these decisions. And if the vast majority of these decisions line up with how the frontier models are doing it, then it's vanishingly unlikely that this was written by a human. You would have to just happen to make the same exact decisions that the LLM does hundreds of times.
Tracy Alloway
Interesting.
Joe Weisenthal
Okay, but this is a really important point. So everyone at this point has some feel for let go the M dash tell, right? But my understanding is it's not like you don't go in and like hard code if you see a bunch of EM dashes. This is the thing, these decisions, in many cases, I imagine neither you nor the model itself can articulate in English what the decisions are. All you know is that the decision pattern exists. Is this correct?
Max Spiro
This is correct.
Joe Weisenthal
Okay, can you explain? So therefore, what does it mean that your model has learned these decision patterns?
Max Spiro
So what we're doing on the very broad scale is we're training a deep learning model. So it's a pretty big black box, but it has the base model of a language model. And then instead of predicting the next token, it's predicting whether the text is AI or not.
Joe Weisenthal
Okay.
Max Spiro
And how we train it is we train on tens of millions of examples. So it sees millions and millions of human examples. And for each human example, we also show it an AI example. So, for example, let's say one of these is a 5 star review for Denny's that's 78 words long. Then we'll ask an AI to write a 5 star review about Denny's that's 78 words Long in the style of the first one. And obviously these two will be different. And so our model is able to learn through contrast what is the difference between these two?
Joe Weisenthal
And the important thing, sorry, just to be clear here, is that you and I might not be able to articulate the difference. There will be some difference in maybe the sentence length, there will be some difference in word choice, there'll be some difference in punctuation, syntax, whatever, but you and I wouldn't obviously spot it. However, after millions of examples of these side by sides, the model learns what the difference is.
Max Spiro
Exactly. I think the best that a human can do is look for some of these really obvious tells. Like ChatGPT loves that it's not just X, it's Y framing. Earlier models really liked some specific words like tapestry and intricate and delve.
Joe Weisenthal
Yeah, delve, tapestry.
Max Spiro
Yeah but. But yeah. I think by training Pangram we're able to go much deeper than this and look deeper than the high level science at the like document level science.
Tracy Alloway
Hey Fidelity, what's it cost to invest with the Fidelity app? Start with as little as $1 with no account fees or trade commissions on US stocks and ETFs. Hmm, that's music to my ears. I can only talk
Fidelity App Disclaimer Voice
Investing involves risk including risk of loss. Zero Account fees apply to retail brokerage accounts only sell order assessment fee not included. A limited number of ETFs are subject to a transaction based service fee of $100. See full list of fidelity.com commission Fidelity Brokerage Services LLC member NYSE, SIPC
Lowe's Advertisement Voice
love paying shipping and delivery fees just to wait weeks for your package? We didn't think so. That's why Lowe's delivers fast and free all Mylo's Rewards and Milo's Pro Rewards. Members enjoy free delivery and free same day delivery on eligible orders. Yep, you heard right. Same day too. That's just another way members get more At Lowe's. Same day applies to orders over $25 subject to availability restrictions and terms@lowe's.com shippingterms
Tracy Alloway
so one thing this kind of reminds me of, and I'm thinking how to phrase this, but it reminds me of, you know those exercises people used to do where you would take a bunch of different faces and meld them all together and come up with like one face that was supposedly attractive. So like, to what extent is this basically a distributional detector in the sense that you're looking for certain paths that you think AI would choose? And I guess could you get a false positive just from someone who's choosing the average of the average of the average in a way to state a particular sentence.
Max Spiro
Maybe there's a reason we have our false positive rate is 1 in 10,000 and not zero. It's because sometimes we look at the false positive and it's like, oh, it reads exactly like an AI generated review or essay, except that it was written in 2019. So it's probably a human who just happened to find the exact mode collapsed type of way that problems, right? Yeah, I would say. Yeah, I think it's a good way to think about the distribution of writing, or writing as a distribution where there's the space of all human writing and then AI writing is really just like a small point within this space. It's very. No matter how much you prompt it, it doesn't go that far from where it was trained to be.
Tracy Alloway
Yeah, okay.
Joe Weisenthal
What's the black box? So I built a little model myself. I built this thing that detects you can upload text and it says whether it's more resemblance of the written word or the spoken word.
Max Spiro
Oh, I saw that.
Joe Weisenthal
Yeah, yeah. And I used bert, which is like one of these things, Open source one from Google. What is the core model that you trained on? Or is it something? Or did you build it yourself? Talk to us about that.
Max Spiro
Our very first model was actually built on bert. But future models, we needed to up our capacity.
Joe Weisenthal
Oh, come, we'll explain that.
Max Spiro
So basically we were running into capacity limits with our model. It was capping out at a certain false positive, false negative rate. It wasn't learning the deeper signals. So we had to 10x and then 100x the parameter count. So that can learn really deeply, like how these frontier models write.
Tracy Alloway
Have you noticed any interesting differences between how the models write? And actually, is your model trained to identify different models as well as whether or not this is just broadly AI generated?
Max Spiro
So we don't specifically train it on different models. We don't say like, hey, this one is Claude 3 and this one is Chat or GPT 5. What we've done, we've done some interpretability work to look at basically the output embeddings of the model. And we find that it actually learns which model the text came from. So you could see little clusters. This is the CLAUDE cluster and all of the clauds cluster around here. And then these are the deep SEQ and Quen and then this is ChatGPT, and they all kind of cluster into different spaces in embedding space. So clearly the model is able to learn what the difference is between these frontier models.
Tracy Alloway
Actually, since you mentioned Kwen, I'm very interested. Is there anything distinct in terms of how QEN generates text versus platforms that have been developed in the us?
Max Spiro
I think QEN is unique because it's trained on a lot more Chinese and multilingual tokens than other models. So I've heard from Chinese friends that it's much better at being conversationally fluent in Chinese. Beyond that, I don't know that I can tell. It would be hard for me to look at a text and say, like, I know that's quen. But I think somebody who's more familiar with it might be able to.
Tracy Alloway
Huh.
Joe Weisenthal
Let's talk about sort of some of the philosophical or societal implications of this work. Have you had anyone whose text has been judged to be AI written by Pengram? And they're like, I swear to God, this isn't. And they, like, really insist, and what do you think about this situation? What do you do or talk to us about that?
Max Spiro
I've had a couple times this happened. There have been times where I genuinely believe that this is just a false positive. We scanned hundreds of millions of documents, so at a certain scale, this will happen. But I also get people who, all the time, they're just like, AI detectors don't work. It's a total fraud. And then whatever they're Putting out on LinkedIn is just 100% AI generated, and they're just mad that they're getting called out. And then you look back farther into their past and their history, everything they're Putting out is AI generated until about 2023 for everyone. If you look historically, there's a lot of slop accounts that are putting out total slop. And you can tell either they weren't posting as much before, and if you scan back in time, then you see that they were writing human text at some point.
Joe Weisenthal
So there's a number of accounts out there that basically right around the beginning of 2023, where if you scan the entire corpus of their work, it very clearly shows a switch right around early 2023.
Max Spiro
Yeah, it really depends on the account. I think one thing we saw that was interesting was there is a writer for the Guardian that was covering the Winter Olympics and somebody was like, hey, this article is total AI slop. Ran it through Pangram. It was AI. The Guardian was like, no, of course our writers don't use AI. So we scanned this single writer's history and we found that they really did start picking up AI mid to late 2024 and were using it more and more in their articles.
Tracy Alloway
I mean, just to play devil's advocate for a second, does intent matter when it comes to identifying AI slop in the sense that, okay, I get you can have a bad actor who's maybe trying to influence how people feel about a particular topic, and maybe they've created a bunch of bots on Twitter X and they're using AI to just flood the zone with a bunch of AI slop supporting their particular viewpoints. On the other hand, if you're a journalist and your business is to Write, you know, like basic understandable copy about a news topic. Just to be clear, I'm not advocating this at all, but that intent is very different to. I'm going to try to influence something by just, you know, sheer volume.
Max Spiro
Yeah, I mean, definitely. These are like one is a lot more severe than the other. But I think at the same time, if you're a journalist and you're using AI to basically shirk your work and not do your work, I think that's also a problem. And I think it's a reputational risk to the outlet because people can tell and people are going to call you out and there's a lot of people who don't want to read AI slop kind of regardless of where it's from.
Joe Weisenthal
Yeah, this is definitely true. Are you ever going to run out of human material to change on? Right. Like, you could be pretty confident that if you find some piece of text that was published on the Internet prior to 2023, but certainly prior to like 2019 or something like that, you can extremely sure that this was human generated. Do you worry that in the future that it's going to be harder to even establish the provenance of your training data?
Max Spiro
Yeah, it's definitely a concern for us.
Joe Weisenthal
Talk to us about how you're thinking about that.
Max Spiro
So we have a near infinite Data reservoir of pre2023 data. There's just more than enough for us to train on for a long, long time. But part of the problem is we also want to train on modern tech. We want to. There's all this talk about if somebody's writing about LLMs or about AI. We don't want to incorrectly flag that as AI because our training data has no sense of this topic. So I think we're looking at different ways to do this, but most of them are just figuring out who is a trusted actor who do we know is putting out human written content? And we could use our model for that to some degree. And then so we have known actors, we know they're putting out human written content and then we could use their data as well.
Tracy Alloway
Slightly random question, but using your model, are you able to quantify like what percentage of the Internet at the moment is AI slop?
Max Spiro
It's about 40%.
Joe Weisenthal
Wow. Based on what? You just. How'd you get that number?
Max Spiro
So a lot of the Internet is just like SEO written articles and yeah, it's articles written for search, basically so that your website comes up more often in search because it's targeting certain keywords and a lot of that industry has switched over to using AI because then instead of having to pay writers, you could churn out articles for pennies on the dollar. But I think that kind of results in a lot of the Internet being AI written. It's a little bit. It's also kind of platform dependent. It's about 40% from a Internet page perspective. About a year and a half ago, we looked at Medium and found that over 50% of newly written Medium articles were AI generated, which was a crazy high number.
Joe Weisenthal
What about Reddit, Reddit?
Max Spiro
It was 7% a year ago, I believe a little over 10% today.
Tracy Alloway
Wait, actually this reminds me. So I'm on Reddit a lot and I really enjoy it nowadays as a platform, but I do worry about how much of it is being generated by AI. And the thing I don't necessarily understand is what are the economic incentives to actually write a bunch of AI generated posts on Reddit and get upvoted? Like, why does that system or motivation even exist?
Max Spiro
So there are startups, I'm not going to name names because I don't want to promote them, but they will sell a promise to companies that we're going to get you organic mentions on Reddit. We're going to run our AI bots that seem organic and they're just going to naturally recommend your product or just mention your product in the comments or in a post. And so I've seen evidence of this. We can find these. They're basically bot farms that are mostly engaging seemingly organically, just like doing a short reply and then sometimes they're doing this brand mention. And so that's why these posts are very valuable.
Tracy Alloway
That's really interesting.
Joe Weisenthal
I have to also imagine it's valuable because all of the models train on Reddit. Right. And if you want your product's name to appear in model output, it's like, what is the best nose hair trimmer? Or whatever. And there's a bunch of bots that on Reddit talked about this nose hair trimmer and then that's probably more likely to show up in a chatgpt request, right? Yeah, yeah.
Max Spiro
It's been weirdly gamed. You know, you used to just Google best nose hair trimmer and now there's like a thousand articles.
Tracy Alloway
Well, the Reddit search results like show up first nowadays. Yeah, that's where people are looking.
Max Spiro
Yeah. And then people started searching best nose trimmer Reddit to get the Reddit comments on it. And now it's. People have realized that that's what people are searching for. So you need to populate Reddit with your advertisements.
Joe Weisenthal
I'm on the Men's Health.
Tracy Alloway
Are you looking for nose hair trimmers?
Joe Weisenthal
The Panasonic Ear and Nose Hair Trimmer is the number one choice on Men's Health pros. Easy to hold, anyway. It's not.
Max Spiro
Yeah, it's all these affiliate links just destroyed the Internet.
Joe Weisenthal
I know, It's. It's really too bad, but whatever. Talk to us more about the whole pipeline. So I was. I'm very fascinated by this idea. It's like, okay, you see this review for Denny's. You have the AI model, try to replicate it as best as it could, and there'll be these subtle differences. Talk to us, though, about, like, the whole pipeline. What are the other tests that you're using to get the true. You know, because what I imagine you're trying to do is get the most similar data sets with an almost imperceptible difference to really stress test.
Max Spiro
Absolutely.
Joe Weisenthal
So talk to us, really, about this whole pipeline.
Max Spiro
Yeah. So what we're really trying to do here is we're.
Joe Weisenthal
As a model maker, myself, I'm trying. No, no, sorry. Keep going.
Max Spiro
Yeah, as an AI expert.
Joe Weisenthal
Yeah, yeah, as an AI expert, I need to hear some tips of the field.
Max Spiro
Yeah. So what we're really looking for is examples that are as close to the boundary between human and AI as possible so that our model learns better. Something that's very obviously AI is our model's not learning as much. Same thing for something that's obviously human. And so step one is creating this dataset with synthetic mirrors of human examples. And then we train a model, and then step two is something called active learning. So we then take this model and use it to scan a much larger corpus of data and look for errors, false positives, false negatives. And then we pull those back into our training set and are able to train a much better model because it's seen these errors, and these errors, we believe, are just much closer to the boundary between human and AI.
Joe Weisenthal
So, wait, sorry, just to be clear, the first pass is like, okay, you have known human writing and known AI writing. You train a model, and then the next pass is, once again, unknown human and known AI writing. So you already know the answer of each of these, and therefore you could come up with a list of which you got wrong, and then that gets fed back into the first version.
Max Spiro
Exactly.
Joe Weisenthal
Got it.
Max Spiro
And so that makes. Once we retrain, then the model gets much, much better. And then we could do this as many times as we want to. Kind of just have a self improving model that gets better with every training run. I can also tell you go a little bit more into how we deal with AI edits because I think that's increasingly important. Problem is like, I think most writing will be AI assisted in the future. I think it's already in Google Docs and it's in my Google keyboard.
Tracy Alloway
Grammarly arguably has been doing this for a while.
Max Spiro
Exactly.
Joe Weisenthal
Yeah.
Max Spiro
Grammarly uses LLMs on the backend. And we don't want to just say like all writing is AI. Now we want to be able to differentiate between AI assisted and AI generated. So what we do is we also have different prompts. So rather than saying so for our human review of Denny's, rather than saying generate a review like this, we could say help improve this, make it more formal, make it more clean up the grammar. And so we have a long list of AI editing prompts and then we're able to look at basically the cosine difference, the distance between the original human
Joe Weisenthal
text and the distance in hyper multidimensional space.
Max Spiro
Exactly. So how much did AI change this text? And then we're able to train our model to say we're just going to put a point on this distance and say this is moderate AI assistance, this is light AI assistance and this is heavy AI assistance.
Tracy Alloway
Interesting. I'm going to do something I don't think I've ever done before, which is ask a founder about their corporate mission. But you've set up this company and when you think about what you're trying to do here, is it just basic AI detection in the sense that there might be a few groups of people, like teachers that find this very valuable? Or is the mission something broader where you're actually trying to improve the Internet and what people see on it?
Max Spiro
I believe the technology of being able to detect AI generated content is immensely valuable. And it's valuable not just for teachers, but for basically everybody in every profession, lawyers, publishers, just a individual who consumes content on the Internet. I think it's valuable for all these people. But ultimately, yeah, our high level goal is to help mitigate some of these negative effects of growing AI content.
Tracy Alloway
But for instance, just using the product review example is the vision that, like a Yelp, for instance, would want to use this technology to make sure that its system isn't being gamed? Or is the vision like, if I am a particularly diligent consumer who has a lot of time on my hands and I'm looking to go out to a restaurant, I can run all these individual restaurant reviews through Pangram and then actually figure out if it's real hype or not.
Max Spiro
So I think right now it's a lot of the former. We work with platforms. One of our biggest customers is Quora, and they run a bunch of content through Pangram. But we have a lot of different platforms that use Pangram to help moderate and find AI bad actors and get them off their platform. But I also think, yeah, the individual consumer case has been growing a lot, and we're really interested in pushing here.
Joe Weisenthal
The free version of Pangram.com, like you get a handful of tests a day or something like that. If someone had an unlimited number of Pangram responses and maybe had an access to the Pangram API at infinite scale, could they theoretically learn a prompt that they would then be able to put into an AI to generate human style writing?
Max Spiro
I actually had a friend do that. He put his Claude code on a loop. I gave him some API credits and then his Claude code just basically worked overnight, writing a prompt, trying to get it to output. Something that's human written or that came back from Panorama is human written. It got there, but the text was pretty incoherent. So yeah, it was producing more or less long gibberish. It was grammatically incorrect. A lot of the words just didn't really make sense.
Joe Weisenthal
Because this was my first thought when I saw it. I was like, that would be a fun experiment to see if you could take all the outputs, find the difference, and just keep iterating on the prompt. You would have to tell AI in order to eventually get an output that looked to Pangram like it was human generated.
Max Spiro
Yeah, I think there's a way to do it if you also had like an LLM judge on coherency and use like Pangram and the coherency judge both to score your text. I think this is definitely possible and I'm excited for someone to try to do it because we could make our model a lot better and more robust if this existed.
Tracy Alloway
Joe, I want to know what your personal, like, token budget is nowadays that you're even like, contemplating some of those stuff.
Joe Weisenthal
You know what? I feel like I have the CloudMax plan, you know, and I don't work, like, when I'm at work, I don't work on any of my vibe coding projects. And, you know, like, when we were kids, I don't know if you remember, like, if you didn't eat all your food, like, someone would say, oh, there's like starving kids in the world.
Tracy Alloway
I'm like, oh, Starving vibe coders that need the term.
Joe Weisenthal
It's like, oh, you didn't like, I have this four hour token window and I'm almost never maxing it out. And I'm just thinking it's like there are kids on the other side of the world that wish they had your tokens. And you're. You're not using all of your tokens for the window. How dare you. I feel a little guilty when I don't out max out my Claude Max token program.
Max Spiro
I also have Claude Max and yeah, most days I'm not doing much coding at all. I'm not maxing it out. And then some days I'm going way over budget.
Joe Weisenthal
Guilty about that though. So can I ask you, like, writing is kind of interesting, but like, what are the prospects of this being able to work on, say, and you must get this a lot. Image and video generation, is it at all theoretically similar? Is there a reason to think that it will be replicable? Or is this just a different beast of a problem?
Max Spiro
I think the approach is definitely doable. I think some of the economics change, especially if we look at video and the cost of generating video today. Okay, we can't generate video at the same scale that we can generate text. And so we might need a kind of different approach. But I also believe that if we're able to solve this for image plus maybe audio, that could be enough to just solve it for video as well. Zero shot.
Tracy Alloway
Could you ever envision, I don't know, launching some sort of certification program for video? Because this seems to be. My dad's a boomer, spends a lot of time on Facebook. This seems to be what society needs, right? A video that comes with a little thing that says, this is not AI generated and someone has actually rubber stamp that.
Max Spiro
So there's an organization called C2PA and I think they're doing pretty good work on content provenance. Basically, they are working with phone makers and hardware makers to basically embed hardware signatures to prove that image and video were truly taken from the hardware.
Tracy Alloway
Like watermarks, basically.
Max Spiro
Yeah, exactly. So rather than marking the AI outputs, we're instead embedding a proof of authenticity in the thing that's real and was captured in real life.
Tracy Alloway
That's interesting.
Joe Weisenthal
All right, so big picture, where's the Internet going? You mentioned 40% of the Internet is already AI generated. But maybe that's not the end of the world. If it's just a bunch of SEO pages that I never read. I don't know, whatever. But give us some thoughts. High Level about what the trajectory of the Internet. Regardless of the uptake of Pangram and other AI detection models, I'm a little
Max Spiro
bit worried about the state of the Internet, I'm going to be honest. I think right now there's still so much of it is built around trust and norms in a way that we're not really well equipped to suddenly deal with an onslaught of bots at a completely different scale than we've dealt with before. There's maybe a good case and a bad case. I would say the bad case is the Internet goes the way of dead Internet theory. Just like every space that's open and accessible is just flooded by bots. And then the only place people are able to communicate authentically is in very walled garden closed servers, like Discord servers, for example, where everybody's identity is known and, you know, you don't have bots in here. So that's maybe the bad scenario.
Joe Weisenthal
Can I tell you an insane thought that I've had?
Tracy Alloway
Go on.
Joe Weisenthal
We're gonna kick out of this.
Tracy Alloway
Just so have you heard of.
Joe Weisenthal
I forget what they call this idea of for the bad actors. It's called Heaven mode or Heaven banning. Have you heard of that? So there's this thought that one way you could deal with bad actors on the Internet is suddenly they're on a version of, say, Twitter in which they're only bots and everyone always agrees with them on everything and it drives them crazy and stuff like that. And they would never know it because they're like, oh, that's cool. Everyone's. And then it started, like, slowly, like, yeah, they just. This is like, oh, you could punish people by putting them on an Internet where they will never get any fighting.
Max Spiro
You get Heaven banned and put into basically jail. You're talking to a bunch of people.
Joe Weisenthal
That's right. That's right. That would be tr. But you're Heaven Band. But I thought. And again, this is, you know, like, I built this little AI model myself and I like, showed to my friends, like, oh, it's really cool, Joe. I'm really impressed, Like, I'm really impressed by, like, that you're able to do this. And I was like, are people being honest with me? Have I been heaven banned? Because I just like. Like, you can be honest with me if it sucks. And I'm. And I sort of have this fear.
Tracy Alloway
The biggest humble brag.
Joe Weisenthal
No, I'm sorry.
Tracy Alloway
I did this thing and everyone thought it was great.
Joe Weisenthal
I'm just saying, like, people are like, I think people. I'm worried that like, people are being nice to me because, like, oh, cool. Yeah, that's impressed. You, like, did that. And I have this, like, deep anxiety that, like, people aren't giving it to me straight about it. I know that sounds like a humble brag, but it's really not.
Max Spiro
That's why you can never get, like, too successful. Like Kanye west surrounded by a bunch of yes men.
Joe Weisenthal
He never gets any. Oh, this is his first try at doing something with vibe coding. I'm like, deeply anxious. Like, no, you can just tell me if it sucks. That's fine. That's my worry.
Tracy Alloway
Don't worry about this. If I tweet that I'm eating a steak, I will get like 100 people
Joe Weisenthal
criticizing that you didn't cook the meat.
Tracy Alloway
Yeah.
Joe Weisenthal
So that's the other thing, which is that the two things you are never allowed to tweet about meat preparation and enjoying life. Because if you ever enjoy life and if you ever enjoy and if you ever prepare meat, people will flip out at you on the Internet. Those are the two things that you are not allowed to do online.
Tracy Alloway
Very true. Sort of related question, but just going back to the methodology, if you're focused on this sort of like, path dependent idea, I'm kind of envisioning it as like a giant decision tree. Right. Is there a possibility that as the models get better and better, and we know that they're already injecting like, some degree of randomness into their output? Although I know there's going to be a pedant out there who like, messages me and says, like, well, you know, computers can't do, like, true randomness. But, you know, setting that aside, setting that aside, like, we know that they're adjusting. They're becoming more sophisticated at an incredible rate. We know that they're trying to adjust and inject some randomness in order to avoid exactly this kind of detection. Do you worry about their own adaptation at all?
Max Spiro
I have noticed that the models, as they get more capable, I believe their output distribution gets more complex. It's harder to learn with a simple model, which is why we've been increasing our model size to capture a higher complexity function that can capture the LLM outputs. So I think we may have to continue to make our models better. We're going to have to work to keep up with it. But we can't just rest on our laurels.
Joe Weisenthal
What are burstiness and perplexity?
Max Spiro
Yeah. So this is a metric that's used by some AI detectors, but not pangram.
Joe Weisenthal
Okay.
Max Spiro
And so I can explain a bit about how it works. So perplexity is basically a measure of.
Joe Weisenthal
And this is not perplexity. AI the website. This is a technical term. Okay, good.
Max Spiro
This is a metric. This is a measure of how confusing a piece of text is to a language model. So basically, for example, with every token we can calculate some perplexity, which is basically how expected is this. For example, if it's I went home to my pet and then the next token is chinchilla, that would be a much higher perplexity token than my pet dog. So low perplexity text or really LLM outputs tend to be low perplexity. They're not going to produce outputs that are surprising to themselves. So this is a decent way to get an AI detector that's around 90 to 95% accurate, but it has some problems. The main one is that you can't improve upon it. Basically it has false positives. Text written by non native English speakers often is low perplexity. Just because when you're learning, they don't
Joe Weisenthal
take as many risks. Exactly. Oh, interesting.
Max Spiro
Yeah. So that's why a lot of the early AI detectors had a bunch of false positives. With ESL speakers, it's because their text was low perplexity. So I think this is a very cool metric, but it is not the path for pangram. Instead we went the deep learning approach. So we can do better than what's burstiness?
Joe Weisenthal
Is that just the opposite side of the coin?
Max Spiro
Yeah, burstiness is basically actually. Yeah, I don't know if I can define it.
Joe Weisenthal
Okay, fine. Yeah. Okay.
Tracy Alloway
Burstiness just sounds like one of those like sort of, I guess, manosphere terms, doesn't it? Like. Oh yeah, he has like he's been look smacksing with high burstiness or something like that.
Joe Weisenthal
Yeah, that's great.
Max Spiro
Yeah, I think it might just be like a measure of like sentence length and.
Joe Weisenthal
Got it.
Max Spiro
Like how the ups and downs of the text.
Tracy Alloway
If we assume that the world is collectively concerned about AI slop and wants to do something about it, what would be like the single biggest change to the system either in terms of the economics of the Internet or regulation or technology, what you're developing that would actually help reduce slop?
Max Spiro
Yeah, I think the biggest one is norms. So there have been a couple great blog posts written about how it is rude to send other people undisclosed AI outputs. And I think I completely agree here. I think if somebody asks a question on the Internet and then somebody else goes and puts into ChatGPT and then pastes the answer, it's kind of rude. I was going here to ask the opinions of my friends or my followers, not chatgpt. I could have done that myself. And so I think building this norm is something that. It's very new technology, so we need to do it quickly. But I think this would help a lot for society.
Joe Weisenthal
Well then actually this gets to a question that I have then, which is I feel as though the major Internet platforms are actually moving the exact opposite direction. I mean, I'm stunned. Maybe I accidentally clicked on something at some point, but the frequency with which I get an email and then I open it up to respond in Gmail and there's that ghost text there that. Do you just want Gemini to respond to this? I've never done that. I also consider. I think that would be extremely rude. I've never responded to any email with AI response. But they're basically telling you to do that. They're doing the exact opposite. They're blowing up these norms. And so I'm curious from your perspective, you mentioned you work with Quora, but from your impression, do the major Internet platforms think this is a problem worth solving or from their concern is like, you know what, the more content, the better.
Tracy Alloway
There's mixed incentives.
Joe Weisenthal
There's mixed incentives for the big company.
Max Spiro
It's funny because, like, Google seems to be playing both sides. So like, on one hand they had that advertisement which people kind of blew up about, where it's like, oh, children can now send their heroes notes on how much they respect them by using AI instead of writing the note themselves. And this is wrong. This is societally bad. But at the same time, they're working very hard to deal with the AI slop on the Internet in search results to make sure people get served real content and not AI slop content. So I think, I think obviously there's a lot of incentives at play around product people who are incentivized to push AI because that is the corporate mandate. But yeah, I think overall, even in my sphere of a bunch of people who are AI researchers generally consensus is that AI is a powerful tool, but slop is bad.
Tracy Alloway
This reminds me, my parents used to make me do these handmade greeting cards for Christmas for all of relatives and stuff. And it was supposed to be a demonstration of my commitment to communicating to family. No, it traumatized me forever. And I hate greeting cards. A result of them of doing this, just spending hours manufacturing these things. But then secondly, the funniest thing was once we got E cards, my parents immediately switched to using E cards. And just. And now this is also the funniest thing, my dad uses E Card. He figured out that the E Card system can tell him whether or not you opened it. So he just uses it as, like day to day communication.
Joe Weisenthal
Now.
Max Spiro
That's so funny.
Joe Weisenthal
Just send an email to your daughter and do it via E Card.
Tracy Alloway
It's like, I noticed you haven't opened up my E Card for International Hot Dog Day. Please let me know what's going on.
Joe Weisenthal
I had terrible handwriting as a kid, and my mother made me write all of these handwritten notes to thank people for the gifts I got for my bar mitzvah. Yeah, I hated it. But you know what? I have deep connections with all of those people that have lasted over the years. And that miserable one week where I just wrote and I got, you know, hand cramped. I think it paid off.
Tracy Alloway
All right, well, imagine doing that for like 16 years, basically, in a never ending stream.
Joe Weisenthal
Maxpiro, thank you so much for coming on Outlast. That was a lot of fun. I'm fascinated by this conversation.
Max Spiro
Thanks so much for having me. Yeah, really exciting to talk about this, and I think slop is a growing problem, so hopefully we're able to deal with it.
Tracy Alloway
40% of the Internet, I can't tell if I'm surprised by that or not.
Joe Weisenthal
And what's it gonna be next year at this time?
Max Spiro
Oh, man, I don't know.
Joe Weisenthal
It'll be like, hard to say for sure.
Max Spiro
Yeah, almost certainly crazy.
Joe Weisenthal
All right, thanks for coming on Oddly.
Max Spiro
Thanks,
Joe Weisenthal
Tracy. I love that conversation. I just think it's like a really fun puzzle, Right?
Tracy Alloway
No, totally.
Joe Weisenthal
It's very, like. It seems like a fun question to solve. And I'm fascinated by this idea of how, like, with both humans and AI, there is going to be this gap inevitable between what we know and what we can articulate. Because you and I both, setting aside AI versus text, there are things that we both know. For example, this is newsworthy and this is. This is a good episode of a podcast. This is a credible sounding guess. And this isn't the gap between that and then being able to explain why. It's like, well, you just sort of know it. Right. You just sort of have this feeling.
Tracy Alloway
There's an intuition. Yeah.
Joe Weisenthal
And that intuition is built up from numerous examples, which is the same way, in a sense, that the AI is trained. It's like these things that you only know from patterns and you can see them without fully being able to articulate exactly what's going on.
Tracy Alloway
Well, the other question I would have on that is, is it even gonna Matter in the long run. If you think about, like, so much of the Internet is already built on bots and the sort of, like, false attention economy. Like, like, if our entire, like, worldview becomes shaped by AI driven drivel.
Joe Weisenthal
Yeah.
Tracy Alloway
Does it matter if, like, the economics of the Internet are still attached to individual bot accounts and things like that? I don't know if I'm. If I'm explaining this, but no, no,
Joe Weisenthal
I think it makes a lot of sense. And I do think, like, it is important. Like, we're gonna have to change the entire way we think. Max said at the beginning, which is. And I've thought about this, which is that it used to be that if you came across a piece of writing and the punctuation was excellent and the spelling was excellent and it was, like, cogent sounding, you're like, okay, this has been written by a smart person. I will take the content seriously. Right? And now there is this complete severance of sort of, like, craft and output. Because you could, and you do this, like, ask Claude to write an argument in favor of the most absurd proposition imaginable. Ask Claude to write an argument for me that the reason why Reagan wanted to do tax cuts in the early 1980s related to these reports of UFO sightings in the 1970s. And it will write something that not only is it grammatically correct, it'll actually, like, strain to come up with the best version of this argument before. And again, if prior to that, having read it, like, oh, maybe the person, like this person took this argument seriously, but now this argument is just created ex nil. Okay, so we're gonna have to really, like, change our heuristics about this stuff.
Tracy Alloway
We've created an unlimited stream of basically, cranks with really good grammar.
Joe Weisenthal
Yeah, that's right. That's right. Because it used to be we knew the crank because they had bad grammar, or they would email us and, like, half the words would be in yellow and the other half would be underlined.
Tracy Alloway
Green ink was the classic example.
Joe Weisenthal
These are the tools that we use to just, like, oh, this person's crank. They, like, you know, half the words are in all caps and stuff like that. Those don't work anymore.
Tracy Alloway
All right, on that note, shall we leave it there?
Joe Weisenthal
Let's leave it there.
Tracy Alloway
This has been another episode of the Odd Lots podcast. I'm Tracy Alloway. You can follow me at Tracy Alloway.
Joe Weisenthal
And I'm Joe Weisenthal. You can follow me at the Stalwart. Follow our guest, Max Spiro. He's at MaxSpero underscore Follow our producers Carmen Rodriguez, Armen Erman, Dashiell Bennett at dashbot and Kell Brooks Aale Brooks and for more Odd Lots content, go to bloomberg.com oddlots where the Daily newsletter are on all of our episodes and you can chat about all these topics 24. Seven in our Discord, Discord, GG Oddlots.
Tracy Alloway
And if you enjoy Odd Lots, if you like it when we talk about how the Internet is 40% slop, then please leave us a positive review on your favorite podcast platform. And remember, if you are a Bloomberg subscriber, you can listen to all of our episodes absolutely ad free. All you need to do is find the Bloomberg Channel on Apple Podcasts and follow the instructions there. Thanks for listening.
Max Spiro
Sam.
Date: April 2, 2026
Hosts: Joe Weisenthal & Tracy Alloway
Guest: Max Spiro (Founder & CEO, Pangram Labs)
This episode dives deep into the question: How can we tell if a piece of writing was generated by AI? With the rise of AI-generated content across the internet, Joe and Tracy welcome Max Spiro, founder of Pangram Labs, a company that builds AI-detection technology. The conversation traverses topics from methodology and AI "tells," to the implications for journalism, trust, and the very nature of the internet itself as more content is AI-written.
Max Spiro:
Joe Weisenthal:
Tracy Alloway:
This episode underscores the message that AI-generated writing is pervasive, often indistinguishable from human work, and poses a challenge to how we evaluate, trust, and navigate information online. Technology like Pangram may provide a stopgap, but the long-term solution—according to the guest—lies in building social norms and expectations about content authenticity.
Quote to remember:
"We’ve created an unlimited stream of basically cranks with really good grammar."
— Joe Weisenthal ([48:42])
(For further content, visit Odd Lots at Bloomberg.)