Loading summary
A
Welcome to the Family Tree Magazine podcast. This is the show from America's number one genealogy magazine. I'm Andrew Cook, editor of Family Tree magazine. Later this month, on April 25, we celebrate DNA Day, the anniversary of researchers discovering DNA's double helix structure back in 1953. Over the past few decades, genealogists have been making their own discoveries with DNA too. With more than 50 million people having taken tests through companies like AncestryDNA 23andMe and MyHeritage DNA, an innovator in his own right. My guest today, Johnny Pearl, is the founder of DNA Painter, one of the leading third party tools for genetic genealogy. Welcome to the podcast, Johnny.
B
Thank you very much, Andrew. Thanks for having me on.
A
In your own words, what is DNA Painter and what tools does it offer users?
B
DNA Painter is a website available@dnapainter.com and I built it really for any genealogist who's working with DNA. So there are tools which are reasonably accessible, so visualizing your direct line, investigating what the relationship you might have with a DNA match is. And then there's some more complicated tools, perhaps for more advanced or DNA enthusiast people, if you like. So for solving family mysteries, there are tools which are used by a lot of people in the adoption community, for example, for finding unknown parents. And then there's chromosome mapping, which, I mean, it's not super advanced, but it's kind of geeky, if you like. So it's maybe not for absolutely everyone, but it's a very engaging way of creating a kind of DNA, a sort of genetic companion to your pedigree chart, if you like. So that's one of the main offerings too.
A
Yeah. And just to be clear, users don't upload their raw DNA to DNA Painter. Right. Can you talk a little bit about how that works?
B
Sure, yeah. This is, I think it's a misconception for a lot of people because if you come, if you're a genealogist and you're coming into DNA, you don't necessarily know how it works. You just know there's some magic somewhere. And so some people see the visuals which you make on my site and they just think, oh, that's great. If I just provide this file, it'll magically turn into that. And there's various reasons why that isn't the case. From a selfish point of view, obviously I don't want to be handling lots of people's genetic data from a security perspective, but actually, practically speaking, you don't really need that data necessarily. So if you want to find Out. If you and I, Andrew, found that we shared 82 centimorgans of DNA, then we can investigate that with just that number, 82, for example. And then if in fact we wanted to map that DNA in a chromosome map, then we would need a bit more data, but we wouldn't need the raw data. We would just need to know which chromosomes we matched on and what the start and end positions were, for example. So, yeah, no raw DNA. Although I have got a new feature that will potentially use it a little bit I could talk to you about later. So I have to break my mantra of seven years or nine years, no raw DNA. I will be having an option to use a tiny bit of raw DNA potentially later on.
A
Yeah, that's a really important clarification. I think that you're not kind of giving away the whole store when you use your tool. It's really, really just the pieces of data that you want to work with. And they're data that the websites already give you.
B
Yeah, for sure. So the only exception to that prominently being the segment data. Unfortunately you can't get it ancestry, but luckily we have quite a lot of other big DNA databases that do give you that.
A
Now you mentioned that shared Centimorgan value. I know one especially valuable tool at the site, something you worked on with Blaine Bettinger is the Shared CM Center Morgan Project. And that shows the possible relationships based on how much DNA you share with a match. In layman's terms, how does shared DNA correlate with genetic relationship? And why is it important that you consider not just one relationship possibility, but multiple?
B
That's a good question. Yeah. So we know that we get half of our DNA from each parent. Right. That's kind of reasonably set in stone. We could argue about little tiny numbers here or there. And Y and X chromosomes, pretty much half from moments are from dad. Right. But beyond that, it really varies. And for me, the best way of learning about this has been practically speaking, I've tested so many people in my family since and among those people are lots of sets of siblings. So I can tell you that one of my nieces, she inherited 33 or 34% of her DNA from my mother and only 16 or 17% from my father, for example. So once you get beyond that first generation, there is a lot of randomness. So. So if you imagine go back a few generations, if that happens at every generation, there can be an enormous variance. Right. So if you get to say a third cousin, you could find yourself sharing say 220 centimorgans or something like about 3%. Or you could share nothing at all. Yeah. So it stands to reason if you just have a number representing the number of centimorgans you share with someone, you're going to need more than just that to figure out what the relationship is. Even if it's a parent child type thing, you know, maybe it's an identical twin. You always need something else. And luckily there's generally a lot of information around. You might have a picture of the person, you might have their location, you might have their age. But yeah, you've always got to bear in mind that genetic inheritance is inherently random and also that family trees aren't always as ordered and organized as you expect. There can be undocumented relationships, undocumented children, there can be half relationships. So it's very important to consider every possibility, as you say. Yeah.
A
And that variance compounds that. It's not just from one generation to another. It's okay. Now you inherited less DNA from your parent than you thought then statistically is average. Right. And then. So your children will inherit less DNA from them as well.
B
Exactly. Yeah. Yeah. There was one situation I made a note of that I saw earlier, where there's a second cousin once removed and my father shares nothing with him at all. But then there's a fifth cousin once removed my father shares 35 centimorgans with, for example. And that's just a kind of elegant, interesting proof of what we've just been discussing. Really. Yeah.
A
And the sites will give you a relationship estimate, but it's one of several options. They don't really. They don't necessarily want you to know that. And they'll give a range, sometimes third
B
to fifth cousin, or I don't think it's that they don't want you to know. I think they don't want to commit because they don't really know. And it's a tricky one because I completely understand where they're coming from. If you're making a match list, you don't want to say, well, it could be any of these 15 things. They kind of want to break it down for you. So it's reasonably simple. They don't want to fill the page up with relationships. But what that leads, some people do is say, well, ancestry says, I've got this, this third cousin or fourth cousin. And actually, yeah, that's not the whole story. And they have. The sites have really come on with that over the last few years. If you click on that amount, shared ancestry, for example, they'll give you a list of Probabilities and relationships. I mean, I don't always agree with it. I think they tend to still be a bit optimistic, as if they think, oh, if I tell him it's a fourth cousin, he'll be excited and he'll get other people to test. But myheritage, also, theirs has already come on a lot. They've got probability tools in there. So, yeah, I think you do always need to look beyond the testing site and open your mind. But the testing sites have improved a lot, actually, on that front.
A
I know. Myheritage too, has simplified the language they use. Instead of first cousin once removed, for example, it's the child of your cousin.
B
Yeah, sure.
A
The terms that people would actually use every day, they're good like that.
B
Yeah. It's funny for me, I think I flipped in my brain so many years ago that it almost confuses me more. I'd rather you just say second cousin twice removed. I know what you're talking about. But, yeah, it's great to make it more accessible because I know that for a lot of people it hasn't clicked yet. So.
A
Yeah, yeah. So if our listeners go to the Shared CM project, they'll see a kind of family tree that shows the relationships. So there's you, the test taker, kind of in the center, and then a serie A grid kind of structure that represents different relatives. And there's one number given for each relationship. That's the mean, the average. But beneath that is that range that we've been talking about. And so you can sort of expect to see anywhere between those values. And you really do, like you said, need more documentation to know is it a first cousin once removed or a second cousin or a half. You know, there are lots of different possibilities.
B
There are. And there's something very important you should do if you're looking at one of those boxes, Andrew, and that is to click on it. Because have you ever done that? If you click on it, you get a histogram, right. So that, that gives you the kind of distribution of what people said the amount shared for that relationship was. So if you, if you imagine you're looking at second cousin and say you've got kind of a lowish amount, maybe you've got 50 centimorgans, you're kind of down the bottom, you can click on it and you can see in a flash, am I kind of still near the middle or am I way off to the side? Do I need to worry about this? Do I need to try and validate it and investigate it more And I've been trying to get people to click on that box for years. So I'm just mentioning it now because it's important.
A
Yeah, no, I don't think I did know that. So that's really helpful.
B
There we go. You never came to my booth at RootsTech last year. I was telling everyone who listened all about that.
A
Yeah, well, that's why we like talking to the people who built the site, because that's exactly the kind of insight that people might miss and that, you
B
know, you really check on the relationship for more info.
A
And it should be said too, that the data there was built using real relationships. It's not as purely statistical exercise. And this is an aggregated.
B
Indeed. Yeah. So I think the current Release has got 60,000 data points and when Dr. Bettinger has time, there will be a release that will have even more data points. And obviously, the more data we have, the better. I do occasionally get emails from me that say, oh, well, you know, I've got this second cousin and we share less. And I say, I'm not trying to say you don't. There can be outliers, but it's a very. It's still a very valuable tool as it is, I think, just because it shows the spread. Inevitably, you know, there may be people who put in the wrong relationship or who put in the wrong amount, but there's enough data that it gets kind of broadened out quite well.
A
Good. And earlier you mentioned that sometimes when people see reports from your site, they get a little like, oh, overwhelmed. There's a lot of colors or a lot of different things to be looking at. And I think one benefit that DNA Painter has is that it does give a deeper level of analysis, a different kind of analysis. How would you suggest that someone new to DNA get started on your site?
B
Yeah, that's a good question. Yeah, the site has grown organically, frankly. I built an application for chromosome mapping because that was my passion back in 2017, still is. And then I ended up building the Shared CM tool. And then more things came along and I think, yeah, it can be overwhelming, but also you can be like, well, how would all these things really help me? So one piece of work I've managed to do recently is the orange button. So if you go to dnapainter.com there's an orange button that says, start here. It says, new to DNA Painter, Start here. And I kind of try to sum up six of the main features on the site. And I do it in a specific way. I say, what is it you Know what do you do? What can you get out of it? What do you need in order to use it? This is all as brief as possible. There's a few pictures and what other resources are there? And then there's also a little heat map if you like, that shows how relatively easy or advanced a tool is because I think that's important for people to understand before they get started. Some people will hurtle into a tool that's been recommended to them. I'm thinking of a tool I have called what are the Odds? Which is an incredibly popular tool for solving family mysteries. But you kind of have to know what you're dealing with. So that was labeled advanced, for example. Then I've got tools for visualizing your direct line. I've got the shared CM tool, I've got the matrix tool which is for visualizing the amount that a big group of people share with each other. And these are relatively accessible. I mean, of course they're slightly data oriented and geeky you could argue, but they're much more accessible than those other tools. So yeah, I would recommend people click on that, start here and have a look at those little overlays to understand what the different features do and what you can get out of them. That would be a good first step.
A
Sounds very helpful. Just almost like a guided tour of the site. Here's what this does and why you would want it. Here's what this does and why you would want it.
B
That's my goal and I should say for years I've been trying to explain the site to people and I don't think I'm there yet, but I think I am getting better at it. So hopefully, hopefully I've made some strides.
A
We'll be sure to list that specific page in the show notes too so that our listeners can go and see that and kind of plan their course around the site and see, you know, what all there is to offer. The what are the Odds Tool. I have been recommended many times before and what makes it so interesting for people who aren't familiar with it is that you can plot out different family tree arrangements based on shared DNA and the tool will tell you sort of provide a score for how likely that is. My. Am I summarizing that correctly?
B
Yeah, pretty much. Is the answer. Yes. What the OZ relies on is if you start with a known tree, right. So I can see two obvious scenarios. One might be you, Andrew, have a mystery DNA match and he's reasonably close and she's reasonably close and you're thinking, how'd they fit in. So you take your known tree and you think, okay, well, where could this person fit in? Could this person have been their grandfather or father? Where could they have been? And the other is that actually you're trying to solve a brick wall, right? So you have a bunch of matches, you find, and you can see there must be a common ancestor for you and these people, and you can see how they're all related to each other. So you start with their tree and you try and fit yourself into different places in their tree, and the tool will tell you how likely each of those places is. So what it relies on, actually is for some solid genealogy to start with, whether it's your tree or someone else's tree more often. And then, yeah, it'll help you consider where the DNA whereabouts in that tree, the DNA suggests that you would best fit, if you like. And the tricky thing for me here as a developer is I put these scores on the screen, Andrew, and I say, you know, 80% here, 90% there. And these scores can change because if you add another DNA match, it completely changes the calculation. So I have to be very measured in the way I describe what those scores do. Because I don't want people to. To say this website, this guy in Britain made it, says that this is true. Because, you know, you're not. You're not outsourcing your brain to me. I'm providing a tool that helps you to see things clearly and do that thinking, if you know what I mean.
A
And even in relative terms, even if it's not 80% accurate or whatever, that it's still more likely than the options beneath it, which I think is helpful for helping gauge these different possibilities.
B
I think. So, yeah. There's a certain amount of experience you need to use a tool like that sensibly, I think. But, yeah, I'm very, very glad to have built it. I collaborated with someone else on. That was someone else's idea, to be clear, Leah Larkin. And we worked together going all the way back to 2017, and then I had a new version of it back in 2024. So it's been really, really interesting. It's been a great journey.
A
That takes us to 2017. When you started DNA Painter. That was sort of the height of DNA testing's popularity, where each month the companies were releasing these crazy numbers of how many new test takers they had. As someone who's been here throughout that whole kind of genetic genealogy wave, what changes have you noticed in the field since then and what trends are you seeing now?
B
I Mean enormous number of changes. Obviously I think the main thing that was happening then, as you say, those databases were massively booming in numbers, weren't they? But actually I think that the sites didn't quite know what to do to make it easier for us to interpret matches. So in a sense that gave me a bit of a window to build DNA Painter because I was trying to make sense of it all myself. But yeah, we saw a lot of very important changes just around that time. There was some clustering became a really big thing. People were starting to talk about it around 2017 and then 2018 I think was when Dana Leeds came out with the leads method. And then people started to automate clustering and that was enormously powerful. There was a bit of a tussle about data because people with people like Ancestry didn't want. People didn't want bots going onto their website to gather data and there were kind of privacy issues if you like. But what we've seen this year is a massive resurgence of that in 2025 and 2026. We've got Ancestry and 23andMe. They've kind of looked at the history of clustering and they've thought, well, what could be done better? And they've come out with incredibly useful tools that I think, I mean, if only I could make more time, I'd probably be on it absolutely all the time. The main innovation is that you can actually choose a specific tester. So if Andrew, you and my third cousin say on a specific line and we shared 220 centimorgans, I could say, okay, give me clusters based on Andrew and it will actually help me explore just that line which is incredibly powerful. Like I said, I could lose the, you know, the next five years doing it probably if I wanted to. So yeah, that, that's the really, the really big change I've seen. But yeah, we've had lots of different ways to help us organize our matches on the, on the websites with color coded groups and that kind of thing. Clustering is a big one. It's absolutely massive.
A
And again, for listeners who might not know, clustering is identifying groups of related matches. Yeah, that sort of seem to be circling around a common ancestor, but that there are these patterns of these networks that you can start to piece together and visualize to try to understand. Okay, this group of my matches seem to be also related to each other, whereas these other ones are related to a different group of people.
B
Indeed. And it's kind of a delightful puzzle once you can get over that Hump of being dazzled by this graphic which kind of moves around and you're like, what is this? How can it help me when you can actually calmly say, okay, well, this group is definitely this line, and this group's definitely this line. This one I'm not sure. But, yeah, I think some of these people seem to be related to each other, and you can start to kind of crack away at it. It's just a different approach to the puzzle, which is highly interesting to me.
A
Do you find that test takers now seem to know more about how the technology works and what they can do with it, or do you feel. Are you still sort of answering a lot of the more basic questions when you're talking to people?
B
Oh, the basic questions definitely still come up. I guess the vast majority of the people who were obvious DNA testers, people who were obsessed with family history, had had their brick wars for decades. And they were like, well, you know, this is one more thing that can help. Most of them have now tested the new testers is more of a trickle than it was. But, yeah, for those new testers, they're going through exactly the same loops that I was going through myself nine or ten years ago, Andrew. You know, they're thinking, well, what does this mean? And if that's that, why is this? And there are various things, there are various aspects of DNA that are inherently confusing. So you see all these precise numbers and you think, well, you know, you think of people in a lab who are just going to be able to give you a list of scientifically proven facts, but everything's a bit more fuzzy and nuanced than you want it to be. And, yeah, people. People are still coming on and learning that. Obviously, a lot of the people I know that they've been doing it for almost as long as me or longer. So they know more, more than me.
A
We had Diane Southern on an episode earlier this year, and she was talking about changes in ancestry DNA specifically, but she pointed out that what other product do you purchase that 10 years later, you're still getting updates for? You know, if I tested in 2016, a lot of the tools that Ancestry or myheritage roll out now also apply to me, even though I haven't, you know, I wouldn't necessarily have paid them anything since then.
B
It's very interesting. And I think they're still working out the business model, obviously, as well. We saw 23andMe. It would be reductive to say that they just didn't charge enough money because obviously all kinds of Things were going on, but now 23andMe have started up as a non profit. They're being reasonably hard nosed about their add on product. You have to pay for each test and I kind of understand that they want to keep working and making a good product. I do get it.
A
An ancestry in myheritage, removing more tools behind some sort of subscription too. That's one way that they're kind of getting their investment back.
B
Yeah, I mean, I think it must have been a loss leader for the companies to start with and now they're maybe knuckling down and figuring out what the best way forward is. But exciting times. There's still lots of new things happening. We have myheritage and Family Tree DNA are moving their autosomal tests to kind of next generation sequencing. So we don't even know what the implications of that will be. But it's potentially exciting. There's more information to be used, possibly more precise comparisons. It's exciting times.
A
Absolutely. So what's next in the pipeline for DNA Painter?
B
That's a good question. I have to decide what I did normally when RootsTech happens. I've got it together and I release a big new thing that I can talk about. And this year I don't know if it's the arrival of AI, but I started to kind of panic and try to do about five things at once and I didn't quite finish any of them. Why don't I talk about the thing I referred to earlier, which is the ability to load your raw DNA file in shock horror. I'm not going to hold your DNA file. I don't actually want your DNA file, but I'm going to extract just a few values. If you were a really, really seasoned DNA Painter user who was into chromosome mapping and has followed on closely, you might have noticed a feature which I released back in 2020, which allows you, if you, if you go into a tiny little checkbox, it allows you to turn on a feature called traits. And what that does is it puts these little lollipops above certain locations on your chromosome map and says, well, this is the bit in your map where you have the gene that determines if you have wet or dry earwax or bitter taste, or if you're an early or a late sleeper, or if you have curly hair. Right. So I put this in there. It doesn't do anything really clever. It just says, well, this is where that marker is. So whatever your value is there, you can figure out which ancestor you got it from. So what I realized I could do was load in the DNA file, pull out the actual values that you have for those traits and then offer an interpretation of them. So I can say, well, it looks like you've got the gene for curly hair. And then because we're in your chromosome map, I can do more than the testing companies can do. I can say it looks like you have the gene for curly hair and it looks like you got it from your grandmother Rose, for example. So I'm hoping that that will make chromosome mapping seem more attractive to people because I think, well, wait a minute, it's not just figuring out which bits of my DNA I got from where I can actually figure out which of my traits I got from different people as well.
A
Anything that can make it feel more personal, that.
B
I hope so, yeah.
A
More concrete that. Okay, I have a different, yeah, different values at whatever on such and such chromosome. But people really get hair color or eye color, I realize.
B
Exactly.
A
And hair like that might be more complicated than just one marker. But.
B
Well, yeah, I have to be careful with my, my science here because I don't, I don't want to be over promising on telling people stuff that isn't true. So it's, I think it's inevitably quite top level. But there are certain traits which can be pinned to specific locations and I think it's fun. I mean, as soon as I built it, I have it in testing now I started to load up members of my family to, to check out. Oh, so who is a late sleeper and where did they get it from, this kind of thing. So I think it adds a little bit more intrigue. It's obviously kind of light, but I think being able to actually relate it to an ancestor because you're mapping your chromosomes anyway makes it a bit more compelling. So, yeah, that's the big one. I've got some other tree visualization stuff that I'm really happy about that I'm hoping to get out as well. But yeah, and I guess I'm doing what everyone else is doing, which is considering how, how AI might be able to do something interesting with DNA. But yeah, there's the kind of specter of not wanting to take my DNA, matches, names and put them into a system where I don't know where they're going to go, I think, I don't know, do we wait until we can operate these AI systems on our very powerful laptops or is there a clever thing that can be done? I'm sure there's loads of clever things that can be done now, I guess I'm just too busy looking at my clusters to figured it out yet.
A
Hard to find the time for other stuff. Yeah. Well, thank you so much, Johnny. I really appreciate you coming by the podcast here. And obviously our listeners can find you@dnapainter.com but where else can we hear what you're up to?
B
And yeah, you can join. There's I have got a free newsletter, so if you go to dnapainter.com mailing list, I've got a monthly newsletter that goes out with a kind of summary of big things happening in the DNA world and the genealogy world in general, plus a few sort of weird history things that happen to attract my interest. Then I've got quite an active Facebook group, the DNA Painter User Group, and then there's another group for what Are the Odds over there on Facebook. And then you can find me on places like LinkedIn and Blue sky and Twitter. But yeah, those first ones I mentioned are the main ones.
A
All right, well, thank you very much, Johnny.
B
Pleasure. Thank you again.
A
Thanks for joining me in this month's episode of the Family Tree Magazine podcast. You can find the show notes from this episode and all episodes@familytreemagazine.com podcasts while on our website. You can also sign up for our free email newsletter where you'll receive free genealogy resources each weekday, including links to new podcast episodes as they're released. Until next time, have fun climbing your Family tree.
Episode: Making Use of DNA Painter – An Interview with Jonny Perl
Host: Andrew Cook, Family Tree Magazine
Guest: Jonny Perl, Founder of DNA Painter
Date: April 1, 2026
This episode features an in-depth interview with Jonny Perl, the founder of DNA Painter, one of the leading third-party tools for genetic genealogy research. The conversation explores the origins, features, and future of DNA Painter, key concepts in DNA matching and inheritance, and the evolving landscape of genetic genealogy tools. Perl also offers practical advice for beginners and previews new features under development.
On Chromosome Mapping's Appeal:
“It’s kind of geeky, if you like…but it’s a very engaging way of creating a kind of DNA, a sort of genetic companion to your pedigree chart.” – Jonny Perl, 00:50
On the randomness of inheritance:
“One of my nieces…inherited 33 or 34% of her DNA from my mother and only 16 or 17% from my father…Once you get beyond that first generation, there is a lot of randomness.” – Jonny Perl, 04:01
On clicking for more insight:
“If you click on it, you get a histogram, right. That gives you the kind of distribution of what people said the amount shared for that relationship was...I’ve been trying to get people to click on that box for years.” – Jonny Perl, 08:43
On the future of user data:
“I don’t actually want your DNA file, but I’m going to extract just a few values…so I can say, well, it looks like you’ve got the gene for curly hair. And…you got it from your grandmother Rose.” – Jonny Perl, 21:48
On the excitement of current trends:
“Exciting times. There’s still lots of new things happening. We have MyHeritage and Family Tree DNA are moving their autosomal tests to kind of next generation sequencing. So we don’t even know what the implications of that will be. But it’s potentially exciting.” – Jonny Perl, 21:15
On educating new users:
“For those new testers, they’re going through exactly the same loops that I was going through myself...they’re thinking, well, what does this mean?...Everything’s a bit more fuzzy and nuanced than you want it to be.” – Jonny Perl, 19:15
This episode offers a lively and practical glimpse into how DNA Painter supports genealogists, demystifying the complexity of genetic inheritance and giving newcomers and experts alike tools to explore and visualize their DNA matches. Jonny Perl’s insights highlight both exciting recent advances and the ongoing need for accessible education as the field continues to develop.
For more resources