
Data visualization is increasingly important as organizations prioritize data-driven decision-making. Tools that transform complex datasets into intuitive, interpretable visualizations are arguably just as critical as the data itself.
Loading summary
Sean Falconer
Data visualization is increasingly important as organizations prioritize data driven decision making. Tools that transform complex data sets into intuitive, interpretable visualizations are arguably just as critical as the data itself. Robert Kassara is a data visualization developer at Observable, which is a platform for creating interactive data visualizations and which makes extensive use of the popular D3 JavaScript library. Robert previously worked at companies including Salesforce and Tableau and has deep experience in data visualization and data visualization tools. He joins the show to talk about modern data visualization and his work at Observable. This episode is hosted by Sean Falconer. Check the show notes for more information on Shawn's work and where to find him.
Robert, welcome to the show. Hi.
Robert Kassara
Thanks for having me. Yeah, absolutely.
Sean Falconer
Thanks for being here. You know, I worked in Information Visualization Lab, you know, once upon a time. So it's going to be fun to, I think, revisit this field and maybe resurface some long forgotten knowledge that I once had.
Robert Kassara
For sure.
Sean Falconer
Yeah. So I wanted to start off with a bit of background on you. You know, you were once an academic, but you've now made the migration into industry. What motivated that transition and how has that experience been?
Robert Kassara
Sure, yeah. So, I mean, it's almost like ancient history at this point. I was a professor at UNC Charlotte many years ago until 2012. That's when I was doing a sabbatical at Tableau and then at that time Tableau was just starting to think about doing more research. And so that's when I made the switch into industry. And so then I was in industrial research at tableau for 10 years. And then about three years ago I switched over to Observable and I'm now doing developer relations and sort of product education at Observable.
Sean Falconer
How would you think about, especially in Tableau, where you're sort of a little bit on the research side, how's that experience of doing that in industry versus in an academic setting different?
Robert Kassara
Well, I don't want to ramble too much about all the things that annoy me about academia, but it tends to be that in academia there is a lot of work that you do as a professor that has to do with administration and getting grants, getting money essentially to pay your students. And then the students are the ones that do the actual research work to a large extent and of course under your guidance and your sort of managing them. But a lot of what you do is really administrative, much more than sort of content. And that's what really drew me to Tableau at that point because I was just doing my own work and I had my schedule open and I could just do the work rather than having to go to all kinds of meetings and having to teach of course as well. And teaching takes up a ton of work and a lot of time. Even though I did enjoy it most of the time, it also was just a lot of work and time spent. And so in industry you tend to just be able to focus on that. And also of course academic research is all about publishing papers and industry research can be quite different. So it depends on where you are. But some places are very focused on publishing I think, like Microsoft Research, I think. I don't know if that's still the case, but they certainly used to be very heavy on like publishing and then other places like tableau. Actually tableau research was sort of a mix. So you wanted to have some sort of impact and that could be publication, that could be getting your work into the product. And that of course was what I was after at that point was to have more impact on real users and the real product rather than sort of like publishing something and then moving on to the next thing and it never really making it into people's hands, which is what happens with a lot of academic research.
Sean Falconer
Yeah, I kind of got turned off from academics for a similar reason. I did a PhD postdoc and I think I had this thought originally where it was like, oh well, if I get to a place where I'm a professor, I'll be able to sort of choose whatever I want to work on. And in some sense maybe that's the case. But really all the fun work gets ended up being done by students. And it is a lot of administrative work. Like you're sort of under resourced from the standpoint of like running a lab. It's like you're running a company in some aspects, but you're sort of the sole person in charge of, you know, making the funding happening while also teaching, while also trying to get, you know, your students to graduate and do things that are, you know, important and things like that. So there is this, I think, downside that isn't necessarily apparent until you sort of reach that stage.
Robert Kassara
That is absolutely true. Yeah. And you were just basically saying at the end there. I think it's also a matter of course of kind of getting over this hump or it's like the startup as a part of that is really takes a long time and it's really hard to sort of get over. And then of course some people are really good at this, they have a pipeline of grant proposals and they know that they have money coming in for this and that. And they keep this mill of papers and grant proposals running. I just wasn't that good at that. And so it's just something that you have to have a talent for to really be good at that. And then it can work very well, for sure. And I want to say that academia is miserable, but some people manage it really well and are extremely productive. So that's certainly true.
Sean Falconer
It seems like you still publish a fair amount. So how do you balance making those kind of research contributions while also having a job that maybe is a less research forward job?
Robert Kassara
Well, I publish a little bit. It's been essentially a paper a year and those are mostly opinion pieces rather than really research. It's tricky because. And I do also still do some surveys. So there is reviewing for conferences and journals that I still do a little bit of because it helps to sort of be part of the community a bit and also sort of be aware of what's happening because you see what's going to be published in six months or in a year, if you're one of the people reviewing that and being part of that. It's mostly my spare time right now. It's not actually really part of my job description right now.
Sean Falconer
Does that help with your sort of credibility within your day job as an educator in the space?
Robert Kassara
I think so, yeah, for sure. I think people appreciate seeing somebody who's still part of this community of research and being out there and sort of like being visible as somebody who's doing that work. Still, that certainly helps with credibility. I think this is a big reason why research groups like Tableau Research, like Microsoft Research publish because it's credibility. It shows that the company is trying to do something that's bigger than itself. It's not just making money and building products. It's about being part of a community and of a larger kind of society in some sense.
Sean Falconer
What originally sparked your interest in sort of like data visualization, information visualization, data apps?
Robert Kassara
Well, I've always been fascinated by graphics in general. I think that are visual. And especially once I saw that you could actually make visuals from data, I was like, well, that's the thing to do because this is so obviously a good idea because we can use our visual abilities to look at numbers and even large amounts of numbers and see patterns and understand what's going on. And it's just really fascinating just to kind of see that when you first kind of look at a scatter Plot and you see the correlation, you see the outliers, you see just kind of things in there that are interesting. And especially if it's data that you care about that has some meaning for yourself or for your work, of course, that you can now actually use that to figure out what to do next or how to change your behavior or things like that. And also I guess the other thing is communications. I really am interested in how to tell people something about the world using visuals because it can be very impactful, but it can also be tricky, of course to read, but there's always this trade off. But I think as a way of communicating, visuals can be incredibly powerful.
Sean Falconer
How do you think about separating or the difference between, I guess, information visualization as a science versus hey, I produced a nice graph. What is that? Depth difference, I guess is what I'm looking for there. How are these things different?
Robert Kassara
That is a good question. I think data visualization tends to have a very nice and tight coupling between research and practice. So I guess to answer your question, the practical side is like, how do I make something, how do I make a chart, a data visualization, an interactive piece, some kind of data app out of my data or for a specific set of data, whatever that is. And the research side is more about what are the right ways to represent the data, what are good ways of interaction, what are things we know about, the use of color, the use of different kinds of visual encodings, how can we also bring in recommendation systems, other things like that that help you with that. So there's a lot of work that's being done there. And of course now with this kind of AI being this big topic, that's becoming something that people are trying to do and figure out what can you do there? But recommendation systems have been part of data visualization for a long time, going back to the 1980s. There's a paper from, I think 1985 that was doing that and there might be even older ones. And in data visualization it's relatively easy to take ideas from research and put them into a product or even other way around because a lot of new things that get released also have publications associated with them. So it's nice to see sort of like the thinking behind work that's being done on a new feature somewhere that you can then actually read about, like the thinking behind that and some of the mechanics behind that as well. So I think for database it's relatively easy to make that connection across from both sides, both from the industry side and stuff like the product side. And from the research side, and that's also, I think, why this is a fairly vibrant space where there is a lot of awareness and sort of like cross talk between both communities or both parts of this community.
Sean Falconer
Yeah. So the visual might be sort of the, if I'm understanding correctly anyway, the visual is like the output. Whereas sort of from a research perspective is what was the thinking that goes into creating the visual to be something that's actually useful, tells a certain story, or unlocks someone's ability to explore data in a new way?
Robert Kassara
That is true, but I also think there is a lot of additional things that you have to be aware of and consider when you're building systems that you want to actually sell to people. So, for example, data access is a big problem in practice that you don't usually deal with in research because you just have the data and it's usually not that much data. So usually research datasets tend to be relatively small. But in practice, in real work there is, or I should say in industry, I don't want to say real too much here, but in industry, as you're using something, you may be talking to a large database or a data warehouse or something that's way beyond what you could just load into your little app. And so you have to think a lot more about how do you push down work into the database, or how do you query the database correctly, or where do you even get the data from? What's the shape? Like, do you have to clean it up first? Do you have to reshape your data? There are a lot of steps before you can even build a visual. And then the visual can be the end result, but it can also be a step along the way because you might be building something and it could be elaborate or it could be very simple to answer a quick question or to answer a first question. And then that leads you to, well, what's the next thing that I need to do? And then you kind of go along the way there. So in my experience, the visual often actually, well, it depends on the use case, of course, but very often the visuals are sort of like stepping stones. They're not necessarily the end product unless you're doing presentation of some kind, like because you're talking to an executive or you're presenting to a board or whatever, then you're going to go through a lot of visuals first, but then you build sort of like the products that actually get shown to somebody else. But that's also, I guess, a difference in visualization between academia and industry where Academia tends to emphasize sort of like the visual as the product more because they're just less embedded in the larger questions of what's the overall sort of work that you're trying to do this for? Like, what's your job? What's your task? Whereas in industry, there's much more focus on, like, well, what's the next step? And how do I get from here to the question or the next answer?
Sean Falconer
And then given that, you could have something that, like, looks really good, is like, you know, very aesthetically pleasing, but maybe is like a really bad sort of user experience where it's just, you know, not that useful. Like, how do you think about, like, balancing those two things where, you know, something that's aesthetically pleasing can go a long way of making someone, like, be helpful in terms of, like, making someone feel a certain way, but you can also build it in such a way that's not particularly useful. So how do you balance those two things?
Robert Kassara
Yeah, that is the crucial question because there's always this trade off, especially if you're trying to get somebody's attention, then you might want to do something fancier or even something that's technically not correct, but it gets your attention, and that's really what it's about. If you just make bar charts and line charts, because those are the correct way to show whatever data it is, but nobody cares because they all look the same. Well, you haven't really done the right thing to get people to do something about whatever it is that you're trying to get people's attention for. So that's a big sort of tension there, where the question is often, well, this isn't the way you should do it, but in reality, you have to get people to actually pay attention to something, and for that you need to do something that's unusual at least, and that has to stand out in some way, which doesn't necessarily excuse all kinds of bad charts. But I think it's too easy to have sort of a very dogmatic view of dataviz that is very disconnected from the reality of how people really use data. And I think I certainly sort of coming from academia first. My thinking, of course, used to be that, well, there are sort of rules and ways of doing things, but seeing what people do, I think you sort of just naturally change and see that the goal, like what the outcome is, really justifies what you build to a large extent. I'm not accusing everything, but if it gets people's attention, if it shows them the right data and doesn't Lie to them. I think it's perfectly reasonable. And oftentimes it's better to have people do things that are a bit unusual and are a bit more fun perhaps than just having sort of like insisting on the rules and just making some more bar charts that nobody's going to look at.
Sean Falconer
Right? Yeah. You almost need some sort of anti pattern to maybe draw attention. And also there's an aspect of if it's fun then you'll actually use it versus if it feels like a choreography. Even if it's technically correct, it might.
Robert Kassara
Not be very useful for sure.
Sean Falconer
What are some of the interesting, maybe non obvious, findings that have come out of information visualization research work?
Robert Kassara
Let me see if I can find a few quickly that I can think of. I'm not sure if this is really research so much as practical use. I'll think of a research example. But one thing that came that I like to cite as something that people do in practice, which also goes back to your earlier question about things that you're supposed to do or not, is that people use treemaps in ways that sort of isn't really what they were intended for, but it's not like wrong in any way. So what treemaps are really for. And just to briefly explain what a treemap is, a treemap is a rectangular space that you subdivide so that it represents some number. So the easiest example is a file system. You have files in some kind of hierarchy and each level of your folders contains files. And those files have sizes. And you take those sizes and add them up together and then each folder and each file becomes a rectangle in this representation and you can see sort of like each level and how big those files are. So this is actually a very useful little tool for that. And that's what it was developed for originally was to actually figure out what's taking up all the space on my hard disk. And it was built for deep trees, so meaning trees that have lots of hierarchies, like a folder hierarchy that goes many, many levels deep. But it turns out it's actually really hard to read the depth on a tree map. So there are different ways of doing that and they're all not that great. So it's really hard to know what level you're on. But the size comparison is very useful. And so what people have started doing is they just make a treemap instead of a pie chart to show part to whole relationships. So you have your department's revenue numbers and how they all add up to the total and it's a tree map and people like those because there's a lot of prejudices against pie charts, but also because they make big nice rectangles that take up the space and you can easily compare them and so you get a good view of that data. So that's one thing that's not so much research, but it came out of the practical uses of a research thing for something that wasn't really intended for but that works extremely well. So that's been kind of interesting.
Sean Falconer
Yeah, it's probably the most practical use case of a treemap that I've seen. I did a bunch of work on tree maps in my PhD and I think one of the challenges there is they come up a lot in academics and research and I think they get proposed probably at least back then in industry, but then end up disappearing by the time you roll something to production or they choose some other option. I think it does come back to this. They're very effective at showing sort volume of data but then you try to jam in additional dimensions into it by using color and maybe other forms of visuals and then you kind of lose the actual affordance of the visual. Whereas the primary thing that it gives you is sort of like a volume, kind of like a pie chart, but like a hierarchical pie chart.
Robert Kassara
Yeah, yeah, for sure. It's very much. You could treat them essentially like what's called a waffle chart, which is like a rectangular pie chart essentially. And that can work very well depending on what you're using them for.
Sean Falconer
Capital One's tech team isn't just talking about multigentic AI. They already deployed one. It's called Chat Concierge and a simplifier. In car shopping, using self reflection and layered reasoning with live API checks. It doesn't just help buyers find a car they love, it helps schedule a test drive, get pre approved for financing and estimate trade in value. Advanced, intuitive and deployed. That's how they stack. That's technology at Capital One. What are some of the most common missteps that you see when you see products in industry? And you're like, you shake your head at it basically and you're like, oh my God, I can't believe I'm seeing this again.
Robert Kassara
Yeah, I think one thing that is very common that I've seen is people really overemphasize the glitziness of things and try to make things pretty before they think of how they actually fit together to be useful. And I think that probably has to do with trying to impress People, because, you know, this is an amazing new thing and look at how amazing it is. Especially. I think this is especially true for animation. So there's a lot of, like, stuff that you can do that's very easy to do these days with animation on the web and everywhere, and it can be used very well. So I'm actually very much pro animation. There are some people who really hate it. I think it's very, very helpful, especially for transitions. But it can also be overdone. And there are things like when you have, like, animated textures on things or just everything being animated, everything bouncing and having little cute little things going on, it just becomes too much. And that's something that I've seen in a few places that I think is a really big mistake. You have to really be very, very careful with these things because they're very visually salient. So animation is a lot like color. Color just draws your attention and you have to use it with care. And then it can be very useful, but it has to be used with a lot of care to not be totally distracting or overwhelming. And the same is true for animation, actually, even more so because it always draws your attention. And so if something moves or it bounces, because that's used for as an attention mechanism, is to get your attention to something. But if stuff always bounces, it just gets distracting and overpowering and overwhelming.
Sean Falconer
Yeah. If everything is attention grabbing, then basically nothing is attention grabbing, right? Yeah, yeah. I mean, it's a little bit like, you know, if you're putting together a PowerPoint and slides, like animations and transitions can be effective tools, but every single thing is animated in a transition, then it becomes very, very obnoxious.
Robert Kassara
Yeah, that's a good comparison. Yeah, because that's also where I've seen it. Yeah, for sure.
Sean Falconer
What are some of the trends in design that have kind of come and gone in terms of. In fashion, out of fashion.
Robert Kassara
So I think there was a whole sort of school of doing things in the late 90s, early 2000s perhaps, that had lots of use of gradients. A lot of this has to do with technology. I think it became easier to do gradients and shading and stuff like that. And so people did a lot of that. And now we've kind of moved to the opposite. Everything's very, kind of two dimensional. Very, very much like just plain colors and nothing else. And I think that, to me is a little bit of an over response because it makes everything kind of look the same and also takes the fun out of things. So I'm A big fan of RJ Andrews and his work on looking at historical data visualizations. And those, of course, those are hand drawn like charts and maps and all kinds of things. And so they used like hatching and little patterns and things like that. And those can be very effective to get your attention to something. And so I keep wondering, what can we bring back that's a bit like that, that's a little bit more playful, a little bit more textural and sort of like tangible that isn't just solid colors and nothing's allowed on the chart other than your two colors and your white background. And everything sort of ends up looking very, very similar.
Sean Falconer
Yeah, I mean, I think it's almost like anything where we end up with these over corrections of something where we kind of do something, we do it to death basically. And then we go the complete opposite direction and then that becomes the thing and then we do that to death and then we go in another direction. So it's kind of like fashion, they say what's in style? And then goes out of style and it's in style again.
Robert Kassara
Yeah, definitely.
Sean Falconer
What is an example of a problem that was solved through an information visualization technique that maybe wouldn't have been able to be solved otherwise?
Robert Kassara
So one example that we had at observable was so we make these charts of traffic. We look at our traffic on the observable website and try to figure out essentially where does it come from, what is it for? And these are very dense plots. So this is something that also we could talk about perhaps this distinction between trying to have everything be a summary plot versus showing essentially every single data point. But one of the charts that we have is very much every single data point from our logs to show essentially all the data for every day when a certain path was hit and they're color coded. We actually have an example on our website for that, what this looks like. And the thing that you see when you do that, you have this very dense display of basically a very dense point cloud. And it just seems random at first, but then you see some patterns and sometimes you see interesting kind of weird lines going through or just darker clusters where there is a certain thing that's being hit more than others. And so again, the color here actually is essentially the path. So we have certain paths that we pick out and those get different colors assigned to them. And so we found very interesting patterns. Some things we were actually looking for because sometimes there was a lot of traffic and things were slowing down and we were like, well, what's causing this and we found people scraping the site. And so we were like, we were able to shut that down if they're causing too much trouble because they're just like hitting certain paths all the time that are expensive and slow. And so they're taking up a lot of resources. But also we found some other patterns that we just weren't aware of. And we're like, oh, people do this? We didn't know that. So for example, some. And so this is kind of a scraping thing. But people were looking at all of our profile images of our users and scraping those. And so we were like, well, why are you doing this? And then in general it's just traffic, like the amount of traffic coming in. So you see sort of like where that goes. And that can help to kind of balance out or to figure out where to put resources or where to maybe like reroute things so that they're more efficient. So there are lots of little things you can find by just looking at these patterns. And in this case they had to really be individual data points. Really. Like, it's hard to see this in a summary chart, but you get lots of information from these things that give you sort of like an overview of that kind of data. Another example is also like, what do people search for? We have this search explorer that lets you see how when people type into a search field, what do they type next? So as they search, because there is a preview that's happening as the search is running and so it shows us common search terms. And also if you start searching this thing, well, where are you going? Was this likely? And so this could actually help build a recommendation system that would help you find the things faster. And that's another thing where you have to really dig into all of the records and dig out those things that are common among them that are otherwise hard to find because they're really weird sort of like sub patterns in a huge morass of data. But you're picking out those things that are structured that you can kind of find and pull up.
Sean Falconer
In terms of using visuals to help people essentially recognize different patterns within their data, how does that compare or contrast to using some sort of statistical method that I could use around pattern recognition that just sort of gives me the answer.
Robert Kassara
Well, that's the question. Does it actually give you the answer? Right. So when it's more exploration of the data and more open ended, then the data visualization is clearly the better way to do it because you don't exactly know what you're looking for. It's Hard to know beforehand what pattern it is that you want to pull out. And especially, like my example with the traffic is like, well, we see some of these lines going through our clouds of points, and so we can ask, well, what is that? Or when there's suddenly, I don't know, a little bit cloud of one color among the other colors that usually isn't there. That's the kind of thing that you pick out visually very easily, that you wouldn't know what to look for. So if you can formulate your question in very precise terms, then you probably just want a query or a statistical method of some kind. But if the goal is to say, well, I don't know what's happening, I want to find out what the issue is, or I want to find out what the cause is, or I just want to see what's happening today. Are there new things jumping out at me that I don't know about? So it's like these unknown unknowns. That's what visualization is great at, because you can see them and you can start then digging into those patterns that you didn't expect or that just jump out at you.
Sean Falconer
How do you make certain visualizations understandable? Or is that a challenge for sort of like the layperson? So if we go back even to the tree map example, right. Like, if I don't really understand what the size of these rectangles represents, then in that I could click on it or interact with it, then it might be hard for me to sort of map that mentally to whatever things I can do with it and how it can help me problem solve. So can you talk a little bit about the challenges there?
Robert Kassara
Oh, for sure, yeah. So this is a very common thing, especially like in data journalism, but it's also true everywhere, because if you're just showing people, I guess you can assume, I mean, hope that you can assume that people will understand the basics of bar charts and line charts. But of course, those are the safe charts, so you can assume that people will know those. If you're using something like a scatter plot or a tree map or a Sankey diagram, then it gets a lot more important to make sure that you either know that the people that you're showing this to know what that is or that you provide enough context. And so in data journalism, I think they do a really good job of what they call the annotation layer, where they add annotations and examples and sort of like guide you through it and say, well, here this is a big thing, and this is twice as much as that thing, and it's important because of that. And so this is both. Again, in the data journalism case, usually what you get is people walk you through the data as much as they walk you through the visualization. And that can be a good way to kind of just teach you something along the way because you're interested in the content more than the viz. Usually you're going to go along with that and sort of like pick that up. That's been a big discussion for a long time about, well, can you use more complex, more unusual charts in a news graphic, for example, when you don't know if people will want to spend the time to read it or just find it confusing and sorry, it depends very much on the publication and on your audience whether you expect them to do that. And also I guess on the kind of coverage or the kind of news you're after. If it's breaking news, you probably don't want to do that because you want people to quickly pick it up and understand what's going on. If it's more of a complex feature piece where you want people to spend time and really dig deeper into it and understand what you're trying to tell them, then you're going to do something more complex. But you have to explain that. And so in a business use case, since of course data journalism is interesting, but it's not everybody's work. Of course, in the business use case, I think the important question is always who's your audience and are you going to be there to explain something to them? Or is this more of a presentation style thing? Or is this something you're sending to people and then you have to figure out if you're going to have some kind of explainer or if you're going to use a simple chart just to be safe.
Sean Falconer
Yeah, I mean, clearly there's different sort of categories and expectations. Like if I'm interacting with a tool like a tableau or something from Palantir, one of these types of companies, there's probably an assumption around like, okay, this person's gone through some sort of training to understand sort of how to use this and drive analytics in some extent because these are exploratory tools that I can use to kind of do come up with hypotheses to run further investigations and things like that. Versus like, hey, I'm going to just send this to somebody who's maybe I don't even know what the role is, or in the sort of the journalism case or you know, particular business user. Then the Expectations about usability is going to be different.
Robert Kassara
Yeah, for sure. And that's where sort of like you're. Whether you're talking to your immediate team and the people who are working with those tools and who are versed in those visualizations and in the data too, versus people who are sort of the recipients of, the consumers of that, of your work, who may or may not know either the visuals or the data. And so it can be important there also to add more context to what does this data actually mean? Like, what are we looking at here? Make sure that people are clear on the context and what's being measured. And I don't know what the relevant goals or KPIs are, for sure.
Sean Falconer
The fact that we have sort of more compute, more memory available to us now to be able to render things quicker, deliver more data faster, has it unlocked certain types of visualizations that previously were just not possible?
Robert Kassara
Yeah, for sure. So there are examples like I was just thinking of. I have this mental image in front of me, but there's this chart of our network traffic, or website traffic, I guess, unobservable, where it shows you a large number. And I don't actually remember what the time horizon is, but it's like a couple of weeks or so. And you have essentially all the traffic being rendered onto that, which is a lot. Like, you have a lot of points being drawn on there, like millions of points. And this is all happening in the browser. So it has to download that data set. This is like a parquet file that it downloads, which is a very efficient way of storing it. But still, we haven't had that for that long. And then now it's possible to just render that in that case, I guess this is probably Canvas, but you're rendering in the browser, so there's enough power certainly on your desktop machine to deal with several million records and rendering those. And it takes, I don't know, a couple seconds maybe to do it. So it's very fast. And then you still know, you can still point to things and it can query from the location what that record actually is and show you more information. So it's not just that you can render an image, but you can actually, you still have that data available and you can still query it and filter and do all kinds of things, like just in the browser. So that's really extremely powerful. And we tend to kind of underappreciate just how much power we have in our laptops, even in our phones. I mean, they're extremely powerful computers and they can do a lot. And so that's been. I think this is still not quite fully utilized and fully understood, but there's a lot that we can do there and especially with interaction like, because we can have the data right here. And even downloading a few million records is quite fast these days given networks, but also just given storage formats like parquet and DuckDB. DuckDB is a very powerful way of doing this because it has its own format that's also very efficient. But then also you can run it in the browser and it's just fast and efficient and you can just run queries and do it interactively, move a slider and that runs a query and renders its output and it all is done in fractions of a second, even over fairly large data sets. So yeah, absolutely, there's a lot of power we have today.
Postman/Redis Advertiser
APIs are the foundation of Reliable AI. And Reliable APIs start with Postman. Trusted by 98% of the Fortune 500, Postman is the platform that helps over 40 million developers build and scale the APIs behind their most critical business workflows. With Postman, teams get centralized access to the latest LLMs and APIs, MCP support and no code workflows all in one platform. Quickly integrate critical tools and build multi step agents without writing a single line of code. Start building smarter, more reliable agents today. Visit postman.comsed to learn more. Building agentic AI apps isn't just about choosing the best. LLM. Agents need short term memory, long term recall and lightning fast retrieval. Without it, you're left with clunky prototypes that never scale, you know. Redis the world's fastest caching solution. It turns out fast data is the key to good context. And good context is essential for fast, accurate memory. It's what makes AI agents actually work with your data. Redis for AI. The right infrastructure, the right tools, the only way to scale. Learn more@redis IO genai.
Sean Falconer
Is there certain visuals that have become more mainstream over the last decade or so? If you think about graphs, for example, you have your standard pie charts, your bar charts, things that kids are learning, school and you can find in any spreadsheet type of software. But has there been a new sort of introduction to that space that has gone mainstream that wasn't there before?
Robert Kassara
Well, nothing that I can think of right now that's very recent other than the tree map. So the tree map certainly I think has had a big impact there. But of course the tree map itself is from the mid-90s, but I think it took a While for people to realize what it can be used for other than these deep trees, as I was saying earlier, but I'm not sure if there is a specific. Well, I guess one thing that is much more common these days because it's just much easier to do them is maps. So maps used to be pretty difficult to do on dashboards and data apps and now it's really easy to do that because they're very fast. It's kind of hard to imagine a time before, like Google Maps and sort of the way we used to do maps on the web, but it used to be very painful and they had to be rendered somewhere else. And so today it's a bit like what I was saying earlier about being able to render things quickly. You can have a lot of map data, just base map data and render that quickly in your browser and then render data on top of that and then be able to move that or zooming it out and things like that, and it renders and just happens right there and it's just seamless and fast. So that's something that I think is more common now because it's so much faster and because maps are for better or worse, are very popular, I think, as a representation, even though they're not necessarily the best representation for a lot of data. Because I think a lot of people overestimate how important location is for a lot of data. But once you have location, it can be helpful to have that sort of as context. But yeah, that's certainly something that's become more popular I think because it's just so much easier today.
Sean Falconer
Yeah. And that's also a good example of something that without the growth in sort of compute and storage and high speed Internet probably wouldn't have been possible 20 years ago or something like that. Yeah, I mean anybody who lived through the MapQuest days and even previously, before even that existed, there was a real dark ages for a lot of people there. So we're living in the height of technology now. We have it much easier.
Robert Kassara
Yeah, I think people who haven't lived through that can't imagine what that was like.
Sean Falconer
Yeah, exactly. When you used to have to go to a library to look things up. So you post a lot of your thoughts and writings on Eager Eyes. Can you talk a little bit about that site? Why did you start that?
Robert Kassara
I mean, honestly, it has been a little bit inactive or at least kind of dormant for a little while. But yeah, Eager Eyes has been my website, my blog for a long time. I started that, I don't know, 18 years ago now, so it's been a while. I actually don't remember exactly what year I even started, but so this was my way to sort of get word out about my research and my work and my thinking. It was just like, what's happening in my head about database? I want to talk about it. And I think this was probably an early sort of like way of trying to maybe rebel against academia and also kind of trying to kind of go beyond it. Because academic publishing takes a lot of time. It takes a long time to get things out from when you do them. So getting things out to just people who can read a blog, it takes seconds technically, or it's much, much faster. You can write a blog post in a couple hours versus a paper that takes weeks or months to write, and then it takes months, if not years to go through review processes and publishing and whatever. So it was just a way to get things out faster and also to tell people about my research who weren't in the academic research community. So a lot of people who are interested in dataviz, and this is also to your question earlier, I think about the relationship between what do people expect from research versus what research does. A lot of people want to do work using data visualization that is informed by what's happening in research, but they don't know what's out there. So if I don't know that well, I mean, how do I find that information? And so you're not going to read all those papers because some of them are actually hard to get access to. And it's a lot of effort to find the right papers. But if you follow a few people who blog, and there are academics who blog, and so I was one of the few, I guess, back then it's still not that common, but you can now learn about what's happening, both what that person is doing, like what I was doing in particular, but also I wrote about other people's work when I was at conferences or read a paper about something that I found interesting. So but that was just my way to kind of get word out and say, hey, there's interesting work happening in this space and here is some of it. Here's my work and here's some other people's work. And here are also just random thoughts that I had around how to use pie charts and what else to do. So that's what it's all about.
Sean Falconer
What's one of your most popular posts?
Robert Kassara
My most popular post, I wrote a review of Edward Tufte's course that was not very kind because I was just not very excited about what he was presenting. And that got a lot of traffic and a lot of comments. So that's my most commented on and perennially sort of like my most popular post. But the other ones are about specific techniques. Like I've written about pie charts, I've written about treemaps, and those are things that people find and seem to find useful. Where I talk a bit about the background, how these things work, how to use them, things like that. It tends to be the bread and butter, or maybe not bread and butter, but the background posts that tell you a little bit about how something works and then the practical side of it as well, that seem to do well.
Sean Falconer
What would you say is the biggest problem that people in the data visualization field are trying to solve now? Is there a class of open or unsolved problems that people are really keen to try to wrap their heads around?
Robert Kassara
There are a lot of questions that people are working on. I think one thing that's been a topic for the last couple years is figuring out if there is a way to use this wave of generative AI for data visualization in some way. And of course also how to use dataviz to work with these AI sort of tools and understanding models, explaining models, helping with decision making based on something that some AI model or LLM produces. So if you can understand some of the background behind that, maybe that'll help you understand where it's coming from. But I think especially this question about, well, where is it all going if perhaps it's possible for some AI model, like I was saying earlier, finding those patterns, right? Those patterns that are unexpected? Well, if there are ways to pull those out using some machine learning thing, perhaps, well, then that would be interesting to know because then I don't know that would inform what data visualization is good for or not. And so I think that there are a lot of people working, or at least some people working on that, especially because it's such a hot topic right now. But there's these ongoing questions about how do you interact with the data visualization, what's the right way to dig deeper into something and also how to even just build something from scratch. Like if you're starting to dig into data, where do you start? How do you help people figure out where to look and what to build from there? And that of course all kind of flows together with this whole AI discussion. So those are some of the topics that I can think of right now.
Sean Falconer
Do you think that is probably going to be a big area of focus in the space where the foreseeable future. What is generative AI's impact on information visualization? How can you leverage it there? What are some of the new types of things that you might be able to do that you couldn't do previously?
Robert Kassara
And I have to say that that's actually a part that I'm not super aware of what's happening there right now. I know that there have been people working on this, but I can't really speak to what specifically has come out because I'm a little bit behind on my reading. But it's certainly, I'm sure it will be a topic going forward and especially once people are able to. To build their own and train their own models, which I don't know if anybody has done that so far. I've seen people use ChatGPT to help just write D3 code or help them write code of some sort to create visualizations. And that's also been incorporated into some products, I think. But that aside, I think there's a bigger question of like, well, if we had some kind of corpus of visualizations and what they're good for, what they show, can we then find a way to kind of query that? And I don't know how much work has been done in that space, so I don't think I can comment on that.
Sean Falconer
Yeah, it'd be interesting if you could look at, here's some of the data that I want to make available for analysis or something like that, and create more of an exploratory experience of, okay, what sort of visualization makes the most sense to support the exploration of this data?
Robert Kassara
As I was saying earlier, these recommendation systems have been around for a while, so this is not necessarily a new question, but it's certainly whether there are better ways of doing this that I think is going to be the interesting thing to see and to watch. Because a lot of this so far has been focused on the data structure, basically. So what data types do I have? Are these categorical, are these numerical? What are they called? So those are things that, are they date fields, for example, are they time, are they currencies, things like that? So then you get a sense of what will you do with that and which ones are the most likely to be important. So when you have a time or date field, then that's very often important and so you can draw conclusions from that. And then once you've picked a few things, or once the user has picked a few things that they are interested in, then you can say, well, there are certain rules that let you build charts. So you can just build a bar chart because you have a categorical and a numerical dimension, or you have two numerical dimensions. Then you maybe do a scatter plot or whatever. But I think there is and there are. This is not a new thing, really. This has been around for a good while. So as I was mentioning earlier, there's this app system from like 1985 that was doing that, that was basically looking at data types and building charts based on that or encoding things dependent on what they were. And this has been incorporated into Tableau and all kinds of chart tools and builders. I think Google Sheets has a way to recommend charts, for example, that relatively recent as in a few years, but that is based on a similar idea that it has a recommendation system that then builds charts for you. But so far they all tend to be sort of correct. They build charts that may be useful, but whether they're actually useful or not is a totally different question. Because very often they just show you stuff where you're like, okay, well, I didn't actually ask for this. I didn't need this particular thing. But they don't know what's important to you and what actually helps you understand your data or answer your question. So I think that's really where the question will be, can we guide these systems so that they produce more useful output than they have so far?
Sean Falconer
Well, awesome. Is there anything else you'd like to share?
Robert Kassara
Yeah, I mean, we could talk a bit about maybe D3 or the observable sort of database tools, if that's of interest.
Sean Falconer
Absolutely. I mean, I'm familiar with D3, but maybe just quickly give a little bit of background on that and then what are some of the things that you're focused on as a company with the investment in D?
Robert Kassara
So D3 is this library that is a data visualization library that came out of Mike Bostock's work during his PhD. I think it stands for Data Driven Documents. This is just background D3. And it became a very popular way of doing data visualization on the web, because that was really hard, especially back then. When this came out in 2011, people were still sort of using Flash. Actually. This was the tail end of Flash, I think, and people were looking for something new and better because Flash wasn't working on their iPhones and it was just also going away at that point. So what D3 made possible was to essentially tie SVG elements. So you build a data visualization in an SVG object within a website Tie those objects, these elements of an svg, like your bar rectangles or your scatter plot dots, tie those to data values. And what that does is it means that when the data values change, your visuals change or can change if you do it right. And so you can have a chart that animates between filter states, for example, or that can morph between a bar chart and a scatter plot and a line chart or whatever. So you can do lots of very complex data visualizations, but it's also a fairly complex library. So even building a bar chart is sort of like a fair amount of work. And so what we've been doing at Observable is try to build tools that help you when it's not so much about building complex charts like you usually do with D3, like very fancy, very elaborate things, but simple charts. And so we have a new library that's called Observable Plot that's built around that. So it's a much more direct mapping of data into visuals. And then you can't build everything that D3 can do, but what you can build is much faster, much easier to build that way. And so it's a way to do exploration versus and exploration analysis versus the bespoke and very cool stuff that you can do with D3. So you have kind of different ways of working and building things that are still web native and very much part of whatever web based system you want to build.
Sean Falconer
Given that D3 has been around now for well over a decade, how has it had to evolve its approach to continue to stay relevant?
Robert Kassara
Well, Mike has done a lot of work. So Mike Bostock, who built D3, he has done a lot of work to essentially keep it up to date with web standards and JavaScript standards. I think he changed the way the modules work. He broke it up into modules, so now you can just take the pieces you want. So D3 has become this sort of library of utilities that can do all kinds of things. Even if you don't use it for your visuals, you might be using it for your array operations, because it has lots of cool operations for that like grouping and nesting and stuff like that. And he's also built a lot more. And while he and a few other collaborators have built some more layout of components, for example, and maps too. So mapping is a big deal with D3. You can have any projection you want. You can have all kinds of really interesting interactions with maps that are all possible to build with D3 and relatively easy actually now and then layouts. And so this means things like treemaps and even pie charts because building those from scratch from sectors of a circle is kind of a pain. And so there are layout systems for that that will do the stacking and whatever for you that build all kinds of different charts for you so you can more easily build things where you don't have to do all the operations yourself. And of course, so I guess layouts and maps, but also force directed layouts. So like graphs and things like that. So those are things that are less common in BI tools but those are things that D3 is also really good at because you have to actually run sort of like a simulation to build to have the layout be computed over time. And so it has ways of doing that.
Sean Falconer
Yeah, I mean that's a common sort of graphs or common visual with workflow orchestration tools, sort of the no code orchestration frameworks and stuff like that that exist. And now actually you're even seeing a lot of that stuff in the Gen AI space of these sort of low code agentic frameworks where you're stitching basically nodes and edges together.
Robert Kassara
Yeah.
Sean Falconer
I wouldn't be surprised if there's a significant growth in investment in those kind of graphics as we continue to build more and more Gen AI devtools. Well, Robert, thanks so much for being here. I really enjoyed this.
Robert Kassara
Thank you, that was great.
Sean Falconer
Cheers.
Podcast: Software Engineering Daily
Host: Sean Falconer
Guest: Robert Kosara (Observable, formerly Tableau, Salesforce, UNC Charlotte)
Date: September 2, 2025
In this engaging episode, Sean Falconer and Robert Kosara explore the field of modern data visualization. They cover Robert’s career journey from academia to industry, key principles and challenges in data visualization practice, the interplay between aesthetics and utility, the evolution of data visualization technology (including D3 and Observable), and current and future trends—especially those around AI and generative technologies.
Robert’s Background
Academia vs. Industry
On Academia vs. Industry:
"In industry you tend to just be able to focus on that. And also of course academic research is all about publishing papers and industry research can be quite different." (02:10, Robert)
On Engaging Visuals:
"If it gets people's attention, if it shows them the right data and doesn't lie to them, I think it's perfectly reasonable." (12:39, Robert)
On Usability:
"If you're using something like a scatter plot or a tree map or a Sankey diagram, then it gets a lot more important to make sure that you either know that the people that you're showing this to know what that is or that you provide enough context." (26:27, Robert)
On Power of Modern Tools:
"There's a lot that we can do there and especially with interaction like, because we can have the data right here." (30:08, Robert)
On Recommender Systems Limitations:
"They build charts that may be useful, but whether they're actually useful or not is a totally different question." (42:12, Robert)
| Timestamp | Segment | |-----------|---------| | 01:15 | Background: Transition from academia to industry | | 06:41 | Motivation and philosophy behind data visualization | | 09:46 | Practical vs. research perspectives on visualization | | 12:12 | Balancing aesthetics and utility | | 14:34 | Unexpected uses/case studies (e.g., treemaps) | | 18:06 | Common missteps: overemphasis on “glitz” and animation | | 19:58 | Design trends—gradients, minimalism, retro styles | | 21:47 | Observable’s data traffic visualization case study | | 24:54 | Visualization vs. statistical pattern recognition | | 26:27 | Usability challenges and the annotation layer | | 30:08 | Advances in compute enabling richer visualizations | | 33:48 | Mainstream adoption of maps | | 35:56 | Eager Eyes blog purpose and popular content | | 39:09 | Unsolved problems: generative AI's impact on visualization | | 44:41 | D3 origins, evolution, and Observable Plot | | 46:56 | Modular D3 and new layouts/maps |
Robert and Sean conclude by touching on the future evolution of visualization tools, the integration of AI/ML, and the value of clear, accessible communication in data-centric roles. The episode is both an expert masterclass and a practical guide to the changing landscape of data visualization.
Further Reading & Resources:
This summary distills the heart of the episode for both practitioners seeking technical insights and newcomers interested in the evolving world of data visualization.