#btconf Munich, Germany 15 - 17 Jan 2018

Nadieh Bremer

Nadieh Bremer is a graduated Astronomer, turned Data Scientist, turned self-taught Data Visualization Designer. After working for a consultancy & fintech company where she discovered her passion for the visualization of data, she's now working as a freelancing data visualization designer under the name “Visual Cinnamon”. She focuses on uniquely crafted (interactive) data visualizations that both engage and enlighten its audience. Secretly, she wouldn't mind venturing into data & generative art as well.

Prefer to watch this video on YouTube directly? This way, please.

Data Sketches: A Year of Exotic Data Visualisations

“data sketches” was a year-long collaboration between Nadieh Bremer and Shirley Wu, both freelancing data visualisation designers. Each month they chose a topic and visualised it in an overly elaborate & geeky manner. But besides sharing the end result, they also wrote extensively about the creation process.

In this talk, Nadieh will share her most important lessons learned in the fundamental areas of data, sketching & coding the visual onto the screen. About how some months became favourites, what mistakes were made along the way, and how they were overcome.

She’ll especially highlight that many visualizations had humble, ugly duckling beginnings, but that through many (embarrassing) iterations they were turned into unique and, hopefully, compelling results. And finally, she’ll touch upon the impact & benefit of collaborating heavily with someone (halfway across the globe) for a year.

Transcription

[Music]

Nadieh Bremer: Hi, everyone. I don’t actually have that much trouble saying no to things that I don’t think I’ll enjoy; at least I think I am. Saying no to things that seem awesome, I’m exceptionally bad at, and that’s why I’ve had a to-do list with personal projects beside my laptop since December of 2014 when I discovered my passion for the visualization of data.

I do cross things off every now and then, but new items appear just as fast. But, in hindsight, I don’t mind this permanent lack of having enough free time left to actually get bored. I think they really help me find opportunities that I didn’t even know I was looking for, and they also played a really big role in me being able to pursue a freelance career three years later.

My favorite personal project is the one that I want to talk to you about today. It is a yearlong collaboration that I did with Shirley Wu, who is, like me, a freelancing data visualization designer, although she’s based in San Francisco and I’m from Amsterdam. To set the scene, a bit of background. Shirley and me first met virtually in a data vis Slack channel and then met in real life a few months later at OpenVis Conf where we both had the honor to speak, and we really hit it off during those three days.

A few weeks later, I was publishing tutorials about the different aspects of my talk, and Shirley really jumped on them and started asking me all kinds of questions. Somewhere during those chats, we started lamenting the fact that we hadn’t created as many more advanced data visualization projects in the previous year as we wanted to. We’d been busy doing other things: tutorials, workshops, blah-blah, blah.

Somehow, out of the blue, Shirley asks me, “Well, do you want to collaborate and create stuff?” I think it took me mere seconds to reply with an all capital YES! That’s how Data Sketches was born.

In the following week, we figured out that we both like the idea where we would create a visualization each month around a specific topic and do that for a year to see how two people would create two visuals starting from the same seed, but then diverge again on different paths based on our own interests and history. Besides sharing this end result, we also really wanted to write about the creation and design process. We split that up into the three pillars that we find most important: data, sketching, and coding.

Initially, we thought we could pull data sketches off with about five to six hours a week. But, as usual, real life really doesn’t care about plans, especially coding plans. Since starting in July of 2016, we’ve clocked many, many hours into creating our visualizations each month. During this talk, I’d like to take you through some of the lessons that I learned, challenges faced, and the insights that I gathered along the way.

To start off with the most fundamental aspect of data visualization, the data itself. We often get the question, “How did you find your data?” It’s not typically the data that leads us. It’s the topic of each month that first provides a spark of an idea, an insight that we might want to reveal and how we would visualize that.

For November, for example, the topic was books - pretty broad. But, I really wanted to focus on fancy books and, even more specifically, the themes and titles of fancy books. That gives you something to search for. After I know that more concrete angle, I do nothing more special than just Google the Web using my more specific idea combined with the words “data” or “dataset” and then having the patience to click on every link on the first two or three pages of results.

This has led me to Google spreadsheets containing thousands of rows of Olympic medal winners or GitHub repos with wonderfully unique datasets such as one about all of the words spoken in the Lord of the Rings movies, and another one that contains a family tree of 3,000 people connected to royalty going back more than a millennium.

I’ve also learned that website design says nothing about the quality of the data it shares. There are also websites that contain structured information, but not in some ready to download format. Instead, I have to scrape the logical layout of the website and put the information that it contains into a file with the help of some code.

It’s not time for the music yet.

Audience: [Laughter]

Nadieh: [Laughter] IMDB has an advanced search that returns a list of movies. Each of these movies is contained within the same set of divs and other elements in terms of styling. I can, therefore, download the HTML structure and then use a script to search for all of the results that follow a certain styling. An easy example, all of the movie titles could be contained within a div of a class title.

Another example could be that, on Amazon, I was able to find the list of the top 100 best-selling fancy authors. Well, there are also APIs from which you can request information, but I have to admit I don’t often use these because they can be a bit of a hassle to set up. Nevertheless, the wealth of information can sometimes be too good to ignore. Using those 100 names that I scraped from Amazon, I then used the GoodReads API to request the information of the top 10 best-rated books of each of these authors, gathering info about the number of ratings, the average rating, and of course the titles of these books.

Another way to actually get data is to just ask others for advice. For April, our topic was community. I really wanted to do something about earth, something that would kind of fit the World Wildlife Fund, but I couldn’t get may angle any more concrete than that, so I just asked on Twitter, and I got a lot of interesting links, and then I started browsing, which led me to new links and so on and so on, until I finally came across this. What you see here is the greenness as measured from space on a particular week in June.

When I saw this, I knew I’d found my hook. I wanted to visualize this same information, but then animate it smoothly throughout the year and try my own spin on the visual styling, which eventually turned it into this result. Although it proved technically quite complex because it was a lot of data, the visual itself is very minimalistic. I think that a title and a simple legend are enough for most to understand what they’re actually looking at.

Finally, you can also create a dataset completely manually; no code required. For our nostalgia month, I dove into Dragon Ball Z. I’ll get back to the visual later, but I could find these lists on the Dragon Ball wiki pages that contained all of the fights that had happened during this anime. I just copied and pasted these lists into Excel and split them apart into the characters and other information. It took me about two hours for 200 fights, which I still think is faster than if I tried to create a script to handle all the nuances in the text of these lists.

Or when, really, I really tried, and I spent two hours finding a proper data set on the characteristics of butterfly species. In the end, I had to resign myself that the best option seemed to be contained within a website called Gardens with Wings. Again, I felt that scraping a sort of semi-structured website would take more time than the manual approach, so I just clicked through every option and put the information that I needed into a final file of 87 different butterfly species.

What I hope to have shown you with all of these examples is that there is not just one specific way to find data. It’s not just hardcore data analysts that have these magical skills. Data can be found in so many different ways from googling and finding the straightforward CSV or Excel file to scraping it or even creating it yourself. But, be aware that you often still have to do some manual adjustments in preparation to get it into the right shape that is needed for your visual.

To get a bit deeper, then, into what those kinds of adjustments can be, during August our theme was the Olympics because we are both big fans and it made a lot of sense. I ended up visualizing all 5,000 gold medal winners since the very first games in 1896. Each of these circles here is a group of similar sports. We have watersports and ball sports, and each slice or feather within a circle.

Wow. That went really fast. Let’s try that again. That’s better.

All right, so each feather is an actual sport. From the circles, we have the first addition on the inside going out to 2016. Then on the reddish background, we have the female events. On the blue background, we have the male events. Finally, each medal itself is given the color of the continent in which the countries lies that won, so Europe is blue, Americas is red, Africa is black, and so on.

I found the data for this piece from two articles published by The Guardian for the 2012 games in London. After getting a rough shape of the visual on my screen, I noticed that some very obvious medals were missing from 2012 like hockey. Suddenly, my confidence in this data set had dropped drastically, even coming from such a respectable source. I had to get a sense of the overall accuracy, but I didn’t want to have to manually check all 5,000 medal winners, so instead, I found a proxy.

On Wikipedia, I could find lists that contained the number of events that had occurred during each edition, which I then compared to the number of gold medals I had in my dataset. If there was a discrepancy, I investigated further to figure out where and why. That’s how I found out that for some of the editions, the horses were also in the dataset, which makes for an interesting read to suddenly see Princess, Sissy, and Lady Mirka as women winning gold in the Olympics.

Audience: [Laughter]

Nadieh: Well, eventually, I managed to figure out each discrepancy and making adjustments to get it to the point where I trusted my data again. The lesson here was that you should really get a sense of accuracy and completeness of your data because missing data can be harder to find than wrong data sometimes. You don’t have to check every value but think about taking sums, averages, and counts, and comparing these to plain common sense. Can it be higher than 100%? Does that make sense? Even better, a different data source, going down a bit deeper into data preparation and how this can be connected to creating more exotic or unique visualizations.

The standard bar and line chart are so straightforward and normal that you can create them with any tool that lets you do data vis. You simply supply the data and the tool or program does the visual heavy lifting for you, creating axis, scales, and so on. But, the more that you start to deviate from these standard charts, the more that you’ll have to supply other aspects of your visual as well, such as where on the screen should the data be placed. In essence, the Olympic Feathers Project is really just a whole lot of rotated and stacked bar charts. They all follow the same structure and concept, but they also depend on each other.

At first, I tried to calculate all of the initial rotations of the circles and feathers in JavaScript. But, after having written 30 lines of code and still not achieving something I knew I could do in two lines in R, which is my favorite data prep and analysis tool, I just pulled all of these calculations into R as well. Even if they were so-called visual variables, by which I mean that they have nothing to do with the data, but only with how it’s laid out on the screen. I recalculated the initial rotation that each of these circles would need to have so that eventually the center would be at the bottom, and how far each of these inner slices would have to rotate based on their predecessors.

The only placement variable that I kept calculating in JavaScript to keep it dynamic was the year scale from the center outward because then I could shrink and expand the circles based on their screen size. But, even each medal’s offset from the center was something I calculated beforehand. Even if they have nothing to do with the data, it’s perfectly fine to pre-calculate these visual variables and attach them to your data.

Sometimes, well, it’s actually quite useful for fixed data sets. The reason is that sometimes it’s just way easier to calculate these values in a different tool than the tool that you’re ending up visualizing it with, such as JavaScript or Tableau. For those visuals that end up in the browser, it can save you a lot of browser calculations, making the visual faster to load. A personal benefit is that it makes for a lot more readable JavaScript file.

Since starting, Shirley and me have filled many pages of our notebooks with sketches because it helps us think and layout ideas beforehand. My sketches are often very simple, only focusing on the main abstract shape that I want to fit my data into. Colors, layout, and details, these are things I only vaguely think about but don’t act on until I have the data on my screen. There’s no use to spend any time on thinking on these things until I’ve figured out that the shape will actually work once I’ve put the data into it.

For the Olympics piece, I was inspired by the shape of a peacock feather, placing emphasis on the more recent editions. But, I had no idea if that would look alright once I finally placed all 5,000 medals together, so I had to see if the general shape would work before moving on. It took a few tries to get it to the point where, actually, things started to make sense, but I saw that luckily it did show potential with the actual data.

Networks are quite tricky. You can’t really create an abstract shape any more complex than just circles connected by lines because, for networks, the visual form that works best is so inherent on the actual connections that are in your data. For October, I went into royalty. I’ve always been intrigued by how intermarriage the royals really are. Are they all cousins twice removed?

Luckily, I found this genealogy dataset that contained a gigantic family tree of the European royal houses. It was from 1992, so I had to add one or two more generations in the main line of succession, which was a fun night on Wikipedia. [Laughter] But, here is the end result with the 3,000 people. The current royal leaders are the bigger circles, and everybody is connected to their parents, their partner, and their children.

Oh, come on. Real quickly.

You can hover over a person and then see how far their six degrees of separation reaches into this Web.

It’s just not working right now. I feel just hating my laptop. They should go a bit faster.

Anyway, you can click on a person, and any other person, and then see sort of how many jumps it takes to get from one person to another person because this entire Web is connected, so everybody in the European royal family is family. Anyway--

But, when I started out with this dataset, I had no idea what it contained, so I just placed all of these people on the screen using the most basic network settings, and then this happened. It’s sort of an explosion of points and lines moving out of my screen. All right, let’s rein in that gravity a bit because these networks are often simulated by using gravity.

Then I was left with something that, in network analysis, is called a hairball. It’s quite useless. Well, maybe I can color all of these people by year of birth. [Laughter] That wasn’t really helping either. But, thankfully, in a browser, you can have gravity depend on a variable, so I pulled the Web apart by year of birth as well, which was a little bit more interesting, but it was still a rather un-insightful bundle.

At this point, I had already invested several hours into playing with the network settings, trying different kinds of connections, and adjusting my data. I was really ready to just give up and try a different angle like how much are the royals spending these days or something. But, I gave it one last shot, and that’s when I decided to focus on the current royal leaders.

I placed these in a line, and then I let the vertical gravity depend on which of these leaders you are most closely related to. That’s when I finally saw it - insights. For example, that the Queen of Denmark, who is moving around here somewhere, is actually very essential to the Web, whereas the Prince of Monaco line, which is appearing here now somewhere, his line is separated from the rest of Europe almost 200 years before. We’ll get there.

It was only around this time that I finally started worrying about more of the design aspects. Networks often remind me of constellations and, with my astronomy background, I have a bias for all things space. That’s why I turned into a starry night. But, I could have never designed this visualization beforehand in illustrator. I had to go hand-in-hand with the actual data and apply my design choices to all of the data simultaneously to make sure that the final result was both engaging and insightful.

But, getting back to actual sketching, for September I made my most personal data visualization ever. Our topic was travel, and I thought, well, I’ll make a visualization of all of the vacations I’ve ever been on. With the help of my parents and browsing through all of my analog childhood photos and my travel journals, then I managed to create a list with a variety of info about my vacations.

Initially, I wanted to do something very simple. Just a row for every year that I’ve been alive and then colored blocks on the periods when I was on vacation where these blocks would be decorated to give an idea of, well, where did I go, who I was with, how much did I enjoy it, and so on. I drew the sketch around that idea. But, when I looked a little bit more closely to the sketch, I suddenly noticed that my mind had glossed over something rather important. Namely that I’m only on vacation for a max of four to five weeks a year, which isn’t even 10% of a year. I mean that isn’t bad, in general, but in this sketch it looks more like I’m on vacation for a quarter or half of the year.

If I were to create this visualization with the actual data, it would be mostly empty, which wasn’t what I had in mind. So, I drew some new sketches and, in the end, came up with a rather unusual approach. I decided to squish any month in which I hadn’t been on vacation. That would make it particularly hard to compare months across the years, but that wasn’t my main point anyway.

I wanted to visualize trends in my vacations themselves, so mostly sun-driven in early childhood to culture in my teens, and nature these days. By first sketching things out, it made me realize that my initial idea wasn’t going to work and it guided me towards this new approach. Here is the final result of 30 years of my vacations, but I have to admit that my favorite part of this month was just reminiscing with my parents over old photos and things we remembered from those vacations.

As another example for the Olympics piece, after seeing an image of a peacock feather, I started drawing some feather shapes. Initially, I filled them completely with color. But, while sketching this, I suddenly remembered. Wait a minute. The events, they change from edition to edition, so how would that actually work?

The next day, I tried to explain my rough idea to a friend. Again, while sketching out parts of it, I stumbled upon more illogical thinking errors that my mind was just glossing over. Only by drawing these shapes several times, catching my thinking errors and trying it again, did I get to a shape that made sense on paper. Instead of going straight from the data or idea in your head to a computer, draw out your design on paper first. It’s the ideal way to quickly catch thinking errors, and you don’t have to be some sort of artist to do data vis design. It’s mostly circles, rectangles, and simple curves anyway. The thing is, if you can’t make it work logically on paper, it will definitely not work on a computer with the actual data, and it’ll save you lots of hours.

Well, another thing I tried to investigate while sketching is how to add extra details, how to add more context around the main insight that I want to convey. For example, even though the Olympics piece is already pretty high in the number of data points, I couldn’t resist adding information about the Olympic and world records because every athlete there tries to break at least the first if not the latter. I placed a small white dot on each medal that resulted in a record. Here we have Usain Bolt’s 200-meter dash in Beijing, for example.

A way for me to think about adding extra details is to think about the visual channels that are still free after I have the main chart standing. Let me explain that a bit more with another example. For the past 19 years, during exactly one week of the year, more than half of the Netherlands listens to the same radio station. I know we’re a small country, but that’s still pretty unique.

Nevertheless, this happens during the final week in the year when the 2,000 best songs ever are aired, counting down to the new year. It’s quite a thing for a Dutch person, so I asked Shirley if our topic for December could be music, so I could tackle these 2,000 songs. I wanted to visualize what decade was most popular in terms of song release year.

Here we have the base visual where each song is a circle, and they’re clustered to sit at the year of release from the ‘60s until today. The size represents the position in the top 2,000, and the darker the color the higher the position it reached in the weekly top 40s. But, you can see sort of that the most popular decade is ‘70s and ‘80s, and people are sort of trying to forget the early ‘00s, apparently.

There is so much more information in this data that can supply contexts such as song or artist name. But, to be honest, I also found the current visual to be a bit boring, so I wanted to add more. Then I think about, well, what visual channels do I still have? Well, here I could add a stroke to these circles or have an extra mark on top or use annotations.

I drew a sketch, a big sketch with some vague ideas of how that might look. Because, in 2016, sadly, David Bowe and Prince died, I wanted to mark all of their songs with stroke and then tell the people how it changed with respect to the list of 2,050. But, I could also use the rankings of these 2,000 songs in different ways such as, who was the most popular band; what song from 2016 is most popular; which one is the highest riser or newcomer; or to single out that Pokémon song. It was the year of Pokémon Go. Or, to even mark the top ten songs more clearly by adding something extra on top.

By adding these extra bits of information, I felt that the visual could be more fully understood, enjoyed, and it made for a visually more interesting piece as well. Even if your chart is making the main insight of your data clear to your audience, try and think about adding extra details. Use remaining visual channels to add new variables that can supply context, and which can give the truly interesting reader even more ways to dive into and understand the information.

As expected, most of our time is really spent on getting the data on the screen. Here are some of my perhaps less obvious coding lessons. For our very first month, the topic was movies. It was pretty clear to me I wanted to do something with the Lord of the Rings, which is my favorite trilogy. I thought that with the popularity of these movies that there would be loads of data to be found online, which didn’t turn out to be true at all. But, it did contain one true gem of a data set. Somebody had created a set that contained the number of words spoken by each character in each scene of all three extended editions of the Lord of the Rings. How amazing is that?!

Audience: [Laughter]

Nadieh: I knew! I knew I had to do something with that. [Laughter] Thinking a bit, I thought, well, wouldn’t it be interesting to see how many words each member of the fellowship spoke at the different locations in the movies? But, you may be able to see--well, maybe not--there is actually -- this is the data set, but there is actually no location information in there. So, with the help of the scripts found online and my own memory, having seen the movies way too often, I manually added location information to all 800 rows of the fellowship members. Yeah. [Laughter]

Audience: [Laughter]

Nadieh: I drew some sketches and, eventually, I came to this idea where each member of the fellowship would be placed in the center, and the locations would be spread around them in a circle. Then they would be sort of connected by these strings where the whiff of the string on the outside represents the number of words spoken by that character at that location.

Sadly, this chart form doesn’t actually exist, so there’s no tool that can help me do it. But, it reminded me of a chart that did exist called a chord diagram. I thought, well, can I sort of try and see if I can transform a chord diagram into my sketch? Here we have a basic, stripped of all text, chord diagram.

The most fundamental thing to me was to see if I could somehow make these chords flow towards the center instead, which eventually took less time than anticipated, which is pretty rare for me in coding. Well, getting the actual data in there from Lord of the Rings, some more appropriate colors, and since we have nine members of the fellowship, I’m making sure that the central portions end up at the right location. But, this was looking way too squished, so maybe I can pull them apart.

Okay. That worked, but now these strings, especially at the top and bottom, were looking a bit odd. That’s when I finally decided to learn how to create SVG paths because practically all of my visualizations are created using D3, which creates SVGs. That is the thing that took the longest during this project to actually learn, but it was quite fun. These look more natural, I felt.

That’s how, finally, I ended up with this new chart that was mutated from D3’s chord diagram. Because we have so many of these strings visible, I thought, well, I need to have some sort of interaction, so we can hover over a location and see the people that spoke there or, vice versa, if you hover over one of the characters, you can see what locations he spoke. But, I also added a small piece of insight that I’d found for this character from this data. My favorite one is actually this one where Boromir, who is really only alive during one movie, still manages to speak more than Legolas does in three.

Audience: [Laughter]

Nadieh: I didn’t know that beforehand, and I don’t even want to mention how long it took me to try and find the Elfish translations of these locations, and I still don’t know if they’re correct. [Laughter] It’s bothering me.

Many people have done amazing things that you can use. Even if you think you’re creating something new, you don’t always have to start from scratch. Just try and find the thing that most closely resembles your idea or design and start adjusting that, you know, remix what’s out there already.

I guess the thing that I learned most about during that year was SVG paths and how they can be connected to creating so many more possibilities on how to shape your visual, from a simple curve instead of straight lines in my royalty network, to sweeping arcs in my visualization about fantasy books, to the feather shapes that I really, really created for the Olympic Feathers Project, but that never made the final cut, and the strings that we saw just now with Lord of the Rings, and these--I don’t even know what to call them--readily flowing lines connecting the loops between two circles. If anybody has a shorter thing than that, I’d love to hear.

Another project that is very much based around custom SVG shapes has to do with our nostalgia one. I decided to dive back into something that I was crazy about during my teens, Dragon Ball Z. For those that sadly don’t know it, Dragon Ball Z is an anime that revolves around fighting. I thought, well, wouldn’t it be fitting to actually make a visualization that shows all of the fights that occurred during this anime, and to show who was fighting whom and what state were they in, Super Saiyan, and was there anything special about the fight?

Here we have the base of the visual. Each cluster of circles is a fight. They are ranked in order of occurrence, and from left to right is the sagas, which is sort of similar to story arcs or seasons in a way. To more easily follow a character from fight-to-fight, I wanted to connect their fights by a line. Since a straight line doesn’t really make any sense here, I started out with a collection of so-called quadratic Bezier curves that gives you the option to pull on these sort of line sections by moving anchor points. To make it visually more interesting, I pulled harder on these anchor points when the distance between two fights was farther apart.

Then I got some great advice to actually use the side of a fight to denote good and bad guys. Good guys swoosh on the left side and bad guys on the right, which then shows you that, for this character, for Vegeta, he started out as a bad guy, moved around a bit, and then mostly became one of the good guys. So, doing this for all of the characters and, actually, this definitely showed me insights into characters that I’d never really realized, but I just wasn’t liking the single thickness lines. They weren’t conveying the dynamic nature of these fights.

Instead, I thought, I need to create a shape that I can fill with a color that sort of mimics a stroke of varying thickness. As usual, I draw that out on paper, and then I try and deconstruct it in terms of its mathematical or, in this case, SVG path elements, which here came down to flipping the path back up again and using different amounts of swoosh. [Laughter] That’s a technical term.

Audience: [Laughter]

Nadieh: Now, with that implementation, I felt that the lines connecting these characters between fights became a lot more visually interesting and fitting. Even though I’ve been talking about SVG paths here because they are so central to the work that I do, but the main lesson is that you should really thoroughly understand your favorite or the main tool that you’re working with if you want to go beyond the examples. I was using D3 for two or three years before I finally took the time to learn how these SVG shapes are actually created. It opened up a whole new world for me in terms of creativity and possibilities, so I wish I’d done that sooner.

Another thing we had a lot of fun with during the year was math. Here is my addition to our collaboration with Google News Lab. Being a non-native English speaker, I wanted to focus on translations. What do other languages want to have translated into English by using Google Translate, from the most translated word of ten chosen languages to the top ten for these languages and, finally, the similarities between these languages?

For the first visual, I wanted to string together these most translated words by a swirling path that would represent the 100 most translated words overall. Preferably, I wanted something organic looking like this. But, for the life of me, I couldn’t figure out how to create something that would mathematically create a similarly organic swirl and that would update and be responsive to both desktop and mobile sites.

Eventually, I went with the idea of beads on a string slowly zigzagging down. Of course, I still end up making errors in my mathematical functions on paper that reveal themselves once you’ve coded it out, thankfully. But, eventually, I got the layout working. Now, using screen size, I can either have it be four, three, or two beads wide.

The last visual on the page also proved to be a rather interesting math and logic puzzle itself. In it, each word that two languages have in common in their translation in the top ten is represented by a line. Initially, I wanted to have all of these visuals to be completely built up out of the words themselves, so I wanted to replace these lines by the words they represented. But, once I finally had that working, I immediately saw that was one big mess, so I had to compromise and eventually only place these words on the lines connected to a central chosen one.

But, it was really important to make sure that these words were placed in the most upright manner possible for readability. But, that gave quite some convoluted calculations in terms of these text paths on which they are drawn, which became glaringly obvious when you click one of these languages to move it towards the center. Not quite what you might expect on what’s going on there with those lines. I again went to my notebook, and it took a few more pages to figure out how to implement a solution. But, eventually, I got there, and now it should be working the way you might expect it to work - although not as smooth in my presentation.

What I’m doing really is more of a hack. When somebody clicks on one of these outer languages, I fade out the words and then I immediately replace all of these lines with their final state, but reverse engineer it to look like the initial state, and then smoothly they’re transitioning to the final stage. The lesson here is very simple. Learn to love math and especially geometry because they are often your best friends in finding solutions to these visual problems.

I want to end with one of my favorite lessons. During February, our topic was nature. I’d always wanted to do something more along the lines of generative or data art. The apparent randomness of nature felt like a perfect match. It also reminded me of butterflies, how their path also feels kind of random to me.

I wanted to mimic these butterfly-like paths across the screen and then using data from different butterfly species to guide the apartment. The species’ main color would give the color of the path. The species itself would define sort of the line style, and then the wingspan, the thickness of these paths. Then, using lots of semi-random number generators, together with inspiration from the works of Jared Tarbell and Inconvergent did I let my butterflies free across the screen.

This is the only month in which I make no attempt to make the data insightful. It’s really just creating something that is based on data, but that delight its audience, to keep them mesmerized as the screen fills up with more and more butterfly paths. Keeping your audience engaged and delighted is an important factor to keep your audience into the visualization, especially for the more complex ones. You can do that in lots of diverse and subtle ways.

For example, I was once on a flight back to Amsterdam, so no wi-fi and I, therefore, couldn’t really do anything essential, so I just created an animated legend of my visualization about fantasy books more for fun. Other nonessential things that I’ve added are animated gifs of the most memorable moments in Dragon Ball Z, or having hovers with the lowest level of information that I had for music nerds in my top 2,000 visual, or turning the top ten songs into tiny vinyls, and having annotations of weird and silly events that happened during the history of the Olympic games such as Henry Pearce having to stop for ducks in the rowing event, but still managing to win gold.

Even though getting your data on the screen in such a matter as to make insightful is key, it’s the other things to add such as annotations and animations, weird legends, gifs, and more that could make it truly unique and special, and even more a delight to investigate. So, take some time to think about these aspects as well.

Then we had collaboration. Maybe one of the best things about it is that you’re not in it alone. Even though Shirley and me both created our visualizations separately, we share lots of things throughout the process. From discussing initial ideas to sharing in the joy of finding an appropriate data set and, of course, sending across loads of photos and screenshots of our works in progress.

We started out as two people who sort of knew each other through Slack, and we had a good time at a conference. But, in the end, we made an incredible friendship during that year. We talked to each other practically daily now, and we even work together professionally on our bigger client projects.

If you are thinking of embarking on an ambitious project, I can definitely advise you to search for a partner. I think that partnering up will keep you going more easily. I was always motivated to work another evening on one of my projects when I saw Shirley screenshots of her progress, but I also didn’t want to let Shirley down, so I was way more motivated to stay on track.

Although it will be very useful if at least one of you is responsible, so at least you don’t both slack off, but it’s someone that you respect and that it’s someone that you trust, or you think you can learn to trust, because that is crucial in giving and receiving feedback. Even though no matter how enthusiastic you might be about starting or how much fun you’re already having, at some point these ambitious projects will get hard and will require a level of dedication. Try again if you’re having a bad day, but at least try again because, with each new app and visual and demo that you create, you’ll learn new skills, and that’s skills that you can develop even further on your next project. That’s why these personal projects are definitely worth the time investment, in my opinion.

Although I’ve been taking you along on a design journey, on a journey, on through my lessons and visuals, Shirley also made a visualization each month with her own spin on a topic. Here’s a sneak peek of two of my favorites. The first one is her take on our movie month, called Film Flowers, in which she scraped IMDB for the top summer blockbuster movies for every year since she’s been alive and turned them into flowers. Each of these elements of the flower represents some value from the size being the rating, the color signifying the genres, and the pedal shape being the age rating.

By using that technique, each movie can turn into its own unique flower. I love this project because it’s made such a wonderfully artistic output, but you can still sort of hang it on a wall and continue to compare movies across the years. That one, Batman & Robin, is her own personal favorite because it is so tiny. [Laughter]

The other one is Hamilton. Even though this was published for our books month in November, she had been working on this for three months prior to that. That’s how crazy she is about Hamilton. I can definitely advise you to read her write-up of this month as well to see her dedication to the data gathering alone.

As you scroll through this page, Shirley takes you through all of the lines of this musical. How are these people connected, and which songs do they sing together? She ends by explaining more about the character arc and the developments of, well, one character in particular. I love this project because of just a general sense of delight that you feel by scrolling through and interacting with the different elements on this page. I can definitely advise you to also search for her other projects because they are each a fun and unique and gorgeous take on our projects.

In that time, I learned that you can find data in the weirdest places. It’s not blasphemy to recalculate visual variables and that sketching helps weed out thinking errors, but that you can also sketch with code. That SVG paths are amazing, and math is too, but I really already knew that, of course. And that, really small and subtle things can create an overall sense of delight for your audience.

We didn’t set out to be confronted or learn all of these things. We just wanted to have fun. In that, we definitely succeeded. The thing is, actually, even though the year has already come and gone, we still have a month or two to go, so a few empty spots there. [Laughter] That’s because even though we were having fun, during the final stretch we noticed that we’d taken on too much, and we decided to take things a bit more slowly. So, if you want to, you can still follow us in our final month or two of coding and data preparation in managing to turn these topics into fun and weird and often overly elaborate data visualizations.

Thank you very much for your attention.

Audience: [Applause]