Transcription
Harry Roberts: Being the first half of the talk, I want to discuss the moral importance of kind of global performance, Web performance kind of outside of the Western bubble. Then the second half of the talk we’re going to drive into some techniques to actually start delivering faster websites. They aren’t going to be sort of small, kind of practical tips. They’re going to be fairly large, philosophical and almost like existential views and opinions that we need to change in order to start delivering reliable, fast, enjoyable, and inclusive--I guess, has been the theme for the last two days--inclusive user experiences. Performance is just as much about accessibility as it is about increased revenue or increased conversions. That’s what I want to try and focus on today.
Yeah, my name is Harry. I’m a consultant performance engineer from the north of England. I’m currently calling myself a performance engineer because that’s what I want to do, but my job title changes about three times a year. I work with typically quite large companies. I’ve been very, very, very fortunate in my career to work with some fascinating people, projects, [and] organizations.
I’ve been very lucky, indeed, but one particular company up on this screen right now is a company called Trainline. Trainline are a British retailer who sell train tickets. They’re moving slowly into European markets. Trainline hired me sort of the middle of last year to help them with some performance efforts. The Trainline reduced latency by just 0.3 seconds and customers ended up spending an extra 8.1 million pounds a year; 0.3 seconds led to an increase in revenue of 8.1 million pounds. It’s an absolutely fascinating project, and we’re dealing with absolutely phenomenal numbers.
Another company, not a client of mine, unfortunately, that Netflix saw a 43% decrease in their bandwidth bill simply by turning on gzip. This is a bit of a problem, right? In this case, then you should say, “Netflix was previously wasting 43% of their bandwidth bill by not having gzip turned on.” But, here’s the fact of it. Forty-three percent is phenomenal; it’s an enormous number. Imagine saving 43% of a bill at Netflix’s kind of scale. We’re talking about enormous figures here.
GQ, the lifestyle magazine, they cut load times by a very impressive 80%. In return, they saw an 80% increase in organic traffic. I’ve kind of missed the word “organic” off of this slide, but the important thing here is this wasn’t paid for search traffic. This wasn’t social media driven traffic. Organic traffic went up by 80%. Out of that 80%, time on site increased by 32%, effectively one-third.
The previous two case studies dealt with money, making more money or saving money. This case study, I like to mention to any clients who aren’t online retail, who aren’t e-commerce because what this shows is that faster experiences lead to more engaged users, more loyal users. Also, I kind of hate to mention it, but GQ, they do actually run ads on their site. So, if you can increase ad impressions by 32%, ultimately you are going to make more money.
Each of these case studies was picked from a site called wpostats.com. Write this one down if you ever struggle to sell performance to your client or your manager, which is most of us. Who would like to build faster websites, but their client just doesn’t want to pay for it or their manager just isn’t interested in funding it? There’s got to be more than ten of you. I don’t believe that.
Tell people what they want to hear. This is a brilliant resource by Tammy Everts and Tim Kadlec. It’s full of resources, case studies, reports all pertaining to performance. You can kind of cross-reference things. You can say, “Show me every report from 2016 that dealt with increased conversions on e-commerce sites. You can get a list of every applicable case study and use this as background information or proof to take to your client, to take to your managers.
The reason I picked these three particular case studies is because each one shows us something different.
- The first one showed us that performance efforts will make us more money. Every case study ever has shown that a faster website makes more money. There is not one case study that certainly I’ve ever seen that says users spend less on faster websites.
- It would also save us money. One really interesting thing I accomplished with Trainline is we reduced their bandwidth overhead by 62%. I think I definitely earned my keep that day because, just by focusing on performance, we reduced throughput by 62.5%.
- Thirdly, we know this already; it makes users happier. Nobody enjoys using a slow website.
I want to talk about something different. I don’t want to talk about the financial. I want to talk about less Western and less capitalist kind of worldviews. I genuinely believe there is a moral reason to build fast websites. It’s more than just, how much money can we make for our shareholders? It’s about, how can we best serve truly global audiences? How can we get our information in front of people who perhaps are less wealthy or have less access to the infrastructure that we do?
I want to start by telling you three different stories, three small anecdotes. One thing I want you to remember as I tell you each of these stories is that this didn’t happen five years ago or ten years ago. Each one of these things happened to me last year in 2017. These things happened recently. This is happening now.
The first one, a really simple one: I was running an event with a friend of mine. He runs a really nice coffee shop in my city. We were running a small event together, and I sent him an important email, and he just didn’t reply.
I said, “Oh, shit! We need to get this thing sorted.” I said, “Hey, look. Did you get my email?” I was that guy, “Did you get my email? Did you get my email?”
He just didn’t reply, so I ended up going into the coffee shop to see him in person and said, “Look. You haven’t replied to this email. Are we still going ahead?”
He was like, “I’m really sorry. I was on holiday in Thailand, and the data connection out there was so bad that I could get a push notification telling me I’d received an email, but the data connection was nowhere near strong enough to actually open my email to read it.”
I hadn’t sent a huge, 20-meg attachment. It was a plain text email. But, the data connection available to him in Thailand was not strong enough to open it.
The stories get better than that. That’s the intro. Don’t worry. They get better.
This one fascinated me. Someone sent me an email, and this happens quite often. People email me asking for advice, or tips and tricks, and whatever. I sent quite a long response to this person. I spent about two hours writing a long-considered reply, and he never got back to me. I was like, well, that’s kind of rude. I soon forgot about it and moved on with my life.
Two weeks later, he sent me a reply saying, “I’m really sorry for my delayed response. I didn’t mean to be rude. I’m on a really bad connection, so I couldn’t reply to you.”
I said, “Look. I will forgive everything. I will forgive you if you tell me everything you mean by a bad connection.”
It was fascinating. He said, “I’m currently at my parents’ place in Rajasthan, India. My parents don’t have a computer, so they only consume Internet through smartphones or through their smartphone. We rely on Internet services by telco providers, which in our town are still 2G. Some claim to be 3G, but I’ve never seen that working. Right now, I’ve collected my laptop via a wi-fi hotspot, and opening Gmail in the basic HTML version takes between 30 and 60 seconds. For any other sites, I tend to use mobile Chrome anyway because it makes preemptive data savings for me.”
Two fascinating things about this. I’ve had to open; I’ve had to use the basic HTML version of Gmail maybe five, ten times in my entire life. When I do it’s like, “Uh, it’s just the worst thing that’s ever happened to me.” This guy has to use it all the time and still takes up to a minute to open. How bad must that be for 100% of the time he’s using this light version, which still takes a minute?
The second thing, even more interesting to my mind, when we go home, we’ve hopefully got a router in the corner of the room that beams out wi-fi to our house. If we’re very lucky, that router is onto a fiber connection. It’s backed by a fiber connection.
This person doesn’t have a router in the corner of the room. They’ve got a commodity Android device. Instead of having a high-powered broadband connection, they’ve got a permanent 2G connection.
You know when the Internet goes down in your office and you have to start tethering your phone for an afternoon? You might be on 4G and, even still, an afternoon tethering 4G is really frustrating. Imagine a lifetime spent tethering 2G. That’s what this guy’s family have to do.
A third little story, someone sent me a DM. I’m not called @Harry on Twitter, so I don’t know who got the notification for this, but someone sent me a DM saying, “Hey, look. I’m a Nepalese developer. Can you give me some advice about code?” and blah-blah-blah.
I said, “Yeah, sure. Here’s your answer. But, as an aside, whilst I have your attention, my analytics told me that Nepal is a problem region for my website. Nepal is apparently a very slow area to visit my website from. Is that true?”
His reply almost knocked me out. He said, “No, no, I don’t think so. I click on your site and it loads within a minute,” and that doesn’t feel slow, right? Image a minute load time not feeling slow.
Here in the middle of Germany, if we experienced a one-minute load time, we’d assume the site was down. We’d assume they were having an outage, and we’d probably go elsewhere.
The second thing is, my site is incredibly highly optimized. It has to be. It’s my job to sell fast websites. If you’re visiting my site from, say, Dublin or West Coast USA, it would be fully loaded, fully rendered within 1.3 seconds. The exact same website on the exact same hosting on the exact same code base takes a minute for this person, over 45 times slower just because of where he lives. That’s the geographic penalty, the geographic tax that a lot of people in the world have to pay.
You may have noticed each one of these anecdotes, these stories, they all occurred in developing regions, in the East and the Far East. I want to talk about an initiative called The Next Billion Users. Hopefully, many of you will have heard of this already. It’s an initiative that’s been mainly spearheaded by Google, which aims to get the next billion Internet users online. There are a lot of people who don’t even use the Internet yet, roughly a billion of them. There are concerted efforts from large providers to make it as seamless and easy for them as possible. If you just Google Next Billion Users, hopefully, you’ll find this result by Quartz, which is a good conical resource with case studies, reports, and documentation. It’s absolutely worth reading.
You may have also seen this diagram before. Anybody seen this map before? A couple of us. This blows my mind. More people live inside of the highlighted the area than outside of it. There are more people under that white blob than all of the rest combined.
Now, when you begin to realize that this covers India, China, and Indonesia, you’ve got very, very densely populated parts of the world. But, most of our billion users live right here in these emerging economies. These are the people who are coming online next. These are the people who want just as much access to our information as we get. They’re no different. Culturally there are differences, but still, there is a thirst for knowledge. There’s a need for knowledge. Currently, these people are being underserved.
I’ll try to quickly whizz through some statistics from some of the countries here. Bangladesh, a 3.5 meg average connection with just 15% of people online at all. With 3.9 million broadband subscriptions, only 2.4% of the population have what we would consider a stable broadband Internet connection. But, with 134 million cellular subscriptions, about 83% of the population have a mobile device. They have a mobile connection. That’s 34 times more people getting online with a mobile connection--high latency, low bandwidth--than they are getting online with broadband/wi-fi.
India, again 3.5 meg connection, a quarter of people online; 1.3% of the population has a stable broadband connection. That’s just 1.3%. With a billion cellular subscriptions--and that’s not a typo. That’s billion with a B--80% of people or 79% of people get online with a high latency, low bandwidth device. That’s 58 times more people on mobile than broadband.
Don’t worry. I’m not going to do every country in the world. I’ve got a couple more.
Two and a half meg connection makes Pakistan the slowest. Eighteen percent of people are online. Only 18% of people are online. That’s all. One percent of the population, just one percent, one person in every hundred, has access to a broadband connection. But, 67% are on a cellular connection--these numbers are getting bigger--70 times more people.
Indonesia is a really interesting one. It’s got the fastest connectivity out of the region, 4.5 meg average speeds. Up to a fifth of people are online. Only 1.1% of the population have a dedicated broadband connection. However, a weird phenomenon happens here. With 338 million cellular subscriptions, 132% of the population has a mobile device.
Now, presumably, these are cellular subscriptions. What this means is that people have, on average, more than one sim card, more than one contract. I own two sim cards. I’ve got one that works in the U.K., and I’ve got one that works in the rest of the world.
Now, Indonesia is the fastest growing country in this fastest growing region, which is the Far East. They’re an incredibly online country. People use lots of social networks, lots of social media. This explains, perhaps, why we’ve got this phenomenon of 121 times more people are on high latency, low bandwidth connections than they are on a dedicated broadband connection.
Averaging this out, 3.5 meg is this region’s average speed. Only a fifth of people are online at all. Eighty percent of people are still waiting to come online.
1.5% or 1.45% have broadband, and a staggering 90% are on cellular. Contrast that with us; we’ve got 14 meg average speeds. 87.6% of people are online. This is one of the highest numbers I’ve actually seen. 37% of us have broadband, and 116% on cellular. Again, here in Germany, I’m guessing that, on average, people have slightly more than one cellular connection.
What does this tell us? The first thing it tells me is that we’re misunderstanding what mobile first means because, to most developers, mobile first means I write media queries. I start a sketch with the narrowest thing possible. That is not what mobile first means. Mobile first now should genuinely mean I test on a 2G connection before anything else. I treat broadband, fiber, or a wired connection as an enhancement. Mobile first should mean I start on an actual mobile device with a truly mobile connection. We’re building for a completely different profile of user.
Now, almost every time I work with a new client on a performance optimization effort, the first thing I get asked is, “How fast is fast enough? How fast should our website be?” It’s a reasonable question. I’d want to know the same thing.
The bad news is, I don’t have an answer. You can’t really answer a question like, “How fast is fast enough?” because it depends where your demographic is. It depends who you’re serving. It depends how much Nepal is an important region to you.
What we can do is we can start to run benchmarks and get a feel for things over time. But, the biggest single bit of advice I can give for any company wanting to make their website faster is to just start by being faster than your nearest competitor. One of the first questions I ask any new performance client is, “Who is your nearest competitor? Who do we need to beat? Who do we need to be better than?” because performance is a competitive advantage. If we can’t put a second amount, we can’t say, “We want to be loaded within three seconds,” what we can say is, “We just need to make sure we’re consistently faster than our nearest competitor.” That gives us that advantage.
I found a tool last year called Dareboost. It’s sort of a Parisian startup just outside of Paris. They’ve got a great Web performance comparison tool. You can drop in two URLs. It will run a series of benchmarks against both of them and give you a pretty nice report about what falls out at the other end. You can set continuous benchmarking here just to ensure that you are absolutely trouncing your nearest competitor. You need to make sure you’re faster than them.
A slightly more involved tool is a tool called SpeedCurve, which sets up repetitive, historical benchmarks, and it captures this data over time. This is actually from the SpeedCurve demo project. You can see that Huffington Post is markedly slower than the other websites in the test and that The New York Times and The Guardian are consistently fighting it out for who is fastest.
Charts like this are amazing for nontechnical stakeholders because nontechnical stakeholders don’t care about DNS lookup times. They don’t care about Time to First Byte or start render. They don’t care about any of that stuff, and nor should they. It’s our job to care about that. What they care about is, are we the fastest? This is a really great way of convincing or beginning to convince nontechnical stakeholders to care about performance.
Next, I want to talk about how we get there. How do we start achieving this? I’ve got three little tips, well, three and a half because the first tip is kind of for free. They’re not like little BuzzFeed kind of these three crazy tips for a fast website. These aren’t going to be quick, short-term wins. These are fundamental shifts in attitude that I think every company needs to make to start delivering more resilient, more reliable, and faster websites.
The first one is really easy. Step zero is just to want a faster website. Normally I go into a client and they say, “We really want to make the website faster.” I don’t say it to their faces, but I just think, “Well, do it. Go and make a faster website. You don’t need permission.”
This sounds really tongue and cheek. It sounds like I’m being a bit facetious, but I genuinely mean this. As soon as you truly focus on making a faster website, as soon as the company says officially that we are going to build a faster website, it’s going to happen. It becomes a lot easier to build a faster website when you actually try and do it. It’s never going to happen by accident. You won’t just have a faster website if you don’t focus on it. This is the single biggest fundamental shift a company can make is just to want to build a faster website.
Okay. The first proper step is understanding the problem. I mean truly understanding the problem. Stress test your site from as many different points of view as possible. If you’re working I a nice office in the middle of Munich, a good infrastructure, got a nice new sort of MacBook Pro, maybe. You’ve got maybe even a wired fiber connection.
If you're fast in that environment, the bad news is that’s not fast at all. You’re getting so many helping hands there. If you’re fast in that environment, it does not count.
Conversely, if you’re fast in the middle of a field on a 2G connection, you’re going to be fast everywhere. This is one of the most reassuring things I tell my clients. If you can make yourself fast on an old phone, on a 2G or 3G connection, everything on top of that comes for free. You’ll be fast everywhere if you optimize for lowest common denominator. Start with the hardest task and everything else happens for free.
A few years ago, Facebook ran something called 2G Tuesdays. Who has heard of this? Yeah, quite a lot of us. That’s good. Facebook engineers typically live in California. They are very sheltered lives, in general. They’ve got nice, fast machines, good quality infrastructure, so they have no idea if their site was fast or not. It felt fast to them, but of course, it would. They were in an office.
What they introduced was an initiative whereby every Tuesday they throttled their connection artificially to a 2G connection. Any Facebook staff visiting Facebook properties on a Tuesday would get an artificially throttled connection. It soon taught them how bad things really were. It soon taught them that people who are different from us don’t get to use our site in a very nice way. It’s not very nice being different than us.
Now, I personally, like me, Harry, the person, I lack the skills to actually set something like this up, IP ring-fencing and artificial throttling. I don’t know how to do any of that. But, what I do know is that there’s a tool called Charles Proxy. Anyone who was in my workshop on Monday will already have this installed.
Charles Proxy is a free tool which can do specific endpoint throttling. You’ve seen it in Chrome. You’ve got your thing, a dropdown to throttle a 2G or a 3G connection.
What Charles does is it actually starts throttling your machine. It will give you a machine feel of a 2G connection. Or, instead of throttling your entire machine, you can just throttle specific endpoints, so a local host perhaps, the staging version of your website, or even just specific third parties. You can measure what happens if our site is running nice and fast, but Google fonts is running slowly. I would recommend downloading Charles Proxy immediately and start to play around with this because it can really open your eyes. It’s a much more forensic, much better quality throttling device than the stuff built into dev tools.
The kind of bad news here, though, unfortunately, is this is just the beginning. It’s been mentioned a few times already in previous talks, but it’s not just connection speed anymore. Once we’ve solved the problem of the network, we’ve got a whole new array of problems to deal with, namely device capabilities.
This is an iPhone 7. Anyone else got one? I’ve got an iPhone 7 in my pocket. Anyone else got an iPhone 7? Who has got a X or an 8? Right. Yeah. Yeah. Quite a lot of hands, right? We’re in the West. That happens.
Who has got a pretty new Android device, a relatively new Android device? Good. Good for you. Good for us, right?
Vitaly mentioned this in his talk. Current advice coming from Google’s performance engineers tell us that the most representative device worldwide is more likely to be the Moto G4. It’s a substantially lower piece of kit, it’s substantially cheaper, and it’s way more prevalent when you look at global averages.
Quick benchmarking on a site called Geekaphone.com told me that Moto G4 is about half as fast as an iPhone 7+. Vitaly’s benchmarks showed or his screenshot earlier showed even wider disparity. A slightly more forensic analysis tool called PhoneArena.com showed us that on nearly every single test the iPhone is about three to four times faster than this phone. Imagine; just because of what device you have, every website feels about four times slower. A site that feels slow on your iPhone is going to feel almost completely unusable on a Moto G4.
Moto G4s are affordable. I think they’re maybe 150 euro, maybe a little more. If you have a device lab already at work, speak to whoever controls the credit card and add a G4 to that device lab because you’ve probably got lots of screen sizes and resolution densities, but most people don’t have a device capability kind of variance in their device lab. By a cheap commodity Android device and test websites on that.
In the meantime, we can just use Chrome’s throttling. I guess most people have seen this. Yeah, of course. Right, so this is a really good start. It’s not perfect. I’m about to explain why. But, this is a good start. If you’ve not got much time or you don’t have a device lab, certainly use this. With the CPU throttling, it’s pretty good. With the network throttling, we are better off using something like Charles. There’s packet level shaping rather than application layer kind of artificial throttling.
But, there is no replacement for these real devices. This is where I become a huge hypocrite because I don’t actually own a Moto G4. What I do own is a Nexus 5, which is kind of even older. This is a phone I bought in 2013 because I wanted to try Android for a while. I just carry it around all the time in my bag as my mobile testing device.
I was working for a client last year, and they just completely re-platformed their entire project. They’d completely rewritten everything from the ground up, spent months and months and months and hundreds and hundreds of thousands of euro rebuilding everything. Then, when they finished, the emailed me and said, “Hey, can you come in and see if everything is fast or not?”
I was like, “Uh, you poor things.”
Audience: [Laughter]
Harry: Pro tip: Always hire a consultant before you need one because spending hundreds of thousands of euro rebuilding a website for someone to say, “You need to rebuild it again,” is a very expensive way of doing things.
What they’d done is they rebuilt m.product.com. I’m not going to name the company, but they’d rebuilt their site specifically literally only for mobile devices. As soon as I plugged in my Nexus 5, I found that their entire client rendered React application took 1.8 seconds just to evaluate that many hundreds of thousands of lines of JavaScript. Another client of mine, Trainline, I mentioned at the beginning, 0.8 seconds is worth 8 million pounds to them. This client is throwing away 1.8 seconds just pausing several hundred thousand lines of JavaScript. This was a site that was only built for mobile devices, and this is the first time it had been tested on one.
Their developers, the extent of their mobile testing was mainly, “Oh, resize my browser window. It looks good.” They were in central London with brand new MacBooks. Other engineers were using big, beefy sort of Linux like desktop towers with 32 gig of RAM each, very fast processors, incredibly good wi-fi, incredibly strong wi-fi. As soon an actual mobile device hit this site, we struggled.
Yeah, build up a realistic idea of conditions, realistic speeds, realistic network conditions, realistic processing power, and optimize for the lowest common denominator.
Step two: Know what’s going on. This is actually really difficult because this involves a lot of effort. This involves talking to other people in the business. It involves calling meetings and interrogating people you may have never met before because this happens to me all the time.
I build a website. I don’t build the entire website myself, but I work on a Web project, I’ll push my last release of the day, I’ll push the chair back, and I’ll go home thinking, “Today was good. I did some good work.” Then I look at our live website and there’s just crap everywhere. I’ve got no idea where this came from. What does this script do? I didn’t put this on there. Where did that come from? Which team is in charge of this? Is this a marketing thing or is this dev ops? Who put this there?
We’re even using this thing, right? Why have we got five different analytics packages on our website? We can’t possibly be using all five of them. Who feels this pain? Who’s had this experience? Yeah, nearly all of us.
There are a number of reasons this happens. I want to talk a little bit about third parties. Know what’s going on because other people and teams add things to the site all the time, often without you knowing. Tag managers, analytics, social sharing widgets, beacons retargeting, all of this stuff has a marked impact on performance. There’s also kind of a Schrodinger-y Heisenberg-y effect where the more analytics you add, presumably to optimize experiences, the slower the experience will become.
Call meetings. Find out where everything is. Find out where things are coming from. Go and ask your marketing department, what tools are we using to track people? What tools are we using for analytics? Ask your marketing department to give you workshops on those tools so you can understand why they’re necessary. Find out if there’s a cheaper, better tool, or a more expensive, better tool that you could happily switch over to. Try and work out exactly who is doing what so that you can begin to debug these problems.
This wasn’t technically a client. Someone just reached out to me for some assistance trying to optimize their website. I said, look, my website feels kind of slow, and I’ve seen some of your talks. I’ve been to one of your workshops. Can you just help me look at what’s going on?
I ran a Web page test of this person’s website. This is absolutely astounding. The green we’ve got highlighted here was the website. This is what the developers built. This is what the developers had coded. This is what they believed was going to users. However, what was actually getting downloaded was a little bit more like this.
Oh, it gets worse.
Audience: [Laughter]
Harry: Holy, shit! The green was what all the developers believed they were doing every day. The developers believed they were doing a good job, and they were doing a good job. But, the red is what got sent to users. The red is actually just disrespectful because no one wants any of this. Nobody asked for this. No user ever intended to download this stuff.
If you look at a monetary cost, a pure monetary cost of loading this website, it cost more to download all these dirty, nasty, retargeting, tracking scripts, these horrible invasions of privacy. The user is paying for that. How disgusting is it that we do that to them?
This is mainly caused by tag managers. Does anyone not know what a tag manager is? I’ll just tell you real quick. Okay, okay, for the people, yeah. I hadn’t heard of a tag manager until a couple of years ago.
Imagine this scenario: You’re a developer. You’re at work. Somebody from marketing comes to you and says, “Hey, can you just install this analytics tool? We just bought it. Can you install?”
You’re like, “Uh, fine. I’ll do it after lunch.” You just put script source equals something, something, something. They’re happy, but then they come to you the next day and say, “Hey, can you actually just add this tracking tool now? We’ve got another thing?” You’re like, “Look. I’m really busy. Can you speak to scrum master and maybe get it prioritized because I can’t really sneak it in?”
What ends up happening is marketing get annoyed because, why is it taking so long to add these new things? Somebody invented something called a tag manager. A tag manager, all the developer does is installs one line of JavaScript. That JavaScript then talks to a service like maybe Google tag manager or Adobe tag manager, which gives nontechnical users an admin panel.
Through the admin panel, they can install random bits of JavaScript. They can say, “Hmm, I want to measure how many clicks are on this button.” It’ll do it through a WYSIWYG. This effectively becomes production JavaScript, not production quality JavaScript. Oh, no, no, no, but production JavaScript nonetheless. It goes live.
What you’ve got is an entire raft of snippets of JS, all this code that can get live and go live to customers without a developer ever seeing it. Now, tag managers do answer a genuine business concern. Sites that make money need to track things. They need analytics. But, because of how it happens, it often goes unchecked.
Know your liabilities. For years, we’ve called them assets. They’re not assets. They’re liabilities.
Third parties can and will cripple our performance. A client of mine was using an A/B testing tool, which it was the devil. It was a client-side A/B testing tool. Never use those. What this did is it would kind of completely block rendering whilst it worked out what test it wanted to run on the client. Then once I’d done that bit of JavaScript, it would then paint the view to the page. What ended up happening is--I’ve got a screenshot coming up, I think--their optimization tool was causing 98% of the runtime overhead. An optimization tool was causing 98% of the runtime overhead. There’s some painful irony there.
Vitaly mentioned earlier that I wrote an article about this. I’ll share a link to that after the talk. If you want to find out who is causing you the most problems in the performance panel in dev tools, you go to performance, bottom-up, group by domain, and it will literally tell you which third-party domains are contributing how much runtime overhead.
This particular client that I just mentioned, what we did is they hired me to help them rebuilt and re-platform. We knew that performance was going to be one of our foremost concerns. They’re an e-commerce company. They turn over about a billion pounds worth of product a year. It’s worth a lot to them.
The first thing we did is we ripped out all of the crap. We ripped out all of the tracking code, the analytics. We went with a simple a build as possible. We ended up achieving 0.8-second start render. So, the first paint was at 0.8 seconds. The speed index was around 2,400.
For an e-commerce build, this was, at the time, probably a first in class. This was unheard of. We were so much faster than our nearest competitors just because we’re removing all this crap.
The marketing team was delighted because all organic metrics had gone up. Number of visits had gone up. Add to cart had gone up. Revenue had gone up. Conversions had gone up. Time on site had gone done because it should, right? If you owned a corner shop and someone spends 20 minutes in there buying on drink, you failed. Don’t measure time on site as a good metric for e-commerce. You want to people to turn up, give you money, disappear again.
All of these metrics were looking favorable. The marketing team were delighted. They said, “Hey, look. Can you now add the tracking stuff in, so we can work out what’s going on and make it faster?” We’re like, “No. The only reason it’s fast is because we got rid of all that stuff.”
They’re like, “No, no, no, no. We know what we’re doing. We need to add this stuff, and we’re going to target things and retarget and track things and measure things. We want to be A/B testing things.” Unfortunately, the CTO and I, we lost this battle. We tried, and we tried, and we tried, but we were told, “No. We need this stuff put back in.”
The first paint went from 0.8 seconds up to 2.1. We added a 1.3-second bottleneck in front of our first paint. We’d gone from being a class leader to being bang average just because of this insistence on adding third parties.
I need to hurry up, actually.
Identifying third parties: I’m going to quickly go through some tooling. If you’ve got Chrome enabled, if you’ve got Chrome with dev tools experiments enabled, Marcy hit upon this briefly. Loads of cool stuff in here. There’s this option for network request group support, a nice, catchy title. You can turn this on, and it will start to group third party requests in your network panel into nice, manageable trunks. Again, if you were in my workshop on Monday, we looked at this already.
The slides are available, so don’t worry about memorizing this stuff.
There’s another column. It’s sometimes there, sometimes isn’t. It is experimental. There’s this product column. This product column appeared in the network panel, which now gives you a list of all the third parties. You can see, hopefully, here we’ve got this new column, which tells me which products are being called. So many times, I look at files and I’m like, “I’ve got no idea where this file came from, no idea whatsoever.” This column now tells me exactly which provider.
Looking at BBC.com, I noticed a huge problem here. There’s a certain file, which had a 35 second TCP connection. That 35 second TCP connection was blocking. This was actually tethered to the load event. This is a blocking script. The load event fired around 39 seconds because of this one erroneous file, which happens to come from something called the Rubicon Project. I’d never heard of the Rubicon Project, so I just Googled it. Of course, it’s an ad provider.
Armed with this information, we can either choose new ad providers. We can open support tickets with them. We can open GitHub issues or however it is that we feed this back. But, this is how we identify which third parties are causing us problems. Third parties will make us vulnerable.
Another really good example: this is a client of mine back in the U.K. They use Adobe Tag Manager. Now, this isn’t a problem specifically with Adobe Tag Manager. Google Fonts has the same problem. Loads of third parties have this problem.
What happens in the most extreme scenario? What happens when Adobe Tag Manager goes offline? What happens if they have a complete outage?
What happens is this. We show a user a blank screen for 1.3 minutes. If this file goes missing or Adobe Tag Manager has an outage, Chrome’s built-in timeout doesn’t fire until about 80 seconds. What happens here is the user sees a completely empty white page for 1.3 minutes. That’s enough time for them to assume the site is down and go and take their business elsewhere.
Again, Vitaly mentioned this in his talk. WebPagetest exposes blackhole server tools that we can use to route third-party traffic through an endpoint that effectively simulates an outage. Again, the slides are available, but add this to your host file when you get back to the office and see just how vulnerable you are to third-party outages.
Don’t prioritize your own metrics over your users’ experiences. There’s no point having all the tracking in the world if users can’t even get to the website.
Step three: Measure everything. Measure absolutely everything. We need to measure the before and measure the after because this tells us two very important things. Measuring the before allows us to know what’s actually wrong. If you don’t measure the before, you don’t know where to concentrate your efforts. If you don’t know what is currently wrong, how are you going to know what to fix? Two, you measure the after so that you can prove that you did the right thing or, hopefully not, but you know if you made something worse.
Google Analytics--we looked at this again in my workshop--is a really, really -- well, it’s free, right? It turns out Google Analytics captures performance data all the time. I’ve had analytics on my site for ten years now - gees. It’s been capturing performance data the whole time.
This is an enormous gif. This gif will beachball my Mac for a second, so I’m going to take a drink.
If you go to behavior, site speed, page timings, you’re going to find a wealth of information.
The map overlay is exactly how I found out that Nepal was a problem region. You can soon start to build up geographic ideas of what’s happening on your site. This isn’t the best data available. It’s not very sort of sanitized, and it’s also using mean results rather than median. But, if you’ve got Analytics installed, go and check this stuff out. It’s fascinating. Then you can do things like cross-reference which pages in which particular countries, so you might know that the homepage is only really slow in India. It’s actually quite fast here in Germany.
Here’s an interesting one. We’re talking a lot about the Far East. I found out that Brazil was a problem for me, and Brazil is also included in this list of emerging economies. There’s an actual definition of what an emerging economy is, and Brazil falls into that category, so I decided to look into things.
It turns out if you want to purchase 500 megabytes of data in Brazil, you’d have to work 8.6 hours of minimum wage. That’s a full day’s work. Imagine being a minimum wage worker and you have to work one day a month to afford 500 meg. We’re really, really dealing with sensitive stuff here.
If we’re sending enormous payloads over the wire, we’re actually eating into someone’s salary. This is costing people genuine amounts of money.
I’ve got a little travel sim card that I mentioned earlier. Just out of interest, I wanted to work out how much this travel sim card would have cost me in data usage if I was Brazilian. Fifteen gig was 28 days of minimum wage work. If I used 15 gig each year, I would have to spend one month of that year working just to afford my data allowance - very, very, very expensive stuff.
Anyway, Brazil: I set up some stuff on SpeedCurve. Has anyone heard of SpeedCurve? For those who haven’t, looked it up. It’s an amazing tool.
I measured the before. I knew Brazil was a problem. Next, I needed to measure the after. I set up some tooling. You can see that I’m testing my site from Brazil. I implemented what I thought might be a potential fix. Sure enough, it worked. This is the importance of measuring the before and after. If I didn’t know Brazil was a problem, it would have stayed a problem. If I wasn’t measuring the after, I wouldn’t know when I’d fixed it. Yeah, SpeedCurve; brilliant.
Then you can set up budgeting. If anyone has looked into performance budgets before, SpeedCurve makes it very easy to set those up. Graphs like this are great for showing your manager because a manager doesn’t necessarily want to know about load times or speed index. What they do want to see if a graph like this. It’s very easy to see that something good happened that day.
Yeah, you can use this to set up performance budgets. Basically, you’re just monitoring with alerts. I find that people in management don’t like the word “budget.”
I’ve got my sites up on SpeedCurve. It’s a very kind of 2018 site with bands of content everywhere. It’s not a very complex site, but it’s got a bunch of third parties. It’s got an analytics package. It’s got a social media widget. It’s got an ad provider on there.
I mentioned this before. My site from, I think this is, Ireland is fully rendered within 1.2 seconds, 1.28 seconds. It’d start to render. It renders in one pass, so 1.28 seconds to render my site.
For visually complete, I’d set a budget of 3 seconds. We can see that, in Ireland, I am consistently under budget because Ireland, Dublin specifically, is a very, very well-connected city. It’s kind of a European tech hub, I guess.
Moving over to Brazil, you can see that 50% of the time I’m going over budget. Those are two very different looking graphs. The exact same code, the exact same CDM, the exact same infrastructure goes over budget 50% of the time purely because of loading it from Brazil. Again, this is the geographic tax that people have to pay.
The last thing I did in SpeedCurve was actually set up some custom profiling. This is a profile that I called “Very Bad Network.” We’ve got 150 kilobits a second downlink, not kilobytes, kilobits. We’ve got a latency of half a second, and we’ve got a packet loss of 10%.
Packet loss is absolutely vital to understanding how performance works. Whenever a packet goes missing, you have to start TCP retransmission again. You get ahead of line blocking on the server. It’s just incredibly expensive, and mobile is absolutely riddled with packet loss. This is very important stuff when we’re building for flaky connections.
As soon as I set up a true mobile connection, things looked very, very, very, very, very different. My 1.28 seconds from Dublin suddenly became 9.5 seconds on an actual mobile connection with only 10% packet loss. Another thing you’ll notice about this graph is, there is no consistency. There’s no uniformity to it. It is just erratic, and this is very, very hard to design around.
When you compare it to everything else, as well, when you compare Brazil to Ireland to whatever else it was -- no, sorry. The different devices, sorry, from within Brazil -- no, this is Ireland. I’ve got everything on this slide wrong.
When you start to compare the very bad network to more consistent networks that don’t suffer packet loss, look how wildly different things are. This is stuff we need to be aware of. Having the tooling set up to measure this at least makes us understand the problems we’re trying to solve.
I want to close. I’ve got 18 seconds to close a 99-slide talk.
Care. Just start caring. The easiest way to start achieving this stuff is to just try. Set out as a business, as a team to begin focusing your efforts on performance. Secondly, understand. Understand your demographics, your customers, their capabilities, their geographic locals. Finally, measure absolutely everything. Measure before; measure after. Understand when you’re heading in the right direction.
All the statistics and data in this talk were 100% verified accurate. No fake news here. If you want to double-check them, these are the places to look.
The last thing I want to do is just thank you all for your time. Thank you very much.
Audience: [Applause]