Part 1: Introduction to web analytics

You might be asking yourself what web analytics is and why it is important? I am glad you are asking, and this part will answer that, I hope. A topic that often comes along for the ride is search engine optimisation and other tricks for reaching out.

Web analytics is not about simple tricks or shortcuts! Rather, web analytics is your way of getting to know your website, your users and how they use the website. It is a methodical way to evaluate whether changes bear fruit and what can be made even better.

The website is not just a digital place filled with pages, images and documents. Often there is a purpose, or at least a hope with it as its creator. Something you wish the website will contribute, something that should happen. It could be selling goods, facilitating contact between people, informing citizens about public services or their rights, and much more.

Working with web analytics is about improving, or perhaps simplifying, what the website is meant to enable. Making it intuitive to find your way to, and easy to for example complete, all sub-steps in a purchase process. Web analytics is thus all tools and tasks that contribute to how you optimise a website!

One might think that web analytics involves collecting masses of data and then searching for the needle in the haystack. That is not quite how it works. Web analytics is about using collected data to gain insight into the user's experience of a website. With the intention of improving the experience. Making the website more useful.

The short answer to what web analytics is, is answered quite well by Wikipedia, I think:

Web analytics is the measurement, collection, analysis and reporting of web data for purposes of understanding and optimizing web usage.

It is easy for the focus to end up on analysing collected visitor statistics when doing web or intranet analysis. However, all forms of evaluation are included in the concept – not just how a website is used. I increasingly hear industry colleagues, in addition to their visitor statistics, using other types of tools to get an overview of their websites. What you may need an overview of is to easily filter out pages with various problems. For example, seeing which pages do not live up to the organisation's ambition regarding accessibility, where unnecessarily high-resolution images are used, where people disappear from the website, and more.

Much is about things other than the visitor statistics practically everyone collects. Furthermore, there are other aspects than the strictly commercial and persistent e-commerce analogies. But first we shall go through the basics of how you work with your web statistics, process your data, and which methods and tools everyone needs to know. Much of this you can use in other web analytics contexts, provided sufficient data is available.

Introduction to working measurably – start with a baseline measurement

There are a number of activities for getting started with working measurably on your website. One thing is to arrive at business goals that should guide the website's development, which we will get into shortly, another is to get a handle on the current state. If you already have a website, you do something called a baseline measurement. It involves making your very first measurement of the current state. What will stand as the model for describing future changes. The values that you and your colleagues will chuckle about during coffee breaks in a few years, in the best case, or what you use in the evaluation to quality-test an updated website. The website's reference point, quite simply.

Now perhaps I am stating the obvious, but why not, it is just as well so no one happens to miss it. For it to be meaningful and fair to compare how the website performs over time, you need to have control over the circumstances that affect the measurement.

My advice is to design at least one page on the website that is the one you measure the website's technical quality against as your baseline. Ideally you have a test page per type of page, say one for the start page, another for product pages, one for articles, and so on. It is against these pages that you over time should check that the website is continuously getting better, or at least not getting worse and worse. These are the pages whose addresses you enter into tools going forward to check how quickly the pages respond, finish loading, whether they are considered mobile-friendly, etc.

Remember that the example content on these test pages must also persist over time. If it does, you can actually set very concrete – and measurable – acceptance criteria when you commission an entirely new website. It is no more than fair to expect that a new website is better than the previous one, right?

To keep track of whether editorial content or content served from other systems maintains the right quality, one approach can be to monitor the most recently created pages. Otherwise the website risks getting worse over time.

I would say that all communication about improvements to a website should contain a comparison against both your baseline measurement and the most recent reference point before the change. Then you can see in concrete numbers that nothing got worse, and whether a promised improvement actually occurred. Why does one have official and designated test pages? Well, because the supplier can check their delivery against the test page before they consider themselves done and shout "Finished!"

These measurements can be about so much more than what you get from your web statistics. Among other things, it can be:

Compliance with established accessibility goals according to WCAG.
Linguistic matters, such as which abbreviations are accepted.
How the responsiveness is, for example on image resolution.
What response time the web server is allowed to have under tough rural conditions via 3G.
What external actors like Google and Pingdom think of your performance for impatient visitors.

More examples of measurable acceptance criteria can be found in the last part of the book, which addresses what activities you can work on to get a handle on your website's quality. Before we have come that far in the subject of web analytics, we shall discuss methods, documentation and goal-setting.

Documentation you should have before a new web project starts

There is much to think about before starting a new web project. These things tend to change over time, making it difficult to reuse the previous web project's requirements straight away. I thought I would bring up a few points I would claim are all too rarely discussed before a project gets going in earnest. This is a quick introduction to documentation; there will be a deep dive later where style guides, design patterns and performance budgets are discussed at greater length. This part is what everyone involved in a web project should know.

The headings below can be in a document that follows the project through to delivery – and is then handed over to management by your own control body (your own web analyst?). This is done to ensure that you do not continuously deviate from the initial agreement during your management of the website. The point is thus not that the website should be shiny and beautiful on launch day but at least as good three months later.

I know you have heard it before – you are never “done” with a website. It is more important how well it works after a year than how pretty it was as a prototype. Just as you must continuously improve the website to prevent it from decommissioning itself, you should regularly revise the measurable requirements you set for the website. You should constantly raise the bar as the world around you changes.

Should you use someone else's content delivery network?

Content delivery networks (often called CDN – Content Delivery Networks) are used to avoid having files locally on your own website. In many cases this provides faster download times or other performance benefits for users, while being both cheap and easy to implement on your website. The most common variant is probably fetching tracking code for web analytics tools like Google Analytics this way. Therein lies also the biggest reason why you need to think through your use of content delivery networks. They encroach more or less on the user's personal privacy because you involve a third party in the communication with your user.

You should choose your website's third-party services with great care. It is you as the publisher of a website who must consider your users' privacy, which parties you invite to contribute to your website and thus how usage data can go astray. It is entirely possible to set up your own content delivery networks, large and small.

Other examples besides Google Analytics are websites that fetch fonts from Google Fonts or have a social feed from Facebook integrated. Many have buttons to share posts on social media or those who let Wordpress.com handle images for our own WordPress sites. Almost always when you receive code to paste in from a service, this issue comes as part of the package. Think embed codes from video services. As soon as an image or other visual is to be displayed, it is often fetched from the third-party service's web servers without the visitor actively needing to interact with the service.

You can run your own content delivery network, but there you obviously cannot place Facebook's news feed or Google Analytics tracking code. But much else. Like images, video, design frameworks like jQuery, or other things used to make the website function as intended. Then you can reclaim some of the benefits that a content delivery network offers. For those of you running a very small website, you can check out web hosting services like Loopia's Autobahn. Such services are specially designed to quickly send files that are rarely or never updated – that is, static files. Or if you judge that large-scale third-party services are not a major concern for your particular visitors, then check out Cloudflare and others that are often easy to integrate with most content management systems (CMS).

I suggest dividing this question into two parts: how do we handle web analytics, and how do we handle content delivery networks for other needs.

For web analytics, Google Analytics is extremely common. During 2015, Google Analytics was found on 8 out of 10 Swedish municipal websites, for example. If you view privacy mostly as a legal matter, it is increasingly common for contracts with web analytics tool providers to make it clear that you own your user data. If you want to protect privacy especially around collecting visitor statistics, there is the alternative Matomo¹ which you can install on your own web server. This makes it harder for foreign countries' surveillance agencies to keep track of your users.

Then there are very many niche tools that offer other useful things for web analytics, like being able to record visitors' sessions on the website, survey tools, A/B tests, checking how far down users scroll and other things we will go through later. The first and perhaps most important decision is which of the big players you allow to get a good overview. If you use Google Analytics, it is inserted on every page and you will probably keep it for a very long time. Then Google gets much more insight that way than if you embed a few scattered YouTube clips (Google owns YouTube). Make an active decision and document how you reasoned.

Regarding the second sub-question, the one about content delivery networks to support the design of the website, it is about when you choose to design your website in a way that benefits from others' services. The main argument for this is that these third parties can often send content faster to your user than you can. Thus the experience becomes faster, your website is offloaded and can therefore serve more users.

Some examples where you stand to benefit from using content delivery networks for better user performance are:

Streaming video or audio from a data centre placed as close to each user as possible.
Letting Google, Microsoft or another larger company send common files that are not unique to your website. For example fonts, code libraries like jQuery, and more.
Placing your own static files such as images, documents, etc. on a content delivery network or web server that is specialised in fast file transfer.

Then one should remember that this is a new dependency you are exposing your website to, that more cooks become involved in getting the page to display correctly. This was something many major media outlets experienced when Facebook had an outage with its Like button during 2013.

Wow a Facebook bug has taken down CNN, the Washington Post, Huffington Post, Slate, BuzzFeed, Gawker and Kickstarter...
- John Herrman, BuzzFeed²

Many times you have probably not made a particularly active choice, or at least not a particularly well-thought-out choice, it just became what it became. I have myself evaluated projects where the question of why there were so many external dependencies was answered by the developer along the lines of “we usually do it that way”. In other words, the question had never been raised with the commissioner. That is why it can be good to preempt these automatic non-decisions with documentation.

What acceptance criteria will there be?

It is not always the case that you have any acceptance criteria at all. Sometimes you do not have support for them in the contract with the supplier. When I worked as a developer, we sometimes had fairly extensive documentation on how to define when a software project is finished, but it does not cover all aspects of a website. But if you have any documentation that defines good craftsmanship, or find one you like, discuss it with the supplier, or internally, and arrive at which parts the project must live up to.

Living up to the law may seem obvious. But the legislation is both something that gets updated and at the same time is not as black and white in interpretation as one would wish. When it comes to websites, it is probably primarily the Electronic Communications Act (LEK), where one paragraph is popularly called the cookie law, the Personal Data Act (PuL) and the Act on responsibility for electronic bulletin boards (1998:112, called the BBS Act) that are relevant. What you should think about before the web project, and document, is whether you are going to use cookies at all. And if there will be cookies, how you choose to view cookie messages and their consent requirements. During 2016, at the time of writing, the EU has new data protection legislation in the works. It will eventually replace the Swedish PuL legislation, so the map is being redrawn.

A newcomer is how the Discrimination Act was sharpened on 1 January 2015. From 2015 onwards, it is grounds for discrimination if a website works worse for someone with a disability. Unlike LEK and PuL, this is legislation you need to work more actively with since it places requirements on everything that is published. It is for example not okay that users must have perfect vision or hearing to access the content. You must also be careful about what demands you place on the user's motor skills, cognitive ability, etc. One way to start sorting out accessibility is to set a desired level. Fortunately, there is a standardised framework for evaluating accessibility in the form of Web Content Accessibility Guidelines (WCAG). With WCAG you can choose different levels of how high accessibility you are after, for example WCAG 2.0 level AA, where level A is more relaxed and level AAA is stricter.

WCAG does not solve all your challenges with accessibility, but it is a very good start and definitely something you should include in your acceptance criteria towards your supplier. If you have a Swedish supplier, you can always check if they know about webbriktlinjer.se – the Guide for Web Development. It is the continuation of the 24-hour agency, but is definitely also sensible for the private sector. The first guideline, with the highest priority, is to follow WCAG 2.0 level AA, but there are a whole lot of other wise guidelines there.

I myself have often supplemented the IT-navel-gazing requirement lists with a list of measurable delivery criteria for what ultimately ends up with a user of a website. What developers call frontend, but it also has a lot to do with web performance. A long but by no means complete list of activities can be found at the end of this book.

Documented acceptance from all suppliers?

Have all suppliers received the assignment document and all other project documentation they need? Including your own IT department? My experience is that it is not always the case that the assignment's exact formulations and requirements make it to all involved. Among other things, we messed up at my workplace with how the much-despised cookie message should look and function on all our websites. We are probably an extreme case, with our almost three thousand web editors for the hundred or so websites we run. In other words, spot checks are needed to verify that an assignment document I have written has truly taken full effect everywhere.

So how do you prevent someone from saying they did not know? Good question. Perhaps by having clearer, and more formal, documentation that describes exactly what you mean and with whom each assignment must be verified upon delivery. This is surely not a problem if you only have a few websites or few people involved. But it probably does not hurt to clarify this anyway.

You need to stick to this documentation yourself. Later in the book you will get an introduction to performance budgets, where exactly this type of requirement can be placed. The word “budget” is of the utmost importance. It is namely like other budgets something you should try to economise with, plan to have a small surplus left over, and that a minor overdraft is not the end of the world.

Personas are useful for web analytics too

All too often when I have seen organisations' personas, they are smiling caricatures of people from the intended target group. It is surely becoming more common that your personas also have weaknesses. But if you do not already have one, I can definitely recommend having one or more personas that together have all common disabilities.

Why? Because all of us who in everyday life are not considered to live with a disability can be affected by this, both temporarily and permanently. Say someone is in shock or personal crisis – well then it is good if what has been built was designed without requiring a high cognitive level. At the end of a tough workday, most of us are probably noticeably cognitively impaired, at least I often am. That we all have trouble with vision became clear with mobile behaviour and the use of screens outdoors. Weak contrasts do not show up at all on the mobile when you are outdoors on a sunny day. People with visual impairments have this difficulty even under optimal lighting conditions.

Example persona: Tipsy Sanjay

Those of you who work with personas are welcome to include Sanjay, or spice up an existing persona with his characteristics (but choose a suitable name that works within your organisation). I have not worked with the Swedish alcohol monopoly Systembolaget, but if Sweden were not so anxious about alcohol policy, you could safely have counted on them having a persona who was at least a bit tipsy. Right?

So what characteristics does Sanjay have? What distinguishes him?

Language: Understands a little Swedish, but reads excellently in Hindi.
Device: Modern mobile, with a cracked display giving strange colours around the cracks.
Subscription: Sanjay has several Indian SIM cards, but also a Swedish one with a monthly data allowance of 500 MB.
Location: Outdoors, middle of the day, Midsummer, in sunshine (believe it or not).
State: A bit tipsy – it is Midsummer after all.
Geography: On the outskirts of 3G coverage in north-western Dalarna, half in a radio shadow.

Sanjay is in Dalarna visiting friends and celebrating a classic Swedish Midsummer. Somewhat inebriated, he cracked his screen when he had the mobile in his back pocket when he fell during a game of Irish Christmas Eve. The screen now gives strange colour variations around the cracks. His cognitive ability is not at its peak, his motor skills are not fully up to scratch either, and reception is really poor where he is.

: Image 1: Among others, Tele2 offered data allowances of only 500 MB during 2016.

Sanjay is of course a concentrate of the challenge you face with your website, especially if you run a website with crisis information, such as the poison information centre or other things that make him self-sufficient when he perhaps needs it the most. There is probably a lot to learn from him. Especially the contrast he offers compared to Ergonomic Egon.

Typical anti-persona: Ergonomic Egon

Ergonomic Egon is always on top, well rested and never emotionally affected. Egon only uses the web at the office, with carefully tuned computer accessories he has meticulously chosen himself. He of course has a high-resolution large screen with a wide viewing angle and no windows causing glare on the screen. A fast wired connection without any limitations.

Egon's characteristics are all too common as an assumption when designing your personas. The normal user is of course somewhere between Sanjay and Egon, but both extremes exist and we especially may need to put some thought into things for Sanjay's sake.

A week of empathy for your users

To at least feel a little empathy with your users, you can do as Facebook announced in 2015 – they introduced a weekday where you can live with the conditions your important target groups have. In Facebook's case, their development staff only have 2G speed when they browse – you know that hysterically slow connection we in Sweden had before 3G started to take hold shortly after the turn of the millennium.

People are coming online at a fast rate in emerging markets. In most cases, they are doing so on mobile via 2G connections. But on a typical 2G network, it can take several minutes to download a webpage.[…] Today we're taking another step toward better understanding by implementing “2G Tuesdays” for Facebook employees. On Tuesdays, employees will get a pop-up that gives them the option to simulate a 2G connection. We hope this will help us understand how people with 2G connectivity use our product, so we can address issues and pain points in future builds.
- Chris Marra, code.facebook.com³

: Image 2: Peter Antonius on what a week of user care can look like.

Or as Peter Antonius⁴ suggested on Twitter, you can have an entire week with this type of insight into users' reality. His proposal is as follows:

Monday: Screen Reader Monday
A day when you get the content on your screen read to you. Software for this probably already exists in your computer/device if you activate the accessibility features. Do not forget to turn off the screen, otherwise it will probably be hard not to cheat.
Tuesday: 2G Tuesday
The day for having a really slow internet connection. This is when you notice which content you would rather not wait a long time for.
Wednesday: Keyboard Wednesday
On Wednesdays you may only navigate using the keyboard, meaning the mouse pointer gets a day off. Then you will discover when the tab order is wrong on websites, whether that notification can be closed with a space bar or not.
Thursday: Colour Vision Deficiency Thursday
Thursday is for not using colour or shade as a carrier of meaning – there are those who have reduced colour vision.
Friday: Mobile Friday
This day you will have a small screen you control with your fingers. Tough luck if something is not responsively built or if zooming is disabled.

You can of course adapt these days to your own target groups. As a creator of websites with primarily a Swedish audience, the most obvious adjustment is to change Tuesday to a “slow 3G Tuesday” to simulate all of rural Sweden and when the network becomes overloaded at some popular location.

The reason this is important is because if usability is too low, nothing works for the users. That does not bode well for goal fulfilment on the website.

What stakeholders are there and what competencies are involved?

Who the stakeholders are depends very much on the organisation's size, type and perhaps primarily how far along the digital transformation has come. The most common stakeholders are those with managerial responsibility in various subject areas. Such as marketing manager, HR manager, CFO, etc. We will get more into that later, but there is no point in starting to report your findings to these people before you are well prepared. Especially not a lot of numbers – so-called “data puking”. However, you can contact them to ask for business goals that exist within their field, if they have any KPIs already developed. That is, if you do not already have this information.

I thought I would describe what a web analytics team can look like. But do not despair – it is probably more common than anyone wants to admit that you barely have a single full-time resource working on the web presence in a strategic and analytical way.

Marketing analyst – is the one who translates the business's long-term and short-term goals. Works with goal-setting for campaigns, etc. It requires both strategic and tactical efforts.
Conversion optimiser – is the one who works on how to succeed in converting users, that is, making loyal users out of spontaneous visitors and customers out of passers-by.
Technical analyst/web developer – works with specification and perhaps execution so that it is possible to collect data, to later analyse. If you do not develop your own website, this role becomes the person close at hand to tell you what is possible to do and document how it should be done.
Data scientist – a role still without a Swedish translation. It is a self-help technician whose greatest strengths lie in being able to collect data, process it, design reports and create visualisations. Probably knows some statistics and explanatory models for how data should look.

Beyond this, there is a perfectly ordinary marketing department, or communications department if it is in the public sector. Note that no web strategist was included in this short list, nor any editor. This is because they are rather at the receiving end of this team's conclusions than having anything obvious to contribute. Not everyone is so specialised in their roles, but for goodness' sake do not let web analytics become solely a question about content. What is also needed is of course someone who leads this team, someone who ensures that tasks are distributed, and so on.

Data quality

What is the most important thing of all when you are to analyse something and make decisions that affect your future? A rather leading question, but of course it is the basis for the decision that matters. If your data is substandard, it is practically meaningless to try to draw any conclusions based on it. It is of utmost importance that someone working with web analytics within the business has a good understanding of how your data is collected and can verify that the model matches reality.

Inspect your data and conduct a data audit regularly, so that you do not make decisions on a substandard basis. You must know what your data looks like and how it is collected. This can be harder than you first think, hence it is wise to keep a diary with careful notes about what you do. Especially if your efforts affect which data is collected. It can also be helpful if you at a later point need to recreate the same web analytics environment. If you are worried about your data quality and want to use a proven model, check out Brian Clifton's book Successful Analytics⁵ and everything he writes in chapter four, including about Quality Score.

Something I have personally experienced is that different parts of a website tracked usage to completely different accounts for web statistics. This means, among other things, that when you link from the start page to a subpage, it counts as separate websites. All your data becomes murky and difficult to work with. The cause in that case was well-intentioned. So that each stakeholder (read silo in the organisation) could get their own account with web statistics and did not have to see the rest of the organisation. What we lost was that it became virtually impossible to get an overview of the entire website, which would surely be more meaningful in my opinion.

The solution would of course have been to double-register the usage of the website. To track each page view primarily to an overarching account but also to each stakeholder's own account. Or if we had used Adobe Analytics, we could have used a built-in function to create a subset for the internal stakeholders who wanted to avoid seeing everything else. The tracking can be fixed afterwards, but you cannot conjure up lost data. That is why it is important to think things through before you start.

An increasingly larger problem is that users block web analytics tools. Then an uncertainty arises about whether it is a homogeneous group you are missing in your attempts to reach insight, or if it is just fewer users evenly distributed across all the groups you know about. One way to try to find out is to start working with log analysis, a technique we will look more at in the third part of the book.

Filtering

Filtering can be done at several different levels. To begin with, you can choose to filter out what is even collected for your web analytics. For example, it is common to exclude logged-in web editors and employees from what is collected by tools like Google Analytics. In Google Analytics specifically, there is a way to filter out certain users after the fact. This requires you to write a fair amount of documentation so you can afterwards verify that the filtering was correct and at best be able to adjust it. A variant many used early on was to exclude certain ranges of IP numbers, that is, the internet addresses office workers shared when they visited the web. This method of course needed to be supplemented when employees started browsing the organisation's website via their mobile phones.

: Image 3: ai-writer.com has successfully spammed its way onto the referring sites list.

Another concern is handling fake data. Those of you who use Google Analytics have surely seen those who try to spam you by setting a sender URL in the referrer list. There are ways to try to filter out the spammers, but you need to make sure you do not skew your data in a way that makes it unusable. There are a great many tips on how to avoid spam in your web analytics, among them filtering out everything with the wrong web address⁶. But to avoid embarrassing yourself, it can be wise to test-run your filter wishes first as a segment – more about segmentation below. Segmentation never changes data, it only gives a selective presentation – like a preview of what a data-destroying filter would yield.

Additionally, you can have view filters, a purely temporary filter. It means that you do not have to see certain types of data in the presentation in front of you. A completely different kind of temporary filter is called segmentation.

About segmentation and why it is important

Your users are not a homogeneous group, nor can you assume that a single group is homogeneous. To explore the variation within a group, you use segments. Segmentation never changes or manipulates the data source you are looking at, something that filtering sometimes does. Segmentation is about grouping users by one or more common characteristics. Segments of users are interesting to compare with each other. For example, what differs between customers on a mobile device and those with a second-to-last version of Internet Explorer? Perhaps an opportunity for improvement is hidden in similar comparisons.

Examples of interesting segments can be those whose usage patterns reveal them as:

Job seekers. How should you guide them further on the website to get them to submit an application?
Potential customer. They are surely in a different phase in how they need to be nurtured compared to an existing customer. You perhaps cannot make the assumption that they are uninterested in the organisation's support (just as not all mobile users are only after contact details), but they probably need a different treatment compared to the loyal regular customer who has been convinced for a long time.
Existing customer. They may need a more personal experience and quick access to their contact channels.

Sometimes you may encounter the word filter, and in some contexts it means the same thing as segmentation. If you want to be a language purist, a filter mainly concerns what should be removed, but segmentation rather describes what should remain. Think of it as either filtering out 13,873 types of mobile devices, or segmenting your data to only show users who connected with a Fairphone mobile.

It is strangely quite common to need to challenge the untested assumption that all your users have similar needs and that they resemble each other. The assumption that everyone is in your target group or that all users are fundamentally the same usually falls apart when you look at your segments. There is usually great variation.

There is also often an unclear picture of who you are attracting. It is good to report who is using what you offer, something worth conveying to your stakeholders.

Segmentation involves grouping statistics by common factors. A segment can be mobile device users, another can be users from Norway. Never working with segmentation is very risky; at the very least, you will probably not reach any insight at all.

Segmentation is a structured way of working further from the low-hanging fruit when you already feel you have inspected the large data sets from a holistic perspective. Otherwise you do not know for whom you want to create an improvement, and it is also easier to follow up a change if you have an intended target group – a segment of users.

Without getting too deep into statistics, a word of caution is appropriate when it comes to only looking at statistics at an overall level, or having unspecific business goals regarding who you mean.

The risk of looking at average values (and not segmenting) is that the average value is not always representative of the majority of users. That everything is seen from such an average viewpoint that there are no meaningful conclusions to draw. To give an example, it is entirely possible not to touch bottom in a lake that is on average only one metre deep. That value says very little about how deep it is at its deepest.

: Image 4: A normally distributed curve. The majority of data resembles the average.

An average value is only meaningful if the data is normally distributed, or if the average value has predictable deviations. In other words, the average value “10 page views per visit” is not a particularly meaningful value if half of users make only one page view and the rest nearly 20. It does not describe reality. What is interesting is rather that there are at least two very different groups in this data.

To remedy this and gain clarity about which divergent groupings you can follow, segmentation is used. Then you can divide your users into different groups that can be explored separately. Or if you would rather compare them with each other. Segmentation is about capturing differences and deviations in the hunt for improvement potential. Differences in behaviour, geographic location, which products people have looked at, and more.

Which segments are worth working with?

The segments that come ready-made in your analytics tool are those so obvious that the vendor believes everyone needs them. It is not those segments that give you a competitive advantage for user favour. Rather, it is the business-specific segments that do the most good. These may even be so complex that you cannot get your analytics tool to list them. You may need to extend with more tools.

When you set up your business goals, there is a point in focusing on a few individual segments per iteration of the analysis work. In the example above, such a segment could be filtering out users who only make a single page view in order to instead see trends in usage among those who more actively use the website. Another segment can be to only look at those who turn around at the door, those who make only a single page view. This is a group that may be worth trying to convert with some campaign that only people in that segment see. A segment can also be your most valuable users. Their behaviour can be interesting to compare with the average user.

As said, average values do not always give a particularly fair picture of reality. Sometimes it is the fifth with the most problems you should focus on, despite the majority having very few problems. With web analytics you can work with multiple simultaneous groups of users in a structured way.

It can be difficult to assess which segments are sensible, not least when the business has not been around long. Many times you may need to get to know your customers, users and intended target group first in order to know which segments are most important for the business.

Segments are not necessarily static. When you are back in the reflection phase of your analysis work, you should not regard existing segments as sacred. They are just a logical division for exploring and testing improvements. If the business already works with personas (caricatures of your different users), the development work around segmentation can provide valuable feedback on what characterises each persona.

As you have understood by now, what is published on the website needs to align with the business goals and intentions the website has. Today it is not enough to claim a general interest or treat the website as your own information dump just because you enjoy producing content. The content must bear fruit!

Method for continuous web analytics

Increasingly, businesses on the web are being forced to live up to the requirements that older and more established professional fields have had for a long time. Having a website can, just as in the 1990s, be considered something you have because everyone else has one. A big difference from then is that users now have considerably higher expectations. It usually is not enough to welcome someone and give them some contact details.

Today a website is often quite costly to develop and maintain. By striving towards the business goals also on the website, you can steer your efforts and be part of the organisation's regular operations – and live by the same guiding principles.

Shortly, a suggested method for working with web analytics will follow. The next four chapters each describe the steps that make up the method, what things to consider at that stage of the work, and examples of what the result can be.

I cannot take full credit for this method, even if it may feel obvious to some people of a more structured nature than myself. Rather, I would like to explain that it is a mixture of what I have read about the subject over the years and how I myself have worked with continuous improvements of my own websites.

Short version of the method

: Image 5: Start with business goals, then you are on your way.

Before we go through the parts of the method, I want to give a quick overview so that in each step you know what comes next.

Work with business goals, KPIs and metrics.
Collect your data; while collecting data, you develop reports.
Analyse and communicate the findings.
Implement improvements on the website.

So what are KPIs? A KPI (Key Performance Indicator) is a measurable goal the business has set up to describe success. Note the word 'key'; there are also 'performance indicators' which means something positive has been measured, but it may not be something worth steering the business towards.

A KPI can for example be an average order value, proportion of returning customers, or whatever is objectively good for the business.

Not all KPIs yield answers within one cycle of this process, because different amounts of time may be needed to collect data depending on what you measure. Some KPIs can be analysed less frequently, but I suggest you still reserve time in the calendar for each step. If you are on step three, you analyse everything that at the time has accumulated enough data.

Step 1: Working with measurable business goals

First and foremost, you must have a guiding principle. If you do not know where you want to go, it is hard to follow up that you are on the right track. Does the website have a reason for existing at all? When Web Service Awards (WSA) in 2015⁷ surveyed Swedish web managers about the point of their website, 27 percent said there is no stated purpose. In a way, they found an even worse result in that only 54 percent consider themselves to have a defined purpose for the website's different subpages.

About half, then, cannot explain why a certain subpage exists, what benefit it should contribute. Without goals, it becomes difficult to steer your fortunes, or follow up work through web analytics.

Then one should remember that even if goals are important, they are not worth much if you do not work to achieve them. Then they are completely worthless and meaningless. This is what you have your KPIs for; they should motivate you to follow up on the business goals, and this may need to become a process in order to actually happen.

Think of the goals as the reason why the website was built in the first place. Hopefully there was a reason and a goal in advance; otherwise it is time to address that now. What your goals are naturally varies somewhat depending on what type of business you run.

Four types of businesses

I once saw a fairly sensible (and very categorical) classification of what types of websites exist. It suggested the following broad categorisation, that a website is primarily about:

Commerce
Content
Creating connections between people, or between organisations
Self-service

Four types of websites that differ greatly. They all explain somewhat that the goals can vary greatly between organisations and that success is defined in somewhat different ways. For those working with commerce, a goal is easier to explain if it concerns revenue, sales or something else with a monetary value. For the organisation working with content, such as the media, one can to a greater extent allow oneself to measure how many in an intended target group have seen certain content. For the connection-creator, it is all about relationships, and the self-service provider tries to solve tangible problems.

Sometimes you can read from industry writers that one should find one's “objectively true metrics”. Yes, I understand that many get stuck on that term. By “objectively true” it means that others besides you, those you work with, or people in the same professional role should agree that your metrics are meaningful. That they describe whether something of importance or value has changed.

What does a value-creating visit look like? Macro and micro level.

What value should the business derive from the website? It is not just about what goals you have; those goals must become measurable. If you do thorough preparatory work, you can arrive at being able to tie, somewhat loosely presumably, some goals to something measurable on your website. It is extremely important to try to figure out what value can arise from the right kind of usage of the website. As usual, this is much easier if you conduct commerce on your website, which is why these examples are constantly brought up. However, even in other contexts, such as for a municipality, there can be clear ways to relate so-called e-services on the website to an actual value. For every completed building permit application submitted digitally, there is potentially an avoided expense for the municipality. But also a benefit for the individual citizen who could complete a bureaucratic task remotely, without having to plan a break in their everyday life.

Another example I often use is the one from the healthcare sector, which is very familiar to me since I happen to have worked extensively in this field. What is the achieved value of a user on 1177 Vårdguiden's e-services? If they log in and request a prescription renewal? First and foremost, you should remember that the less time and thought a citizen needs to spend on being in the care system, or identifying as a patient, the better for the individual. Being able to administer your healthcare remotely and in less time is very valuable. Furthermore, if a person does not show up in healthcare, the risk of infection decreases, there are fewer demands on premises or staff in certain places.

Goals can be divided into the following categories (most important first):

Macro goals – the business's vision or grand plans (engaged customers, perceived as a thought leader, attractive employer, etc.)
Goals – your KPIs, or key performance indicators. An absolute measure for evaluating success. If you want to subdivide KPIs, they can describe:
2.1. Completion of a goal – the user completed all sub-steps successfully. If a goal is for the user to create an account, a completion KPI is that they create an account.
2.2. Progress towards a goal – a step in the right direction, but the user did not quite make it all the way. Progress would be that a passing user chooses to read more about the benefits of creating an account. Close, but no cigar.
Micro goals – measurable sub-steps that through hypothesis can be tied to a user confirming the macro goal (such as signing up for a newsletter, going from a landing page to a detail page). Here, a confirmed interest should be provable. An interest signal that markedly distinguishes a customer from someone just browsing and killing time.
Metrics, or metrics in English – the other metrics you collect. Often mostly out of curiosity or to support the conclusions.

The point you work most with is point number two. The first is about why you do something at all, i.e. what it is about. The micro goals and progress KPIs are about evaluating positive events, but they are mostly interesting for finding trends, such as whether the trend has changed since you redesigned a function. The problem with putting your micro goals as something you measure in your web analytics tool is that they can overshadow the “real” goals. The more important goals do not occur quite as often but are ultimately what you are after; if no one ever completes a goal, it does not really matter how many subscribers you have collected for your newsletter.

I realise that it may not always be easy to distinguish between goals and micro goals, and the exact dividing line may not be directly crucial. Try to keep it so that a goal is what distinguishes a genuinely engaged person from an interested party. An example could be the micro goal of leading users towards the job advertisements on the website; it gives an indication that something of some form of value has occurred during the visit – it was at least not completely meaningless. To differentiate between the micro goal of window-shopping job advertisements, a fulfilled goal is that the user actually submitted an application for a job. A confirmed interest and a mutual goal has occurred.

Micro goals give an indication of interest; a goal should confirm some form of mutual benefit or goal fulfilment.

Last, and perhaps least after all, metrics. These are anything under the sun you might be curious about, such as which sources your most valuable users come from. Metrics are usually not goals in themselves, more like part of the backstory of the other goals.

Rank the goals amongst each other

An option in this work is to give each goal a relative value. Say that goals must be valued between zero and ten. Do you put a zero in value on a new newsletter subscriber, a seven on a new registered user and ten on a completed transaction? A lot of zeros is still zero in value. In this way you can relativise the benefit of lots of Facebook likes, thousands of contact details, lots of campaigns and so on with some other activity that perhaps creates the lion's share of all value (with or without the help of all the other things).

If you are instead in the situation where the benefit of the website is questioned, you can turn the reasoning around. On one hand you can counter with what value arises because of the website, but also what untapped potential for improvement there is by working more structurally with web analytics and getting to know the users' experience.

To translate the different levels of goals, perhaps it is time for an example. 1177 Vårdguiden, Sweden's counties' joint website for providing healthcare information to the public, could have the following goals:

The macro goal is the business's goal that all residents who may at some point need contact with healthcare know about 1177.se and have an account on 1177 Vårdguiden's e-services.
KPIs can include “percentage of population in the catchment area that has a resident account”, or completion KPIs like “percentage who manage to complete the entire prescription renewal process”.
Micro goals are probably quite many, such as a user window-shopping for a healthcare centre to switch to, reading about their rights, etc.
Metrics become type of device used during visit, location, time and everything else that can support or describe the conclusions.

This is not rocket science, so one can relax somewhat about the rigour, but being sloppy can backfire. The simplest, and probably wisest, is to use the goals, visions and such that the business has already formulated.

Not all business goals are measurable. But with a little imagination (and sometimes with the help of a skilled web developer), you can often come up with some way to at least partially get a picture of how well a goal is met on the website.

Try to ensure that the business goals selected – and in some cases refined to become measurable – are ones that can be acted upon. A drastic change in the metric should not be met with lukewarm interest or uncertainty about whether it is significant.

If inspiration for what your website should deliver to the business runs dry, you can think about how the website can make things easier for users while simultaneously creating value for the business. Also do not forget to look in the business's vision, plans and other documentation, as there may already be defined key performance indicators that can be reused. Break down each item to something that should be possible to isolate and measure. One challenge can be that the measurable business goals must work together with the long-term goals. It is therefore about trying to foresee the long-term consequences of the goals you set up. Trying to avoid negative side effects. To see how others measure their business, you can check out listings online with KPI libraries⁸.

: Image 6: Speaking of trying to foresee future consequences⁹.

: Image 7: This variant works for quite a while.

Another explanatory model you can borrow for categorising goals is the classic AIDAS¹⁰ from advertising. That is, the customer journey's conversion process consists of the following steps, from top to bottom:

A – Attention
I – Interest
D – Desire
A – Action
S – Satisfaction

Here, through advertising and marketing communication, you try to move a customer step by step. A perhaps more neutral variant is the one web analyst Avinash Kaushik has suggested¹¹. A division of your intended target group depending on where in the consideration process they are:

See – A person who matches an intended target group.
Think – A person in the intended target group who is also aware of a need your service can help with.
Do – A person aware of a need your service can help with, and who is also ready to act right now.
Care (i.e. pamper the customer or spoil them), sometimes called “Coddle” – Those people who are regulars, or at least returning customers who have converted more than once.

Besides needing to define measurable ways to follow up on your goals, it becomes obvious with these models that you need several different types of goals. Probably at least one per category. If you are stuck with goals that measure a digital equivalent of a print publication's circulation, perhaps one of these models is a liberating way to discuss other angles on goal-setting.

At least I have encountered web managers who only had defined goals that would fall into the See category above. That type of goal is very hard to defend on your own website as it is probably the reason the user ended up on your web page – once there, there are no measurable goals for what constitutes a useful or successful visit.

It is definitely outside the scope of this book, but if you talk to a marketer, they would probably mention words like channel strategy, or perhaps channel mix. It is about the fact that your website is not necessarily involved in all parts of the conversion process. It may well be that you work with external bloggers, buy keywords on search engines or similar to build awareness. Then perhaps the last two sub-steps of the conversion process take place on your website, app, or a combination of whatever touchpoints you have with the person of interest.

Beyond the business goals, you can of course have some use for working with “vanity metrics”, like the number of page views per visit and similar. However, it is difficult to show what value explains that the result is unambiguously good or bad. These figures should not be what you put great effort into as many page views (or similar) are hardly a goal worth striving for the business. These figures can be pleasant anecdotes among us web nerds but they give very little insight into how the website becomes successful.

Examples of business goals translated into measurable goals – KPIs

At least 90% of those who begin the registration process should manage to create a user account.
At least every other case report to customer support should be made via the website's purpose-built form, and fewer than 10% should be made via email.
Users should consider the search function to give relevant results.
Customers should consider it easy to find contact details for customer service and the nearest store.
At least 90% of mobile users should consider the website to be very usable in a mobile scenario.
At least 95% of material on the website should have been reviewed/revised in the past year.
All pages shall comply with accessibility requirements in WCAG 2.0 level AA.
Completed purchases at campaign prices should have under 5% returns.
At least 25% of customers recruited through keyword advertising should make a second order within six months.

As you notice in these examples of business goals, they are not limited to statistics that end up in the web statistics tool, and that is the whole point. Some have a threshold, such as that at least a certain proportion should consider something or manage to achieve something. In other cases, there is no specific target, and instead you work with continuous improvement. How you choose to mix these approaches is up to you.

You need to work with both quantitative values like what you get from the website statistics and qualitative research that is the users' subjective opinions. And depending on business goals, you may also need to look at other systems or do your own research.

Examples of refining business goals

To give you some ideas on how a business goal can be refined into something measurable, here are a number of examples. First up is perhaps the most difficult one. The somewhat too vague and abstract macro goal. After that follows a more concrete and easily measurable goal.

“We should be perceived as a customer-focused and competent company”

The wording itself makes it clear that this is a macro goal; surely something similar can be found in many organisations' vision documents or business plans.

Here you simply have to capitulate to the fact that you cannot measure a perception in a quantitative way via website statistics. However, you can work with website surveys – you know those boxes that ask if you have a moment to answer some questions about the website. If you conduct surveys or other forms of market research, you can consider yourself to get answers on this macro goal's development over time, just by rather boldly asking the question.

But how on earth do you manage that with individual users' sessions on the website? Well, one way is to start inventorying whether there is already content that relates to the goal. If the content is also “active”, meaning you can capture user signals about their approval, it becomes easier. If you completely lack such content, you need to consider whether this really is a goal that should influence the website's content. If you conclude that such content should be created, the content needs to be designed so you can follow up that the reception/usage is in line with the goal. You need to figure out how to place a call to action, i.e. something that can register the goal or micro goal as achieved.

Call to action, often abbreviated CTA, is a concept you will stumble upon regularly. A CTA is usually a button. The button's function can be that the user likes the page's content, rates it, shares a post on social networks, adds a product to the shopping cart, and more.

This sort of content can then be grouped via your web analytics tool so you can follow its fate and how it is received. Micro goals, or progress KPIs for that matter, for this group of content are whether it is shared by users. Or whether page views on these pages are part of other measurable business goals being fulfilled. An example of a successful user session could be that the user entered the website via a landing page about the competencies the organisation possesses and that the user submitted a spontaneous job application on another part of the website.

“Our communication should be comprehensible”

This macro goal is something the public sector gladly has stated; one does not want to allow the internal jargon to affect the general public. The goal is not very measurable in its current form, but considerably closer than the previous one. Here we are talking about the content, the website's message, and that it should suit the intended target group's language skills.

: Image 8: Have you tried searching for things you do not want to exist on your website?

How do you find out if you are comprehensible? One way is to work with content analysis. Most websites already have a search function and every competent search function has an index with everything it knows about the website. This index can be inspected to find known linguistic aberrations and correct them. To keep track of common blunders over time, perhaps a developer can create an extension that performs searches for all “bad words”, where one hopes to get no results.

If you have a search editor, it is their task to check the language use regularly. A search function is remarkably good at revealing how bad the existing content is.

Speaking of search functions, you can also tackle the guilty conscience with search analytics (something that has its own entire chapter later in the book). The most obvious benefit of search analytics is to look at the most common search queries and how well they match words used on the website. When someone cannot find something, they may try to search for it. What makes the search function unique in this context is that there you find out what the user considers something to be called.

Even if the debate about ‘dagis’ versus ‘förskola’ (informal versus formal Swedish for ‘preschool’) will never end, it is important to work with synonym management and structure content in a way that gives the user a chance to find what they are looking for. A related KPI is the number of zero results in the search function, meaning when there is no matching content. This sometimes has to do with language confusion. As a municipality, you need to ensure that searches for ‘dagis’ also return results for ‘förskola’. Perhaps there are similar differences in language use in your industry too?

Another KPI is what level of readability you aspire to. There are several methods for measuring this, none of which are likely to impress a linguist. One is the readability index (LIX, test it yourself online¹²) which attempts to indicate how advanced a text is. The readability index is not a perfect method but can indicate a text's complexity. Unlike distributing cheat sheets to web editors, LIX is something that is measurable for seeing whether the trend is moving towards more complex or simpler sentences.

A further variant, but for English text, is Flesch reading ease¹³ which gives a hint of whether the text risks being complex. Neither here is the method exact enough to start paying editors based on the score, but it may still be worth monitoring how the texts develop, whether it differs between different groupings of content. Say you build a tool that analyses this; then it becomes interesting to evaluate whether simpler texts perform better or worse than complex ones. Read more about A/B tests later in the book, but it is possible to evaluate whether one variant is preferable.

“Our Scandinavian customers should perceive us as a Scandinavian brand”

: Image 9: Zlatan and a Volvo in a winter landscape (source: Volvo Cars).

I have never had insight into what formerly Swedish-owned companies set as goals when they stop being owned or managed from the part of the world where they once started. Think Volvo Cars, owned alternately by Americans and Chinese, but going out with an extensive advertising campaign about being “Made by Sweden”. You can think what you like about such things, but the message and what they want people to believe about the Volvo brand is undeniable – Volvo is Swedish!

Together with Zlatan Ibrahimovic we have done a celebration to Sweden. It's our country's unique nature that inspires and challenge the people at Volvo when they develop their cars. It's also here, at home, in the magnificent wilderness that they find their strength. Just as Zlatan does.
- Advertising agency F&B in their description of the assignment¹⁴

But how do you measure whether people take in such a message, and whether there are certain user segments where you need to work a bit extra? This is where qualitative methods come in. Interviews of various kinds, surveys, guerrilla testing while waiting for the morning coffee at the nearest coffee shop.

As you have understood, it is not particularly meaningful to make this type of effort only once. What should you do with the knowledge that 72% think something? It is of course the case that value arises through a changed perception, and at some point you perhaps only need to maintain the position you are satisfied with.

There are some guidelines when designing questions to ask your users. As you ask, so shall you receive. I usually exemplify with how one thinks the Swedish people would answer the question “Do you think Sweden's head of state should be democratically elected?” It is a leading and at the same time misleading question, since many who want to keep the monarchy would answer yes, but through their answer indicate that they want to depose the king as head of state.

This type of problem is very common if you are careless with your questions, but also when it comes to which answer options are given. Some who work a lot with this are the media, and Swedish Radio's Eko editorial team has produced an excellent checklist¹⁵ that highlights common problems with survey responses. You should be able to answer yes to seven questions, namely:

Is the sample random? Otherwise the margin of error risks becoming very large.
Have enough people responded? For sweeping generalisations, you normally need 1,000 respondents, or if it is a smaller group, perhaps you should ask everyone.
Have enough of those asked participated? If it is under 60%, it is time to worry that those who did not respond might have answered differently from those who participated.
Was the survey conducted by telephone, by letter or through visits? Eko specifically does not approve of web panels, but web surveys can also be questionable if you cannot demonstrate that the respondents were selected randomly.
Is the change you want to report statistically reliable? The famous margin of error can play tricks on you.
Can you trust the sender? In contexts like the web, you need to be questioning about how the data collection has been carried out. If you have a third party helping, you need to trust that they know what they are doing and are giving you correct information.
Are the questions neutral and sensibly formulated? The person formulating the questions can unwittingly reveal their own opinion through how the questions are phrased. The wording of a question can influence the answer you get. A question should be neutral and free of value judgements.

But as mentioned earlier, this is not rocket science; be vigilant about what conclusions you base on data of lesser accuracy. Do not make sweeping changes based on a statistically non-verified proportion of respondents. If you really want to be on the safe side (and spend lots of time), ask a statistician what they think of your method…

Measurable values and quality requirements

Besides the business's goals, there are other things you want to follow up, namely the metrics you have prioritised. It can be about how you expect the website to function to be representative of the business. Dagens Nyheter ran an article series in autumn 2015 pointing out the ways Swedish public sector let third parties, such as Google, Facebook, etc., listen in when citizens used the websites. In some extreme cases, the third parties were organisations that did not hide the fact that they sold the information onwards. Having them on pages dealing with sexual orientation, sexual health, ethnicity or other sensitive personal data was of course not good at all.

Better is if you are clear about what values you have and what they might mean for the design of the website. Furthermore, there are a number of quality requirements you should set for yourself and conduct regular follow-ups so that things at least do not get worse the further away launch day is in the rear-view mirror.

What is expected of a modern, efficient and useful website that follows best practice is changeable. This places demands on frequently doing research. The last part of the book covers a long but certainly not complete list of activities you can start looking at to get a handle on the current state.

Beyond the above-mentioned metrics, there are also those metrics that tell the user's story and let you gain insight into the user experience. Both their frustrations and successes. As you have figured out by now, these are not goals for the website, but they are still important for having a complete picture of how the website functions and can be optimised. You should keep these goals with you throughout this method for web analytics.

Below you have a suggested set of points to discuss with the web team and with suppliers:

Should we use or even allow third-party content delivery networks? Even if you may not be entirely consistent in this, it is good to document what exceptions you are willing to make. Many will surely conclude that regardless, they want to use Google Analytics, but that they will not design the website to include more actors. Temporarily, you can surely add tools to follow up on the user experience, get help from third parties on individual subpages to carry out A/B tests, and more. The important thing is to make an active choice.
Which hygiene factors/quality requirements are prioritised? Check the last part of the book for a bunch of suggestions, but you will probably think of more. These can be documented in something called a style guide, a topic we will address in the next part of the book.
What compromises are agreed upon, and what improvements are planned? Document which points should be met with the first iteration of the website. It may be that for budget reasons or similar, you need to postpone certain requirements to the future. This is a plan for the website's continuous improvements and how you can measure that the plan has been fulfilled.
Agree on what should be included in a baseline measurement! If you have not already done a baseline measurement, or a measurement plan with historical comparison values, it is time to start.
Create a representative “test page” that remains the same after each update of the website! IT providers, developers and consultants should verify their deliveries against this page to show that their contribution did not worsen quality – rather, one should expect improvements.
Have documented buy-in from all suppliers! Whether you make it solemn or formal is up to you and the nature of the relationship with those who contribute to the website, but it is important that everyone understands the expectations that exist.

Checklist for a good business goal for web analytics

There are a number of criteria that indicate whether the business goals you have selected are relevant, namely that:

There is a stated stakeholder. If you do not know who should care, it is rather pointless to start measuring.
The stakeholder can become interested in a significant change. If a major change in the metric/KPI is met with a shrug, comments like “so what?” or incomprehension, you have a problem.
The metric has an impact on the business's goal fulfilment. If it has little to do with the business's path to success, it is not a ‘Key’ Performance Indicator.
Can be acted upon. A metric/KPI needs to be related to something tangible, such as at a certain level it is no longer profitable to continue the advertising campaign on a certain social network. Or that a part of the website no longer contributes to the business's goals, that very few value-creating users benefit from that content. There are also informative metrics/KPIs, but they are rarely particularly interesting for anything other than understanding your users.
Does not cause negative side effects. A good goal is thought through enough so that problems do not arise elsewhere in the business. For example, that increased sales mostly caused extra work, more returns and customers who felt deceived.

Also remember that very few care about the numbers! They want to hear insight and recommendations backed by collected facts – not a lot of numbers or data talk. Rehearse your presentation beforehand, verify your report with a colleague and make further preparations before you hopefully take contact with management.

Step 2: Develop reports and methods for upcoming analysis

In step two, it is time to start working with how data should be collected, how the quality of the collection should be good enough, and how it should be presented. As already mentioned, reports are more about text than numbers. Even a few visualisations can be overwhelming the first time someone outside the web team's core takes in the content. A report should be a story that the recipient can remember and retell!

The concept of a report is delightfully vague. What I mean by it is really not that complicated; it is about how someone should be able to take in what has been inspected. Sometimes it is called a dashboard (which probably corresponds to the Swedish ‘kontrollpanel’). It may not be more complicated than creating a new tab in an Excel sheet where you document the results of the work.

: Image 10: Example of a dashboard with many tabs, from the County Administrative Board of Örebro.

I have seen masses of different ways to build reports over the years. Most have been good. What you need to think about is who you are addressing. Who will take in the report and what activities you hope it will lead to. We who work with collecting data, or wade in website statistics all day long, might think we can act on data, but we are probably lying to ourselves. Most likely, we build a story about the data we are continuously fed and then we automatically react to the deviations we perceive.

A person who rarely sees the same statistics experiences, on the other hand, that we are spewing numbers at them if we do not tell the collected data's story rather than present figures. It can be easy to forget. So easy that it has been given a name in English – data puking :)

So the first step is of course to figure out who will benefit from the information we will collect and later report on. What prior knowledge do those stakeholders have in statistics, web and technology? If you have not planned to adapt to the stakeholders, it is rather pointless to work with web analytics. They are the ones who can help you make good things happen based on the evidence you dig up.

: Image 11: Example of a tool where stakeholders themselves can inspect collected data.

Two fundamental forms of reporting are whether you will send them a compiled report afterwards or whether they need ongoing visibility. Here the difference is also whether the stakeholder wants a desktop product to bring to a board meeting or whether they themselves actively want to follow the development of the website in real time. Most likely, you will end up with a combination of both.

A rough overview for the work in this step is as follows:

Collect data.
Assess the quality of the collected data.
Refine the content, i.e. clean it up so that it is ready for upcoming work. This can involve converting geographic positions from one format to another.
Mix in more data sources and dimensions. Say that you augment geographic positions with average order value.
Pre-processing. Develop models for how to draw conclusions on potentially very large data sets. Perhaps you need to make the content easier to work with?
Filter out the data you need. This can involve only collecting or compiling data about a certain user segment.
Exploration, visualisation and data analysis. To know that you have the right content and verify all previous steps, visualisation and general exploration of the data set can be wise.

Note also that your data becomes more valuable after each step; it is somewhat of a refinement process. If you are working as a generalist and get stuck, it can be a comfort to know that there is expertise in each sub-step. If you work in a larger organisation, this expertise surely exists internally; otherwise people are usually unexpectedly helpful when you reach out on social media (with short questions).

A new professional role that started emerging around 2015 is called a data scientist. That is a multi-tasking person who is good at statistics, technical systems and who can help themselves with getting technical work done even if the data source is very large. Beyond these, you may need to talk to those who know the content on a website, but also preferably a web developer to get a handle on how data should be collected correctly. If you are in a tight spot, it is a developer in general that you are after.

Sometimes it can require making adjustments to your technical platform. Beyond the business goal's own data, it is good to collect data that tells the story around the business goal – you do want to know why something got better, not just that it happened.

Collecting data can be difficult

: Image 12: How do you log automatic error messages on a website?

In theory, it is probably easy to collect the data you are interested in. But the larger the technical environment gets, the more complex components help to offer a website. Where I work, at the county council, we are probably an extreme case in that our technical systems have high protection value as we handle extremely sensitive personal data, so for us it is not always a given that we want to take advantage of all the data we have. The same concern probably exists for most, albeit on a smaller scale, now that practically all organisations are starting to place great value on collected data.

Somewhat depending on what business goals you have, you may need to dig into this almost infinite technical complexity. It is worst when you want to find out in detail why a deviation occurred. These data sources are primarily relevant when you afterwards want to be able to investigate why people disappeared just before they completed a business goal on the website.

At my workplace, it has happened from time to time that important content has ended up in the publishing system's recycle bin. For a page that has ended up in the recycle bin to be shown to a non-logged-in user, they end up on the login page for the publishing system. That page is not logged in our case by the website statistics. Thus users disappear into a black hole, from a web analytics perspective, and we completely lose sight of the user's experience.

One way that many website statistics systems offer for capturing things that are not a regular page view is called events and virtual page views. It involves behind the scenes registering that something occurred, good or bad. In many ways, it is practical to be able to collect signal data in the same platform as your website statistics.

If you cannot manage to make various error pages register the user in the regular website statistics, then the analyst's access to technical logs becomes crucial. Speaking of the difficulty of collecting data, it is important for the work's credibility that you can answer the question of how your data was collected, what factors affect it, and so on. It can be sensible to document what constitutes a virtual page view, when you choose to register it as events instead, and so on. It may not be so important exactly how you choose to do it as long as you are consistent; otherwise you can yourself cause unwanted variation in your own data source.

Log analysis may see a resurgence. It was what we used in the old days on the web since not enough people had JavaScript support in their browsers. Today the problem is rather that many users block web analytics tools to protect their privacy. For the masses, this appeared during autumn 2015 when Apple released iOS 9 which had support for so-called content blockers. These allow you to block Google Analytics, Matomo, and most other tools that try to keep track of what the user does on a website.

Initially, it was mostly shady or unknown providers of these blockers. But after a few months, the open and fairly credible organisation Mozilla came out with Focus by Firefox. Suddenly there was an open alternative without a commercial or hidden ulterior motive.

Even before these content blockers were built into Apple's mobile system, there have of course been special browsers. And for computers, ad blockers have existed for a very long time – they can also be used against web analytics tools.

: Image 13: Focus By Firefox makes it easy to block various surveillance.

The existence of this type of tool is worth thinking about when you design your data collection. Will the tool you choose cause all these users to be unable to use the website? That could be enough to go bankrupt if you are really unlucky. My recommendation is that you yourself try out these blocking tools and carefully follow what impact they can have on your website, how they affect the tools you choose to use. You may have experienced yourself that you cannot scroll down on a website? At least I have. The question one should ask is, how likely is it that the few who figure out that it is due to their content blocker actually choose to do something about it. Perhaps it is more likely that they do not bother and visit a competitor instead?

The bottom line of your work around data collection is that you need to check that you have a reasonably complete picture of what can happen for users who visit your website. Do an inventory of which systems contribute to displaying the pages that can be reached and check that you have visibility into each system. In a larger organisation, it can typically involve having the following systems contributing to the website's content:

The content management system itself, the CMS, where editorial material is handled.
A separate document management system (ECM – Enterprise Content Management) for keeping track of documents, brochures, forms, and more.
An image bank in an image management tool – a so-called MAM (Media Asset Management).
Video clips, or fonts and third-party files that are not unique to the organisation are on a content delivery network (CDN – Content Delivery Network) to speed up transfer.
Various odd solutions from external actors that are loaded via the, excuse my bluntness, completely worthless technology Iframe, i.e. a small peephole in the page that loads content from a third party.
Third parties that via APIs contribute with currency conversions, card payments or content such as current product texts.

You need to have a handle on this. If nothing else, you will as soon as you start investigating how it is that it looks the way it does (or why something does not seem to work as intended). It is sometimes an interesting challenge to manage to measure a user's journey between multiple different systems, but often you get a long way by registering a virtual page view before the website hands the user over to, say, the document management system.

Tools for analysis

There are very many tools for working with web analytics. Furthermore, new ones are constantly being added. Some are small solutions that help with individual tasks, like building reports; others are large environments that offer you and the stakeholders a fairly complete view of the current state.

The most well-known type of tool is those that help with website statistics. The biggest is probably Google Analytics, but Adobe offers a counterpart in their Adobe Analytics which has its strengths. For those who do not want to involve third parties, there is Matomo. It is an open-source solution that you can install on your own website and thus protect personal privacy. We use it at my workplace, and now in a society, post-Snowden, there are probably more and more public actors who are starting to take an interest in Matomo.

These tools really only show the most obvious information. That is, the click stream on your website. Where did the users come from, what did they do and did anything of value happen? These tools offer views of information that are very generic. Things so obvious that everyone can probably benefit from them. What you need to keep track of your specific website, you may need to figure out yourself and supplement with.

There are more and more tools for helping yourself with analysing collected information. One tool I have found useful is called Tableau. It is a way to explore the data you have collected in the hope that deviations and patterns can give you insight into what can be improved to make the website a little bit better for the users. Here you find new views of the data you have accumulated. It depends a bit on what the goal you are measuring is about or what kind of data you are looking at. Later, in the more advanced part of the book, we will go through some of the useful tools that exist but have among many of us ended up in the shadow of website statistics systems like Google Analytics.

There are a number of well-used techniques you can look at for how your reports can be designed, and shortly we will go through some of the opportunities available. It can be useful to have a web developer at hand until you yourself know how complicated your solution will be to realise in the system the website runs on.

A large part of the benefit of the report work and methods is the learning process. Working with what benefit you should strive for, getting to know your users' behaviours and needs is something that means each iteration of this work starts at a new higher level.

Yet another concept you need to know is CTR (Click Through Rate), which refers to how large a proportion of users chose to click on a certain link in a search result, or follow the call to action on a page they visited. The goal is to have a high and predictable CTR as it means the user understands what they are being served and accepts continuing towards the goal.

Conversion funnel

: Image 14: Large drop-off in the first step, almost none at all in the last. From Google Analytics for 1177.se

The method that is probably most common when it comes to reporting goal fulfilment is to use a conversion funnel. It is particularly well suited for visualising how large a proportion of users drop off in a process with multiple sub-steps. You define a starting point and then measure the click-through rate at each sub-step – that is, how many users you lose along the way towards the goal.

If you have a low click-through rate, it can among other things indicate that you need to improve usability in the step where many users disappear, or that the next step is not enticing enough for the user. If you have not worked with conversion funnels at all, you can be in for a shock the first time. There may be enormous potential for improvement in reducing the drop-off rate at a single step in a multi-step process towards a common goal.

Imagine the business goal for an intranet is that at least 90% should manage to submit their leave request. If then 25% of them disappear at the first sub-step, you get to try to figure out what it is due to, or possibly arrive at a more reasonable goal.

It is of interest to segment out those users who do not complete all steps in a conversion funnel to look for patterns. Are there any common denominators for where they drop off? Where they go? Is it possibly an unclear call to action that makes users not understand what they are expected to do to proceed?

It does not necessarily need to be many sub-steps for a conversion funnel to be suitable as a visualisation when you report the web analytics results to your stakeholders. A simple example of a conversion funnel can be how large a proportion who after a site search actually click on any of the results. That could be one of several ways to evaluate whether an improved relevance model for the search function has been successful. What a conversion funnel can tell you, unlike click-through rate, is what the most commonly used exits are.

A conversion funnel is not limited to your website but can very well be the benchmark for impact of email newsletters or even something in the physical world.

A/B testing to compare two different alternatives

: Image 15: A/B testing is like a seesaw comparing alternative A against alternative B.

A/B testing is a method for seeing which of two alternative ways to design something works best on real users. This is hopefully a prestige-free test of people's hypotheses to see what works best. It is somewhat of a competition that runs for a limited time by randomly presenting alternative A and alternative B to users. Afterwards, you inspect how the two alternatives performed and whether you can declare a winner that gets the trust to continue existing (until a new challenger appears).

It can be an excellent (and harmless) suggestion when a pushy person proposes a certain change. Completely disarmingly, you exclaim that it is an excellent suggestion that you are eager to evaluate on live users :)

With an A/B test, you should try to answer three questions: who, what, and why.

Who is the change aimed at? It is a certain segment of users, in other words a subset that has something specific in common.
What in the users' behaviour do you hope will change? It can for example be about more people choosing to write a product review on the website after a completed purchase.
Why should the change happen? There needs to be a testable hypothesis about how the goal/improvement can be achieved, which is usually a change in design.

An example of an A/B test for a website is to check whether a right column with extra information works better for the segment of users connecting via a desktop computer than placing the material below the article's body text.

A/B testing can really make a difference. One of the more inspiring examples I have heard is when Obama's presidential campaign in 2012 did A/B tests to find the best way to bring in as many donations as possible. After they conducted 240 A/B tests¹⁶, they had increased the conversion rate by 49%. If you succeed with that level of optimisation, it can make a really big difference.

Random selection for test group and control group

Keep in mind that it can be confusing, and can worsen the test's quality, if as a user you get different versions during the test period. Try to design the test so that you consistently show one alternative to a certain group of users. Furthermore, it is preferable that the selection is random. This can be by distinguishing users by their geographic location, rank in the value chain, or perhaps depending on whether it is their first visit or not. Beyond that, it is randomised whether you end up in the test group or the control group. Whether you get the new proposal or the existing one, alternative A or B.

The reason you want this semi-scientific setup is so that you can draw conclusions based on credible data. If you have enough users to test on, say 5,000 users, then 2,500 are randomly selected for the test group – those who get a new design proposal or text tested on them. When the test period is over, you compare the test group's and control group's results; which group was most successful against the goal you wanted to achieve? If the difference between the two groups is greater than negligible, you have found something. But it can absolutely be the case that you conclude that the old design alternative still works best.

Even if the test concludes that the two alternatives perform roughly equally, you have learned what room for manoeuvre you have and perhaps dare to make a more drastic change for the next test.

Things you may want to test include:

Different layouts, such as whether it works better with a side column on wide screens or if it should be placed below the article. In this way you can investigate the phenomenon of “banner blindness”.
Microcopy, for example which button texts work best for both getting a click and for the user to complete the entire process – abandoned shopping carts have no real value.
Colour alternatives; it is well known that people from different cultures react differently to colours¹⁷. In competitive contexts, it is demonstrably more likely¹⁸ to win if you wear red clothing. So what colour do you choose for a buy button? Blue, red, green, or other? Test!
Imagery; among other things, you can evaluate which type of image works for ads on a certain ad placement, but also whether it is the ocean view or the pool that sells hotel packages by the Mediterranean.
Usability factors. An interesting thing that can however be a bit complicated to test is whether you can improve the experience of a multi-step process by reducing or increasing the number of sub-steps. That users cannot be bothered to click more than three steps is an untested assumption until you have actually investigated the matter. Perhaps the process becomes clearer if you introduce an additional sub-step. By comparing the alternatives' conversion funnels, you can see which alternative retained the most users to the end.

Just as with conversion funnels, A/B testing can and should be used in other contexts than strictly on the website itself. In email campaigns, it is common to send out two alternative messages, for example on an offer of some kind. When you afterwards analyse what impact each alternative had, thoughts arise about what you might do better in a future campaign.

Multivariate testing to test multiple things simultaneously

If you want to and dare to complicate things somewhat, you can carry out so-called multivariate testing. This means testing multiple adjustments simultaneously as a package, but otherwise it is a mix of other methods – perhaps primarily A/B testing.

An example is when you initially design landing pages on your website. Which alternative worked best of, for example:

Alternative A with a large decorative image, bold headline and a single call to action?
Alternative B which without an image or bold headline uses the space to offer three distinct call to actions?
Alternative C with copy text personalised for the user based on their click pattern, how they arrived at the page, etc.

The yardstick for which alternative to work further with is the one that has the highest click-through rate on the call to actions you set up. Or even better, if possible, measuring the actual profitability of the user's session. The first round of this test would give answers about how many call to actions users can handle on a single page. Depending on the results, you can design the next test to try to optimise further.

In this method too, segmentation is used to isolate subsets of the information collected, perhaps primarily to tell the story around what you are testing.

Checking quality factors

Not all metrics can be answered with the help of collected statistics about users' clicks and behaviour. For example, you should maintain the thought of checking other forms of quality indicators. Say that the communications department needs to follow up on published material. For example, listing all material that has not been updated in a certain number of months, or:

List heavy images in order of size in the hope of being able to optimise them.
Check if there is outdated technology hiding somewhere. For example, Flash files or strange document formats.
Web pages with page titles that do not follow current best practice in search engine optimisation, something that hinders the ability to find them via search engines.
Documents that have an inappropriate name, for example “New Word Document”, lack a title or other common mistakes that make it difficult to search for the content or is embarrassing when someone finds it.
Material that lacks sufficient metadata. For example, that a description of the page, or that keywords for the internal search engine are missing.
Pages that respond very slowly. Slow page views can be due to many things but often it is about complexity in the system for compiling everything that is to be displayed to the users. But sometimes you simply have a web server that is too sluggish.

At my workplace, we have over a million searchable pages/documents. With the help of everything the search engine knows about the pages it has indexed, we can extract a great deal of interesting statistics. Among other things, that 89% of the content on the intranet lacks keywords – in other words, 9 out of 10 pages cannot compete for the top spots on the internal search engine. No wonder the search engine has difficulty selecting what is most relevant. It is not much better on our external websites.

: Image 16: Optimizr.com provides reports via email.

Feel free to take inspiration from the reports that commercial tools have; it is usually both easy and free to sign up for a trial period with most tools. When you develop your own template for reporting, it is of course important to focus on the value-creating events. Not just all the potential concerns, challenges and notices that are obvious enough that standard tools highlight them.

For those of you who do not already have suitable software that can inspect the entire website, or at least large parts of it, there are some tools you can run yourself or together with a more technically inclined colleague. One I have personally used and liked is called SEO Toolkit¹⁹, which is an add-on for Windows Server. This server version of the Windows operating system is found within practically every larger organisation and is probably not unusual among developers in your vicinity.

: Image 17: SEO Toolkit can crawl through your website and provide a great deal of tips.

Examples of other sources where you can find data of a qualitative or quantitative nature include:

Customer service or the IT department may have logs of cases concerning difficulties using the website, or other cases where the website failed to help the user.
Usability studies where I work have sometimes been conducted on subsystems without my knowledge and yet, thankfully enough, looked at the parts I work with. So it does not hurt to ask around a little.
Transaction data, which can go by many different names depending on the knowledge of the person you are talking to. For example, it can be called a service platform, Enterprise Service Bus (ESB), integration platform, APIs or similar. This is a form of digital telephone exchange and information desk combined.

The next step is of course to inspect, compare and analyse all the data you have collected.

Step 3: Analyse

The goal of the analysis phase is to understand why users behave in a certain way or think as they do. What you have done so far is to gather evidence for what happened when, and where. Now it is time to try to answer the question why. Here you get the chance to see what obstacles prevent users from completing the desirable activities on the website and what you can tackle to optimise the value of the website.

What you quite quickly notice that distinguishes different users, especially if you talk to them, is that their satisfaction is affected by at least two factors, namely:

The user's expectations before the visit. They can be differently motivated. Perhaps you have already convinced the person before they arrived.
The actual user experience. A very motivated user puts up with a bit more than one who needs to be worked on a little to convert. It can be about how easy it is to understand what to do. How many sub-steps something has. Whether one can even use the website under the conditions one finds oneself in and what device one is using.

The first thing you must consider in the analysis work is whether you have enough data to draw any conclusions. Even if you conclude that you probably have too little data to draw big and far-reaching conclusions, your data surely gives you indications of how things stand and tips on how you can redo the test on a larger scale. But do not dig yourself in! The risk exists that you spend more time analysing than improving the website. If you are in doubt, contact a statistician and mention keywords like confidence interval (i.e. what the margin of error is) to catch their attention. Or you Google it and learn about the subject :)

Also be aware of any seasonal variations that can affect the test. In my experience, there is information and support that is needed more at certain times of the year. For example, data from my workplace supports that it is common for people to forget their passwords during the summer holidays. This is also an example of how your conclusions in web analytics can also support internal communications with the priorities in editorial work on intranets. If people have extra trouble with passwords during late July and all of August, there is probably an efficiency gain in placing relevant tips on the intranet's starting points for how employees should act when they have forgotten their password. External users can surely have a similar difficulty the first time they need to log in after a long holiday.

If you are going to place this seasonal information anyway, you might as well do an A/B test to evaluate the best possible wording instead of just writing something up. I would personally test both using a question as the title, for example “Forgotten your password?” but also the more active “Get a new password”. If you do it as an image teaser, you can test which form of image or illustration performs best. Here we get into the hypotheses about intended improvements that are connected to answering the why question and the desire to optimise.

The data you have collected can sometimes contain unwanted variation or extreme examples. If you want to be completely on the safe side, it is as mentioned a good time to bring in a statistician, but often you can use historical data to figure out what constitutes an unlikely scenario. If that is the case, it is perfectly fine to filter or segment out these odd cases. What you achieve is statistics that are admittedly not as exact, but you can see trends over time. The exactness is really most crucial in comparative tests between competing alternatives, if no alternative wins a landslide victory.

Inspect the few who convert – or everyone else?

Based on the goals and user segments identified, it is now you ask yourself the question about which group has the greatest potential for improvement. Should you try to gain insight into whether it is more worthwhile to optimise those who converted, or the perhaps 98% who did not convert. I assume that you too find it interesting to try to convert more of the large mass of users. But of course, one should not forget to try to do even better with those who have already confirmed that they like the business goals by actually converting.

Increasing the proportion that converts by a mere percentage point can mean a doubling, if you now had one percent converts before the improvement and two after. It can be difficult to double the benefit/revenue/etc. on existing users or customers.

The often small proportion that converts are those who have confirmed themselves as interested in what the business offers. But many of the others also largely consist of interested users, but for some reason they did not get to the next step. In this group there are those who abandon shopping carts, inform themselves about the range, read up, use comparison functions but do not complete anything (yet).

In the analysis phase, you need to check where someone drops off in processes, see where the friction occurs and come up with suggestions for concrete measures for the next step in the work process. Analysing is about, based on data and structured evaluation, arriving at conclusions about how well something has performed, as well as hypotheses for how something can be improved further.

The short-term results of the analysis

Based on the data you have collected, processed and analysed, you should document the findings and lessons learned. It can be good to have a note showing with what certainty the analysis points to being right; there is nothing wrong with being humble about this not being an exact science. Also write down which methods were used for each test. A log, quite simply.

Keep in mind that behind an unambiguous A/B test, there may well be data that clearly explains why one variant performed better. For example, it may turn out that there is a big difference between mobile users and those with desktop computers visiting a certain subpage. Perhaps the design element that is the page's call to action is not as easy to spot on a small screen? That may be what the segmentation shows, but it could be worth testing again to see whether that assumption was correct. Perhaps the test can have bearing on how other parts of the website can work even better?

Feel free to note which tests may need to be improved and run through another cycle of the entire analysis work. It is not exactly unusual that you do not get results that are clear enough to draw any big conclusions, but that knowledge too is not worthless.

As the last point in the analysis phase, you compile the report, based on the template you worked with in the preceding step and fill in with insights and conclusions (and be careful not to insert more data or numbers than absolutely necessary). This report is partly something to archive, but primarily it is now you should be able to compare findings with the stakeholders who “own” the website or its various parts.

Do not contact the stakeholders and say “We have 17.39% more visitors compared to the previous period”, or other similar statements. Before presenting or handing over the report, you and a colleague can present it to each other, where you both take turns being the devil's advocate – the one who complains and questions everything under the sun. What you report on should give insight and be possible to act upon. They should pass a test I myself usually call the “so what?” test. If you cannot answer the counter-question about why one should care, then what you want to report on is not worthy of anyone else's attention.

Examples of things to report on that both give insight and can be acted upon include:

“If the trend continues, our Facebook campaign will stop being profitable in two weeks (calculated on a customer's average profitability over 5 years).”
“Over half of our competitors now have better performance than us. There are three simpler measures we would like to test and evaluate, to begin with.”
“We have redesigned the shopping cart process and evaluated it. On mobile devices, completions have doubled, but it dropped by 7% on desktops. Overall roughly break-even, but as we expect more mobile traffic, we suggest keeping the design.”
“Only a few of our landing pages follow best practice in SEO. We suggest a redesign of the CMS so it is not possible to forget the description text and that page titles are never over 70 characters long.”

Step 4: Implement improvements on the website

Based on hypotheses and conclusions from the analysis phase, you prioritise which improvements to tackle first. Some work may have a big impact while being a small effort; other things may be complex because they have external dependencies – regardless, you now have a list of things to address.

Examples of work in the improvement phase include moving buttons to increase usability, something that may only be needed for those with small screens. If you have run A/B tests “manually”, i.e. without automatic support in the content management system, it is now you ensure that only the winning alternative is used.

The last thing you do in the improvement phase is to add to the wish list with the improvements you prioritised away this time. That list can prove useful going forward when you happen to get some time over, perhaps can combine certain activities that resemble each other, or as inspiration next time you redesign the website.

Some of the business goals you have set up can take quite a long time to influence regardless of what improvements you make, so have reasonable expectations for how quickly you can start seeing results.

...and then what?

After completed improvements, you run another cycle and start over with step one and your business goals. This work is something that is never finished; what governs is rather the level of ambition. The level of ambition determines how many cycles you can manage per year.

Become a certified web analyst?

If you feel insecure in your role, there is good help to be had, beyond this book. Even if you have educated yourself or worked with statistics, have a background as a web developer or in other ways feel you have a handle on a lot, it can be good to be able to refer to something concrete. Or to ensure that the experience you have acquired does not have big gaps in any important area.

Something that makes you a web analyst even according to a questioning person's perspective is to earn the title. For example, for those of you who use Google Analytics, there is their Academy, a free web-based education where you get to learn both theory and practice. There you can learn the ropes required in Google Analytics specifically and ultimately get your knowledge confirmed by becoming certified on the platform. Even if you do not use Google Analytics specifically, the education and its certification can be useful.

Having a certification in an area is often the quality stamp that opens doors, something that shows you have at least achieved an okay level according to a third party.

Common pitfalls when working with website statistics

There are of course many different ways to fail with your analysis work, far more than I have the ability to research or you have the stamina to read about. I have selected two main groups that, at least in my experience, are very easy to fall into. Partly whether the quality of the goals is something to have, but also that you get lost in your statistics.

Vanity metrics – metrics or goals that give a good gut feeling, but lack actual value

A common thinking error is what is called “vanity metrics”, that is, metrics that appeal to one's vanity. What appeals to one's vanity I suspect varies depending on one's professional identity. It can be comparisons of values that might seem far-fetched in terms of what they have to do with an unambiguously good result.

Examples I have heard often over the years are how many page views you have, or the number of unique visitors. Just because the print industry used to get away with vague figures like what print run something was released in does not make the value a reasonable business goal worth striving towards.

It is questionable what benefit can be achieved by setting goals about the average number of page views on a website. Is it unambiguously good with many page views per visitor? Is it bad? You need to be self-critical of your goals – then perhaps you will avoid others being critical for you. In the method, it is built into the workflow that you reflect on your goals. It is namely entirely natural that you tweak the goals the more you learn about your website, the users' behaviour and the content you offer.

The relationship between web strategy and systematic work with web analytics is what helps you keep order in your mix of goals. So it does not just give a good gut feeling.

Web analytics guru Avinash Kaushik has, in addition to See-Think-Do-Care, introduced the “Ladder of Awesomeness” to visualise this. Your web strategy is about how you climb high on the ladder, while web analytics is about how you follow up the results, improve and experiment.

: Image 18: Avinash's ladder illustrates the steps that need to be taken to create a good experience on the web.

At the bottom of the ladder are the hygiene factors that this book focuses on in its concluding part; at the very top you find marketing at a high level. Need I mention that it is meaningless to be good at marketing if the bottom two steps are substandard?

Agenda-amplification effect

That one's own agenda steers can feel similar to vanity metrics. I would argue, however, that this one is about how neutral and open you are to learning something new. With statistics and research, you often find arguments whose only value is that they confirm an answer you have already decided on. To recycle an example, let us say that the undersigned, who is sceptical of the royal family's democratic value, asks the following question to the public: “Do you think Sweden's head of state should be democratically elected?”

What do you think the answer would be? I would guess that more people would answer yes to that question compared to if the question had been formulated like this: “Should Sweden depose the king and have a democratically elected president instead?”

As you ask, so shall you receive. And depending on one's ingrained thought patterns, it can be difficult to be completely objective, or even ask questions that are not leading. It is good to have several people doing this work and perhaps even introduce an activity where you challenge your assumptions, explore other explanatory models, and more. There may be other insights there that are not in line with one's own agenda.

Common mistakes with statistics

The most obvious mistake is that the data collected does not have a quality that is good enough. Perhaps it is also not representative of reality. As soon as you are uncertain about your data quality, you stop doing statistics and rather it becomes guesses based on a substandard foundation. Then it is back to the drawing board to figure out how you can collect data in a good and orderly way. Using third-party tools means handing the problem to someone else, but even here it is good to know a fair amount about how it works. For example, those who had a large proportion of iPhone users could probably see a before-and-after when iOS 9 was released, as then more people started blocking tools including Google Analytics. Say that your intended target group are likely users of content blocking – what does that tell you about your data collection?

Furthermore, the sample must be large enough for you to be able to draw conclusions. The margin of error is enormously larger if you have a handful of users in your sample compared to if you have hundreds of thousands. If you have a handful of users and a majority of this small number encounter a certain problem, it is not really about statistics. But even anecdotal evidence can be good enough to act on.

: Image 19: Confidence interval shown with a red line.

Compare with flipping a coin. With a perfect coin, the chance of getting heads is 50% and 50% of getting tails. If you flip ten times, the chance of an even distribution is lower than if you flip twenty-five or a thousand times. One way to be open and honest about your uncertainty is called a confidence interval. It is a mathematical estimation of how often you will be wrong based on the available data. If you have a confidence interval of 95%, you believe yourself to be wrong every twentieth time. To visualise this in reports, it is common to show a red line in bars to indicate within what range the uncertainty lies. Then it becomes clearer to compare two bars; when their red lines overlap, you may not have evidence to distinguish between them.

Correlation says nothing about causation!

: Image 20: Butter sales and divorces in the state of Maine show a clear correlation.

Just because two graphs follow each other over time does not mean they have a common cause. One of my favourites in this regard is the creative religion with the god “the Flying Spaghetti Monster”, where pirates are celebrated. There is namely a correlation between decreasing numbers of pirates and climate change. Therefore, one should celebrate “Talk like a Pirate Day” to reduce climate change.

Then there are weak causal relationships, like in ice cream sales and weather.

What a statistician calls statistical significance may be worth reflecting on. It is about a value needing to deviate sufficiently from another for you to be able to say that it is not due to chance. Say you run an A/B test where you compare the click-through rate on two different design alternatives. If alternative A has 23% and alternative B has 25% click-through rate, it is admittedly true that alternative B has two percentage points higher frequency in the test. To determine whether we can declare alternative B the winner, we need to know how many were presented with alternative A versus B (what is called the population). The short version of hypothesis testing is that the more people who took part in each alternative, the less tricky it is to declare alternative B the winner.

A big risk factor here is that you rely on the selection of users for alternatives A and B being done randomly. Otherwise, the results can be skewed more than the meagre two percentage points we have as “evidence” that alternative B is better. Make sure your A/B tests do not become some statistician's horror example in the future. Act when you have a fairly large sample and a large deviation between the alternatives' performance (or hire a statistician when you are uncertain).

It can be wise to have a statistician to consult, or not to read too much into data you are not comfortable with.

Summary

The benefit of each restart of the analysis work is that you get the chance to look critically at your business goals to see if you really still think they are worth having. You will probably see them with slightly new eyes over time and they may need to be improved, supplemented or clarified.

Now follows a section about documenting your performance budget, that is, what quality level the website should live up to and how to make design work more standardised and possible to follow up. Part of that work includes deciding on a long list of hygiene factors, making design a bit more engineering-like and the importance of documenting everything.

Continue reading – Part 2: An analytical perspective on design, performance and content