What you need to know about Web Analytics

by Ben Lorica (last updated Aug/2011)

The goal of this document is to provide a quick, up-to-date introduction to web analytics: loosely defined as measuring and analyzing user behavior on your site, in order to increase revenue, lower (marketing) costs, while improving customer satisfaction. A data-driven web strategy, starts with embracing some of the best practices discovered and refined by web analytics practitioners. Implementation details vary depending on what web analytics tools you use. I'm most familiar with Google Analytics (henceforth referred to as GA) so many of the examples will be discussed from the GA perspective.

The key ideas cut across what tools you use and what type of web sites you operate. In order to understand what's happening on your web pages, you need to implement systems & develop a culture steeped in measurement, analytics, and experimentation. Because all you have to do these days is insert (javascript) tracking code on your individual pages, you should at the minimum be capturing baseline data that's already available on your log files. In practice, there are lots of subtle details that take place under the covers, and one of my goals is to point out some common pitfalls.

My plan is to try to keep this document updated, so if you have any suggestions or comments, feel free to contact me. If you're interested in learning more about the subjects discussed below, follow the hyperlinks to the original sources.

  • The Metrics
  • Campaign and Conversion Analysis
  • Best Practices
  • Metrics

    "Without data, you are just another person with an opinion ..." (Andreas Schleicher of the OECD Directorate for Education)

    STANDARD METRICS: Any discussion of web analytics begins with a description of available data. If the goal is to understand how users came to your site, and what they do once they land, then we you'll need to know what relevant data can be captured. When a user is on one the pages on your site, web analytics tools typically record the information listed below. [Web analytics tools can identify (and sessionize) users who navigate using multiple tabs on the same browser.]

  • the page the user is viewing
  • the current date/time
  • the IP address (identity) of the user
  • the geographic location of the user
  • the browser/device of the the user
  • origin of the user: could be a referring web site or through (an organic or paid) search or direct (user typed in URL)
  • data from cookies: number of times the user has been to this site
  • Based on the data described above, it's easy to envision reports for sources, volume, geo-location, and seasonality of your traffic. Among web analysts there are a set metrics that have become standard:
  • Unique Visitors, Page Views per Visitor, Pages viewed until "conversion": Make sure you understand how your web analytics tool defines simple metrics like clicks, visits, and visitors, as well as page views and unique views. Most web analytics tool define a visit as an individual session (a session is usually reset after 30 minutes of inactivity, but details vary by vendor). Because users of web analytic tools create and run many reports, on different slices of their traffic, "discrepancies" will sometimes arise, many of which can be "resolved" if you undertand how your web analytics tool processes and aggregates your data. (As an example, an important concept to understand in GA is the difference between filters and advanced segments.)

  • Top exit pages

  • Bounce Rate: Usually the percentage of users who see only a page (or a few pages) on your site. A more sophisticated definition would count hits as opposed to page views.

  • Page/Site Duration: When a user visits only one page on your site and exits, most web analytics tools set the time on page/site to be zero. (The work around in GA is to use event tracking.) Computing the duration spent on a page/site when a user employs multiple tabs is even trickier. GA uses a method called linearization -- meaning GA atttempts to create one session across the multiple tabs.

  • Conversion/ROI analysis (see below)
  • REFINEMENTS: Most web analytics vendors go beyond the standard metrics listed above. Many have added (and continue to add) features that reflect the changing nature of online behavior:
  • Filters: These are business rules that you define ahead of time, to make your web analytics reports more accurate and understandable. The most common example is to eliminate certain IP addresses (e.g., traffic from your own employees) from your counts. Within GA, many users create filters to standardize names & (referring) keywords, and lowercase any query-string parameters.

  • Segmentation: Most web analytics tools let you isolate and compare different segments of traffic to your site, using combinations of the standard metrics.

  • Behavioral Targeting & Reporting: At times you may want to have the flexbility to target users based on their recent or long-term behavior on your site. For more, see section on Campaign and Conversion Analysis.

  • Event Tracking: Tracking what your users do on your site is a prerequisite to behavioral targeting. Within GA, Event Tracking lets you record (& categorize for subsequent reports) user interactions with elements of your website, from routine file downloads, to embedded AJAX or Flash-driven items.

  • Real-time numbers: As more web content gets generated in "real-time", low latency business intelligence has become increasingly important. Web analytics has traditionally operated on a weekly to daily turn around. GA doesn't process data in real-time, and it's best to avoid making decisions based on the intra-day numbers it reports. Adobe's Omniture product claims to provide real-time data processing and reports. A startup called ChartBeat claims it can tell you have many users are on your pages "right now".

  • Site Speed: Site owners are finally recognizing the impact of site speed on revenue and in response, web analytics providers have added performance tracking. As an example site speed tracking is easy to enable in GA. The web analytics providers are also conscious of their impact on site speed: GA released asynchronous tracking in Dec/2009.

  • In-page analytics: Some tools allow you to visualize user interactions down to individual elements within a page (e.g., GA and Open Web Analytics). Technically speaking, the same data is available using traditional reports. But letting you view the numbers visually on your pages, makes for improved comprehension and decision-making. The main problem for these visual presentation formats is that many don't archive your older pages: web pages change over time, and some UI elements highlighted by such tools, may no longer be relevant.

  • Social Engagement: Many web pages now have a number of social sharing widgets/buttons, from Twitter's "tweet this", Google+1, and Facebook's like. Web analytics providers have started providing reports that measure how many users use those widgets to share one of your pages. (Here is how you do Social Plugin tracking in GA.)

  • Mobile Devices: Startups like PercentMobile more accurately track users who visit your site from mobile devices.

  • Reports based on random samples: When you start attracting lots of traffic, your web analytics tool will start to use random samples to quickly generate some of your (long-term) historical reports. Proper statistical sampling should generate reports of sufficient accuracy, in a reasonable amount of time. Consult your web analytics vendor to find out its approach to sampling: GA has the following explanation for how/when it does sampling.

  • Reports that span multiple sessions: The idea is to measure and compare segments, metrics that aren't limited to a single session. Examples include, (1) number of sessions (from first interaction) until "conversion", and (2) number of days from first session until "conversion". (GA supports this via visitor-level custom variables. Within GA there are built-in ecommerce reports for both time & visits to transactions.)

  • Privacy and Opt outs: Web Analytics vendors should not only publish their privacy policy, they should also let users opt out from being tracked: GA has a browser plug-in for users who want to opt out. Presently the number of users do opt out is quite small, and unlikely to skew your reports.

  • Cross-domain tracking: Sometimes users interact with you across multiple domains. Consider for example sites that use third-party shopping cart, or companies that have multiple web properties. As mentioned in the above introduction, GA and many other tools support first and not third-party cookies, which are what you need in this particular instance. In the case of GA, there are cases when you can track users across multiple domains (both domains use GA AND accept query-string parameters), but the work around involves some tricky Javascript Coding (see for example Chapter 10 of O'Reilly's GA book).
  • SITE ARCHITECTURE: The last item above (cross-domain tracking) is just one of several considerations involving site architecture, that may affect or skew the numbers generated by your reports. Here are a few more:

  • Frames and iFrames: You need to consult you web analytics tool, but GA recommends you install tracking codes on frames and iFrames.

  • Query-string parameters: GA and other tools use query-strings in URL's to keep track of marketing campaigns and other initiatives. To leverage those features, you'll need to be able to append query-strings to your URL's.

  • Content that may not generate page views: GA has a mechanism for tracking Flash and Ajax.

  • Redirects: As in the case of GA, many tools recommend you place tracking codes on the redirecting page as well as on the landing page.

  • Cross-domain tracking: see previous section on Improved Metrics.

  • Tracking multiple sub-domains: You need to consult how your web analytics tool distinguishes between sub-domains: e.g., www.example.com and blog.example.com. See here for the GA work around.
  • Back to top

    Campaign and Conversion Analysis

    When you get comfortable generating and consuming simples usage reports, the next step is to track slightly more complex metrics. I'm referring to actionable metrics that measure success (usually the completion of some task, such as a transaction or signing up for a newsletter). Since web analytics tools capture detailed activities of individual users, you can measure factors that lead users to complete or abandon a specific transaction. Another important instrument in campaign analysis is the URL: by tagging your URL's, your web analytics tool can generate reports that business users can easily consume. Most tracking URL builders create query-strings that include parameters for the name, medium (email, cpc, banner), source (referrer), and content (to differentiate between multiple ads). But note that link tagging goes beyond the analysis of campaigns. Depending on how you set-up your query strings, you can use GA & other web analytics tools, to track (session) events such as interactions with specific UI elements (e.g., a travel site might want people who interacted with their calendar widget). If you tag your URL's ahead of time, GA will generate reports where you can slice & dice across those dimensions.

    Here are some considerations when tagging your links:

  • Users of both GA and Google AdWords should take advantage of AdWords autotagging, which lets you distinguish between organic and paid Google search traffic: In addition, taking the extra step of linking the two accounts will allow you to use AdWords cost data to calculate campaign ROI. As of the end of August/2011, you can link multiple AdWords accounts to the same GA account.

  • To make the resulting reports more readable, query-string parameter values should be short & consistent.

  • Use link tagging even for offline campaigns: If you're URL is displayed on a poster, it goes without saying that you want to keep it short and simple. You can either use a custom URL shortener to compress URL's laden with query strings, or stick with a catchy and memorable URL. In the latter case you'll have to embed (campaign URL) tags on your landing pages (see for example Chapter 9 of O'Reilly's GA book).

  • Social Media & URL Shorteners: If you plan to share a URL on Twitter and other microblogs, it's a good idea to tag the original URL with tracking parameters before shortening it. Otherwise, your web analytics tool may not be able provide accurate referrer information. Depending on the web analytics tool you're using, there might even be a URL shortener that automatically adds tracking parameters on the fly.

  • Use link tagging to distinguish multiple links on a page on your site, that point to the same page on your site: Web analytics tools are unable to distinguish between distinct links on your homepage, that point to the same page on your site. Tagging such links can help you optimize page layouts.
  • The most common application of link tagging is tracking overall campaign effectiveness. Below are some techniques & tools frequently used for the detailed analysis of conversions:
  • Goals and Funnels: A Goal is the task you want users to complete, while funnels are the sequence of tasks/pages that they need to do to bring them towards the Goal. Goals and Funnels are supported and easy to setup withing GA. If you're interested in analyzing campaigns that have already ended, a startup called PadiTrack lets you peform Goal/Funnel analysis on your historical GA data. Another interesting feature of PadiTrack is that it lets you perform analysis across sessions (so-called pan session analysis).

  • Structured path analysis refers to the detailed analysis of the steps during a visitor session (funnels are a special, more structured example). I agree with many experts who think that structured path analysis is generally something that should be skipped. The visualization and analytic tools just don't support the use of resources on this type of analysis. The exception is if you have a highly structured funnel of pages.

  • An alternative to path analysis is to create simple reports that identify influential pages: Here is an example of a simple GA report with 3 columns: list of pages, page views, and completions.

  • Last-click/sales attribution: Evaluating campaigns usually requires assigning conversions to the right ads, landing pages, etc. If a user comes to your site because of Campaign A but doesn't "convert", and returns later because of Campaign B and susbequently "converts", which campaign gets credit for the conversion? By default GA awards it to Campaign B, but allows for a simple coding change in cases when you want to give credit to Campaign A. In general asssigning who gets credit for conversions can get tricky, and the correct answer really depends on the circumstances. The situation gets even more difficult when the user crosses domains, as GA and other tools only support first-party cookies.

  • Transaction and Revenue Calculations: Many tools also let you record and set goals based on transactions & revenue, which you should definitely take advantage of. These numbers will be viewed with interest throughout an organization, so make sure you understand how your web analytics tool calculates revenue. Expect some discrepancy between the (absolute) numbers from your backend systems and the numbers produced by your web analytics solution. Web analytics tools rely on tracking code located on your order completion page, and for a tiny fraction of users this code may not properly load. The discrepancies should be small enough not to affect overall trends.

  • Calculating Conversion Rates: Similarly, make sure you understand how your web analytics tool computes conversion rates. Because its defined to be (number of transactions) divided by (number of visits), in GA the e-commerce conversion rate can exceed 100%!

  • Behavioral Targeting: At times you may want to have the flexibility to target users based on their previous behavior on your site. A concept coined by online advertisers, Behavioral Targeting refers to the use of identifiers (cookies, IP addresses) and event tracking to create profiles of users. You can do this in Omniture, and within GA with the aid of third-party solution providers.

  • Landing pages: I listed some tips and best practices in my article on SEO and Search Marketing.

  • Tools for designing and testing landing pages: Web analytics tool providers also offer tools for optimizing landing pages (GA has a sister product called Website Optimizer). There are also a bunch of startups ready to help you with your landing pages. Examples include hosted landing pages with Unbounce (if you want to speed up your dev/testing process), simple A/B testing with Optimizely, Concept Feedback lets you solicit feedback from experienced web designers, and [x+1] claims that their dynamically assembled landing pages increase conversion rates.
  • Direct marketing using email almost always points recipients to pages on your site. As a channel, email can be tricky to manage, and marketers need to ensure email messages point to pages that spur "conversion". Here are a few web analytics metrics that one can use for email campaigns: these metrics can also be tailored for other types of marketing activities (such as search marketing).
  • Effectiveness of your Landing Pages: Bounce Rate = (# of email campaign visits with a single Page View) / (# of email campaign visits)

  • Engagement with your Brand: Depth of Visit = percent of email campaign visits that last longer than N pages

  • Favorable outcomes: Actions Completed = percent of email campaign visits that (watched a video or filled out a form or purchased or ...)

  • Assign Monetary value to Favorable outcomes: Average Economic Value per Email Sent = (total economic value) / (# of emails sent)
  • Back to top

    Best Practices

    "To find out what happens when you change something, it is necessary to change it." (Box, Hunter, and Hunter (1978) )

    Here are some other useful tips from Avinash Kaushik and other web analytics experts:

  • Start by looking at the sources of traffic: What you want as much as possible is a balanced mix, which for a typical site would be 40-50% from Search, 20% Direct, 20-30% referrals, and 10% campaigns. In the most recent GA newsletter, among the sites who enabled anonymous data sharing, the breakdown of traffice sources was 37% Direct, 19% Referral, and 28% from Search engines.

  • Next look at engagement: frequency of visits and recency of last visit; time duration of visits; days/visits until desired outcome (e.g., purchase); conversation rate (comments, social sharing)

  • Is money spent on marketing delivering results?: Examine data on previous marketing campaigns, specifically conversion rates and revenue, to identify what can be done more effectively and efficiently. Specifically, dig into your web analytics report to see if the cost of acquiring traffic can be reduced.

  • Focus on revenue and other key outcomes: Besides reducing costs, you need to understand how your site generates revenue (or in the case of a non-commerce site, meets its key outcomes), and identify ways you can grow it.

  • Identify the top landing pages and compare their bounce rates: It's imperative that you do this. Besides your home page, the top 20-30 entry pages are how users first experience your site. Surprisingly many companies don't even bother to track bounce rates on their sites. (On most sites, the bounce rate is in the 40-60% range).

  • Examine search keywords and keyword referrals: experiment with alternate visualization strategies (such as tag clouds) that may not be available in your web analytics tool.

  • Influential pages: Within GA, calculate the $Index value of pages (contribution to revenue).

  • Mine site search logs to have an understanding of users interests and intentions: GA lets you do this fairly easily, so you won't even have to go to the I.T. department (also see for example Chapter 5 of O'Reilly's GA book).

  • When allocating resources for web analytics, consider the 10/90 rule and overinvest in people:
    If your have a budget of $100 to make smart decisions about your websites ... invest $10 in tools and vendor implementation and spend $90 on Analysts with big brains.
  • Use link tagging to Track Direct traffic: Direct visitors are ones who typed (or bookmarked) your URL into their browser. As such they represent users who might be existing customers, or those familiar with your brand, or those who are taking part in one of your offline campaigns. These are users that come mostly on their own accord, requiring less marketing costs, so you need to understand them as best you can.

  • To make your reports more useful and readable, use filters: See above discussion.

  • Given the choice between visitors and unique visitors, use the latter in your reports: In fact, ignore visitors and stick with unique visitors in all your reports. In GA, unique visitors is the number of distinct, persistent, cookies. More importantly, it's available as a metric in any custom report you might want to write.

  • If you need to, don't hesitate to go beyond the default reports: Sometimes your data/reporting needs aren't served by what's currently available through the dashboard provided by your web analytics tool. Most of these tools have an API, many third-party integrated solutions, and certified consultants. In many cases, a careful reading and usage of the API does the trick.

  • Within GA, set-up a profile filter for host names that you actually own.

  • Don't forget to track conversion rates from other channels: consider ifbyphone and mongoose metrics for measuring phone conversions
  • Back to top

    NOTE: Reproduction & reuse allowed under Creative Commons Attribution.    Creative Commons Attribution