I look at digital data all day and often take for granted how good I’ve got it. When running an analysis on the impact of a marketing campaign or an A/B test, there are hundreds of thousands of data points at my fingertips just waiting for me to extract meaning and take action.
The beauty of digital analytics is that you can actually look at how data is collected in real time.
Think about that for a second: if you’re doing analysis within a data warehouse you may have no idea, or visibility, into where that data is actually coming from or how it is collected. Digital analytics is different – I can actually go to a site, pull up a packet sniffer and watch as information is collected and sent off to tools we use every day like Google Analytics, Adobe’s Marketing Cloud or DoubleClick.
I firmly believe that having a fundamental understanding of how your data is collected will make you a better analyst – one of Cardinal Path’s core beliefs is “To understand data, you must engage in the world” and this is a perfect example of that coming to life. Problem is, there’s a lot going on behind the scenes to make the reports you see in the tools we use every day possible. I’m going to break it all down and hopefully shine a light on how it all works.
A note before we begin, I’m simplifying a lot below. The way Facebook collects data is different from Google Analytics which is different from Adobe but all of these tools are still relying on a similar methodology.
LET’S START WITH THE BROWSER
Tagging a website
Data can be collected from multiple sources, the most common places we grab data from are:
- The HTTP headers – for things like browser information, referrer information, etc.
- On page code – often times we collect data from the page itself, things like a friendly name for the page, the section it belongs to on the site, etc.
- Cookies – see below
- When I debug tags on a website I rarely use the network panel though, there’s too much work involved. There are a few great tools that I like to use to help me see these requests quickly – two of my favorites are WASP and ObservePoint.
COOKIES LET US TRACK ACROSS PAGESCookies are text files that lives within your browser. Often times they contain long alphanumeric strings that represent a visitor ID. The web is often referred to as a “stateless” place – meaning that the server you’re requesting information from doesn’t know anything about you before or after that request is made. What I mean by that is when you request a page to be loaded in your browser the web server will send you all the information to load that page, but it won’t know what you’re doing on that page or even if you’ve navigated away (unless we explicitly tell it so).Cookies are the bridge across pages as well as sessions. They allow us to tell a web server that the same browser that requested page A also requested page B. This is useful for keeping you logged into a website, or storing items within a shopping cart. Cookies also allow analytics tools to track what a visitor is doing across pages and sessions. You can view your own cookies in Chrome quite easily, just go here: chrome://settings/cookies and you can see their contents as well as expiration dates and what domains they’ve been set on.
FIRST VS. THIRD PARTY COOKIESAs mentioned above, all cookies are just text files that can be written and read by a web server or the code being executed on a website. When a cookie is written to your browser it is assigned a domain that can read the cookie. This is where we get the term first party and third party cookies from.A first party cookie is written to the domain that you’re currently on, so in the case of www.cardinalpath.com first party cookies would be written to the domain of cardinalpath.com. A third party cookie is written to a domain different than the one you are currently on.
- The domain of the cookie becomes important because it determines who can read the contents of the cookie. In the case of CNN, any cookies written to the domain of CNN.com are considered to be first party cookies – because the cookie domain matches the domain of the website the user is on. Where as a cookie written to advertising.com is considered a third party cookie because that domain is different from the current site the user is on. Third party cookies are useful to advertisers because their content can be read across domains, meaning whenever advertising.com code is loaded on any other website the user visits that code can read the value of that cookie.It’s important to note that cookies cannot be read across domains, meaning code from CNN.com cannot read the values of the advertising.com cookies and the advartising.com cookies can’t read the CNN.com ones.
- Virtual Internship