There are many who believe that data is data. That there aren’t any significant differences between the various types of data used within businesses, whether it is financial data, sales data, marketing data, whatever. You will particularly hear this from vendors who provide generic business intelligence/data warehousing solutions. They need to believe this since their solutions aren’t tailored for any particular type of data.

The reality is that there are differences — significant ones — between the various types of data. Marketing data, in particular, is quite unique. This is in large part due to the adoption of digital marketing and the fact that most marketing organizations now use a variety of third-party platforms and services, each of which generates its own set of data. No other functional area is so dependent on external data to run its business. This dependency gives marketing data a number of unique characteristics that create challenges when bringing the data together for measurement purposes:

Data is highly diverse: As mentioned earlier, most marketing organizations use a large variety of third-party platforms and services, each of which generates its own set of data. Here is a list of some of the different types of data used within marketing…

  • Performance data
  • Activity data (clickstream, pixel / tag data, etc.)
  • Creative and ad assets
  • Customer data
  • Audience data
  • Social listening data
  • Survey data
  • Business impact data (sales, for example)

Data is fragmented, even within a single platform: It goes without saying that marketing data is fragmented due to the dependency on third-party platforms and services. What is surprising — and unfortunate — is that data can be fragmented even within a single platform. For example…

  • Facebook and Twitter each have different APIs to access data related to advertising activity versus non-advertising activity. So if an organization wants to perform an analysis on the combined activities, they need to “join” the data themselves.
  • Most large companies have multiple accounts within the same marketing platform — different AdWords accounts or different Facebook pages for different brands or regions — and there is no easy way to look at all of the data across the accounts.

Similar data is treated differently: Different third-party platforms and services use different terminology and definitions for the “same” data. A classic example of this is “time.” There is no consistent representation of time across all of the different platforms. Some define the start of their day at different times so the definition of “today” varies. Also, some platforms store time based on GMT (Greenwich Mean Time) and some don’t. All of this makes things very difficult when time-based data needs to be combined from the different platforms.

Data sources change often: The providers of the third-party platforms and services make changes, somewhat regularly. Keeping up with these changes is no easy task, particularly if data is being accessed programmatically via APIs.

Data sources can be unreliable: At times, the APIs of the third-party platforms and services go down and become inaccessible. Sometimes, the data generated during the downtime gets filled in later but sometimes it is not.

Data is dimensionalized differently: Third-party platforms and services provide their users with many different ways to slice-and-dice data; the data is highly dimensionalized. For example, in Google Analytics, visits can be scrutinized by Audience dimensions (demographics, interests, geo, technology, etc.), Acquisition dimensions (AdWords, SEO, social, etc.), Behavior dimensions (site content, site speed, site search, etc.), and Conversions dimensions (goals, ecommerce, multi-channel funnels, and attribution). The problem is not all platforms define their dimensions the same. This creates problems when trying to analyze data across different sources.

Data does not have a consistent hierarchical structure: Third-party platforms and services organize their data along different levels, from a hierarchical perspective. For example, the hierarchy within Google AdWords starts with an Account, which has Campaigns, which has Ad Groups, which has Ads and Keywords. Similar to the dimensions issue above, the hierarchical structure across platforms is not consistent, making measurement across different sources challenging.

Data can be incomplete: It is important to have complete data at every level in the hierarchy, from the detailed level all the way up to an aggregated view at the highest level, in order to be able to pinpoint the reason for something exceptional happening. However, this is easier said than done. Sometimes there are gaps in the data due to operational issues as mentioned above (API becomes inaccessible). In addition, there are certain metrics that are not provided by the third-party platform and service providers, for whatever reason. For example, Facebook breaks down some metrics by country, but not all of them. So, for those metrics that they don’t, if you want it broken down by country, you have to do it yourself. There are numerous examples like this among the various third-party platforms and services.

Data is very nuanced: It is critical to understand how each third-party platform and service handles its data. There are MANY not-so-obvious subtleties. Here is one good example…In YouTube, it is easy to get the total video views for a channel (a channel is often associated with a brand and has many videos in it); however, it is not so easy to get the daily video views for a channel. On the surface, it would seem easy to calculate the daily video views — just keep the total for each day and subtract the current day’s total from the previous day’s total, right? Wrong. If you used that approach, you will get certain days where the daily video views would be negative. This is because the total video views doesn’t include any videos that gets removed from the channel. So if on a particular day, a very popular video gets removed from a channel, the total video views could go down significantly that day. It is possible to still figure out the daily video views but that would take up an entire blog post 🙂 The important point here is that data from each third-party platform and services has these nuances. Developing the expertise to know the subtleties of each platform is an overwhelming task.

Due to the issues raised above, much work is often required in order to turn data from the various third-party platforms and services into a high-quality, integrated set of marketing data. The process is very similar to the one of taking crude oil from a variety of sources and turning it into something that has value, like high octane gas. Data needs to be enriched, cleansed, and refined in order to make it useful. Going through such a process when data is coming from so many different sources — each with its own nuances — makes marketing data unique from other types of business data.