When data science becomes a data audit

When data science becomes a data audit

Giles Cottle, Mon 27 November 2017

We were engaged by a large vertically integrated media company to help them use data science across different divisions of the business. They wanted to optimize their marketing activity and to improve upon core KPIs like retention and customer engagement, and had recently launched several new products and initiatives, which they wanted to use data to assess and support.

However, we know from experience that until we've taken a look at the data available, we can't work out how to get the most significant benefit. Over the years we've established a streamlined process to carry out a data audit to quickly understand customers can optimize their data architecture to help to support these broader goals

Reviewing how the business uses the data

Before we take a look at the data itself and the technology used to process the data, we look at how the business is using the data and what they aspire to achieve. Working with the business units, our data science consulting team documented the datasets they had, how data was used, and the range of existing reports. A good starting point is what business-as-usual reporting looks like today and how the business wants it to look.

We followed this with a workshop to work through the goals of each of the business unit, and the data points needed to support these. The sessions were designed to surface data requirements that may not have been immediately apparent to the team. Our team also shared best practices from other companies, to inspire the business units.

It rapidly became clear that there were some gaps between what the teams wanted to do and the data they had.

The first problem area was competitive information. Without understanding their position relative to competitors, it was impossible to benchmark services and establish suitable competitive strategies. Although detailed competitive data was essential to the team, there was no means of capturing and accessing it.

Secondly, customers were not grouped by acquisition source (SEO, organic, Facebook, etc...) and there were no means of capturing historical subscription status. Although they had a full picture of what a subscriber was doing at any moment, there was no context. A subscriber who was a premium subscriber yesterday but not today was treated no differently to subscribers who had only just joined.

Technical data audit

With a detailed knowledge of what the business wanted to achieve, and where some of the data gaps might be, we then started to review the business’ technical set-up. A kick-off session with the data engineering team gave us in-depth knowledge of the business’ data architecture and set-up.

We started with the database technologies used, systems that feed into and out of the warehouse; the volume of data stored and frequency of access; the amount of data stored and tools used for ETL, visualization, and reporting and data mining.

Our data science consulting team then reviewed every element of the database structure: from the ingestion process, the cleansing process, looking at the transformation process, data storage, and database structure, and reviewing all reports.

Comparing the processes against industry best practice and the experience of our team, it became clear that the client had problems with data quality. The data warehouse contained data without filtering for cleanliness, so quality and junk data were stored together with no way to tell them apart. There were also specific issues around content metadata, preventing the marketing team from carrying out granular content recommendations and segmented offers.

We also found some technical "quick-wins" that could be implemented immediately. These included updating some of the sort keys for better performance, and providing the ability to create a summary and aggregate tables in the database, to aid overall performance. We also showed how several critical reports could be re-written so that they run more efficiently and quickly

Bringing it all together

Before we undertook the project, they believed that they are two main problems. They knew there were lots of internal initiatives that needed to use data, but no real cohesiveness regarding a data strategy. They thought there was a misalignment between where the business wanted to go, and what was collected and what was stored in the data warehouse.

Although these were broadly true, we found more specific and actionable items that provided immediate benefits and initiated longer terms projects in the other areas.

The client could immediately implement the database optimizations and re-write the slowly performing reports. There is much less of a delay in getting reports to C-level executives, and members of the BI team whose time was spent creating worksheets have now been freed up to carry out more advanced analytics

The historical infromation on subscribers is now incorporated into the data warehouse, and the team is now able to target upsell offers to different subscribers, based on how long they have used the service for, whether they have dipped in and out of the service in the past, and so on.

Customer loyalty and user retention have both improved as a direct result of the team having the ability to target offers accurately to different parts of its audience

As a next stage, the client is looking to onboard competitive intelligence directly into their platforms and has commissioned the data cleansing and enhanced metadata projects to address the final points of the audit.

Need help? Get in touch...

Sign up below and one of our data consultants will get right back to you

Other articles about Data Audit


Dativa is a global consulting firm providing data consulting and engineering services to companies that want to build and implement strategies to put data to work. We work with primary data generators, businesses harvesting their own internal data, data-centric service providers, data brokers, agencies, media buyers and media sellers.

145 Marina Boulevard
San Rafael
California
94901

Registered in Delaware

Thames Tower
Station Road
Reading
RG1 1LX

Registered in England & Wales, number 10202531