Businesses across the country (and the globe) are fast becoming more proficient with capturing, storing and using data to enhance their decision making—and you can too.
In this guide, we'll teach you the fundamentals of data and how data science works, the best practice data gathering techniques to use, and how you can start using data to:
The world of data can be daunting, but with the right expertise to guide you it doesn't have to be out of reach.
Short on time? Download a copy of this guide to read offline!
Learn the difference between data and information in business, and discover what your enterprise needs to leverage your market research to its fullest.
Put simply, data is a number, picture, statement, etc that is unprocessed.
It might be a name, the content of an email, an address, a sales figure; anything of that nature. If you printed out a bundle of receipts from your latest transactions and observed the total value, you would have a set of data.
Every business generates data, whether it’s through the cash register, an email list, a loyalty card or even just casual conversations with customers.
It isn’t just numbers in a system. You can get both structured data (numbers), and unstructured data (images, audio, writing, conversations). It is quite similar to the difference between quantitative and qualitative data.
Data in its raw form, as you can imagine, isn’t particularly useful. Knowing that some customer somewhere at some point paid $10 for something only tells you one thing: how much they paid. Once you begin gathering other data points, such as who, when, where, why and for what, and expand that out to your whole customer base, you start getting something more helpful for your strategy.
You start getting information.
Information is processed data. It has meaning and context. It isn’t just a single data point, it’s a series of data points that have been melded into something that informs or allows you to act.
Take that $10 purchase from earlier. Once you know the specifics of that transaction, and other similar transactions, you might start noticing that one particular product is performing well at a specific time of day, or is being bought alongside other items. Each and every single one of those numbers, sales figures, timings, customer numbers; those are data points.
Once you group them, collect them, organise them, analyse them and manage them, you get information. From information, you can see trends or connections or unfulfilled niches. From information, you can start strategising. From information, you get business intelligence.
Computers need data. Humans need information. The question you need to ask of your business is whether it has the capability to both gather data and changing data into information both efficiently and accurately. It’s the difference between conducting research and generating actionable insights.
From data, information. From information, business intelligence. Every business generates data, large and small, and every business can take advantage of data—it is how you process data that makes the difference.
However, all too often, businesses are working from broken, inaccurate or unimportant data. The information you need will change entirely on your growth stage and what industry you are in, and as a result, the data that you focus on will be unique as well.
All of these questions affect the information that you need, and thus the data you need.
Moreover, the data you gather has to be both reliable and valid. Market research, i.e. data gathering, doesn’t stem from a single, short survey from a select group of customers. It is an in-depth, complex process, in which analysts separate the bad data from the good, the important from the unimportant; the wheat from the chaff, in other words.
Only by working with ‘good’ data can you hope to create ‘good’ information—the cornerstone of any solid business strategy.
Data builds information, and information builds strategic success. Without the first, you can’t have the second or the third. A good business is built upon great market research, which can gather and analyse all of the data that your company is currently gathering, separating the useful from the useless.
Get good information. Start with great data.
While data science is something of a buzz word, businesses and brands often aren’t aware of what data scientists actually do—or the insights they can offer. We’re here to rectify that.
From predicting which of your customers are most likely to leave to identifying your most valuable customer segments, all data science projects follow the same lifecycle.
Define the main and sub objectives of the analysis. This will steer the project going forward. Any context/business knowledge around the objectives is also collected.
Acquiring the relevant/required data is crucial. Typically, this is where a data science team will talk to you to understand and obtain the correct data for the project.
At this stage, the data cleaned and explored for general insights. This step gives data analysts a strong understanding of the intricacies of the data, which then guides model development.
Models are developed, tested and compared to optimise performance against the project’s objectives. In most cases (depending on the type of model used), deeper insights are gained during this stage.
What do we mean by models? By models we mean advanced algorithms that utilise data to identify and predict relationships between behaviours, attitudes, and outcomes. These can involve machine learning, a type of artificial intelligence that automatically adapts and learns depending on the task it has been set.
At this stage, analysts will evaluate the model’s results using historic interpolation and forecasting. The models are then integrated into the client’s ongoing operations and strategy in some form.
Model performance and other metrics are monitored through a dashboard. Because the models force population change, the performance of these models often decay over time, hence the need for model re-calibration.
Repeat steps 2 to 6 to re-calibrate as often as required.
Data is only useful if it is accurate, meaningfully structured, and accessible.
In most cases, the first hurdle companies face is getting their data to meet the above criteria. Many simply don’t know how to get it there.
Data cleaning and structuring is mandatory for any modelling work. Without client-side data in a workable format, most of the deliverables are unattainable. Cleaning and structuring is a big job (sometimes months or years in the making) and as such, is typically a project on its own.
This type of data science work is ideal for businesses who:
Dynamic dashboards customised to display subject-specific data.
Power BI Dashboarding is a good solution for clients wanting to get more meaning out of their transactional/system data.
Currently, we develop them to display cross-sectional and/or longitudinal survey data gathered from our brand trackers.
To see trends and changes over time requires regular data updates. With each update, the data has to be converted into a structured format (which the DS team performs). This means that the way the data is received, collected, and processed has to be repeatable.
This type of data science work is ideal for businesses who:
Uses statistical analysis to groups data points into similar populations to help clients tailor and target their marketing.
Cluster analysis can provide powerful psychographic insights into customer segments. Using attitudinal and/or behavioural segmentation data, we group a client’s customer population into like-minded subsets based on their demography and behaviour observed by the client.
This type of data science work is ideal for businesses that:
Assesses the likelihood of an event occurring.
By drawing on historic behavioural data, we can create a predictive model that can predict binary outcomes (i.e. an event happening or an event not happening). It can be used to assess whether a customer will churn, whether they will purchase and whether a customer will react to communication, which helps companies develop a highly targeted approach for engaging these customers. This type of modelling can also quantify the drivers of the event being predicted.
Like cluster analysis, this task also typically moves through the following four stages:
This type of data science work is ideal for businesses that:
A strain of econometric modelling that measures how effective advertising spend/activity is on a target variable, such as sales.
The result is an equation that receives inputs (advertising activity and other seasonal components) and estimates the target variable (e.g. sales).
This task is a more advanced modelling task and is generally slowed due to data acquisition because the client collects the data at a summarised level. Because data is only top-level, it usually means less cleaning is required.
This type of data science work is ideal for businesses that:
Data modelling falls into three broad categories:
1. Descriptive models.
Purpose: To understand the past, interpret and explain.
Measure: What happened/is currently happening?
2. Predictive models.
Purpose: To understand the future, predict accuracy and continue evolution.
Measure: What is to come given the current course?
3. Normative models.
Purpose: To answer a question posed through scenario prediction.
Measure: How can we achieve a specific outcome based on what we know?
Each category almost always relies on the previous category being completed first
(i.e. 1 before 2, 2 before 3). Descriptive models are a key component of a predictive modelling solution and predictive models can be utilised in a prescriptive way, making it a normative model.
How far we delve into this hierarchy depends on how far you are willing to go.
Excellent marketing starts with excellent data—clean, complete and up-to-date.
Many companies usually update their customer data over a period of time, and do so reactively rather than proactively. To be able to act fast on the insights derived from your data, this data needs to be complete and well maintained in the first place. This will help immensely with generating sales and ultimately driving more revenue for your business.
To achieve this, simply follow these four steps to quality control your data.
It sounds so straightforward and simple but many businesses get this wrong. Having the exact data you need fully completed means you can make the most of your customer database and use it to identify key trends and customer behaviour.
Complete data makes it easier to develop more targeted marketing strategies that narrow in on your target audience. This will ensure your sales process is more personalised and thereby more effective. For example, in the B2B sphere, knowing key demographic information such as company name, location, job position, industry and number of company staff allows you to be more targeted in your marketing efforts, so make sure these fields are always filled in by your customers.
By now we all know that communicating in a personal way with our existing customers means that we can generate more sales. These people have already bought something from you so they are more inclined to listen to (or read) what you have say.
It’s a great way to stay in touch and ensure they keep your brand top of mind whilst you are informing them about new products, offers and events. This way, you can be specific about the message you’re sending, increasing the chances of further sales.
The key to doing this the right way though is to have as much detailed data as possible about your audience and using it in the right way.
If you're in the B2B space, using info about their company size by staff and industry demonstrates to your customer that you know them as you send relevant info their way.
Having duplicate data of customers is an easy mistake to make but can cause headaches as it gives you a false view of your customers’ information and purchase behaviour. This can happen when you capture data from different points in your business.
It may even confuse revenue figures as it doesn’t give you a true picture of what they’ve bought. To properly clean a database from duplicates can take time and the best way to do this is to perform a de-dupe. To save you the time and hassle, you should have a data expert do this for you.
As digital as we’ve become these days, email contact just isn’t enough sometimes. This depends on your business model, so you will know if the best way to communicate with your key decision makers is simply to just call them.
Ensure that each of your customer contacts has a working phone number. Again—simple but effective.
If you’re new to collecting customer data, it’s not always clear what data you need to collect and what the rules around data collection are. Below are six common data collection questions we frequently encounter.
The answer depends on what you plan to use the data for.
For surveying purposes
If you’re looking to survey your customers or clients for customer satisfaction and feedback, you’ll want the basics, such as:
For CRM purposes
For customer data that will go in your CRM system for and sales and marketing purposes, you’ll also want to collect more complete data such as:
By having this data in your CRM systems, you can harness interaction data between your business and your customer/client to better understand your consumers’ needs and behaviours.
Having this data means you can communicate in a personal way, such as following up with customers on their order and delivering personalised product recommendations based on past purchases.
Incomplete information is a sign of poor data quality. The best method to get your sales reps to fill in the data properly is two-fold:
1. Make the important fields mandatory in your online forms.
You’ll want to have the following mandatory fields:
2. Engage your sales team in the development of these forms and integrate the accuracy and competition of these forms with the team’s KPIs.
Prevent duplicate records by comparing the email address of the contacts. These will be unique for each individual.
In many CRM systems today, you now have an option to check whether the newly added record already exists in the account.
For this scenario, instead of deleting one record and potentially losing important data from one that isn't present in the other, merge the two contacts into a single entity instead.
The biggest mistake when collecting data is to ask for too much at one time and overwhelming your customers. To good news is that persona data is not the be all and end all. While it is important, a business on the journey of collecting data should have a large focus on their organic data collection capabilities. This organic data stems from customers’ interactions with your platforms, such as your website, app, social media, email, online and in-store transaction software and loyalty programmes. This is the data that will provide you with the most long-term payback.
Important note: You should also consider the ethics of how you use your customers’ data. While it is perfectly legal to collect sensitive data if customers agree to it, you should consider how you're leveraging that information and whether it is appropriate.
For example, lower tier lenders often harness data to target lower income households for hire purchase loans. This is legally fine to do (it is their target market) but ethically speaking, giving those families access to high interest loans to buy expensive consumer goods is not helping them. Instead, it puts them under more financial stress.
In short: with great data comes great responsibility.
In New Zealand there are three main data protection and privacy laws to be aware of:
Important note: while New Zealand is obviously not part of the European Union or California, if your business is collecting data from citizens of these regions, these laws still apply to you, particularly if you have international customers.
We will go into more detail on these regulations below.
If you’re collecting customer data, you need to know the regulations around it.
GDPR is a regulation that requires businesses to protect the personal data and privacy of EU citizens for transactions that occur within EU member states.
Briefly, the GDPR gives European Union citizens the right to:
Even if you’re not in the European Union, the law still applies if you’re collecting customer data on EU citizen, for example, if they provide their name and email to sign up to your newsletter.
Much like the GDPR, if your business collects any customer data from Californian residents, then this law applies to you.
The key differences between the two laws center around:
Civil remedies for individuals—under the GDPR, an action can be brought for any violation of the law, while under the CCPA actions can only be brought for failure of security measures and in the context of data breaches.
In short, if you follow the GDPR, you’ll be compliant with most of the CCPA. However, it is critical that you are aware of and check your compliance in the sections where the two regulations differ. We recommend using this guide to ensure you are compliant with both laws.
The Privacy Act of 2020 came into effect on 1 December 2020 and was designed to replace the 1993 act of the same name. Intended to align New Zealand’s privacy laws with the likes of the GDPR, the 2020 Act applies to both local and overseas organisations that conduct business in New Zealand.
The key updates include:
You can read about these changes in more detail here.
Below are three data science scenarios where businesses use data science and data modelling to identify and connect with their most valuable customers.
Retailer X has over 1.5 million loyalty card members nationwide. However, an annual churn rate of approximately 60 per cent is causing concern, particularly with new competitors entering the market. Retailer X wants to look at ways to reduce their churn rate nationwide in a cost-effective manner. To do this, they supply 2 years’ worth of data on a subset of their loyalty members to a data science team.
Using the data available, a number of models are developed and tested to determine which is the most effective at identifying customers on the brink of churning. The winning model provides high-level insights on attribute effectiveness by way of churn indicators (such as how often they shopped in-store). This goes on to form the basis of a low-cost targeting campaign designed to improve yields of approximately five per cent per annum.
The final model and strategy proves to:
Bank X is currently facing the challenge of now knowing who their customers are or how they behave—and who they are losing or making money on. There is uncertainty about what needs to be done to target customer groups based on their needs and wants. They supply data scientists with four years of data for each of their customers along with product ownership data and some demographic information.
Segmentation is utilised to better understand the behaviours, profiles and profitability of Bank X’s customer groups, resulting in four segments being identified within the customer base of Bank X. The segment profiles provide Bank X with possible leverage points to inform their business and marketing strategy going forward.
The customer segmentation proves to:
With the segmentation setting the groundwork for target marketing, the sales and marketing departments of Bank X now want know what products should be promoted to particular customers. Using the same dataset as the customer segmentation analysis, a second analysis is done to steer product offering and also to shed more light on attrition issues Bank X is suffering from.
A number of models are developed to power cross-selling strategies and each customer is scored on their likelihood to accept each possible product offered to them. A churn model is also developed to minimise attrition while the product offering models help maximise customer potential.
The propensity models proves to:
Take your data science learning a step further with our free transactional data guide: Transform with Transactional Data.
 Sisense, State of BI & Analytics Report 2020: Special COVID-19 Edition, 2020.