| home > articles >
Customer relationships with data mining Building profitable customer relationships
with data mining
Source: www.spss.com
Copyright SPSS, Inc. 2004
Youve built your customer information and marketing data warehouse now how do
you make good use of the data it contains?
Customer relationship management (CRM) helps companies improve the profitability of their
interactions with customers, while at the same time, makes the interactions appear
friendlier through individualization. To succeed with CRM, companies need to match
products and campaigns to prospects and customers in other words, to intelligently
manage the customer life cycle.
Until recently, most CRM software focused on simplifying the organization and management
of customer information. Such software, called operational CRM, focuses on creating a
customer database that presents a consistent picture of the customers relationship
with the company and providing that information in specific applications. These include
sales force automation and customer service applications, in which the company
touches the customer.
However, the sheer volume of customer information and increasingly complex interactions
with customers have propelled data mining to the forefront of making customer
relationships profitable. Data mining is a process that uses a variety of data analysis
and modeling techniques to discover patterns and relationships in data that are used to
understand what your customers want and predict what they will do. Data mining can help
you select the right prospects on whom to focus, offer the right additional products to
your existing customers and identify good customers who may be about to leave. This
results in improved revenue because of a greatly improved ability to respond to each
individual contact in the best way and reduced costs due to properly allocated resources.
CRM applications that use data mining are called analytic CRM.
Data mining
The first and simplest analytical step in data mining is to describe the data. For
example, you can summarize datas statistical attributes (such as means and standard
deviations), visually review data using charts and graphs and look at the distribution of
field values in your data.
But data description alone cannot provide an action plan. You must build a predictive
model based on patterns determined from known results and then test that model on results
outside the original sample. A good model should never be confused with reality (you know
a road map isnt a perfect representation of the actual road), but it can be a useful
guide to understanding your business.
Data mining can be used for both classification and regression problems. In classification
problems youre predicting what category something falls into for example,
whether or not a person is a good credit risk or which of several offers someone is most
likely to accept. In regression problems, youre predicting a number, such as the
probability that a person will respond to an offer.
In CRM, data mining is frequently used to assign a score to a particular customer or
prospect indicating the likelihood that the individual behaves the way you want. For
example, a score could measure the propensity to respond to a particular offer or to
switch to a competitors product. It is also frequently used to identify a set of
characteristics (called a profile) that segments customers into groups with similar
behaviors, such as buying a particular product.
A special type of classification can recommend items based on similar interests held by
groups of customers. This is sometimes called collaborative filtering.
The data mining technology used for solving classification, regression and collaborative
filtering problems is briefly described in the appendix at the end of this paper.
Defining CRM
Customer relationship management in its broadest sense simply means managing all
customer interactions. In practice, this requires using information about your customers
and prospects to more effectively interact with your customers in all stages of your
relationship with them. We refer to these stages as the customer life cycle.
The customer life cycle has three stages:
1. Acquiring customers
2. Increasing the value of customers
3. Retaining good customers
Data mining can improve your profitability in each of these stages when you integrate it
with operational CRM systems or implement it as independent applications.
Applying data mining to CRM
In order to build good models for your CRM system, there are a number of steps you
must follow. The Two Crows data mining process model described below is similar to other
process models such as the CRISP-DM model, differing mostly in the emphasis it places on
the different steps.
Keep in mind that while the steps appear in a list, the data mining process is not linear
you will inevitably need to loop back to previous steps. For example, what you
learn in the explore data step (step 3) may require you to add new data to the
data mining database. The initial models you build may provide insights that lead you to
create new variables.
The basic steps of data mining for effective CRM are:
1. Define business problem
2. Build marketing database
3. Explore data
4. Prepare data for modeling
5. Build model
6. Evaluate model
7. Deploy model and results
Lets go through these steps to better understand the process.
1.Define the business problem. Each CRM application has one or more business objective for
which you need to build the appropriate model. Depending on your specific goal, such as
increasing the response rate or increasing the value of a
response, you build a very different model. An effective statement of the problem
includes a way to measure the results of your CRM project.
2.Build a marketing database. Steps two through four constitute the core of the data
preparation. Together, they take more time and effort than all the other steps combined.
There may be repeated iterations of the data preparation and model building steps as you
learn something from the model that suggests you modify the data. These data preparation
steps may take anywhere from 50 to 90 percent of the time and effort for the entire data
mining process.
You will need to build a marketing database because your operational databases and
corporate data warehouse often dont contain the data you need in the form you need
it. Furthermore, your CRM applications may interfere with the speedy and effective
execution of these systems.
Building profitable customer relationships with data mining
When you build your marketing database you need to clean it up if you want good
models you must have clean data. The data you need may reside in multiple databases such
as the customer database, product database and transaction databases. This means you need
to integrate and consolidate the data into a single marketing database and reconcile
differences in data values from the various sources. Improperly reconciled data is a major
source of quality problems. There are often large differences in the way data is defined
and used in different databases. Some inconsistencies may be easy to uncover, such as
different addresses for the same customer. However, these problems are often subtle. For
example, the same customer may have different names or, worse, multiple customer
identification numbers.
3. Explore the data. Before you can build good predictive models, you must understand your
data. Start by gathering a variety of numerical summaries (including descriptive
statistics such as averages, standard deviations and so forth) and looking at the
distribution of the data. You may want to produce cross tabulations (pivot tables) for
multi-dimensional data.
Graphing and visualization tools are a vital aid in data preparation and their importance
for effective data analysis cant be overemphasized. Data visualization most often
provides the Aha! that leads to new insights and success. Some common and very
useful graphical data displays are histograms or box plots which display distributions of
values. You may also want to look at scatter plots in two or three dimensions of different
pairs of variables. The ability to add a third, overlay variable greatly increases the
usefulness of some types of graphs.
4. Prepare data for modeling. This is the final data preparation step before building
models and the step where the most art comes in. There are four main parts to
this step:
First, you want to select the variables on which to build the model. Ideally, you take all
the variables you have, feed them to the data mining tool and let the data mining tool
find those that are the best predictors. In practice, this is very involved. One reason is
that the time it takes to build a model increases with the number of variables. Another
reason is that blindly including extraneous columns can lead to models with less, rather
than more, predictive power.
The next step is to construct new predictors derived from the raw data. For example,
forecasting credit risk using a debt-to-income ratio rather than just debt and income as
predictor variables may yield more accurate results and are also easier to
understand.
Next, you may decide to select a subset or sample of your data on which to build models.
If you have a lot of data, however, using all your data may take too long or require
buying a bigger computer than youd like. Working with a properly selected random
sample usually results in no loss of information for most CRM problems. Given a choice of
either investigating a few models built on all the data or investigating more models built
on a sample, the latter approach usually helps you develop a more accurate and robust
model of the problem.
Last, you need to transform variables in accordance with the requirements of the algorithm
you choose to build your model.
5. Data mining model building. The most important thing to remember about model building
is that it is an iterative process. You need to explore alternative models to find the one
that is most useful in solving your business problem. What you learn when searching for a
good model may lead you to go back and make some changes to the data you are using or even
modify your problem statement.
Most CRM applications are based on a protocol called supervised learning. You start with
customer information for which the desired outcome is already known. For example, you may
have historical data from a previous mailing list that is very similar to the one you are
currently using. Or, you may have to conduct a test mailing to determine how people will
respond to an offer. You then split this data into two groups. On the first group, you
train or estimate your model. You then test it on the remainder of the data. A model is
built when the cycle of training and testing is completed.
6. Evaluate your results. Perhaps the most overrated metric for evaluating your results is
accuracy. Suppose you have an offer to which only one percent of the people respond. A
model that predicts nobody will respond is 99 percent accurate and 100 percent
useless. Another measure that is frequently used is lift. Lift measures the improvement
achieved by a predictive model. However, lift does not take into account cost and revenue
so it is often preferable to look at profit or ROI. Depending on whether you choose to
maximize lift, profit or ROI, you choose a different percentage of your mailing list to
whom you send solicitations.
7. Incorporating data mining in your CRM solution. In building a CRM application, data
mining is often a small, albeit critical, part of the final product. For example,
predictive patterns through data mining may be combined with the knowledge of domain
experts and incorporated in a large application used by many different kinds of people.
The way data mining is actually built into the application is determined by the nature of
your customer interaction. There are two main ways you interact with your customers: they
contact you (inbound) or you contact them (outbound). The deployment requirements are
quite different.
Outbound interactions are characterized by your company, which originates the contact,
such as through a direct mail campaign. Thus, you select people to contact by applying the
model to your customer database. Another type of outbound campaign is an advertising
campaign. In this case, you match the profiles of good prospects shown by your model to
the profile of the people your advertisement would reach.
For inbound transactions, such as a telephone order, an Internet order or a customer
service call, the application must respond in real time. Therefore, the data mining model
is embedded in the application and actively recommends an action. In either case, one key
issue you must deal with in applying a model to new data is the transformations you used
in building the model. Thus if the input data (whether from a transaction or a database)
contains age, income and gender fields, but the model requires the age-to-income ratio and
gender has been changed into two binary variables, you must transform your input data
accordingly. The ease with which you can embed these transformations becomes one of the
most important productivity factors when you want to rapidly deploy many models.
|