Last week, we introduced you to Propensity Modeling and how it can help your association make data-guided decisions while providing great value to your customers. We’ll now dig into some of the technical detail and steps to implement Propensity Modeling.
Step 1. Prepare Your Data
Consistent, complete, and accurate data is the foundation of predictive modeling. Your data should ultimately look like a very wide row with a dependent variable of 1 or 0 relating to the business action taken (or not) along with a variety of independent variables with values at the time of transaction.
Categorical data should be converted to “dummy” variables where values are transformed into individual columns as opposed to row-based data that is ideal for data exploration. Fortunately, the ability to quickly access high-quality and timely data regardless of source from an environment such as a dimensional data model makes the process much easier.
Step 2. Select Your Variables
Incorporating the right mix of features is vital to the success of any predictive model. While it’s great to have many variables available as candidates, having too many can actually harm model accuracy.
Several automated stepwise techniques are available to propose variables by iterating through different combinations while considering measures such as significance and model error. Simply relying on automated processes is not recommended as statistics should be tempered with business expertise to identify variables that are not meaningful or pick between highly correlated variables. Another challenge is the potential for over fitting, meaning the selected variables based on the sample data are not best for unseen data.
Step 3. Select Your Modeling Technique
Next, you will want to select a modeling technique. You will likely be deciding between a linear regression model and a logistic regression model.
Linear Regression models have outcomes based on nearly infinite continuous variables, such as time, money, or large counts. Propensity Modeling generally leverages Logistic Regression models to derive probability-based scores between a fixed range of 0 and 1. The underlying algorithms used to create models are very different as well.
Logistic Regression is often perceived as an approach to estimate binary outcomes by rounding to 0 or 1, but a score of .51 is very different from a score of .99. A common approach is to assign records to categories using deciles, or 10 bins with equal ranges.
Step 4. Determine If You Need to Use Any Other Analytic Techniques
You can use several other advanced analytic techniques to accomplish goals similar to Propensity Modeling.
- Clustering is a form of unsupervised learning as the model is not based on a specific outcome or dependent variable, but simply groups records such as individuals. The groups can result in customer segments that are ideal for certain products or marketing approaches.
- Collaborative Filtering is based solely on the actions of groups of users as opposed to individual characteristics. This is a common approach for recommendation systems based on actions such as purchases, product ratings, or web activity.
- Decision Trees traverse a path of variables with branch “splits” based on the contributions of variables to ultimate outcomes. This technique can be effective when a very small set of variables lead to outcomes influenced by downstream groups of variables.
You can also combine models, where the results of one model are the input to another to create a ensemble models.
The decile scores generally represent a range from “sure thing” to “lost cause”. You can use the different decile groups to guide approaches such as the effort to retain individuals, pricing strategies, and marketing messages.
Step 5. Determine Measurement Approach
The Lift of a Propensity Model represents the ratio of the rate based on applying a model to the rate based on “random” individuals. An ideal way to derive this measure is to maintain a control group for comparison to a similar group leveraging the Propensity Model. If can be a difficult decision to risk potential revenue, so a common approach is to simply compare before-and-after results.
Step 6. Consider How You Will Take Action
Before using any analytics model, it’s a good idea to consider how you can take action on the information. What decisions will you make as a result of the information? Similarly, how will you measure the results of the action and use it to inform your model?
For example, you can use a propensity model to reduce expenses. Targeting individuals differently based on their propensity to take action can optimize costs in different ways. Costs might be direct costs, such as actual print mailings or list rentals, or costs can be indirect, such as many non-personalized emails that contribute to information overload. You will want to establish a baseline and a goal for cost reduction to measure success of the model.
Step 7. Identify Your Tool
A range of different options are available to implement Propensity Modeling.
- R Programming: A popular open-source statistical programming language with many mature packages to perform the techniques underlying Propensity Modeling.
- Alteryx Software: A software platform offering pre-built tools for different modeling techniques and business scenarios.
- Amazon Machine Learning: A cloud-based service that is part of the comprehensive Amazon Web Services environment that provides visual wizards for tools to perform Propensity Modeling
This may seem like a lot of steps, but once you have all of your comprehensive data easily accessible along with an available user-friendly tool, all you will need is your imagination to better understand your association’s customer journeys to make valuable data-guided decisions.