Archive for Predictive Analytics

How Associations Are Successfully Using Artificial Intelligence

With AI no longer science fiction, associations are using advanced technologies to convert mountains of data into actionable insights.

At the recent EMERGENT event, hosted by Association Trends, we had the opportunity to jointly present case studies with ASAE’s Senior Director of Business Analytics, Christin Berry.

These success stories include how ASAE has:

Combined artificial intelligence and text analytics to enhance customer engagement, understand evolving trends, and improve product offerings

DOUBLED online engagement with unique open and the click-to-open rates using AI to personalize newsletters

Reduced the need for surveys, identified what’s trending, and measured through Community Exploration

Leveraged Expertise Search and Matching to better identify experts and bring people with similar interests together

I’m Matt Lesnak, VP of Product Development & Technology at Association Analytics and I hope to demystify these emerging technologies to jumpstart your endeavors in association innovation.

Text and Other Analytics

Associations turn to analytics and visual discovery for answers to common questions including:

  • How many members to we have for each member type?
  • How many weeks are people registering in advance of the annual meeting?
  • How much are sales this year for the top products?

Questions about text content can be very different, and less specific.  For example:

  • What is it about?
  • What are the key terms?
  • How can I categorize the content?
  • Who and where is it about?
  • How is it like other content?
  • How is the writer feeling?

It is widely estimated that 70% of analytics effort is spent on data wrangling.

This high proportion is no different for text analytics but can be well worth the effort. Text analytics involves unique challenges including:

  • Term ambiguity: Bank of a river vs. a bank with money vs. an airplane movement
  • Equivalent terms: Eat vs. ate, run vs. running
  • High volume: Rapidly growing social data
  • Different structure: Doesn’t really have rows, columns, and measure
  • Significant data wrangling: Must be transformed into usable format

Like the ever-growing data from association source systems that might flow to data warehouse, text content of interest might include community discussions, articles or other publications/books, session/speaker proposals, journal submissions, and voice calls or messages.

Possible uses include enhancing your content strategy, providing customized resources, extracting trending topics for CEOs, and identifying region-specific challenges.

Learn More

 

Personalized Newsletter

ASAE is working with rasa.io to automatically identify topics of newsletter content as part of a pilot that significantly improved user engagement.  ASAE and rasa.io first tracked newsletters interactions over time to understand individual preferences and trending topics.  Individuals then received personalized newsletters based on demonstrated preferences.

The effort had been very successful, as unique open and the click-to-open rates have more than doubled for the personalized newsletters.

Underlying technology includes Google, IBM Watson, and Amazon Web Services; combined with other machine learning tools developed by rasa.io.


Community Exploration

ASAE leverages a near-real-time integration with over 10 million community data points combined with enterprise data warehouse to analyze over 50,000 pieces of discussion content and over 50,000 site searches.  The integration is offered as part of the Association Analytics Acumen product through a partnership with Higher Logic.

Information extracted includes named entities, key phrases, term relevancy, and sentiment analysis.  This capability provides several impactful benefits.

Quick wins:

  • Visualize search terms
  • What’s trending
  • Staff and volunteer use
  • Reduce need for surveys

Longer-term opportunities:

  • Aboutness of posts as content strategy
  • Identifying key expertise areas
  • Connecting like-minded individuals

Underlying technology includes AWS Comprehend, Python, and Hadoop with Mahout.

Learn More


Expertise Search and Matching

Another application of text analytics that we’ve implemented involves enabling associations to better identify experts and bring together people with similar interests.  In addition to structured data from multiple sources, text from content including meeting abstracts and paper manuscripts provides insights into potential individual interests and expertise.

This incorporates data extracted from content using approaches including content similarity, term relevancy, validation of selected tags, and identifying potential collaborators.

Underlying technology includes Python and Hadoop with Mahout.


Approaches and Technology

We’re written extensively about the importance of transforming data into a format optimized for analytics, such as a dimensional data model implemented as a date warehouse.

Thinking back to the common association questions involving membership, event registration, and product sales; these are based on discrete data such as member type, event, and day.

Text data is structured for analysis using a different approach, but fundamentally similar as each term is a field instead of, for example, a member type table field.

Picture a matrix with each document as a row and each term as a column.

This is referred to as “vector space representation”.  With thousands of commonly used words in the English language, that can be a big matrix.  Fortunately, we have ways to reduce this size and complexity.

First, some basic text preparation:

  • Tokenization – splitting into words and sentences
  • Stop Word Removal – removing words such as “a”, “and”, “the”
  • Stemming – reduction to root word
  • Lemmatization – morphological analysis to reduce words
  • Spelling Correction – like common spell-checkers

Another classic approach is known as “Term Frequency–Inverse Document Frequency (TF-IDF)”.  We use TF-IDF to reduce the data to include the most important terms using the calculated scores.  TF-IDF is different from many other techniques as it considers the entire population of potential content as opposed to isolated individual instances.

It is widely estimated that 70% of analytics effort is spent on data wrangling.  This high proportion is no different for text analytics but can be well worth the effort.

Other key foundational processing:

  • Part-of-Speech Tagging: Noun, verb, adjective
  • Named Entity Recognition: Person, place, organization
  • Structure Parsing: Sentence component relationships
  • Synonym Assignment: Discrete list of synonyms
  • Word Embedding: Words converted to numbers

The use of Word Embedding, also referred to as Word Vectors is particularly interesting.  For example, the word embedding similarity of “question” and “answer” is over 0.93.  This isn’t necessarily intuitive and it is not feasible to manually maintain rules for different term combinations.

A team of researchers at good created a group of models known as Word2vec that is implemented in development languages including Python, Java, and C.

Here are common analysis techniques:

  • Text Classification: Assignment to pre-defined groups, that generally requires a set of classified content
  • Topic Modeling: Derives topics from text content
  • Text Clustering: Separating content into similar groups
  • Sentiment Analysis: Categorizing opinions with measures for positive, negative, and neutral


Finding and Measuring Results

With traditional data queries and interactive visualizations, we generally specify the data we want by selecting values, numeric ranges, or portions of strings.  This is very binary – either the data matches the criteria, or it does not.

We filter and curate text using similarity measures that estimate “distance” between text content.  Examples include point-based Euclidean Distance, Vector-based Cosine Distance, and set-based Jaccard Similarity.

Once we identify desired content, how do we measure overall results?  This is referred as relevance and is made up of measures known as precision and recall.  Precision is the fraction of relevant instances among the retrieved instances, and recall is the fraction of relevant instances that have been retrieved over the total amount of relevant instances.  The balance between these measured is based on a tradeoff between ensuring all content is included and only including content of interest.  This should be driven by the business scenario.

This overall approach to text analytics is like that used for recommendation engines based on collaborative filtering driven by preferences of “similar” users and “similar” products.


APIs to the Rescue

Fortunately, there are web-based Application Programming Interfaces (APIs) that we’ve used to help you get started.  Here are online instances from Amazon and IBM for interactive experimenting:

This is a lot of information, but the takeaways are they there are big opportunities for associations to mine their trove of text data and it is easy to get started using web-based APIs to rapidly provide valuable insights.

Learn More

 

Matt Lesnak, VP of Product Development & Technology
Association Analytics

Coolest New Power BI Features Revealed!

Recently the Microsoft Business Applications Summit 2019 highlighted new Power BI features and these are the coolest features to note IMO:

1. New Power BI App Workspace Experience in Preview Power BI App Workspaces were introduced to enable collaboration amongst the data/business analysts within an organization. The new experience introduces numerous improvements to better enable a data-driven culture including:

•   Managing access using security groups, distribution lists, and Office 365 Groups
•   Not automatically creating an Office 365 group
•   API’s for Admins, as well as new tools for Power BI Admins to effectively manage workspaces

2. Printing Reports via Export to PDF
You can now easily print or email copies of reports by exporting all visible pages to PDF.

3. Bookmark Groups
Now you have a way to organize bookmarks into groups for easier navigation.

4. Python Integration in Preview
Now data scientists can use Python in addition to R within Power BI Desktop.

5. New Visual Header
More flexibility and formatting options have been added to the header section of each visual.

6. Tooltips for Table and Matrix Vizs
Report page tooltips are now available for the table and matrix visuals

7. Many to Many Relationships in Preview
You will now be able to join tables using a cardinality of “Many to Many” – prior to this feature, at least one of the columns involved in the relationship had to contain unique values.

And now I’ve saved the best for last!

8. Composite Models in Preview
With this feature, you’ll now be able to seamlessly combine data from one or more DirectQuery sources, and/or combine data from a mix of DirectQuery sources and imported data. For example, you can build a model that combines sales data from an enterprise data warehouse using DirectQuery, with data on sales targets that is in a departmental SQL Server database using DirectQuery, along with some data imported from a spreadsheet.

As you can see there are many new features to digest but it would be well worth your while to follow the links provided.

On a closing note, I’d like to give you a teaser for two new features coming up soon that will have a huge impact on self-service data prep and querying for big data:

  • Dataflows
  • Aggregates

Stay tuned!
Mario Di Giovanni, BASc, MBA, CBIP
Director, Business Analytics

More about Mario

 

Tableau’s R Integration

Ever wondered how data scientists and data analysts use Tableau for predictive analytics? The ability to integrate R into Tableau is powerful functionality. For those familiar with using R, it can be tricky to get started. Here’s how to get started with the R Integration.

Step 1. Set Up R on Your Computer

First, you will need to have a user interface for R on your computer. We recommend R Studio Desktop.

Step 2. Install RServe Package

Next, you will need to install the RServe package. To do this, click on Packages -> Install. Then, type in RServe and it will find the package for you to install.
reserve

Step 3. Set Up Rserve Connection

Now you will need to run the following code to start up the Rserve connection:
library(Rserve)
Rserve()

Step 4. Set Up the External Connection in Tableau

There is one more thing you will need to do prior to writing in R in Tableau, but to do this you will need to switch over to Tableau. Tableau needs to have the external connection set-up in order to run R.  Go to the Help -> Settings and Performance -> Manage External Connections.
R Serve
 
In the pop-up, type in localhost for the Server name. Click on Test Connection to verify it is now connected.

Step 5. Start Using R Integration

At this point, we can now start taking advantage of the R integration.  The integration uses calculated fields to pass R code. There are four different types of calculations used in the R integration:

  1. SCRIPT_BOOL
  2. SCRIPT_INT
  3. SCRIPT_REAL
  4. SCRIPT_STR

Which one you use depends on what type of value you expect to get as a result of your R Code.  SCRIPT_BOOL would be used if you expected a TRUE/FALSE value returned.  SCRIPT_INT would be used if you expected to have an integer returned.  SCRIPT_REAL would be used if you expected a numeric value returned.  SCRIPT_STR would be used if you expected a string value to be returned.
The basic set-up of any R calculated field is as follows:
SCRIPT_REAL (
“R code”,
Tableau fields being passed in
)
The R code would be encased by quote marks and the parenthesis would encase both the R code and any Tableau measures/dimensions that will be used inside the R code. You can pass in multiple Tableau fields, you will just need to separate the field names using a comma.
Two important items to know is that inside the R code, you do not use the Tableau field name. You will use .arg and you cannot mix aggregate and non-aggregate arguments.  Here is an example below.
script_bool
Within my R code, I would need to refer to sum([Profit]) as .arg1 and ATTR([Department]) as .arg2.  Also, I made Department an Attribute in order to use both it and Profit.

Example of R and Tableau in Action

Now that you have the basics of the calculated field, here’s a real life example using the Superstore dataset. We’ll be looking at the correlation between Profit and Discount.  The returned value will be a numeric value, so I will be using SCRIPT_REAL.
script_real
Now, use that field to visualize the correlation coefficient between Customer Segment and Supplier. A value close to -1 indicates a negative linear relationship between the variables. A value to close +1 indicates a positive linear relationship between the variables.
matrix
This is just a starter in using the R integration. Hopefully, this will help you get started using this at your own association. If you need help developing predictive models or using R, contact us.

Top Posts of 2016

We looked at our Google Analytics data to bring you our most popular blog posts from 2016 based on total hits. If you missed these popular posts, here’s another chance to read them.

Tableau 9.3: Smarter Version Control and Better Use of Color

Back in March 2016, Jasmin Ritchie published this summary of her favorite new features in Tableau Desktop 9.3. New features included the ability to have workbook versions stored on the server when a new one is published so that it can be reverted if necessary. Here are her other three favorite new features:

  1. Newly designed dialog for data source publishing
  2. Excluding grand totals from shading so the eye isn’t drawn directly to those obviously higher numbers
  3. Applying colors to sheets. This is very helpful for organization and communication. We’ve started using color on the sheet name to organize and identify which worksheets are not ready to be published.

Top 5 New Features in Tableau 10

Tableau 10 was released in the August 2016. Tamsen Haught provided a review of her favorite features in Tableau 10, including:

  • Revision history improvements and previews
  • Device specific dashboards which enable a single dashboard to target multiple device sizes automatically
  • Workbook formatting which is a glorious time saver that lets you apply a style to an entire workbook
  • Cross data source filtering which makes filtering across data sources much easier with an option called “all related data sources”
  • Clustering, which is essentially a simplified way to use segmented groups like exercise and diet and find which locations are similar to one another.

Also worth mentioning, that when you upgrade to Tableau 10, you get the new features from 9.3 as well as any other earlier version you’re missing.

Using Propensity Modeling to Drive Revenue and Increase Engagement

Propensity modeling lets you “look at past behaviors in order to make predictions about your customers.” Associations can use propensity modeling to drive revenue and increase engagement. Models are able to predict the likelihood that someone will buy, churn or lapse, or unsubscribe. There’s a lot of potential with this type of advanced analytics. We preict more associations will invest in advanced analytics like propensity modeling in the year to come.

Five Advanced Analytics Ideas for Associations

At the ASAE Technology Conference this past month, Kelly Baker (Chief Analytics Officer at Association Analytics®) and Galina Kozachenko (Director, Strategic Data Analytics, Association for Financial Professionals) presented a fascinating session on advanced analytics.

What is Advanced Analytics?

Gartner defines advanced analytics as “the autonomous or semi-autonomous examination of data or content using sophisticated techniques and tools, typically beyond those of traditional business intelligence (BI), to discover deeper insights, make predictions, or generate recommendations.”

Five Ideas Advanced Analytics Ideas for Associations

During the session, Kelly and Galina shared five ideas for how associations can use advanced analytics to drive engagement and revenue:

  1. Meeting Attendance Model to predict which members will attend an upcoming meeting
  2. Decision Tree to visually show how different paths affect conversion times.
  3. Interactive Tool created from a regression model that helps individuals see how different decisions will impact outcomes.
  4. 20-Year Revenue Model that calculates net present value of a new member and thus total membership revenue, including what-if scenarios from simulation results for each variable.
  5. Purchasing Likelihood Model to predict whether/when an individual or company will purchase an item.

Download Kelly and Galina’s complete presentation to learn more about how the Association for Financial Professionals is using advanced analytics to advance their mission.
Banner Designed by Freepik

How Your Association Can Implement Propensity Modeling

Last week, we introduced you to Propensity Modeling and how it can help your association make data-guided decisions while providing great value to your customers. We’ll now dig into some of the technical detail and steps to implement Propensity Modeling.

Step 1. Prepare Your Data

Consistent, complete, and accurate data is the foundation of predictive modeling. Your data should ultimately look like a very wide row with a dependent variable of 1 or 0 relating to the business action taken (or not) along with a variety of independent variables with values at the time of transaction.
Categorical data should be converted to “dummy” variables where values are transformed into individual columns as opposed to row-based data that is ideal for data exploration.  Fortunately, the ability to quickly access high-quality and timely data regardless of source from an environment such as a dimensional data model makes the process much easier.

Step 2. Select Your Variables
checklist-1402461_960_720

Incorporating the right mix of features is vital to the success of any predictive model. While it’s great to have many variables available as candidates, having too many can actually harm model accuracy.
Several automated stepwise techniques are available to propose variables by iterating through different combinations while considering measures such as significance and model error. Simply relying on automated processes is not recommended as statistics should be tempered with business expertise to identify variables that are not meaningful or pick between highly correlated variables. Another challenge is the potential for over fitting, meaning the selected variables based on the sample data are not best for unseen data.

Step 3. Select Your Modeling Technique

Next, you will want to select a modeling technique. You will likely be deciding between a linear regression model and a logistic regression model.
Linear Regression models have outcomes based on nearly infinite continuous variables, such as time, money, or large counts. Propensity Modeling generally leverages Logistic Regression models to derive probability-based scores between a fixed range of 0 and 1. The underlying algorithms used to create models are very different as well.
Logistic Regression is often perceived as an approach to estimate binary outcomes by rounding to 0 or 1, but a score of .51 is very different from a score of .99.  A common approach is to assign records to categories using deciles, or 10 bins with equal ranges.

Step 4. Determine If You Need to Use Any Other Analytic Techniques

You can use several other advanced analytic techniques to accomplish goals similar to Propensity Modeling.

  • Clustering is a form of unsupervised learning as the model is not based on a specific outcome or dependent variable, but simply groups records such as individuals.  The groups can result in customer segments that are ideal for certain products or marketing approaches.
  • Collaborative Filtering is based solely on the actions of groups of users as opposed to individual characteristics.  This is a common approach for recommendation systems based on actions such as purchases, product ratings, or web activity.
  • Decision Trees traverse a path of variables with branch “splits” based on the contributions of variables to ultimate outcomes.  This technique can be effective when a very small set of variables lead to outcomes influenced by downstream groups of variables.

You can also combine models, where the results of one model are the input to another to create a ensemble models.
The decile scores generally represent a range from “sure thing” to “lost cause”. You can use the different decile groups to guide approaches such as the effort to retain individuals, pricing strategies, and marketing messages.

Step 5. Determine Measurement Approach

The Lift of a Propensity Model represents the ratio of the rate based on applying a model to the rate based on “random” individuals. An ideal way to derive this measure is to maintain a control group for comparison to a similar group leveraging the Propensity Model. If can be a difficult decision to risk potential revenue, so a common approach is to simply compare before-and-after results.

Step 6. Consider How You Will Take Action

Before using any analytics model, it’s a good idea to consider how you can take action on the information. What decisions will you make as a result of the information? Similarly, how will you measure the results of the action and use it to inform your model?
For example, you can use a propensity model to reduce expenses. Targeting individuals differently based on their propensity to take action can optimize costs in different ways. Costs might be direct costs, such as actual print mailings or list rentals, or costs can be indirect, such as many non-personalized emails that contribute to information overload. You will want to establish a baseline and a goal for cost reduction to measure success of the model.

Step 7. Identify Your Tool

A range of different options are available to implement Propensity Modeling.

  • R Programming: A popular open-source statistical programming language with many mature packages to perform the techniques underlying Propensity Modeling.
  • Alteryx Software: A software platform offering pre-built tools for different modeling techniques and business scenarios.
  • Amazon Machine Learning: A cloud-based service that is part of the comprehensive Amazon Web Services environment that provides visual wizards for tools to perform Propensity Modeling

This may seem like a lot of steps, but once you have all of your comprehensive data easily accessible along with an available user-friendly tool, all you will need is your imagination to better understand your association’s customer journeys to make valuable data-guided decisions.

Using Propensity Modeling to Drive Revenue and Increase Engagement

At the 2016 ASAE Annual Meeting & Expo, Gwen Fortune-Blakely (Enterprise-wide Marketing Director) and Leslie Katz ( Marketing Director) with the American Speech-Language-Hearing Association (ASHA) presented an amazing session on how ASHA is using propensity modeling to move people up the continuum of engagement to drive revenue and membership. Here’s a quick overview of what you need to know about propensity modeling and how it can help your association.

What is Propensity Modeling?

Propensity Models look at past behaviors in order to make predictions about your customers.
It is complementary to segmentation, but different. When segmenting, you cluster customers based on shared traits or behaviors. In marketing, propensity modeling goes a step beyond segmentation by focusing on likely behavior or action. Where segmentation provides insight into customer behavior, propensity modeling provides foresight. It allows you to target customers based on likely behavior as opposed to past behavior.
There are three main types of models: propensity to buy, propensity to churn, and propensity to unsubscribe.

  1. Propensity to Buy model looks at customers who are ready to purchase and those who need a little more incentive in order to complete the purchase.
  2. Propensity to Churn model looks for your at-risk customers.
  3. Propensity to Unsubscribe model looks for those customers who have been over-saturated by your marketing efforts and are on the verge of unsubscribing.

How can Propensity Modeling help your Association?

Think about an association that is about to send membership renewal notices. In the past, they send out a packet of materials by mail to all current members. The packet includes an invoice and an expensive, professionally designed brochure that espouses the value of membership. The association’s retention rate is about 86%, which is respectable but what would happen if the association applied a propensity model to better understand their customers?

  1. Increase Revenue. A propensity to churn model would “score” current members and could help identify those members who are at risk. The association staff can use that information to create customer campaigns for at-risk members. This might include in-person visits or phone calls and other personal touchpoints that would help secure renewal.
  2. Decrease Expenses. Propensity modeling also helps associations determine who to target and how, which can help reduce expenses. In this case, the staff might use the model to identify those members who don’t require a brochure and would simply renew after receiving an invoice. Similarly, a propensity model can identify those customers who need extra attention. It may not be cost-effective to have staff call every member, but what if staff knew which members would likely respond best to a personal phone call?

You can imagine other examples where a propensity model can help your association. For example, associations can use propensity modeling to facilitate market penetration by identifying customers most likely to buy. Or you can use propensity modeling to anticipate how much a customer is likely to spend. This can help determine pricing and product offers.
We often draw inspiration from the corporate world. MasterCard Advisors shared an interesting white paper on how you can use behavioral scoring to add precision to targeted marketing.

How do you get started?

5 step process
At Association Analytics, we follow a five-step process for data analytics, including propensity modeling.

  1. Scope. Define your business objectives and prioritize them. We recommend starting small and focusing on developing a model for one specific objective first. This will keep you from becoming overwhelmed. When you try to fix everything, you normally will end up fixing nothing.
  2. Collect. Spend time with your data before doing any modeling. Inventory your data sources and make sure you understand how data sources will help you answer your questions. Then, integrate data into a central location. We recommend a data warehouse and dimensional data models, but you can directly connect data sources to a business intelligence tool like Tableau.
  3. Clean. Don’t spend time on your model before making sure you have clean, complete data. Be sure to identify data anomalies and then correct any issues.
  4. Analyze. Visualize your data to understand likely behaviors in your model. Don’t get trapped by confirmation bias. Keep an open mind and be open to new patterns or information.
  5. Communicate and Take Action. Share the results of your propensity model with key stakeholders. Take action on the results to improve your marketing efforts and advance your association’s mission.

Propensity Modelling can be a valuable tool in order to better understand your customers and to predict their behavior. It can help you improve their experience with your association.

Don’t Be Afraid to Ask ‘What If?’

The oft-cited Gartner image depicting an analytics maturity model shows different forms of analytics that associations can use to understand customers and make decisions with confidence. We’ve previously discussed how Predictive Analytics can provide valuable insight into your association business, but how can you move towards Prescriptive Analytics to answer ‘how can we make it happen?’ One way to get there is through “What-if” Analysis.
analytic-maturity

What is “What-if” Analysis?

“What-if” Analysis is the process of changing the scenarios or variables to see how those changes will affect the outcome. Associations might use this when they have limited data for making a decision or they’re considering launching a major new program. This type of analysis can help you make decisions with confidence.
With “What-if” Analysis, you begin with the end in mind while exploring a world of possibilities in your association’s data. It is a great way for your association to apply models developed for Predictive Analytics to move towards prescriptive analytics. “What-if Analysis” incorporates predictive and other models demonstrating data relationships and allows you to measure the potential impact of different strategies. Here are potential questions that What-if Analysis can help answer:

  • How will different levels of membership dues impact overall revenue?
  • Will changing the location of a conference increase attendance?
  • What marketing channel allocation will maximize conversion rates?

Potential Challenges of “What-if” Analysis

Implementing multiple models and making data assumptions present certain challenges, such as:

  • Data relationships might not be linear – customers eventually encounter diminishing returns as their activity increases
  • Other data relationships may emerge – increasing meeting attendance could decrease training course attendance
  • Price elasticity is not uniform at untested levels – the impact of price on customer decisions may not be easily estimated

Another key consideration is understanding when fundamental changes over time change previously discovered data relationships. Although the past is often the best predictor of the future, this is not always the case.  You can identify instances of data changing over time by consistently monitoring and exploring Descriptive Analytics based on historical data.
These challenges demonstrate why you need analysis beyond basic spreadsheet features.

Getting Started

You can perform basic “What-if” Analysis in Microsoft Excel. However, you can take your “What-If” Analysis even farther, with these tips:

  • Get a Data Visualization Tool. You will want the power of interactive data visualization using tools, such as Tableau, to rapidly adjust data inputs and understand resulting changes.
  • Validate Data. You need to continuously re-validate models and measure the effectiveness of models to ensure the ongoing effectiveness of your models. Be sure to include this when you are considering resources. Also, not all data is created equal. You can use sensitivity analysis to identify the impact of individual variables on different outcomes.
  • Encourage “What-if” Questions. “What-if” Analysis works best in an innovative culture where intellectual curiosity is encouraged. Reward staff for experimenting and questioning long-held beliefs.

You can move your association towards Prescriptive Analytics to truly have conversations with data and create the future. So, now think about your own “what-if” questions!

How to Harness the Power of Recommendation

Taking a customer-focused approach to data analytics helps provide optimal value, enhance engagement and understand the overall customer journey. Individuals’ actions provide valuable information that goes further than what is collected with surveys and online profiles. Additionally, actions uncover hidden patterns that can be used to build a recommendation system to guide customers toward other interests.
Here are the most common approaches to creating recommendation systems:

  • Collaborative filtering. This is based on data about similar users or similar items. It includes these techniques:
    • Item-based: Recommends items that are most similar to the user’s activity
    • User-based: Recommends items that are liked by similar users
  • Content-based filtering: Makes suggestions based on user profiles and similar item characteristics
  • Hybrid filtering: Combines different techniques

Recommendation systems results are similar to those on sites that suggest products and people, like Amazon and LinkedIn. Collaborative filtering leads to more of a self-learning process, since it is entirely based on actual activity and not data provided by users. There are scenarios where the others are more appropriate that we’ll address soon.
Similarity between users or items is measured by “distance” calculations from those long-ago geometry and trigonometry classes. You can use the results with a visualization tool such as Tableau, creating a similarity matrix and quickly identifying relationships.
correlation_matrix
It is sometimes helpful to group individuals and items into categories, which can be done by combining similarity scoring with data mining techniques like cluster analysis and decision trees.
Recommendation systems generally require data structured by columns instead of the row-based data that is best for interactive data discovery. Similar to text analytics, the items themselves — meetings, publications, donations, and content — represent large columns. It’s used by specialized R packages for the recommendation system features described in this book.
These algorithms generally need binary values, like a “yes” if someone purchased an item and “no” if he did not. But if users can rate items on a scale of 1-5, what does a score of 3 mean? Normalizing scores based on individual and overall ratings is a good way to answer this question.
The data requirements are really not as onerous as they may sound. Once data is in the right format for the R analysis tools, your imagination can take over to drive actionable association analytics. Content-based filtering works well for new users, and a hybrid approach can help prevent a “filter bubble” where some people get a too-narrow set of interests from similar recommendations.
Data from meeting registrations, membership history, donations, publication purchases, content interaction, web navigation, survey responses and profile characteristics can be used to guide association customers. Additionally recommendations can bring people with common interests together. This new insight can be used to enhance all customer interactions, ranging from email marketing to dynamic website presentation to event sessions.

A Beginner’s Guide to Analysis with R

Many associations want to do more advanced analytics projects using R — a programming language used for statistics — but are not sure how to start.
Before starting this kind of analysis, you need to define the goal. It is best to make this a S.M.A.R.T. goal, which means it is Specific, Measurable, Attainable, Relevant, and Time-Bound.
define-your-goalThe detailed S.M.A.R.T. goal will become your dependent variable, which is what you are trying to measure in your analysis. Here’s what a transformed basic goal looks like:
Basic goal: Increase membership retention.
S.M.A.R.T. goal: Determine what program changes will increase next year’s membership retention for first-year members by 10 percent, compared to the two previous years.
After defining the dependent variable, you need to determine the independent variables you are measuring. These are the factors you think may be influencing whether you reach your detailed goal. In this type of analysis, you will have multiple independent variables. In fact, the more independent variables, the better.
As you analyze the data, you will be able to narrow down the independent variables to those that have the highest impact on your goal. For example:

  • Dependent variable
    • Renewal (Did the member renew or not?)
  • Independent variables
    • Participation in chapter events
    • Is the member at a university
    • Participation in committees
    • Gender
    • Age
    • Workplace type and size
    • Location
    • Number and type of events attended

After determining your goal and what may be influencing it, you need to figure out what pool of data you will examine to look for answers. For our example, we would need to start with first-year members who could renew.
However, you may need to filter your data more. For example, if you know that there was a huge change in the renewal process in middle of the year, you may want to remove people who joined before then. Or maybe you have free memberships that automatically renewed each year, so these people should not be included in your pool.
Later in the blog, we will talk about preparing your data and how to run and interpret descriptive statistics in R.