Many associations want to do more advanced analytics projects using R — a programming language used for statistics — but are not sure how to start.
Before starting this kind of analysis, you need to define the goal. It is best to make this a S.M.A.R.T. goal, which means it is Specific, Measurable, Attainable, Relevant, and Time-Bound.
The detailed S.M.A.R.T. goal will become your dependent variable, which is what you are trying to measure in your analysis. Here’s what a transformed basic goal looks like:
Basic goal: Increase membership retention.
S.M.A.R.T. goal: Determine what program changes will increase next year’s membership retention for first-year members by 10 percent, compared to the two previous years.
After defining the dependent variable, you need to determine the independent variables you are measuring. These are the factors you think may be influencing whether you reach your detailed goal. In this type of analysis, you will have multiple independent variables. In fact, the more independent variables, the better.
As you analyze the data, you will be able to narrow down the independent variables to those that have the highest impact on your goal. For example:
- Dependent variable
- Renewal (Did the member renew or not?)
- Independent variables
- Participation in chapter events
- Is the member at a university
- Participation in committees
- Gender
- Age
- Workplace type and size
- Location
- Number and type of events attended
After determining your goal and what may be influencing it, you need to figure out what pool of data you will examine to look for answers. For our example, we would need to start with first-year members who could renew.
However, you may need to filter your data more. For example, if you know that there was a huge change in the renewal process in middle of the year, you may want to remove people who joined before then. Or maybe you have free memberships that automatically renewed each year, so these people should not be included in your pool.
Later in the blog, we will talk about preparing your data and how to run and interpret descriptive statistics in R.