Archive for big data

An Approach to Analytics both Hamilton and Jefferson Could Embrace

Happy 4th of July!  What a great time to think about data independence, democratization, and governance for your association.  In this post we’ll talk about the balance between the central management of data by IT and data directly managed by association staff.
Leading analytics tools provide great capabilities to empower people to make data-guided decisions. The ability to analyze diverse data from a breadth of sources in a usable way for association staff is a key feature of these tools. Examples include Power BI Content Packs and Tableau Data Connectors. These range from pre-built data sources based on specific applications such as Dynamics, Google Analytics, and Salesforce; to relatively rarer “NoSQL” sources such as JSON, MarkLogic, and Hadoop data. These tools rapidly make data from specific applications available in formats for easy reporting, but can still lead to data silos. Tools such as Power BI and Tableau provide dashboard and drill-through capabilities to help bring these difference sources together.

Downstream Data Integration

This method of downstream integration is commonly described as “data blending” and “late binding”. An application of this approach is a data lake that brings all data into the environment but only integrates specific parts of data for analysis when needed. This approach does present some risks, as the external data sources are not pre-processed to enhance data quality and ensure conformance. In addition, business staff can misinterpret the data relationships that can lead to incorrect decisions. This makes formal training, adoption, and governance processes even more vital to analytics success.

What about the Data Warehouse?

When should you consider using content packs and connectors and how does this relate to a data warehouse and your association? The key is understanding that they do not replace a data warehouse, but is actually an extension of it. Let’s look at a few scenarios and approaches.

  • Key factors to consider when combine data is how closely the data is linked to common data points from other sources, the complexity of the data, and the uniqueness of the audience. For example, people throughout the association want profitability measures based on detailed cost data from Dynamics, while the finance group has reporting needs unique to their group. An optimal approach is to bring cost data into the data warehouse while linking content pack data by GL codes and dates. This enables finance staff to visualize data from multiple sources while drilling into certain detail as part of their analysis.
  • Another consideration is the timeliness of data needed to guide decisions. While the data warehouse may be refreshed daily or every few hours, staff may need the immediate and real-time ability review data such as meeting registrations, this morning’s email campaign, or why web content has just gone viral. This is like the traditional “HOLAP”, or Hybrid Online Analytical Processing, approach where data is pre-aggregated while providing direct links to detailed source data. It is important to note that analytical reporting should not directly access source systems on a regular basis, but can be used for scenarios such as reviewing exceptions and individual transaction data.
  • In some cases, you might not be sure how business staff will use data and it is worthwhile for them to explore data prior to integration into the data warehouse. For example, marketing staff might want to compare basic web analytics measures from Google Analytics against other data sources over time. In the meantime, plans can be made to expand web analytics to capture individual engagement, align the information architecture with a taxonomy, and track external clicks through a sales funnel. As these features are completed, you can use a phased approach to better align web analytics and promote Google Analytics data into the data warehouse. This also helps with adoption as it rapidly provides business staff with priority data while introducing data discovery and visualizations based on actual association data.
  • Another important factor is preparing for advanced analytics. Most of what we’ve described involves interactive data discovery using visualizations. In the case of advanced analytics, the data must be in a tightly integrated environment such as a data warehouse to build predictive models and generate results to drive action.

It’s not about the Tools

The common element is that using data from sources internal and external to your association requires accurate relationships between these sources, a common understanding of data, and confidence in data to drive data-guided decisions. This makes developing an analytics strategy with processes and governance even more important. As we’ve said on many occasions: it’s not about the tools, it’s the people.
Your association’s approach to data democratization doesn’t need to rise to the level of a constitutional convention or lead to overly passionate disputes.

Association Analytics: Begin with the End in Mind

Often association leaders ask me, “Where is the best place for us to begin our data analytics initiative?”  I like this question and am reminded of Steven Covey’s expression that before we undertake a new initiative, we should “begin with the end in mind”.  When it comes to data analytics for associations, this means starting with your strategic plan.
One association’s strategic plan, for example, specified that they would grow meeting attendance by 10% in 3 years.  This was their largest source of revenue.
In the past the association made decisions by looking at historical trends and was unable to confidently estimate expected attendance, revenue, venue amenities needed, onsite staffing levels and more.  In fact, the more events they held, the greater the risk of an incorrect estimate.
Yet because of the strategic initiative to increase attendance, the first thing they tried was to increase the number of events and market them by email campaigns and direct mail.
The visualization below is a simple bar chart showing the number events by year. The higher the bar, the more events. The color shading of the bars indicate the number of registrants. The darker the green, the more registrants. We can quickly see 2010 had the most events, but not the most registrants.  In other words, although the association increased the number of events, this did not increase the number of attendees. In fact, holding more events actually caused a decrease in meeting profitability as a whole, since there were more expenses associated with fewer registrants.
events
By reducing the number of events by combining smaller, similar events to a larger meetings, the organization was able to dramatically exceed the target set by the board.  In other words, by holding fewer events, they had more participants.

What is the Cost of Not using Data?

When making business decisions it’s important to weigh the costs of:

  1. Bad decisions
  2. Decision delay
  3. Lost opportunities
  4. Damage to image/reputation

Often these costs are much greater than the costs associated with analyzing data. Taking a “decision tree” approach can be a logical way to evaluate where to start:

  • Are there areas in your strategic plan where data can answer questions, provide insight and direction?
  • Which areas do our customers care about? Will it make a difference to them – add more
    value, improve their experience – advance our mission?
  • Do we have the data – or can we get it?
  • Will we be able to take action on the results of the analysis?

 

The Value of Data Discovery for Associations

The Magic Quadrant

In February 2013, Gartner Inc. released an important report entitled Magic Quadrant for Business Intelligence and Analytics Platforms which details the current state of the business intelligence (BI) market and evaluates the strengths and weaknesses of several of the top vendors. It’s interesting to note that in this report, Gartner emphasized the emergence of data discovery into the “mainstream business intelligence and analytics architecture”, something we have been highlighting at DSK Solutions for years.
What is data discovery? Associations and nonprofits are sitting on large quantities of data and don’t always realize the value of this powerful asset. The old days of spray and pray are gone. Remember direct mailing blasts? How ineffective! Associations were shooting in the dark and wasting resources that could have been allocated to better serve members. Unfortunately, some associations still rely on this marketing approach, but there is a better way: segmented target marketing based on data.
All of your data – including CRM or AMS (customer data), general ledger and budget (financial data), and Google Analytics (Web data), can be pooled together to illuminate your member strategy. Think of each data source as a small flashlight that reveals a little bit of the path in front of you. When your data sources are pulled together, the path becomes much clearer. When analyzing your data with data discovery, it becomes possible to discover things you did not know before.

Necessary Steps

Clients frequently come to us seeking guidance on how to begin the task of leveraging their data to inform better decision making. Before you can embark on data discovery, you have to do two things:

  • Ask the right questions.  What is meaningful to your organization? What are you trying to find out about your members, prospects, products, services and profit?
  • Clean your data. If your data is filled with duplicates, inaccuracies, inconsistencies and other forms of noise, your analysis will be flawed. Remember: Garbage in, garbage out.  Quality data as an input allows for accurate analysis as an output, which results in the improved ability to make good decisions.

These two steps form the foundation of the data discovery process. Almost always, the answers you derive from your data will lead to more questions. It’s okay to ask why. In fact, you should be asking why! Start by asking questions like these:

  • How dependent is your association on dues revenue?
  • What is the price elasticity of membership (Full Rate v.s. Discounted Rate)?
  • Which members are at risk for not renewing?
  • How far (in miles) will registrants travel to attend a meeting?
  • Which products or services have the highest profit?

Then start asking “why”.  Remember the idea of the Ishikawa (or fishbone) diagram?  It’s an easy and useful way to begin thinking in terms of cause and effect – you ask “why” 5 times, until you arrive at the root cause of an effect.  Now with interactive data discovery you ask these questions directly by interacting with the data in a visual way!  At DSK we describe it as “having a conversation with your data”.  For example, a certification department of an association wanted to look at their pass/fail ratio for an exam.  Using data discovery, they discovered many more college-aged people were registering and doing poorly than in the past.  In the process of asking “why” the failure rate was increasing, they discovered an opportunity not only to publish a new study guide, but also they located an entire new source of prospective members and created a new membership type to serve the college market.
Data discovery is an iterative process where you ask questions of your data in an interactive way. Drilling down both vertically and horizontally into your data allows you to not only answer the questions you know you have, but shed slight on those unknown-unknowns and enables associations to make better decisions.

 SS 2 Filtered on Type 2

The Analytics Convergence

Data-guided decisions permeate our everyday lives as individuals, but how can you harness that power for your association and your members? The field of data analytics and big data is exploding with opportunity. Businesses are encroaching on the areas that used to be the private domain of associations – content, networking, events, etc. – because they are employing the power of analytics. But now, the process and tools needed to analyze and interpret data are much less expensive and easier to use than before. Are you ready to have a conversation with your data? Each level of business intelligence has a unique language to make your data speak!
The following presentation was created by Debbie King, CEO of DSK Solutions, Inc. and David DeLorenzo, CIO National League of Cities. This information was first featured at the 2013 ASAE Finance, HR, and Business Operations Conference.

Business Intelligence Trends 2013

Although the term “Business Intelligence” is so overused that it is almost meaningless, what is important to know is that advances in data science and analytics are affecting everyone – every day.  Tableau outlines important trends in this field for 2013. What do these trends mean to your association?

Data is an Asset

Data is one of the most important assets an association has because it defines each association’s uniqueness. You have data on members and prospects, their interests and purchases, your events, speakers, your content, social media, press, your staff, budget, strategic plan, and much more. But is your data accurate and are you using it fully? Your data is an asset and should be carefully cultivated, managed and refined into information which will allow you to better serve your community and ensure you remain viable in today’s competitive landscape.
Although data is one of the most important ‘raw materials’ of the modern world, most organizations do not treat it that way. In fact, according to The Data Warehousing Institute, the cost of poor data quality in America is six hundred billion dollars every year. Data quality issues are also the cause of many failed IT projects.
Your data is talking to you, are you listening?
Associations have known for a long time that data is essential for market segmentation. However, there is so much more that can be done to harness data and use it as a strategic asset. Hidden within your data are stories about which members are at risk of not renewing, which prospects are likely to join, who might make a good speaker, where the best location for your next event is, the level to which you can raise rates without a decrease in member count, your best strategy for global expansion, and much more. We would be wise to listen to the stories our data is telling us, and to make sure the data on which they are based, is accurate.
The insights you glean from your data are only as good as the underlying data itself. It’s obvious that if the input is flawed, the output will be misleading. When it comes to data, there is a direct correlation between the quality of the data and the accuracy of the analysis. I’m no longer surprised at the high number of duplicate records, and the high percentage of incomplete, inaccurate and inconsistent data we find when we begin to analyze an association’s data. Because it is difficult to quantify the value of data in the same way we can measure cash, buildings and people, the activities designed to manage and protect data as an asset are often low on the priority list. That is, until a business intelligence or analytics project is undertaken. Then suddenly data quality management (DQM) takes center stage.
DQM is a Partnership between Business and IT
Business responsibilities include: 1) Determining and defining the business rules that govern the data, and 2) Verifying the data quality.  IT responsibilities include: 1) Establishing architecture, technical facilities, systems, and databases, and 2) Managing the processes that acquire, maintain, disseminate data
DQM is a Program, Not a Project
DQM is not really a “project” because it doesn’t “end”. Think of DQM as a program consisting of the following activities:

  • Committing to and managing change – are the benefits clear and is everyone on board?
  • Describing the data quality requirements – what is the acceptable level of quality?
  • Documenting the technical requirements – how exactly will we clean it up and keep it clean?
  • Testing, validating, refining – is our DQM program working?  How can we make it better?

DQM is Proactive and Reactive
The proactive aspects of DQM include: establishing the structure of a DQM team, identifying the standard operating procedures (SOPs) that support the business, defining “acceptable quality”, and implementing a technical environment.  The reactive aspects include identifying and addressing existing data quality issues. This includes missing, inaccurate or duplicate data. For example:

  1. Important data may be missing because you have never collected it. The information you have on a member may allow you to send a renewal, but it’s not enough for you to determine their level of interest in the new programs you are rolling out in the coming year. Or the information you have on your publications is enough to be able to sell them online, but because the content is not tagged in a way that matches customer interest codes, you can’t serve up recommendations as part of the value your association offers. Associations must not only have accurate data but more data in order to fully understand the contextual landscape in which our members and prospects operate.
  2. When organizations merge, data from the two separate organizations needs to be combined, and it can often be very time-consuming to determine which aspects of the record to retain and which to retire. A determination must also be made about how to handle the historical financial transactions of the merged company.
  3. With the ability for visitors to create their own record online, the increase in duplicate records is on the rise. The Jon Smith who purchased a publication online is really the same Jonathan Smith who attended the last three events and whose membership is in the grace period. Because he used different email, a duplicate record is created and you miss the opportunity to remind him of the value of his membership when he registered for the event.

Sometimes it’s not until a data quality issue surfaces in a publicly embarrassing way that an organization decides to truly tackle the problem – a board report has erroneous data, an association cannot reply quickly to a request for statistics from an outsides source, thereby losing the PR opportunity, the CEO cannot determine the primary contact for an organization in the system. It’s usually only after several situations like these that DQM receives serious attention, but it is unfortunate that it often starts with a search for blame. This engenders fear which represents a threat to the success of a DQM initiative. It is essential that DQM programs begin with acceptance of the current state and commitment to a better future. A promise of “amnesty” with regard to what happened in the past can go a long way toward fostering buy-in for the program.
How do you Eat an Elephant?
The easiest way to start a DQM program is to start small. Identify an area that requires attention and focus first on that. In order to obtain support from key stakeholders, show how the program ties in with the association’s strategic plan.  After you identify the primary focus (for some it might not be company names, it might be demographics), set an initial timeframe (such as 3 months). Make the first project of the program manageable so you can obtain a relatively quick win and work the kinks out of your program.
Steps for your First DQM Initiative:

  1. Create a new position or assign primary DQM responsibilities to an individual
  2. Build a cross functional team and communicate the value of the program
  3. Decide how to measure quality (example # records reviewed/cleaned)
  4. Set a goal (# records)
  5. Reference the goal in the performance evaluation of the individuals on the team
  6. Evaluate progress
  7. Revise

Data is an asset and is one of the most important assets an association has because it is unique in its detail and context and can be used strategically to ensure we remain relevant and viable. When analyzed and leveraged properly it can provide a competitive advantage in attracting and retaining members and creating new sources of non-dues revenue. It is important that the underlying data is accurate and complete and a well-organized DQM program should be considered essential. Worldwide attention is being given to the importance of making “data-driven decisions”. Private industry and government have been using data to guide their decisions for many years and now is the time for associations to recognize data as the valuable asset it is.

Agile Business Intelligence

Business Intelligence (BI) projects that incorporate key aspects of Agile processes dramatically increase the probability of a successful outcome. 
I wonder why business intelligence (BI) projects have a reputation for being slow, painful and ineffective – and why do they often fail to deliver on the promise to improve data-driven decision-making?  I believe part of the answer is in the approach: the waterfall, linear, command and control model of the traditional System Development Life Cycle (SDLC) that is still pervasive in most technology projects today.  There is a better way!
One of the core principles of Agile is embracing a realistic attitude about the unknown.  It is interesting that at the beginning of a traditional technology project, when the least amount is actually known about an organization and its business rules, environment, variables, players, questions and requirements, that the greatest amount of effort is made to lock in the scope, the cost and the schedule.  It’s understandable that we want to limit risk, but in reality the pressure to protect ourselves can lead to excessive time spent on analysis, which often still results in unclear requirements, leading to mismatched expectations, change orders and cost overruns.  This is a well known phenomenon – at the very point where we have the least amount of information, we are trying to create the most rigid terms.  See the “Cone of Uncertainty” concept. 
I think part of the reason for this paradox stems from an intrinsic lack of trust.  Steven M.R. Covey explains in his book, “The Speed of Trust”, that trust has two components:  character and competence.  In each situation in which you are asked to trust, you must have both.   For example, if your best friend is a CPA, you might trust them as a friend, have complete confidence in their character and trust them to handle your taxes, but you will not trust their competency to perform surgery on a family member.   It’s the same in business.  We might have confidence in a vendor’s base software product, but not trust their ability to understand our needs or implement the solution well.  And trust has to be earned.  Once an organization has trust, the speed at which change can be communicated and accommodated dramatically increases.  And this increase in speed translates into an improved outcome and a reduction in the cost, both of which are a by-product of the clear communication that is possible when trust is present.
What does all this have to do with business intelligence?  I believe BI projects lend themselves to an agile, iterative approach, and this approach requires trust in order to work.  I’m not a big fan of some of the Agile terminology – terms like “product backlog” (doesn’t “backlog” sound negative?) and “sprint” (is it a race?)  But I do fully embrace the concept of working solutions vs. endless analysis, communication and collaboration instead of rigid process enforcement, responding to change vs. “hold your feet to the fire” denials of needed change requests.  In general, it’s the concept of “outcome” vs. “output” that is so inspiring to me about Agile.  I’ve seen examples where a technology project met all of the formal “outputs” specified in the contract, yet sadly failed to deliver the most important thing – the “outcome” that the organization was really trying to achieve.  For example, the CRM implementation that was delivered on time and on budget but that none of the staff would use, or the BI project that resulted in dashboards that measured the wrong things.  These are not examples of successful projects because the true desired outcome was not achieved.
How can Agile concepts be used in BI? 

  1. Identify an initial high profile “win” and complete a small but important aspect of the project to inspire the team, generate enthusiasm, engagement and feedback
  2. Facilitate data discovery : create a hypothesis -> investigate and experiment -> learn -> ask new questions and repeat the process
  3. Value the learning and the teamwork that is intrinsic to the process and which builds trust and speeds the ability to adapt to change

In a future post I’ll debunk some of the common myths that surround the topic of agile processes.