Archive for data mining

Data is an Asset

Data is one of the most important assets an association has because it defines each association’s uniqueness. You have data on members and prospects, their interests and purchases, your events, speakers, your content, social media, press, your staff, budget, strategic plan, and much more. But is your data accurate and are you using it fully? Your data is an asset and should be carefully cultivated, managed and refined into information which will allow you to better serve your community and ensure you remain viable in today’s competitive landscape.
Although data is one of the most important ‘raw materials’ of the modern world, most organizations do not treat it that way. In fact, according to The Data Warehousing Institute, the cost of poor data quality in America is six hundred billion dollars every year. Data quality issues are also the cause of many failed IT projects.
Your data is talking to you, are you listening?
Associations have known for a long time that data is essential for market segmentation. However, there is so much more that can be done to harness data and use it as a strategic asset. Hidden within your data are stories about which members are at risk of not renewing, which prospects are likely to join, who might make a good speaker, where the best location for your next event is, the level to which you can raise rates without a decrease in member count, your best strategy for global expansion, and much more. We would be wise to listen to the stories our data is telling us, and to make sure the data on which they are based, is accurate.
The insights you glean from your data are only as good as the underlying data itself. It’s obvious that if the input is flawed, the output will be misleading. When it comes to data, there is a direct correlation between the quality of the data and the accuracy of the analysis. I’m no longer surprised at the high number of duplicate records, and the high percentage of incomplete, inaccurate and inconsistent data we find when we begin to analyze an association’s data. Because it is difficult to quantify the value of data in the same way we can measure cash, buildings and people, the activities designed to manage and protect data as an asset are often low on the priority list. That is, until a business intelligence or analytics project is undertaken. Then suddenly data quality management (DQM) takes center stage.
DQM is a Partnership between Business and IT
Business responsibilities include: 1) Determining and defining the business rules that govern the data, and 2) Verifying the data quality.  IT responsibilities include: 1) Establishing architecture, technical facilities, systems, and databases, and 2) Managing the processes that acquire, maintain, disseminate data
DQM is a Program, Not a Project
DQM is not really a “project” because it doesn’t “end”. Think of DQM as a program consisting of the following activities:

  • Committing to and managing change – are the benefits clear and is everyone on board?
  • Describing the data quality requirements – what is the acceptable level of quality?
  • Documenting the technical requirements – how exactly will we clean it up and keep it clean?
  • Testing, validating, refining – is our DQM program working?  How can we make it better?

DQM is Proactive and Reactive
The proactive aspects of DQM include: establishing the structure of a DQM team, identifying the standard operating procedures (SOPs) that support the business, defining “acceptable quality”, and implementing a technical environment.  The reactive aspects include identifying and addressing existing data quality issues. This includes missing, inaccurate or duplicate data. For example:

  1. Important data may be missing because you have never collected it. The information you have on a member may allow you to send a renewal, but it’s not enough for you to determine their level of interest in the new programs you are rolling out in the coming year. Or the information you have on your publications is enough to be able to sell them online, but because the content is not tagged in a way that matches customer interest codes, you can’t serve up recommendations as part of the value your association offers. Associations must not only have accurate data but more data in order to fully understand the contextual landscape in which our members and prospects operate.
  2. When organizations merge, data from the two separate organizations needs to be combined, and it can often be very time-consuming to determine which aspects of the record to retain and which to retire. A determination must also be made about how to handle the historical financial transactions of the merged company.
  3. With the ability for visitors to create their own record online, the increase in duplicate records is on the rise. The Jon Smith who purchased a publication online is really the same Jonathan Smith who attended the last three events and whose membership is in the grace period. Because he used different email, a duplicate record is created and you miss the opportunity to remind him of the value of his membership when he registered for the event.

Sometimes it’s not until a data quality issue surfaces in a publicly embarrassing way that an organization decides to truly tackle the problem – a board report has erroneous data, an association cannot reply quickly to a request for statistics from an outsides source, thereby losing the PR opportunity, the CEO cannot determine the primary contact for an organization in the system. It’s usually only after several situations like these that DQM receives serious attention, but it is unfortunate that it often starts with a search for blame. This engenders fear which represents a threat to the success of a DQM initiative. It is essential that DQM programs begin with acceptance of the current state and commitment to a better future. A promise of “amnesty” with regard to what happened in the past can go a long way toward fostering buy-in for the program.
How do you Eat an Elephant?
The easiest way to start a DQM program is to start small. Identify an area that requires attention and focus first on that. In order to obtain support from key stakeholders, show how the program ties in with the association’s strategic plan.  After you identify the primary focus (for some it might not be company names, it might be demographics), set an initial timeframe (such as 3 months). Make the first project of the program manageable so you can obtain a relatively quick win and work the kinks out of your program.
Steps for your First DQM Initiative:

  1. Create a new position or assign primary DQM responsibilities to an individual
  2. Build a cross functional team and communicate the value of the program
  3. Decide how to measure quality (example # records reviewed/cleaned)
  4. Set a goal (# records)
  5. Reference the goal in the performance evaluation of the individuals on the team
  6. Evaluate progress
  7. Revise

Data is an asset and is one of the most important assets an association has because it is unique in its detail and context and can be used strategically to ensure we remain relevant and viable. When analyzed and leveraged properly it can provide a competitive advantage in attracting and retaining members and creating new sources of non-dues revenue. It is important that the underlying data is accurate and complete and a well-organized DQM program should be considered essential. Worldwide attention is being given to the importance of making “data-driven decisions”. Private industry and government have been using data to guide their decisions for many years and now is the time for associations to recognize data as the valuable asset it is.

Agile Business Intelligence

Business Intelligence (BI) projects that incorporate key aspects of Agile processes dramatically increase the probability of a successful outcome. 
I wonder why business intelligence (BI) projects have a reputation for being slow, painful and ineffective – and why do they often fail to deliver on the promise to improve data-driven decision-making?  I believe part of the answer is in the approach: the waterfall, linear, command and control model of the traditional System Development Life Cycle (SDLC) that is still pervasive in most technology projects today.  There is a better way!
One of the core principles of Agile is embracing a realistic attitude about the unknown.  It is interesting that at the beginning of a traditional technology project, when the least amount is actually known about an organization and its business rules, environment, variables, players, questions and requirements, that the greatest amount of effort is made to lock in the scope, the cost and the schedule.  It’s understandable that we want to limit risk, but in reality the pressure to protect ourselves can lead to excessive time spent on analysis, which often still results in unclear requirements, leading to mismatched expectations, change orders and cost overruns.  This is a well known phenomenon – at the very point where we have the least amount of information, we are trying to create the most rigid terms.  See the “Cone of Uncertainty” concept. 
I think part of the reason for this paradox stems from an intrinsic lack of trust.  Steven M.R. Covey explains in his book, “The Speed of Trust”, that trust has two components:  character and competence.  In each situation in which you are asked to trust, you must have both.   For example, if your best friend is a CPA, you might trust them as a friend, have complete confidence in their character and trust them to handle your taxes, but you will not trust their competency to perform surgery on a family member.   It’s the same in business.  We might have confidence in a vendor’s base software product, but not trust their ability to understand our needs or implement the solution well.  And trust has to be earned.  Once an organization has trust, the speed at which change can be communicated and accommodated dramatically increases.  And this increase in speed translates into an improved outcome and a reduction in the cost, both of which are a by-product of the clear communication that is possible when trust is present.
What does all this have to do with business intelligence?  I believe BI projects lend themselves to an agile, iterative approach, and this approach requires trust in order to work.  I’m not a big fan of some of the Agile terminology – terms like “product backlog” (doesn’t “backlog” sound negative?) and “sprint” (is it a race?)  But I do fully embrace the concept of working solutions vs. endless analysis, communication and collaboration instead of rigid process enforcement, responding to change vs. “hold your feet to the fire” denials of needed change requests.  In general, it’s the concept of “outcome” vs. “output” that is so inspiring to me about Agile.  I’ve seen examples where a technology project met all of the formal “outputs” specified in the contract, yet sadly failed to deliver the most important thing – the “outcome” that the organization was really trying to achieve.  For example, the CRM implementation that was delivered on time and on budget but that none of the staff would use, or the BI project that resulted in dashboards that measured the wrong things.  These are not examples of successful projects because the true desired outcome was not achieved.
How can Agile concepts be used in BI? 

  1. Identify an initial high profile “win” and complete a small but important aspect of the project to inspire the team, generate enthusiasm, engagement and feedback
  2. Facilitate data discovery : create a hypothesis -> investigate and experiment -> learn -> ask new questions and repeat the process
  3. Value the learning and the teamwork that is intrinsic to the process and which builds trust and speeds the ability to adapt to change

In a future post I’ll debunk some of the common myths that surround the topic of agile processes.