When starting a data warehouse or other business intelligence (BI) initiative, several aspects need extra attention due to the inherent differences from other types of implementations. For instance, unlike enterprise-wide software implementations by huge teams, many BI projects are executed by smaller teams receiving continuous business feedback often. The work is typically very creative since it isn’t always anchored to one application but is focused on high data integrity. As a result, some of the aspects I outline below don’t typically get the emphasis they deserve. With this iterative, creative, and data management-oriented approach, consider these often under-appreciated factors of BI success.
When initiating a BI project, a realistic plan is needed. Don’t worry about not having a 100% solution in mind — simply begin by just “road-mapping” some intermediate goals along with the business users. The roadmap should not be so detailed that it cannot be adapted with the needs of the business over time. Remember, project plans contain the tactical detail that should trace back to the roadmap’s high level, strategic goals. Consider the following on building a solid BI strategy:
Remember what English writer, mathematician, and logician Lewis Carroll said, “If you don’t know where you are going, any road will take you there.”
BI efforts should aim to build a “single version of the truth” — a non-redundant and consistent recording of all a corporation’s business data. This is nearly impossible to do, but it is the goal to get as close as possible. The key question is can you can construct a high integrity, relational data model with available data within the BI project budget and timeframe. Data sources, particularly much older ones, often suffer from data validation gaps and sometimes unauthorized data definition changes over time. In addition, many companies lack logical data consolidation, which is caused by mergers and acquisitions or other situations that forced hurried band-aiding of systems together. Data profiling informs on data quality from these sources and involves analyzing the data needed, collecting relevant statistics, and discovering problematic anomalies like start dates after associated end dates, data type inconsistencies, duplication errors, or other defects. Data analysis extends throughout any BI project. However, data profiling is best started after target data sources are isolated and PRIOR to designing or building anything. Finding out too late about major problems in data quality can adversely affect ETL and data warehouse design, even to the point of causing project cancelation. Overall, robust upfront data profiling:
Automated data profiling tools can speed this process along significantly and should be seriously considered, especially for larger projects. The bottom line is this: The longer the wait to get into the nitty-gritty of data anomalies, the higher the risk of not meeting project goals and endangering the project altogether.
Early data profiling will likely expose “dirty data” that must be cleansed prior to loading into a data warehouse or other data storage structure — but where and when? The usual assumption is that the upstream data owners should discover and cleanse dirty data as completely as possible at the source once they are made aware of data issues. That looks great on paper but it is totally unrealistic based on time and budget constraints. These complicating factors make it almost impossible to get multiple upstream data owners to align efforts on getting data cleaned at the source:
This only gets worse as the number of disparate data sources and owners increases. Unless data owners cooperate quickly and effectively, the project team should move rapidly to include additional cleansing functionality into the data staging process. The sooner this is resolved along with data owners, the quicker the additional cleansing can be fleshed out and deployed.
Every IT project can benefit from a right-sized change management program and BI projects are no different. Remember to add these points to the BI project change management program:
Finally, the nature of BI projects usually focuses around smaller, highly skilled teams working through iterative deployment cycles closely with business end users. Couple that with increased demand for fast time-to-value IT deployments and it’s obvious that this type of project lends itself well to an agile instead of a “big bang” project methodology. To be sure, classic agile isn’t really a best fit for every BI initiative and a standard waterfall or hybrid (i.e. “wagile”) project methodology may be needed particularly when building out, say, the data warehouse foundation. Keeping the work tightly focused on small teams, an iterative deployment cycle, and high user “touch” will likely garner quicker successes and associated longer term management support. One word of caution: It is critical to ensure that fast time-to-value means deployment to Production, not a random building of analytical prototypes never deployed.
While there are other important rules-of-thumb to observe in BI deployments, experience shows that these should be strongly considered for any BI project. Do you have any other critical observations regarding BI projects that should be mentioned? Please comment and let’s discuss.
Eric Noack is a Senior Manager in the Data Analytics Practice of Cyber Group, Inc. He has a Bachelor’s Degree in Mechanical Engineering from Texas A&M University and Masters of Business Administration from The University of Texas at Austin. His career spans over 20 years as an IT leader and consultant and he is passionate about implementing analytics and demonstrating its significant financial benefits.