To be or not to be for the data warehouse

9 min read

There are discussions about the future of the data warehouse in this age of Digitalization and Big Data technology. Is there a need for the data warehouse anymore? Or is the solution a data lake in the cloud containing all our data – big and small – where the business rules are implemented in the data visualization tool?

Before looking at the solution, it is important to first look the challenges. And next look at how those challenges can best be solved.

In this post we will argue that the most common challenges with analytics (still) are:

  • Data Quality
  • Access to data
  • Data responsibility or ownership
  • Data governance and control
  • Continuous change of all of the above

The main focus in this post will be on Quality, Access and Change.

Most BI blogposts today focus on front-end tools, and the never-ending list of new functionalities available for data connections, integrations, blending etc. but the focus here will be on the importance of a solid back-end, the data foundation and data preparation. We will look into three common scenarios where the data warehouse (DWH) or Enterprise Data Warehouse (EDW)  implemented using a data warehouse automation (DWA) tool plays a big role in solving the core reporting and analytic challenges.

The latest Front-end development tools have become easy to use with high focus on self-service. “Access all the data with this amazing connector and get an answer to all analytic questions”. The tools are user-friendly and developing reports and report models are easy – when access to the data is established. But if front-end is connected too data with poor quality, limited access, no responsible owner or lack governance, the result will not be as desired. This is especially true when bringing in the factor of Change. And unfortunately, many companies are learning this lesson.

Scenario 1: Legacy Architecture

(We just want our reports to work)

The challenges

In this scenario, the company already has a data warehouse implemented about 10 years ago, but do not get the value they need from it. It takes all night to load the data and the data load impacts the source systems, making them slow.

The scripts doing the updates are written a long time ago by some consultants and no current employees really know how the data loads work, and do not want to risk rewriting them. This makes for an un-scalable and inflexible architecture.

The organization itself has a low level of BI maturity and has no real owner of the solution internally and limited budget to manually rewrite or rebuild the whole solution.

Solution

The solution to this scenario would be to first analyze the root cause for the load issues. Study and analyze the old and complicated code and identify where and why the load issues exists.

A Proof-of-Concept approach can be recommended here for a limited scope of the solution to demonstrate how and why a DWA tool would handle data load issues better. With a small redesign and a few modifications, the solution can be migrated to fit the DWA tool, which gives flexibility, changeability and growth and automated optimized data loads.

To succeed with such a migration, it is important to think evolution not revolution, solve the “as is” solution first to see that it is possible to get stable, efficient, configurable data loads and that their reports will work with updated data every day. Then expand the solution brick by brick.

Effects

Fast and stable updates – The reports work!

Less risk and lower cost to maintain solution

Easy to make changes to and expand the data loads

The company will be ready for the next step: Data Analysis

Scenario 2: Data Preparation for Digitalization

(We just want some data from the data warehouse)

The Challenges

In this scenario, the company has an EDW, but the backlog at the BI department is huge and getting new data needs prioritized is difficult as the developers are overwhelmed by their ongoing tasks. Even small changes take a long time to implement. The EDW is in other words not as flexible as it should be, and can no longer keep up with the changes of data requirements.

The new digitalization department demands new data now, and are frustrated with the rigid and slow EDW. The new department has a mission to develop customized “apps” to various user groups in the organization, and need access to data now.

Solution

Use DWA to extract data directly from the source systems or from the existing EDW to create a data ‘store’ for digitalization. Extracting data is configuration only, no coding. With built-in metadata adapters in the DWA tool, the data repository will be enriched with usable metadata, e.g. understandable table- and column names and implemented relationships in the data model.

With a solution like this the digitalization department can empower technical staff inside their own department to use self-service data preparation and extract the data they need to make their apps and customer dialog.

The EDW would still serve as a source reporting for the business users, but it would co-exist with the changeable data-on-demand needs of the digitalization department. Further down the line, the DWA tool can – off course – also be implemented to the EDW.  But again – a key to success is evolution, not revolution. Making a ‘cold turkey’ technology change on an EDW is not done overnight, but do not let that be a constraint to the whole organization’s more changing data needs.

As a side note, there will be a need of governance between the two data-hubs so that the same business rules are not defined and maintained in both solutions.

Effects

Efficient access to on-prem, in-house, small data

Self-service for the «apps» developers

Flexible use of data

Scenario 3: Need an Info-Hub (data warehouse) FAST

(We want FAST access to structured, controlled and flexible data)

The challenges

In this scenario the company does not have a data warehouse to meet their basic reporting and analytic needs. The scenario might often be a merger or acquisition and there is a need for structured data ‘yesterday’.

The information requirements origin from a wide range of applications and data must be integrated and consolidated for reporting. Single application specific reports are not sufficient and the cost for custom reports inside the application is too high.

The organization have a low BI maturity level. The main concern is to get access to qualified and timely information – FAST. And since the information needs are not defined, there is a need for a flexible and scalable solution.

Solution

By starting with a DWA tool, the extraction of data is automated with no coding, where a qualified metadata repository will come “out of the box”. All the integrations will be handled in a data warehouse, which means the source systems do not have to integrate with each other (Hub and spoke).  It is easy to configure the data loads from inside the DWA tool and the solution is fully documented. The data will then need to be integrated and transformed into a star-schema enabling drill-down and drill-through to detailed data analytics.

This enables the company to implement an enterprise data warehouse or ‘information bank’ to use a more friendly name for the solution, where data from many different applications are loaded, combined and managed by the DWA tool.

Effects

Efficient access to data

Expandable and flexible solution

Designed to account for new technologies (Cloud & MDM)

Full technical documentation

Survival of the fittest

With these scenarios in mind, you should evaluate how these core challenges are met in your organization. If you already have a data warehouse, perfect – but is it giving you the desired value considering the factor of Change. The DWH/EDW and analytics platform need to be smartened up for competitive advantage. If everybody is doing it – you need to do it smarter!

By utilizing efficient automation tools to collect and prepare the data, you will also be able to make more data available faster. Data extractions and loads are configurable and easy to change and expand. The data warehouse enables governance to what data is available to end users, it be in a raw format or transformed.

When finding solutions to your core information challenges, start with implementing efficient and sufficient data preparation to increase data quality and acces.

Second implement reports and analytics, custom apps, dashboards and KPI’s.

And automate all that can be automated!

Is the data warehouse challenged?

Yes, it is!

The data warehouse as is challenged by cloud and Big Data solutions because the data analytic tool vendors preach that a lake and an analytical tool is all that is needed. And if companies are buying into the arguments, this challenges the traditional data warehouse.

 

But, is it really?

In this article we argue that a data warehouse or a prepared data hub is needed, because:

With more data, it is a higher need for governance!

With more data, it is a higher need for structure!

With more data, it is a higher need for responsibility!

With more users, it is a higher need for intuitive structured data!

Remember that the basics needs do not disappear, but the DWH solutions need to adapt to the emergence of data urgency.

So, with more data the higher risk of chaos and a higher need for DWA.

And with less time spent on data preparation, there is more time to spend in the analytics area.


Anja Loug Helland is a co-founder of BI Builders with more than 10 years experience in implementing data warhouse solutions. Her speciality is implementing solutions using the data warehouse automation tool Xpert BI and she has been a speaker at the Gartner Analytics Summit conference in London 2017. 
 
Espen Langbråten is Senior BI Advisor and Solution Architect at BI Builders. He has almost 20 years of experience in the area of BI and analytics and has worked in various  companies helping them achieve their BI goals. For the last 6 years he led the BI and Data Warehouse department at Europris with focus on Mobile BI and analytics. He is a sought after speaker at conferences where Big Data World in London and Make Data Smart Again in Oslo are the two latest.
 

 

 

About BI Builders