Data enables a company to gain insights into ongoing business. Business intelligence provide such insights by tracking the performance and value creation of the delivered products and services. By collecting and processing representative data advanced analytics makes it possible to make predictions and recommendations. Finally, if enough data is gathered machine learning could make it possible to build models and make fast predictions and recommendations in an automated fashion.
It is of paramount importance that data quality as high as possible when the data is collected and stored. The reason being that the quality of the predictions will be no better than the quality of the source data. It is also of highest priority to get a high (if not complete) level of automation when handling the data if a production grade and scalable solution is to be obtained. Furthermore, data governance have to be an integrated part of the solution should be scalable and enable data democratization.
Business intelligence
A core responsibility of Business intelligence is to perform a given set of analyses on regular basis to track their outcomes. To enable such analyses one or more focused data marts will in the general case be an optimal solution. The reason being that this setup allows for full automation simply by populating the data mart with the relevant time-varying data.
Advanced analytics
Usually advanced analytics requires data from different business departments.
A data warehouse caters for such needs by ensuring that the most up-to-date data exposed by the departments are available. Since the analyses tend to be one-offs they can include a significant handheld wrangling when fetching the data from the central data source.
Machine learning
The input to machine learning solutions tend not to be restricted to structured (fx. tabular) data. A data lake provides the ability to handle both structured and unstructured (fx. picture / sound / video) data. Furthermore, a data lake is also meant to serve as the source repository for historical data that is required when developing and training machine learning solutions.