Nowadays, Big Data is more than just a buzzword. This set of strategies allowing for an efficient data processing is already set up in many business sectors: banks, insurance companies, manufactures, e-commerce, retail, etc. Companies using these strategies are performing better and take less risks. This is why the Big Data market has an annual increase of 10 % and is estimated at $84 billions for 2026 (Big Data Vendor Revenue and Market Forecast, 2011-2026).
However, the technical solutions required to establish these data strategies are mostly unknown to the general public and require the intervention of data experts. Indeed, a good data strategy implies the knowledge of frameworks such as Hadoop and Spark, of flow aggregators, cloud solutions, dedicated programming languages, libraries for data analytics, as well as strong skills for statistics, visualization and popular science writing.
Companies willing to start their data transition are thus facing a confusing situation. A data strategy has to be set up in order to stay competitive, but the task is very difficult: where to begin with? Why? How? Which technology? Whith whom? Numerous companies have thus failed their data transition and lost huge amounts of money on the way.
In this article, we present the available options for a company willing to start a data transition, as well as the best strategy to undertake in order to achieve a sucessfull transition.
How to get started
The first right move for a company willing to set up an efficient data strategy is to surround itself with competent people for data analytics. New professions have emerged, such as data scientists, data analysts, data architects, etc. The skills can be cross-disciplinary and we will address this particular topic in an upcoming article. A good data expert is able to tell the pros and cons of each data analysis: beware of ready-to-use solutions deluding with predictive analytics on a simple click.
In order to surround itself with the right persons, a company has two choices: either hiring a data team or enlisting a consultung agency.
Hiring a data team
Hiring a data scientist or a data team seems to be a good option a priori. : “I need to set up a data strategy, thus I hire competent people”. A new issue is now faced: what qualifications to look for, for doing what, at which cost? It is very difficult to know which candidate to hire without being a data analytics expert, and thus companies often do not hire the best profiles.
It is also important to understand that a good data analysis implies a good communication between the services in the company, because it is essential to combine data knowledge and field knowledge. Thus, it is unrealistic to think of hiring a team of data experts, to tell them “come on, analyse!” and to expect a return on investment. Data transition is a shift in the company’s global vision, that will become data-centered. We will address this particular point later in this article.
What about the Freelance option? A freelance is usually highly qualified and is a good option for a project that the in-house data team does not have the time to do. However, this solution is not adapted for the situation described in this article, namely a company willing to set up a data strategy.
Enlisting a consulting agency
Whether the final idea being to hire a data team or to outsource the data analytics, it makes sense to call in a consulting agency. Consultants are usually skilled, however one should watch out for low-cost agencies, especially if one does not have good knowledge of data analytics. A good consulting service allows for understanding the quality of the existing data in the company, what can be expected from these data, how they should be structured, etc. Following this initial diagnostic, the company can set up the advised strategies and then move on with the data analyses per se. To that end, the company can either hire a data team, knowing now which profiles to hire and what they should analyse, or chose to outsource the data analyses, or even to do both. For example, by having a data team implementing a BI software or a sophisticated predictive model developped by an external team.
Which option to choose for data analyses?
The company needs to know which kind of analyses it wants to perform before choosing a given option, especially regarding the softwares available on the market. First and foremost, it is important to know if the analyses will be descriptive or predictive (or both).
What are these ? Descriptive analyses give informations on what happened in the past and allow for an overview of the company’s activity (for example: during the 1st semester 2018, 60 % of my sales revenue come from customers from age 45 to 55 and living in an urban area). Predictive analyses forecast what will happen in the future (for example: in the upcoming week, this machine from the production line has a 69 % chance of breaking down). A specific category of predictive analyses is called prescriptive analyses: they predict the effect of future decisions and prescribe actions accordingly in order to make the best decisions.
Nowadays, more than 80 % of the business data analytics are descriptive. However, predictive analyses are so powerful that their usage grows faster than descriptive analyses (estimated increase of 13 % by 2021, source Forbes).
Now let’s see which solution is best adapted for each type of data analysis.
Although some basic descriptive analyses can be performed with CRM softwares, in most cases the best solution consists in opting for a Business Intelligence (BI) software. Every BI softwares on the market have their strenghts: reliability of the servers, pretty visualization, inclusion of various data formats, etc. These softwares are a first step towards the implementation of a data-centered company, allowing for making decisions with an objective, immediate and exhaustive overview of the relevant indicators.
It is important to point out that ready-to-use BI softwares cannot to deliver predictive analyses. While they allow for the visualization of variables, they are unable to understand the mathematical relations between these variables.
Predictive analyses can answer questions such as: “if I rise the advertising budget by x % for a given product, how much will the sales of this product increase?”. In order to answer this kind of question, a mathematical model has to be built. This model will be able to answer only one question, with a given dataset, in a given context.
This is why it is impossible to perform predictive analyses with ready-to-use softwares: this type of analyses has to be adapted to the dataset, to the context and to the challenges of each company.
The best predictive analyses are based on Machine Learning methods. We discuss these methods in detail in this article. Advanced data analytics skills are needed in order to create good predictive models. This is why predictive analytics are often outsourced to Machine learning specialists (except in some big companies, but it is not the context of a company starting its data transition).
However, there are softwares called Model Builders that are able to “perform Machine Learning” in a quasi-automated fashion. Essentially, entry data are given to the software, as well as the predicted value, and the software automatically generates the appropriate mathematical model. While they require a data science expertise, these tools are remarkable when properly used. However, while these softwares are performing well in the building of a model, they totally conceal the essential data pre-treatment. This part is so essential that in Kaggle competitions (data science competitions), the winners focus on two parts: data pre-treatment and optimization of the model’s hyper-parameters. For now, model builders do not have the ingenuity of a good data scientist.
Adopt a data-centered vision for the company
So that a company achieves a successful data transition, there has to be a complete vision of the data strategy and the company needs to adopt data-centered operations. Notice that this concept often worries people because they naturally tend to oppose data to human beings, while actually both cannot be separated (on this topic, you can have a look at our article about Dataism). A good data strategy will never replace a good sales representative or a marketing manager with data scientists. On the contrary, a winning situation will emerge by connecting field experience, corporate and market knowledge with good data analyses, and only this way will a data transition be successfull.
The best way of setting up an efficient data strategy is to move step by step, with an end-to-end support, so that all the people in the company benefit from this step ahead. Instead of setting up a huge infrastructure, with a radical change in the paradigm that can be difficult to handle for the direction and the collaborators, we advise to transit gradually.
First ask the question: “What is our challenge today? Which question should be answered quickly?”. The nature of the challenge can be diverse, such as a decrease of the sales, issues in purchasing raw materials at the right moment in order to benefit from a good balance between the purchase price and the stock, the unability to predict breakdowns in the production line and subsequent long delays, etc. A data solution should always be set up in order to answer a question or a challenge, and not “because one has to do Big Data”, otherwise the risk is taken to invest huge amounts of money for an unsatisfactory result.
Once the question or the challenge defined, the company can be supported in order to formulate it in mathematical terms and to establish a diagnostic of the data already available and the possibly missing data. For this step, it is essential that all the concerned collaborators are implied in the process. For predictive analytics, the most feedbacks and competences implied, the better and more precise the model will be. After being built, the model need to be embedded in an easy-to-use software interface that can be used by the department concerned: accounting, marketing, production, etc. The software can be deployed online or directly in the company.
This process may take several months, and allows the company to get familiar with a data-centric business administration in an efficient way.
Once the first challenge completed with data analytics, another challenge can be addressed. Thus, step by step, a complete data solution is setting up in a smart and efficient way.
In order to initate a data transition, every company should get advices and a 360° support, whether the plan is to hire a data team or to outsource the data analyses.
Before doing anything, it is important to know wich challenges or questions will be addressed, which data are available and if these data allow for answering the question. If it is not the case, the missing data have to be defined.
For descriptive analyses, the company can use a BI software, in addition or not to hiring a data team.
For predictive analyses, which are mathematically more complex, it is not possible to use ready-to-use softwares. In this case, the company has to find specialists able to build the adequate mathematical models for each challenge.
If there were only one thing to keep in mind, it would be to begin with one specific challenge, clearly formulated, and to address it with data analytics. And to go on like this until the company is mature in terms of data analytics.
A brutal “all-data” transition should be avoided, as it might be unefficient, paralyzing for the collaborators and very expensive.