Data science: we are only just beginning
31.08.2016 – Christoph Schuler
Report for the newsletter of IngCH Engineers Shape our Future, August 2016
Big data, or 'data science', is one of the major trends within information technology and beyond. This explains why the Swiss Federal Institute of Technology in Zurich (ETH Zürich) and the Swiss Federal Institute of Technology in Lausanne (EPFL) have joined forces to establish a centre for data science. But what are the implications of the use of data within business for industrial companies in Switzerland? What is their current level of progress and how can they move forward in this regard? We have put together a guide that takes retail firms as an example.
A company's progress when it comes to data science can be categorised according to the following three stages:
1. Data collection
Every big data project starts with the systematic collection of data points that are needed to provide the basis for innovation or optimisation processes. For this stage, companies will need to rely on electrical technology (for hardware or sensors) and software.
Looking at the retail example, it's clear that data is already being collected within this sector (e.g. on purchasing behaviour and stock levels). In fact, it is often the case that more data is collected than is actually currently required. One challenge at this stage is ensuring that the right data is being collected so as to avoid putting unnecessary strain on the infrastructure. It is therefore important to first clarify your objectives, in order to set out the ways in which the big data will be beneficial. Example: All of a retailer's products are recorded in the digital product management system. Could replenishment orders be optimised by refining them on the basis of product expiry dates, for example?
This example also demonstrates, however, that it is necessary to make an investment even at the data collection stage. Here, sensors would need to be installed on countless products and recording lots of different data points is a highly complex process. At this stage, IT specialists will work closely with whoever is responsible for the infrastructure as well as system engineers.
2. Data organisation
Once the necessary data has been collected, it is crucial that it is organised effectively and stored in appropriate structures. The way the mass of data is managed needs to make analysis possible at a later stage. Data series can sometimes stay relevant for years (e.g. purchasing behaviour on bank holidays). At the organisation stage, the performance and availability of large, complex datasets are of particular importance. These issues fall under the key skills of IT engineers.
3. Data analysis
The real added value that big data brings isn't seen until the third stage. Once you have collected the necessary data, the art of data mining comes down to analysing the data and linking it up to other relevant data from a wide range of sources. This type of project will always be interdisciplinary, with IT specialists working with statisticians or mathematicians. Business expertise is another fundamental requirement here, as it ensures that the right questions are asked. Sometimes, it may even be necessary to be a bit unconventional in the methods used. Companies like Apple and Google are leading the way here. VIP customers in Apple stores are identified by their personalised Apple watches and then given preferential treatment, whilst Google uses smartphone tracking to map visitor numbers and opening times. Here in Switzerland, we are only just beginning to analyse this kind of data in a systematic way.
What is the current level of progress in Switzerland?
The majority of advanced industrial companies in Switzerland – not just those in the retail sector – are currently at the second stage. In other words, they have created an infrastructure and are currently working out how to store and organise the data. However, many companies are actually still at the first stage, including some within the retail sector. Although many products do now have a barcode on them, the expiry date cannot be read by a machine, for example. This would bring a potential additional benefit for product management and the ordering process. The use of data within business certainly has a lot of potential, but there are still some obstacles to overcome yet.