Data Science Explained

November 18, 2019

Nowadays, Data Analytics for Business is a practice familiar to all businesses and it has been around since the 1980s. Data Science has recently been defined as an evolution of Analytics and it’s considered an umbrella term that groups several tactical components. A few examples of these tactics are Artificial Intelligence, Machine Learning, ETLs data cleanup, Data Visualisation, Algorithm and Infrastructure.

However, the best way to understand Data Science is as a process that uses data to understand things such as your customers, competitors and business processes.

The process involves starting from a data input to the “understanding” as the ultimate goal. Data Science goes beyond business intelligence and data analytics. It delves into the why and how a specific action was taken (i.e. a purchase decision) and it tries to understand the context.

Data itself is just the beginning.Understanding data leads to the insights by finding patterns and trends hidden within it. With this premise, you should prioritise identifying the data you need to gain that understanding. For example, you might want to combine your structured data (usually found in your database) and unstructured data (usually from external sources, or hidden in internal documents like contracts).

From Healthcare to Manufacturing Data Science is helping teams in optimising their understanding and use of their databases. A Database is usually used to store a collection of facts about entities, such as customers, sales transactions or web interactions.

Businesses might spend a lot of money, time and effort, to manage this repository, but that doesn’t necessarily convert into value. In fact, data, information and insights aren’t the same thing.

Imagine a hierarchy of needs with raw and unprocessed structured and unstructured data at the bottom of this pyramid.Information is located in the middle of the, it’s where data is prepared and processed, aggregated and organised into a machine understandable format. At the top of the pyramid are the insights generated by analysing the information. Data and information are essential in support to the discovery of insights that can influence decisions and drive change.

At the peak of the pyramid lies Artificial Intelligence (Machine Learning) the “automated & actionable insights” that are often invisible to the human eye.

Big Data & Data Skills

Aggregating different forms of data can be messy, and can result in a humongous volume, also known as: Big Data. Gartner’s definition of Big Data is

“high volume, and high-velocity and/or high variety information assets that demand cost-effective, innovative forms of information processing that enable enhanced insight, decision making, and process automation”

The life-cycle of usable data usually involves collection, pre-processing, storage, retrieval, post-processing, analysis,visualisation and so on. Traditional applications cannot effectively process this high volumes of data. The Data Analytics tactics mentioned above are applied for valuable and on-demand results. For example, Machine Learning algorithms run through a number of data sets to look for meaningful correlations between each other.

Data Scientists are supporting businesses in finding the right models, data processing techniques and applying them correctly. Building an internal Data Science team is hard because of the time required to find profiles with the right expertise but also, due to the growing gap between supply and demand for Data Science skills that makes it extremely competitive to hire.

Outsourcing Data Science solutions offer benefits including access to global talent pools, essential generalist skills and expertise that can solve real and specific business needs. In addition,businesses can leverage third parties’ larger volumes of data (i.e.unstructured data) and combine skills and cutting edge technologies not yet available to them.

However, the harder the data is to reach, the higher the degree of complexity of a project. People and Data must be the two things to consider before starting an AI project.

Machine Learning Projects

Having a plan for the entire project is also essential. From the project management perspective, there are Data Science methodologies built to provide a lifecycle to Data Science projects. The “Team Data Science Process” method, also known as TDSP, outlines 5 steps that are usually taken when executing a project.

Data Science Lifescyce
  1. Business and requirements understanding. The business need is identified and business goals are defined. At this stage, teams will have to determine whether  they need additional data from different data sources. It’s essential to make sure that all project parties clearly understood what they are trying to achieve/solve for the implementation of the technology.
  2. Data Acquisition and Understanding. The current state of data is assessed, explored, pre-processed and cleaned. At this stage, Data Scientists usually have a better idea whether existing data is sufficient or not.
  3. Data Modelling. Feature engineering is performed on the cleaned dataset to generate a new, improved, data set that facilitates model training. At this stage, the difference between generating the input data needed by a Machine Learning model for a Proof-Of-Value and doing it continuously and at scale is important.
  4. Deployment of the data pipeline and the winner model to a production or production-like environment. Model predictions can be either in real-time or on a batch basis. The latter should be decided at this stage. After ensuring the dataset is comprised of (mostly) informative features, several models are trained and evaluated, and the best one is selected to be deployed.
  5. The last stage is Customer Acceptance. Two important tasks are performed, namely: system validation and project hand-off. The aim for this stage is to confirm that the model deployed meet the client’s needs and expectations. The project hand-off includes deliver the project to the person responsible for running system in production, delivering project reports and documentation. At this final stage it is important that the third parties used, provide the right training to all the team members accordingly.

How Gravity 9 can help

Our Data Science and Machine Learning Practice help clients from creating a strategy through to building and managing Data Science solutions.

Gravity 9 Data Science Capabilities
Join us
We are building a world class team and looking for colleagues ready to help us disrupt.
learn more

We help clients accelerate their digital journey.

Follow us
© 2020 Gravity 9 Solutions Ltd