Data Preparation and Why Should you Care?

Data Preparation is the process of gathering, combining, structuring and organizing data so it can be analysed as part of business intelligence (BI) and business analytics (BA) programs.

The components of data preparation include data discovery, profiling, cleansing, validation and transformation; it often also involves pulling together data from different internal systems and external sources.

Source: http://searchbusinessanalytics.techtarget.com/definition/data-preparation

But, isn’t that ETL? Yes and no – let me explain.

I believe the difference between Data Preparation and ETL relies on who does it and the tools used in the process.

Data Preparation is like ETL developed by Business / Data Analysts or Business Users. And the tools used to implement it in most cases do not require any coding skills.

Data Preparation is usually the initial step in analytics. When analysts do not know what they are looking for, but when they know where to start looking. Data Preparation exists so that the time required to bring data together into a meaningful and ready to be used dataset is minimal. Can you imagine the time needed to build or rebuild ETL when the user requirements for data, change so rapidly?

Data Preparation has been around for many years as the complement of many self-service data visualisation technologies that are by now widespread in most organisations.

Personally, I came to realise about Data Preparation, when I noticed that most of the tools provide a toolkit for business users to access data, to cleanse and transform it, and to put it together into a data model that can feed data into the visualisations. At the same time, I identified something I see as a flaw in these self-service BI tools, and this is that the data model containing clean and well-shaped data remains within the boundaries of the specific toolset and it is not available to any other systems.

Why Should You Care?

Next time you are designing a solution in a self-service data visualisation tool, and also, if you want to implement a Data Preparation strategy in your organisation, you should consider the following:

Datasets (or Data Models) should be reusable: Minimise the cost and effort required to transform raw data into ready to use datasets. Avoid duplication of work, and allow users and systems to share previously prepared datasets.
A central repository of datasets: To implement reusability, the prepared datasets or data models should exist in a single central place. This way, it becomes easier to manage, and everyone knows where to find the data they require for their data analytics.
Use only approved datasets: Data Preparation tools put the power of ETL in user’s hands, but without governance, every user can create their version of the truth. Make sure only approved datasets are made available to other data consumers in the organisation.
From Data Preparation to ETL: Data Preparation tools can have limitations when handling large amounts of records or when scheduling their automatic execution. When problems arise, use the Data Preparation steps as technical specifications for the implementation of a more robust ETL process.
Secure your datasets: Be mindful of the users accessing the datasets and the activities they perform with it internally and externally.

Data Preparation helps individuals and organisations to add the necessary agility to the process of transforming data into information. It also allows IT and Business Users to collaborate in the process. IT by providing a reliable data platform where data is always available when needed; and Business Users by making sure the prepared datasets implement the business logic required to make the right decisions.

Data Preparation and Why Should you Care?

Why Should You Care?

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112