Big Data

Introduction

Big data is trending briskly in the last few years. Since we are living in the era of “Internet of Things”, where data are produced in every fraction of time and data growth rate has increased rapidly. As reported by Forbes, by 2020 it’s estimated that for every person on earth will produce 1.7 MB data per second. We use various applications or visit sites on the internet for different purposes like online streaming, E-commerce, transactional, social media, etc. From these activities, a large set of data (known as big data) are produced which are complex, unorganized and difficult to store.

We used big data in various analytics approaches like machine learning, data mining, predictive analytics, etc. These are part of big data analytics. By applying it, we can reduce cost, better decision making, new product development and so forth. There are different forms of big data like structured, unstructured and semi-structured data. Big data analytics are used in various sectors like banking, healthcare, manufacturing, retail, etc.

Big Data Analytics

Types of Big Data

      • Structured – Any form of data which are stored in an ordered manner and can be accessed easily. Such form of data is known as structured data. It is easy to processed structured data. SQL is used to manage such kind of data. For example, employee information table where details like name, Id no, designation, salaries, etc. are stored in an orderly fashion. Its source can be either from human-generated or machine generated. Data generated from the ticket reservation system, online appointment, feedback forms, transactional, GPS tracking, etc. are structured data.
      • Unstructured – Most of the big data is unstructured. Data which are heterogeneous in nature as a combination of audio, video, images, etc. Email is a good example of unstructured data. Its source can be either from human-generated or machine generated. It cannot be organized easily and takes lots of time for arranging. NoSQL applications are used to manage such kind of data. Data generated from email, Facebook, Twitter, digital surveillance data, satellite images, etc. are unstructured data.
      • Semi-structured – Data which is the combination of both structured and unstructured forms of data is known as semi-structured data. While semi-structured entities belong in the same class, they may have different attributes. Xml file is a common example of this form of data.
A simple visualization of how data is generated by usage of the smartphone.

Characteristics of Big Data

      • Variety – By the term variety means of different forms. Data which we gathered can be structured, unstructured and semi – structured forms of data. For example, social media data, GPS, Inventory, image, etc.
      • Velocity – As we already know that data are produced in every fraction of time and data growth rate has increased exponentially. By the term, velocity means “How speedily data has been generated?”
      • Volume – As data is exponentially increasing with greater velocity, we need sufficient storage to keep all the data. Thus whenever we are dealing with big data, a large storage is needed to handle it.
      • Variability – It means the variation in a data. Whenever we collect data, variation is present on it and thus hampers the analyzing of data. Data is inconsistent and keeps on changing.
      • Value – Our main aim is to derive value from a data. It is a waste unless we derive value from it. By the term, value means “How worthy is our collected data?”

We provide a wide range of Analytics Solutions like Business Analytics, Digital Process Automation, Enterprise Information Management, Enterprise Decisions Management and Business Consulting Services for Organisations to enhance their decision support systems.

KNIME - Complete Data Science Platform (Free & Open-source)