Big Data

1 min read

Definition


Big data is a field that involves analyzing and managing huge amounts of data.

Description


Similar to smaller data sets, the usual aim of big data is to derive insights from large data sets.

There isn’t a specific size to determine if a data set is big enough to be considered big data.

A data set can be considered big data if the organization has difficulty using traditional methods, software and database to manage their data.

Characteristics of Big Data

Volume

This refers to the quantity of data.

Velocity

This refers to the speed at which the data is received and needs processing.

Data from real-time sources usually requires much faster management and processing capabilities, especially when the insights from the data need to be extracted quickly.

Variety

This refers to the type of data. The common types are:

  • Text
  • Numbers
  • Audio
  • Imagery
  • Video

Another way to categorize data is structured vs unstructured data.

Structured data is organised and formatted in a way that is easily searchable, processed and analyzed.

Unstructured data has no pre-defined organization or format. This makes it harder to search, process and analyze.

Veracity

This refers to accuracy of the data.

Value

This refers to how much useful insights can be derived from the data.

Variability

This refers to the consistency of the flow of data. The creation of some data peak during certain times, days or months, but slow down during other times.

Complexity

This refers to how complex it is to clean, match, link and manage the data. This characteristic is especially important when there are multiple data sources.

Big data in Industries

Big data is common in the following industries:

  • Manufacturing
  • Media
  • Government
  • Social Media
  • Finance
  • Healthcare
  • Insurance
  • Technology

Examples of Big Data in Action


  • Millions of surveillance cameras capture videos of the public across the country. Machine learning is then used to identify faces.
  • Spotify tracks the data of its users. It then analyzes this data to recommend the users music they might like.
  • Uber generates and uses a huge amount of data regarding drivers, their vehicles, locations, every trip from every vehicle. These data are analyzed to predict the demand, supply, location of the drivers and decide whether to slap on a surcharge.

Links to Complicated Explanations


Related Terms


Stochastic Calculus

Definition Stochastic calculus is a way to conduct regular calculus when there is a random element. Regular calculus is the study of how things...
Lucas Liew
4 min read