Let’s Understand the Data…
What is Data?
Data is raw pieces of facts/figures that need to be processed further to gain some valuable insights from it. As it is observed nowadays, “Data has become the fuel of the future”.
Data is the raw material of the 21st century. It can be used to power everything from businesses to governments to social movements. But data is not created equal. The data that is valuable for one domain or problem may be useless for another. This is why it is important to understand the different types of data and how they can be used.
Types of Data-
Data can be classified into different types based on various characteristics & properties. These classifications help us to understand the nature of the data and determine valuable analysis and interpretation.
- Numerical Data:
Numerical data, also known as quantitative data, is data that you typically present in number form, and it doesn’t include any language or descriptive form. It’s always measurable, and you can add it together. You can perform mathematical and arithmetical operations on it, and you can express it in decimal or fraction form.
a. Discrete Data: Discrete data represents whole numbers or counts that have a finite or countable number of values.For e.g. Number of Bank accounts, Number of cars, etc.
b. Continuous Data: Continuous data represents measurements or values that can take any value within a certain range.For e.g. Height, weight, distance, etc.
2. Categorical Data:
Categorical data represents qualitative variables that can be grouped into specific categories or classes.
a. Nominal Data: Nominal data consists of categories that do not have any specific order or ranking. These categories can simply be labels or names. E.g. Car Brands(Toyota, BMW, Audi, etc.), Hair color(Brown, Black, Blonde, etc.), Marital status ( Single, Married, Divorced), etc
b. Ordinal Data: Ordinal data represents categories with a natural order or ranking. The categories have a relative position or value. E.g. Educational level(Primary, Secondary, College) or food satisfaction ratings ( poor, average, good, excellent)
3. Dichotomous Data:
Dichotomous or Binary data occur when you can place an observation into only two categories. It tells you that an event occurred or that an item has a particular characteristic. For instance, an inspection process produces binary pass/fail results. Or, when a customer enters a store, there are two possible outcomes — sale or no sale. Binary data is commonly used in fields such as computer science and statistics to represent logical or boolean variables.
4. Textual Data:
Textual data refers to unstructured data in the form of textual information such as sentences, paragraphs, or Word documents. Textual data often requires Natural Language Processing techniques to extract meaningful insights or perform analysis
5. Temporal Data:
Temporal data refers to information that changes over time. It includes data points associated with specific timestamps or time intervals. Temporal data is most commonly used in finance, weather forecasting as well and stock market analysis.
6. Spatial data:
Spatial data is any type of data that directly or indirectly references a specific geographical area or location. Sometimes called geospatial data or geographic information, spatial data can also numerically represent a physical object in a geographic coordinate system. However, spatial data is much more than a spatial component of a map. Analyzing this data provides a better understanding of how each variable impacts individuals, communities, populations, etc.
7. Image & Video Data:
Image and video data are visual information captured in the form of images or sequences of images (called frames) captured and eventually displayed at a given frequency. By stopping at a specific frame of the sequence, a single video frame, i.e. an image, is obtained. Image and video analytics applications utilizing deep learning algorithm deployment have long surpassed human abilities to recognize data patterns in images and videos.
8. Speech Data:
Any digital information with speech or music stored on and played through a system is known as an audio file or sound file. Gathering and recording sounds, either for human enjoyment or for use in automated learning systems, is known as “audio data collection”. For example, voice recognition systems rely heavily on audio data, which has risen in prominence in recent years for usage in AI and ML systems.