Data Science Glossar

The most important terms

The Data Science Glossar

Quick start / Navigation

#B / #C / #D / #I / #O


Image recognition is done by software that relies on data mining algorithms. The image recognition software is able to distinguish between elements on images and classify them in the next step. The algorithms on which this technology is based are, for example, neural networks. Image recognition is used for:

  • Facial recognition
  • Apps for the identification of plant, animal or fungus species by means of pictures etc.
  • Transfer of scanned handwriting into digital text

Image recognition enables the use and further processing of information from images. In companies, for example, image recognition can automate quality assurance processes by automatically detecting cracks or damage in products.


Clustering is a form of unsupervised learning. In this process, data is segmented by an algorithm and divided into groups (clusters) based on similar characteristics. Each data point represents only a part of the cluster. Different clustering methods are for example:

  • K-means-clustering
  • Hierarchical clustering
  • Density Based Spacial Clustering of Applications with Noise (DBSCAN)

Cluster analysis helps companies identify patterns in data sets. The processing and segmentation of data sets enables diverse use cases. For example, it can be used to sort customer groups in order to adapt offers specifically to demand or to optimise work processes through recognised patterns.


Data mining generally describes the processing and examination of large sets of data or information for hidden patterns, insights and structures. Processes and methods from various fields are used for this purpose:

• Machine learning
• statistics
• Database systems

Data mining is an important step in the process of data processing and helps to gather useful information for the analysis of data sets. Data mining is interesting for companies because it reveals hidden potentials, trends and insights in existing data and can also reveal previously unknown cross-connections in the data.

Related topics:
Text mining, web mining, image recognition

Descriptive analytics is data analysis that uses real-time data as a basis to answer specific questions. It is characterised by traditional business intelligence and visualisations such as:

  • pie charts
  • bar charts
  • tables

The graphical representation of results helps to present complex data in an appealing and easy-to-understand way.

Descriptive analytics uses data aggregation and data mining to gain insights into the past. The aforementioned methods make it possible to extract correlations between individual data sets that would otherwise remain hidden.


Industry 4.0 refers to a development of digitalisation towards automation and data exchange in manufacturing technology. Also referred to as the fourth industrial revolution, it involves algorithm-based mechanisms in cyber-physical systems. These enable networking between physical machines and software components within a system in order to optimise manufacturing processes.

Examples are:

  • automated greenhouses
  • environmental monitoring
  • autonomous vehicle systems

Companies benefit in many ways from Industry 4.0 measures:

  • increasing efficiency in production
  • reducing production costs
  • increase flexibility
  • efficient monitoring


Open data is the idea that certain data sets should be freely accessible to the public. This means that no patents or copyrights are required to use and publish the data itself.

Examples of open data sources are:

  • Human Genome Project
  • Dataverse network
  • Open government data (e.g. GovData in Germany)

Companies can benefit in many ways from free data portals to optimise their own processes and drive innovation. Conversely, companies can also make data sets available as open data. This increases transparency and, through the participation of third parties, can provide food for thought for the further development of processes and products.

Stefanie Supper

Book an appointment?

click here!