Glossary

From Competence Center
Jump to navigation Jump to search

This glossary summarizes knowledge from the Competence Center Corporate Data Quality (CC CDQ) and provides key definitions for data-related terms. It consolidates the body of knowledge that has been developed by the CC CDQ research team in collaboration with data experts from 40+ Fortune 500 companies since 2006.


A

Analytical data

Analytical data is a particular subtype of enterprise data. It is derived from business operations and transaction data and is mainly used to meet standard reporting and analytics requirements by applying descriptive analytics.

Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
Full working report: Download



Related topics

Master data, Media data, Metadata, Reference data, Advanced analytical data, Observational data, Transactional data, Enterprise data

Analytical data product

Analytical data products are a particular subtype of data product. They include metrics, dashboards or reports that deliver key insights using basic analytics to support decision making. Examples include:

  • Metrics such as email open rate that shows how many emails have been opened as proportion of total emails sent
  • Dashboards such as tail-spend analysis that show the smallest vendors of company and how much has been ordered from them
  • Reports such as carbon emission reporting that show how much CO2 has ben produced and emitted from an organization
  • Source: Hasan, M.R. and Legner, C. (2023) ‘Understanding Data Products: Motivation, Definition, And Categories’, in Proceedings of the Thirty-first European Conference on Information Systems (ECIS 2023). Kristiansand, Norway, pp. 1–17.
    Full paper: Download



    Related topics

    Advanced analytical data product, Basic data product, Data product

    Advanced analytical data

    Advanced analytical data is a particular subtype of enterprise data. It is created by applying methods of data science that go beyond purely descriptive analytics (i.e., towards predictive/prescriptive analytics). It is used to identify patterns or correlations in complex (structured and unstructured) data sets, such as text, images, geospatial or sensor data..

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    Related topics

    Master data, Media data, Metadata, Reference data, Analytical data, Observational data, Transactional data, Enterprise data

    Advanced analytical data product

    Advanced analytical data products are a particular subtype of data product. They use sophisticated methods to create prescriptive acumen & self-learning capabilities. Examples include:

  • AI/ML models such as sales forecast that predicts how much sales will take place over a certain period of time based on relevant parameters
  • (Semi-)Automated products such as predictive maintenance suite that smartly predicts possible upcoming machine problems and notifies responsible person to conduct repairs
  • Source: Hasan, M.R. and Legner, C. (2023) ‘Understanding Data Products: Motivation, Definition, And Categories’, in Proceedings of the Thirty-first European Conference on Information Systems (ECIS 2023). Kristiansand, Norway, pp. 1–17.
    Full paper: Download



    Related topics

    Analytical data product, Basic data product, Data product

    Artificial intelligence

    The term "Artificial intelligence" (AI) was first coined by John McCarthy at the renowned Dartmouth workshop in 1956, defining it as “the science and engineering of making intelligent machines”.

    Source: McCorduck, 2004

    B

    Basic data product

    Basic data products are a particular subtype of data product. They are ready-to-use datasets giving foundational insight of the domain(s) represented by the data. Examples include:

  • Customer master data information and provides relevant data of customers such as address, location, name etc.
  • Aggregates datasets that contains shop floor sensor data from different machines
  • Enriched dataset where basic data from within the organization is enhanced with other data, mainly from external sources, to offer enriched insights
  • Source: Hasan, M.R. and Legner, C. (2023) ‘Understanding Data Products: Motivation, Definition, And Categories’, in Proceedings of the Thirty-first European Conference on Information Systems (ECIS 2023). Kristiansand, Norway, pp. 1–17.
    Full paper: Download



    Related topics

    Advanced analytical data product, Analytical data product, Data product

    Big Data

    Big data are data that are so large and diverse that they require cost-effective, innovative forms of data collection, storage, management, analysis and visualization. Big Data are typically characterized by 3 V's: Velocity is the speed at which the data is created and the speed at which the data shoud be analyzed and used. Volume refers to the size of the data which is typically in the range of terabytes and exabytes, whereas variety refers the changing data types in scope ranging from more traditional structured source (spreadsheets, SQL database tables) to semi-structured data (XML, JSON, Semantic Web data) as well as unstructured data (images, texts, files). In recent years, three more V's have been added to the the traditional 3 V's framework to charactarize Big Data: Variability, Veracity and Value. Veracity refers to different levels in reliability and truthfulness of big data sources, while variability describes the high frequency of changes within a data sources. Last but not least value describes the fact that while single data points may not be of high value, value from big data comes from analyzing huge amounts and trends within and between datasets.

    Source:Amir Gandomi, Murtaza Haider, Beyond the hype: Big data concepts, methods, and analytics, International Journal of Information Management, Volume 35, Issue 2, 2015, Pages 137-144, https://doi.org/10.1016/j.ijinfomgt.2014.10.007.


    Business Analytics

    Business analytics is defined as the exploration and investigation of past business data to gain valuable insights and drive business planning. These activities depend on a sufficient volume of data as well as on a sufficient level of data quality. This requires data managers to integrate and reconcile data across various sources (i.e. from various business units, divisions, departments, branches, and information systems), with the goal of compiling a complete picture of the company’s past and current state for deriving future scenarios.

    Source: Legner, Christine; Pentek, Tobias; Ofner, Martin; Labadie, Clément: CDQ Trend Study: Trends in corporate data management, 2017 (https://www.cdq.com/events-insights/publications/cdq-trend-study)


    Business Capabilities

    Business capabilities define a set of data-based skills, routines, and resources a company needs to have in order to achieve its business goals through data monetization. The Business Capability design area specifies what data-related business capabilities are required, which of these are already in place to some extent and need to be enhanced, and which ones need to be established from scratch.

    Source: Dissertation Tobias


    Business Engineering

    The method-oriented, model-based theory of construction for companies in the Information Age.

    Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequisite for Successful Business Models, 2015 (http://www.cdq-buch.de/)


    Business Object

    A Business Object represent a real or imagined object of value generation, that can be either used, changed or analyzed in business processes. It describes reoccurring set of information used in multiple business contexts and minimum one data domain. It is specified by attributes.

    Source: Schmidt, Alexander (dissertation)


    Business rule

    A business rule is a statement that defines or constrains some aspect of the business. It is intended to assert business structure or to control or influence the behavior of the business. Business rules may be defined as business definitions for business use (to represent policies, practices and procedures), or defined as executable business rule statements for use in rule-driven systems, or both.

    Source: The Business Rules Group (BRG) and the Object Management Group (OMG)


    Business value

    Refers to the impact of data management on business with regard to financials, business processes, customers, and organizational growth.

    Source: Legner, Christine; Pentek, Tobias: Data Excellence Model: Short Description and Basic Terminology, 2017 (https://cc-wiki.cdq.com/Data_Excellence_Model)


    C

    CAD

    Computer aided design, meaning designing a product with the help of information technology.

    Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)


    Cloud services

    Data-related services delivered via the Internet in an on-demand model.

    Source: Legner, Christine; Pentek, Tobias; Ofner, Martin; Labadie, Clément: CDQ Trend Study: Trends in corporate data management, 2017 (https://www.cdq.com/events-insights/publications/cdq-trend-study)


    Core business object

    The central actors (business partners, customers, suppliers and employees), products (incl. materials) and operating materials (systems, etc.) of a company and its ecosystem. These objects are represented as master data for purposes of IT.

    Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)


    D

    Data analyst

    The Data Analyst is a core data & analytics role in the CC CDQ Reference Model for Data & Analytics Governance. (S)he is responsible for the implementation (development and deployment) and maintenance of reports and adhoc-analysis.

    Source: Reference Model for Data & Analytics Governance


    Data applications

    In the CDQ Data Excellence Model, data applications is about planning, implementing, and maintaining software which is designed to manage data and data products in order to achieve and maintain data excellence. The design area specifies the applications for managing (master) data, managing data quality, and cataloging/curating data.

    Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.


    Data architect

    The Data Architect is a core data & analytics role in the CC CDQ Reference Model for Data & Analytics Governance. (S)he is responsible for designing, creating, deploying and managing conceptual and logical data models and for the mapping to physical data models. (S)he is also accountable for the implementation and maintenance of data pipelines.

    Source: Reference Model for Data & Analytics Governance


    Data catalog

    A Data Catalog is an integrated platform for data curation, matching data supply and demand. It offers users functions to register data; to retrieve and use data; and to assess and analyze data. A Data Catalog therefore should provide a data inventory (for data supply) and features for data discovery (for data demand) as key components. Additional features should support data governance, data assessment, and data analytics, alongside with appropriate features for catalog administration and data collaboration.

    Source: Fadler, Martin; Korte, Tobias; Legner, Christine; Otto, Boris; Spiekermann, Markus: Data Catalogs: integrated platforms for matching data supply and demand, 2018


    Data citizen

    Data citizens represent employees who rely on data for their daily work but are not data specialists.

    Source: Reference Model for Data & Analytics Governance


    Data contract

    A data contract is a formal agreement between the producer(s) and consumers(s) of data products which guarantees their provision at a desired level of service in return for adhering to conditions to facilitate the products’ reliable usage.

    Source: Hasan, M.R. and Legner, C. (2024) ‘Improving Consumer-Provider Interaction With Data Products: Insights From Traditional Industries’, in Proceedings of the Thirty-second European Conference on Information Systems (ECIS 2024). Paphos, Cyprus, pp. 1–17.
    Full paper: Download



    Related topics

    Data product, Data product lifecycle

    Data democratization

    The enterprise’s capability to motivate and empower a wider range of employees—not just data experts—to understand, find, access, use, and share data in a secure and compliant way.

    Source: Lefebvre et al. (2021)


    Data excellence

    Data excellence is an umbrella term that defines properties of data, comprising data quality (defined as “fitness for purpose”) but also on additional dimensions, such as regulatory compliance, data security, or data privacy.

    Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.


    Data governance

    A company-wide framework that determines which decisions must be made and who should make them. This includes the definition of roles, responsibilities, obligations and rights in handling the company’s resource data. In this, data governance pursues the goal of maximizing the value of the data in the company. While data governance determines how decisions should be made, data management makes the actual decisions and implements them.

    Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)


    Data integration

    Data integration is the task of presenting a unified view of data owned by heterogeneous and distributed data sources". The need for data integration may stem from (1) technological heterogeneities (different database technologies) (2) schema heterogeneities (different data models and data representations) and (3) instance-level heterogeneities (conflicting values in different sources for the same data object). Data can be physically integrated or virtually, meaning that the data will remain in the source systems, however will be accessed using a uniform view.

    Source: Data and Information Quality (2016), Carlo Batini, Monica Scannapieco


    Data lifecycle

    In the CDQ Data Excellence Model, the data lifecycle comprises all processes regarding the creation, acquisition, storage, maintenance, use, archiving, and deletion of data. For a given data object, it defines and documents the data sources, data supply chains, data consumers, and data use contexts.

    Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.


    Data literacy

    The continuous learning of core skills, knowledge, attitude and values required to interpret data in a critical manner, and derive meaningful and actionable business insights.




    Data management

    Data management aims at the efficient usage of data in companies. It makes decisions and executes measures that affect the company-wide handling of data (whereas data governance creates the framework for such through the definition of responsibilities and so forth). It comprises all tasks related to the data lifecycle on a strategic, governing, and technical level: the formulation of a data strategy, the definition of data management processes, standards, and measures, the assignment of roles and responsibilities, the description of the data lifecycle and architecture – covering data models and data modeling standards –, and the management of applications and systems.

    Source: Pentek, T., Legner, C. and Otto, B. 2017. 'Towards a Reference Model for Data Management in the Digital Economy'. In: Maedche, A., vom Brocke, J., Hevner, A. (eds.) Designing the Digital Transformation: DESRIST 2017 Research in Progress Proceedings of the 12th International Conference on Design Science Research in Information Systems and Technology. Karlsruhe, Germany. 30 May - 1 Jun. Karslruhe: Karlsruher Institut für Technologie (KIT), pp. 51-66


    Data management capabilities

    in the CDQ Data Excellence Model, the data management capabilities design area defines a set of skills, routines, and resources a company needs to have in order to accomplish data excellence that results in business value.

    Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.


    Data owner

    The data owner is a core data & analytics role in the CC CDQ Reference Model for Data & Analytics Governance. Two different role types of data owner are usually being distinguished in practice: data definition owner and data content owner.

    The data definition owner is a decentralized data governance role which is assigned typically to senior business executives with global outreach (e.g. Global head of sales). (S)he is accountable for the data definition in specific areas of responsibility (e.g. a specific data domain like product or customer). Here, (s)he ensures that business requirements are fulfilled and data is compliantly accessed and used. Her/his tasks include collecting/defining data requirements and delegating the detailling of a data definition to a data steward.

    The data content owner is a decentralized data governance role which is assigned to local business executives/ team leaders with operational responsibilities. (S/he) is accountable for data creation and maintenance (Data lifecycle) according to the data definition for a specific area of responsibility. (S)he coordinates the creation and maintenance of data by data editors.

    Source: Reference Model for Data & Analytics Governance


    Data product

    A data product is a managed artifact which satisfies recurring information needs and creates value through transforming and packaging relevant data elements into consumable form. Data products have particular subtypes: Basic data product, Analytical data product, Advanced analytical data product. Firms build data products in order to enhance access and reuse of data, to improve its governance and ownership as well as to reduce the time-to-insight. Data products have five main characteristics:

  • Data products fulfill recurring information needs
  • Data products must have a well-defined consumer base
  • Data products generate tangible value that can be tracked and measured
  • Data products are built from data that comes from reliable sources
  • Data products must be delivered in a consumable form

  • Source: Hasan, M.R. and Legner, C. (2023) ‘Understanding Data Products: Motivation, Definition, And Categories’, in Proceedings of the Thirty-first European Conference on Information Systems (ECIS 2023). Kristiansand, Norway, pp. 1–17.
    Full paper: Download
    Further reading: Data product research briefing



    Related topics

    Data product lifecycle, Data product portfolio management, Data product canvas, Data product owner, Data product manager

    Data product canvas

    Data product canvas is a visual inquiry tool that supports organizations in designing and documenting data products. It addressed three key dimensions in the design of data products: desirability (do consumers want it), feasibility (can we deliver it) and viability (is it worth it).

    Source: Hasan, M.R. and Legner, C., (2023) 'Data Product Canvas: A visual inquiry tool supporting data product design', in International Conference on Design Science Research in Information Systems and Technology (pp. 191-205). Cham: Springer Nature Switzerland.
    Full paper: Download
    Download template: Data product canvas template



    Related topics

    Data product, Basic data product, Analytical data product, Advanced analytical data product

    Data product lifecycle

    The data product lifecycle provides an actionable guideline on selection, creation and maintenance of data products in organizations. It consists of six phases: Ideation & qualification, Data sourcing, Development & testing, Deployment, Consumption & monitoring, Retirement.

    Source: Hasan, M.R. and Legner, C. (2024) ‘Improving Consumer-Provider Interaction With Data Products: Insights From Traditional Industries’, in Proceedings of the Thirty-second European Conference on Information Systems (ECIS 2024). Paphos, Cyprus, pp. 1–17.
    Full paper: Download



    Related topics

    Basic data product, Advanced analytical data product, Analytical data product, data products

    Data product manager

    A data product manager is a data role that is accountable for the creation, implementation and maintenance of data products. The data product manager closely oversees the data team consisting of various other roles such as data analyst, data scientist, data engineer and data architect.

    Source: Hasan, M.R. and Legner, C. (2024) ‘Improving Consumer-Provider Interaction With Data Products: Insights From Traditional Industries’, in Proceedings of the Thirty-second European Conference on Information Systems (ECIS 2024). Paphos, Cyprus, pp. 1–17.
    Full paper: Download



    Related topics

    Data product owner, data products

    Data product owner

    A data product owner is a data role that represents the business interests and is accountable for the specification of business requirements of data products. In many cases, the data product owner is also the sponsor of the data product and has a final say in its acceptance.

    Source: Hasan, M.R. and Legner, C. (2024) ‘Improving Consumer-Provider Interaction With Data Products: Insights From Traditional Industries’, in Proceedings of the Thirty-second European Conference on Information Systems (ECIS 2024). Paphos, Cyprus, pp. 1–17.
    Full paper: Download



    Related topics

    Data product manager, data products

    Data product portfolio management

    Data product portfolio management is the process of systematically selecting product ideas, continuously assessing and optimizing the portfolio to maximize its value over time through alignment with organizational goals.

    Source: Adapted from Eckert, T. and Hüsig, S., 2022. Innovation portfolio management: A systematic review and research agenda in regards to digital service innovations. Management Review Quarterly, 72(1), pp.187-230.


    Related topics

    Data product

    Data quality

    Data quality is a multi-dimensional, context-dependent concept that cannot be described and measured by a single characteristic, but rather by various data quality dimensions. The desired level of data quality is thereby oriented on the requirements in the business processes and functions, which use this data, such as Purchasing, Sales or Reporting. A low level of data quality will reduce the value of the data assets in the company, because its usability is minimal. Companies are therefore striving to achieve a quality of data required by the business strategy using data quality management.

    Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)


    Related topics

    Data quality dimensions, Data quality Key Performance Indicators KPIs, [[[#Data quality tool|Data quality tool]]

    Data quality dimensions

    A data quality dimension is a measurable feature or characteristic of data. The most important dimensions whose data quality can be assessed are: - Correctness: factual agreement of the data with the properties of the real world object that it represents. - Consistency: agreement of several versions of the data related to the same real objects, which are stored in various information systems. - Completeness: complete existence of all values or attributes of a record that are necessary. - Actuality: agreement of the data at all times with the current status of the real object and adjustment of the data in a timely manner as soon as the real object has been changed. - Availability: the ability of the data user to access the data at the desired point in time.

    Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)


    Related topics

    Data quality Key Performance Indicators, Data quality, Data quality tool

    Data quality Key Performance Indicators

    A quantitative measure of data quality. A data quality measurement system measures the values for the quality of data at measurement points at a certain frequency of measurement. Data quality key performance indicators operationalize data quality dimensions. One example is the validation of a data element based on business rules.

    Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015


    Data quality management

    The mandate of Data Quality Management (DQM) is to analyze, improve and ensure the quality of the data. DQM generally differentiates between preventive and reactive measures. Preventive DQM measures target the avoidance of defects in the data with negative effects on the quality of the data. In contrast, reactive DQM measures target the discovery of existing defects in the data and their correction.

    Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)


    Related topics

    Data quality Key Performance Indicators, Data quality, Data quality dimensions, Data quality tool

    Data quality measurements

    Periodic examination of the data quality of the central records as part of DQM. For example, the data quality of the most important attributes could be measured based on defined business rules on a monthly basis. A record that does not fulfill all rules will be considered defective.

    Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)


    Related topics

    Data quality Key Performance Indicators, Data quality, Data quality dimensions, Data quality tool

    Data quality tool

    Data quality tools are solutions that help identify and fix data quality issues which affect the performance of various other application using data to support decision making

    Source: Barateiro, J. and Galhardas, H., (2005). A survey of data quality tools. Datenbank-Spektrum, 14(15-21), p.48.


    Related topics

    Data quality Key Performance Indicators, Data quality, Data quality dimensions

    Deep Learning

    Deep learning networks are neural networks with many layers. The layered network can process extensive amounts of data [and therefore] requires a great deal of computing power, which raises concerns about its economic and environmental sustainability.

    Source: Brown, 2021

    E

    Enterprise data

    Enterprise data describes all data that are created, maintained and used by enterprises. The enterprise data taxonomy developed in the Competence Center Corporate Data Quality distinguishes eight different categories of enterprise data and depicts their relationships: Master data, Transactional data, Observational data, Media data, Analytical data, Advanced analytical data, Metadata, and Reference data.

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    External data

    External data refers to any type of data that is captured, processed, and provided from outside the company. The major external data types include open, paid, shared and web data. Despite their increasing relevance, external data remain an untapped resource for most companies. External data can be used to complement internal data and help to improve advanced analysis, optimize business processes (e.g. with geolocation, weather, or traffic data), reduce internal data maintenance efforts (e.g. to enrich or validate internal data), and create new services. However, despite their increasing relevance, external data remain an untapped resource for most companies.

    Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2020


    F

    First time right

    A principle of preventive data quality management according to which data should be acquired by an information system as correctly as possible in order to avoid retroactively correction (at generally higher levels of expenditure)

    Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)


    G

    Generative AI

    Generative AI can be thought of as a machine learning model that is trained to create new data, rather than making a prediction about a specific dataset. A generative AI system is one that learns to generate more objects that look like the data it was trained on.

    Source: Zewe, 2023

    H

    I

    Internet of Things (IoT)

    The "Internet of Things" refers to the idea of an extended Internet that, in addition to classic computers and mobile devices, also integrates any physical objects into its infrastructure by means of sensors and actuators, thus turning them into providers or consumers of a wide variety of digital services.

    Source: Fleisch, E. & Tiesse, F. Enzyklopaedie der Wirtschaftsinformatik: https://www.enzyklopaedie-der-wirtschaftsinformatik.de/wi-enzyklopaedie/lexikon/technologien-methoden/Rechnernetz/Internet/Internet-der-Dinge


    J

    K

    L

    Linked open data

    Linked Open Data defines a vision of globally accessible and linked data on the internet based on the RDF standards of the semantic web. This structured web data is interlinked with other data and can be accessed through semantic queries. Linked open data is released under an open license, which does not impede its reuse for free.

    Source: W3C, Tim Berners-Lee


    M

    Machine learning

    Machine learning is a subfield of artificial intelligence and covers algorithms and techniques that allow machines to learn from data. Two main categories of machine learning techniques are supervised machine learning (SML) and unsupervised machine learning (USML).

    Source: Kalota, 2024

    Master data

    Master Data is the most fundamental enterprise data data subtype. Master data represent core business objects (e.g. customers, suppliers, or products) which are agreed upon and shared across the enterprise. They remain largely unaltered and are often referenced and reused in business document and data analysis. They must be unambiguously identifiable and interpretable across the entire organization (i.e., across organizational departments, divisions, and units).

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    Related topics

    Advanced analytical data, Media data, Metadata, Reference data, Analytical data, Observational data, Transactional data, Enterprise data

    Master Data management

    {All of the activities, methods and (IT) tools for modeling, managing and providing master data as well as its data quality management. The goal is to provide and ensure a company-wide truth about the core business objects (single source of truth) and thereby to support data users in various business processes throughout the company.

    Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)


    Media data

    Media data is a particular enterprise data data subtype that represents documents, digital images, geospatial data, and multimedia (video/audio) files. Media data is mainly unstructured in nature.

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    Related topics

    Master data, Advanced analytical data, Metadata, Reference data, Analytical data, Observational data, Transactional data, Enterprise data

    Metadata

    Metadata is »data about data«. This is a particular enterprise data data subtype that describes the properties of other data. Typically, metadata enables retrieval and maintenance of »data containers« (e.g., documents or files) by means of identifying, classifying or descriptive attributes. Metadata helps an organization understand its data and contributes to the ability to process, maintain, integrate, secure, audit, and govern it. Common metadata attributes include context (i.e. the environment in which data is living), terminology (i.e. definitions and descriptions), administrative information (i.e. when the data have been created and by whom) and governance (i.e. ownership and level of confidentiality).

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    Related topics

    Master data, Media data, Advanced analytical data, Reference data, Analytical data, Observational data, Transactional data, Enterprise data

    N

    Neural Network

    Neural networks are a commonly used, specific class of machine learning algorithms. Artificial neural networks are modeled on the human brain, in which thousands or millions of processing nodes are interconnected and organized into layers. In an artificial neural network, cells, or nodes, are connected, with each cell processing inputs and producing an output that is sent to other neurons.

    Source: Brown, 2021

    O

    Observational data

    Observational data is a particular enterprise data subtype that captures experiences and behavior at a very detailed and fine granular level. It is generated by human or things. Observational data includes IoT/sensor data from connected devices (often in the form of data streams), web data generated by user activities on social media platforms or commercial websites, as well as survey data from questionnaires.

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    Related topics

    Master data, Media data, Metadata, Reference data, Analytical data, Advanced analytical data, Transactional data, Enterprise data

    Open data

    Open data can be defined as "data that is freely available, and can be used as well as republished by everyone without restrictions from copyright or patents”. As specific type of external data, open data holds great business potential and is expected to fuel advanced analytics, optimize business processes, enrich data management, or even enable new services.

    Source: Krasikov, P., Legner, C., & Eurich, M. (2021). Sourcing the Right Open Data: A Design Science Research Approach for the Enterprise Context.

    Braunschweig, K., Eberius, J., Thiele, M., & Lehner, W. (2012). The State of Open Data. Limits of Current Open Data Platforms.


    P

    Paid data

    Paid data, also known as commercially available data, refers to the datasets available directly from specialized data providers (or brokers) and data marketplaces, and offered at a certain cost. It is a specific type of external data and is typically coupled with specific services which facilitate its use, such as identification and classification of data by categories, description of the intended use, metadata documentation, and integration services.

    Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2020


    People, roles and responsibilities

    In the CDQ Data Excellence Model, the people, roles, and responsibilities design area defines the culture, organization, roles, boards, and interactions for data management. As data is generated, managed, and used in many different parts of an organization, a dedicated data management organization supports the orchestration and alignment of enterprise-wide data management activities. This is of particular importance as data management involves many different parts of the enterprise. Consequently, data can only be managed consistently if ownership and responsibilities are assigned and trained and when all employees have a data-driven mindset.

    Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.


    Performance management

    In the CDQ Data Excellence Model, the performance management design area defines how to plan, implement, and control all activities for measuring, assessing, improving, and ensuring data management performance, data excellence, and business value.

    Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.


    Personal data

    From a regulatory perspective, personal data can be defined as “data enabling direct or indirect identification of a single physical person, data that is specific to a single physical person without enabling identification, data that can be linked to a physical person, data regarding which anonymization techniques cannot completely mitigate the risk of re-identification” (Debet et al. 2015). From a practical perspective, most companies collect personal data about their customers, employees, suppliers and vendors. A particular area of concern typically are customer data that can be defined as “a set of data that represents and is associated with the identity, activities and service offering associated with a unique individual” (Tapsell et al. 2018).

    Source: Debet, A., Massot, J., Metallinos, N., Danis-Fantôme, A., Lesobre, O.: Informatique et libertés. La protection des données à caractère personnel en droit français et européen (2015). Tapsell, J., Akram, R.N., Markantonakis, K: Consumer-Centric Data Control, Tracking and Transparency (2018).


    Processes and methods

    In the CDQ Data Excellence Model, the processes and methods design area defines relevant data management procedures on a strategic, governance, and operational level and specifies which tasks are to be executed by whom and in what order.

    Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.


    Q

    R

    Reference data

    Reference data is a particular enterprise data subtype used to characterize, categorize, validate or constrain other data. The most basic reference data are codes or key value lists, but they can also be more complex and incorporate hierarchies or vocabularies. Reference data can be defined and created internally (e.g., customer classifications, product groups) or received from external sources (e.g., country or currency codes defined by ISO standards, product classifications defined by e-commerce standards).

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    Related topics

    Master data, Media data, Metadata, Advanced analytical data, Analytical data, Observational data, Transactional data, Enterprise data

    Regulation

    A regulation is a document written in natural language containing a set of guidelines specifying constraints and preferences pertaining to the desired structure and behavior of an enterprise. Examples of regulations are a law (e.g., the General Data Protection Regulation - GDPR), a standardization document, a contract, etc. A regulation specifies the domain elements it applies to and oftentimes has implications for data management.

    Source: El Kharbili, M.: Business Process Regulatory Compliance Management Solution Frameworks: A Comparative Evaluation (2012).


    Regulatory compliance management (RCM)

    Regulatory Compliance Management (RCM) is the problem of ensuring that enterprises (data, processes, organization, etc.) are structured and behave in accordance with the regulations that apply, i.e., with the guidelines specified in the regulations.

    Source: El Kharbili, M.: Business Process Regulatory Compliance Management Solution Frameworks: A Comparative Evaluation (2012).


    Regulatory guideline

    A regulatory guideline specifies the expected behavior and structure on enterprise domain elements. It additionally defines tolerated and non-tolerated deviations from the ideal behavior and structure, and also defines the possible exceptional cases. A regulation may also specify how the enterprise ought to or may react to deviations from ideal behavior and structure.

    Source: El Kharbili, M.: Business Process Regulatory Compliance Management Solution Frameworks: A Comparative Evaluation (2012).


    S

    Shared data

    Shared data refers to external data which is shared between companies within dedicated business ecosystems. Examples for sharing and exchange environments include Global Data Synchronization Network (GDSN) provided by GS1 or CDQ Data Sharing Community.

    Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2020




    Social media data

    Web data refers to the data made available on the Web (e.g., online sources, websites) and also shared by users (e.g., user-generated content, reactions, comments) of social media platforms, including the metadata (e.g. location, time, language, biographical data). Web data is one of the subtypes of external data.

    Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2021


    T

    Transactional data

    Transactional data is a particular enterprise data subtype that is created by business processes and document key business events or the result of a business activity. Transactional data often references master data, but in contrast to master data, it naturally changes during its lifecycle (e.g. status changes). Furthermore, the volume of transactional data (e.g. number of sales orders) increases with ongoing business activity. Examples are sales or purchase orders, invoices, delivery notes or incidents.

    Source: Martin, F., Walter, V., Hasan, M.R., Legner, C. (2021): CC CDQ Data Quality Handbook
    Full working report: Download



    Related topics

    Master data, Media data, Metadata, Reference data, Analytical data, Observational data, Advanced analytical data, Enterprise data

    U

    V

    W

    X

    Y

    Z