Glossary
This glossary summarizes knowledge from the Competence Center Corporate Data Quality (CC CDQ) and provides key definitions for data-related terms. It consolidates the body of knowledge that has been developed by the CC CDQ research team in collaboration with data experts from 40+ Fortune 500 companies since 2006.
A
Analytical data is a particular subtype of enterprise data. It is derived from business operations and transactional data and is mainly used to meet standard reporting and analytics requirements by applying descriptive analytics.
Related topics
Master data, Media data, Metadata, Reference data, Advanced analytical data, Observational data, Transactional data, Enterprise data
Analytical data products are a particular subtype of data product. They are created using descriptive analytics and deliver key insights to support decision making. Examples include:
Related topics
Advanced analytical data product, Basic data product, Data product
Advanced analytical data is a particular subtype of enterprise data. It is created by applying methods of data science that go beyond purely descriptive analytics (i.e., towards predictive/prescriptive analytics). It is used to identify patterns or correlations in complex (i.e., structured and unstructured) datasets, such as text, images, geospatial or sensor data.
Related topics
Master data, Media data, Metadata, Reference data, Analytical data, Observational data, Transactional data, Enterprise data
Advanced analytical data products are a particular subtype of data product. They use machine learning methods to create predictive and prescriptive knowledge and improve self-learning capabilities. Examples include:
Related topics
The term "Artificial intelligence" (AI) was first coined by John McCarthy at the renowned Dartmouth workshop in 1956, defining it as “the science and engineering of making intelligent machines”.
Source: McCorduck, 2004
B
Basic data products are a particular subtype of data product. They are ready-to-use datasets giving foundational insights of the domain(s) represented by the data. Examples include:
Related topics
Advanced analytical data product, Analytical data product, Data product
Big data are data that are so large and diverse that they require cost-effective, innovative forms of data collection, storage, management, analysis and visualization. Big Data are typically characterized by 3 V's: Velocity is the speed at which the data is created and the speed at which the data shoud be analyzed and used. Volume refers to the size of the data which is typically in the range of terabytes and exabytes, whereas variety refers the changing data types in scope ranging from more traditional structured source (spreadsheets, SQL database tables) to semi-structured data (XML, JSON, Semantic Web data) as well as unstructured data (images, texts, files). In recent years, three more V's have been added to the the traditional 3 V's framework to charactarize Big Data: Variability, Veracity and Value. Veracity refers to different levels in reliability and truthfulness of big data sources, while variability describes the high frequency of changes within a data sources. Last but not least value describes the fact that while single data points may not be of high value, value from big data comes from analyzing huge amounts and trends within and between datasets.
Source:Amir Gandomi, Murtaza Haider,
Beyond the hype: Big data concepts, methods, and analytics,
International Journal of Information Management,
Volume 35, Issue 2,
2015, Pages 137-144,
https://doi.org/10.1016/j.ijinfomgt.2014.10.007.
Click here to edit Big Data
Business analytics is defined as the exploration and investigation of past business data to gain valuable insights and drive business planning. These activities depend on a sufficient volume of data as well as on a sufficient level of data quality. This requires data managers to integrate and reconcile data across various sources (i.e. from various business units, divisions, departments, branches, and information systems), with the goal of compiling a complete picture of the company’s past and current state for deriving future scenarios.
Source: Legner, Christine; Pentek, Tobias; Ofner, Martin; Labadie, Clément: CDQ Trend Study: Trends in corporate data management, 2017 (https://www.cdq.com/events-insights/publications/cdq-trend-study)
Click here to edit Business analytics
Business capabilities define a set of data-based skills, routines, and resources a company needs to have in order to achieve its business goals through data monetization. The Business Capability design area specifies what data-related business capabilities are required, which of these are already in place to some extent and need to be enhanced, and which ones need to be established from scratch.
Source: Dissertation Tobias
Click here to edit Business capabilities
The method-oriented, model-based theory of construction for companies in the Information Age.
Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequisite for Successful Business Models, 2015 (http://www.cdq-buch.de/)
Click here to edit Business engineering
A Business Object represent a real or imagined object of value generation, that can be either used, changed or analyzed in business processes. It describes reoccurring set of information used in multiple business contexts and minimum one data domain. It is specified by attributes.
Source: Schmidt, Alexander (dissertation)
Click here to edit Business Object
A business rule is a statement that defines or constrains some aspect of the business. It is intended to assert business structure or to control or influence the behavior of the business. Business rules may be defined as business definitions for business use (to represent policies, practices and procedures), or defined as executable business rule statements for use in rule-driven systems, or both.
Source: The Business Rules Group (BRG) and the Object Management Group (OMG)
Click here to edit Business rule
Refers to the impact of data management on business with regard to financials, business processes, customers, and organizational growth.
Source: Legner, Christine; Pentek, Tobias: Data Excellence Model: Short Description and Basic Terminology, 2017 (https://cc-wiki.cdq.com/Data_Excellence_Model)
Click here to edit Business value
C
Computer aided design, meaning designing a product with the help of information technology.
Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)
Click here to edit CAD
Data-related services delivered via the Internet in an on-demand model.
Source: Legner, Christine; Pentek, Tobias; Ofner, Martin; Labadie, Clément: CDQ Trend Study: Trends in corporate data management, 2017 (https://www.cdq.com/events-insights/publications/cdq-trend-study)
Click here to edit Cloud services
The central actors (business partners, customers, suppliers and employees), products (incl. materials) and operating materials (systems, etc.) of a company and its ecosystem. These objects are represented as master data for purposes of IT.
Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)
Click here to edit Core business object
D
The Data Analyst is a core data & analytics role in the CC CDQ Reference Model for Data & Analytics Governance. (S)he is responsible for the implementation (development and deployment) and maintenance of reports and adhoc-analysis.
Source: Reference Model for Data & Analytics Governance
Click here to edit Data analyst
In the CDQ Data Excellence Model, data applications is about planning, implementing, and maintaining software which is designed to manage data and data products in order to achieve and maintain data excellence. The design area specifies the applications for managing (master) data, managing data quality, and cataloging/curating data.
Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.
Click here to edit Data applications
The Data Architect is a core data & analytics role in the CC CDQ Reference Model for Data & Analytics Governance. (S)he is responsible for designing, creating, deploying and managing conceptual and logical data models and for the mapping to physical data models. (S)he is also accountable for the implementation and maintenance of data pipelines.
Source: Reference Model for Data & Analytics Governance
Click here to edit Data architect
A Data Catalog is an integrated platform for data curation, matching data supply and demand. It offers users functions to register data; to retrieve and use data; and to assess and analyze data. A Data Catalog therefore should provide a data inventory (for data supply) and features for data discovery (for data demand) as key components. Additional features should support data governance, data assessment, and data analytics, alongside with appropriate features for catalog administration and data collaboration.
Source: Fadler, Martin; Korte, Tobias; Legner, Christine; Otto, Boris; Spiekermann, Markus: Data Catalogs: integrated platforms for matching data supply and demand, 2018
Click here to edit Data catalog
Data citizens represent employees who rely on data for their daily work but are not data specialists.
Source: Reference Model for Data & Analytics Governance
Click here to edit Data citizen
A data contract is a formal agreement between the producer(s) and consumers(s) of data products which guarantees their provision at a desired level of service in return for adhering to conditions to facilitate the products’ reliable usage. A data contract may contain metadata on the following: Structural, Administrative, Data quality, Ownership, Pipeline, Service level agreements, Licensing, Pricing, Access.
Related topics
The enterprise’s capability to motivate and empower a wider range of employees—not just data experts—to understand, find, access, use, and share data in a secure and compliant way.
Source: Lefebvre et al. (2021)
Click here to edit Data democratization
Data excellence is an umbrella term that defines properties of data, comprising data quality (defined as “fitness for purpose”) but also on additional dimensions, such as regulatory compliance, data security, or data privacy.
Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.
Click here to edit Data excellence
A company-wide framework that determines which decisions must be made and who should make them. This includes the definition of roles, responsibilities, obligations and rights in handling the company’s resource data. In this, data governance pursues the goal of maximizing the value of the data in the company. While data governance determines how decisions should be made, data management makes the actual decisions and implements them.
Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)
Click here to edit Data governance
Data integration is the task of presenting a unified view of data owned by heterogeneous and distributed data sources". The need for data integration may stem from (1) technological heterogeneities (different database technologies) (2) schema heterogeneities (different data models and data representations) and (3) instance-level heterogeneities (conflicting values in different sources for the same data object). Data can be physically integrated or virtually, meaning that the data will remain in the source systems, however will be accessed using a uniform view.
Source: Data and Information Quality (2016), Carlo Batini, Monica Scannapieco
Click here to edit Data integration
In the CDQ Data Excellence Model, the data lifecycle comprises all processes regarding the creation, acquisition, storage, maintenance, use, archiving, and deletion of data. For a given data object, it defines and documents the data sources, data supply chains, data consumers, and data use contexts.
Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.
Click here to edit Data lifecycle
The continuous learning of core skills, knowledge, attitude and values required to interpret data in a critical manner, and derive meaningful and actionable business insights.
Data management aims at the efficient usage of data in companies. It makes decisions and executes measures that affect the company-wide handling of data (whereas data governance creates the framework for such through the definition of responsibilities and so forth). It comprises all tasks related to the data lifecycle on a strategic, governing, and technical level: the formulation of a data strategy, the definition of data management processes, standards, and measures, the assignment of roles and responsibilities, the description of the data lifecycle and architecture – covering data models and data modeling standards –, and the management of applications and systems.
Source: Pentek, T., Legner, C. and Otto, B. 2017. 'Towards a Reference Model for Data Management in the Digital Economy'. In: Maedche, A., vom Brocke, J., Hevner, A. (eds.) Designing the Digital Transformation: DESRIST 2017 Research in Progress Proceedings of the 12th International Conference on Design Science Research in Information Systems and Technology. Karlsruhe, Germany. 30 May - 1 Jun. Karslruhe: Karlsruher Institut für Technologie (KIT), pp. 51-66
Click here to edit Data management
in the CDQ Data Excellence Model, the data management capabilities design area defines a set of skills, routines, and resources a company needs to have in order to accomplish data excellence that results in business value.
Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.
Click here to edit Data management capabilities
The data owner is a core data & analytics role in the CC CDQ Reference Model for Data & Analytics Governance. Two different role types of data owner are usually being distinguished in practice: data definition owner and data content owner.
The data definition owner is a decentralized data governance role which is assigned typically to senior business executives with global outreach (e.g. Global head of sales). (S)he is accountable for the data definition in specific areas of responsibility (e.g. a specific data domain like product or customer). Here, (s)he ensures that business requirements are fulfilled and data is compliantly accessed and used. Her/his tasks include collecting/defining data requirements and delegating the detailling of a data definition to a data steward.
The data content owner is a decentralized data governance role which is assigned to local business executives/ team leaders with operational responsibilities. (S/he) is accountable for data creation and maintenance (Data lifecycle) according to the data definition for a specific area of responsibility. (S)he coordinates the creation and maintenance of data by data editors.
Source: Reference Model for Data & Analytics Governance
Click here to edit Data owner
A data product is a managed artifact which satisfies recurring information needs and creates value through transforming and packaging relevant data elements into consumable form. Data products have particular subtypes: Basic data product, Analytical data product, Advanced analytical data product. Firms build data products in order to enhance access and reuse of data, to improve its governance and ownership as well as to reduce the time-to-insight. Data products have five main characteristics:
Source: Hasan, M.R. and Legner, C. (2023) ‘Understanding Data Products: Motivation, Definition, And Categories’, in Proceedings of the Thirty-first European Conference on Information Systems (ECIS2023). Kristiansand, Norway, pp. 1–17.
Further reading: CC CDQ Research Briefing - Data product
Related topics
Data product lifecycle, Data product portfolio management, Data product canvas, Data product owner, Data product manager
The Data product canvas is a visual inquiry tool that supports organizations in designing and documenting data products. It addressed three key dimensions in the design of data products: desirability (do consumers want it), feasibility (can we deliver it) and viability (is it worth it).
Source: Data Product Canvas: A visual inquiry tool supporting data product design', in International Conference on Design Science Research in Information Systems and Technology (pp. 191-205). Cham: Springer Nature Switzerland
Download template: Data product canvas template
Related topics
Data product, Basic data product, Analytical data product, Advanced analytical data product
The data product lifecycle is an end-to-end approach which oversees the evolution of data products from its cradle to grave within organizations. It consists of six phases: Ideation & qualification, Data sourcing, Development & testing, Deployment, Consumption & monitoring, Retirement.
Related topics
Basic data product, Advanced analytical data product, Analytical data product, Data products, Data product manager, Data product owner, Data product portfolio management
A data product manager is a data role that is accountable for the creation, implementation and maintenance of data products. The data product manager closely collaborates with the data team consisting of various other roles such as data owner, data analyst, data scientist, data engineer and data architect. The data product manager works closely with the data product owner to ensure that data products remain fully functional and relevant for the business throughout its lifecycle.
Related topics
A data product owner is a data role that represents the business interests and is accountable for the specification of business requirements of data products. In many cases, the data product owner is also the sponsor of the data product and has a final say in its acceptance. The data product owner collaborates with the data product manager throughout the data product lifecycle to ensure that business requirements are addressed while creating the data products.
Related topics
Data product portfolio management is the process of systematically selecting product ideas, continuously assessing and optimizing the portfolio to maximize its value over time through alignment with organizational goals. It mainly consists of three phases: selection, monitoring and optimization. Data product portfolio management allows organizations to create transparency of all their data products, efficiently allocate resources to the right products, manage intricate dependencies between data products and ensure fitness to strategic, technical and regulatory requirements.
Source: Adapted from Eckert, T. and Hüsig, S., 2022. Innovation portfolio management: A systematic review and research agenda in regards to digital service innovations. Management Review Quarterly, 72(1), pp.187-230.
Related topics
Data quality is a multi-dimensional, context-dependent concept that cannot be described and measured by a single characteristic, but rather by various data quality dimensions. The desired level of data quality is thereby oriented on the requirements in the business processes and functions, which use this data, such as Purchasing, Sales or Reporting. A low level of data quality will reduce the value of the data assets in the company, because its usability is minimal. Companies are therefore striving to achieve a quality of data required by the business strategy using data quality management.
Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)
Related topics
Data quality dimensions, Data quality Key Performance Indicators KPIs, [[[#Data quality tool|Data quality tool]]
A data quality dimension is a measurable feature or characteristic of data. The most important dimensions whose data quality can be assessed are: - Correctness: factual agreement of the data with the properties of the real world object that it represents. - Consistency: agreement of several versions of the data related to the same real objects, which are stored in various information systems. - Completeness: complete existence of all values or attributes of a record that are necessary. - Actuality: agreement of the data at all times with the current status of the real object and adjustment of the data in a timely manner as soon as the real object has been changed. - Availability: the ability of the data user to access the data at the desired point in time.
Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)
Related topics
Data quality Key Performance Indicators, Data quality, Data quality tool
A quantitative measure of data quality. A data quality measurement system measures the values for the quality of data at measurement points at a certain frequency of measurement. Data quality key performance indicators operationalize data quality dimensions. One example is the validation of a data element based on business rules.
Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015
Click here to edit Data quality Key Performance Indicators (Data quality KPIs)
The mandate of Data Quality Management (DQM) is to analyze, improve and ensure the quality of the data. DQM generally differentiates between preventive and reactive measures. Preventive DQM measures target the avoidance of defects in the data with negative effects on the quality of the data. In contrast, reactive DQM measures target the discovery of existing defects in the data and their correction.
Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)
Related topics
Data quality Key Performance Indicators, Data quality, Data quality dimensions, Data quality tool
Periodic examination of the data quality of the central records as part of DQM. For example, the data quality of the most important attributes could be measured based on defined business rules on a monthly basis. A record that does not fulfill all rules will be considered defective.
Source: Otto, B., Österle, H. (2015), Corporate Data Quality: Prerequsite for Successful Business Models (http://www.cdq-buch.de/)
Related topics
Data quality Key Performance Indicators, Data quality, Data quality dimensions, Data quality tool
Data quality tools are solutions that help identify and fix data quality issues which affect the performance of various other application using data to support decision making
Source: Barateiro, J. and Galhardas, H., (2005). A survey of data quality tools. Datenbank-Spektrum, 14(15-21), p.48.
Related topics
Data quality Key Performance Indicators, Data quality, Data quality dimensions
Deep learning networks are neural networks with many layers. The layered network can process extensive amounts of data [and therefore] requires a great deal of computing power, which raises concerns about its economic and environmental sustainability.
Source: Brown, 2021
E
Enterprise data describes all data that are created, maintained and used by enterprises. The enterprise data taxonomy developed in the Competence Center Corporate Data Quality distinguishes eight different categories of enterprise data and depicts their relationships: Master data, Transactional data, Observational data, Media data, Analytical data, Advanced analytical data, Metadata, and Reference data.
External data refers to any type of data that is captured, processed, and provided from outside the company. The major external data types include open, paid, shared and web data. Despite their increasing relevance, external data remain an untapped resource for most companies. External data can be used to complement internal data and help to improve advanced analysis, optimize business processes (e.g. with geolocation, weather, or traffic data), reduce internal data maintenance efforts (e.g. to enrich or validate internal data), and create new services. However, despite their increasing relevance, external data remain an untapped resource for most companies.
Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2020
Click here to edit External data
F
A principle of preventive data quality management according to which data should be acquired by an information system as correctly as possible in order to avoid retroactively correction (at generally higher levels of expenditure)
Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)
Click here to edit First time right
G
Generative AI can be thought of as a machine learning model that is trained to create new data, rather than making a prediction about a specific dataset. A generative AI system is one that learns to generate more objects that look like the data it was trained on.
Source: Zewe, 2023
H
I
The "Internet of Things" refers to the idea of an extended Internet that, in addition to classic computers and mobile devices, also integrates any physical objects into its infrastructure by means of sensors and actuators, thus turning them into providers or consumers of a wide variety of digital services.
Source: Fleisch, E. & Tiesse, F. Enzyklopaedie der Wirtschaftsinformatik: https://www.enzyklopaedie-der-wirtschaftsinformatik.de/wi-enzyklopaedie/lexikon/technologien-methoden/Rechnernetz/Internet/Internet-der-Dinge
Click here to edit Internet of Things (IoT)
J
K
L
Linked Open Data defines a vision of globally accessible and linked data on the internet based on the RDF standards of the semantic web. This structured web data is interlinked with other data and can be accessed through semantic queries. Linked open data is released under an open license, which does not impede its reuse for free.
Source: W3C, Tim Berners-Lee
Click here to edit Linked open data
M
Machine learning is a subfield of artificial intelligence and covers algorithms and techniques that allow machines to learn from data. Two main categories of machine learning techniques are supervised machine learning (SML) and unsupervised machine learning (USML).
Source: Kalota, 2024
Master Data is the most fundamental enterprise data subtype. Master data represents core business objects (i.e., customers, suppliers, or products) which are agreed upon and shared across the enterprise. They remain largely unaltered and are often referenced and reused in business documents and data analysis. They must be unambiguously identifiable and interpretable across the entire organization (i.e., across organizational departments, divisions, and units).
Related topics
Advanced analytical data, Media data, Metadata, Reference data, Analytical data, Observational data, Transactional data, Enterprise data
{All of the activities, methods and (IT) tools for modeling, managing and providing master data as well as its data quality management. The goal is to provide and ensure a company-wide truth about the core business objects (single source of truth) and thereby to support data users in various business processes throughout the company.
Source: Otto, Boris; Österle, Hubert: Corporate Data Quality: Prerequsite for Successful Business Models, 2015 (http://www.cdq-buch.de/)
Click here to edit Master Data management
Media data is a particular enterprise data subtype that represents documents, digital images, geospatial data, and multimedia (video/audio) files. Media data is mainly unstructured in nature.
Related topics
Master data, Advanced analytical data, Metadata, Reference data, Analytical data, Observational data, Transactional data, Enterprise data
Metadata is "data about data". This is a particular enterprise data subtype that aims to facilitate access, management and sharing of large sets of structured and/or unstructured data. There are six categories of metadata:
Source: Labadie, C., Eurich, M. and Legner, C., 2020. Empowering Data Consumers to Work with Data: Data Documentation for the Enterprise Context. In Wirtschaftsinformatik (Zentrale Tracks) (pp. 1428-1442).
Related topics
Master data, Media data, Advanced analytical data, Reference data, Analytical data, Observational data, Transactional data, Enterprise data
N
Neural networks are a commonly used, specific class of machine learning algorithms. Artificial neural networks are modeled on the human brain, in which thousands or millions of processing nodes are interconnected and organized into layers. In an artificial neural network, cells, or nodes, are connected, with each cell processing inputs and producing an output that is sent to other neurons.
Source: Brown, 2021
O
Observational data is a particular enterprise data subtype that captures experiences and behavior at a very detailed and fine granular level. It is generated by human or things. Observational data includes IoT/sensor data from connected devices (often in the form of data streams), web data generated by user activities on social media platforms or commercial websites, as well as survey data from questionnaires.
Related topics
Master data, Media data, Metadata, Reference data, Analytical data, Advanced analytical data, Transactional data, Enterprise data
Open data can be defined as "data that is freely available, and can be used as well as republished by everyone without restrictions from copyright or patents”. As specific type of external data, open data holds great business potential and is expected to fuel advanced analytics, optimize business processes, enrich data management, or even enable new services.
Source: Krasikov, P., Legner, C., & Eurich, M. (2021). Sourcing the Right Open Data: A Design Science Research Approach for the Enterprise Context.
Braunschweig, K., Eberius, J., Thiele, M., & Lehner, W. (2012). The State of Open Data. Limits of Current Open Data Platforms.
Click here to edit Open data
P
Paid data, also known as commercially available data, refers to the datasets available directly from specialized data providers (or brokers) and data marketplaces, and offered at a certain cost. It is a specific type of external data and is typically coupled with specific services which facilitate its use, such as identification and classification of data by categories, description of the intended use, metadata documentation, and integration services.
Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2020
Click here to edit Paid data
In the CDQ Data Excellence Model, the people, roles, and responsibilities design area defines the culture, organization, roles, boards, and interactions for data management. As data is generated, managed, and used in many different parts of an organization, a dedicated data management organization supports the orchestration and alignment of enterprise-wide data management activities. This is of particular importance as data management involves many different parts of the enterprise. Consequently, data can only be managed consistently if ownership and responsibilities are assigned and trained and when all employees have a data-driven mindset.
Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.
Click here to edit People, roles and responsibilities
In the CDQ Data Excellence Model, the performance management design area defines how to plan, implement, and control all activities for measuring, assessing, improving, and ensuring data management performance, data excellence, and business value.
Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.
Click here to edit Performance management
From a regulatory perspective, personal data can be defined as “data enabling direct or indirect identification of a single physical person, data that is specific to a single physical person without enabling identification, data that can be linked to a physical person, data regarding which anonymization techniques cannot completely mitigate the risk of re-identification” (Debet et al. 2015). From a practical perspective, most companies collect personal data about their customers, employees, suppliers and vendors. A particular area of concern typically are customer data that can be defined as “a set of data that represents and is associated with the identity, activities and service offering associated with a unique individual” (Tapsell et al. 2018).
Source: Debet, A., Massot, J., Metallinos, N., Danis-Fantôme, A., Lesobre, O.: Informatique et libertés. La protection des données à caractère personnel en droit français et européen (2015).
Tapsell, J., Akram, R.N., Markantonakis, K: Consumer-Centric Data Control, Tracking and Transparency (2018).
Click here to edit Personal data
In the CDQ Data Excellence Model, the processes and methods design area defines relevant data management procedures on a strategic, governance, and operational level and specifies which tasks are to be executed by whom and in what order.
Source: Pentek, T; Legner, C. & Otto, B. (2020). Data Excellence Model – Reference Model for Managing Data Assets. CC CDQ Working Report.
Click here to edit Processes and methods
Q
R
Reference data is a particular enterprise data subtype used to characterize, categorize, validate or constrain other data. The most basic reference data are codes or key value lists, but they can also be more complex and incorporate hierarchies or vocabularies. Reference data can be defined and created internally (i.e., customer classifications, product groups) or received from external sources (i.e., country or currency codes defined by ISO standards, product classifications defined by e-commerce standards).
Related topics
Master data, Media data, Metadata, Advanced analytical data, Analytical data, Observational data, Transactional data, Enterprise data
A regulation is a document written in natural language containing a set of guidelines specifying constraints and preferences pertaining to the desired structure and behavior of an enterprise. Examples of regulations are a law (e.g., the General Data Protection Regulation - GDPR), a standardization document, a contract, etc. A regulation specifies the domain elements it applies to and oftentimes has implications for data management.
Source: El Kharbili, M.: Business Process Regulatory Compliance Management Solution Frameworks: A Comparative Evaluation (2012).
Click here to edit Regulation
Regulatory Compliance Management (RCM) is the problem of ensuring that enterprises (data, processes, organization, etc.) are structured and behave in accordance with the regulations that apply, i.e., with the guidelines specified in the regulations.
Source: El Kharbili, M.: Business Process Regulatory Compliance Management Solution Frameworks: A Comparative Evaluation (2012).
Click here to edit Regulatory compliance management (RCM)
A regulatory guideline specifies the expected behavior and structure on enterprise domain elements. It additionally defines tolerated and non-tolerated deviations from the ideal behavior and structure, and also defines the possible exceptional cases. A regulation may also specify how the enterprise ought to or may react to deviations from ideal behavior and structure.
Source: El Kharbili, M.: Business Process Regulatory Compliance Management Solution Frameworks: A Comparative Evaluation (2012).
Click here to edit Regulatory guideline
S
Shared data refers to external data which is shared between companies within dedicated business ecosystems. Examples for sharing and exchange environments include Global Data Synchronization Network (GDSN) provided by GS1 or CDQ Data Sharing Community.
Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2020
Click here to edit Shared data
Web data refers to the data made available on the Web (e.g., online sources, websites) and also shared by users (e.g., user-generated content, reactions, comments) of social media platforms, including the metadata (e.g. location, time, language, biographical data). Web data is one of the subtypes of external data.
Source: Krasikov, Pavel; Eurich, Markus; Legner Christine: External Data CC CDQ Working Report, 2021
Click here to edit Social media data
T
Transactional data is a particular enterprise data subtype that is created by business processes and documents key business events or the results of business activities. Transactional data often references master data, but in contrast to master data, it naturally changes during its lifecycle (i.e., status changes). Furthermore, the volume of transactional data (i.e., number of sales orders) increases with ongoing business activities. Examples are sales or purchase orders, invoices, delivery notes or incidents.
Related topics
Master data, Media data, Metadata, Reference data, Analytical data, Observational data, Advanced analytical data, Enterprise data