GB/T 36344-2018 PDF English
Search result: GB/T 36344-2018
Standard ID | Contents [version] | USD | STEP2 | [PDF] delivered in | Name of Chinese Standard | Status |
GB/T 36344-2018 | English | 150 |
Add to Cart
|
0-9 seconds. Auto-delivery.
|
Information technology - Evaluation indicators for data quality
| Valid |
PDF Preview: GB/T 36344-2018
GB/T 36344-2018: PDF in English (GBT 36344-2018) GB/T 36344-2018
GB
NATIONAL STANDARD OF THE
PEOPLE’S REPUBLIC OF CHINA
ICS 35.240.01
L 70
Information technology -
Evaluation indicators for data quality
ISSUED ON. JUNE 07, 2018
IMPLEMENTED ON. JANUARY 01, 2019
Issued by. State Administration for Market Regulation;
Standardization Administration of the People's Republic of
China.
Table of Contents
Foreword ... 3
1 Scope ... 4
2 Terms and definitions ... 4
3 Frame of indicators ... 5
4 Overview ... 6
5 Indicator description ... 6
Annex A (informative) Data quality evaluation process ... 13
Bibliography ... 14
Information technology -
Evaluation indicators for data quality
1 Scope
This Standard specifies the frame and description of evaluation indicators for
data quality.
This Standard is applicable to data quality evaluation of data lifecycle at each
stage.
2 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
2.1 data
formal representation of reinterpretable information so as to suit communication,
interpretation or processing
NOTE. Data can be processed manually or automatically.
[GB/T 5271.1-2000, Definition 01.01.02]
2.2 meta data
data about data or data elements (possibly including their data descriptions),
and data about data ownership, access paths, access rights, and data
variability
[GB/T 5271.17-2010, Definition 17.06.05]
2.3 data quality
the extent to which the characteristics of data meet clear and implied
requirements when used under specified conditions
2.4 raw data
various unprocessed or simplified data stored by the end user
NOTE. Raw data has multiple forms of existence, such as text data, image data, audio data, or a mixture
of several types of data.
2.5 data lifecycle
a set of processes that turn raw data into actionable knowledge
2.6 data set
a collection of data that has a certain theme that can be identified and can be
computerized
2.7 data model
expression of images and texts for analysis; the analysis identifies the data the
organization needs to complete its mission, function, goals, objectives, and
strategy, as well as manage and evaluate the organization
NOTE 1. When representing data at different levels of abstraction from high to low, it is often distinguished
between conceptual models (models composed of concepts related to certain efforts), logical models, and
physical models.
NOTE 2. The formal description of the boundary of the use of the data model used, is called the context
mode.
NOTE 3. The data model identifies entities, domains (attributes), and relationships (associated) with other
data, providing a conceptual view of the relationship between data and data.
EXAMPLE 1. For semantic data model consisting of block diagrams, this box represents a set of
transactions that are meaningful to the business, such as "people" or "actions", and lines that describe
the relationships between pairs of such entities.
EXAMPLE 2. A relational table that applies specific data management techniques or an extensible markup
language XML is a logical data model.
2.8 data standard
rules and benchmarks for naming, definition, structure, and value specification
of data
3 Frame of indicators
Refer to Figure 1 for the frame of evaluation indicators of data quality.
NOTE 1. The data model is a
means to visually describe the
organization's data structure and
is the norm for data
representation.
NOTE 2. When evaluating data
quality, it needs to check for clear
and understandable data model
definitions and the organization of
these data.
A = The number of
elements in the data set
that meet the data
model requirements.
B = The number of
elements in the data set
being evaluated.
0103 Meta data
The measurement that data meets
meta data.
NOTE. Meta data labels,
describes, or portrays other data
to make it easier to retrieve, or
use, information. When evaluating
data quality, it needs to check if an
interpretable metadata document
is available.
Example. A meta data dictionary
containing the contents of each
field name, description, type value
field, etc. is a meta data
document.
X=A/B
where,
A = The number of
elements in the data set
that meet the meta data
requirements.
B = The number of
elements in the data set
being evaluated.
0104 Business rules
The measurement that data meets
business rules.
NOTE 1. Business rules are
authoritative principles or
guidelines that describe business
interactions and establish rules for
actions and data behavioral
outcomes and integrity.
NOTE 2. When evaluating data
quality, it needs to check if there
are good archived business rules.
X=A/B
where,
A = The number of
elements in the data set
that satisfy the business
rules.
B = The number of
elements in the data set
being evaluated.
Authoritative
reference data
(authoritative
reference source)
Reference data is a collection or
classification of values used by
systems, applications, databases,
processes, reports, and
transaction records and master
records.
NOTE. Need to collect reference
data list when evaluating data
quality.
X=A/B
where,
A = The number of
elements in the data set
that satisfy the
reference data rules.
B = The number of
elements in the data set
being evaluated.
A = The number of
elements in the data set
that meet the data
correctness
requirements.
B = The number of
elements in the data set
being evaluated.
0302 Data format compliance
Whether the data format (including
data type, value range, data
length, accuracy, etc.) meets the
expected requirements.
Example. Content other than
male/female cannot appear in the
gender column. ID number cannot
have punctuation. And some
restrictions on character encoding,
need to be achieved by specifying
the format of the content.
X=A/B
where,
A = The number of
elements in the data set
that meet the format
requirements.
B = The number of
elements in the data set
being evaluated.
0303 Data repetition rate
Unexpected measurement for
specific fields, records, files, or
data sets.
X=A/B
where,
A = The number of
elements in the
repeated data set.
B = The number of
elements in the data set
being evaluated.
0304 Data uniqueness
Measurement of the uniqueness of
a particular field, record, file, or
data set.
X=A/B
where,
A = The number of
elements in the data set
that satisfy the
uniqueness
requirement.
B = The number of
elements in the data set
being evaluated.
0305 Dirty data occurrence rate
Measurement for invalid data
outside of the correct field, record,
file, or data set.
Example. Dirty data may occur
when a transaction rolls back due
to a weak or imperfect rollback
mechanism.
X=A/B
where,
A = The number of
elements in the data set
with dirty data.
B = The number of
elements in the data set
being evaluated.
...... Source: Above contents are excerpted from the PDF -- translated/reviewed by: www.chinesestandard.net / Wayne Zheng et al.
|