HOME   Cart(0)   Quotation   About-Us Tax PDFs Standard-List Powered by Google www.ChineseStandard.net Database: 189760 (22 Mar 2025)

GB/T 36344-2018 PDF English


Search result: GB/T 36344-2018
Standard IDContents [version]USDSTEP2[PDF] delivered inName of Chinese StandardStatus
GB/T 36344-2018English150 Add to Cart 0-9 seconds. Auto-delivery. Information technology - Evaluation indicators for data quality Valid


PDF Preview: GB/T 36344-2018


GB/T 36344-2018: PDF in English (GBT 36344-2018)

GB/T 36344-2018 GB NATIONAL STANDARD OF THE PEOPLE’S REPUBLIC OF CHINA ICS 35.240.01 L 70 Information technology - Evaluation indicators for data quality ISSUED ON. JUNE 07, 2018 IMPLEMENTED ON. JANUARY 01, 2019 Issued by. State Administration for Market Regulation; Standardization Administration of the People's Republic of China. Table of Contents Foreword ... 3  1 Scope ... 4  2 Terms and definitions ... 4  3 Frame of indicators ... 5  4 Overview ... 6  5 Indicator description ... 6  Annex A (informative) Data quality evaluation process ... 13  Bibliography ... 14  Information technology - Evaluation indicators for data quality 1 Scope This Standard specifies the frame and description of evaluation indicators for data quality. This Standard is applicable to data quality evaluation of data lifecycle at each stage. 2 Terms and definitions For the purposes of this document, the following terms and definitions apply. 2.1 data formal representation of reinterpretable information so as to suit communication, interpretation or processing NOTE. Data can be processed manually or automatically. [GB/T 5271.1-2000, Definition 01.01.02] 2.2 meta data data about data or data elements (possibly including their data descriptions), and data about data ownership, access paths, access rights, and data variability [GB/T 5271.17-2010, Definition 17.06.05] 2.3 data quality the extent to which the characteristics of data meet clear and implied requirements when used under specified conditions 2.4 raw data various unprocessed or simplified data stored by the end user NOTE. Raw data has multiple forms of existence, such as text data, image data, audio data, or a mixture of several types of data. 2.5 data lifecycle a set of processes that turn raw data into actionable knowledge 2.6 data set a collection of data that has a certain theme that can be identified and can be computerized 2.7 data model expression of images and texts for analysis; the analysis identifies the data the organization needs to complete its mission, function, goals, objectives, and strategy, as well as manage and evaluate the organization NOTE 1. When representing data at different levels of abstraction from high to low, it is often distinguished between conceptual models (models composed of concepts related to certain efforts), logical models, and physical models. NOTE 2. The formal description of the boundary of the use of the data model used, is called the context mode. NOTE 3. The data model identifies entities, domains (attributes), and relationships (associated) with other data, providing a conceptual view of the relationship between data and data. EXAMPLE 1. For semantic data model consisting of block diagrams, this box represents a set of transactions that are meaningful to the business, such as "people" or "actions", and lines that describe the relationships between pairs of such entities. EXAMPLE 2. A relational table that applies specific data management techniques or an extensible markup language XML is a logical data model. 2.8 data standard rules and benchmarks for naming, definition, structure, and value specification of data 3 Frame of indicators Refer to Figure 1 for the frame of evaluation indicators of data quality. NOTE 1. The data model is a means to visually describe the organization's data structure and is the norm for data representation. NOTE 2. When evaluating data quality, it needs to check for clear and understandable data model definitions and the organization of these data. A = The number of elements in the data set that meet the data model requirements. B = The number of elements in the data set being evaluated. 0103 Meta data The measurement that data meets meta data. NOTE. Meta data labels, describes, or portrays other data to make it easier to retrieve, or use, information. When evaluating data quality, it needs to check if an interpretable metadata document is available. Example. A meta data dictionary containing the contents of each field name, description, type value field, etc. is a meta data document. X=A/B where, A = The number of elements in the data set that meet the meta data requirements. B = The number of elements in the data set being evaluated. 0104 Business rules The measurement that data meets business rules. NOTE 1. Business rules are authoritative principles or guidelines that describe business interactions and establish rules for actions and data behavioral outcomes and integrity. NOTE 2. When evaluating data quality, it needs to check if there are good archived business rules. X=A/B where, A = The number of elements in the data set that satisfy the business rules. B = The number of elements in the data set being evaluated. Authoritative reference data (authoritative reference source) Reference data is a collection or classification of values used by systems, applications, databases, processes, reports, and transaction records and master records. NOTE. Need to collect reference data list when evaluating data quality. X=A/B where, A = The number of elements in the data set that satisfy the reference data rules. B = The number of elements in the data set being evaluated. A = The number of elements in the data set that meet the data correctness requirements. B = The number of elements in the data set being evaluated. 0302 Data format compliance Whether the data format (including data type, value range, data length, accuracy, etc.) meets the expected requirements. Example. Content other than male/female cannot appear in the gender column. ID number cannot have punctuation. And some restrictions on character encoding, need to be achieved by specifying the format of the content. X=A/B where, A = The number of elements in the data set that meet the format requirements. B = The number of elements in the data set being evaluated. 0303 Data repetition rate Unexpected measurement for specific fields, records, files, or data sets. X=A/B where, A = The number of elements in the repeated data set. B = The number of elements in the data set being evaluated. 0304 Data uniqueness Measurement of the uniqueness of a particular field, record, file, or data set. X=A/B where, A = The number of elements in the data set that satisfy the uniqueness requirement. B = The number of elements in the data set being evaluated. 0305 Dirty data occurrence rate Measurement for invalid data outside of the correct field, record, file, or data set. Example. Dirty data may occur when a transaction rolls back due to a weak or imperfect rollback mechanism. X=A/B where, A = The number of elements in the data set with dirty data. B = The number of elements in the data set being evaluated. ......
 
Source: Above contents are excerpted from the PDF -- translated/reviewed by: www.chinesestandard.net / Wayne Zheng et al.