Powered by Google www.ChineseStandard.net Database: 189759 (14 Jul 2024)

GB/T 38673-2020 related PDF English

GB/T 38673-2020 (GB/T38673-2020, GBT 38673-2020, GBT38673-2020) & related versions
Standard IDContents [version]USDSTEP2[PDF] delivered inStandard Title (Description)See DetailStatusSimilar PDF
GB/T 38673-2020English205 Add to Cart 0-9 seconds. Auto delivery. Information technology -- Big data -- Basic requirements for big data systems GB/T 38673-2020 Valid GBT 38673-2020
Buy with any currencies (Euro, JPY, KRW...): GB/T 38673-2020    Preview this PDF: GB/T 38673-2020



GB/T 38673-2020: PDF in English (GBT 38673-2020)
GB/T 38673-2020
GB
NATIONAL STANDARD OF THE
PEOPLE’S REPUBLIC OF CHINA
ICS 35.240
L 67
Information technology - Big data - Basic
requirements for big data systems
ISSUED ON: APRIL 28, 2020
IMPLEMENTED ON: NOVEMBER 01, 2020
Issued by: State Administration for Market Regulation;
Standardization Administration of the People’s Republic of
China.
Table of Contents
Foreword ... 3 
1 Scope ... 4 
2 Normative references ... 4 
3 Terms and definitions ... 4 
4 Abbreviations ... 5 
5 Big data system framework ... 5 
6 Functional requirements ... 7 
7 Non-functional requirements ... 14 
Information technology - Big data - Basic
requirements for big data systems
1 Scope
This Standard specifies the functional requirements and non-functional
requirements of big data systems.
This Standard is applicable to the design, model selection, acceptance and
testing of various big data system requirements.
2 Normative references
The following documents are indispensable for the application of this document.
For dated references, only the dated version applies to this document. For
undated references, the latest edition (including all amendments) applies to this
document.
GB/T 35295-2017, Information technology - Big data - Terminology
GB/T 35589-2017, Information technology - Big data - Technical reference
model
3 Terms and definitions
Terms and definitions determined by GB/T 35295-2017 and the following ones
are applicable to this document. For ease of use, some of the terms and
definitions in GB/T 35295-2017 are repeated below.
3.1 Big data system
The system that implements all or part of the big data reference architecture.
[GB/T 35295-2017, Definition 2.1.14]
3.2 Distributed computing
A computing mode that covers the storage layer and the processing layer and
is used to implement multi-type programming algorithm models.
c) It shall provide column conversion, row conversion and table conversion
functions of structured data;
d) It shall provide data loading function, to support the loading of cleaned and
converted data to the data analysis module;
e) It should provide data comparison function before and after cleaning;
f) It should support data conversion function of unstructured data.
6.3 Data storage module
The data storage module requirements are as follows:
a) It shall provide data storage function, to support the storage of structured
data, unstructured data and semi-structured data.
b) It shall provide the function of exchanging data or files with relational
databases and other file systems.
c) Support distributed file storage, to realize the following functions:
1) It shall support basic operations of the file system, including upload,
download, read and write, copy, move, delete, rename, permission
modification, etc.;
2) It shall support multi-copy storage and recovery functions of data blocks;
3) It should support the function of fast retrieval of files, and support the
unified retrieval, cataloging, adding and deleting operations of data
resources;
4) It should support data compression storage function.
d) Support distributed column data storage, to achieve the following functions:
1) It shall support the function of storing data in the form of key-value;
2) It should support user authority management functions that are based
on tables, column families, and columns. Authority management
operations include read, write, and create.
e) Support distributed structured data storage, to achieve the following
functions:
1) It should support distributed storage of structured data, to ensure the
scalability and consistency of data storage;
1) Built-in graph data query API, support synchronous or asynchronous
computing model to write iterative algorithms;
2) Online graph analysis and query function;
3) Graph data expression that is based on the attribute graph model,
including the label and attribute type definition on the node/edge;
4) Built-in common graph index calculation function, to describe the
topological structure characteristics of graphs.
d) It should support memory computing, to realize the following functions:
1) Provide data processing capabilities through distributed memory
computing and DAG execution engine;
2) Support multiple data types, including data processing of structured data,
unstructured data, and semi-structured data.
e) It should support the batch stream integration computing framework, to
achieve the following functions:
1) Batch stream integration unified query SQL language;
2) Streaming SQL in multiple scenarios, such as location information
analysis, etc.;
3) Common time windows, including jumping windows, sliding windows, etc.
f) It should support automatic scheduling of tasks according to the
dependencies between tasks.
g) It should support the description of multi-task dependencies within the job
in the form of a directed acyclic graph.
h) It should provide the ability to dispatch complex tasks.
6.5 Data analysis module
The data analysis module requirements are as follows:
a) Support data query, to realize the following functions:
1) It shall provide the function of querying through a standard database
connection interface;
2) It shall provide the function of querying through the REST API query
interface;
3) It should support data statistics on real-time streams;
4) It should support the sorting of streaming data;
5) It should support the association with static tables;
6) It should support the associated processing of multiple data streams.
f) It should support interactive on-line analysis, to achieve the following
functions:
1) Perform distributed on-line analysis of data through structured query
language, such as OLAP;
2) Perform ad hoc query of data through structured query language;
3) Use visualization middleware to display data analysis results;
4) Define the calculation formula and parameter configuration during the
interactive analysis process;
5) Automatically save and roll back during interactive analysis;
6) Save and publish analysis results during interactive analysis;
7) Interactive data analysis based on online on-line analysis.
g) It should support visual process editing operations, to achieve the
following functions:
1) Perform process editing and revision through drag;
2) Support workflow dispatch trigger mechanism, configurable trigger time
or trigger event;
3) Support the persistent storage of process editing results.
6.6 Data visualization module
The requirements of the visualization module are as follows:
a) It should support the use of conventional charts to display data, such as
tables, bar charts, pie charts, line charts, heat maps;
b) It should support the API of third-party data visualization tools.
6.7 Data access module
d) It shall provide service management functions, including the management
of big data system component services;
e) It should provide the health check management function, to support the
realization of cluster health check through a graphical interface.
7 Non-functional requirements
7.1 Reliability requirements
7.1.1 High availability
High availability requirements are as follows:
a) It shall provide the system automatic fault detection and management
functions;
b) It shall ensure that there is no single point failure risk for system
components;
c) When any node of the cluster fails, there shall be no service interruption,
data loss or data inconsistency;
d) When any unit of the cluster fails, the system operation shall not be
affected;
e) It shall guarantee that the system operates without any problems for a
long time without interruption.
7.1.2 Data redundant storage and distribution
Data redundancy storage and distribution requirements are as follows:
a) It shall provide the metadata multi-copy memory function; the failure of
any node will not affect the system's ability to continue to provide services;
b) It shall provide the master copy planning function that is based on partition
fault tolerance, with the ability to plan the physical distribution of each copy
data in advance.
7.1.3 Data backup and recovery
The data backup and recovery requirements are as follows:
a) It shall provide distributed file storage backup and recovery functions;
b) It shall configure authority for users according to the principle of minimizing
authority;
c) It shall support the allocation of authority for users according to the
granularity of the data table level and the data column level;
d) It shall support the allocation of authority for users according to different
operation types (such as adding, deleting, modifying, checking, executing).
7.3.3 Log management
The log management requirements are as follows:
a) It shall provide the function of recording system operation logs, to record
important operations of users;
b) It shall ensure that the system operation log cannot be deleted, modified
or overwritten;
c) The operation log shall include date, time, operator information, operation
type, operation description and operation result;
d) It shall provide functions of statistics, query, analysis and report generation
of system operation logs.
7.3.4 Data security
The data security requirements are as follows:
a) It shall provide data storage encryption and decryption functions, to
support database-level data encryption;
b) It shall provide encrypted transmission function of system sensitive data,
and the encryption key can be replaced;
c) It should support data encryption at the data column level.
7.4 Scalability requirements
The system scalability requirements are as follows:
a) It shall provide online cluster expansion and reduction functions;
b) It shall provide offline cluster expansion and reduction functions.
7.5 Maintainability requirements
The system maintainability requirements are as follows:
......

BASIC DATA
Standard ID GB/T 38673-2020 (GB/T38673-2020)
Description (Translated English) Information technology -- Big data -- Basic requirements for big data systems
Sector / Industry National Standard (Recommended)
Classification of Chinese Standard L67
Classification of International Standard 35.240
Word Count Estimation 14,111
Date of Issue 2020-04-28
Date of Implementation 2020-11-01
Quoted Standard GB/T 35295-2017; GB/T 35589-2017
Drafting Organization China Electronics Standardization Institute, Huawei Technologies Co., Ltd., Peking University, Renmin University of China, ZTE Corporation, Inspur Electronic Information Industry Co., Ltd., Alibaba Cloud Computing Co., Ltd., Tianjin Nanda General Data Technology Co., Ltd., Beijing Percentage Information Technology Co., Ltd., Fudan University, Nanjing University, Southeast University, Beijing Hezhongning Information Technology Co., Ltd., Beijing Tus Blockchain Technology Development Co., Ltd.
Administrative Organization National Information Technology Standardization Technical Committee (SAC/TC 28)
Proposing organization National Information Technology Standardization Technical Committee (SAC/TC 28)
Issuing agency(ies) State Administration for Market Regulation, National Standardization Administration
Summary This standard specifies the functional and non-functional requirements for big data systems. This standard is applicable to the design, selection, acceptance and testing of various types of big data systems.