HOME   Cart(0)   Quotation   About-Us Tax PDFs Standard-List Powered by Google www.ChineseStandard.net Database: 189759 (6 Oct 2024)

GB/T 38673-2020 PDF in English


GB/T 38673-2020 (GB/T38673-2020, GBT 38673-2020, GBT38673-2020)
Standard IDContents [version]USDSTEP2[PDF] delivered inName of Chinese StandardStatus
GB/T 38673-2020English205 Add to Cart 0-9 seconds. Auto-delivery. Information technology -- Big data -- Basic requirements for big data systems Valid
Standards related to (historical): GB/T 38673-2020
PDF Preview

GB/T 38673-2020: PDF in English (GBT 38673-2020)

GB/T 38673-2020 GB NATIONAL STANDARD OF THE PEOPLE’S REPUBLIC OF CHINA ICS 35.240 L 67 Information technology - Big data - Basic requirements for big data systems ISSUED ON: APRIL 28, 2020 IMPLEMENTED ON: NOVEMBER 01, 2020 Issued by: State Administration for Market Regulation; Standardization Administration of the People’s Republic of China. Table of Contents Foreword ... 3  1 Scope ... 4  2 Normative references ... 4  3 Terms and definitions ... 4  4 Abbreviations ... 5  5 Big data system framework ... 5  6 Functional requirements ... 7  7 Non-functional requirements ... 14  Information technology - Big data - Basic requirements for big data systems 1 Scope This Standard specifies the functional requirements and non-functional requirements of big data systems. This Standard is applicable to the design, model selection, acceptance and testing of various big data system requirements. 2 Normative references The following documents are indispensable for the application of this document. For dated references, only the dated version applies to this document. For undated references, the latest edition (including all amendments) applies to this document. GB/T 35295-2017, Information technology - Big data - Terminology GB/T 35589-2017, Information technology - Big data - Technical reference model 3 Terms and definitions Terms and definitions determined by GB/T 35295-2017 and the following ones are applicable to this document. For ease of use, some of the terms and definitions in GB/T 35295-2017 are repeated below. 3.1 Big data system The system that implements all or part of the big data reference architecture. [GB/T 35295-2017, Definition 2.1.14] 3.2 Distributed computing A computing mode that covers the storage layer and the processing layer and is used to implement multi-type programming algorithm models. c) It shall provide column conversion, row conversion and table conversion functions of structured data; d) It shall provide data loading function, to support the loading of cleaned and converted data to the data analysis module; e) It should provide data comparison function before and after cleaning; f) It should support data conversion function of unstructured data. 6.3 Data storage module The data storage module requirements are as follows: a) It shall provide data storage function, to support the storage of structured data, unstructured data and semi-structured data. b) It shall provide the function of exchanging data or files with relational databases and other file systems. c) Support distributed file storage, to realize the following functions: 1) It shall support basic operations of the file system, including upload, download, read and write, copy, move, delete, rename, permission modification, etc.; 2) It shall support multi-copy storage and recovery functions of data blocks; 3) It should support the function of fast retrieval of files, and support the unified retrieval, cataloging, adding and deleting operations of data resources; 4) It should support data compression storage function. d) Support distributed column data storage, to achieve the following functions: 1) It shall support the function of storing data in the form of key-value; 2) It should support user authority management functions that are based on tables, column families, and columns. Authority management operations include read, write, and create. e) Support distributed structured data storage, to achieve the following functions: 1) It should support distributed storage of structured data, to ensure the scalability and consistency of data storage; 1) Built-in graph data query API, support synchronous or asynchronous computing model to write iterative algorithms; 2) Online graph analysis and query function; 3) Graph data expression that is based on the attribute graph model, including the label and attribute type definition on the node/edge; 4) Built-in common graph index calculation function, to describe the topological structure characteristics of graphs. d) It should support memory computing, to realize the following functions: 1) Provide data processing capabilities through distributed memory computing and DAG execution engine; 2) Support multiple data types, including data processing of structured data, unstructured data, and semi-structured data. e) It should support the batch stream integration computing framework, to achieve the following functions: 1) Batch stream integration unified query SQL language; 2) Streaming SQL in multiple scenarios, such as location information analysis, etc.; 3) Common time windows, including jumping windows, sliding windows, etc. f) It should support automatic scheduling of tasks according to the dependencies between tasks. g) It should support the description of multi-task dependencies within the job in the form of a directed acyclic graph. h) It should provide the ability to dispatch complex tasks. 6.5 Data analysis module The data analysis module requirements are as follows: a) Support data query, to realize the following functions: 1) It shall provide the function of querying through a standard database connection interface; 2) It shall provide the function of querying through the REST API query interface; 3) It should support data statistics on real-time streams; 4) It should support the sorting of streaming data; 5) It should support the association with static tables; 6) It should support the associated processing of multiple data streams. f) It should support interactive on-line analysis, to achieve the following functions: 1) Perform distributed on-line analysis of data through structured query language, such as OLAP; 2) Perform ad hoc query of data through structured query language; 3) Use visualization middleware to display data analysis results; 4) Define the calculation formula and parameter configuration during the interactive analysis process; 5) Automatically save and roll back during interactive analysis; 6) Save and publish analysis results during interactive analysis; 7) Interactive data analysis based on online on-line analysis. g) It should support visual process editing operations, to achieve the following functions: 1) Perform process editing and revision through drag; 2) Support workflow dispatch trigger mechanism, configurable trigger time or trigger event; 3) Support the persistent storage of process editing results. 6.6 Data visualization module The requirements of the visualization module are as follows: a) It should support the use of conventional charts to display data, such as tables, bar charts, pie charts, line charts, heat maps; b) It should support the API of third-party data visualization tools. 6.7 Data access module d) It shall provide service management functions, including the management of big data system component services; e) It should provide the health check management function, to support the realization of cluster health check through a graphical interface. 7 Non-functional requirements 7.1 Reliability requirements 7.1.1 High availability High availability requirements are as follows: a) It shall provide the system automatic fault detection and management functions; b) It shall ensure that there is no single point failure risk for system components; c) When any node of the cluster fails, there shall be no service interruption, data loss or data inconsistency; d) When any unit of the cluster fails, the system operation shall not be affected; e) It shall guarantee that the system operates without any problems for a long time without interruption. 7.1.2 Data redundant storage and distribution Data redundancy storage and distribution requirements are as follows: a) It shall provide the metadata multi-copy memory function; the failure of any node will not affect the system's ability to continue to provide services; b) It shall provide the master copy planning function that is based on partition fault tolerance, with the ability to plan the physical distribution of each copy data in advance. 7.1.3 Data backup and recovery The data backup and recovery requirements are as follows: a) It shall provide distributed file storage backup and recovery functions; b) It shall configure authority for users according to the principle of minimizing authority; c) It shall support the allocation of authority for users according to the granularity of the data table level and the data column level; d) It shall support the allocation of authority for users according to different operation types (such as adding, deleting, modifying, checking, executing). 7.3.3 Log management The log management requirements are as follows: a) It shall provide the function of recording system operation logs, to record important operations of users; b) It shall ensure that the system operation log cannot be deleted, modified or overwritten; c) The operation log shall include date, time, operator information, operation type, operation description and operation result; d) It shall provide functions of statistics, query, analysis and report generation of system operation logs. 7.3.4 Data security The data security requirements are as follows: a) It shall provide data storage encryption and decryption functions, to support database-level data encryption; b) It shall provide encrypted transmission function of system sensitive data, and the encryption key can be replaced; c) It should support data encryption at the data column level. 7.4 Scalability requirements The system scalability requirements are as follows: a) It shall provide online cluster expansion and reduction functions; b) It shall provide offline cluster expansion and reduction functions. 7.5 Maintainability requirements The system maintainability requirements are as follows: ......
 
Source: Above contents are excerpted from the PDF -- translated/reviewed by: www.chinesestandard.net / Wayne Zheng et al.