GB/T 37964-2019 English PDFUS$679.00 · In stock
Delivery: <= 6 days. True-PDF full-copy in English will be manually translated and delivered via email. GB/T 37964-2019: Information security technology - Guide for de-identifying personal information Status: Valid
Basic dataStandard ID: GB/T 37964-2019 (GB/T37964-2019)Description (Translated English): Information security technology - Guide for de-identifying personal information Sector / Industry: National Standard (Recommended) Classification of Chinese Standard: L80 Classification of International Standard: 35.040 Word Count Estimation: 34,386 Date of Issue: 2019-08-30 Date of Implementation: 2020-03-01 Issuing agency(ies): State Administration for Market Regulation, China National Standardization Administration GB/T 37964-2019: Information security technology - Guide for de-identifying personal information---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order.Information security technology - Guide for de-identifying personal information ICS 35.040 L80 National Standards of People's Republic of China Information Security Technology Guidelines for De-identification of Personal Information 2019-08-30 released 2020-03-01 Implementation State Administration for Market Regulation Issued by China National Standardization Administration Table of contentsForeword Ⅰ Introduction Ⅱ 1 Scope 1 2 Normative references 1 3 Terms and definitions 1 4 Overview 3 4.1 De-identification target 3 4.2 Principles of de-identification 3 4.3 Re-identification risk 3 4.4 Impact of de-identification 4 4.5 The impact of different public sharing types on de-identification 4 5 De-identification process 4 5.1 Overview 4 5.2 Determine the goal 5 5.3 Identification mark 5 5.4 Processing identification 6 5.5 Verification and approval 7 5.6 Monitoring and review 8 6 Role responsibilities and personnel management 9 6.1 Role responsibilities 9 6.2 Personnel management 9 Appendix A (informative appendix) Commonly used de-identification techniques 10 Appendix B (informative appendix) Commonly used de-identification models 17 Appendix C (informative appendix) Selection of de-identification models and technologies 24 Appendix D (informative appendix) Challenges of de-identification 29 Reference 31ForewordThis standard was drafted in accordance with the rules given in GB/T 1.1-2009. Please note that certain contents of this document may involve patents. The issuing agency of this document is not responsible for identifying these patents. This standard was proposed and managed by the National Information Security Standardization Technical Committee (SAC/TC260). Drafting organizations of this standard. Tsinghua University, Venus Star Information Technology Group Co., Ltd., Zhejiang Ant Small and Micro Financial Services Group Co., Ltd., Alibaba (Beijing) Software Service Co., Ltd., Beijing Qi Anxin Technology Co., Ltd., Beijing Tianrongxin Network Security Technology Co., Ltd. The company, the Software Research Institute of the Chinese Academy of Sciences, China Software Evaluation Center, Shanghai Computer Software Technology Development Center, and Beijing Digital Certification Co., Ltd., Xidian University, Hunan Kechuang Information Technology Co., Ltd., China Electronics Standardization Institute, Shaanxi Provincial Information Engineering Research Institute. The main drafters of this standard. Jin Tao, Xie Anming, Chen Xing, Bai Xiaoyuan, Zheng Xinhua, Liu Xiangang, Chen Wenjie, Liu Yuling, Song Pengju, Zhao Liang, Song Lingdi, Ye Xiaojun, Wang Jianmin, Fang Ming, Pei Qingqi, Pan Zhengtai.IntroductionIn the era of big data, cloud computing, and the Internet of Everything, data-based applications are becoming more and more extensive, and it also brings huge personal information security problem. In order to protect the security of personal information and promote the sharing and use of data, we have formulated guidelines for de-identification of personal information. This standard aims to draw on the latest research results on the de-identification of personal information at home and abroad, refine the best practices currently prevailing in the industry, and study individual The goals, principles, technologies, models, processes and organizational measures of de-identification of human information are proposed to scientifically and effectively resist security risks and comply with information Guidelines for the de-identification of personal information needed for the development of globalization. The data set to be de-identified that this standard focuses on is microdata (a data set represented by a record set, which can logically be shown in a table Show). De-identification is not only to delete or transform the direct identifiers and quasi-identifiers in the data set, but can also be combined with later application scenarios Consider the risk of re-identification of the data set, so as to select the appropriate de-identification model and technical measures, and implement appropriate effect evaluation. For data sets that are not microdata, they can be converted into microdata for processing, or you can refer to the goals, principles, and methods of this standard. Line processing. For example, for tabular data, if there are multiple records about the same person, multiple records can be spliced into one to form Microdata, in which there is only one record of the same person. Information Security Technology Guidelines for De-identification of Personal Information1 ScopeThis standard describes the goals and principles of de-identification of personal information, and proposes the process of de-identification and management measures. This standard provides specific guidance on the de-identification of personal information for microdata. It is suitable for organizations to carry out personal information de-identification. It is suitable for organizations such as network security-related authorities and third-party assessment agencies to carry out personal information security supervision, management and assessment.2 Normative referencesThe following documents are indispensable for the application of this document. For dated reference documents, only the dated version applies to this article Pieces. For undated references, the latest version (including all amendments) applies to this document. GB/T 25069-2010 Information Security Technical Terms3 Terms and definitionsThe following terms and definitions defined in GB/T 25069-2010 apply to this document. 3.1 Personal information Recorded electronically or in other ways that can identify a specific natural person alone or in combination with other information or reflect the activities of a specific natural person Various information about the situation. [GB/T 35273-2017, definition 3.1] 3.2 Personal information subject The natural person identified by personal information. [GB/T 35273-2017, definition 3.3] 3.3 De-identification Through the technical processing of personal information, the process of making it impossible to identify the subject of personal information without resorting to additional information. [GB/T 35273-2017, definition 3.14] Note. Remove the association between the identifier and the subject of personal information. 3.4 Microdata A structured data set, in which each (row) record corresponds to a personal information subject, and each field (column) in the record corresponds to one Attributes. 3.5 Aggregate data Data that characterizes a group of personal information subjects. Note. For example, a collection of various statistical values. 3.6 Identifier One or more attributes in microdata can be used to uniquely identify the subject of personal information. Note. Identifiers are divided into direct identifiers and quasi-identifiers. 3.7 Direct identifier The attributes in microdata can individually identify the subject of personal information in a specific environment. Note 1.The specific environment refers to the specific scenario of the use of personal information. For example, in a specific school, a specific student can be directly identified by the student ID. Note 2.Common direct identifiers are. name, ID number, passport number, driver's license number, address, email address, phone number, fax number, bank card number Code, license plate number, vehicle identification number, social insurance number, health card number, medical record number, device identifier, biometric code, Internet protocol (IP) address number and network universal resource locator (URL), etc. 3.8 Quasi-identifier The attributes in microdata, combined with other attributes, can uniquely identify the subject of personal information. Note. Common quasi-identifiers are. gender, date of birth or age, event date (such as admission, surgery, discharge, visit), location (such as zip code, building Name, region), ethnic origin, country of birth, language, aboriginal status, visible minority status, occupation, marital status, education level, last school year Limit, criminal history, total income, religious beliefs, etc. 3.9 Re-identification The process of re-associating the de-identified data set to the original personal information subject or a group of personal information subjects. 3.10 Sensitive attributes The attribute that needs to be protected in the data set, the leakage, modification, destruction or loss of the attribute value will cause personal damage. Note. It is necessary to prevent its value from being associated with any personal information subject during a potential re-identification attack. 3.11 Usefulness Data has specific meanings for applications and features of usefulness. Note. De-identification data is widely used, and each application will require de-identification data to have certain characteristics in order to achieve the purpose of the application. To ensure the preservation of these characteristics. 3.12 Fully shared publicly Once the data is released, it is difficult to recall, and it is generally released directly through the Internet. 3.13 Controlled public sharing Constrain the use of data through the data use agreement. Note 1.For example, the information receiver is prohibited from launching re-identification attacks on individuals in the data set through the agreement, and the information receiver is prohibited from being associated with external data sets or information. 3.14 Territory publicly shared Sharing within a physical or virtual territory, data cannot flow out of the territory. 3.15 De-identification technology Technology to reduce the degree of correlation between the information in the data set and the subject of personal information. Note 1.Reduce the degree of discrimination of information so that the information cannot correspond to a specific individual. A lower degree of discrimination cannot determine whether different information corresponds to the same one. Individuals, in practice, often require that the number of people who may correspond to a piece of information exceeds a certain threshold. Note 2.Disconnecting the association with the subject of personal information means separating other personal information from identification information. 3.16 De-identification model Apply de-identification technology and calculate the method of re-identification risk.4 overview4.1 De-identification target De-identification goals include. a) Delete or change direct identifiers and quasi-identifiers to prevent attackers from directly identifying or combining other information based on these attributes. Information to identify the original subject of personal information; b) Control the risk of re-identification, select appropriate models and technologies according to the available data and application scenarios, and set the re-identification The risk is controlled within an acceptable range to ensure that the risk of re-identification will not increase with the release of new data, and to ensure that the data recipients The potential collusion will not increase the risk of re-marking; c) Under the premise of controlling the risk of re-identification, combined with business objectives and data characteristics, select appropriate de-identification models and technologies to ensure Ensure that the de-identified data set meets its intended purpose (useful) as much as possible. 4.2 Principles of de-identification To de-identify the data set, the following principles should be followed. a) Compliance. It should meet the relevant provisions of my country's laws, regulations and standards on the protection of personal information, and continue to follow up relevant laws Laws, regulations and standards; b) Personal information security protection is prioritized. Personal information should be appropriately de-identified according to business objectives and security protection requirements Processing, ensuring that the de-identified data has application value while protecting the security of personal information; c) Combination of technology and management. develop appropriate strategies based on work objectives, select appropriate models and technologies, and comprehensively utilize technology and management Management measures to achieve the best results. Including setting specific positions and clarifying corresponding responsibilities; Take effective security measures for auxiliary information (such as keys, mapping tables, etc.); d) Full use of software tools. For the de-identification of large-scale data sets, software tools should be considered to improve de-identification Efficiency and guarantee effectiveness; e) Continuous improvement. After the de-identification work is completed, evaluation and regular Including the identification of risks and usefulness) and efficiency, continuous improvement of methods, techniques and tools. And document related work. 4.3 Re-identification risk 4.3.1 Re-identification method Common methods for re-identification are as follows. a) Separation. Extract all records belonging to the same personal information subject; b) Association. Link the information about the same personal information subject in different data sets; c) Inference. Judging the value of an attribute with a certain probability through the value of other attributes. 4.3.2 Re-identification attack Common re-identification attacks include. a) Re-identify a record as belonging to a specific subject of personal information; b) Re-identify the personal information subject of a specific record; c) As many records as possible are associated with their corresponding personal information subjects; d) Determine whether a specific subject of personal information exists in the data set; e) Infer the sensitive attributes associated with a set of other attributes. 4.4 Impact of de-identification De-identification of the data set will change the original data set and may affect the usefulness of the data. Business applications use de-identified data The data collection should be fully aware of this point, and consider the possible impact of data set changes. 4.5 The impact of different public sharing types on de-identification Before carrying out de-identification work, it is necessary to determine the public sharing type of data according to application requirements. Different public sharing types may be cited The risk of re-identification and the requirements for de-identification are shown in Table 1. 5 De-identification process 5.1 Overview The de-identification process can usually be divided into the steps of determining the target, identifying the identification, processing the identification, and verifying and approving. Effective monitoring and review during and after implementation. As shown in Figure 1. Figure 1 De-identification process 5.2 Determine goals 5.2.1 Overview The steps of determining the target include determining the target of de-identification, establishing the target of de-identification, and formulating the work plan. 5.2.2 Identify de-identified objects Determining the object of de-identification refers to determining the scope of the data set that needs to be de-identified. It should be determined which data belongs to the object of de-identification based on the following factors. a) Regulatory standards. Understand the relevant policies, laws, regulations and standards of the country, region or industry, and whether the data to be collected or released involves de-identification requirements. b) Organizational strategy. Understand whether the data belongs to the important data or sensitive data category listed by the organization, and whether there is a requirement for de-identification in data application. c) Data source. Understand whether these data collections have made past identification-related commitments. d) Business background. Understand the business characteristics of the information system related to the data source, understand the business content and business process, and whether the disclosure of data involves personal information security risks. e) Data usage. Understand the purpose of the data to be released and whether there is a personal information security risk. f) Relevance. Understand the history of data disclosure and de-identification, and whether the data to be disclosed is related to the historical data. 5.2.3 Establish de-identification goals Establish de-identification goals, including determining the unacceptable degree of re-identification risks and the minimum requirements for data usefulness. Factors to consider include. a) Data usage. Understand the purpose of data de-identification, involving the functions and characteristics of business systems, and consider the impact of data de-identification To determine the minimum requirements for data usefulness. b) Data source. Understand the relevant commitments at the time of data acquisition and what personal information is involved. c) Publicly share categories. If you implement personal information de-identification for data release, you need to understand that the data is completely publicly shared and controlled Sharing or territorial public sharing, as well as security measures for data browsing and use. d) Risk level. Understand the data attributes and business characteristics, the proposed re-identification risk assessment model and the set risk level. e) De-identification models and technologies. Understand the protection or de-identification standards applicable to data, as well as the de-identification models and technologies that may be used. 5.2.4 Develop a work plan Develop an implementation plan for the de-identification of personal information, including the purpose, objectives, data objects, public sharing methods, and implementation groups of the de-identification. Teams, implementation plans, stakeholders, emergency measures and schedules, etc., form a de-identified implementation plan. After determining the relevant content, the de-identification implementation plan should be approved and supported by the senior management of the organization. 5.3 Identification mark 5.3.1 Overview Methods of identifying identifiers include table look-up identification method, rule determination method and manual analysis method. 5.3.2 Look-up table recognition method The look-up recognition method refers to the establishment of a metadata table in advance, storing identifier information, and when identifying the identification data, each attribute of the data to be identified The specific name or field name is compared with the records in the metadata table one by one to identify the identification data. The established identifier metadata table should include the identifier name, meaning, format requirements, commonly used data types, and commonly used field names. The look-up table recognition method is suitable for de-identification scenarios where the format and attributes of the data set have been clear, such as using a relational database, in the table structure Identifier fields such as name and ID number have been clarified. 5.3.3 Rule determination method Rule judging method refers to the establishment of a software program, analysis of the data set rules, and automatic discovery of identification data. Organizations can analyze business chara......Tips & Frequently Asked Questions:Question 1: How long will the true-PDF of GB/T 37964-2019_English be delivered?Answer: Upon your order, we will start to translate GB/T 37964-2019_English as soon as possible, and keep you informed of the progress. The lead time is typically 4 ~ 6 working days. The lengthier the document the longer the lead time.Question 2: Can I share the purchased PDF of GB/T 37964-2019_English with my colleagues?Answer: Yes. The purchased PDF of GB/T 37964-2019_English will be deemed to be sold to your employer/organization who actually pays for it, including your colleagues and your employer's intranet.Question 3: Does the price include tax/VAT?Answer: Yes. Our tax invoice, downloaded/delivered in 9 seconds, includes all tax/VAT and complies with 100+ countries' tax regulations (tax exempted in 100+ countries) -- See Avoidance of Double Taxation Agreements (DTAs): List of DTAs signed between Singapore and 100+ countriesQuestion 4: Do you accept my currency other than USD?Answer: Yes. If you need your currency to be printed on the invoice, please write an email to Sales@ChineseStandard.net. In 2 working-hours, we will create a special link for you to pay in any currencies. Otherwise, follow the normal steps: Add to Cart -- Checkout -- Select your currency to pay. |