GB/T 41813.1-2022 English PDFUS$339.00 ยท In stock
Delivery: <= 4 days. True-PDF full-copy in English will be manually translated and delivered via email. GB/T 41813.1-2022: Information technology - Intelligent speech interaction testing method - Part 1: Speech recognition Status: Valid
Basic dataStandard ID: GB/T 41813.1-2022 (GB/T41813.1-2022)Description (Translated English): Information technology - Intelligent speech interaction testing method - Part 1: Speech recognition Sector / Industry: National Standard (Recommended) Classification of Chinese Standard: L77 Classification of International Standard: 35.240.01 Word Count Estimation: 18,142 Date of Issue: 2022-10-14 Date of Implementation: 2023-05-01 Issuing agency(ies): State Administration for Market Regulation, China National Standardization Administration GB/T 41813.1-2022: Information technology - Intelligent speech interaction testing method - Part 1: Speech recognition---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order. Information technology -- Intelligent speech interaction testing method -- Part 1.Speech recognition ICS 35.240.01 CCSL77 National Standards of People's Republic of China Information technology intelligent voice interaction test method Part 1.Speech Recognition Part 1.Speechrecognition 2023-05-01 Implementation State Administration for Market Regulation Released by the National Standardization Administration directory Preface III Introduction IV 1 Scope 1 2 Normative references 1 3 Terms and Definitions 1 4 Overview 2 5 Test Preparation and Execution 2 5.1 Test dataset 2 5.2 Test Tools 3 5.3 Test equipment 3 5.4 Test Environment 4 5.5 Test execution 4 5.6 Test Results 4 6 Functional Test Method 4 6.1 Voice Signal Collection 4 6.2 Speech to text 5 6.3 Voice wakeup5 6.4 Front-End Signal Processing 5 6.5 Speaker Separation 5 6.6 Language Information Recognition 6 6.7 Post-processing of speech recognition 6 7 Performance Test Method 6 7.1 Speech recognition effect 6 7.2 Speech Recognition Efficiency 7 7.3 Voice wake-up effect8 7.4 Front-end signal processing effects 9 7.5 Speaker separation effect 10 7.6 The effect of language information recognition 10 7.7 System stability 11 Reference 12 forewordThis document is in accordance with the provisions of GB/T 1.1-2020 "Guidelines for Standardization Work Part 1.Structure and Drafting Rules of Standardization Documents" drafted. This document is the first part of GB/T 41813 "Information Technology Intelligent Voice Interaction Test Method". GB/T 41813 has been released the following parts. --- Part 1.Speech recognition; --- Part 2.Semantic understanding. Please note that some content of this document may be patented. The issuing agency of this document assumes no responsibility for identifying patents. This document is proposed and managed by the National Information Technology Standardization Technical Committee (SAC/TC28). This document is drafted by. China Electronics Standardization Institute, iFLYTEK Co., Ltd., Xiaomi Communication Technology Co., Ltd., China For Terminal Co., Ltd., Shenzhen Ubisoft Technology Co., Ltd., China Telecom Group Co., Ltd., Speed Technology Co., Ltd., Institute of Automation, Chinese Academy of Sciences, Institute of Biomedical Engineering, Chinese Academy of Medical Sciences, Harbin Institute of Technology, Hisense Video Technology Co., Ltd. Co., Ltd., Immediate Consumer Finance Co., Ltd., Tencent Technology (Beijing) Co., Ltd., Shenyang SIASUN Robot Automation Co., Ltd. Company, Shenzhen Human Horse Interactive Technology Co., Ltd., Ping An Technology (Shenzhen) Co., Ltd., Anhui Mickey Technology Co., Ltd., Jingfeng Technology (Shenzhen) Co., Ltd., Beijing Jietong Huasheng Technology Co., Ltd., Beijing Baidu Netcom Technology Co., Ltd., Shenzhen Beikeraisheng Technology Co., Ltd. Co., Ltd., Alibaba Cloud Computing Co., Ltd., Yuncong Technology Group Co., Ltd., NetEase (Hangzhou) Network Co., Ltd., Nanjing Yunwen Network Technology Co., Ltd. Technology Co., Ltd., Lenovo (Beijing) Co., Ltd., Fuzhou Data Technology Research Institute Co., Ltd., National Network Software Product Quality Supervision and Inspection Xin (Jinan), China Automotive Research Institute (Tianjin) Automotive Engineering Research Institute Co., Ltd., South China University of Technology, Shandong Provincial Computing Center (National Supercomputing Jinan Center), Zhongke Jiyuan (Hangzhou) Intelligent Technology Co., Ltd., Shensi Electronic Technology Co., Ltd., Zhengzhou Zhongye Technology Co., Ltd. Company, China Automotive Data (Tianjin) Co., Ltd., China Electric Appliance Research Institute Co., Ltd., Shanghai Computer Software Technology Development Center, Beijing Love Digital Wisdom Technology Co., Ltd. The main drafters of this document. Dong Jian, Xu Yang, Wu Guogang, Ma Wanzhong, Zhu Yajun, Jia Yijun, Zhou Lijun, Song Wenlin, Yuan Jie, Yang Zhen, Tian Dingshu, Qian Yanmin, Tao Jianhua, Hua Yunfei, Pu Jiangbo, Liu Bin, Li Haifeng, Wang Feng, Yang Chunyong, Sudan, Zhang Feng, Feng Haihong, Liu Guotao, Ren Junmin, Chen Nan, Xing Qizhou, Wei Tao, Li Xiaoru, Huang Shilei, Wang Miaomiao, Li Jun, Hu Guanglong, Yang Meng, Meng Xianming, Wen Zhengqi, Lu Fei, Fang Bin, Wang Yue, Jing Kun, Li Jie, Zhang Ying, Cai Lizhi, Xu Xiangmin, Gao Yongchao, Zhang Qingqing.IntroductionIntelligent voice interaction is used in smart home, smart customer service, mobile terminals, vehicle terminals, as well as smart education, smart medical care, smart office, and services. It is widely used in many fields such as robotics, and has become one of the important ways of human-computer interaction. With the deepening of intelligent voice interaction In all aspects of production and life, it is necessary to unify the system reference framework, basic technical requirements, and Internet interface requirements of intelligent voice interaction. a specification. In this regard, the country has formulated the basic national standards to support the intelligent voice interaction system. On this basis, it is also necessary to use the A test method and evaluation standard to evaluate the ability of intelligent voice interaction system, for intelligent voice interaction related products and services Provide the basic method and basis for evaluation. GB/T 41813 "Information Technology Intelligent Voice Interaction Test Method" is GB/T 36464 (all parts) "Information Technology Intelligent Language Audio Interaction System" provides basic and general testing methods. Intelligent voice interaction includes three basic loops. speech recognition, semantic understanding and speech synthesis. The test objects, test items, test environments and test methods involved in each link are different. GB/T 41813 "Information Technology Intelligent Voice Interaction Test Methods" aims to establish and describe general test items and general test methods applicable to all aspects of intelligent voice interaction. It consists of three parts. --- Part 1.Speech recognition. The purpose is to provide general test items and common test items for speech recognition in intelligent voice interaction applications. using the test method. --- Part 2.Semantic understanding. The purpose is to provide general test items and common test items for semantic understanding in intelligent voice interaction applications. using the test method. --- Part 3.Speech synthesis. The purpose is to provide general test items and common test items for speech synthesis in intelligent voice interaction applications. using the test method. Information technology intelligent voice interaction test method Part 1.Speech Recognition1 ScopeThis document describes the general test items and general test methods for speech recognition systems in intelligent voice interaction testing. This document applies to speech recognition systems for intelligent speech interaction applications by intelligent speech service providers, users and third-party testing agencies Design and implementation of tests.2 Normative referencesThe contents of the following documents constitute essential provisions of this document through normative references in the text. Among them, dated citations documents, only the version corresponding to that date applies to this document; for undated references, the latest edition (including all amendments) applies to this document. GB/T 21023 General Technical Specification for Chinese Speech Recognition System GB/T 36464 (all parts) Information technology intelligent voice interactive system3 Terms and DefinitionsThe terms and definitions defined in GB/T 36464 (all parts) and the following terms and definitions apply to this document. 3.1 speech recognition speechrecognition The process of converting human voice signals into words or instructions. [Source. GB/T 36464.1-2020, 3.7] 3.2 speakerdiarization The process of speaker segmentation and speaker clustering for multiple speakers in an audio stream containing valid speech signals. NOTE. The purpose of speaker separation is generally to classify and track multiple speakers present in space. 3.3 speaker segmentation speakersegmentation Find out the temporal boundaries of speaker changes among multiple speakers, and split the audio stream into multiple speech segments according to these boundaries. 3.4 speaker clustering Categorize one or more speech segments belonging to the same speaker. 3.5 Speech coding speechcoding; speechchencoding speechwaveformcoding According to a set of plans that can reasonably reconstruct the speech signal, the transformation from the digitized speech signal to a discrete sequence of data elements. NOTE. Speech digitization can be combined with a coding for speech compression. Therefore, the term "voice coding" often refers to such combinatorial operations. [Source. GB/T 5271.29-2006, 29.01.23] ......Tips & Frequently Asked Questions:Question 1: How long will the true-PDF of GB/T 41813.1-2022_English be delivered?Answer: Upon your order, we will start to translate GB/T 41813.1-2022_English as soon as possible, and keep you informed of the progress. The lead time is typically 2 ~ 4 working days. The lengthier the document the longer the lead time.Question 2: Can I share the purchased PDF of GB/T 41813.1-2022_English with my colleagues?Answer: Yes. The purchased PDF of GB/T 41813.1-2022_English will be deemed to be sold to your employer/organization who actually pays for it, including your colleagues and your employer's intranet.Question 3: Does the price include tax/VAT?Answer: Yes. Our tax invoice, downloaded/delivered in 9 seconds, includes all tax/VAT and complies with 100+ countries' tax regulations (tax exempted in 100+ countries) -- See Avoidance of Double Taxation Agreements (DTAs): List of DTAs signed between Singapore and 100+ countriesQuestion 4: Do you accept my currency other than USD?Answer: Yes. If you need your currency to be printed on the invoice, please write an email to Sales@ChineseStandard.net. In 2 working-hours, we will create a special link for you to pay in any currencies. Otherwise, follow the normal steps: Add to Cart -- Checkout -- Select your currency to pay. |