Home Cart Quotation Policy About-Us
www.ChineseStandard.net
Database: 221581 (27 Mar 2026)
SEARCH
Path: Home > GB/T > Page224 > GB/T 45288.2-2025

GB/T 45288.2-2025 PDF English

Price & Delivery

US$599.00 · In stock · Download in 9 seconds
GB/T 45288.2-2025: Artificial intelligence - Large-scale model - Part 2: Testing and evaluation for metrics and methods
Delivery: 9 seconds. True-PDF full-copy in English & invoice will be downloaded + auto-delivered via email. See step-by-step procedure
Status: Valid
Std IDVersionUSDBuyDeliver [PDF] inTitle (Description)
GB/T 45288.2-2025English599 Add to Cart 5 days [Need to translate] Artificial intelligence - Large-scale model - Part 2: Testing and evaluation for metrics and methods

Click to Preview a similar PDF

Basic data

Standard ID GB/T 45288.2-2025 (GB/T45288.2-2025)
Description (Translated English) Artificial intelligence - Large-scale model - Part 2: Testing and evaluation for metrics and methods
Sector / Industry National Standard (Recommended)
Classification of Chinese Standard L70
Classification of International Standard 35.240
Word Count Estimation 30,396
Date of Issue 2025-02-28
Date of Implementation 2025-02-28
Issuing agency(ies) State Administration for Market Regulation, China National Standardization Administration

GB/T 45288.2-2025: Artificial intelligence - Large-scale model - Part 2: Testing and evaluation for metrics and methods




---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order.
GB/T 45288.2-2025 English version. Artificial intelligence - Large-scale model - Part 2.Testing and evaluation for metrics and methods ICS 35.240 CCSL70 National Standard of the People's Republic of China Artificial Intelligence Big Model Part 2.Evaluation indicators and methods Artificial intelligence - Large-scale model- Part 2.Testing and evaluation for metrics and methods Released on 2025-02-28 2025-02-28 Implementation State Administration for Market Regulation The National Standardization Administration issued

Table of Contents

Preface III Introduction V 1 Scope 1 2 Normative references 1 3 Terms and Definitions 1 4 Abbreviations 1 5 Evaluation indicators 1 5.1 Comprehension Ability Evaluation Indicators 1 5.2 Generation capability evaluation indicators 8 6 Evaluation Methods11 6.1 Overview 11 6.2 Evaluation Dataset 14 6.3 Evaluation Environment 14 6.4 Evaluation Tools 14 6.5 Evaluation Implementation 14 Appendix A (Informative) Evaluation Index Calculation Method 17 A.1 Objective evaluation method 17 A.2 Subjective evaluation method 18 Reference 21

Foreword

This document is in accordance with the provisions of GB/T 1.1-2020 "Guidelines for standardization work Part 1.Structure and drafting rules for standardization documents" Drafting. This document is Part 2 of GB/T 45288 "Artificial Intelligence Big Model". GB/T 45288 has been published in the following parts. --- Part 1.General requirements; --- Part 2.Evaluation indicators and methods; --- Part 3.Service capability maturity assessment. Please note that some of the contents of this document may involve patents. The issuing organization of the document does not assume the responsibility for identifying patents. This document was proposed and coordinated by the National Information Technology Standardization Technical Committee (SAC/TC28). This standard was drafted by. China Electronics Technology Standardization Institute, Shanghai Artificial Intelligence Innovation Center, Institute of Automation, Chinese Academy of Sciences, Ant Group Co., Ltd., Beijing University of Aeronautics and Astronautics, Tsinghua University, Hangzhou Lianhui Technology Co., Ltd., China Railway Construction Corporation Co., Ltd., Beijing Baidu Netcom Technology Co., Ltd., China Southern Power Grid Co., Ltd., China Mobile Communications Co., Ltd. Research Institute, China Energy Investment Group Information Technology Co., Ltd., Huawei Cloud Computing Technology Co., Ltd., Shanghai SenseTime Intelligent Technology Co., Ltd. Alibaba Cloud Computing Co., Ltd., Shenzhen Tencent Computer Systems Co., Ltd., Beijing Qihoo Technology Co., Ltd., Beijing Zhiyuan Artificial Intelligence Co., Ltd. Research Institute, China Railway Fifth Survey and Design Institute Group Co., Ltd., Beijing Zhipu Huazhang Technology Co., Ltd., Inspur Cloud Information Technology Co., Ltd., iFlytek Co., Ltd., China Electric Power Research Institute Co., Ltd., Tianjin University, China Telecom Research Institute, China Central Radio and Television China Central Television, Beijing Baichuan Intelligent Technology Co., Ltd., Tongfang Knowledge Network Digital Publishing Technology Co., Ltd., Beijing Zhongguancun Laboratory, Shanghai Harbin Artificial Intelligence Industry Association, China Southern Power Grid Research Institute Co., Ltd., Xidian University, Southwest University of Science and Technology, Harbin University of Science and Technology, Institute of Software, Chinese Academy of Sciences, Wuhan Institute of Artificial Intelligence, Peking University, Qingdao Hisense Electronic Technology Service Co., Ltd., Beijing DeepGlint Information Technology Co., Ltd., Beijing University of Technology, China Southern Power Grid Artificial Intelligence Technology Co., Ltd., China Telecom Group Co., Ltd., Tianyi Cloud Technology Co., Ltd., Beijing Software Product Quality Testing and Inspection Center Co., Ltd., Beijing Century Good Future Education Technology Co., Ltd., Beijing Xiaomi Mobile Software Co., Ltd., Beijing Zhixin Microelectronics Technology Co., Ltd., China Mobile Communications Group Co., Ltd., Cloud Zhisheng Intelligent Technology Co., Ltd., Beijing Zhongguancun Kejin Technology Co., Ltd., Qingdao Haier Technology Co., Ltd., Hangzhou Hikvision Digital Technology Co., Ltd. Digital Technology Co., Ltd., BOE Technology Group Co., Ltd., Kunlun Digital Intelligence Technology Co., Ltd., Inspur Electronic Information Industry Co., Ltd., Inspur Software Technology Co., Ltd., Mashang Consumer Finance Co., Ltd., Pengcheng Laboratory, Pingtouge (Shanghai) Semiconductor Technology Co., Ltd., Qilin Hesheng Network Technology Co., Ltd., Shandong Inspur Science Research Institute Co., Ltd., Shandong Artificial Intelligence Research Institute Institute, Shanghai Computer Software Technology Development Center, Shanghai Artificial Intelligence Research Institute Co., Ltd., Beijing Ansheng Technology Co., Ltd., Shanghai Suiyuan Technology Technology Co., Ltd., Shanghai Tianshu Zhixin Semiconductor Co., Ltd., Shenzhen Qianhai Weizhong Bank Co., Ltd., Shenzhen Simo Information Technology Co., Ltd., Northwestern Polytechnical University, Siemens (China) Co., Ltd., CloudWalk Technology Group Co., Ltd., Shanghai Wenyue Information Technology Co., Ltd. Company, Zhejiang Dahua Technology Co., Ltd., Wanda Information Co., Ltd., Shanghai Xuanwu Information Technology Co., Ltd., China Mobile Internet Co., Ltd., Sichuan Changhong Electronics Holding Group Co., Ltd. The main drafters of this standard. Huang Xiancui, Sun Chuanxing, Ma Shanshan, Li Dong, Yu Dianhai, Long Yun, Liu Weidong, Jing Dichun, Zheng Zimu, Jiang Hui, Peng Juntao, Hu Zhichao, Zhang Xiangzheng, Yang Xi, Zheng Zhong, Feng Tao, Zheng Jiajia, Liu Cong, Zhou Fei, Chen Xi, Li Jianxin, Xiong Deyi, Yang Mingchuan, Wang Feng, Mei Jianping, Chen Weipeng, Zhang Hongwei, Zhang Songyang, Peng Jin, Liu Jing, Liu Aishan, Wang Jiakai, Gao Donghui, Ma Tongsen, Zhang Tianlin, Gao Tiezhu, Chen Xi, Liang Zhihong, He Gang, Yu Wenxin, Yang Muyun, Meng Lingzhong, Zhu Guibo, Wang Jinqiao, Zheng Ruolin, Shen Zhiyue, Nie Jiandi, Ren Haifeng, Shi Xian, Wu Xihong, Liu Shang, Liu Weiwei, Shi Congcong, Ding Peng, Liu Xiaoou, Xiang Chao, Xue Dejun, Wang Longyue, Liu Wei, Hu Quanyi, Sun Haoyuan, Sun Lin, Zhao Bimei, Xuan Richeng, Zhao Chunhao, Suo Siliang, Chen Liming, Jiang Yixin, Wu Shanshan, Gao Pengjun, Kong Hao, Xue Yunzhi, Liu Zitao, Yu Lei, Zheng Zhe, Deng Chao, Liang Jiaen, Cui Mingfei, E Lei, Ren Ye, Zhang Zhigang, Chen Hongzhi, Wu Shaohua, Wang Kechen, Feng Yue, Li Rui, Li Jinwei, Long Zhenyue, Gao Hui, Zhang Xu, Duan Qiang, Shan Ke, Chen Mingang, Song Haitao, Liu Yifan, Wang Sishan, Yu Xuesong, Li Bin, Zhang Chi, Zhang Tao, Sheng Ruogu, Sun Jin, Rui Ziwen, Kong Weisheng, Tong Qing, Yang Dengfeng, Sun Wenqing, Zhu Lin, Yang Lan.

Introduction

Big models have become an important technical means for the development of artificial intelligence and play an important role in leading industrial transformation. Relevant institutions have successively researched and developed more than 100 large-scale model products and evaluation lists, making it difficult for users to effectively evaluate the technical level of artificial intelligence products. GB/T 45288 "Artificial Intelligence Big Model" aims to specify the technical requirements, evaluation indicators and service capabilities of general big models. Force is proposed to consist of five parts. --- Part 1.General requirements. The purpose is to establish a reference architecture for large models and specify general technical requirements. --- Part 2.Evaluation indicators and methods. The purpose is to establish the evaluation indicators of large models and describe the evaluation methods. --- Part 3.Service capability maturity assessment. The purpose is to provide the large model service capability maturity level and assessment method. --- Part 4.Computer vision big model. The purpose is to define the concept and function of the computer vision big model and specify the technical requirements and testing methods. --- Part 5.Multimodal large models. The purpose is to define the concept and function of multimodal large models, specify technical requirements and tests method. Artificial Intelligence Big Model Part 2.Evaluation indicators and methods

1 Scope

This document establishes the evaluation indicators for large AI models and describes the evaluation methods for large AI models. This document is applicable to model providers, application servers, and application consumers to evaluate and test the capabilities of large models. Lead the design, development and application of large models.

2 Normative references

The contents of the following documents constitute essential clauses of this document through normative references in this document. For referenced documents without a date, only the version corresponding to that date applies to this document; for referenced documents without a date, the latest version (including all amendments) applies to This document. GB/T 42755-2023 Artificial Intelligence Data Labeling Procedure for Machine Learning GB/T 45288.1 Artificial Intelligence Large Model Part 1.General Requirements

3 Terms and definitions

The terms and definitions defined in GB/T 45288.1 apply to this document.

4 Abbreviations

The following abbreviations apply to this document. API. Application Programming Interface BLEU. Bilingual Evaluation Understudy

5 Evaluation Indicators

5.1 Comprehension Ability Evaluation Indicators 5.1.1 Overview The evaluation of large model understanding ability is mainly divided into single-modal dimension and multi-modal dimension. The single-modal dimension mainly includes text, image, and audio. The multimodal dimension mainly includes four secondary dimensions. picture and text, text and sound, picture and sound, and picture and text and sound. The types of tasks are shown in Table 1.
...

Tips & Frequently Asked Questions:

Question 1: How long will the true-PDF of GB/T 45288.2-2025_English be delivered?


Answer: Upon your order, we will start to translate GB/T 45288.2-2025_English as soon as possible, and keep you informed of the progress. The lead time is typically 3 ~ 5 working days. The lengthier the document the longer the lead time.

Question 2: Can I share the purchased PDF of GB/T 45288.2-2025_English with my colleagues?


Answer: Yes. The purchased PDF of GB/T 45288.2-2025_English will be deemed to be sold to your employer/organization who actually pays for it, including your colleagues and your employer's intranet.

Question 3: Does the price include tax/VAT?

Answer: Yes. Our tax invoice, downloaded/delivered in 9 seconds, includes all tax/VAT and complies with 100+ countries' tax regulations (tax exempted in 100+ countries) -- See Avoidance of Double Taxation Agreements (DTAs): List of DTAs signed between Singapore and 100+ countries

Question 4: Do you accept my currency other than USD?

Answer: Yes. If you need your currency to be printed on the invoice, please write an email to Sales@ChineseStandard.net. In 2 working-hours, we will create a special link for you to pay in any currencies. Otherwise, follow the normal steps: Add to Cart -- Checkout -- Select your currency to pay.
Refund Policy Privacy Policy Terms of Service