GB/T 33994-2017 English PDFUS$564.00 ยท In stock
Delivery: <= 3 days. True-PDF full-copy in English will be manually translated and delivered via email. GB/T 33994-2017: Information and documentation -- WARC file format Status: Valid
Basic dataStandard ID: GB/T 33994-2017 (GB/T33994-2017)Description (Translated English): Information and documentation -- WARC file format Sector / Industry: National Standard (Recommended) Classification of Chinese Standard: A14 Classification of International Standard: 35.240.30 Word Count Estimation: 30,350 Date of Issue: 2017-07-12 Date of Implementation: 2018-02-01 Issuing agency(ies): General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China, Standardization Administration of the People's Republic of China GB/T 33994-2017: Information and documentation -- WARC file format---This is a DRAFT version for illustration, not a final translation. Full copy of true-PDF in English version (including equations, symbols, images, flow-chart, tables, and figures etc.) will be manually/carefully translated upon your order.Information and documentation - WARC file format ICS 35.240.30 A14 National Standards of People's Republic of China Information and documentation WARC file format (ISO 28500..2009, IDT) 2017-07-12 released 2018-02-01 Implementation General Administration of Quality Supervision, Inspection and Quarantine of the People's Republic of China China National Standardization Management Committee released ForewordThis standard is drafted in accordance with the rules given in GB/T 1.1-2009. This standard uses the translation method equivalent to ISO 28500..2009 "Information and documentation WARC file format". And the normative reference in this standard international documents are consistent with the relationship between China's documents are as follows. --- GB/T 7408-2005 Data element and exchange format Information exchange date and time representation (ISO 8601..2000, IDT). This standard has made the following editorial changes. --- Added abbreviations. LWS, MIME, US-ASCII (see 3.2); --- In order to enhance readability, replace some examples with domestic examples on the basis of examples in the retention of international standards (see Appendix B). This standard is proposed by the National Committee for Standardization of Information and Documentation (SAC/TC4). The drafting of this standard. National Library, Chinese Academy of Sciences Document Information Center, China National Defense Science and Technology Information Center, China Science and Technology Information Research Institute, Beijing Wanfang Data Co., Ltd. The main drafters of this standard. Mao Yajun, Li Chunming, Wu Zhenxin, Zhen Qin, Qu Yunpeng, Zhang Xiaodan, Zhang Lan, Yang He, Dun Wenjie, Zhang Biao.IntroductionEvery day, websites and web pages are generated or disappear from the internet. For more than a decade, memory storage organizations have tried to use network-scale tools such as networks Reptiles) to find the most appropriate way to collect and track important information on the mass. At the same time, the memory storage organization saves non-network capture The demand for digital resources is also increasing (eg, the data generated by the entire set of electronic journals or environmental sensors). There is a demand that is It is desirable to have a file format that simply and safely carries a large number of data objects that make up a document through a file for storage, Management and exchange. The WARC (WebvChive) file format provides a file that is linked by multiple resource records (data objects) into a long File, wherein each resource record consists of a set of simple text headers and arbitrary data content blocks. The WARC format is ARC text Format extension. The WARC format will be used as an organization to organize, manage and store a collection of data from the network and other hundreds of millions of digital resources Standards that can be used to build applications such as Harvest (such as Heritrix web crawler, an open source software), manage, access and exchange content. In addition to the original content recorded by ARC, the extended WARC format also accommodates relevant secondary content, such as assigned metadata, Reduced repetitive detection activities, post-conversion and large-scale resource segmentation. Information and documentation WARC file format1 ScopeThis standard specifies the WARC file format. - Stores payload content and control information from mainstream Internet application layer protocols such as HTTP, DNS, and FTP; - storing any metadata associated with other stored data (such as subject classification, language, coding); --- support data compression, and to ensure the integrity of data records; - store all control information (such as request header information) from the harvest protocol, not just the response information; - store the data conversion results associated with other stored data; - store duplicate monitoring activities associated with other stored data (when the same or substantially similar resources are available, Storage consumption); - to expand without interrupting the current function; --- support for long records in the required office for truncation or segment operation.2 normative reference documentsThe following documents are indispensable for the application of this document. For dated references, only the dated edition applies to this article Pieces. For undated references, the latest edition (including all modifications) applies to this document. ISO 8601 data element and exchange format information exchange date and time representation (Dataelementsandinterchange formats-Informationinterchange-Representationofdatesandtimes) RFC1035 domain name implementation and standard (Domainnames-Implementationandspecification) RFC1884 IPV6 address architecture (IPVersion6AddressingArchitecture) RFC2045 Multipurpose Internet Mail Extensions (MIME) Part 1. Internet Message Body Format [Multipurpose InternetMailExtensions (MIME) PartOne. FormatofInternetMessageBodies] RFC2540 Separates Domain Name Resolution System (DNS) Information [DetachedDomainNameSystem (DNS) Information] RFC2616 Hypertext Transfer Protocol-HTTP/1.1 (HypertextTransferProtocol-HTTP/1.1) RFC2822 Internet Message Format (InternetMessageFormat) RFC 3629 UTF-8 --- A conversion format for ISO 10646 (UTF-8, atransformationformatofIS 10646) RFC3986 Unified Resource Identifier (URI). Generic Syntax [UniformResourceIdentifier (URI). GenericSyntax] RFC4027 Domain Name Resolution System Media Type (DomainNameSystemMediaTypes) W3CDTF Date and Time Format. Comments submitted to the W3C (DateandTimeFormats. notesubmittedtothe W3C) 3 terms, definitions and abbreviations 3.1 Terms and definitions The following terms and definitions apply to this document. 3.1.1 WARC record WARC file is a basic component of the WARC file consisting of a sequence of WARC records. ......Tips & Frequently Asked Questions:Question 1: How long will the true-PDF of GB/T 33994-2017_English be delivered?Answer: Upon your order, we will start to translate GB/T 33994-2017_English as soon as possible, and keep you informed of the progress. The lead time is typically 1 ~ 3 working days. The lengthier the document the longer the lead time.Question 2: Can I share the purchased PDF of GB/T 33994-2017_English with my colleagues?Answer: Yes. The purchased PDF of GB/T 33994-2017_English will be deemed to be sold to your employer/organization who actually pays for it, including your colleagues and your employer's intranet.Question 3: Does the price include tax/VAT?Answer: Yes. Our tax invoice, downloaded/delivered in 9 seconds, includes all tax/VAT and complies with 100+ countries' tax regulations (tax exempted in 100+ countries) -- See Avoidance of Double Taxation Agreements (DTAs): List of DTAs signed between Singapore and 100+ countriesQuestion 4: Do you accept my currency other than USD?Answer: Yes. If you need your currency to be printed on the invoice, please write an email to Sales@ChineseStandard.net. In 2 working-hours, we will create a special link for you to pay in any currencies. Otherwise, follow the normal steps: Add to Cart -- Checkout -- Select your currency to pay. |