Location>code7788 >text

[HTTP] HTTP 协议 Response Header 之 Content-Length、Transfer-Encoding与Content-Encoding

Popularity:933 ℃/2024-07-30 11:42:13

0 Introduction

  • In a recent project scenario, the network transmission of a Web API (response content: 7MB-40MB, data items: 5W-20W) took a large amount of time, as short as 5s and as long as 25s, and the front-end rendering took 9s-60s.
  • In this scenario, the front-end problems are left aside for now. Then for the back-end problems, I personally think there is still more room for optimization:
    • 1) Enable HTTP content compression policy [most important].
    • 2) Adjust the data structure (refer to JDBC, thefields:list<string> + values:list<list<object>>, divided into 2 separate datasets, the attribute values of the objects within the dataset are guaranteed to be ordered, reducing the number ofobject(Iterative descriptions of field names to reduce network bandwidth).

  • Here, the main focus is on analyzing, implementingHTTP Content Compression PolicyThe HTTP Response Header is the main focus in the
    • Content-Length : The length of the content (after compression, if compression is enabled).
    • Transfer-Encoding : Transfer-Encoding
    • Content-Encoding : Content-Encoding

1 Overview

1.1 Transfer-Encoding

  • Transfer-Encodingis an HTTP header field that literally translates to "transmission code」。
  • Transfer-Encoding is used to change the format of the message, which not only does not reduce the size of the physical content transfer, but even makes the transfer larger, so what is its role? That's what this article is about. Let's remember that Content-Encoding and Transfer-Encoding are complementary to each other. For an HTTP message, it is very likely that both Content-Encoding and Transfer-Encoding will be performed at the same time.

1.2 Content-Encoding

  • In fact.HTTP protocolThere is another header in there related to encoding:Content-Encoding
  • Content-Encoding Typically used to make changes to thePhysical contentcarry outcompression codingThe purpose is toOptimize TransmissionFor example, compressing text files with gzip can dramatically reduce their size. For example, compressing text files with gzip can reduce their size significantly.
  • content codeIt's usuallyselectiveFor example, jpg / png files are usually not turned on, because the image format is already highly compressed, so compressing it again is not effective and wastes CPU.
Content-Encoding
Transfer-Encoding Value
descriptive
gzip (recommended) Indicates that the entity is GNU zip-encoded
compress Indicates that the entity uses the Unix file compression program
deflate indicates that the entity is compressed in zlib format
br (Recommended) Indicates that the response data is encoded using Brotli compression.
Example: Transfer-Encoding: br / Content-Encoding: br
identity Indicates that the entity is not encoded. This is the default when there is no Content-Encoding header

HTTP defines a number of standard content encoding types and allows more to be added in the form of extended encodings.
The various codes are standardized by the Internet Assigned Numbers Authority (IANA), which assigns unique designators to each content coding algorithm.
The Content-Encoding header uses these standardized designations to describe the algorithms used in encoding.

gzip, compress, and deflate encoding are lossless compression algorithms used to reduce the size of transmitted messages without loss of information.
Of these algorithms, gzip is usually the most efficient and widely used.

1.3 Accept-Encoding : Client Declares Acceptable Encoding

  • Accept-Encoding Field contains a comma-separated list of supported encodings, here are some examples:.
Accept-Encoding: compress, gzip
Accept-Encoding:
Accept-Encoding: *
Accept-Encoding: compress;q=0.5, gzip;q=1.0
Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0
  • The client can parameterize each encoding with a Q (quality) value to indicate the priority of the encoding. q values range from 0.0 to 1.0, with 0.0 indicating that the client does not want to accept the encoding indicated, and 1.0 indicating the most preferred encoding.

1.4 Persistent Connection

  • leaveTransfer-Encoding Put it aside. Let's see.HTTP protocolAnother important concept in thePersistent Connectionpersistent connectionThe colloquialism.long connection)。
  • We know.HTTP run inTCP connectionThe TCP is the same as the TCP.handshakeslow startfeatures, to maximize HTTP performance, use thepersistent connectionThis is especially important. For this reason, the HTTP protocol has introduced mechanisms.
    • HTTP/1.0 (used form a nominal expression)persistent connection mechanismwas introduced later, through theConnection: keep-alive This header to realize thatserver-sidecap (a poem)client (computing)Both can use it to tell the other person in theThere is no need to disconnect the TCP connection after sending data for future use.
    • HTTP/1.1 ruleAll connections must be persistentunless explicitly added to the headerConnection: close
      • So, in effect.HTTP/1.1 centerConnection This header field no longer has a keep-alive value, but for historical reasons, many Web Servers and browsers have retained a value for theHTTP/1.1 long connection sendingConnection: keep-alive The Habits.

1.6 Content-Length: tells the browser the length of the (encoded) response entity.

When there is no Content-Length:pending

  • The browser reuses already open idle persistent connections and can bypass the slowhandshakeIt's also a way to avoid meetingTCP Slow Start Congestion Adaptation Phase, sounds wonderful. To delve into thepersistent connectionfeatures, I decided to write the simplest Web Server I could for testing using Node, which provides thehttp module for quickly creating an HTTP Web Server, but I needed more control, so I used thenet module creates a TCP Server:
require('net')
	.createServer(function(sock) { 
		('data', function(data) { 
			('HTTP/1.1 200 OK\r\n'); 
			('\r\n'); 
			('hello world!'); 
			(); 
		}); 
	}).listen(9090, '127.0.0.1');

After starting the service, accessing 127.0.0.1:9090 in a browser correctly outputs the specified content, and everything works fine. Remove the() This line, make itpersistent connectionI'm not sure if it's a good idea, but I'm going to try it again after restarting the service. This time, the result is a bit strange: I can't see the output, and when I check the status of the request through Network, it is alwayspending

This is because, fornon-persistent connectionThe browser can be accessed through thegroutWhether or not it is closed defines the boundaries of the requesting or responding entity; and for thepersistent connectionThis method obviously doesn't work. In the above example, even though I've sent all the data, the browser doesn't know this, it has no way of knowing if there's more data coming in on this open connection, so it just waits.

After the introduction of Content-Length: to give the browser an honest feedback about the length of the response content entity.

The easiest way to come up with a solution to the above problem is toCalculate content entity length, and tell each other through the head. This is done by using theContent-Length up, revamp the example above:

require('net')
	.createServer(function(sock) { 
		('data', function(data) { 
			('HTTP/1.1 200 OK\r\n'); 
			('Content-Length: 12\r\n');
			('\r\n'); 
			('hello world!'); 
			(); 
		}); 
	}).listen(9090, '127.0.0.1');

As you can see, this time the data is sent and the TCP connection is not closed, but the browser is able to output the content and end the request normally, because the browser can send data via theContent-Length The length information of theResponding entities have closedWhat happens if the Content-Length doesn't match the actual length of the entity? What happens if the Content-Length doesn't match the actual length of the entity? If you're interested, you can try it yourself. Usually if theContent-Length Shorter than the actual length will result inContent truncated; if it is longer than the entity's content, it will result inpending

due toContent-Length The field mustTrue reflection of the length of the entityLibyan Arab Jamahiriyapractical applicationThere are times when the length of an entity is not that easy to obtain, such as when the entity comes from thenetwork fileor bydynamic programGenerate. At this point, in order to accurately obtain the length, can only open a large enough buffer, and wait for the content of all the generated before calculating. But this requires a largermemory overhead, on the other hand it may also make the client wait longer.

1.5 Transfer-Encoding: chunked (Transfer-Encoding = chunked): entity boundaries are known without relying on length information in the header

TTFB (Time To First Byte)

  • We're on it.WEB Performance OptimizationWhen the time comes, there is an important indicator calledTTFBTime To First Byte), which stands forThe time taken from when the client sends the request to when it receives the first byte of the response
  • This metric can be seen in the Network panel that comes with most browsers (e.g.Chrome - Network - a request - Timing - Waiting for server response), the shorter theTTFB It means that the sooner the user can see the content of the page, the better the experience. Predictably, the server-side caching of all content in order to calculate the response entity length runs counter to the shorter TTFB idea.
  • However, in HTTP messages, the entity must come after the header, and the order cannot be reversed, for which we need a new mechanism:Know the boundaries of the entity without relying on the length information of the header

Transfer-Encoding : knows entity boundaries without relying on length information in the header

  • The protagonist of this article has finally reappeared, theTransfer-Encoding It is used to solve this very problem above. History.Transfer-Encoding can take on a variety of values, and for this purpose a new function calledTE header is used to negotiate which transport encoding to use. However, the latest HTTP specification defines only one transport encoding: chunked.

  • chunkingIt's fairly simple to add in the headerTransfer-Encoding: chunked After that, it means that this message is encoded in chunks. At this point, theThe response entity in the messageNeeds to be replaced with a series ofchunkingto transmit. EachchunkingContains the length value and data in hexadecimal, with the length value occupying a single line, and the length not including its endingCRLF(\r\n), nor thechunk datafinalCRLF. The last chunk length value must be 0. The correspondingchunk dataThere is no content that indicatesClosure of the entity

  • Remodel the previous code according to this format:

require('net').createServer(function(sock) {
    ('data', function(data) {
        ('HTTP/1.1 200 OK\r\n');
        ('Transfer-Encoding: chunked\r\n');
        ('\r\n');

        ('b\r\n');
        ('01234567890\r\n');

        ('5\r\n');
        ('12345\r\n');

        ('0\r\n');
        ('\r\n');
    });
}).listen(9090, '127.0.0.1');

In the example above, I indicated in the response header that the next entities would be encoded in chunks, then output an 11-byte chunk, then a 5-byte chunk, and finally a 0-length chunk to indicate that the data had been passed. Accessing this service with a browser gives the correct result. As you can see, the problem posed earlier is well solved by this simple chunking strategy.

aforementionedContent-Encoding cap (a poem)Transfer-Encoding The two are often used in combination, in fact, for theTransfer-Encoding The chunking is then performedContent-Encoding

Here's the response to a telnet request to the test page, gzip-encoded to the chunked content:

> telnet 106.187.88.156 80

GET / HTTP/1.1
Host: qgy18.
Accept-Encoding: gzip

HTTP/1.1 200 OK
Server: nginx
Date: Sun, 03 May 2015 17:25:23 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Content-Encoding: gzip

1f
�H���W(�/�I�J

0

You can also see similar results with the HTTP packet grabber Fiddler, so if you're interested you can try it yourself.

Transfer-Encoding: chunked together with Content-Length

  • Transfer-Encoding: chunked together withContent-Length same asheader fieldsTheyNot in the head at the same time
  • When using thechunkingWhen the head will appearTransfer-Encoding: chunkedbut (not)No longer includes the Content-Length field, even if the field is forced to be set, it will be ignored.

In HTTP, we usually rely on HttpCode/HttpStatus to determine whether an HTTP request was successful or not, for example:

HTTP: Status 200 - Success, the server returned the page successfully
HTTP: Status 304 - Success, page not modified
HTTP: Status 404 - Failed, requested page does not exist
HTTP: Status 503 - Failed, service unavailable

… …

Extension: Determining the success of an HTTP request when a developer's program initiates an HTTP request Scenario (optional reading section)

But developers can be surprisingly imaginative at times. A portion of our developers decided to useContent-Length to determine if the HTTP request was successful when theContent-Length If the value of 0 or 162 is less than or equal to 0, the request is considered to have failed.

(coll.) fail (a student)Content-Length is less than or equal to 0, it is understandable that the HTTP request fails because developers incorrectly assume that the Content-Length field must be included in the HTTP response header.

Why is it that whenContent-Length When the value of the 404 page is 162, the request is also considered to have failed. This is because the length of the 404 page on the company's server happens to be 162, surprise, surprise!

2 Web Server Configuration

  • Enabling configurations such as compression

Nginx

gzip on;
gzip_min_length 1k;

gzip_buffers 4 16k;

#gzip_http_version 1.0;

gzip_comp_level 2;

gzip_types text/xml text/plain text/css text/js application/javascript application/json;

gzip_vary on;

gzip_disable "MSIE [1-6]\.";
  • When the Nginx configuration file (used form a nominal expression)locationWhen the above is configured in a location such as this, the Nginx server turns on compression (gzip) for the specified file types to optimize transfers and reduce the amount of transfers.

  • chunkingIt is possible to combinecompressed objectinto multiple parts, in which case the resource is compressed as a whole and the compressed output is transmitted in chunks. In the case of compression, thechunkingconduciveSending data while compressingInstead of completing the compression process firstThe data will be transmitted after the size of the compressed data is known, thus enabling the user to receive the data more quickly.TTFB Indicators are better.

  • For those who have turned on thegzip transmission, the header of the message will be increased byContent-Encoding: gzip to mark how the transmitted content is encoded. Also, the Nginx serverdefault (setting)It would be a good idea toCompressed contentcarry outchunkingwithout explicitly turning onchunked_transfer_encoding

  • In Nginx, how toTurning off chunked transfersIn the Nginx configuration filelocation Add a line in the paragraph "chunked_transfer_encoding off;"Just do it.

location / {
    chunked_transfer_encoding       off;
}

SpringBoot(Embed Tomcat)

  • SpringBoot defaults toWithout gzip compressionthat we need to turn on manually by adding two lines to the configuration file
server: 
  compression: 
    enabled: true 
    mime-types: application/json,application/xml,text/html,text/plain,text/css,application/x-javascript
  • Note: Themime-typesIn spring 2.0+, the default value is as follows, so we don't usually need to add this configuration specifically
// #mimeTypes
/**
 * Comma-separated list of MIME types that should be compressed.
 */
private String[] mimeTypes = new String[] { "text/html", "text/xml", "text/plain",
		"text/css", "text/javascript", "application/javascript", "application/json",
		"application/xml" };
  • Although with the addition of the above configuration, thegzip compression is enabledBut be warned.Not all interfaces use gzip compression., by default.Will only compress content above 2048 bytes
  • If we need to change this value, we can do so by modifying the configuration
server:
  compression:
    min-response-size: 1024

X References

  • Transfer-Encoding in the HTTP protocol - the [Recommended
  • Transfer-Encoding in the HTTP Protocol - Blogosphere
  • Introducing the Network panel of the browser console - CSDN [Recommended
    • Chrome-developer-tools-Network-(a request)-Size(response-header-size + response-data)
  • Introduction to HTTP Transfer-Encoding - CSDN
  • SpringBoot Tutorial Web Series to open the GZIP data compression - Blog Park [Recommended
  • Content-Encoding: Content Encoding - CSDN
  • What is Content-Encoding:br? - 51CTO [Recommended