0 Introduction
- In a recent project scenario, the network transmission of a Web API (response content: 7MB-40MB, data items: 5W-20W) took a large amount of time, as short as 5s and as long as 25s, and the front-end rendering took 9s-60s.
- In this scenario, the front-end problems are left aside for now. Then for the back-end problems, I personally think there is still more room for optimization:
- 1) Enable HTTP content compression policy [most important].
- 2) Adjust the data structure (refer to JDBC, the
fields:list<string>
+values:list<list<object>>
, divided into 2 separate datasets, the attribute values of the objects within the dataset are guaranteed to be ordered, reducing the number ofobject
(Iterative descriptions of field names to reduce network bandwidth).
- Here, the main focus is on analyzing, implementing
HTTP Content Compression Policy
The HTTP Response Header is the main focus in the- Content-Length : The length of the content (after compression, if compression is enabled).
- Transfer-Encoding : Transfer-Encoding
- Content-Encoding : Content-Encoding
1 Overview
1.1 Transfer-Encoding
- Transfer-Encodingis an HTTP header field that literally translates to "transmission code」。
- Transfer-Encoding is used to change the format of the message, which not only does not reduce the size of the physical content transfer, but even makes the transfer larger, so what is its role? That's what this article is about. Let's remember that Content-Encoding and Transfer-Encoding are complementary to each other. For an HTTP message, it is very likely that both Content-Encoding and Transfer-Encoding will be performed at the same time.
1.2 Content-Encoding
- In fact.HTTP protocolThere is another header in there related to encoding:Content-Encoding。
- Content-Encoding Typically used to make changes to thePhysical contentcarry outcompression codingThe purpose is toOptimize TransmissionFor example, compressing text files with gzip can dramatically reduce their size. For example, compressing text files with gzip can reduce their size significantly.
- content codeIt's usuallyselectiveFor example, jpg / png files are usually not turned on, because the image format is already highly compressed, so compressing it again is not effective and wastes CPU.
Content-Encoding Transfer-Encoding Value |
descriptive |
---|---|
gzip (recommended) | Indicates that the entity is GNU zip-encoded |
compress | Indicates that the entity uses the Unix file compression program |
deflate | indicates that the entity is compressed in zlib format |
br (Recommended) | Indicates that the response data is encoded using Brotli compression. Example: Transfer-Encoding: br / Content-Encoding: br |
identity | Indicates that the entity is not encoded. This is the default when there is no Content-Encoding header |
HTTP defines a number of standard content encoding types and allows more to be added in the form of extended encodings.
The various codes are standardized by the Internet Assigned Numbers Authority (IANA), which assigns unique designators to each content coding algorithm.
The Content-Encoding header uses these standardized designations to describe the algorithms used in encoding.
gzip, compress, and deflate encoding are lossless compression algorithms used to reduce the size of transmitted messages without loss of information.
Of these algorithms, gzip is usually the most efficient and widely used.
1.3 Accept-Encoding : Client Declares Acceptable Encoding
-
Accept-Encoding
Field contains a comma-separated list of supported encodings, here are some examples:.
Accept-Encoding: compress, gzip
Accept-Encoding:
Accept-Encoding: *
Accept-Encoding: compress;q=0.5, gzip;q=1.0
Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0
- The client can parameterize each encoding with a Q (quality) value to indicate the priority of the encoding. q values range from 0.0 to 1.0, with 0.0 indicating that the client does not want to accept the encoding indicated, and 1.0 indicating the most preferred encoding.
1.4 Persistent Connection
- leave
Transfer-Encoding
Put it aside. Let's see.HTTP protocolAnother important concept in thePersistent Connection(persistent connectionThe colloquialism.long connection)。 - We know.HTTP run inTCP connectionThe TCP is the same as the TCP.handshake、slow startfeatures, to maximize HTTP performance, use thepersistent connectionThis is especially important. For this reason, the HTTP protocol has introduced mechanisms.
-
HTTP/1.0
(used form a nominal expression)persistent connection mechanismwas introduced later, through theConnection: keep-alive
This header to realize thatserver-sidecap (a poem)client (computing)Both can use it to tell the other person in theThere is no need to disconnect the TCP connection after sending data for future use.。 -
HTTP/1.1
ruleAll connections must be persistentunless explicitly added to the headerConnection: close
。- So, in effect.
HTTP/1.1
centerConnection
This header field no longer has a keep-alive value, but for historical reasons, many Web Servers and browsers have retained a value for theHTTP/1.1
long connection sendingConnection: keep-alive
The Habits.
- So, in effect.
-
1.6 Content-Length: tells the browser the length of the (encoded) response entity.
When there is no Content-Length:pending
- The browser reuses already open idle persistent connections and can bypass the slowhandshakeIt's also a way to avoid meetingTCP Slow Start Congestion Adaptation Phase, sounds wonderful. To delve into thepersistent connectionfeatures, I decided to write the simplest Web Server I could for testing using Node, which provides the
http
module for quickly creating an HTTP Web Server, but I needed more control, so I used thenet
module creates a TCP Server:
require('net')
.createServer(function(sock) {
('data', function(data) {
('HTTP/1.1 200 OK\r\n');
('\r\n');
('hello world!');
();
});
}).listen(9090, '127.0.0.1');
After starting the service, accessing 127.0.0.1:9090 in a browser correctly outputs the specified content, and everything works fine. Remove the
()
This line, make itpersistent connectionI'm not sure if it's a good idea, but I'm going to try it again after restarting the service. This time, the result is a bit strange: I can't see the output, and when I check the status of the request through Network, it is alwayspending
。
This is because, fornon-persistent connectionThe browser can be accessed through thegroutWhether or not it is closed defines the boundaries of the requesting or responding entity; and for thepersistent connectionThis method obviously doesn't work. In the above example, even though I've sent all the data, the browser doesn't know this, it has no way of knowing if there's more data coming in on this open connection, so it just waits.
After the introduction of Content-Length: to give the browser an honest feedback about the length of the response content entity.
The easiest way to come up with a solution to the above problem is toCalculate content entity length, and tell each other through the head. This is done by using theContent-Length
up, revamp the example above:
require('net')
.createServer(function(sock) {
('data', function(data) {
('HTTP/1.1 200 OK\r\n');
('Content-Length: 12\r\n');
('\r\n');
('hello world!');
();
});
}).listen(9090, '127.0.0.1');
As you can see, this time the data is sent and the TCP connection is not closed, but the browser is able to output the content and end the request normally, because the browser can send data via the
Content-Length
The length information of theResponding entities have closedWhat happens if the Content-Length doesn't match the actual length of the entity? What happens if the Content-Length doesn't match the actual length of the entity? If you're interested, you can try it yourself. Usually if theContent-Length
Shorter than the actual length will result inContent truncated; if it is longer than the entity's content, it will result inpending
。
due to
Content-Length
The field mustTrue reflection of the length of the entityLibyan Arab Jamahiriyapractical applicationThere are times when the length of an entity is not that easy to obtain, such as when the entity comes from thenetwork fileor bydynamic programGenerate. At this point, in order to accurately obtain the length, can only open a large enough buffer, and wait for the content of all the generated before calculating. But this requires a largermemory overhead, on the other hand it may also make the client wait longer.
1.5 Transfer-Encoding: chunked (Transfer-Encoding = chunked): entity boundaries are known without relying on length information in the header
TTFB (Time To First Byte)
- We're on it.WEB Performance OptimizationWhen the time comes, there is an important indicator called
TTFB
(Time To First Byte
), which stands forThe time taken from when the client sends the request to when it receives the first byte of the response。 - This metric can be seen in the Network panel that comes with most browsers (e.g.
Chrome - Network - a request - Timing - Waiting for server response
), the shorter theTTFB It means that the sooner the user can see the content of the page, the better the experience. Predictably, the server-side caching of all content in order to calculate the response entity length runs counter to the shorter TTFB idea. - However, in HTTP messages, the entity must come after the header, and the order cannot be reversed, for which we need a new mechanism:Know the boundaries of the entity without relying on the length information of the header。
Transfer-Encoding : knows entity boundaries without relying on length information in the header
-
The protagonist of this article has finally reappeared, the
Transfer-Encoding
It is used to solve this very problem above. History.Transfer-Encoding
can take on a variety of values, and for this purpose a new function calledTE
header is used to negotiate which transport encoding to use. However, the latest HTTP specification defines only one transport encoding: chunked. -
chunking
It's fairly simple to add in the headerTransfer-Encoding: chunked
After that, it means that this message is encoded in chunks. At this point, theThe response entity in the messageNeeds to be replaced with a series ofchunkingto transmit. EachchunkingContains the length value and data in hexadecimal, with the length value occupying a single line, and the length not including its endingCRLF
(\r\n), nor thechunk datafinalCRLF
. The last chunk length value must be 0. The correspondingchunk dataThere is no content that indicatesClosure of the entity。 -
Remodel the previous code according to this format:
require('net').createServer(function(sock) {
('data', function(data) {
('HTTP/1.1 200 OK\r\n');
('Transfer-Encoding: chunked\r\n');
('\r\n');
('b\r\n');
('01234567890\r\n');
('5\r\n');
('12345\r\n');
('0\r\n');
('\r\n');
});
}).listen(9090, '127.0.0.1');
In the example above, I indicated in the response header that the next entities would be encoded in chunks, then output an 11-byte chunk, then a 5-byte chunk, and finally a 0-length chunk to indicate that the data had been passed. Accessing this service with a browser gives the correct result. As you can see, the problem posed earlier is well solved by this simple chunking strategy.
aforementionedContent-Encoding
cap (a poem)Transfer-Encoding
The two are often used in combination, in fact, for theTransfer-Encoding
The chunking is then performedContent-Encoding
。
Here's the response to a telnet request to the test page, gzip-encoded to the chunked content:
> telnet 106.187.88.156 80
GET / HTTP/1.1
Host: qgy18.
Accept-Encoding: gzip
HTTP/1.1 200 OK
Server: nginx
Date: Sun, 03 May 2015 17:25:23 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Content-Encoding: gzip
1f
�H���W(�/�I�J
0
You can also see similar results with the HTTP packet grabber Fiddler, so if you're interested you can try it yourself.
Transfer-Encoding: chunked together with Content-Length
-
Transfer-Encoding: chunked
together withContent-Length
same asheader fieldsTheyNot in the head at the same time。 - When using thechunkingWhen the head will appearTransfer-Encoding: chunkedbut (not)No longer includes the Content-Length field, even if the field is forced to be set, it will be ignored.
In HTTP, we usually rely on HttpCode/HttpStatus to determine whether an HTTP request was successful or not, for example:
HTTP: Status 200 - Success, the server returned the page successfully
HTTP: Status 304 - Success, page not modified
HTTP: Status 404 - Failed, requested page does not exist
HTTP: Status 503 - Failed, service unavailable
… …
Extension: Determining the success of an HTTP request when a developer's program initiates an HTTP request Scenario (optional reading section)
But developers can be surprisingly imaginative at times. A portion of our developers decided to useContent-Length
to determine if the HTTP request was successful when theContent-Length
If the value of 0 or 162 is less than or equal to 0, the request is considered to have failed.
(coll.) fail (a student)Content-Length
is less than or equal to 0, it is understandable that the HTTP request fails because developers incorrectly assume that the Content-Length field must be included in the HTTP response header.
Why is it that whenContent-Length
When the value of the 404 page is 162, the request is also considered to have failed. This is because the length of the 404 page on the company's server happens to be 162, surprise, surprise!
2 Web Server Configuration
- Enabling configurations such as compression
Nginx
gzip on;
gzip_min_length 1k;
gzip_buffers 4 16k;
#gzip_http_version 1.0;
gzip_comp_level 2;
gzip_types text/xml text/plain text/css text/js application/javascript application/json;
gzip_vary on;
gzip_disable "MSIE [1-6]\.";
-
When the Nginx configuration file
(used form a nominal expression)
location
When the above is configured in a location such as this, the Nginx server turns on compression (gzip) for the specified file types to optimize transfers and reduce the amount of transfers. -
chunkingIt is possible to combinecompressed objectinto multiple parts, in which case the resource is compressed as a whole and the compressed output is transmitted in chunks. In the case of compression, thechunkingconduciveSending data while compressing,Instead of completing the compression process firstThe data will be transmitted after the size of the compressed data is known, thus enabling the user to receive the data more quickly.
TTFB
Indicators are better. -
For those who have turned on the
gzip
transmission, the header of the message will be increased byContent-Encoding: gzip
to mark how the transmitted content is encoded. Also, the Nginx serverdefault (setting)It would be a good idea toCompressed contentcarry outchunkingwithout explicitly turning onchunked_transfer_encoding
。 -
In Nginx, how toTurning off chunked transfersIn the Nginx configuration file
location
Add a line in the paragraph "chunked_transfer_encoding off;
"Just do it.
location / {
chunked_transfer_encoding off;
}
SpringBoot(Embed Tomcat)
- SpringBoot defaults toWithout gzip compressionthat we need to turn on manually by adding two lines to the configuration file
server:
compression:
enabled: true
mime-types: application/json,application/xml,text/html,text/plain,text/css,application/x-javascript
- Note: The
mime-types
In spring 2.0+, the default value is as follows, so we don't usually need to add this configuration specifically
// #mimeTypes
/**
* Comma-separated list of MIME types that should be compressed.
*/
private String[] mimeTypes = new String[] { "text/html", "text/xml", "text/plain",
"text/css", "text/javascript", "application/javascript", "application/json",
"application/xml" };
- Although with the addition of the above configuration, thegzip compression is enabledBut be warned.Not all interfaces use gzip compression., by default.Will only compress content above 2048 bytes。
- If we need to change this value, we can do so by modifying the configuration
server:
compression:
min-response-size: 1024
X References
- Transfer-Encoding in the HTTP protocol - the [Recommended
- Transfer-Encoding in the HTTP Protocol - Blogosphere
-
Introducing the Network panel of the browser console - CSDN [Recommended
Chrome-developer-tools-Network-(a request)-Size(response-header-size + response-data)
- Introduction to HTTP Transfer-Encoding - CSDN
- SpringBoot Tutorial Web Series to open the GZIP data compression - Blog Park [Recommended
- Content-Encoding: Content Encoding - CSDN
- What is Content-Encoding:br? - 51CTO [Recommended