I. Writing at the beginning
The previous blog posts roughly introduced what network programming is, and the practical role of network programming, today, we will focus on the important knowledge points involved in it, a detailed compendium and learning!
In the whole WEB programming, there is one application layer protocol that we can't skip, and that's theHTTP
Onehypertext transfer protocol (HTTP)
It's always the first thing that comes up when we browse the web, so let's learn to understand it today.
II. HTTP
2.1 Definition of HTTP
HTTP is an important protocol in the application layer, which is translated as Hypertext Transfer Protocol in Chinese. It is based on the TCP protocol and is mainly designed for the communication between WEB browsers and WEB servers, and it can transfer hypertext and multimedia content. When we browse the web pages by using a browser, our web pages are loaded by HTTP/HTTPS requests.
2.2 HTTP Response Status Code
HTTP status code is a specific code value that describes the result of an HTTP request, through which we can quickly locate where the problem of this request occurs.
The status code basically starts with the Arabic numerals 1-5, which represent different meanings, among which 1XX, which we seldom see, indicates that the server is processing the incoming request:
100 Continue: The client can continue with the request. Usually used after the client has sent the initial part of the request, indicating that the server has received the initial part of the request and the client should continue to send the rest. 101 Switching Protocols: The server is switching to the protocol requested by the client. This is used when a client requests a protocol change (such as switching from HTTP/1.1 to HTTP/2).
2XX (success status code)
200 (success) The server has successfully processed the request. This status code is the default for the servlet, which gets 200 if the setStatus method is not called; 201 Created: The request was successfully processed and one or more new resources were created on the server side. For example, a new user is created via a POST request. 204 (no content) The server successfully processed the request and returned no content; 205 (reset content) The server successfully processed the request, returned no content, and reset the document view, such as clearing the form content; 206 (partial content) The server successfully processed a partial GET request.
3XX (redirection status code)
300 (Multiple choices) The server can perform multiple actions in response to a request. The server can select an action based on the requester or provide a list of actions to select from; 301 (permanently moved) The requested page has been permanently moved to a new location. When the server returns this response, it automatically redirects the requester to the new location; 302 (temporary move) The server is currently responding to a request from a page in a different location, but the requester should continue to use the original location for future requests. The requestor is automatically moved to the new location; 304(unmodified) The requested page has not been modified since the last request and no page content will be returned; 305 (Use Proxy) The requestor can only use the specified proxy to access the requested web page.
4XX (client error status code)
400 (Error request) The server does not understand the syntax of the request ; 401 (Authentication Error) This page requires authorization; 403 (forbidden) The server rejects the HTTP request outright and does not process it. Generally used for illegal requests; 404 The resource you requested was not found on the server side. For example, if you request information about a user, the server does not find the specified user; 406 (Not Accepted) It is not possible to respond to the requested page using the requested content characteristics; 408 (Request Timeout) A timeout occurred while the server was waiting for a request; 414 (Requested URI is too long) The requested URI is too long for the server to process.
5XX (server error status code)
500 (Internal server error) The server encountered an error and could not complete the request; 503 (Service Unavailable) The server is currently unavailable (due to overloading or downtime for maintenance). Usually, this is a temporary state; 504 (Gateway Timeout) The server, acting as a gateway or proxy, did not receive requests from upstream servers in a timely manner; 505 (HTTP version not supported) The server does not support the version of the HTTP protocol used in the request.
2.2 HTTP request message
According to the following message case, let's see, where ①, ② and ③ belong to the request line; ④ belongs to the request header; ⑤ belongs to the message body.
① is the request method, HTTP/1.1 defines 8 request methods: GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS, TRACE, the most common two GET and POST; ② is the URL address corresponding to the request, which, together with the Host attribute in the message header, forms the complete request URL; ③ It is the protocol name and version number; ④ is the header of HTTP message, the header contains several attributes in the format of "Attribute Name:Attribute Value", according to which the server obtains the client's information. ⑤ is a message body that encodes the values of components in a page form into a formatted string in the form of key-value pairs with param1=value1¶m2=value2, which carries the data for multiple request parameters. Not only the body of the message can pass the request parameters, the request URL can also be encoded into a formatted string similar to "/chapter15/? param1=value1¶m2=value2".
2.3 HTTP VS HTTPS
Let's compare and analyze HTTP vs. HTTPS in the four areas shown below.
Port number: The port number for HTTP is 80, while the port number for HTTPS is 443; URL prefix: The URL prefix is http:// for HTTP and https:// for HTTPS; Security and resource consumption: HTTP is a protocol running on top of TCP, the transmission is in plaintext, the client and the server can not verify each other's identity, while HTTPS is running on top of SSL/TLS, the transmission of the content of the symmetric encryption, and heaps of encrypted secret key, and asymmetric encryption at the server side, relative to the security of HTTP a lot, but because of this series of operations, but also let HTTPS consumes HTTPS consumes more server resources; SEO: Search engines often favor websites that use the HTTPS protocol because HTTPS provides greater security and user privacy protection. Websites that use the HTTPS protocol may be prioritized in search results, which can have an impact on SEO.
2.4 Different versions of HTTP
Since the announcement of the HTTP 1.0 version in May 1996, after decades of time, HTTP has been born 1.0, 1.1, 2.0, 3.0 and many other versions of the development of the times, and constantly moving forward!
2.4.1 HTTP1.0 VS HTTP1.1
connection method: HTTP1.0 is a short connection, you need to use keep-alive parameter to establish a long connection, HTTP1.1 supports keep-alive long connection by default. status response code: HTTP1.1 added a lot of response codes on the basis of the original, just the error response status code added 24 kinds, such as 100, 204, 409, 410 and so on. caching mechanismHTTP1.0 uses If-Modified-Since and Expires in the header as the criteria for caching judgment, while HTTP1.1 introduces more cache control policies such as Entity tag, If-Unmodified-Since, If-Match, If-None- Match and other optional cache headers to control the caching policy. Match, If-Match, If-None- Match, and many more optional cache headers to control caching policies. bandwidths: HTTP1.0 can not request part of the object content, can not be broken to continue transmission, in HTTP1.1 in the request header introduced the range header field, which allows only a part of the request resource, that is, the return code is 206 (Partial Content), which facilitates the developer to freely choose in order to make full use of the bandwidth and connection. Host header processing: HTTP 1.1 introduced the Host header field, which allows multiple domain names to be hosted on the same IP address, thus supporting virtual hosting. HTTP 1.0 does not have a Host header field and does not enable virtual hosting.
2.4.2 HTTP1.1 VS HTTP2.0
multiplexed: HTTP2.0 in the same connection allows simultaneous transmission of multiple requests and responses without interfering with each other. While HTTP1.1 is used in a serial way, each request and response requires a connection to process, due to the browser in order to resource loss control in the 6-8 TCP connection limit, so that HTPP1.1 processing speed is greatly limited.
binary frame: HTTP 2.0 uses binary frames for data transmission, while HTTP 1.1 uses text-formatted messages. Binary frames are more compact and efficient, reducing the amount of data transmitted and bandwidth consumption.
Head compressionHTTP1.1 supports Body compression, Header does not support compression. HTTP2.0 supports Header compression, using the HPACK algorithm designed specifically for Header compression, reducing network overhead.
2.4.3 HTTP2.0 VS HTTP3.0
transportation protocolHTTP2.0 is based on the TCP protocol, HTTP3.0 adds the QUIC (Quick UDP Internet Connections) protocol to realize reliable transmission, providing security comparable to TLS/SSL, with low connection and transmission delays. You can think of QUIC as an upgraded version of UDP, with new features such as encryption, retransmission, etc. HTTP3.0 was previously known as HTTP-over-QUIC, and we can see from this name that the biggest change in HTTP3.0 is the use of QUIC. establish a connection: HTTP2.0 requires the classic TCP handshake three times (and since secure HTTPS connections require a TLS handshake as well, a total of about 3 RTTs). Due to the nature of the QUIC protocol (TLS 1.3, which supports 0 RTT handshakes in addition to 1 RTT handshakes) connection establishment requires only a 0-RTT or 1-RTT, which means that QUIC, in the best case scenario, does not require any additional round-trip time to establish a new connection. Head compression: HTTP 2.0 uses the HPACK algorithm for header compression, while HTTP 3.0 uses the more efficient QPACK header compression algorithm. fault tolerance: HTTP 3.0 has a better error recovery mechanism, which allows for faster recovery and retransmission when network problems such as packet loss and delay occur. HTTP2.0, on the other hand, relies on TCP's error recovery mechanism. security: In HTTP 2.0, TLS is used to encrypt and authenticate the entire HTTP session, including all HTTP headers and data loads.TLS works on top of the TCP layer, encrypting the application-layer data transmitted over the TCP connection, and does not encrypt the TCP header or the TLS record-layer header, so the TCP header can be tampered with by an attacker to interfere with communication during transmission. Therefore, the TCP header may be tampered with by an attacker during transmission to interfere with communication. QUIC of HTTP3.0 encrypts and authenticates the entire packet (including the header and body) to ensure security. connection migration: HTTP 3.0 supports connection migration because QUIC uses a 64-bit ID to identify the connection, so as long as the ID stays the same it will not be interrupted, and the connection will be maintained when the network environment changes (e.g., switching from Wi-Fi to mobile data). A TCP connection, on the other hand, consists of (source IP, source port, destination IP, destination port), and once one of the values in this quaternion changes, the connection is no longer available.
III. Summary
Well, today's HTTP learning here, in fact, for java development engineers, for the degree of understanding of HTPP, to the end, but for network engineers, HTPP is a vital knowledge, need to explore deeper, recommended to see the "illustrated HTTP" book.