Foreword
In the previous essay, the type of request of how to expand the custom protocol. In this article, I will introduce file transmission based on this custom protocol, which will involveData piece
andZero copy
Before designing the custom protocol, we first understand how the HTTP protocol processs file transmission.
The implementation of the http protocol
Here, we mainly discuss the most widely used HTTP/1.1 protocol
About data shard
HTTP protocol itself is a pure text protocol, among themContent-Length
The head field is used to specify the content length of the response (Body).Content-Length
It is a pure text format that theoretically there is no length limit. Therefore, in most cases, the HTTP protocol can transmit the entire file at one time.
For larger files, it is usually possible to download the entire file through one request, which is also a common practice for many websites and services. However, if the file is particularly large, or in order to improve the download efficiency (such as supporting breakpoints, parallel downloads, etc.), it is necessary to process the files of the file on the application layer. For example, the server can return the segmentation information of the file first, and then the client requests different parts of the file one by one.
About zero copy
The client library of the HTTP protocol usually does not expose the socket connection of the underlying layer, which makes the upper application cannot directly operate the Socket for zero copy transmission.
In most cases, the data needs to be copied to the memory of the process first, and then transmitted to the HTTP client.
Due to the limitation of the HTTP client library, zero copy technology is not directly applicable in the application of the HTTP protocol.
Custom protocol
About data shard
In the custom protocol, we can control the transmission process more flexibly. For example, we only use 3 bytes to represent the length of the message body, so the maximum transmission content of the protocol is 16MB (2^24-1 bytes).
For the content beyond this limit, we must perform a block processing to ensure that each data block meets the length limit of the protocol.
About zero copy
The custom protocol can be referenced to the socket, so zero copy can be used to avoid multiple copies of data between memory and disk, thereby improving the transmission efficiency and reducing the CPU load.
Preliminary design
How to build a packet?
The message is a complete protobuf basesponse message
- msgid: request ID
- Headers: File name+file size+number of blocks+block numbers
- bytes: file segmentation data
message BaseResponse { required int32 msgId = 1; repeated Header headers = 2; optional bytes data = 3; }
Send two parts in the message body division
1. Send the meta -data data (Baseresponse's MSGID+Headers)
2. After sending file data
Service side:
1. Send the file range to get chunksize
2. Build Baseresponse (only contains MSGID and Headers)
3. Calculate the size of the message length = Baseresponse+chunksize
4. Send the news header
5. Effects Baseresponse
6. Zero copy send file chunk
Client:
1. Analyze the message as a complete Baseresponse.
conflict? Propobuf and zero copy
During the processing process, we will encounter a problem:ProtoBufThe analysis process requires a specific encoding format, and the contents of the file stitching cannot be used directly asProtoBuf
Part of the message.
If Protobuf is needed to identify the content of this file, the file data must be involved in the encoding, and the encoding must be loaded into the process memory. This is contrary to zero copy.
How to deal with this problem?
Add another length! The message is divided into three parts:
- 2 bytes, the length information of the Proto message. (The number of metadata bytes is limited, 2 bytes are expressed enough)
- N -byte, Proto message (MSGID+Headers)
- N -byte, file chunk data
Processing logic
1) Service side code
Java's zero -copy API is (Long Position, Long Count, WritablebyteChannel).
But Netty's Channel is not a subclass of WritablebyteChannel. To use zero copy, you must use the FileRegment provided by Netty. The underlying layer is also called FileChannel's Transferto.
public void handleDownloadRequest(BaseRequest baseRequest, ChannelHandlerContext ctx) throws Exception { File file = new File("F:\\"); RandomAccessFile raf = new RandomAccessFile(file, "r"); FileChannel fileChannel = null; long fileLength = (); ("file length" + fileLength); long offset = 0; int chunkIndex = 0; int totalChunks = (int) ((double) fileLength / MAX_CHUNK_SIZE); boolean firstPackage = true; while(offset < fileLength) { raf = new RandomAccessFile(file, "r"); fileChannel = (); ("open:"+()); //File block size long chunkSize = (MAX_CHUNK_SIZE, fileLength - offset); ("chunkSize:"+chunkSize); //Create FileRegion to transmit the current file block FileRegion fileRegion = new DefaultFileRegion(fileChannel, offset, chunkSize); List<Header> headers = new ArrayList<>(); if(firstPackage) { (().setKey("fileName").setValue(()).build()); (().setKey("fileSize").setValue((fileLength)).build()); (().setKey("totalChunks").setValue((totalChunks)).build()); } (().setKey("chunkIndex").setValue((chunkIndex)).build()); //The upper half of the message body (MSGID+Headers) BaseResponse response = () .setMsgId(()) .addAllHeaders(headers) .build(); byte[] payloadHeadBytes = (); long bodyLength = 2 + + chunkSize; //Two bytes byte[] lengthBytes = new byte[3]; lengthBytes[0] = (byte) (bodyLength >> 16); lengthBytes[1] = (byte) (bodyLength >> 8); lengthBytes[2] = (byte) bodyLength; //propobuf length long length2 = ; byte[] lengthBytes2 = new byte[2]; lengthBytes2[0] = (byte) (length2 >> 8); lengthBytes2[1] = (byte) (length2); //Send a message head+the upper part of the message body ByteBuf byteBuf = (new byte[]{5}, lengthBytes, lengthBytes2, payloadHeadBytes); ChannelFuture f1 = ().writeAndFlush(byteBuf); (); // ("f1:"+()); //Zero copy writing file data (the file content does not need to enter the user area memory, copy it directly to the socket to send the buffer area) ChannelFuture f2 = (fileRegion); (); // ("f2:"+()); firstPackage = false; //Update offset offset += chunkSize; ("Write:"+bodyLength); (); } }
2) Client code
public class DownloadManager { private Map<Integer, FileDownContext> waitingMap = new ConcurrentHashMap<>(); public void addToMap(Integer msgId, CompletableFuture<String> waiter) { (msgId, new FileDownContext(null, null, 0L, 0.0d, waiter)); } public void onResponse(BaseResponse response) { //("Receive:"+()); Integer msgId = (); FileDownContext context = (msgId); if((context)) { return; } //The first package has these two information for (Header header : ()) { if(((), "fileName")) { (()); } if(((), "totalChunks")) { ((())); } } //Update receiving situation ++; = (double)/; try { //If the file does not exist, create Path filePath = ("F:\\clientDownload\\" + ); if(!(filePath)) { (filePath); } //Additional writing file (filePath, ().toByteArray(), ); } catch (IOException e) { (); } //Complete the request and release Context if((, )) { (); (msgId); } } @Data @AllArgsConstructor class FileDownContext { String fileName; Long totalChunks; Long receivedChunks; Double progress; CompletableFuture<String> waiter; } }