[Code] Custom communication protocol -realizing zero -copy file transmission

Foreword

In the previous essay, the type of request of how to expand the custom protocol. In this article, I will introduce file transmission based on this custom protocol, which will involveData pieceandZero copy

Before designing the custom protocol, we first understand how the HTTP protocol processs file transmission.

The implementation of the http protocol

Here, we mainly discuss the most widely used HTTP/1.1 protocol

About data shard

HTTP protocol itself is a pure text protocol, among themContent-LengthThe head field is used to specify the content length of the response (Body).Content-LengthIt is a pure text format that theoretically there is no length limit. Therefore, in most cases, the HTTP protocol can transmit the entire file at one time.

For larger files, it is usually possible to download the entire file through one request, which is also a common practice for many websites and services. However, if the file is particularly large, or in order to improve the download efficiency (such as supporting breakpoints, parallel downloads, etc.), it is necessary to process the files of the file on the application layer. For example, the server can return the segmentation information of the file first, and then the client requests different parts of the file one by one.

About zero copy

The client library of the HTTP protocol usually does not expose the socket connection of the underlying layer, which makes the upper application cannot directly operate the Socket for zero copy transmission.

In most cases, the data needs to be copied to the memory of the process first, and then transmitted to the HTTP client.

Due to the limitation of the HTTP client library, zero copy technology is not directly applicable in the application of the HTTP protocol.

Custom protocol

About data shard

In the custom protocol, we can control the transmission process more flexibly. For example, we only use 3 bytes to represent the length of the message body, so the maximum transmission content of the protocol is 16MB (2^24-1 bytes).

For the content beyond this limit, we must perform a block processing to ensure that each data block meets the length limit of the protocol.

About zero copy

The custom protocol can be referenced to the socket, so zero copy can be used to avoid multiple copies of data between memory and disk, thereby improving the transmission efficiency and reducing the CPU load.

Preliminary design

How to build a packet?

The message is a complete protobuf basesponse message

msgid: request ID
Headers: File name+file size+number of blocks+block numbers
bytes: file segmentation data

message BaseResponse {
    required int32 msgId = 1;
    repeated Header headers = 2;
    optional bytes data = 3;
}

Send two parts in the message body division

1. Send the meta -data data (Baseresponse's MSGID+Headers)

2. After sending file data

Service side:

1. Send the file range to get chunksize

2. Build Baseresponse (only contains MSGID and Headers)

3. Calculate the size of the message length = Baseresponse+chunksize

4. Send the news header

5. Effects Baseresponse

6. Zero copy send file chunk

Client:

1. Analyze the message as a complete Baseresponse.

conflict? Propobuf and zero copy

During the processing process, we will encounter a problem:ProtoBufThe analysis process requires a specific encoding format, and the contents of the file stitching cannot be used directly asProtoBufPart of the message.

If Protobuf is needed to identify the content of this file, the file data must be involved in the encoding, and the encoding must be loaded into the process memory. This is contrary to zero copy.

How to deal with this problem?

Add another length! The message is divided into three parts:

2 bytes, the length information of the Proto message. (The number of metadata bytes is limited, 2 bytes are expressed enough)
N -byte, Proto message (MSGID+Headers)
N -byte, file chunk data

Processing logic

1) Service side code
Java's zero -copy API is (Long Position, Long Count, WritablebyteChannel).
But Netty's Channel is not a subclass of WritablebyteChannel. To use zero copy, you must use the FileRegment provided by Netty. The underlying layer is also called FileChannel's Transferto.

    public void handleDownloadRequest(BaseRequest baseRequest, ChannelHandlerContext ctx) throws Exception {
        File file = new File("F:\\");
        RandomAccessFile raf = new RandomAccessFile(file, "r");
        FileChannel fileChannel = null;

        long fileLength = ();
        ("file length" + fileLength);
        long offset = 0;


        int chunkIndex = 0;
        int totalChunks = (int) ((double) fileLength / MAX_CHUNK_SIZE);
        boolean firstPackage = true;

        while(offset < fileLength) {
            raf = new RandomAccessFile(file, "r");
            fileChannel = ();
            ("open:"+());
            //File block size
            long chunkSize = (MAX_CHUNK_SIZE, fileLength - offset);
            ("chunkSize:"+chunkSize);

            //Create FileRegion to transmit the current file block
            FileRegion fileRegion = new DefaultFileRegion(fileChannel, offset, chunkSize);


            List<Header> headers = new ArrayList<>();
            if(firstPackage) {
                (().setKey("fileName").setValue(()).build());
                (().setKey("fileSize").setValue((fileLength)).build());
                (().setKey("totalChunks").setValue((totalChunks)).build());
            }
            (().setKey("chunkIndex").setValue((chunkIndex)).build());

            //The upper half of the message body (MSGID+Headers)
            BaseResponse response = ()
                    .setMsgId(())
                    .addAllHeaders(headers)
                    .build();
            byte[] payloadHeadBytes = ();
            long bodyLength = 2 +  + chunkSize; //Two bytes

            byte[] lengthBytes = new byte[3];
            lengthBytes[0] = (byte) (bodyLength >> 16);
            lengthBytes[1] = (byte) (bodyLength >> 8);
            lengthBytes[2] = (byte) bodyLength;

            //propobuf length
            long length2 = ;
            byte[] lengthBytes2 = new byte[2];
            lengthBytes2[0] = (byte) (length2 >> 8);
            lengthBytes2[1] = (byte) (length2);

            //Send a message head+the upper part of the message body
            ByteBuf byteBuf = (new byte[]{5}, lengthBytes, lengthBytes2, payloadHeadBytes);
            ChannelFuture f1 = ().writeAndFlush(byteBuf);
            ();
//            ("f1:"+());
            //Zero copy writing file data (the file content does not need to enter the user area memory, copy it directly to the socket to send the buffer area)
            ChannelFuture f2 = (fileRegion);
            ();
//            ("f2:"+());

            firstPackage = false;
            //Update offset
            offset += chunkSize;
            ("Write:"+bodyLength);

            ();
        }

    }

2) Client code

public class DownloadManager {
    private Map<Integer, FileDownContext> waitingMap = new ConcurrentHashMap<>();

    public void addToMap(Integer msgId, CompletableFuture<String> waiter) {
        (msgId, new FileDownContext(null, null, 0L, 0.0d, waiter));
    }

    public void onResponse(BaseResponse response) {
//("Receive:"+());
        Integer msgId = ();
        FileDownContext context = (msgId);
        if((context)) {
            return;
        }
        //The first package has these two information
        for (Header header : ()) {
            if(((), "fileName")) {
                (());
            }
            if(((), "totalChunks")) {
                ((()));
            }
        }
        //Update receiving situation
        ++;
         = (double)/;

        try {
            //If the file does not exist, create
            Path filePath = ("F:\\clientDownload\\" + );
            if(!(filePath)) {
                (filePath);
            }
            //Additional writing file
            (filePath, ().toByteArray(), );
        } catch (IOException e) {
            ();
        }
        //Complete the request and release Context
        if((, )) {
            ();
            (msgId);
        }
    }

    @Data
    @AllArgsConstructor
    class FileDownContext {
        String fileName;
        Long totalChunks;
        Long receivedChunks;
        Double progress;
        CompletableFuture<String> waiter;
    }
}