Spare time with .net wrote a free online customer service system: ascension of XUNWEI online customer service and marketing system.
From time to time, friends ask me about performance, and it just so happens that I have a real client with 2,000 visitors online. With the client's permission, I recorded a video.
The AxisWay online customer service system can easily cope with this situation in a very low configuration server environment, and still can do:
Millisecond delivery of messages, millisecond response of operations
performances
Taking the official online environment as an example, if the number of HTTPS requests handled per day is more than 160,000, and the number of PV requests is more than 250,000, the memory consumption of the main program on the server side is less than 300MB, and the CPU consumption of the server is less than 5%.
Handles more than 160,000 HTTPS requests per day:
Handles more than 250,000 PV requests per day:
The server-side main program memory footprint is less than 300MB:
Server CPU (Intel Xeon Platinum 8163 / 4-core 2.5 GHz) occupancy stabilized at about 5%:
safety
- The guest side uses https and wss to connect securely, and the data is encrypted throughout the transmission.
- Customer service end data messages are transmitted using AES encryption. (Advanced Encryption Standard, U.S. Federal Government Block Encryption Standard).
- It can be deployed 100% private on your own servers.
Messages in intercepted messages are transmitted in ciphertext:
Realization effects
customer service end
guest side
How does it work? Technical Details
TCP Server using NetworkStream
Typical code written in .NET before Pipelines is shown below:
async Task ProcessLinesAsync(NetworkStream stream)
{
var buffer = new byte[1024];
await (buffer, 0, );
// existbufferProcessing a line of messages in the
ProcessLine(buffer);
}
This code may work correctly when tested locally, but it has several potential errors:
A single ReadAsync call may not receive the entire message (end of line).
It ignores the amount of data in the return value of () that actually fills the buffer. (Translator's note: i.e., it doesn't necessarily fill the buffer to the brim)
Multiple messages cannot be processed in a single ReadAsync call.
These are some of the common flaws when reading streaming data. To fix this, we need to make some changes:
We need to buffer incoming data until a new row is found.
We need to parse all the lines returned in the buffer
async Task ProcessLinesAsync(NetworkStream stream)
{
var buffer = new byte[1024];
var bytesBuffered = 0;
var bytesConsumed = 0;
while (true)
{
var bytesRead = await (buffer, bytesBuffered, - bytesBuffered);
if (bytesRead == 0)
{
// EOF It's the end of the line.
break;
}
// Keeps track of the number of bytes buffered
bytesBuffered += bytesRead;
var linePosition = -1;
do
{
// Finding the end of a row in buffered data
linePosition = (buffer, (byte)'\n', bytesConsumed, bytesBuffered - bytesConsumed);
if (linePosition >= 0)
{
// Calculate the length of a row based on the offset
var lineLength = linePosition - bytesConsumed;
// Deal with this line
ProcessLine(buffer, bytesConsumed, lineLength);
// mobilitybytesConsumedIn order to skip the lines we've already processed (including through\n)
bytesConsumed += lineLength + 1;
}
}
while (linePosition >= 0);
}
}
This time, this may apply to local development, but a line may be larger than 1KiB (1024 bytes). We need to resize the input buffer until a new line is found.
Therefore, we can allocate buffers on the heap to handle longer lines. We can parse longer lines from the client by using the ArrayPool
async Task ProcessLinesAsync(NetworkStream stream)
{
byte[] buffer = ArrayPool<byte>.(1024);
var bytesBuffered = 0;
var bytesConsumed = 0;
while (true)
{
// existbufferNumber of bytes remaining in the calculation
var bytesRemaining = - bytesBuffered;
if (bytesRemaining == 0)
{
// commander-in-chief (military)buffer sizedoubled 并且commander-in-chief (military)之前缓冲的数据复制到新的缓冲区
var newBuffer = ArrayPool<byte>.( * 2);
(buffer, 0, newBuffer, 0, );
// commander-in-chief (military)旧的bufferstrew
ArrayPool<byte>.(buffer);
buffer = newBuffer;
bytesRemaining = - bytesBuffered;
}
var bytesRead = await (buffer, bytesBuffered, bytesRemaining);
if (bytesRead == 0)
{
// EOF extremity
break;
}
// Keeps track of the number of bytes buffered
bytesBuffered += bytesRead;
do
{
// exist缓冲数据中查找找一个行extremity
linePosition = (buffer, (byte)'\n', bytesConsumed, bytesBuffered - bytesConsumed);
if (linePosition >= 0)
{
// Calculate the length of a row based on the offset
var lineLength = linePosition - bytesConsumed;
// Deal with this line
ProcessLine(buffer, bytesConsumed, lineLength);
// mobilitybytesConsumedIn order to skip the lines we've already processed (including through\n)
bytesConsumed += lineLength + 1;
}
}
while (linePosition >= 0);
}
}
This code works, but now we are resizing the buffer to produce more copies of the buffer. It will use more memory because according to the code it will not shrink the buffer size after processing a line. To avoid this, we can store the buffer sequence instead of resizing it each time it exceeds 1KiB in size.
In addition, we don't grow the 1KiB buffer until it is completely empty. This means that we end up passing ReadAsync smaller and smaller buffers, which leads to more calls to the operating system.
To mitigate this, we will allocate a new buffer when there are less than 512 bytes remaining in the existing buffer:
public class BufferSegment
{
public byte[] Buffer { get; set; }
public int Count { get; set; }
public int Remaining => - Count;
}
async Task ProcessLinesAsync(NetworkStream stream)
{
const int minimumBufferSize = 512;
var segments = new List<BufferSegment>();
var bytesConsumed = 0;
var bytesConsumedBufferIndex = 0;
var segment = new BufferSegment { Buffer = ArrayPool<byte>.(1024) };
(segment);
while (true)
{
// Calculate the amount of bytes remaining in the buffer
if ( < minimumBufferSize)
{
// Allocate a new segment
segment = new BufferSegment { Buffer = ArrayPool<byte>.(1024) };
(segment);
}
var bytesRead = await (, , );
if (bytesRead == 0)
{
break;
}
// Keep track of the amount of buffered bytes
+= bytesRead;
while (true)
{
// Look for a EOL in the list of segments
var (segmentIndex, segmentOffset) = IndexOf(segments, (byte)'\n', bytesConsumedBufferIndex, bytesConsumed);
if (segmentIndex >= 0)
{
// Process the line
ProcessLine(segments, segmentIndex, segmentOffset);
bytesConsumedBufferIndex = segmentOffset;
bytesConsumed = segmentOffset + 1;
}
else
{
break;
}
}
// Drop fully consumed segments from the list so we don't look at them again
for (var i = bytesConsumedBufferIndex; i >= 0; --i)
{
var consumedSegment = segments[i];
// Return all segments unless this is the current segment
if (consumedSegment != segment)
{
ArrayPool<byte>.();
(i);
}
}
}
}
(int segmentIndex, int segmentOffest) IndexOf(List<BufferSegment> segments, byte value, int startBufferIndex, int startSegmentOffset)
{
var first = true;
for (var i = startBufferIndex; i < ; ++i)
{
var segment = segments[i];
// Start from the correct offset
var offset = first ? startSegmentOffset : 0;
var index = (, value, offset, - offset);
if (index >= 0)
{
// Return the buffer index and the index within that segment where EOL was found
return (i, index);
}
first = false;
}
return (-1, -1);
}
This code just gets a lot more complicated. While we are looking for the separator, we are also keeping track of the sequence of filled buffers. To do this, we use the List
Our server now handles some of the messages and it uses pooled memory to reduce overall memory consumption, but we need to make more changes:
- We used byte[] and ArrayPool
s are just plain old managed arrays. This means that whenever we execute ReadAsync or WriteAsync, these buffers are fixed for the lifetime of the asynchronous operation (in order to interoperate with the native IO API on the OS). This has a performance impact on GC, as it is not possible to move fixed memory, which can lead to heap fragmentation. Depending on how long the asynchronous operation hangs, the pool implementation may need to be changed. - Throughput can be optimized by decoupling the read logic from the processing logic. This creates a batch effect that allows the parsing logic to use larger buffer blocks instead of reading more data only after parsing a single line. This introduces some additional complexity
- We need two loops that run independently of each other. A read socket and a parse buffer.
- We need a way to signal the parsing logic when data is available.
- We need to decide what happens if the loop reads the socket "too fast". If the parsing logic can't keep up, we need a way to limit the read loop (logic). This is often called "flow control" or "backpressure".
- We need to make sure things are thread-safe. We now share multiple buffers between the read loop and the parse loop, and these buffers run independently on different threads.
- The memory management logic is now spread across two different code segments; the code that fills the buffer pool from reads from the socket, and the code that fetches data from the buffer pool is the parsing logic.
- We need to be very careful about what we do with the buffer sequence after the parsing logic completes. If we are not careful, we may return a buffer sequence that is still written by the socket read logic.
Experience the full privatization deployment package online or download it:
Hopefully, it will be built:Open, open source, share. Strive to create an excellent open source product for the .net community.