System.IO.Pipelines: .NET上高性能IO (2)

日期：2021-05-29 栏目：程序人生浏览：次

译者注：这段代码太复杂了，懒得翻译注释了，大家将就看吧

public class BufferSegment { public byte[] Buffer { get; set; } public int Count { get; set; } public int Remaining => Buffer.Length - Count; } async Task ProcessLinesAsync(NetworkStream stream) { const int minimumBufferSize = 512; var segments = new List<BufferSegment>(); var bytesConsumed = 0; var bytesConsumedBufferIndex = 0; var segment = new BufferSegment { Buffer = ArrayPool<byte>.Shared.Rent(1024) }; segments.Add(segment); while (true) { // Calculate the amount of bytes remaining in the buffer if (segment.Remaining < minimumBufferSize) { // Allocate a new segment segment = new BufferSegment { Buffer = ArrayPool<byte>.Shared.Rent(1024) }; segments.Add(segment); } var bytesRead = await stream.ReadAsync(segment.Buffer, segment.Count, segment.Remaining); if (bytesRead == 0) { break; } // Keep track of the amount of buffered bytes segment.Count += bytesRead; while (true) { // Look for a EOL in the list of segments var (segmentIndex, segmentOffset) = IndexOf(segments, (byte)'\n', bytesConsumedBufferIndex, bytesConsumed); if (segmentIndex >= 0) { // Process the line ProcessLine(segments, segmentIndex, segmentOffset); bytesConsumedBufferIndex = segmentOffset; bytesConsumed = segmentOffset + 1; } else { break; } } // Drop fully consumed segments from the list so we don't look at them again for (var i = bytesConsumedBufferIndex; i >= 0; --i) { var consumedSegment = segments[i]; // Return all segments unless this is the current segment if (consumedSegment != segment) { ArrayPool<byte>.Shared.Return(consumedSegment.Buffer); segments.RemoveAt(i); } } } } (int segmentIndex, int segmentOffest) IndexOf(List<BufferSegment> segments, byte value, int startBufferIndex, int startSegmentOffset) { var first = true; for (var i = startBufferIndex; i < segments.Count; ++i) { var segment = segments[i]; // Start from the correct offset var offset = first ? startSegmentOffset : 0; var index = Array.IndexOf(segment.Buffer, value, offset, segment.Count - offset); if (index >= 0) { // Return the buffer index and the index within that segment where EOL was found return (i, index); } first = false; } return (-1, -1); }

此代码只是得到很多更加复杂。当我们正在寻找分隔符时，我们同时跟踪已填充的缓冲区序列。为此，我们此处使用List<BufferSegment>查找新行分隔符时表示缓冲数据。其结果是，ProcessLine和IndexOf现在接受List<BufferSegment>作为参数，而不是一个byte[]，offset和count。我们的解析逻辑现在需要处理一个或多个缓冲区序列。

我们的服务器现在处理部分消息，它使用池化内存来减少总体内存消耗，但我们还需要进行更多更改：

我们使用的byte[]和ArrayPool<byte>的只是普通的托管数组。这意味着无论何时我们执行ReadAsync或WriteAsync，这些缓冲区都会在异步操作的生命周期内被固定（以便与操作系统上的本机IO API互操作）。这对GC有性能影响，因为无法移动固定内存，这可能导致堆碎片。根据异步操作挂起的时间长短，池的实现可能需要更改。

可以通过解耦读取逻辑和处理逻辑来优化吞吐量。这会创建一个批处理效果，使解析逻辑可以使用更大的缓冲区块，而不是仅在解析单个行后才读取更多数据。这引入了一些额外的复杂性

我们需要两个彼此独立运行的循环。一个读取Socket和一个解析缓冲区。

当数据可用时，我们需要一种方法来向解析逻辑发出信号。

我们需要决定如果循环读取Socket“太快”会发生什么。如果解析逻辑无法跟上，我们需要一种方法来限制读取循环（逻辑）。这通常被称为“流量控制”或“背压”。

我们需要确保事情是线程安全的。我们现在在读取循环和解析循环之间共享多个缓冲区，并且这些缓冲区在不同的线程上独立运行。

内存管理逻辑现在分布在两个不同的代码段中，从填充缓冲区池的代码是从套接字读取的，而从缓冲区池取数据的代码是解析逻辑。

我们需要非常小心在解析逻辑完成之后我们如何处理缓冲区序列。如果我们不小心，我们可能会返回一个仍由Socket读取逻辑写入的缓冲区序列。

复杂性已经到了极端（我们甚至没有涵盖所有案例）。高性能网络应用通常意味着编写非常复杂的代码，以便从系统中获得更高的性能。

System.IO.Pipelines的目标是使这种类型的代码更容易编写。

使用System.IO.Pipelines的TCP服务器

转载注明出处：https://www.heiqu.com/wpjgss.html

System.IO.Pipelines: .NET上高性能IO (2)

相关推荐