I Read Papers with Awesome-Graphs: Interpreting X-Streams

X-Stream Thesis：《X-Stream: Edge-centric Graph Processing using Streaming Partitions》

Previously through the articleThe Dissertation Graph: Awesome-Graphs Takes a Sample with 200 Graph Systems Dissertations.Introduced Awesome-Graphs, a thesis graphing project, to the group and shared Google'sPregeland on OSDI 2012PowerGraph. This time, we share with you X-Stream, another classical graph computation framework paper published in SOSP 2013, which builds a framework for external memory-based Scatter-Gather graph processing on a single machine.

Students interested in graph computing technology can learn more about it, and you are very welcome to follow and participate in the open source project of Thesis Mapping:

Awesome-Graphs：/TuGraph-family/Awesome-Graphs
OSGraph：/TuGraph-family/OSGraph

Thanks in advance to those of you who gave the project a Star, and let's get right into the meat of it!

summaries

X-Stream is a single-computer shared-memory graph processing system that can process both in-memory and out-of-memory graphs.
Features:
- Edge-centered computational models.
- Streaming access to unordered edges instead of random access.

1. Introduction

Traditional point-centered processing:

The scatter function propagates the point state to neighboring points.
The gather function accumulates updates and recalculates the point state.

Performance differences in sequential/random access to different storage media:

Disk: 500x
SSD：30x
Memory: 1.8x - 4.6x

X-Stream's edge-centered processing:

scatter/gather iterates over edges/updates, not over points.
Mitigating random access to point sets using streaming partitioning.
Divide edges and source points into the same partition.

X-Stream Main Contributions:

Edge-centered processing models.
Streaming Zoning.
Good scalability on different storage media.
High performance.

2. X-Stream processing model

API Design:

Scatter: computes target point updates based on edges and source points.
Gather: recalculates the target point status based on updates received by the target point.

2.1 Streams

X-Stream performs Scatter+Gather using streaming. edges and updates are accessed sequentially, but points are accessed randomly.

2.2 Streaming Partitioning

Streaming partitions are included:

Point set: a subset of points on a partition.
Edge list: the edges of the source point.
Updated list: updates of the target point.

2.3 Scatter-Gather on Partitions

Scatter + Shuffle + Gather：

2.4 Size and number of partitions

On the one hand, in order for the point set to load as much as possible into fast storage, the number of partitions cannot be too small.
On the other hand in order to maximize the use of the sequential read and write capabilities of the slow storage, the number of partitions cannot be too large.
Partitioning is performed by fixing the size of the set of partition points.

2.5 API Limitations and Extensions

While it is not possible to traverse all edges on a point, it is possible to iterate over all points and provide custom point functions.
It is not limited to supporting the scatter-gather model, but can also support the semi-streaming, W-Stream model, and so on.

3. External memory-based streaming engine

Each streaming partition maintains three disk files: a point file, a side file, and an update file.
The difficulty lies in implementing sequential access to shuffle nodes by merging the scatter+shuffle phases, writing updates to a memory buffer, and executing a memory shuffle append to the target partition disk file when the buffer is full.

3.1 In-memory data structures

stream buffer design:

Based on stream buffers, one buffer is used to store the updates of the scatter and the other stores the results of the memory shuffle.

3.2 Operations

Initializing side partitions can be achieved using the memory shuffle method.

3.3 Disk IO

X-Stream's stream buffer uses asynchronous Direct I/O instead of OS page cache (4K).
Pre-reads and block writes improve disk utilization, but require an additional stream buffer.
Use RAID to achieve read/write separation.
Implement truncate using the SSD storage TRIM operation.

3.4 Number of partitions

Assuming that updates to partitions satisfy a uniform distribution, the following memory equation is available:

N: Total amount of point set memory.
S: Maximum bandwidth IO request packet size.
K: Number of partitions.
M: Total amount of memory.

4. Memory-based streaming engine

4.1 Parallel Scatter-Gather

Each thread writes the free cache and then uniformly flushes to the contributing output data block.
Avoid tilting by worker stealing.

4.2 Parallel multistage shuffle

The partitions are organized using a tree structure with a branching factor F (fan-out size), and each level of the tree corresponds to a step of shuffle.
Thus for K partitions, a total of logFK steps of shuffle are required.
Shuffle is implemented using two stream buffers rotating the input and output roles.
The paper sets F to the number of lines available in the CPU cache.

4.3 Layering of Disk Streams

The memory engine is logically on top of the external memory engine, and the external memory engine is free to choose the number of partitions to be processed using the memory engine to maximize the use of memory and computing resources.

5. Assessment

256M memory cache size, reaching a maximum memory bandwidth of 25GB/s at 16core.
16M IO request packet size.