Random shrink algorithm to solve global minimum cutting

Global minimum cutting problem

The Global Min-Cut Problem is a classic problem in graph theory, aiming to divide the set of vertices of a graph by cutting edges in a graph. Specifically, given a weighted undirected graph $G = (V, E) $, where $V$ is the set of nodes and $E$ is the set of edges, and each edge $e \in E$ in the graph has a weight $ w(e) $, the goal of the global minimum cutting problem is to find a partition $ (S, T) $ so that the total weight of the edges from the set $S $ to the set $T $ is minimized. That is, we want to split the vertex set $V$ of the graph into two subsets $S$ and $T$, so that the total weight of the cut set $\text{cut}(S)$ is minimized, where the cut set is defined as A collection of all edges from $S $ to $T $.

Formally, the goal of the minimum cutting problem is to solve the following optimization problems:

\[\min_{S \subseteq V} \sum_{u \in S, v \in V \setminus S} w(u, v) \]

Relationship with maximum flow-minimum cut

In many applications, the minimum cut problem is closely related to the maximum flow problem. The maximum flow-minimum cut theorem states that in a stream network, the maximum flow from a source point to a sink is equal to the minimum cut capacity from a source point to a sink. This theorem provides a calculation tool for the minimum cutting problem: we can obtain the minimum cutting by solving the maximum flow.

However, this method is only applicable to network flow graphs, that is, it is necessary to determine a source point and sink point and calculate the traffic through the maximum flow algorithm. The global minimum cutting problem is more general. It is defined on an undirected graph and does not depend on a specific source sink, but rather it is necessary to find the optimal solution among all possible vertex divisions of the graph. Therefore, the solution to the global minimum cutting problem is more complicated than the minimum cutting problem in network flow.

Minimum cutting of network flow graph: In the maximum flow problem, the smallest cut is the cut set between a specific source point $s $ and a sink $t $, with the goal of cutting off all streams in the network from $s $ to $t $. The maximum flow-minimum cut theorem provides an effective way to solve this problem, that is, to calculate the maximum flow from the source point $s $ to the sink $t $, and then obtain the minimum cut capacity.
Global minimum cutting: Unlike network flow graph minimum cutting, the global minimum cutting problem does not depend on a specific source sink, but considers all possible node divisions in the graph. We want to find a partition that minimizes the weight of the cut-in cluster among all partitions, and this is not limited to a single source sink node. Therefore, the global minimum cutting problem is a more general graph optimization problem, which is usually more complex and difficult than the minimum cutting problem in network streams.

When solving the minimum cutting problem of network flow graphs, the most common classic method is an algorithm based on the maximum flow-minimum cutting theorem. Directly use the maximum flow algorithm to calculate the minimum cutting.

The global minimum cutting problem is not limited to cutting between a pair of source points and sink points, but requires finding the optimal division of the entire graph. Therefore, in order to obtain a global minimum cut, the most violent method is to enumerate the source and sinks ($V^2$Then run the maximum flow algorithm multiple times, and calculate multiple source sink pairs, gradually approaching the global minimum cut. In fact, due to some properties of minimal cut, it can be optimized to only enumerations$V$level combination.

Even so, this method faces significant efficiency problems in practical applications. Specifically, the time complexity of the maximum flow algorithm is usually high, especially when the graph is large, the calculation cost is too high and the efficiency is inefficient.

Random shrinkage algorithm

Random Contraction AlgorithmIt is a random algorithm to solve the minimum cutting problem of graphs. The core idea of the random shrinking algorithm is to reduce the size of the graph by constantly shrinking the edges in the graph, and finally divide the graph into two parts. This is an algorithm that randomly finds the approximate solution.

Running process

The basic steps of a naive random contraction algorithm are as follows:

initialization: Given an undirected weighted graph $G = (V, E) $, each edge has a weight.
Randomly select and shrink edges: Randomly select an edge $e = (u, v) $ (inspiringly, the larger the weight, the easier it is to be randomly designed), and select the two endpoints of this edge $u $ and $ v$ merge into a new node $u'$. When merging nodes, all edges connected to $u $ or $v $ will be connected to the new node$u'$(The points where both $u $ and $v $ are connected add their weights) and the original edge $e $ is removed.
Repeat the process: Continuously contract randomly until there are only two nodes left in the graph. It is believed that the only edge power at this time is the cessation.

The correctness of the algorithm

The random shrinking algorithm can be regarded as a process through "compressing" the graph structure. Imagine you have a very large map that marks all the roads between cities and cities. You want to find a road that divides the map into two parts and the total length of this road is the shortest. By constantly randomly narrowing down some unimportant parts (such as those that are not related to other parts), you gradually compress the map, and the rest of the end is the shortest split path you are looking for. Since small weighted edges are more likely to become cut edges, the probability of their selection should be reduced by weight and are easier to retain until the end.

When performing each contraction, it is actually retaining the important structure in the graph, and gradually ignoring the part that does not affect the minimum cut. Therefore, after multiple shrinkages, the edge between the remaining two nodes is actually the smallest cut edge.

By randomly selecting edges for shrinking, the edges selected each time may be different, so the solution obtained by the algorithm is random. However, in all possible random paths, the final cut is usually close to optimal. This process relies on the basis of probability theory and ensures that the final result is correct. By running the algorithm multiple times, the true optimal solution can be obtained with high probability.

Complexity of a single solution

The time complexity of each shrink operation depends on the number of edges of the graph and the selected heuristic data structure. Assuming that the graph has $n$ nodes and $m$ edges, when performing a shrinking operation, the time complexity of selecting one edge is $O(m)$, and the time complexity of merging two nodes is $O(1) $, the structure of the update graph is completed at most in $O(m)$ time.

As the shrinkage process progresses, the number of nodes in the graph gradually decreases. The shrinkage process continues $n - 2 $ times (until two nodes are left), so the total time complexity is about $O(mn) $.

The relationship between the number of running rounds and the optimal solution for high probability

Although the random shrinking algorithm may not get the optimal solution every time it runs, it can find the near-optimal solution with a higher probability. To increase the probability of obtaining high-quality solutions, the algorithm is usually run multiple times.

In a run, each step of the algorithm is random, so it may not find the minimum cut. However, through multiple runs, the probability that the algorithm will eventually get the correct minimum cut will increase significantly. In practice, algorithms usually run $O(n^2) $ times to find the optimal solution in most cases. This number of times ensures that the random shrinking algorithm can give a near-optimal minimum cut at high probability.

Specifically, it can be mathematically proved that if $t$ random contraction is performed, the minimum probability of survival in each contraction is at least $ \frac{C_2^t}{C_2^n} $ . By choosing the appropriate $t$, we can ensure that the true minimum cut $S^*$ survives the final contraction with a higher probability.

Recursive random contraction algorithm

The efficiency of the random shrinking algorithm is still not high.Recursive Random Contraction AlgorithmIt is an optimization of the naive random shrinking algorithm. It improves the minimum cutting solution process through recursive means, enhancing the efficiency of the algorithm and the accuracy of the solution results. The core idea of the recursive random shrinking algorithm is to continuously reduce the size of the graph by recursively, gradually shrink the edges in the graph until a smaller graph is obtained, and the optimal minimum cut is found in it.

Running process

The running process of the recursive random shrinking algorithm is similar to the naive random shrinking algorithm, but the algorithm is further optimized through recursive means. It is based on the fact that when the algorithm is run, the graph is still very large when the first few contractions are collapsed, and it is basically not easy to miss the optimal solution, so we don't need to start with the complete graph for each round of operation.

initialization: Given an undirected weighted graph $G = (V, E) $, each edge has a weight.
Recursive shrinkage：
- If the number of nodes in the graph $n$ is less than or equal to a certain threshold (such as 6), the minimum cut is calculated by exhaustive method.
- Otherwise, we still shrink, run twice respectively but not to the end, reduce the number of nodes to around $\frac n{\sqrt 2} $, we run twice to get two different subgraphs, and then recursively Solve two subgraphs and take the smaller value in the answer obtained by the two subgraphs.
Recursive stop condition：
- When the number of nodes in the graph is less than or equal to a certain threshold, the global minimum cut is directly found through exhaustive search. Otherwise, continue recursive contraction.
- After the recursion is completed, the minimum cut is returned.

Correctness

Like the naive random shrinking algorithm, each shrinking operation reduces the part that does not affect the minimum cut. Through the recursive process, the scale of the graph is gradually reduced, and each shrinking operation is carried out on the basis of retaining the smallest cut. Therefore, even through recursive shrinkage, the final result can still be obtained with the correct minimum cut. When the number of nodes in the graph is reduced to a certain level (such as less than or equal to 6), the exhaustive method is used to calculate the minimum cutting.

The recursive random contraction algorithm can be understood as a process of "dividing and conquering". First, we divide the large picture into several small pictures, and the size of each small picture is gradually reduced. We then simplify the graph by random shrinkage while ensuring that the minimum cut is preserved during each recursion. Finally, the algorithm selects the optimal cut as the final solution by comparing the smallest cuts of different subgraphs.

Complexity

The time complexity of the recursive random shrinking algorithm is affected by the number of recursive layers and the number of operations recursively per layer. In each recursion, the size of the graph is reduced to the original $ \frac{n}{\sqrt{2}} $, so the number of recursive layers is $ O(\log n) $. The complexity of each recursion is $ O(m) $, where $ m $ is the number of edges of the graph.

Therefore, the overall time complexity of the recursive random shrinking algorithm is:

\[O(m \cdot \log n) \]

Compared with the naive random shrinking algorithm, the recursive random shrinking algorithm gradually reduces the scale of the problem through recursion, avoiding repeated calculations and improving the efficiency of the algorithm.

Recursive random contraction algorithms rely on randomness, just like naive random contraction algorithms. The results of each run may vary, so in order to increase the probability of finding the optimal solution, it is often necessary to run the recursive random shrinking algorithm multiple times.

Code simplified