Location>code7788 >text

Detailed analysis of balanced trees - red-black tree balancing correction Graphic explanation (with code) (10,000-word article)

Popularity:5 ℃/2024-08-27 00:00:05

catalogs
  • red and black trees
    • summarize
    • Nature/rules
      • Main rule.
      • Properties of derivation.
    • Basic implementation of a red-black tree
      • struct RBTreeNode
      • class RBTree
    • Red and black tree insertion
    • Red and Black Tree Insertion Amendment Preamble
      • When do I need to change colors: When do I need to change colors?
        • The Basis of Color Change.
      • Why do you need rotation and color change
        • Discoloration.
        • revolve
    • All situations requiring amendment
      • Recognize the simplest case first
        • 1. Uncle is a red node
          • Caution.
        • 2. No uncle node
        • 3. Uncle is a black node
      • Derivative structure underlying all structures (left as an example)
        • Simple case derivation
        • Derivation of secondary amendments
      • Summary:
    • Insertion of amended code implementation

red and black trees

summarize

A red-black tree, is a binary search tree, but with an additional storage bit at each node to indicate the color of the node, which can be either Red or Black. By restricting the way in which the individual nodes on any path from root to leaf are colored, the red-black tree ensures that no path will be two times longer than any other, and thus is close to balanced.

Nature/rules

Details.Mangrove_Baidu Encyclopedia ()

Main rules:

In order to facilitate the understanding and memorization, the red-black tree rules are broken down, the following three points are the main rules that I believe that the understanding of the red-black tree is the most important. The bolded words are the key words, remember these key words first, to learn the red-black tree will be much easier to understand.

  • The root node is black.
  • Any two neighboring nodesCannot be red at the same time. That is, the child of a red node is black . (No consecutive red nodes can occur)
  • between any node and its reachable leaf node,containsSame number of black nodes. (The same black node on each path)

Other rules.

Default rule: Every node is either red or black.

Addendum: Leaf nodes are black and do not store data, also known as NIL nodes.nil (computer language)_Baidu Encyclopedia ()

Transcription:By replacing all leaf nodes of a red-black tree with NIL nodes, it can be ensured that every node of a red-black tree has at least one child node, thus simplifying the implementation of the operation.The presence of NIL nodes helps to maintain the structure and properties of the red-black tree, in particular when dealing with the boundary case, and avoids special treatment of the leaf nodes by determining whether or not a node's child node is a NIL node.

Properties of derivation.

The derivation rules are the red-black tree rules plus some rules derived from binary tree properties, which are used later to prove and understand some operations of red-black trees.

  1. Derivation of Property 1.
  • One PathIn all possible cases, the number of longest path nodes is not more than twice the shortest path (consecutive red nodes can make the longest path more than twice the shortest path).

    The longest possible path from root to leaf is no more than twice as long as the shortest possible path.

  • The red node is black if it has two children. (There may be 0,1,2 black children.)

  • Modify the balance of black nodes in all branching paths that are not affected by changes in the color of ancestor nodes (fix commonly used)

  1. Derivation of Property 2.
  • Shortest path: all black nodes

  • Longest path: red and black (one black, one red, one black, one red, one red, one red)The last non-NIL node can be red)

  • The red-black tree with the red node removed is close to a full binary tree. (Removing the red nodes directly might not be a binary tree, it won't hold.)

  • When there is only one root node (black), the second node can only be red (satisfying the same number of black nodes), forcing the insertion of only red nodes.

  • The default color for new nodes is red, and the rules for red are less stringent than those for black.

  1. Derivation of Property 3.

Let there be N black nodes

  • The shortest path length is:$log_2(N) $

  • The length of the longest path is :$2log_2(N)$

  • The number of all nodes of a red-black tree in the[N,2N]Between. (N for all black, 2N for all red and black)

  • Performance, assuming there are 1 billion nodes, AVL tree can be searched up to 30 times, RB tree can be searched up to 60 times.

Note: the article only to understand the main function of the red-black tree (insertion correction) implementation of the main, did not realize the NIL node processing and other cases, is not a rigorous implementation of the red-black tree.

Basic implementation of a red-black tree

The basic functionality is almost identical to that of an AVL tree. The following is a brief description of the

struct RBTreeNode

template<class K, class V>
struct RBTreeNode {
    //trident chain (math.)
    RBTreeNode* _left;
    RBTreeNode* _right;
    RBTreeNode* _parent;

    std::pair<K,V> _kv;
    Color _col;
    
    RBTreeNode(const decltype(_kv)& kv)
    : _left(nullptr)
    ,_right(nullptr)
    ,_parent(nullptr)
    ,_kv(kv)
    ,_col(Color::RED) //Default is red,Because the rules are the loosest.
    {}
};

class RBTree

enum class Color { RED, BLACK };

template<class K,class V>
class RBTree {
public:
	 bool Insert(const std::pair<K,V>& kv);
private:
    using Node = RBTreeNode<K,V>;
    Node* _root;
};

Red and black tree insertion

The focus of the insertion is the same as for the AVL tree, in that the insertion is corrected.

    bool Insert(const std::pair<K,V>& kv) {
        if (_root == nullptr) {
            _root = new Node(kv);
            _root->_col = Color::BLACK;
            return true;
        }

        Node* cur = _root;
        Node* parent = _root;
        while (cur) {
            if ( > cur->_kv.first) {
                parent = cur;
                cur = cur -> _right;
            }
            else if ( < cur->_kv.first) {
                parent = cur;
                cur = cur -> _left;
            }
            else {
                //identical
                return false;
            }
        } //whilecomparison process [end]

        //I didn't find it.,additional
        cur = new Node(kv); //curaddress change,can only be usedkvcarry out a comparison(Always usekvThat's it.)
        //Maintenance of trident chains
        if (cur->_kv->first > parent->_kv.first) {
            parent->_right = cur;
        }
        else {
            parent->_left = cur;
        }
    
        //Checking and adjusting the red-black tree
        FixInsert();

    }

Red and Black Tree Insertion Amendment Preamble

In the realization of the red-black tree insertion before, we know that the red-black tree inserted into the new node, there must be a situation that does not meet the rules of the red-black tree, so we will need to first, the red-black tree correction operation is mainly through the rotation and color change to achieve.

When do I need to change colors: When do I need to change colors?

Only if the father is red, it needs to change color, and it must change color. (Red nodes are not neighboring rule)

The Basis of Color Change.

The insertion of red nodes has the loosest rules and does not require adjustment of other paths, so the insertion nodes are not color changeable, thusI can only turn my father black.The next step is to deal with how the grandfather and uncle nodes change color when the father turns black.

Bottom line: When the father is red, the color must be changed, and it is the father who changes color. (Expansion: Both the father and the father are exchanging colors. It will be explained in detail later)

Why do you need rotation and color change

Discoloration.

According to the law of binary trees: a full binary tree always has two children at a node, and always has a 1:2 relationship.
Through this relationship, the quantity matching operation can be realized, i.e.1 black father:2 red children can be converted to1 red father:2 black children

image-20240826172745480

This color change has less effect on the color of all branching paths of the parent node, and does not easily break the rule of equal number of black nodes in red-black trees (for short).

revolve

First, one black node, two red nodes. Observe that this can be balanced by a certain rotation operation.

image-20240826173044634

Red-black tree in some cases, direct color change operation is difficult to meet the rules, or more complex; and through a certain rotation operation, it will be more simple, so the red-black tree needs to cycle. See below for details.

In addition, the specific rotation operation I described in detail in another blog, this post will not describe too much.AVL tree

All situations requiring amendment

All the cases that the red-black tree needs to fix are derived from the following abstract tree.

image-20240826182022937

A red-black tree can only be realized if you understand the basic modifications of a red-black tree. The following is a step-by-step explanation.

Recognize the simplest case first

The simplest case can be regarded as a correction between the farthest nodes, but also after the insertion of the first correction, the simplest case is also the easiest to understand. Red-black tree correction and AVL tree, the same, a lot of complex situations, handle the simple and then go to see the abstract situation is much better to understand.

1. Uncle is a red node

Description: Uncle and father are both red andBoth uncle and father are farthest nodes of non-NIL nodes.

Reversal: If the uncle still has children, it must be black, that is, there is one more black node; in order to satisfy the rule that the number of black nodes in each path is the same, the parent node must also have a black child node, and there must be two of them, and then it will not be possible to insert a new node, so this situation is impossible.

  • ZuoZuo (LL type)

Insert to the left of the parent node

image-20240826181253215

Operation: Father turns black, Master turns red (Father and Master exchange colors), Uncle turns black.

image-20240826181434994

Sentimental interpretation.

The father and son are both red, and only the father can and must become black. Since there is one more black node, there must be one less black node on the path.

Which one is missing? You can't go down, because the bottom has already been processed (the correction operation is iterated upwards), so you have to look up.

Because all the nodes on the path before insertion are to meet the rules of the red-black tree, so the grandfather node must be black; and because the grandfather from the father is the closest node, the smallest impact on the other nodes, and therefore chose to set the grandfather node to red, that is, the father and the master node color exchange.

When grandpa's color is a red node, uncle's path has one less black node, so uncle's node must be black.

  • Around (LR type)

Operation: Father turns black, Master turns red (Father and Master exchange colors), Uncle turns black.

image-20240826181722794

Because the parent and uncle are the farthest nodes, the adjustment process has no effect on the child nodes, and there is no need to rotate or other additional operations, so it is exactly the same as the LL type.

Caution.

Grandfather is not a root, if there is an uncle and he is red, after adding a new node and a round of correction, grandfather will become red. At this point, if the grandparent is also red, you need to continue to correct.

It is only in this case that further corrections may be necessary.

2. No uncle node

Operationally, the case of no uncle is the same as the case of a black uncle. However, the no-uncle case is easier to understand because it is the first round of corrections.

A mismatch in the number of black nodes in the case where the uncle is black, indicating that the situation is the result of the last adjustment (intermediate/unbalanced state).

Without uncle, according to the length rule, this is already the longest path (grandfather is the last non-NIL black node), so the new node must be the farthest node, i.e., the first correction after insertion

  • Left-Left (LL type): Grandpa Right Single Spin (Lower Height), exchanging father's and grandfather's colors

image-20240826184249784

  1. Why the rotation?

In case the uncle is empty, inserting a red node may violate the length rule (the longest path in a path is not more than twice the shortest path).

image-20240825171014847

(Example above)

In the previous AVL tree we know that the height can be reduced by rotating the subtree (the process of rotation is described in detail in the AVL tree article, and will not be described specifically in this article, . Similarly, a red-black tree can be rotated to reduce its height after violating the length rule. After verification, the rotation process can effectively solve the problem of violating the length rule.

  1. How do you rotate it?

    In LL type, the grandfather node is left-rotated, after which the grandfather node becomes the left child of the parent node, height-1; then the colors of the parent nodes are exchanged, and the red-black tree is balanced.

  2. Why does it change color after spinning down? How does it change color?

    Rotate down, the father node is the ancestor, is red; but the grandfather is black, that is, the father node as the root of the two paths black node number of imbalance, one more than the other less than one, in this case, the exchange of the father and the master node color can be balanced.

  • Left/Right (LR type): Father single spin left (to LL type), Grandfather single spin right (to lower height), exchange father and grandfather colors.

image-20240826184510086

3. Uncle is a black node

(The mismatch in the number of black nodes itself suggests that this situation must have been caused by the last adjustment)

Straight to the picture.

image-20240826211618991

Derivative structure underlying all structures (left as an example)

After recognizing the simple structures, we have a general understanding of the basics of red-black tree modification. Now let's analyze how these structures came to be. The following diagram is an abstract structure from which all the cases that need to be corrected are derived.

image-20240826214715790

Rectangle▯ is a subtree of arbitrary height, where x is the possible insertion position,y is determined by x (according to the red-black tree rule).

Simple case derivation

When x is the inserted node

image-20240826211940998

(∆ denotes a node, where a,b are the possible insertion locations of the new node.c can only be red or no nodes。)

  • When c is a red node, the following is derived

image-20240826212610923

Apparently it's in the simple structure of the above, Uncle Red's case.

  • When c is a black node

image-20240826213106299

As above, this case is an intermediate state of correction, caused by the previous round of corrections.

  • When c is empty

image-20240826213428777

Derivation of secondary amendments

When the rectangle x is not an insertion node and according to the

  1. The Red-Black Tree Rule
  2. and analyzed in the simple case, only if the uncle is a red node, it makes the grandfather node red, which in turn may affect the ancestor node

Perform the downward derivation once to get this figure.

image-20240826220932973

where c is any one of the cases i/j/k/l in the following figure, d analyzed specifically below

image-20240826212112629

Insertion at all four positions of the farthest node will cause the grandfather node to turn red. Take the leftmost insertion as an example

image-20240826221608433

  • When d is red

image-20240826223043192

  • There are two basic cases when d is black

image-20240826223510388

where a may be four of i/j/k/l and b and c can only be empty or a red node

  • When d is null, we learned in the simple case analysis that the adjustment method is the same as if d were black and can be reused, so we will not analyze it again.

Summary:

And then there's level 3, level 4... etc. It's enough that we know how level 2 came to be, and applying the methodology of level 1, with iterative corrections, will accomplish the final equilibrium.

Insertion of amended code implementation