Introduction
The nature of data structures is only two types of structures.Arrays and linked tables
. The others are its derivatives and combinations
The essence of an algorithm isExhausted
。
Array
Arrays can be divided into two categories.Static array
andDynamic array
。
The essence of static arrays isA continuous memory
, because it is continuous, we can use the offset method to achieve fast access to elements.
Dynamic arrays are encapsulation of static arrays, making it more convenient to operate elements. With dynamic arrays, subsequent stacks, hashes, and queues can be implemented more elegantly.
Static array
-
Superpowers of arrays
Random access. As long as any index is used, the memory address of the element can be inferred, and the computer's memory addressing ability is Log(1), so the random access time complexity of the array is also Log(1) -
Limitations of arrays
Since the size of the array is fixed, when the array is full, or needs to be inserted/deleted in the middle. All elements need to be moved, and the time complexity will increase to Log(N)
Dynamic array
Dynamic arrays cannot solve the problem of static array Log(N). They only help you hide the process of dynamic expansion and element transfer, as well as a more easy-to-use API.
The superpower of random access to arrays comes from the continuous memory space of the array, and continuous memory space inevitably faces the problems of element transfer and expansion.
A simple dynamic array
public class MyList<T>()
{
//The bottom layer of real data storage
private T[] arr = new T[5];
//Record the number of elements
public int Count { get; private set; }
/// <summary>
/// Add
/// </summary>
/// <param name="item"></param>
public void Add(T item)
{
if (Count == )
{
//Expand capacity
Resize(Count * 2);
}
arr[Count] = item;
Count++;
}
/// <summary>
/// Delete
/// </summary>
/// <param name="idx"></param>
public void RemoveAt(int idx)
{
if (Count == / 4)
{
//Shrinkage
Resize( / 2);
}
Count--;
for (int i = idx; i < Count; i++)
{
arr[i] = arr[i + 1];
}
arr[Count] = default(T);
}
public void Remove(T item)
{
var idx = FindIndex(item);
RemoveAt(idx);
}
/// <summary>
/// change
/// </summary>
/// <param name="idx"></param>
/// <param name="newItem"></param>
public void Put(int idx,T newItem)
{
arr[idx] = newItem;
}
/// <summary>
/// check
/// </summary>
/// <param name="item"></param>
/// <returns></returns>
public int FindIndex(T item)
{
for(int i = 0; i < ; i++)
{
if ((arr[i]))
return i;
}
return -1;
}
/// <summary>
/// Capacity expansion/capacity reduction operation
/// </summary>
/// <param name="initCapacity"></param>
private void Resize(int initCapacity)
{
var newArray=new T[initCapacity];
for(var i = 0; i < Count; i++)
{
newArray[i] = arr[i];
}
arr = newArray;
}
}
Variants of arrays: ring arrays
Someone might ask? Isn't an array a continuous piece of memory? How could it be circular?
From a physical point of view, this is indeed impossible. But fromLogical angle
Let's go, it's possible.
The core content is to useFind the modular operation
public static void Run()
{
var arr = new int[] { 1, 2, 3, 4, 5, 6 };
var i = 0;
While ( > 0)
{
(arr[i]);
//The key code is here. When i traversal to the end, i+1 and derest will become 0
//Close loop is completed logically
i = (i + 1) % ;
if ((i % ) == 0)
{
("A loop is completed, i returns to zero");
(1000);
}
}
}
The key to a ring array is that it maintains two pointers, start and end, pointing to the index of the first valid element, and end pointing to the next position of the last valid element
What problems does ring array solve? The array is added and deleted from O(N) at the head, and optimized to O(1)
Link List
The linked list is divided intoSingle-linked table
andDouble-linked table
, single linked list has only one pointer, pointing to next element. The double-linked list has two pointers, pointing to previous and next respectively.
There is no other difference. The main function difference lies in whether it can traverse forward.
Why do you need a link list
As mentioned earlier, the essence of an array isA continuous memory
, When elements move/expand capacity, one by one needs to move, which is very expensive.
Is there any one that can break throughMemory limit
What about the data structure? Linked lists came into being. Link ListNo continuous memory required
, they can be allocated across the world, and the connection between them is based on next/prev link,Scattered elements
String into a chain structure.
There are two benefits to doing this
- Improve memory utilization and allocate anywhere. So it can reduce memory fragmentation
- It is convenient for expansion and movement. You only need to re-point to next/previous to achieve addition, deletion, modification and other operations, without moving elements and expansion.
But everything comes at a cost. Because of the discontinuity of the linked list, it is impossible to use fast random access to locate elements, and can only traverse one by one to determine elements. Therefore, the query complexity of the linked list is Log(N)
A simple link list
public class MyLinkedList<T>
{
public static void Run()
{
var linked = new MyLinkedList<string>();
("a");
("b");
("c");
("d");
(1, "bc");
(1, "aaaa");
(());
}
/// <summary>
/// Virtual head and tail nodes have two benefits
/// 1. Regardless of whether the linked list is empty or not, both virtual nodes exist to avoid many boundary value processing.
/// 2. If you want to insert data at the tail, if you don’t know the tail node, then the complexity needs to be degraded to O(N), because you need to traverse from the beginning to the tail.
/// </summary>
private Node _head, _tail;
public int Count { get; private set; }
public MyLinkedList()
{
_tail = new Node();
_head = new Node();
_head.Next = _tail;
_tail.Prev = _head;
}
public void AddLast(T item)
{
var prev = _tail.Prev;
var next = _tail;
var node = new Node(item);
= next;
= prev;
= node;
= node;
Count++;
}
public void AddFirst(T item)
{
var prev = _head;
var next = _head.Next;
var node=new Node(item);
= prev;
= next;
= node;
= node;
Count++;
}
public void Add(int idx,T item)
{
var t = Get(idx);
var next = ;
var prev = t;
var node = new Node(item);
= next;
= prev;
= node;
= node;
}
public void Remove(int idx)
{
var t = Get(idx);
var prev = ;
var next = ;
= next;
= next;
t = null;
Count--;
}
public void Put(int idx,T item)
{
var t = Get(idx);
= item;
}
private Node? Get(int idx)
{
var node = _head.Next;
//There is an optimization space here, which interval in Count can be placed through idx. Therefore, decide to start traversing from head or tail
for (int i = 0; i < idx; i++)
{
node = ;
}
return node;
}
public override string ToString()
{
var sb = new StringBuilder();
var node = _head.Next;
While (node != null && != null)
{
($"{}<->");
node = ;
}
("null");
return ();
}
private class Node
{
public T? Value { get; set; }
public Node Next { get; set; }
public Node Prev { get; set; }
public Node()
{
Value=default(T);
}
public Node(T value)
{
Value = value;
}
}
}
Variants of linked list: jump table
In the simple example above, the complexity of the query is O(N), and the complexity of the insert is O(1).
It mainly consumes query operations, and can only start from the beginning and traverse to the target node one by one.
So the focus of our optimization is to optimize query.
In the example above, we usedVirtual head and tail nodes
Change space and time to improve insertion efficiency. Similarly, we can also use this idea to improve query efficiency
Core principle of jump table
index 0 1 2 3 4 5 6 7 8 9
node a->b->c->d->e->f->g->h->i->j
At this moment, if you want to get the node of h, you need to0 starts traversal until 7
。
At this time, you think, it would be great if I could know the location of 6 in advance, so I just need NextGet h quickly
That's it
indexLevel 0-----------------------8-----10
indexLevel 0-----------4-----------8-----10
indexLevel 0-----2-----4-----6-----8-----10
indexLevel 0--1--2--3--4--5--6--7--8--9--10
nodeLevel a->b->c->d->e->f->g->h->i->j->k
Based on the original linked list, the table has added multiple layers of indexes. Each layer is upward, the index is reduced by half, so the height of the index is O(log N)
- First, search down from the highest level index, index 7 is in the [0,8] interval
- Starting from node 0, I found that 7 is in [4,8], and I got the address of node 4 in [4,8].
- Starting from node 4, I found that 7 is in [6, 8], and I got the address of node 6 in [6].
- Starting from node 6, it is found that 7 is in [6,7], and finally found node 7
During the search process, it will pass through the O(log N) layer index, so the time complexity is O(log N)
The implementation of table adjustment is relatively complex. When adding and deleting new tables, dynamic adjustment of indexes must be considered. It is necessary to ensure that the binary is as much as possible, otherwise the time complexity will degenerate to O(N) again.
It's a bit like self-balancing binary search number, but it's relatively simple.