arrays
1) General
define
In computer science, an array is a data structure consisting of a set of elements (values or variables), each of which has at least one index or key to identify the
In computer science, an array is a data structure consisting of a collection of elements (values or variables), each identified by at least one array index or key
Because the elements within the array areContinuous storages, so the address of an element in an array can be calculated from its index, for example:
int[] array = {1,2,3,4,5}
Knowing that the array'sdigitalstarting address\(BaseAddress\), it is then possible to obtain the result from the equation\(BaseAddress + i * size\) Calculate the index\(i\) The address of the element
- \(i\) i.e., indexes, which in Java, C, etc. start at 0.
- \(size\) is each element takes up bytes, e.g.\(int\) take possession of\(4\),\(double\) take possession of\(8\)
quiz
byte[] array = {1,2,3,4,5}
The array is known to bedigitalThe starting address of element 3 is 0x7138f94c8, so what is the address of element 3?
A: 0x7138f94c8 + 2 * 1 = 0x7138f94ca
space occupation
The structure of an array in Java is
- 8 bytes markword
- 4-byte class pointer (case of compressed class pointer)
- 4 bytes Array size (determines the maximum size of the array which is\(2^{32}\))
- Array elements + aligned bytes (all object sizes in java are integer multiples of 8 bytes [^12], any shortfall is made up with aligned bytes)
for example
int[] array = {1, 2, 3, 4, 5};
is 40 bytes in size and consists of the following
8 + 4 + 4 + 5*4 + 4(alignment)
Random Access Performance
That is, the time complexity of finding an element based on its index is\(O(1)\)
2) Dynamic Arrays
java version
public class DynamicArray implements Iterable<Integer> {
private int size = 0; // logical size
private int capacity = 8; // quantitative (science)
private int[] array = {};
/**
* backward [size] Adding Elements
*
* @param element 待Adding Elements
*/
public void addLast(int element) {
add(size, element);
}
/**
* toward [0 .. size] 位置Adding Elements
*
* @param index index position
* @param element 待Adding Elements
*/
public void add(int index, int element) {
checkAndGrow();
// Adding Logic
if (index >= 0 && index < size) {
// toward后挪动, Empty position to be inserted
(array, index,
array, index + 1, size - index);
}
array[index] = element;
size++;
}
private void checkAndGrow() {
// quantitative (science)检查
if (size == 0) {
array = new int[capacity];
} else if (size == capacity) {
// carry out capacity expansion, 1.5 1.618 2
capacity += capacity >> 1;
int[] newArray = new int[capacity];
(array, 0,
newArray, 0, size);
array = newArray;
}
}
/**
* through (a gap) [0 .. size) Range Deletion Element
*
* @param index index position
* @return deleted element
*/
public int remove(int index) { // [0..size)
int removed = array[index];
if (index < size - 1) {
// toward前挪动
(array, index + 1,
array, index, size - index - 1);
}
size--;
return removed;
}
/**
* query element
*
* @param index index position, exist [0..size) interval (math.)
* @return 该index position的元素
*/
public int get(int index) {
return array[index];
}
/**
* Traversal methods1
*
* @param consumer Iterate over the operations to be performed, join: each element
*/
public void foreach(Consumer<Integer> consumer) {
for (int i = 0; i < size; i++) {
// furnish array[i]
// come (or go) back void
(array[i]);
}
}
/**
* Traversal methods2 - Iterator traversal
*/
@Override
public Iterator<Integer> iterator() {
return new Iterator<Integer>() {
int i = 0;
@Override
public boolean hasNext() { // Is there a next element?
return i < size;
}
@Override
public Integer next() { // come (or go) back当前元素,and move to the next element
return array[i++];
}
};
}
/**
* Traversal methods3 - stream (math.) ergodic
*
* @return stream banish or send into exile
*/
public IntStream stream() {
return ((array, 0, size));
}
}
- These implementations simplify the validity of indexes by assuming that the input indexes are all legal.
Insertion or Deletion Performance
head position, the time complexity is\(O(n)\)
In the middle position, the time complexity is\(O(n)\)
tail position, the time complexity is\(O(1)\)(in equal shares)
3) Two-dimensional arrays
int[][] array = {
{11, 12, 13, 14, 15},
{21, 22, 23, 24, 25},
{31, 32, 33, 34, 35},
};
The memory map is as follows
-
A two-dimensional array takes up 32 bytes, where array[0], array[1], and array[2] hold references to three one-dimensional arrays.
-
Three one-dimensional arrays of 40 bytes each
-
They are in the inner layout of theprogression(used form a nominal expression)
More generally, for a two-dimensional array\(Array[m][n]\)
- \(m\) is the length of the outer array, which can be thought of as rows.
- \(n\) is the length of the inner array, which can be viewed as a column column
- When accessing the\(Array[i][j]\),\(0\leq i \lt m, 0\leq j \lt n\)When it is, it is equivalent to
- Let's find the first one.\(i\) Inner arrays (rows)
- Then, find the first item in this inner array\(j\) Elements (columns)
quiz
In a Java environment (which does not take into account class pointers and reference compression, which is the default), there are the following two-dimensional arrays
byte[][] array = {
{11, 12, 13, 14, 15},
{21, 22, 23, 24, 25},
{31, 32, 33, 34, 35},
};
The known arrayboyfriendThe starting address is 0x1000, so what is the address of the element 23?
Answer:
- Starting address 0x1000
- Outer array size: 16 bytes object header + 3 elements * 4 bytes per reference + 4 aligned bytes = 32 = 0x20
- First inner array size: 16 byte object header + 5 elements * 1 byte per byte + 3 aligned bytes = 24 = 0x18
- Second inner array, 16-byte object header = 0x10, index of element to be found is 2
- Final result = 0x1000 + 0x20 + 0x18 + 0x10 + 2*1 = 0x104a
4) The principle of localization
Only spatial localization is discussed here
- After the cpu reads the data from memory (which is slow), it puts it into cache (which is fast), and if the data is used in later calculations, it is not necessary to read it from memory if it can be read in the cache.
- The smallest storage unit of the cache is the cache line, usually 64 bytes, and it is not cost-effective to read less data at a time, so at least 64 bytes is read to fill a cache line, so when you read in a certain piece of data, you will also read in itsProximity dataThat's what it's called.spatial localization
Impact on efficiency
Compare the execution efficiency of the following two methods ij and ji
int rows = 1000000;
int columns = 14;
int[][] a = new int[rows][columns];
StopWatch sw = new StopWatch();
("ij");
ij(a, rows, columns);
();
("ji");
ji(a, rows, columns);
();
(());
ij Methods
public static void ij(int[][] a, int rows, int columns) {
long sum = 0L;
for (int i = 0; i < rows; i++) {
for (int j = 0; j < columns; j++) {
sum += a[i][j];
}
}
(sum);
}
ji Methods
public static void ji(int[][] a, int rows, int columns) {
long sum = 0L;
for (int j = 0; j < columns; j++) {
for (int i = 0; i < rows; i++) {
sum += a[i][j];
}
}
(sum);
}
Implementation results
0
0
StopWatch '': running time = 96283300 ns
---------------------------------------------
ns % Task name
---------------------------------------------
016196200 017% ij
080087100 083% ji
You can see that ij is much more efficient than ji, why?
- The cache is finite, and when new data comes in, some of the old cache line data gets overwritten
- Failure to fully utilize cached data can lead to inefficiencies
In the case of the ji execution, for example, the first inner loop has to read in the\([0,0]\) This data, due to the principle of locality, reads into the\([0,0]\) while also reading into the\([0,1] ... [0,13]\)As shown in the figure
But sadly, the second inner circle wants the\([1,0]\) This data, which was not in the cache, was then read in as follows
This is obviously a waste because\([0,1] ... [0,13]\) including through\([1,1] ... [1,13]\) This data is read into the cache but not used in time, and the size of the cache is finite, so by the time the ninth inner loop is executed
The first line of data in the cache has been replaced by a new data\([8,0] ... [8,13]\) It's overwritten, so if you want to read it again later, for example\([0,1]\)I've got memory to read again.
Similarly, the ij function can be analyzed to make full use of the locality principle to load the cached data.
deduce many things from one case
-
The same localization principle can be applied to I/O reads and writes.
-
Arrays can take full advantage of the principle of locality, but what about linked lists?
A: Chained tables don't work because the elements of a chained table are not stored next to each other
5) Cross-border inspections
In java, there are out-of-bounds checks for reading and writing array elements, similar to the following code
bool is_within_bounds(int index) const
{
return 0 <= index && index < length();
}
- Code Location:
openjdk\src\hotspot\share\oops\
It's just that this checking code doesn't need to be called by the programmer himself, the JVM will call it for us.
school work exercises
E01. merge ordered arrays - corresponds to Leetcode 88
Combining ordered elements in two intervals within an array
precedent
[1, 5, 6, 2, 4, 10, 11]
can be viewed as two ordered intervals
[1, 5, 6] and [2, 4, 10, 11]
After merging, the result is still stored in the original space
[1, 2, 4, 5, 6, 10, 11]
Method 1
recursive (calculation)
- Each recursion copies smaller elements into the result array
merge(left=[1,5,6],right=[2,4,10,11],a2=[]){
merge(left=[5,6],right=[2,4,10,11],a2=[1]){
merge(left=[5,6],right=[4,10,11],a2=[1,2]){
merge(left=[5,6],right=[10,11],a2=[1,2,4]){
merge(left=[6],right=[10,11],a2=[1,2,4,5]){
merge(left=[],right=[10,11],a2=[1,2,4,5,6]){
// copy (loanword)10,11
}
}
}
}
}
}
coding
public static void merge(int[] a1, int i, int iEnd, int j, int jEnd,
int[] a2, int k) {
if (i > iEnd) {
(a1, j, a2, k, jEnd - j + 1);
return;
}
if (j > jEnd) {
(a1, i, a2, k, iEnd - i + 1);
return;
}
if (a1[i] < a1[j]) {
a2[k] = a1[i];
merge(a1, i + 1, iEnd, j, jEnd, a2, k + 1);
} else {
a2[k] = a1[j];
merge(a1, i, iEnd, j + 1, jEnd, a2, k + 1);
}
}
beta (software)
int[] a1 = {1, 5, 6, 2, 4, 10, 11};
int[] a2 = new int[];
merge(a1, 0, 2, 3, 6, a2, 0);
Method 2
coding
public static void merge(int[] a1, int i, int iEnd,
int j, int jEnd,
int[] a2) {
int k = i;
while (i <= iEnd && j <= jEnd) {
if (a1[i] < a1[j]) {
a2[k] = a1[i];
i++;
} else {
a2[k] = a1[j];
j++;
}
k++;
}
if (i > iEnd) {
(a1, j, a2, k, jEnd - j + 1);
}
if (j > jEnd) {
(a1, i, a2, k, iEnd - i + 1);
}
}
beta (software)
int[] a1 = {1, 5, 6, 2, 4, 10, 11};
int[] a2 = new int[];
merge(a1, 0, 2, 3, 6, a2);
This article, which has been featured on, my tech siteWe have full interviews with big companies, working technology, architect growth path, and other experiences to share!