CRC algorithm principle, derivation and implementation

CRC, Cyclic Redundancy Check, Cyclic redundancy check

1. Basic principles

The essence of CRC is division, treating the data to be examined as a very large (long) divisor, choosing a divisor (called poly in some literature) on either side, and finally obtaining a remainder that is the CRC checksum value.

Methods of Determination:

Separate the message from the checksum. Computes the checksum of the message (after appending W zeros) and compares the two checksums.
Put the checksum value at the end of the data and do a checksum (without appending zeros) for the whole batch to see if the result is zero!

1.1 Why use CRC

More common is the cumulative sum check, but with the following drawbacks:

80 together with80 00 .. 00The results of the calculations are consistent, i.e., if the data are interspersed with the00is undetectable and unfriendly to the detection of indeterminate lengths
Because of the cumulative sum, the80 00There are very many combinations where the checksums are equal, such as70 10, 79 06 wait a minute!

So what would cause a CRC to fail?

TODO

2. Preparation for derivation

2.1 Unrounded addition and subtraction

Adding two numbers in CRC arithmetic is the same as adding numbers in normal binary arithmetic, except there is no rounding.

Unprogressive addition and subtraction are actually different-or operations (different-or, unlike or together: unlike is 1, same is 0)

This means that each pair of corresponding bits determines the corresponding output bit without reference to any other bit position. For example

    10011011
   +11001010
    --------
    01010001
    --------

Addition in 4 cases:

    0+0=0
    0+1=1
    1+0=1
    1+1=0  (no carry)

Subtraction is similar:

    10011011
   -11001010
    --------
    01010001
    --------

with

    0-0=0
    0-1=1  (wraparound)
    1-0=1
    1-1=0

In this way, we are equivalent to combining addition and subtraction into one algorithm, or it can be understood that addition and subtraction are here called a kind of reciprocal operation, for example, we can add or subtract the same amount, we can go from 1010 to 1001:

    1001 = 1010 + 0011
    1001 = 1010 - 0011

So in unrounded addition and subtraction, 1010 can no longer be considered greater than 1001;

What's the benefit of doing that?

You'll notice that no matter how long the data bit is, it no longer depends on the state of the previous or the next bit in the operation, unlike addition with rounding and subtraction with borrowing, which you can understand as running a parallel calculation:

Addition with rounding, the result of the calculation of the high place needs to be added to the result of the low place generated by the rounding, which leads to the need to calculate the low place, and then only after the calculation of the high place; for example, the following example, if with rounding, then you must start from the most right to calculate, and then in turn calculated to the most left to get the result. But if we remove the rounding, we can start counting from that side, and of course we can also count multiple bits at the same time (parallel computing).
```
          1011                         1011
        + 1101                       + 1101
          ----                         ----        
        1 1000 (with carry)            0110 (no carry)
```
The same is true of subtraction, without further ado.

2.2 Unprogressive multiplication

With addition defined, we can multiply and divide. Multiplication is simple, except that we use XOR for addition.

    1101
  x 1011
    ----
    1101
   1101.
  0000..
 1101...
 -------
 1111111  Note: The sum uses CRC addition
 -------

2.3 Unrounded division

Division is similar, except that there are two things to keep in mind:

When the highest digit of both the divisor and the divisor is 1, it is treated as if theline up in correct orderup, you can start XOR operations without comparing data sizes, such as1001receivables1011Divide, as to whether the result of the quotient is 1 or 0, no one pays attention and is happy with themselves, because the algorithm is pressed into service;

Subtraction of divisors and divisors requires the use of unrounded subtraction, the XOR operation;

           1 = the Shang dynasty, 16th 11th century BC (nobody cares about the quotient)
       ______
 1011 ) 1001 divisor (math.)
 =Poly 1011
        ----
        0010

3. Algorithm derivation

Even if we know that the CRC algorithm is based on division, we can't use the division operation directly; for one thing, the data to be checked is very long and we don't have such a large register; for another, do you know how division is implemented in the MCU?

3.1 Human Resource Modeling

Now let's assume that a message data is1101011011, choose a divisor that is10011, divide the message by poly using the CRC algorithm:

            1100001010 = Quotient (nobody cares about the quotient)
       _______________
10011 ) 11010110110000 = Augmented message (1101011011 + 0000)
 =Poly  10011,,.,,..|.
        -----,,.,,|...
         10011,.,,|.|.
         10011,.,,|...
         -----,.,,|.|.
              10110...
              10011.|.
              -----...
               010100.
                10011.
                -----.
                 01110
                 00000
                 -----
                  1110 = Remainder = THE CHECKSUM!!!!

The role of the high bit to the left of the divisor poly is really for show (it's actually 0011 that's involved in the operation), and the purpose is to take out the current highest divisor, essentially making poly and the divisorline up in correct order, and then start the XOR operation.

So what's consideredline up in correct orderWhat about it? From the example, when the highest digit of both the divisor and the divisor is 1, it is considered aligned.

The idea of converting it to an algorithm is that you can also understand it as a long string of data constantly shifted from the right into a register, and when the value of the leftmost overflow of the register is 1, then the data in the current register can be iso-orthogonal to poly, which, in terms of the algorithm, is roughly the same:

                  3   2   1   0   Bits
                +---+---+---+---+
       Pop! <-- |   |   |   |   | <----- Augmented message
                +---+---+---+---+

             1    0   0   1   1   = The Poly

The description in algorithmic language is:

Register clear
right-most patch of data W bit 0 // W is the number of bits in the CRC check value
when(data) {
    shift the register left by 1 bit and read the next bit of data to bit 0 of the register
    if (overflow occurs while shifting register left){
        register ^= poly; // here poly=0011, as in the example above
    }
}
The value of the register is the checksum value

In C:

// CRC8 generates a polynomial
#define POLYNOMIAL 0x07

// Calculate the CRC8 checksum value
uint8_t crc8_data(const uint8_t dat8) {
    uint8_t crc = dat8;
    for (j = 8; j; j--) {
        if (crc & 0x80)
            crc = (crc << 1) ^ POLYNOMIAL.
        else = (crc
            crc <<= 1; }
    }
    else crc <<= 1; }
}

But this method is too dumb to perform calculations by bits, and the efficiency leaves much to be desired.

3.2 Calculating CRC4 using the Table driver

3.2.1 Calculation of 4-bit data

For the sake of description, let's take an exampleW=4both (... and...)poly=3case, for example, if we compute a3The CRC value of5, which we write as the computational procedure for XOR:

0011 0000 // complement W = 4 zeros (value 1)
,,10 011 // poly alignment (value 2)
---------
0001 0110
,,,1 0011 // poly alignment (value 3)
---------
0000 0101 // CRC value (value 4)

The above computation has gone through N iterations of the operation (actually how many iterations we don't care), which is equivalent to

CRC value = value 1 ^ value 2 ^ value 3
      = value 1 ^ (value 2 ^ value 3)
      = value 1 ^ lookup value // make `lookup value` = value 2 ^ value 3

Note that in the CRC calculation, 4 zeros are added to the end, but we are clear that any XOR operation with numbers and zeros is itself, so adding zeros does not affect the final CRC value, it is just equivalent to extracting the CRC value. The CRC calculation is equivalent to a series of shifts and XOR operations, so the expression above actually reads:

CRC value = (value 1 ^ 0) ^ value 2 ^ value 3
      = (value 1 ^ 0) ^ (value 2 ^ value 3)
      = (value 1 ^ 0) ^ lookup value
      = value 1 ^ lookup value // make `lookup value` = value 2 ^ value 3

That is, we can realize that it is possible to put0~15The value of the CRC of the first pre-calculated, and then saved, so that the next time the calculation can be calculated directly look up the table to calculate, which is very good to understand.

3.2.2 Calculation of 8-bit data

Imagine an 8-bit byte can be split into two 4-bit data, if we can utilize the lookup table method, isn't it possible to get an 8-bit CRC value by two calculations? Specifically how to do it, let's take an exampleW=4both (... and...)poly=3case, for example, if we compute a33hof the CRC value, we write the procedure for calculating the XOR:

0011 0011 0000 // complement W = 4 zeros (value 1)
,,10 011 // poly alignment (value 2)
--------------
0001 0101 0000
,,,1 0011 // poly alignment (value 3)
--------------
0000 0110 0000 // change back to 4-bit CRC calculation (value 4)

The following calculation will be familiar to us, going back to Calculating 4-bit data as6 The CRC value of the

0110 0000 // complement W = 4 zeros
,100 11 // poly alignment
---------
0010 1100
,,10 011 // poly alignment
---------
0000 1010 // CRC value

One interesting thing we found was that the original 4-bit data3The CRC value of5But when33hThe high 4-bit CRC value is calculated first but is6The data is not the same as the previous one (fortunately not, if the data is the same no matter what data follows it, what is the use of checking), what is the reason for this?

First, let's look at the difference between the 8-bit computation and the original 4-bit computation in the end complement:

A 4-bit CRC calculation with a zero at the end does not affect the result;

When 8-bit CRC is calculated, the end of the complement is followed by the lower 4-bit data, which will affect the original calculation results:

For the sake of description, we have put the 8-bitValue 1The high 4-bit data is written as H4 and the low 4-bit data is written as L4.

`value4` = `value1` ^ value2 ^ value3
    = `value1` ^ `lookup table value` // make `lookup table value` = value2 ^ value3
    = `(H4<<4 ^ L4)` ^ `Lookup Table Value`
    = `(H4<<4)` ^ `lookup table value` ^ L4 // (H4<<4) is actually the CRC value of H4 with zeroes at the end.

So, we can get 2 segments of 4-bit computational flow:

The high 4-bit value of the removed byte is H4
The H4 value is looked up and calculated to get the value TMP1
Put the value of TMP1isomorphicThe value L4 of the lower 4 bits yields the value TMP2
The value of TMP2 is then used to perform a lookup table calculation to obtain the value CRC

4. Algorithmic improvements

4.1 CRC8 calculations

Now we can be based on the CRC4 calculation process analogous to the CRC8 calculation, in fact, the main difference is that the number of register bits from 4 to 8 bits, a typical CRC8 calculation model is as follows, now you should be able to read and understand.

#include <>
#include <>

// CRC8Generating polynomials
#define POLYNOMIAL 0x07

// initializationCRC8lookup table
void init_crc8_table(void) {
    uint8_t i, j;
    for (i = 0; i < 256; i++) {
        uint8_t crc = i;
        for (j = 8; j; j--) {
            if (crc & 0x80)
               crc = (crc << 1) ^ POLYNOMIAL;
            else
               crc <<= 1;
        }
        crc8_table[i] = crc;
    }
}

// countCRC8calibration value
uint8_t crc8(const void *data, size_t len) {
    const uint8_t *byte = data;
    uint8_t crc = 0x00;

    for (; len > 0; len--) {
        crc = crc8_table[(crc ^ *byte++) & 0xFF];
    }

    return crc;
}

int main(int argc, char *argv[]) {
    int fd;
    uint8_t buffer;
    size_t bytes_read;
    uint8_t crc;

    if (argc != 2) {
        fprintf(stderr, "Usage: %s filename\n", argv);
        exit(1);
    }

    fd = open(argv, O_RDONLY);
    bytes_read = read(fd, buffer, sizeof(buffer));
    crc = crc8(buffer, bytes_read);
    printf("CRC: 0x%02X\n", crc);<q refer="1"></q><span class="_q_s_"></span>

    close(fd);
    return 0;
}

4.2 CRC8 calculations -- improved

The above algorithm is still not good enough, because the Table's table is too large occupies 256 bytes, for the FLASH space is tight for the MCU is not very friendly, can not be an 8-bit data is split into two times the calculation of 4-bit data, so is not it possible to engage in the16Table of bytes now? Let's try it!

Actually uses 32 bytes

idx	L4	H4	H4 Description
0	0	0	(L4<<4) ^ 07*0
1	07	70	(L4<<4) ^ 07*0
2	0E	E0	(L4<<4) ^ 07*0
3	09	90	(L4<<4) ^ 07*0
4	1C	C7	(L4<<4) ^ 07
5	1B	B7	(L4<<4) ^ 07
6	12	27	(L4<<4) ^ 07
7	15	57	(L4<<4) ^ 07
8	38	89	(L4<<4) ^ ((07<<1) ^ 07)
9	3F	F9	(L4<<4) ^ ((07<<1) ^ 07)
A	36	69	(L4<<4) ^ ((07<<1) ^ 07)
B	31	19	(L4<<4) ^ ((07<<1) ^ 07)
C	24	4E	(L4<<4) ^ (07<<1)
D	23	3E	(L4<<4) ^ (07<<1)
E	2A	AE	(L4<<4) ^ (07<<1)
F	2D	DE	(L4<<4) ^ (07<<1)

// countCRC8calibration value
uint8_t crc8(const void *data, size_t len) {
    const uint8_t *byte = data;
    uint8_t crc = 0x00;

    for (; len > 0; len--) {
        crc = crc8_table_h4[(crc ^ *byte)>>4] ^
              crc8_table_l4[(crc ^ *byte) & 0xF] ;
              
              byte++;
    }

    return crc;
}

Validation:

CRC8 calculates a single byte
```
crc8(88) = 38 ^ 89 = B1
```

CRC8 calculates multibyte

crc8(8888) = crc8(B1 ^ 88) = crc8(39) = 90^3F=AF
crc8(1234) = crc8(crc8(12) ^ 34) = crc8(7E ^ 34 = 4A) = C7^36=F1

4.3 CRC16 calculations -- improved

Further, can we implement the CRC16 algorithm using the same principles?W=16, poly=8005

idx	X[3:0]	X[7:4]	X[7:4] Description	X[11:8]	X[11:8] description	X[15:12]	X [15:12] Description
0	0	0
1	8005	8063	X[3:0]<<4 ^ 8033	8603	X[3:0]<<8 ^ 8303	E003	X[3:0]<<12 ^ B003
2	800F	80C3	X[3:0]<<4 ^ 8033	8C03	X[3:0]<<8 ^ 8303	4003	X[3:0]<<12 ^ B003
3	0A	00A0	X[3:0]<<4	A00	X[3:0]<<8	A000	X[3:0]<<12
4	801B	8183	X[3:0]<<4 ^ 8033	9803	X[3:0]<<8 ^ 8303	8006	X[3:0]<<12 ^ B003 ^ 8005
5	1E	1E0	X[3:0]<<4	1E00	X[3:0]<<8	6005	X[3:0]<<12 ^ 8005
6	14	140	X[3:0]<<4	1400	X[3:0]<<8	C005	X[3:0]<<12 ^ 8005
7	8011	8123	X[3:0]<<4 ^ 8033	9203	X[3:0]<<8 ^ 8303	2006	X[3:0]<<12 ^ B003 ^ 8005
8	8033	8303	X[3:0]<<4 ^ 8033	B003	X[3:0]<<8 ^ 8303	8009	X[3:0]<<12 ^ B003 ^ A
9	36	360	X[3:0]<<4	3600	X[3:0]<<8	600A	X[3:0]<<12 ^ A
A	3C	3C0	X[3:0]<<4	3C00	X[3:0]<<8	C00A	X[3:0]<<12 ^ A
B	8039	83A3	X[3:0]<<4 ^ 8033	BA03	X[3:0]<<8 ^ 8303	2009	X[3:0]<<12 ^ B003 ^ A
C	28	280	X[3:0]<<4	2800	X[3:0]<<8	000F	X[3:0]<<12 ^ 800F
D	802D	82E3	X[3:0]<<4 ^ 8033	AE03	X[3:0]<<8 ^ 8303	E00C	X[3:0]<<12 ^ B003 ^ 800F
E	8027	8243	X[3:0]<<4 ^ 8033	A403	X[3:0]<<8 ^ 8303	400C	X[3:0]<<12 ^ B003 ^ 800F
F	22	220	X[3:0]<<4	2200	X[3:0]<<8	A00F	X[3:0]<<12 ^ 800F

Validation:

CRC16 Calculated Double Byte

for example, calculatingABCDof CRC16:

crc16(A000) = C00A
crc16(0B00) = BA03
crc16(00C0) = 0280
crc16(000D) = 802D
happeningcrc16(ABCD) = C00A ^ BA03 ^ 0280 ^ 802D = F8A4

CRC16 Calculated Quadruple Byte

What if the data is concatenated, such asABCD

  crc(ABCD) = F8A4
  
  crc(ABCD1234) 
= crc(F8A4 ^ 1234) = crc(EA90) = 400C ^ 3C00 ^ 360 ^ 0 = 7F6C