Location>code7788 >text

FPGA CDC Multi-bit Cross Clock Domain Synchronization-Hand-Shanking Mechanism in One Article

Popularity:647 ℃/2024-11-16 16:27:27

I. Background
Cross-clock domain processing of data is a common problem during FPGA development and exists in two cases

    1. Slow clock to fast clock synchronization: just two beats in the fast clock domain. The RTL is as follows:

    The principle of beat synchronization: everyone in the beginning of learning FPGA, often heard of FPGA signal beat can effectively avoid substability, and generally to play two beats, the essence of its mathematics is that if the probability of hitting a beat of the error is 1/1000, then the probability of hitting two beats of the error is 1/1,000,000, which is statistically significant from the infinite close to zero.
    From the point of view of the circuit itself, if the update frequency of a signal is not synchronized with the update frequency of the signals in the current clock domain, then any operation with the signals in the current clock domain may result in a burr or failure to meet the establishment or hold time of the lower register. Therefore, in order to ensure that the new introduction of this non-domain of the clock signal will not cause devastation to the clock domain, it must be synchronized, and the method is to use the current clock region of the clock to sample it. The two beat is to let the asynchronous signal beat two can let the asynchronous level signal to reach a more "robust" level interval, two avoid the establishment time, multi-level fan-out caused by the logic level instability of the situation. The following is excerpted from The FPGA Way:

"Why sample twice?
Sampling once, has completed the synchronization operation of the signal of the non-domain clock, then why use two-stage sampling method? Although from a logical point of view, the two-stage sampling method is the same as the shift register, and can not change the logic value of the signal, but also increase the delay in the signal transfer in, but this is not a snake, but very necessary, for the following reasons: Please think about, the first level of sampling of that register, its build and hold time is not able to get to meet? Obviously, in some cases, its build and hold time can not be satisfied, because it and the current clock domain clock signal changes are not synchronized, so there will always be a problem. For example, when asynsignal from 0 to 1, if the time before and after this, clk experienced a total of three rising edges, then the output of the unsafesignal may be 001, may be 011, in which the second sample due to the establishment or retention time is not satisfied, so can not be sure whether it is 0 or 1, but it is worth the comfort of Europe, whether it is 001, or 011, are at least 001, 011, or 011, are at least 0 or 1, but it is worth the comfort, whether it is 001, 011, or 011, are at least 0 or 1, but it is worth the comfort. or 011, are at least correctly captured the original signal changes. You may think, asynchronous logic itself is impossible to be accurately captured in time, since it can correctly capture the signal change, that is not enough? Yes, from the logic is enough, but from the driving ability, perhaps not enough. As we all know, FPGA internal operating voltage is generally 1.5V, that is to say, ideally, logic 1 corresponds to the voltage 1.5V, logic 0 corresponds to the voltage 0V. But the reality is brutal, logic 1 can not be precisely 1.5V, logic 0 can not be precisely 0V, in fact, perhaps the industry recognized that more than 0.5V can be judged as a logic 1, and vice versa can be judged as a logic 0, this is also the case for logic 1, logic 0, logic 1 and logic 0, which can also be used for logic 1, logic 1 and logic 0. Logic 0, which is why the digital signal is more resistant to interference than analog signals. Therefore, if you now ask you, FPGA internal two flip-flop output are logic 1, then it is equal to the physical voltage of the two? The answer is obviously not necessarily. Why trigger can give the correct output results is the premise of the input signal to meet the requirements of its build and hold time? In fact, the reason is very simple, is to give the digital circuit to sufficient time to charge or discharge operation, so that its output logic 1 is closer to 1.5V, logic 0 is closer to 0V; Conversely, if a logic level 1, 0 corresponds to the physical level is closer to 1.5V or 0V, then it is easier to charge or discharge control of its full in the specified time to the later trigger, so that its later flip-flops. Thus, the output of its later trigger is also more "strong". Now, we look back at the physical voltage of the unsafesignal, because the output of the unsafesignalflip-flop, it is likely that the establishment or hold time requirements are not met, therefore, in this case, its charging and discharging operations are very inadequate, the output of the logic 1 or logic 0 physical voltage is not too good. For example, if unsafesignalj for logic 1, then its physical voltage is likely to be 0.6V, if there are many places in the current clock domain are used in the unsafesignal, then the fan-out capacity of 0.6V voltage will obviously be relatively poor, so when passed on to the subsequent use of various places to the unsafe signal, the physical voltage may become 0.4V, 0.5V, 0.5V, 0.5V, 0.5V, 0.5V, 0.5V, 0.5V, 0.5V, 0.5V, 0.5V, 0.5V. 0.4V, 0.5V, 0.55V and so on, then the unsafe signal will be recognized as different logic levels by different triggers, so the error is born. In order to avoid this situation, we sample the unsafesignal again. Because unsafesignal and safesignal are synchronized, so for the output safesignal trigger, the build-up time has far exceeded its build-up time requirements, so even if the 0.6V voltage charging is relatively slow, but because of the charging time is sufficient, charging current is also guaranteed, so it can also let safesignal reach a The next step is to connect the safesignal to wherever it is needed, and there will be no more problems with its fan-out capability."

  1. 2. Fast clock domain to slow clock domain synchronization
    The first simple way to avoid clock desynchronization is to add a FIFO between the fast and slow clock domains.
    The second way is the hand shanking mechanism, which is simply a handshake mechanism whose timing diagram can be represented by the following figure:

    After the data is valid, the host initiates the synchronization request req, until the slave's ACK signal is detected, req is pulled down, marking the end of a synchronization. The req signal is sampled and synchronized in the slave, and after two beats, the slave samples and synchronizes the DATA signal of the host to complete the data synchronization from the fast clock domain to the slow clock domain.
    The code is as follows:
  `// ************************ ***************************************

// Copyright (C) xx Coporation
// File name: hand_shanking.v
// Author: Dongyang
// Date: 2024-11/16
// Version: 1.0
// Abstract: CDC multi bit sync,use hand shanking to sync data
//***************************************************************** `
`timescale 1 ns/1 ns
module hand_shanking_module# (
    parameter integer DATA_WIDTH = 8
    )
(
    input i_clk_f , //
    input i_sys_rst_n , //External asynchronous reset signal
    input [DATA_WIDTH-1 : 0] i_src_data , //External Input Signal,
    input i_src_data_valid , //Data validity flag


    input i_clk_s ,
    output o_des_ack , //acknowledge completion signal,Taking the rising edge can be used asi_clk_s (used form a nominal expression)o_des_data (used form a nominal expression)valid code
    output reg [DATA_WIDTH-1 : 0] o_des_data //existi_clk_s 时钟域同步后(used form a nominal expression)code
);

//******************** siganl define ***********************
    reg r_src_req ;
    reg r_src_ack_sync1 ;
    reg r_src_ack_sync2 ;
    
    reg r_des_req_sync1 ;
    reg r_des_req_sync2 ;
    
    reg r_des_ack ;

//************** combination logic *************************
assign o_des_ack = r_des_ack;
// step 1 , generate r_src_req
always @(posedge i_clk_f or negedge i_sys_rst_n) begin
    if (~i_sys_rst_n) begin
        r_src_req <= 1'b0;
    end
    else begin
        if (i_src_data_valid) begin //once datavalid , generate r_src_req
            r_src_req <= 1'b1;
        end
        else if (r_src_ack_sync2) begin // when i_clk_s domain ack successfully,r_src_req reset
            r_src_req <= 1'b0;
        end
    end
end

//step 2 and 3, under i_clk_s domain, sync r_src_req from i_clk_f domain,generate ack ok siganl
always @(posedge i_clk_s or negedge i_sys_rst_n) begin
    if (~i_sys_rst_n) begin
        r_des_req_sync1 <= 1'b0;
        r_des_req_sync2 <= 1'b0;
        r_des_ack <= 1'b0;
    end
    else begin
        r_des_req_sync1 <= r_src_req ;
        r_des_req_sync2 <= r_des_req_sync1;
        r_des_ack <= r_des_req_sync2;
    end
end
//step 3, once r_des_req_sync2 set, sync o_des_data
always @(posedge i_clk_s or negedge i_sys_rst_n) begin
     if (~i_sys_rst_n) begin
        o_des_data <= 'b0;
     end
     else begin
        if(r_des_req_sync2) begin
            o_des_data <= i_src_data;
        end
     end
end


//step 4 ,sync r_des_ack to i_clk_f domain
always@(posedge i_clk_f or negedge i_sys_rst_n) begin
    if(~i_sys_rst_n) begin
        r_src_ack_sync1 <= 1'b0;
        r_src_ack_sync2 <= 1'b0;
    end
    else begin
        r_src_ack_sync1 <= r_des_ack;
        r_src_ack_sync2 <= r_src_ack_sync1;

    end
end

endmodule

TestBench:

`timescale 1 ns/1 ns
module tb_hand_shanking();

parameter integer DATA_WIDTH =  8;

reg                         clk_f      = 'b0;
reg                         clk_s      = 'b0;
reg                         sys_rst_n  = 'b0;
reg   [DATA_WIDTH- 1 : 0 ]  src_data   = 'b0;
reg                         data_valid = 'b0;

always # 10   clk_f = ~ clk_f;
always # 30   clk_s = ~ clk_s;

initial begin
    clk_f      = 'b0;
    clk_s      = 'b0;
    sys_rst_n  = 'b0;
    src_data   = 'b0;
    data_valid = 'b0;
    #50
    sys_rst_n <= 1'b1;
    #100
    src_data <= 8'h5A;
    data_valid <= 1'b1;
    #20
    data_valid <= 1'b0;
    #500
    src_data <= 8'h6A;
    data_valid <= 1'b1;
    #20
    data_valid <= 1'b0;

end

hand_shanking_module  
#(
        .DATA_WIDTH(DATA_WIDTH)
)
 U_hand_shanking_module_0 
(
    .i_clk_f             (clk_f),
    .i_sys_rst_n         (sys_rst_n),
    .i_src_data          (src_data),
    .i_src_data_valid    (data_valid),
    .i_clk_s             (clk_s),
    .o_des_ack           (),
    .o_des_data          ()
 
);

endmodule

Simulated waveforms:
/blog/3539410/202411/