NJU ICS2024 PA Assignment Tips (II)
Since I'm not a student at NJU 2024, I'm posting this tip "in a dignified manner" and only for the purposes ofNot taking this course as your current academic year's guaranteed course of studyfor the reference of the students. If this is your current course of study for the conservatory, please close this page quickly!!!!
RTFSC problem solving
This part must be veryVery, very carefully.of reading the RISCV's manual, or you'll suffer later.
We here provide some brief additional information on the operations before finger fetching and after execution from the perspective of the framework, so as to facilitate the reader to understand the overall flow of NEMU in conjunction with the help documentation.
-
detachable finger
-
The length of each instruction in risv32 is 4 bytes, and these instructions are stacked in bin files as binary file reads and writes, and a location parameter is defined in nemu to obtain the required bin file path.
-
In the Interactive Console use the
si
orc
fulfillment$NEMU_HOME/src/cpu/
file under thecpu_exec
function, which calls theexecute
After the function executes the instruction a number of times. -
execute
function creates a structureDecode
,pc
refers to the address of the instruction to be decoded.snpc
be used in place ofpc
The next address after that.dnpc
After the execution of the band-only instruction, thepc
the address to which it is traveling.inst
with only the binary code of the instruction.logbuf
It is used to store the disassembly result of the instruction.Then by calling a number of times
exec_once
The function executes the sequence a number of instructions. -
exec_once
function maintains the structureDecode
The content of the call to theisa_exec_once
To fetch, decode, and execute
-
-
Implementation (update)
pc
(after)- exist
exec_once
function to record instruction information and disassembly results to theCenter.
- exist
execute
function that calls thetrace_and_difftest
function checks the behavior of our decoding in NEMU and prints the trace log (all XTRACE should log results here). Each time we call theexec_once
The function checks the status of the NEMU (an exception stops the execution of the instruction) - exist
cpu_exec
function prints the result of the execution.
- exist
How riscv32 should load 32-bit constants due to instruction word length limitations
Both use segmented loading, where the high bits are loaded first, and then the low parts are merged into registers using immediate number instructions, thus completing the loading operation of the 32-bit constants. This approach effectively utilizes the limited space of 32-bit instructions and avoids the limitation of encoding the entire 32-bit constant directly in a single instruction, and we give the solution for RISC-V32 below.
-
lui
(Load Upper Immediate) command.:- utilization
lui
instruction loads the high 20 bits of a 32-bit constant into the high portion of a register. For example.lui t0, 0x12345
would0x12345000
Load to registert0
Center.
- utilization
-
addi
(Add Immediate) Directive:- utilization
addi
instruction loads the lower 12 bits of the immediate number of thelui
The loaded highs are summed. For exampleaddi t0, t0, 0x678
will set the registert0
The value in the0x12345678
。
- utilization
How the riscv32 command is implemented
This section is the focus of this subsection, and there are a number of easy pitfalls to step into (which will require you to carefully study the official documentation), the
- Be sure to note whether the operator is signed or unsigned!
- Be sure to follow the documentation description according to the type of operator (64-bit OR 32-bit) developed!
- Be sure to implement the exception as described in the documentation!
In this section in order to be able to pass all instruction test sets (if the instruction test set passes, it proves that there should be no problem with a particular instruction implementation, and if you encounter a bug after that, please have sufficient confidence in the instruction you have implemented without having to trace back to here), we need to implement a total of 6 types of instructions R, I, S, B, U, and J. We need to implement 6 types of instructions R, I, S, B, U, J, and we need to add them in thedecode_operand
increaseTYPE_*
and the way they immediately count to get it (refer to the chart below).
To get immediate numbers, the NEMU framework gives us some great macro definitions:
-
BITS(x, hi, lo)
Get Binary Number\(x\)(prefix indicating ordinal number, e.g. first, number two etc)\([lo,hi]\)bit content -
SEXT(x, len)
Symbolic Expansion\(x\)to binary\(len\)Bits. The implementation here uses bit fields, cleverly enough to accept signed int by limiting the length of the\(x\)of signed bit information, which can then be converted into unsigned types to realize signed bit expansion.
The next implementation is to find the missing instruction based on the error message, according to theopcode
,funct*
Just lock the specific command implementation.
Programs, Runtime Environment and AM Problem Handling
How to run NEMU in batch mode
Since the Makefile in the AM folder must have compiled nemu somehow and passed in the parameters, we need to find this step and then figure out how to batch parameterize the-b
Incoming.
By reading$AM_HOME/Makefile line 97
content-include $(AM_HOME)/scripts/$(ARCH).mk
It can be noticed that here the introduction of the$(AM_HOME)/scripts/$(ARCH).mk
As a result of ourARCH=riscv32-nemu
So check again.$(AM_HOME)/scripts/
It was found that he quoted$AM_HOME/scripts/platform/
。
In line 31 of this document$(MAKE) -C $(NEMU_HOME) ISA=$(ISA) run ARGS="$(NEMUFLAGS)" IMG=$(IMAGE).bin
It was found that the parameter it passed to nemu wasNEMUFLAGS
This variable. If you're careful, you'll notice that in the Makefile of our test file, we'll put the command-lineARGS
Parameters passed to MakefileARGS
variable, we need to pass this variable to the Makefile'sNEMUFLAGS
That's right.
But soon you'll realize that if you change this step, it will cause NEMU to fail to output theSo we tried to find
-l xxx/
The location of the parameter definition reveals that he is in the$AM_HOME/scripts/platform/
but this variable is declared in theNot addedoverride
Keywords.. This causes the += operation to not take effect on variables passed on the command line because command line variables with the same name overwrite variables with the same name defined in the Makefile file, requiring the addition of theoverride
Specializes in arithmetic operations against command-line variables of the same name.
How to implement sprintf
-
sprintf
: Take the list of variable parameters and provide them to thevsprintf
, just output the result to a string. -
printf
: Take the list of variable parameters and provide them to thevsprintf
, outputs the result to a string, and then utilizes theputch
A single character output is sufficient
So if you want to implement thesprintf
The most critical step is to realizevsprintf
vsprintf
is a formatted output function for outputting formatted data to a string. The design idea of this function is based on processing different placeholders in the formatted string and converting them to the corresponding string form according to the type of the variable parameters, and finally splicing the result into the output bufferout
in. The following is the idea behind the design of the function:
1. Input parameters
-
char *out
: Pointer to the output buffer to store the formatted string. -
const char *fmt
: a formatted string containing normal characters and formatting placeholders (e.g.%d
,%s
(etc.). -
va_list ap
: A list of variable parameters containing data to be formatted for output.
2. Iterate over formatted strings
utilizationwhile (*fmt != '\0')
Iterating over formatted stringsfmt
, processed character by character. The processing is divided into two cases:
- If the current character is not
%
, which is copied directly into the output buffer as a normal character, and updates the buffer pointer and total output length. - If the current character is
%
, then you need to enter the formatting processing logic.
3. Formatting placeholder parsing
run into%
When the function enters parsing mode, it parses the various formatting modifiers in turn:
-
Flag Bit Resolution:
- pass (a bill or inspection etc)
switch
statement handles common flag bits such as-
(left-aligned),+
(showing plus and minus signs),0
(Filled with zeros),#
(special format), etc. Each flag bit is mapped to a binary flag bit and stored in theflags
in the variable.
- pass (a bill or inspection etc)
-
Width Analysis:
- If the width is specified in the format specifier (e.g.
%10d
(indicating that the output width is at least 10), which is either a number given directly or via the*
through (a gap)va_list
Dynamically obtained in the parameter.
- If the width is specified in the format specifier (e.g.
-
Precision Analysis:
- If a precision modifier is present (e.g.
%.2f
(indicating 2 decimal places), parses the precision value. Again, the precision value can be a number specified directly or through the*
through (a gap)va_list
Get it in the parameter.
- If a precision modifier is present (e.g.
-
Type specifier parsing:
- The last character of the format specifier is the type specifier (e.g.
d
,s
,x
etc.) for determining the data type of the current parameter. Depending on the type specifier, different handler functions are called to format the parameters.
- The last character of the format specifier is the type specifier (e.g.
4. Different types of formatting
-
integer type (
d
,i
,u
,x
,o
): callint_to_str
function that converts an integer value to a string (supporting decimal, octal, hexadecimal, etc.) according to different binaries, and formats the output according to the specified width and precision. -
character type (
c
): Directly replace thechar
Type of parameter to output to the buffer. -
string type (
s
): process the string to the specified width and precision, supports left or right alignment. -
Pointer type (
p
): converts the pointer address to hexadecimal format for output with the0x
Prefix. -
special category:
-
%
: Direct Output%
Symbol.
-
n
: Stores the length of the current output character to the corresponding pointer location.
-
5. Splicing of outputs
After processing each placeholder, the formatted result is spliced into the output buffer via pointer operationsout
and update the current buffer pointerbuf
and total lengthtotal_len
。
6. Return results
Adds a string terminator to the end of the output buffer when the formatting string has been processed\0
and returns the length of the output buffer.
How to implement the string library
utilizationman 3 <function_name>
Just implement it carefully according to the manual, otherwise there may be problems, and here's a bit of an easy pitfall to step into, as well
-
strnpy
in the event that\(n\)exceeded\(det\)length, you don't need to add a string terminator.\0
(modal particle intensifying preceding clause) -
strcmp
Note that the characters in the string are generally unsigned char, here if you want to read the characters of unsigned conversion, otherwise the size of the characters in the ASCII code in 128 and later comparison will be a problem.
How to write a test program for the klib library
Try to use multiple nested for loops to enumerate the start and end positions of the given two strings, as well as the length of the operation.\(n\)(if any), so that it is generally possible to cover all cases (including the boundary cases mentioned in the previous pit point).
How itrace's ring buffer is implemented
My implementation was to create a new$NEMU_HOME/src/cpu/
file, in which a ring queue is maintained and the interface exposed to the outside world is a functionvoid record_inst(uint64_t pc, const char *asm_code)
Passed in the pc pointer and disassembly string for the current recorded instruction, and thevoid info_inst_records()
function prints information about the current queue.
exist$NEMU_HOME/src/cpu/
(used form a nominal expression)trace_and_difftest
function changes the location of the original print command to record the command into a buffer. In the same file under theassert_fail_msg
function in the NEMU frameworkASSERT
Output call on interruptinfo_inst_records
。
How mtrace is implemented
exist$NEMU_HOME/src/memory/
centerpaddr_read
cap (a poem)paddr_write
It is sufficient to log the information after the function is successfully called.
How ftrace is implemented
First we need to understand the structure of the elf file, and here we need to first use theriscv64-linux-gnu-readelf -a
Get information about the elf file, then pass thehd
command to verify the elf file information we obtained, we can clearly understand the elf file architecture, we have a brief introduction below.
Overall composition of the ELF file structure
The ELF file consists of three main parts:
-
ELF Header(ELF header): describes the organization of the entire file, including basic information such as file type, target architecture, and entry points.
The ELF header is located at the top of the file and contains metadata information about the entire ELF file. It has a fixed structure, usually 64 bytes in size (52 bytes in 32-bit format), and mainly contains the following fields:
-
Magic Number: identifies this as an ELF file, the first 4 bytes are fixed values
0x7F
、E
、L
、F
。 -
Class: architecture type of the file, 32-bit (
ELFCLASS32
) or 64-bit (ELFCLASS64
)。 -
Data: data encoding, indicating the arrangement of byte order (little end)
LSB
or big endMSB
)。 -
Version: The version of the file format, usually
1
。 -
OS ABI: The ABI (Application Binary Interface) of the target operating system interface.
-
File type: specifies that this is an executable file (
ET_EXEC
), relocatable documents (ET_REL
) or shared libraries (ET_DYN
)。 -
Target architecture: e.g. x86 (
EM_386
) or ARM (EM_ARM
), etc. -
Entry point address: the virtual address at which program execution begins.
-
Program Header table offset: an offset to the program header table.
-
Section Header table offset: Offset to the section header table.
-
Flags: flags associated with the processor.
-
Head size: the size of the ELF head.
-
The size and number of entries in the Program Header table.
-
The size and number of entries in the Section Header table.
-
Section header string tableIndex in section header table
-
-
Program Header Table(program header table): describes how files are mapped to different segments in memory for loading executables at execution time.
The program header table describes how data segments in a file are mapped into memory for program execution. Each program header entry describes the attributes of a memory segment. The program header table is used at execution time to determine which segments are loaded into memory.
-
Section Header Table(Section Header Table): Describes the attributes of individual sections (Sections) in a file for linking and relocating.
Each entry in the section header table (section header) is used to describe individual sections (sections) in the file that play a role in the linking and relocation phases. Each section has its own specific role, such as storing code, data, symbol tables, relocation information, and so on. Each section header entry typically contains the following information:
-
Section Name: indexed by the section name table (Section header string).
-
Section Type: describes the content of the section, such as relocatable data (
SHT_RELA
), table of symbols (SHT_SYMTAB
), string tables (SHT_STRTAB
), etc. -
Section Flags: describes the attributes of a section, such as writable, executable, allocated to memory, etc.
-
Section Address: the virtual address of the section if it is loaded into memory.
-
Section Offset: the offset of the section in the file.
-
Section Size: the size of the section in the file.
-
Link: association with other sections, e.g., the symbol table is associated with the string table.
-
Info: additional information, the meaning of this field is different for different types of sections.
-
Address Alignment: The alignment requirements of the section in a file or in memory.
-
Entry Size: This field specifies the entry size if the section contains fixed-size entries (e.g., symbol tables).
A typical ELF file will contain the following sections:
-
.text: code segment.
-
.data: initialized data.
-
.bss: uninitialized data.
-
.symtab: a symbol table containing information about all symbols and their addresses.
-
.strtab: string table storing symbol names and section names.
-
. or . : Relocation table with relocation information for the code segment.
-
We begin by clarifying the following information:
- We get the address of the function and the offset of the function's name in the string table through the symbol table, and the function name through the string table decoding offset.
- Section name tables, symbol tables and string tables, although called tables in Chinese, but in fact they are a section (section).
- Section header information for the section name table, symbol table, and string table is stored in the section header table. The section header table is an array which is contiguous in memory with the exact location (offset information) and size given by the ELF header.
- The ELF header information is located at the very beginning of the file at an offset of zero.
- The information we need is already in the structural composition described abovebold (typeface)Up.
Flow to get the mapping of function name to function start address
-
Open and read ELF files
pass (a bill or inspection etc)
fopen
Opens the incoming ELF file. If the file cannot be opened, outputs an error message and terminates the program.utilization
fread
Read the ELF header (Elf32_Ehdr
) to get basic information about the file, such as the offset of the section header table and the number of sections. -
Verify that the file is in ELF format
Check the magic number field in the ELF header (
e_ident
), make sure it is a legal ELF file. If the magic numbers don't match, output an error message and terminate the program. -
Reading the Section Headers table (Section Headers)
Allocate enough memory to store the section header table based on the section header table offset in the ELF header, the number of sections and the size of each section header table (
Elf32_Shdr
) and use thefseek
cap (a poem)fread
Reads the section header table from a file. -
Read the section name table
The section name table is a special section for storing the names of all sections. In the section header table, the section name table is found by indexing (
shstrtab
) and read its contents. -
Finding symbol and string tables
Iterate through all section header entries, looking for
.symtab
(symbol table) and.strtab
(String tables) have section headers for storing symbols and symbol names, respectively. (If one wishes, one can secondarily confirm that we have found the correct table based on the section type information provided by the section header entries.)
If the two sections are not found, an error message is output and the program is terminated.
-
Reading symbol and string tables
according to
.symtab
Offset and size in the section header, read the symbol table.according to
.strtab
Offset and size in the section header to read the symbol name string table. -
Adding function symbols to the symbol table
Iterate through each symbol in the symbol table (
Elf32_Sym
), for each symbol, check if its type isSTT_FUNC
(i.e., function type). If it is a function symbol, callingadd_symbol
Adds the name, address, and size of the symbol to the symbol table in memory.
Realization of FTRACE
- define
const char *find_symbol(uint64_t addr)
function finds the function corresponding to the memory address from the saved mapping table - commands involving function jumps.
jal
is a direct jump to the target address for a function call, and his jump address is an immediate number, so it is a specific offset that is already known at compile time, and is therefore a direct call.jalr
The target address of a function is provided by a register, which is an offset that can only be determined at runtime, and is therefore an indirect call. Also, since we don't know at compile time which address each function returns to when it returns, we need to store the current pc in the return address register before each call.ra
In this case, the use ofjalr ra, 0(ra)
Realize the return. - So we know.
jal
is called directly.jalr
With an incoming immediate number of 0 and a call to the return address registerra
when it is returning a function, and in other cases when it is calling it indirectly.
How to implement DiffTest
sensingref_r
Is the register information in the global variablecpu
The same in the pc register can be forgotten.
Input-output problem solving
The relationship between AM and NEMU is that NEMU emulates a series of peripheral interfaces by calling a library of C functions, so we need to go read the$NEMU_HOME/src/device
down the file to see how NEMU does MMIO so we can go in AM and read and write the correct information for NEMU to process.
What is volatile used for?
For example, suppose thatp
A register pointing to a hardware device. If you don't use thevolatile
modifier, the compiler may assume that the value of this register will not be changed externally while the program is running, so it will put the value of the register that reads thep
operation is optimized to read only once, caching the value in a register and not re-reading it from that address as often. This is a reasonable optimization for normal variables, but for device registers this optimization can cause problems. The reason for this is that the value of the device register may change at any time (e.g., by updating the state via an external hardware device), and the value cached by the compiler is no longer up-to-date, causing the program to read the device state incorrectly, resulting in erroneous behavior.
How to realize the clock
Read about it in NEMUsource code, you can see that the NEMU simulation is implemented by calling the
<>
Gets the current time, stored to thertc_port_base
in the 16 bytes of the Also through which thertc_io_handler
You can see that if we read the high byte, NEMU gets the latest time. So when we implement theAlways read the high byte first, in order to refresh the device cache and get the latest time.
How to implement malloc
Pay attention to the address of a good application to be aligned, here because my computer is 64-bit I set the 8-byte alignment, that is, the size of each application space must be aligned to an integer multiple of 8.
How to implement DTRACE
From reading the documentation, we know that NEMU has been able to get a lot of information through themap_read
cap (a poem)map_write
Perform MMIO operations, so just log in these two functions.
How to implement the keyboard
Read about it in NEMUin the file
send_key
function can see that the message for each keystroke of the keyboard is to set thekeymap
The key information and mask in the AM is recorded by a per-bit or. We parse out these two pieces of information in AM, write them to thekbd
Ready to go.
How to realize VGA
Read about it in NEMUYou can see that the length and width of the display are in the
vgactl_port_base
The order in which they are stored in the In AM we need to extract this information and record it in config, because by reading theam-test
in the test program, it can be found that the program needs to get this information.
__am_gpu_fbdraw
It is necessary to include the providedctl
The printed information in the VGA is correctly written to the VGA's registers, and here is the simple mapping of the two-dimensional arrays that we need in thex, y
as the upper-left corner endpoint of the rectangle, write aw*h
of the rectangular image information. If the control information is required to request synchronization at this point, we must also write this information to the VGA's registers in a timely manner.
How to implement a sound card
In NEMU since we useSDL
library to emulate the sound card, every time we call the sound card information, we also need to initialize a new SDL with the information we have just used (because the configuration information may change) and set up his callback function. the SDL callback function defines how we handle the audio information provided in the registers, and since we have a limited buffer, we use a circular queue to write data into the buffer. So we need to read the data from the buffer and write it to the stream correctly.
In AM's__am_audio_play
in which we need to properly maintain the circular queue buffer to which we write data.
source code (computing)
/* Stick some representative,incomplete paste */
INSTPAT_START();
INSTPAT("0000000 ????? ????? 000 ????? 01100 11", add , R, R(rd) = (int32_t)((int32_t)src1 + (int32_t)src2));
INSTPAT("0000001 ????? ????? 000 ????? 01100 11", mul , R, R(rd) = (int32_t)((int64_t)(int32_t)src1 * (int64_t)(int32_t)src2));
INSTPAT("0000001 ????? ????? 001 ????? 01100 11", mulh , R, R(rd) = (int32_t)(((int64_t)(int32_t)src1 * (int64_t)(int32_t)src2) >> 32));
INSTPAT("0000001 ????? ????? 010 ????? 01100 11", mulhsu , R, R(rd) = ((int64_t)(int32_t)src1 * (uint64_t)(uint32_t)src2) >> 32);
INSTPAT("0000001 ????? ????? 011 ????? 01100 11", mulhu , R, R(rd) = (uint32_t)(((uint64_t)(uint32_t)src1 * (uint64_t)(uint32_t)src2) >> 32));
INSTPAT("0100000 ????? ????? 000 ????? 01100 11", sub , R, R(rd) = src1 - src2);
INSTPAT("0000000 ????? ????? 010 ????? 01100 11", slt , R, R(rd) = (int32_t)src1 < (int32_t)src2);
INSTPAT("0000000 ????? ????? 011 ????? 01100 11", sltu , R, R(rd) = (uint32_t)src1 < (uint32_t)src2);
INSTPAT("0000000 ????? ????? 100 ????? 01100 11", xor , R, R(rd) = src1 ^ src2);
INSTPAT("0000001 ????? ????? 100 ????? 01100 11", div , R, R(rd) = src2 ? (src1 == (1 << 31) && src2 == -1) ? (1 << 31) : (int32_t)((int32_t)src1 / (int32_t)src2) : -1);
INSTPAT("0000001 ????? ????? 101 ????? 01100 11", divu , R, R(rd) = src2 ? (uint32_t)((uint32_t)src1 / (uint32_t)src2) : (uint32_t)((1ll << 32) - 1));
INSTPAT("0000000 ????? ????? 001 ????? 01100 11", sll , R, R(rd) = (uint32_t)src1 << BITS(src2, 4, 0));
INSTPAT("0000000 ????? ????? 101 ????? 01100 11", srl , R, R(rd) = (uint32_t)src1 >> BITS(src2, 4, 0));
INSTPAT("0100000 ????? ????? 101 ????? 01100 11", sra , R, R(rd) = (int32_t)src1 >> BITS(src2, 4, 0));
INSTPAT("0000000 ????? ????? 110 ????? 01100 11", or , R, R(rd) = src1 | src2);
INSTPAT("0000001 ????? ????? 110 ????? 01100 11", rem , R, R(rd) = src2 ? (src1 == (1 << 31) && src2 == -1) ? 0 : (int32_t)src1 % (int32_t)src2 : src1);
INSTPAT("0000001 ????? ????? 111 ????? 01100 11", remu , R, R(rd) = src2 ? (uint32_t)src1 % (uint32_t)src2 : src1);
INSTPAT("0000000 ????? ????? 111 ????? 01100 11", and , R, R(rd) = src1 & src2);
INSTPAT("??????? ????? ????? 000 ????? 00000 11", lb , I, R(rd) = SEXT(Mr((uint32_t)((int32_t)src1 + (int32_t)imm), 1), 8));
INSTPAT("??????? ????? ????? 001 ????? 00000 11", lh , I, R(rd) = SEXT(Mr((uint32_t)((int32_t)src1 + (int32_t)imm), 2), 16));
INSTPAT("??????? ????? ????? 010 ????? 00000 11", lw , I, R(rd) = SEXT(Mr((uint32_t)((int32_t)src1 + (int32_t)imm), 4), 32));
INSTPAT("??????? ????? ????? 100 ????? 00000 11", lbu , I, R(rd) = Mr((uint32_t)((int32_t)src1 + (int32_t)imm), 1));
INSTPAT("??????? ????? ????? 101 ????? 00000 11", lhu , I, R(rd) = Mr((uint32_t)((int32_t)src1 + (int32_t)imm), 2));
INSTPAT("??????? ????? ????? 000 ????? 00100 11", addi , I, R(rd) = (int32_t)((int32_t)src1 + (int32_t)imm)); // mv
INSTPAT("??????? ????? ????? 010 ????? 00100 11", slti , I, R(rd) = (int32_t)src1 < (int32_t)imm);
INSTPAT("??????? ????? ????? 011 ????? 00100 11", sltiu , I, R(rd) = (uint32_t)src1 < (uint32_t)imm);
INSTPAT("??????? ????? ????? 100 ????? 00100 11", xori , I, R(rd) = src1 ^ imm);
INSTPAT("000000? ????? ????? 001 ????? 00100 11", slli , I, R(rd) = (uint32_t)src1 << BITS(imm, 4, 0));
INSTPAT("000000? ????? ????? 101 ????? 00100 11", srli , I, R(rd) = (uint32_t)src1 >> BITS(imm, 4, 0));
INSTPAT("010000? ????? ????? 101 ????? 00100 11", srai , I, R(rd) = (int32_t)src1 >> BITS(imm, 4, 0));
INSTPAT("??????? ????? ????? 110 ????? 00100 11", ori , I, R(rd) = src1 | imm);
INSTPAT("??????? ????? ????? 111 ????? 00100 11", andi , I, R(rd) = src1 & imm);
// INSTPAT("??????? ????? ????? 000 ????? 00011 11", fence , I, R(rd) = src1 & imm);
INSTPAT("??????? ????? ????? 000 ????? 11001 11", jalr , I, R(rd) = s->snpc; s->dnpc = (uint32_t)((int32_t)src1 + (int32_t)imm) & -2ull; check_jalr(rd, imm, s)); // ret
INSTPAT("??????? ????? ????? 000 ????? 01000 11", sb , S, Mw((uint32_t)((int32_t)src1 + (int32_t)imm), 1, src2));
INSTPAT("??????? ????? ????? 001 ????? 01000 11", sh , S, Mw((uint32_t)((int32_t)src1 + (int32_t)imm), 2, src2));
INSTPAT("??????? ????? ????? 010 ????? 01000 11", sw , S, Mw((uint32_t)((int32_t)src1 + (int32_t)imm), 4, src2));
INSTPAT("??????? ????? ????? 000 ????? 11000 11", beq , B, s->dnpc = src1 == src2 ? (uint32_t)((int32_t)s->pc + (int32_t)imm) : s->dnpc);
INSTPAT("??????? ????? ????? 001 ????? 11000 11", bne , B, s->dnpc = src1 != src2 ? (uint32_t)((int32_t)s->pc + (int32_t)imm) : s->dnpc);
INSTPAT("??????? ????? ????? 100 ????? 11000 11", blt , B, s->dnpc = (int32_t)src1 < (int32_t)src2 ? (uint32_t)((int32_t)s->pc + (int32_t)imm) : s->dnpc);
INSTPAT("??????? ????? ????? 101 ????? 11000 11", bge , B, s->dnpc = (int32_t)src1 >= (int32_t)src2 ? (uint32_t)((int32_t)s->pc + (int32_t)imm) : s->dnpc);
INSTPAT("??????? ????? ????? 110 ????? 11000 11", bltu , B, s->dnpc = (uint32_t)src1 < (uint32_t)src2 ? (uint32_t)((int32_t)s->pc + (int32_t)imm) : s->dnpc);
INSTPAT("??????? ????? ????? 111 ????? 11000 11", bgeu , B, s->dnpc = (uint32_t)src1 >= (uint32_t)src2 ? (uint32_t)((int32_t)s->pc + (int32_t)imm) : s->dnpc);
INSTPAT("??????? ????? ????? ??? ????? 00101 11", auipc , U, R(rd) = s->pc + imm);
INSTPAT("??????? ????? ????? ??? ????? 01101 11", lui , U, R(rd) = imm);
INSTPAT("??????? ????? ????? ??? ????? 11011 11", jal , J, R(rd) = s->snpc; s->dnpc = s->pc + imm; check_jal(s));
INSTPAT("0000000 00001 00000 000 00000 11100 11", ebreak , N, NEMUTRAP(s->pc, R(10))); // R(10) is $a0
INSTPAT("??????? ????? ????? ??? ????? ????? ??", inv , N, INV(s->pc));
void check_jal(Decode *s) {
if (elf_loaded) {
const char *func_name = find_symbol(s->dnpc);
if (func_name != NULL) {
call_depth++;
log_write("F\t%#010x\t%*sCall to %s @ %#010x\n", s->pc, call_depth * 2, "", func_name, s->dnpc);
}
}
}
void check_jalr(word_t rd, word_t imm, Decode *s) {
uint32_t i = s->;
int rs1 = BITS(i, 19, 15);
if (elf_loaded) {
if (rd == 0 && rs1 == 1 && imm == 0) {
// Function return
const char *func_name = find_symbol(s->pc);
if (func_name != NULL) {
log_write("F\t%#010x\t%*sReturn from %s\n", s->pc, call_depth * 2, "", func_name);
}
if (call_depth > 0) call_depth--;
}
else {
// Indirect function call
const char *func_name = find_symbol(s->dnpc);
if (func_name != NULL) {
call_depth++;
log_write("F\t%#010x\t%*sIndirect call to %s @ %#010x\n", s->pc, call_depth * 2, "", func_name, s->dnpc);
}
}
}
}
/* Stick some representative,incomplete paste */
int vsprintf(char *out, const char *fmt, va_list ap) {
char* buf = out;
int total_len = 0;
while (*fmt != '\0') {
if (*fmt != '%') {
*buf++ = *fmt++;
total_len++;
} else {
fmt++;
// Parse flags
int flags = 0;
int width = 0;
int precision = -1;
// int length = 0;
int specifier = 0;
while (*fmt == '-' || *fmt == '+' || *fmt == ' ' || *fmt == '#' || *fmt == '0') {
switch (*fmt) {
case '#': flags |= 0x02; break; // Hash
case '+': flags |= 0x04; break; // Plus sign
case ' ': flags |= 0x08; break; // Space
case '-': flags |= 0x10; break; // Left align
case '0': flags |= 0x20; break; // Zero padding
}
fmt++;
}
// Parse width
if (*fmt == '*') {
width = va_arg(ap, int);
fmt++;
} else {
while (*fmt >= '0' && *fmt <= '9') {
width = width * 10 + (*fmt++ - '0');
}
}
// Parse precision
if (*fmt == '.') {
fmt++;
precision = 0;
if (*fmt == '*') {
precision = va_arg(ap, int);
fmt++;
} else {
while (*fmt >= '0' && *fmt <= '9') {
precision = precision * 10 + (*fmt++ - '0');
}
}
}
// // Parse length modifier
// if (*fmt == 'h' || *fmt == 'l') {
// length = *fmt++;
// }
// Parse specifier
specifier = *fmt++;
// Handle specifier
if (specifier == 'd' || specifier == 'i') {
int value = va_arg(ap, int);
char temp[65];
int len = int_to_str(value, temp, 10, 0, width, precision > 0 ? precision : 0, flags);
for (int i = 0; i < len; i++) {
*buf++ = temp[i];
}
total_len += len;
} else if (specifier == 'o') {
unsigned int value = va_arg(ap, unsigned int);
char temp[65];
int len = int_to_str(value, temp, 8, 1, width, precision > 0 ? precision : 0, flags);
for (int i = 0; i < len; i++) {
*buf++ = temp[i];
}
total_len += len;
} else if (specifier == 'u') {
unsigned int value = va_arg(ap, unsigned int);
char temp[65];
int len = int_to_str(value, temp, 10, 1, width, precision > 0 ? precision : 0, flags);
for (int i = 0; i < len; i++) {
*buf++ = temp[i];
}
total_len += len;
} else if (specifier == 'x' || specifier == 'X') {
unsigned int value = va_arg(ap, unsigned int);
char temp[65];
if (specifier == 'X') flags |= 0x01; // Uppercase
int len = int_to_str(value, temp, 16, 1, width, precision > 0 ? precision : 0, flags);
for (int i = 0; i < len; i++) {
*buf++ = temp[i];
}
total_len += len;
} else if (specifier == 'c') {
char c = (char)va_arg(ap, int);
*buf++ = c;
total_len++;
} else if (specifier == 's') {
char *str = va_arg(ap, char*);
int len = precision >= 0 ? precision : strlen(str);
if (width > len && !(flags & 0x10)) { // Right align
for (int i = 0; i < width - len; i++) {
*buf++ = ' ';
total_len++;
}
}
for (int i = 0; i < len && str[i] != '\0'; i++) {
*buf++ = str[i];
total_len++;
}
if (width > len && (flags & 0x10)) { // Left align
for (int i = 0; i < width - len; i++) {
*buf++ = ' ';
total_len++;
}
}
} else if (specifier == 'p') {
void *ptr = va_arg(ap, void*);
unsigned long value = (unsigned long)ptr;
char temp[65];
flags |= 0x02; // Force '0x'
int len = int_to_str(value, temp, 16, 1, width, precision > 0 ? precision : 0, flags);
for (int i = 0; i < len; i++) {
*buf++ = temp[i];
}
total_len += len;
} else if (specifier == 'f' || specifier == 'F' || specifier == 'e' || specifier == 'E' || specifier == 'g' || specifier == 'G') {
assert(0);
// double value = va_arg(args, double);
// char temp[320];
// int len = double_to_str(value, temp, precision >= 0 ? precision : 6, specifier, flags, width);
// for (int i = 0; i < len; i++) {
// /*buf++ = temp[i];
// }
// total_len += len;
} else if (specifier == '%') {
*buf++ = '%';
total_len++;
} else if (specifier == 'n') {
int *ptr = va_arg(ap, int*);
*ptr = total_len;
} else {
// Invalid specifier
assert(0);
}
}
}
*buf = '\0';
return buf - out;
}
int sprintf(char *out, const char *fmt, ...) {
va_list args;
va_start(args, fmt);
int total_len = vsprintf(out, fmt, args);
va_end(args);
return total_len;
}
int snprintf(char *out, size_t n, const char *fmt, ...) {
panic("Not implemented");
}
int vsnprintf(char *out, size_t n, const char *fmt, va_list ap) {
panic("Not implemented");
}
static int int_to_str(int value, char *buffer, int base, int is_unsigned, int width, int precision, int flags) {
char temp[65];
int i = 0;
unsigned int uvalue = value;
int is_negative = 0;
if (!is_unsigned && value < 0) {
is_negative = 1;
uvalue = -value;
}
if (uvalue == 0) {
temp[i++] = '0';
} else {
while (uvalue != 0) {
int digit = uvalue % base;
if (digit < 10) {
temp[i++] = digit + '0';
} else {
if (flags & 0x01) { // Uppercase for X
temp[i++] = digit - 10 + 'A';
} else {
temp[i++] = digit - 10 + 'a';
}
}
uvalue /= base;
}
}
// Handle precision
while (i < precision) {
temp[i++] = '0';
}
// Add prefix for '#' flag
if (flags & 0x02) {
if (base == 8 && temp[i-1] != '0') {
temp[i++] = '0';
} else if (base == 16) {
temp[i++] = flags & 0x01 ? 'X' : 'x';
temp[i++] = '0';
}
else {
assert(0);
}
}
int len = i;
// Handle zero-padding and width
int padding = width - len - 1; // 1-width for the sign
if (padding > 0 && (flags & 0x20) && !(flags & 0x10)) { // Right align
while (padding-- > 0) {
temp[i++] = '0';
}
}
if (is_negative) {
temp[i++] = '-';
} else if (flags & 0x04) { // '+' flag
temp[i++] = '+';
} else if (flags & 0x08) { // ' ' flag
temp[i++] = ' ';
}
if (padding > 0 && !(flags & 0x20) && !(flags & 0x10)) { // Right align
while (padding-- > 0) {
temp[i++] = ' ';
}
}
reverse_str(temp, i);
// Copy to buffer
for (int j = 0; j < i; j++) {
*buffer++ = temp[j];
}
// Right padding if left align
if (padding > 0 && (flags & 0x10)) {
while (padding-- > 0) {
*buffer++ = ' ';
}
}
return i;
}
/* Stick some representative,incomplete paste */
void record_inst(uint64_t pc, const char *asm_code) {
inst_buf[inst_buf_pos].pc = pc;
// Safely copy the assembly code to prevent buffer overflow
strncpy(inst_buf[inst_buf_pos].asm_code, asm_code, sizeof(inst_buf[inst_buf_pos].asm_code) - 1);
inst_buf[inst_buf_pos].asm_code[sizeof(inst_buf[inst_buf_pos].asm_code) - 1] = '\0'; // Ensure null-termination
inst_buf_pos = (inst_buf_pos + 1) % INST_BUF_SIZE; // Update write position to create a ring buffer
if (inst_buf_count < INST_BUF_SIZE) {
inst_buf_count++; // Increment the count of recorded instructions
}
}
/* */
#include <>
typedef struct {
char *name;
uint64_t addr;
uint64_t size;
} FunctionSymbol;
static FunctionSymbol *symbol_table = NULL;
static int symbol_count = 0;
static int symbol_capacity = 0;
bool elf_loaded = false;
void add_symbol(const char *name, uint64_t addr, uint64_t size) {
if (symbol_count >= symbol_capacity) {
symbol_capacity = symbol_capacity == 0 ? 1024 : symbol_capacity * 2;
symbol_table = realloc(symbol_table, symbol_capacity * sizeof(FunctionSymbol));
}
symbol_table[symbol_count].name = strdup(name);
symbol_table[symbol_count].addr = addr;
symbol_table[symbol_count].size = size;
symbol_count++;
}
const char *find_symbol(uint64_t addr) {
if (!elf_loaded) return NULL;
for (int i = 0; i < symbol_count; i++) {
if (addr >= symbol_table[i].addr && addr < symbol_table[i].addr + symbol_table[i].size) {
return symbol_table[i].name;
}
}
return NULL;
}
void init_elf(const char *elf_file) {
if (elf_file == NULL) {
elf_loaded = false;
return;
}
FILE *fp = fopen(elf_file, "rb");
Assert(fp, "Cannot open '%s'", elf_file);
elf_loaded = true;
// Read ELF header
Elf32_Ehdr ehdr;
if (fread(&ehdr, sizeof(Elf32_Ehdr), 1, fp) != 1) {
printf("Failed to read ELF header\n");
exit(1);
}
// Verify ELF magic number
if (memcmp(ehdr.e_ident, ELFMAG, SELFMAG) != 0) {
printf("Not an ELF file\n");
exit(1);
}
// Read section headers
Elf32_Shdr *shdrs = malloc(ehdr.e_shentsize * ehdr.e_shnum);
fseek(fp, ehdr.e_shoff, SEEK_SET);
if (fread(shdrs, ehdr.e_shentsize, ehdr.e_shnum, fp) != ehdr.e_shnum) {
printf("Failed to read section headers\n");
exit(1);
}
// Read section header string table
Elf32_Shdr shstr_shdr = shdrs[ehdr.e_shstrndx];
char *shstrtab = malloc(shstr_shdr.sh_size);
fseek(fp, shstr_shdr.sh_offset, SEEK_SET);
if (fread(shstrtab, shstr_shdr.sh_size, 1, fp) != 1) {
printf("Failed to read section header string table\n");
exit(1);
}
// Find .symtab and .strtab sections
Elf32_Shdr *symtab_shdr = NULL;
Elf32_Shdr *strtab_shdr = NULL;
for (int i = 0; i < ehdr.e_shnum; i++) {
char *section_name = &shstrtab[shdrs[i].sh_name];
if (shdrs[i].sh_type == SHT_SYMTAB && strcmp(section_name, ".symtab") == 0) {
symtab_shdr = &shdrs[i];
} else if (shdrs[i].sh_type == SHT_STRTAB && strcmp(section_name, ".strtab") == 0) {
strtab_shdr = &shdrs[i];
}
}
if (symtab_shdr == NULL || strtab_shdr == NULL) {
printf("Failed to find .symtab or .strtab in ELF file\n");
exit(1);
}
// Read symbol table
int sym_count = symtab_shdr->sh_size / symtab_shdr->sh_entsize;
Elf32_Sym *symtab = malloc(symtab_shdr->sh_size);
fseek(fp, symtab_shdr->sh_offset, SEEK_SET);
if (fread(symtab, symtab_shdr->sh_entsize, sym_count, fp) != sym_count) {
printf("Failed to read symbol table\n");
exit(1);
}
// Read string table
char *strtab = malloc(strtab_shdr->sh_size);
fseek(fp, strtab_shdr->sh_offset, SEEK_SET);
if (fread(strtab, strtab_shdr->sh_size, 1, fp) != 1) {
printf("Failed to read string table\n");
exit(1);
}
// Store function symbols
for (int i = 0; i < sym_count; i++) {
Elf32_Sym sym = symtab[i];
char *name = &strtab[sym.st_name];
if (ELF32_ST_TYPE(sym.st_info) == STT_FUNC) {
add_symbol(name, sym.st_value, sym.st_size);
}
}
// Free allocated memory
free(shdrs);
free(shstrtab);
free(symtab);
free(strtab);
fclose(fp);
}
/* Stick some representative,incomplete paste */
void __am_gpu_config(AM_GPU_CONFIG_T *cfg) {
uint32_t data = inl(VGACTL_ADDR);
int width = (data >> 16) & 0xffff;
int height = data & 0xffff;
int vmemsz = width * height * sizeof(uint32_t);
*cfg = (AM_GPU_CONFIG_T) {
.present = true, .has_accel = false,
.width = width, .height = height,
.vmemsz = vmemsz
};
}
void __am_gpu_fbdraw(AM_GPU_FBDRAW_T *ctl) {
int x = ctl->x, y = ctl->y, w = ctl->w, h = ctl->h;
int width = inl(VGACTL_ADDR) >> 16, height = inl(VGACTL_ADDR) & 0xffff;
uint32_t *pixels = ctl->pixels;
uint32_t *fb = (uint32_t *)(uintptr_t)FB_ADDR;
for (int i = y; i < y + h && i < height; i++) {
for (int j = x; j < x + w && j < width; j++) {
fb[i * width + j] = pixels[(i - y) * w + (j - x)];
}
}
if (ctl->sync) {
outl(SYNC_ADDR, 1);
}
}
/* Paste some representative, not all */
void __am_audio_play(AM_AUDIO_PLAY_T *ctl) {
uint8_t *audio_data = (ctl->buf).start;
uint32_t sbuf_size = inl(AUDIO_SBUF_SIZE_ADDR);
uint32_t len = (ctl->buf).end - (ctl->buf).start;
uint8_t *ab = (uint8_t *)(uintptr_t)AUDIO_SBUF_ADDR;
for(int i = 0; i < len; i++){
ab[sbuf_pos] = audio_data[i].
sbuf_pos = (sbuf_pos + 1) % sbuf_size;
}
outl(AUDIO_COUNT_ADDR, inl(AUDIO_COUNT_ADDR) + len);
}
/* am/ Paste some representative, not all */
void __am_audio_play(AM_AUDIO_PLAY_T *ctl) {
uint8_t *audio_data = (ctl->buf).start;
uint32_t sbuf_size = inl(AUDIO_SBUF_SIZE_ADDR);
uint32_t len = (ctl->buf).end - (ctl->buf).start;
uint8_t *ab = (uint8_t *)(uintptr_t)AUDIO_SBUF_ADDR;
for(int i = 0; i < len; i++){
ab[sbuf_pos] = audio_data[i].
sbuf_pos = (sbuf_pos + 1) % sbuf_size;
}
outl(AUDIO_COUNT_ADDR, inl(AUDIO_COUNT_ADDR) + len);
}
/* nemu/ Stick some representative,incomplete paste */
void init_sound();
static void audio_io_handler(uint32_t offset, int len, bool is_write) {
if(audio_base[reg_init] == 1){
init_sound();
audio_base[reg_init] = 0;
}
}
void sdl_audio_callback(void *userdata, uint8_t *stream, int len){
SDL_memset(stream, 0, len);
uint32_t used_cnt = audio_base[reg_count];
len = len > used_cnt ? used_cnt : len;
uint32_t sbuf_size = audio_base[reg_sbuf_size];
if( (sbuf_pos + len) > sbuf_size ){
SDL_MixAudio(stream, sbuf + sbuf_pos, sbuf_size - sbuf_pos , SDL_MIX_MAXVOLUME);
SDL_MixAudio(stream + (sbuf_size - sbuf_pos), sbuf, len - (sbuf_size - sbuf_pos), SDL_MIX_MAXVOLUME);
}
else
SDL_MixAudio(stream, sbuf + sbuf_pos, len , SDL_MIX_MAXVOLUME);
sbuf_pos = (sbuf_pos + len) % sbuf_size;
audio_base[reg_count] -= len;
}
void init_sound() {
SDL_AudioSpec s = {};
= AUDIO_S16SYS;
= NULL;
= audio_base[reg_freq];
= audio_base[reg_channels];
= audio_base[reg_samples];
= sdl_audio_callback;
SDL_InitSubSystem(SDL_INIT_AUDIO);
SDL_OpenAudio(&s, NULL);
SDL_PauseAudio(0);
}