Reset the Machine Exit Qemu gdb and Start Them Again Examine the 8 Words
Function ane: PC Bootstrap
Getting Started with x86 assembly
Exercise 1. Familiarize yourself with the associates linguistic communication materials available on the 6.828 reference page. You lot don't have to read them now, merely you'll almost certainly desire to refer to some of this material when reading and writing x86 assembly.
We practice recommend reading the section "The Syntax" in Brennan's Guide to Inline Assembly. It gives a adept (and quite cursory) description of the AT&T assembly syntax we'll be using with the GNU assembler in JOS.
I accept already learned that in Introduction to Computer Organization and Figurer Organisation course.
Simulating the x86
I have been trying to prepare the simulation environs on my MacBook Pro. I refer to MIT 6.828 tools page to set up my environment. But this web page guide us to use the "evil" make install which might cause conflicts with other bundle manager. I prefer to use Homebrew as my default package director. I accept wrote the script to install binutils using Homebrew. Build the gcc is a greater challenge. Luckily Leedy have provide me his solution.
The PC's Physical Address Space
The ROM BIOS
Note: I have to use i386-jos-elf-gdb in order to correctly load .gdbinit file. I'k confused why gdb nether a Linux distribution can straight load symbol from "i386-os-elf". Peradventure it is because that Linux binary is besides organised in ELF?
Exercise two. Use GDB's si (Step Instruction) command to trace into the ROM BIOS for a few more instructions, and try to gauge what it might be doing. You might desire to expect at Phil Storrs I/O Ports Clarification, equally well as other materials on the 6.828 reference materials page. No demand to figure out all the details - just the full general idea of what the BIOS is doing first.
TODO: Read it later on
Role 2: The Boot Loader
When I desire to utilize the disassemble command to dig into the associates code inside bios code, I found that gdb cannot disassemble the lawmaking with a output "No function contains program counter for selected frame.". I institute that I must specify the accost range of the lawmaking I desire to disassemble. So I came upwardly with this control disas /r ((($cs)<<4)+$eip), ((($cs)<<four)+$eip)+0xF which effectively calculate current code address and detach the following 16 bytes of code.
Exercise three. Accept a expect at the lab tools guide, specially the section on GDB commands. Fifty-fifty if yous're familiar with GDB, this includes some esoteric GDB commands that are useful for Os work.
Prepare a breakpoint at address 0x7c00, which is where the boot sector will exist loaded. Continue execution until that breakpoint. Trace through the code in kicking/boot.S, using the source code and the disassembly file obj/boot/boot.asm to keep track of where you are. Besides use the x/i command in GDB to disassemble sequences of instructions in the kick loader, and compare the original boot loader source code with both the disassembly in obj/boot/kicking.asm and GDB.
Trace into bootmain() in boot/primary.c, and and so into readsect(). Place the verbal associates instructions that correspond to each of the statements in readsect(). Trace through the rest of readsect() and dorsum out into bootmain(), and identify the begin and finish of the for loop that reads the remaining sectors of the kernel from the disk. Find out what code volition run when the loop is finished, set a breakpoint there, and proceed to that breakpoint. Then pace through the residue of the boot loader.
Questions:
- At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode?
.code16 # Assemble for 16 - chip mode ... ... # Switch from existent to protected way , using a bootstrap GDT # and segment translation thursday at makes virtual addresses # identical to their concrete addresses , then thursday at the # effective memory map does not change during the switch. lgdt gdtdesc movl % cr0 , % eax orl $ CR0_PE_ON , % eax # this lawmaking select protected mode ( 32 - bit mode) for processor movl % eax , % cr0 # Spring to next instruction , but in 32 - bit code segment. # Switches processor into 32 - bit mode. ljmp $ PROT_MODE_CSEG , $ protcseg # jump to actual 32 - scrap code .code32 # Assemble for 32 - chip mode ... ... The orl $CR0_PE_ON, %eax caused the Os to switch from xvi-bit existent mode to 32-bit protected manner. The ljmp $PROT_MODE_CSEG, $protcseg caused the didactics pointer actually points to a 32-flake lawmaking segment.
- What is the last instruction of the kick loader executed, and what is the kickoff instruction of the kernel it only loaded?
// call the entry point from the ELF header // note: does not render! ((void (*)(void)) (ELFHDR->e_entry))(); //Terminal code of boot loader Then the %eip jumps to *0x10018 where the first line of kernel is run past CPU.
The first line of kernel lawmaking is located in lab/kern/entry.S.
# Load the physical address of entry_pgdir into cr3 . entry_pgdir # is defined in entrypgdir.c. movl $(RELOC(entry_pgdir)) , % eax movl % eax , % cr3 - Where is the start instruction of the kernel?
I looked at the compiled boot loader and find the following line is respective to the terminal line of boot loader.
// phone call the entry point from the ELF header // note: does not render! ((void ( * )(void)) (ELFHDR - >e_entry))() ; 7d63: ff 15 18 00 01 00 call * 0x10018 Past debugging the kernel, I constitute the arrow in memory location 0x10018 is actually pointed to 0x10000c, where the entry tag is located in /lab/kern/boot.S.
.globl entry entry: movw $ 0x1234 , 0x472 # warm boot # We haven 't ready virtual memory yet, and so we' re running from # the physical address the boot loader loaded the kernel at : 1MB # (plus a few bytes). Nonetheless , the C code is linked to run at # KERNBASE + 1MB. Hence , we set a trivial page directory th at # translates virtual addresses [ KERNBASE , KERNBASE + 4MB) to # physical addresses [ 0 , 4MB). This 4MB region volition be # sufficient until we set up our existent page tabular array in mem_init # in lab 2 . # Load the physical accost of entry_pgdir into cr3 . entry_pgdir # is defined in entrypgdir.c. movl $(RELOC(entry_pgdir)) , % eax movl % eax , % cr3 # Turn on paging. movl % cr0 , % eax orl $(CR0_PE|CR0_PG|CR0_WP) , % eax movl % eax , % cr0 It seems that information technology is nearly memory paging.
- How does the kick loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?
The following code reads ELF sectors in to memory:
struct Proghdr *ph, *eph; // read 1st page off disk readseg((uint32_t) ELFHDR, SECTSIZE*8, 0); //Load the first page which include the ELF Header and Program Header Table. // is this a valid ELF? if (ELFHDR->e_magic != ELF_MAGIC) //Make sure that it is an ELF file. goto bad; // load each programme segment (ignores ph flags) ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff); //The kickoff segment eph = ph + ELFHDR->e_phnum; //The end of segment for (; ph < eph; ph++) //Loop through all the segment and load them into proper location of retentiveness // p_pa is the load address of this segment (as well // as the physical accost) readseg(ph->p_pa, ph->p_memsz, ph->p_offset); As stated in the comment in the code, the last few lines of boot loader load every segment of the kernel into the memory:
- The kick loader load the first page of kernel ELF file into the retentivity.
- The boot loader reads the start and cease address of program header table.
- The kicking loader loop through each entry of program header tabular array to load each segment of kernel into memory.
As a upshot, as long as the ELF header and program header table of the kernel ELF file tin can fits into one page. The kick loader can load every segment of kernel into memory.
Loading the Kernel
Practise 4. Read nigh programming with pointers in C. The best reference for the C linguistic communication is The C Programming Language past Brian Kernighan and Dennis Ritchie (known as 'K&R'). We recommend that students purchase this volume (here is an Amazon Link) or discover one of MIT's 7 copies.
Read 5.1 (Pointers and Addresses) through 5.5 (Character Pointers and Functions) in K&R. And then download the code for pointers.c, run it, and make sure you understand where all of the printed values come from. In item, make certain you sympathise where the pointer addresses in lines 1 and 6 come up from, how all the values in lines 2 through 4 go there, and why the values printed in line five are seemingly corrupted.
There are other references on pointers in C (e.g., A tutorial by Ted Jensen that cites K&R heavily), though not as strongly recommended.
Warning: Unless you are already thoroughly versed in C, do non skip or even skim this reading exercise. If you do not actually understand pointers in C, you will suffer untold pain and misery in subsequent labs, and and then eventually come to understand them the hard mode. Trust us; y'all don't want to find out what "the difficult way" is.
I believe this is already covered in Introduction to Computing class.
Do 5. Trace through the commencement few instructions of the boot loader once more and identify the offset instruction that would "suspension" or otherwise exercise the wrong thing if you were to go the boot loader's link address wrong. And then change the link accost in kick/Makefrag to something wrong, run make make clean, recompile the lab with make, and trace into the boot loader again to see what happens. Don't forget to change the link address back and clean once again afterward!
Done. In my situation I modify the accost 0x7C00 to 0x7C01 and the emulator stuck into a infinite loop 😝.
Exercise 6. Nosotros can examine memory using GDB'south x command. The GDB manual has full details, only for now, information technology is enough to know that the command ten/Nx ADDR prints North words of retentiveness at ADDR. (Annotation that both 'x'south in the command are lowercase.) Warning: The size of a word is not a universal standard. In GNU assembly, a word is two bytes (the 'due west' in xorw, which stands for discussion, means 2 bytes).
Reset the machine (leave QEMU/GDB and commencement them again). Examine the 8 words of retentiveness at 0x00100000 at the betoken the BIOS enters the boot loader, and and then again at the point the kicking loader enters the kernel. Why are they unlike? What is at that place at the second breakpoint? (You do not really need to use QEMU to answer this question. Just call up.)
Earlier the boot loader loads, the memory 0x100000 is empty (00 00 00 00 00 00 00 00). Only before the kick loader calls the kernel entry function, the memory 0x100000 is filled with data in kernel (02 b0 ad 1b 00 00 00 00).
This different is acquired by the boot loader loads information from the kernel ELF file according to its ELF header. According to the linker script for kernel (/lab/kern/kernel.ld):
.text : AT(0x100000) { *(.text .stub .text.* .gnu.linkonce.t.*) } The .text (code) segment is prepare to be load at location 0x100000.
Part 3: The Kernel
Using virtual memory to work around position dependence
Exercise seven. Use QEMU and GDB to trace into the JOS kernel and cease at the
movl %eax, %cr0. Examine memory at 0x00100000 and at 0xf0100000. Now, unmarried step over that instruction using the stepi GDB command. Again, examine memory at 0x00100000 and at 0xf0100000. Make sure y'all understand what only happened.
Before the code executed, the 0x00100000 points to the beginning of the kernel while 0xf0100000 points to blank accost. Afterward that, the address 0xf0100000 and 0x00100000 are both pointed to the same location in kernel lawmaking with physical address of 0x00100000.
The movl %eax, %cr0 load new %cr0 which enable paging with page table at entry_pgdir. Later that all retention address is interpreted every bit virtual address. Co-ordinate to /lab/kern/entrypgdir.c both [0, 4MB) and [KERNBASE, KERNBASE+4MB) in virtual address are mapped to [0, 4MB) in physical address.
What is the get-go instruction later the new mapping is established that would fail to work properly if the mapping weren't in place? Comment out the movl %eax, %cr0 in kern/entry.S, trace into information technology, and run into if yous were right.
I presume that information technology would be movl $0x0,%ebp. This is the line is but beneath the following lawmaking:
# At present paging is enabled , but we're withal running at a depression EIP # (why is this okay?). Spring up to a higher place KERNBASE before entering # C lawmaking. mov $ relocated , % eax jmp * % eax These code alter %eip from low address to proper virtual address for kernel. Because we accept stop the switching process, the CPU would withal look at the physial retentiveness address $eip, which will results in wrong results.
Formatted Printing to the Console
Practice viii
We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form "%o". Find and make full in this code fragment.
unequal --git a/lib/printfmt.c b/lib/printfmt.c index b1de635..28e01c9 100644 --- a/lib/printfmt.c +++ b/lib/printfmt.c @@ -205,9 +205,11 @@ vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap) // (unsigned) octal instance 'o': - num = getuint(&ap, lflag); - base = 8; - goto number; + // Supervene upon this with your code. + putch('Ten', putdat); + putch('Ten', putdat); + putch('10', putdat); + interruption; // pointer case 'p': Be able to respond the following questions:
- Explain the interface betwixt
printf.candpanel.c. Specifically, what function doesconsole.cexport? How is this function used byprintf.c?
The console.c handles the hardware part and printf handles the formatting part of string printing. The interface between them is cputchar(int) who is used in printf.c and implemented in panel.c. They are bridged by header file inc/stdio.h.
Explain the post-obit from
console.c:if (crt_pos >= CRT_SIZE) { int i; memcpy(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t)); for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++) crt_buf[i] = 0x0700 | ' ' ; crt_pos -= CRT_COLS; }
These code is used to handle the situation that the screen is filled with characters. When crt_pos exceeds CRT_SIZE, this lawmaking move all but the start line on screen forrard one line and fill the last line with blank character.
For the post-obit questions you lot might wish to consult the notes for Lecture i. These notes encompass GCC's calling convention on the x86.
Trace the execution of the following code pace-past-step:
int x = one, y = 3, z = 4; cprintf("x %d, y %10, z %d\n", x, y, z);
- In the call to
cprintf(), to what doesfmtbespeak? To what doesapbetoken?
fmt points to the formatting string, while ap points to the pointer of the commencement element in the list, namely ten.
2. List (in social club of execution) each call to `cons_putc`, `va_arg`, and `vcprintf`. For `cons_putc`, listing its argument as well. For `va_arg`, listing what `ap` points to before and after the call. For `vcprintf` list the values of its two arguments.
-
int cprintf("x %d, y %x, z %d\due north", 10, y, z); -
cprintfcallsint vcprintf(const char *fmt, va_list ap);fmtpoints to"x %d, y %x, z %d\north"and ap points to the pointer of10. -
vcprintfcallsvoid vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap);putchpoints to the functionputch;putdatequals0; fmt and ap is the same as to a higher place. - vprintfmt calls
pitch(int)to impress each character.
Run the following code.
unsigned int i = 0x00646c72; cprintf("H%ten Wo%s", 57616, &i);What is the output? Explain how this output is arrived at in the pace-past-step way of the previous exercise. Hither'southward an ASCII table that maps bytes to characters.
The output depends on that fact that the x86 is lilliputian-endian. If the x86 were instead big-endian what would you set
ito in club to yield the same output? Would you demand to change57616to a different value?Here'due south a description of footling- and big-endian and a more whimsical description.
The output is He110 World.
In the following code, what is going to exist printed after
y=? (note: the answer is not a specific value.) Why does this happen?
y is printed as 267380146. This is considering that the length of va_list is one and cprintf tries admission the second element and have it equally a pointer. As a result information technology points to a random parts of memory and print it out.
- Let'south say that GCC changed its calling convention then that information technology pushed arguments on the stack in announcement social club, so that the final argument is pushed concluding. How would you have to change
cprintfor its interface and so that it would nonetheless be possible to pass it a variable number of arguments?
This function can just listing variable in a reverse gild.
Challenge
Enhance the console to allow text to be printed in different colors. The traditional mode to do this is to make it interpret ANSI escape sequences embedded in the text strings printed to the console, merely you may use whatever mechanism you like. At that place is plenty of data on the 6.828 reference folio and elsewhere on the web on programming the VGA display hardware. If you're feeling really audacious, you could try switching the VGA hardware into a graphics fashion and making the console draw text onto the graphical frame buffer.
I have implemented total ANSI fore/background color support. I refer to FFmpeg's ASCII/ANSI art decoder to convert ANSI color alphabetize to CGA color alphabetize. I implemented a finite state machine to recognize ANSI code and extract the colour code. I place the recognition logic in putch office to provide aforementioned behaviour as current printf in standard c library (printf does non require ANSI code to be placed in format string, ANSI lawmaking in literal string being passed into printf tin can likewise office properly). Notably, my implementation back up to display color character in both native console (serial port) and CGA interface.
The running result is shown below:
Here is the patch:
0 Response to "Reset the Machine Exit Qemu gdb and Start Them Again Examine the 8 Words"
Postar um comentário