floppy format ------------- 512 bytes MBR stage 1 simple boot loader and magic boot string loads stage 2 directly following stage 1, also assumes stage 2 fits on one track of the floppy, so we don't need a complicated loading method probing tracks per sector 1024 bytes stage 2 boot loader, interprets tar format one sector after stage 2 and reads files into memory (vmlinuz, ramdisk.img) N tar file format (no compression, we expect the files to be well compressed). 2 blocks .ustar format, file names are easy accessible (vmlinuz, ramdisk.img). We sacrifice 512 bytes for easier reading in multiple disks (for instance a kernel disk, an initial ramdisk, a driver disk for SCSI, a root file system, etc.), we could even do multi-floppy kernels, so we can read the kernel distributed on more than one floppy. ustar/tar --------- offset length description example byte 0 100 filename in ascii, zero-term string "bzImage" byte 0x7c (124) 12 length in octal, zero-term string "00004014360" byte 0x94 (148) 8 checksum in octal, zero-term string "012757" with an ending space for some reason sum the header bytes with the checksum bytes as spaces (0x20) byte 0x101 (257) 6 UStar indicator, zero-term string "ustar" one of the easiest ways to detect a tar header sector kernel ------ newest kernel: 5.18.1 ramdisk ------- find . | cpio -H newc -o -R root:root | zstd > ../ramdisk.img user land --------- C library: dietlib, musl options: busybox, toybox, sbase/ubase/uinit/mdev make V=1 VERBOSE=1 CC=musl-gcc building the user land ---------------------- Archlinux mkinitcpio is way too big, but maybe it can be trimmed to size ramdisk: buildroot, mkroot, or by hand memory layout ------------- 0x07c00 - 0x08fff boot loader 0x09000 - 0x091ff floppy read buffer 0x0e000 - 0x09200 stack of real mode kernel 0x10000 - 0x101ff Linux zero page (first part) 0x10200 - 0x103ff zero page (part two), real mode entry point at 0x10200 0x10400 - xxx continue code of real mode kernel 0x1e000 - 0xe0ff cmd line for kernel 0x100000 - xxx protected mode kernel code (at 1 MB) 0x800000 - xxx ram disk (at 8 MB) state machine ------------- TAR state machine: reading metadata, reading data, we know whether we are in the kernel, ramdisk, etc. kernel substates: - sector 1: read number of real mode sectors - sector 2: read and check params, set params - sector >2: always read and copy data from floppy to destination area error codes ----------- error codes consist of a error class (DISK, KERN) and a code ERR DISK 0x01 stage 1 read error while reading stage 2 ERR DISK 0x02 stage 1 short read error (we didn't read as many stage 2 sectors as expected) ERR DISK 0x03 reading and interpreting tar state machine error ERR DISK 0xXX other read errors (BIOS int 0x13 codes), stage 2 ERR A20 0x01 A20 address line not enabled ERR KERN 0x01 kernel read state machine error ERR KERN 0x02 kernel signature 'HdrS' not found ERR KERN 0x03 kernel boot protocol too old ERR KERN 0x04 kernel cannot be started (or better, we return from the real mode jump) Linux IA-32 boot sequence ------------------------- - load Kernel boot sector at 0x10000 (first 512 bytes) - read 0x10000+0x1f1 number of sectors => minimal 4 sectors (if 0 is in 1f1), number of setup sectors - read 0x10200 (second part of the zero page) - compare 0x10202 to linux header 'HdrS', must be equal - compare 0x10206 to linux boot protocol version, don't allow anything below 0x215 (the newest one) for now - set various zero page data - test for KASLR enabled 0x10211 has bit 1 set? (this we might not want to do for old i486 kernels and systems) - set 0xFF for non-registered boot loader in 0x10210 - set 0x80 in loadflags 0x10211 - CAN_USE_HEAP (bit 7) - LOADED_HIGH? where do we load protected mode code? - set head_end_ptr 0x10224 to 0xde00 ; heap_end = 0xe000 ; heap_end_ptr = heap_end - 0x200 = 0xde00 mov word [es:0x224], 0xde00 ;head_end_ptr "Set this field to the offset (from the beginning of the real-mode code) of the end of the setup stack/heap, minus 0x0200." - set 0x10228 to 0x1e000 set to mov dword [es:0x228], 0x1e000 ;cmd line ptr mov dword [es:0x228], 0x1e000 ; set cmd_line_ptr also copy your command line to 0x1e000, for now from the boot loader data segment (initialized data) area. At offset 0x0020 (word), “cmd_line_magic”, enter the magic number 0xA33F. At offset 0x0022 (word), “cmd_line_offset”, enter the offset of the kernel command line (relative to the start of the real-mode kernel). The kernel command line must be within the memory region covered by setup_move_size, so you may need to adjust this field. - read to 0x10400 N-1 sectors (as much as we calculated above) as the real mode kernel part - 0x1001f4 is the 16-byte paragraphs of 32-bit code for protected mode kernel to load -> transform to 512 byte sectors to read - eventually get the prefered loading location for the kernel - load the protected part to 0x100000 by loading it to low memory and copy it to high memory in unreal mode - print kernel version number, 020E, offset, but we must load the complete kernel first - at end of kernel PM code read check if we have the same size as the tar entry - run_kernel (real mode) cli mov ax, 0x1000 mov ds, ax mov es, ax mov fs, ax mov gs, ax mov ss, ax mov sp, 0xe000 jmp 0x1020:0 - eventually get the prefered loading location for the ramdisk or highest possible location (should make the kernel happy), but then we have to know a little bit about the memory layout and size of the machine.. - read ram image - read octal size in tar metadata of ramdisk, convert do decimal - set address and size in kernel zero page - 0x218/4 ramdisk image address - 0x21c/4 ramdisk image size Bochs commands -------------- # have a look at the boot.map file for the address of a symbol # set breakpoint b 0x7F93 # dump memory in floppy read buffer x /30b 0x0008800 # dump real mode kernel code/data x /30b 0x0010000 bugs ---- I see the 'early console in setup mode' message in bochs and qemu, so the kernel is definitely booting in real mode, but then jumps to a machine reset without saying anything.. C code main gets executed # Jump to C code (should not return) calll main We see the early console message if we pass 'debug' to the command line options, so the error must be after this: if (cmdline_find_option_bool("debug")) puts("early console in setup code\n"); /* End of heap check */ init_heap(); Bochs debugging on a 64-bit host with a gdb in a 32-bit LXC container. target remote tcp:10.0.3.1:1234 problem was a shift-right 5 instead of shift-right 4, we read too few sectors and jumped into a unitialized area of the memory when entering PM! -- traps: init[1] trap invalid opcode ip:8049120 sp:bfc47a3c error:0 in busybox[8049000+4b000] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 Kernel Offset: disabled ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 ]--- objdump -d ramdisk/bin/busybox | less 8049120: f3 0f 1e fb endbr32 this is really nice, it's an ELF binary, address in mapped in virtual memory exactly as is. -fcf-protection it should really not be enable per default for march-i486: -fcf-protection=none still there, so has musl been built with CET enabled? for now hack it out by hand: f3 0f 1e fb -> 90 works (we just have to find where this kreeps in, could be all of the i486 subarchitecture!) uname -a Linux (none) 5.18.1-arch1-1 #5 Sat Jul 30 13:55:53 CEST 2022 i486 GNU/Linux cat /proc/cpuinfo model name : 486 DX/2 free -m total used free shared buff/cache available Mem: 61584 1924 59048 0 612 57572 Swap: 0 0 0 more illegal opcodes: <6>traps: fdisk[46] trap invalid opcode ip:804d25b sp:bf96ec30 error:0 in busybo x[8049000+4b000] udhcpc: socket: Address family not supported by protocol ip addr add 192.168.1.100/24 dev eth0 ip: can't find device 'eth0' \ ifconfig -a eth0 Link encap:Ethernet HWaddr 00:00:E8:CD:05:88 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) Interrupt:10 Base address:0x300 lo Link encap:Local Loopback LOOPBACK MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) <6>ne ne.0 (unnamed net_device) (uninitialized): NE*000 ethercard probe at 0x300: <4>ne ne.0 (unnamed net_device) (uninitialized): interrupt from stopped card <6>ne ne.0 eth0: NE2000 found at 0x300, using IRQ 10. change floppy0 floppy2.img real floppy: read on floppy 2 results in ERR, so we have to add retries and also call the reset BIOS function every time. first idea, format disk space with a filesystem, add a swapfile, etc. use wget to copy the ISO, then use a loopback on it and start installation from there. copying the ISO 486 is not fast, but acceptable (half an hour or so). /mnt, main medium, alredy formated, contains a swapfile and now also the iso losetup /dev/loop0 /mnt/archinstall-i486 mount /dev/loop0 /mnt2 mount --bind /mnt /mnt2/mnt chroot /mnt2 mount -t proc /proc proc mount -t sysfs /sys sys mount -t devtmpfs /dev dev mount -t devpts /dev/pts /dev/pts pacstrap normally pacstrap /mnt base linux grub dhclient joe vi problems for shm and unshare pacstrap. Maybe kernel issues missing features (SHM for sure)? futex facility unexpected error code (dito, futex support missing) we also might have to change some tmpfs stuff (/run, etc) systemd or permission/cgroup issues, this is quite a disaster to install from a simple kernel, bysubox system: Could not set capabilities on /usr/bin/newuidmap: Operation not supported Could not set capabilities on /usr/bin/newgidmap: Operation not supported /usr/lib/tmpfiles.d/static-nodes-permissions.conf:12: Failed to resolve group 'audio'. /usr/lib/tmpfiles.d/static-nodes-permissions.conf:13: Failed to resolve group 'audio'. /usr/lib/tmpfiles.d/static-nodes-permissions.conf:14: Failed to resolve group 'disk'. /usr/lib/tmpfiles.d/static-nodes-permissions.conf:17: Failed to resolve group 'kvm'. /usr/lib/tmpfiles.d/static-nodes-permissions.conf:18: Failed to resolve group 'kvm'. /usr/lib/tmpfiles.d/static-nodes-permissions.conf:19: Failed to resolve group 'kvm'. /usr/lib/tmpfiles.d/systemd-network.conf:10: Failed to resolve user 'systemd-network': No such process /usr/lib/tmpfiles.d/systemd-network.conf:11: Failed to resolve user 'systemd-network': No such process /usr/lib/tmpfiles.d/systemd-network.conf:12: Failed to resolve user 'systemd-network': No such process /usr/lib/tmpfiles.d/systemd-network.conf:13: Failed to resolve user 'systemd-network': No such process /usr/lib/tmpfiles.d/systemd.conf:22: Failed to resolve group 'systemd-journal'. /usr/lib/tmpfiles.d/systemd.conf:23: Failed to resolve group 'systemd-journal'. Failed to parse ACL "d:group::r-x,d:group:adm:r-x,d:group:wheel:r-x,group::r-x,group:adm:r-x,group:wheel:g Failed to parse ACL "d:group:adm:r-x,d:group:wheel:r-x,group:adm:r-x,group:wheel:r-x": Invalid argument. g Failed to parse ACL "group:adm:r--,group:wheel:r--": Invalid argument. Ignoring /usr/lib/tmpfiles.d/systemd.conf:28: Failed to resolve group 'systemd-journal'. /usr/lib/tmpfiles.d/systemd.conf:29: Failed to resolve group 'systemd-journal'. /usr/lib/tmpfiles.d/systemd.conf:30: Failed to resolve group 'systemd-journal'. Failed to parse ACL "d:group::r-x,d:group:adm:r-x,d:group:wheel:r-x,group::r-x,group:adm:r-x,group:wheel:g Failed to parse ACL "d:group:adm:r-x,d:group:wheel:r-x,group:adm:r-x,group:wheel:r-x": Invalid argument. g Failed to parse ACL "group:adm:r--,group:wheel:r--": Invalid argument. Ignoring /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:2: Failed to resolve user 'tss': No such process Failed to parse ACL "default:group:tss:rwx": Invalid argument. Ignoring /usr/lib/tmpfiles.d/tpm2-tss-fapi.conf:4: Failed to resolve user 'tss': No such process Failed to parse ACL "default:group:tss:rwx": Invalid argument. Ignoring /usr/lib/tmpfiles.d/var.conf:15: Failed to resolve group 'utmp'. /usr/lib/tmpfiles.d/var.conf:16: Failed to resolve group 'utmp'. /usr/lib/tmpfiles.d/var.conf:17: Failed to resolve group 'utmp'. Failed to open directory 'lock': No such file or directory Failed to validate path /run/lock/subsys: No such file or directory Failed to open directory 'initramfs': No such file or directory Failed to open directory 'nscd': No such file or directory Failed to open directory 'faillock': No such file or directory Failed to open directory 'user': No such file or directory Failed to validate path /run/systemd/ask-password: No such file or directory Failed to validate path /run/systemd/seats: No such file or directory Failed to validate path /run/systemd/sessions: No such file or directory Failed to validate path /run/systemd/users: No such file or directory Failed to validate path /run/systemd/machines: No such file or directory Failed to validate path /run/systemd/shutdown: No such file or directory Failed to open directory 'log': No such file or directory Cannot set file attributes for '/var/log/journal', maybe due to incompatibility in specified attributes, . Cannot set file attributes for '/var/log/journal/remote', maybe due to incompatibility in specified attri. error: command failed to execute correctly install: cannot stat '/dev/stdin': No such file or directory error: command failed to execute correctly /dev/fd is a problem for hooks and mkinitcpio ln -s /proc/self/fd -> /dev/fd /usr/lib/initcpio/functions: line 712: /dev/stdin: No such file or directory ln -s /proc/self/fd/0 /dev/stdin ln -s /proc/self/fd/1 /dev/stdout ln -s /proc/self/fd/2 /dev/stderr mkinitcpio takes ages also with a swap! especially the fallback image.. modprobe ext2 mkdir /mnt mount /dev/sda1 /mnt modprobe ne io=0x300 irq=10 ip addr add 192.168.1.100/24 dev eth0 ip link set up dev eth0 ip route add default via 192.168.1.1 dev eth0 cd /mnt/boot rm vmlinuz-floppy wget http://archlinux32.andreasbaumann.cc/other/bzImage mv bzImage vmlinuz-floppy systemd needs a communication protocol, either unix domain sockets or a loopback on ipv6, also it requires cgroups, otherwise it panicks! systemd: Failed to allocate manager object, function not implemented minimal sane config: +config FORCE_MINIMALLY_SANE_CONFIG + bool + default y + + # so that capset() works (sudo, etc.): + select SECURITY + select SECURITY_CAPABILITIES + select BINFMT_ELF + + select SYSFS + select SYSFS_DEPRECATED + select PROC_FS + select FUTEX + + # newer systemd silently relies on the presence of the epoll system call: + select EPOLL + select ANON_INODES + + # newer systemd silently hangs durig early init without these: + select PROC_SYSCTL + select SYSCTL + select POSIX_MQUEUE + select POSIX_MQUEUE_SYSCTL + + # systemd needs this syscall: + select FHANDLE + + # systemd needs devtmpfs: "systemd[1]: Failed to mount devtmpfs at /dev: No such device" + select DEVTMPFS + + # systemd needs tmpfs: "systemd[1]: Failed to mount tmpfs at /sys/fs/cgroup: No such file or directory" + select SHMEM + select TMPFS + + # systemd needs timerfd syscalls: "[ 8.198625] systemd[1]: Failed to create timerfd: Function not implemented^" + select TIMERFD + + # systemd needs signalfd support: "[ 45.536725] systemd[1]: Failed to allocate manager object: Function not implemented" + select SIGNALFD + + # systemd hangs during bootup without cgroup support: + select CGROUPS + + # systemd fails during bootup without this option, with a nonsensical message: "[DEPEND] Dependency failed for File System Check on /dev/sda1." + select FILE_LOCKING + + # systemd fails during bootup without this option: + select FSNOTIFY + select INOTIFY_USER + + # won't boot otherwise: + select RD_GZIP + select BLK_DEV_INITRD + + # old F6 userspace needs vsyscalls: + select X86_VSYSCALL_EMULATION if X86_64 + select IA32_EMULATION if X86_64 => https://lwn.net/Articles/672587/ Now we cannot change the password: passwd: Authentication token lock busy passwd: password unchanged Maybe we need the kernel with file locking or so? Or could we use crypt6 from busybox to set it? So far I tried to boot via floppy and then chroot into the partially installed Archlinux32 => yep, that was it, takes 10 seconds for a crypt 6 shadow entry :-) Now loading 'ne' produces a null pointer in the kernel? -> let's hope this is just the lack of memory/swap => unmapped page in PF (packet filtering)? references ---------- - kernel boot up in all it's details, really nice documentation: - https://0xax.gitbooks.io/linux-insides/content/Booting/linux-bootstrap-1.html - https://0xax.gitbooks.io/linux-insides/content/Booting/linux-bootstrap-2.html - debug kernel with bochs - https://bochs.sourceforge.io/doc/docbook/user/debugging-with-gdb.html - https://www.kernel.org/doc/html/v4.12/dev-tools/gdb-kernel-debugging.html - https://www.cs.princeton.edu/courses/archive/fall09/cos318/precepts/bochs_gdb.html - Linux boot protocol - https://docs.kernel.org/x86/boot.html - https://www.spinics.net/lists/linux-integrity/msg14580.html: version string - get available memory - http://www.uruk.org/orig-grub/mem64mb.html - https://wiki.osdev.org/Detecting_Memory_(x86) - create ramdisk.img: https://people.freedesktop.org/~narmstrong/meson_drm_doc/admin-guide/initrd.html - tar format - https://wiki.osdev.org/USTAR - https://en.wikipedia.org/wiki/Tar_(computing)#UStar_format - https://github.com/calccrypto/tar - https://github.com/Papierkorb/tarfs - other minimal bootloader projects - https://github.com/wikkyk/mlb - https://github.com/owenson/tiny-linux-bootloader and https://github.com/guineawheek/tiny-floppy-bootloader - http://dc0d32.blogspot.com/2010/06/real-mode-in-c-with-gcc-writing.html (Small C and 16-bit code, leads to a quite big boot loader, in the end we didn't use C but Unreal mode 16/32-bittish assembly) - https://wiki.syslinux.org/wiki/index.php?title=The_Syslinux_Project - Lilo (but the code is hard to read and looks quite chaotic) - Linux 1.x old boot floppy code - PC ROM - https://members.tripod.com/vitaly_filatov/ng/asm/ - systemd: - https://lwn.net/Articles/672587/: minimal required features of the kernel, or what crazy undocumented features are a minimal set of features to enable to make systemd happy - other projects: - https://www.insentricity.com/a.cl/283 todos ----- - have an early console also for serial (uart8250 in assembly, yuck) - fix CET code generation (endbr32) in Arch32 i486 toolchain - fix MMX in fdisk (musl) - network drivers crash with NPE in kernel (ne on real hardware, 8139cp on qemu) - find optimal set of kernel and busybox parameters just enough to enter a chroot to install a copied iso image - test more A20 switching stuff on real hardware - script ramdisk generation (currently there is a manually crafted ramdisk directory) - have a nicer build process