FRESH

Hacker News

Home

The Journey Before main()

314 points by amitprasad

by fweimer

3 subcomments

> The ELF file contains a dynamic section which tells the kernel which shared libraries to load, and another section which tells the kernel to dynamically “relocate” pointers to those functions, so everything checks out.
This is not how dynamic linking works on GNU/Linux. The kernel processes the program headers for the main program (mapping the PT_LOAD segments, without relocating them) and notices the PT_INTERP program interpreter (the path to the dynamic linker) among the program headers. The kernel then loads the dynamic linker in much the same way as the main program (again without relocation) and transfers control to its entry point. It's up to the dynamic linker to self-relocate, load the referenced share objects (this time using plain mmap and mprotect, the kernel ELF loader is not used for that), relocate them and the main program, and then transfer control to the main program.
The scheme is not that dissimilar to the #! shebang lines, with the dynamic linker taking the role of the script interpreter, except that ELF is a binary format.

by mmsc

2 subcomments

It's also possible to pack a whole codebase into "before main()" - or with no main() at all. I was recently experimenting doing this, as well as a whole codebase that only uses main() and calls itself over and over. Good fun: https://joshua.hu/packing-codebase-into-single-function-disr...

by archmaster

1 subcomments

This is awesome! To anyone interested in learning more about this, I wrote https://cpu.land/ a couple years ago. It doesn't go as in-depth into e.g. memory layout as OP does but does cover multitasking and how the code is loaded in the first place.

by khaledh

3 subcomments

> A note on interpreters: If the executable file starts with a shebang (#!), the kernel will use the shebang-specified interpreter to run the program. For example, #!/usr/bin/python3 will run the program using the Python interpreter, #!/bin/bash will run the program using the Bash shell, etc.
This caused me a lot of pain while trying to debug a 3rd party Java application that was trying to launch an executable script, and throwing an IO error "java.io.IOException: error=2, No such file or directory." I was puzzled because I know the script is right there (using its full path) and it had the executable bit set. It turns out that the shebang in the script was wrong, so the OS was complaining (actual error from a shell would be "The file specified the interpreter '/foo/bar', which is not an executable command."), but the Java error was completely misleading :|
Note: If you wonder why I didn't see this error by running the script myself: I did, and it ran fine locally. But the application was running on a remote host that had a different path for the interpreter.

by vbezhenar

6 subcomments

I wonder how many C projects prefer to avoid standard library, just invoking Linux syscalls directly. Much more fun to write software this way, IMO.

by turbert

1 subcomments

Its been a while since I've touched this stuff but my recollection is the ELF interpreter (ldso, not the kernel) is responsible for everything after mapping the initial ELF's segments.
iirc execve maps pt_load segments from the program header, populates the aux vector on the stack, and jump straight to the ELF interpreter's entry point. Any linked objects are loaded in userspace by the elf interpreter. The kernel has no knowledge of the PLT/GOT.

by nneonneo

0 subcomment

For a fun example of a crash that can occur before main() even starts: https://stackoverflow.com/questions/12570374/floating-point-...
The poster was receiving a SIGFPE (floating point exception) on a C program that is simply “int main() { return 0; }”. A fun little mystery to dive into!

by bignerd_95

3 subcomments

As someone who teaches this stuff at university, I see students getting confused every single year by how textbooks draw memory. The problem is mostly visual, not conceptual.
Most diagrams in books and slides use an old hardware-centric convention: they draw higher addresses at the top of the page and lower addresses at the bottom. People sometimes justify this with an analogy like “floors in a building go up,” so address 0x7fffffffe000 is drawn “higher” than 0x400000.
But this is backwards from how humans read almost everything today. When you look at code in VS Code or any other IDE, line 1 is at the top, then line 2 is below it, then 3, 4, etc. Numbers go up as you go down. Your brain learns: “down = bigger index.”
Memory in a real Linux process actually matches the VS Code model much more closely than the textbook diagrams suggest.
You can see it yourself with:
cat /proc/$$/maps
(pick any PID instead of $$).
```
    ...
```
[0x00000000] lower addresses
```
    ...
```
[0x00620000] HEAP start
[0x00643000] HEAP extended ↓ (more allocations => higher addresses)
```
    ...
```
[0x7ffd8c3f7000] STACK top (<- stack pointer)
```
                  ↑ the stack pointer starts here and moves upward

                  (toward lower addresses) when you push
```
[0x7ffd8c418000] STACK start
```
    ...
```
[0xffffffffff600000] higher addresses
```
    ...
```
The output is printed from low addresses to high addresses. At the top of the output you'll usually see the binary, shared libs, heap, etc. Those all live at lower virtual addresses. Farther down in the output you'll eventually see the stack, which lives at a higher virtual address. In other words: as you scroll down, the addresses get bigger. Exactly like scrolling down in an editor gives you bigger line numbers.
The phrases “the heap grows up” and “the stack grows down” aren't wrong. They're just describing what happens to the numeric addresses: the heap expands toward higher addresses, and the stack moves into lower addresses.
The real problem is how we draw it. We label “up” on the page as “higher address,” which is the opposite of how people read code or even how /proc/<pid>/maps is printed. So students have to mentally flip the diagram before they can even think about what the stack and heap are doing.
If we just drew memory like an editor (low addresses at the top, high addresses further down) it would click instantly. Scroll down, addresses go up, and the stack sits at the bottom. At that point it’s no longer “the stack grows down”: it’s just the stack pointer being decremented, moving to lower addresses (which, in the diagram, means moving upward).

by hagbard_c

1 subcomments

On the subject of symbols:
> Yeah, that’s it. Now, 2308 may be slightly bloated because we link against musl instead of glibc, but the point still stands: There’s a lot of stuff going on behind the scenes here.
Slightly bloated is a slight understatement. The same program linked to glibc tops at 36 symbols in .symtab:
```
    $ readelf -a hello|grep "'.symtab'"
    Symbol table '.symtab' contains 36 entries:
```

by itopaloglu83

0 subcomment

I like doing this with old microcontrollers like PIC16 series etc. You said see how to stack pointer, timers, and variables etc. all are configured.

by Animats

1 subcomments

From the title, I thought this was going to be about the parts of a program that run before the main function is entered. Static objects have to be constructed. Quite a bit of code can run. Order of initialization can be a problem. What happens if you try to do I/O from a static constructor? Does that even work?

by yawpitch

0 subcomment

You’ve got a broken link in your markdown, round about the phrase “lang_start function (defined here)”.

by ramanvarma

0 subcomment

did you see the relocations for the main binary applied before or after the linker resolves its own symbols? the ordering always feels like black magic when you step through it in a debugger

by matheusmoreira

0 subcomment

Hacking this stuff is so fun!!
> Depending on your program, _start may be the only thing between the entrypoint and your main function
I once developed a liblinux project entirely built around this idea.
I wanted to get rid of libc and all of its initialization, complexity and global state. The C library is so complex it has a primitive form of package management built into it:
https://blogs.oracle.com/solaris/post/init-and-fini-processi...
So I made _start functions which did nothing but pass argc, argv, envp and auxv to the actual main function:
https://github.com/matheusmoreira/liblinux/blob/master/start...
https://github.com/matheusmoreira/liblinux/blob/master/start...
You can get surprisingly far with just this, and it's actually possible to understand what's going on. Biggest pain point was the lack of C library utility functions like number/string conversion. I simply wrote my own.
https://github.com/matheusmoreira/liblinux/tree/master/examp...
Linux is the only operating system that lets us do this. In other systems, the C library is part of the kernel interface. Bypassing it like this can and does break things. Go developers once discovered this the hard way.
https://www.matheusmoreira.com/articles/linux-system-calls
The kernel has their own nolibc infrastructure now, no doubt much better than my project.
https://github.com/torvalds/linux/tree/master/tools/include/...
I encourage everyone to use it.
Note also that _start is an arbitrary symbol. The name is not special at all. It's just some linker default. The ELF header contains a pointer to the entry point, not a symbol. Feel free to choose a nice name!