Identifying the "main" function

There are times when de-compiling a binary (eg. Ghidra) may not directly reveal the address of the main function. This may happen as a result of a few things:

  • Stripped symbols: the symbol table was removed, so main has no name

  • main is invoked indirectly (via __libc_start_main or via .init_array constructors)

In this section, I will discuss a few techniques to discover the starting address of the main function.

Debugging (GDB)

1. Discover entry address

a. readelf

  • -l flag: Displays the information contained in the file's segment headers, if it has any

    • other possible flag names for -l: --program-headers/--segments

  • We can view the entry point:

$ readelf -l <bin_file> | grep -i Entry
Entry point 0xxxx

b. objdump

  • --disassemble --disassemble-all: performs disassembly

  • -M/--disassembler-options: pass target specific information to the disassembler

    • We pass the value intel, to tells objdump to print assembly in Intel syntax instead of the default AT&T syntax

  • grep '__libc_start_main' to grep the value __libc_start_main, which represents the initial function called by default that will eventually call the main function

2. Analyse disassembly

  • Perform stepi until right before the first call command

    • This will typically be the __libc_start_main function

    • The first argument (stored in the rdi register) will be the address of the main function

Ghidra

  1. Look for the "entry" function

  2. Look for __libc_start_main

  • The first argument to the __libc_start_main function will be the main function

  • In the image above, it will be the FUN_001008a1 function

Last updated