AOSP Expert & Production Engineering
6 min read

Tombstones (Native Crashes)

Understanding Native Crashes in Android

In the Android Open Source Project (AOSP), native crashes occur when C or C++ code performs an illegal operation, such as accessing unallocated memory, executing an invalid instruction, or violating memory alignment constraints. When this happens, the kernel sends a signal to the process, and Android's crash handling mechanisms kick in to generate a comprehensive dump known as a "tombstone".

Tombstones are crucial for analyzing and fixing native crashes. They are typically found in the /data/tombstones/ directory on a device. A single tombstone file contains a snapshot of the process at the time of the crash, including register states, memory maps, logcat output, and, most importantly, the call stack (backtrace).

Tombstone File Anatomy

A tombstone file follows a structured format designed to provide maximum context. Here is a breakdown of its key sections:

1. Build and Device Information

The top of the tombstone contains metadata about the device, build ID, and process information.

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/raven/raven:13/TP1A.220624.014/8836240:user/release-keys'
Revision: 'MP1.0'
ABI: 'arm64'
Timestamp: 2023-10-14 15:30:22.123456789+0800
Process uptime: 42s
Cmdline: com.example.nativeapp
pid: 12345, tid: 12345, name: main  >>> com.example.nativeapp <<<
uid: 10123

This section is vital for verifying that you are analyzing a crash from the correct build and understanding which process and thread failed.

2. Signal Information and Abort Message

This tells you why the process crashed.

signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
Cause: null pointer dereference

In some cases, you will also see an abort message if the application called abort() directly or used LOG_FATAL.

Abort message: 'Assertion failed: x > 0'

3. Registers

The register dump shows the CPU state at the exact moment of the crash.

x0  0000000000000000  x1  0000007fc1b2c3d0  x2  0000000000000000  x3  0000000000000000
...
sp  0000007fc1b2c3c0  lr  0000007fc1c5d6e4  pc  0000007fc1c5d700
  • pc (Program Counter): Points to the exact instruction that caused the crash.
  • lr (Link Register): Points to the return address (the function that called the current function).
  • sp (Stack Pointer): Points to the current top of the stack.

4. Backtrace

The backtrace unwinds the call stack, showing the sequence of function calls that led to the crash.

backtrace:
      #00 pc 0000000000001700  /data/app/~~xxxxx==/com.example.nativeapp-yyyyy==/lib/arm64/libnative.so (crash_function+16)
      #01 pc 00000000000016e0  /data/app/~~xxxxx==/com.example.nativeapp-yyyyy==/lib/arm64/libnative.so (trigger_crash+32)
      #02 pc 00000000000018f4  /data/app/~~xxxxx==/com.example.nativeapp-yyyyy==/lib/arm64/libnative.so (Java_com_example_nativeapp_MainActivity_stringFromJNI+40)

5. Memory Maps

The memory maps section (memory map) lists all loaded libraries and allocated memory regions. This is essential for converting relative offsets (like the pc values in the backtrace) to absolute addresses or vice versa.

Signal Analysis

Understanding POSIX signals is fundamental to diagnosing native crashes. The most common signals you will encounter are:

SIGSEGV (Signal 11: Segmentation Violation)

This is the most common crash. It means the process tried to access memory it does not own or does not have permission to access.

  • SEGV_MAPERR: The memory address is not mapped (e.g., null pointer dereference).
  • SEGV_ACCERR: The memory is mapped, but the access permissions are violated (e.g., trying to write to read-only memory).

SIGABRT (Signal 6: Abort)

Triggered when a process intentionally terminates itself by calling abort(). This usually happens when an assertion fails or an unrecoverable error is detected by the code itself (e.g., libc detecting a heap corruption).

SIGBUS (Signal 7: Bus Error)

Similar to SIGSEGV but usually implies an alignment issue or a hardware-level problem.

  • BUS_ADRALN: Invalid address alignment (e.g., trying to read a 64-bit integer from an address that is not a multiple of 8).

SIGILL (Signal 4: Illegal Instruction)

The CPU encountered an instruction it could not understand. This often happens if memory containing code gets corrupted, or if an arm64 binary tries to execute an instruction not supported by the specific CPU core.

Register Dump Interpretation

When analyzing a crash, the register dump can provide immediate clues, especially for null pointer dereferences. In ARM64:

  • If fault addr is 0x0 and the pc points to a load/store instruction, look at the registers involved in that instruction. One of them (often x0 or x1) will likely be 0000000000000000.
  • The lr (Link Register) is incredibly useful if the backtrace is truncated or corrupted. It points to the function that invoked the crashing function.

Stack Unwinding in Tombstones

Android uses libunwindstack to walk the stack and generate the backtrace. It reads the .eh_frame or .ARM.exidx sections of ELF binaries to determine how to restore the caller's registers (including the Program Counter) at each frame. If these sections are missing or stripped improperly, the backtrace might be incomplete or show ??? for frames.

Symbolizing with ndk-stack and addr2line

Tombstones from unstripped or symbol-rich builds might contain function names directly. However, for production (stripped) builds, the backtrace will only contain memory offsets, as seen here:

      #00 pc 0000000000001700  /data/app/.../libnative.so

To map these offsets to source code files and line numbers, you need the unstripped version of libnative.so containing debug symbols.

Using ndk-stack

The easiest way to symbolize a tombstone is using ndk-stack, a tool provided in the Android NDK.

ndk-stack -sym /path/to/unstripped/libraries/ -dump /path/to/tombstone

You can also pipe logcat directly into it:

adb logcat | ndk-stack -sym /path/to/unstripped/libraries/

Using addr2line

For targeted analysis of a specific frame, addr2line (or llvm-addr2line) is useful. You provide the unstripped library and the pc offset:

llvm-addr2line -C -f -e /path/to/unstripped/libnative.so 0000000000001700

This will output the function name and the exact file/line number.

crash_function
/Users/developer/project/app/src/main/cpp/native-lib.cpp:15

debuggerd and crash_dump Internals

The crash reporting architecture in AOSP relies heavily on debuggerd and crash_dump.

  1. Signal Handler Registration: During initialization (in bionic/libc/bionic/libc_init_dynamic.cpp), a process sets up signal handlers for fatal signals (SIGSEGV, etc.).
  2. Crash Interception: When a signal occurs, the kernel invokes the handler in the crashing process.
  3. Delegation: The handler connects to the debuggerd socket (/dev/socket/tombstoned) to request a crash dump.
  4. crash_dump Execution: tombstoned spawns a crash_dump64 (or crash_dump32) process. This separate process attaches to the crashing process using ptrace.
  5. Dumping: crash_dump reads the memory, registers, and threads of the crashing process via ptrace, unwinds the stack using libunwindstack, and writes the formatted output to a new tombstone file in /data/tombstones/.
  6. Termination: Once the dump is complete, crash_dump detaches, and the crashing process is finally killed by the kernel.

Using a separate process (crash_dump) ensures that the dumping logic runs safely, as the original process is in an unstable state and cannot be trusted to execute complex logic like memory allocation or stack unwinding safely.