AOSP Expert & Production Engineering
3 min read

Perfetto Tracing Mastery

Perfetto Tracing Mastery

Perfetto is the next-generation system profiling and tracing tool in Android, replacing the legacy Systrace. It allows for system-wide, high-frequency telemetry collection with minimal overhead.

Comprehensive Trace Collection Config

Perfetto relies on a protobuf-based configuration to define what data sources to enable. A comprehensive configuration might look like this:

buffers {
    size_kb: 65536
    fill_policy: RING_BUFFER
}
data_sources {
    config {
        name: "linux.ftrace"
        ftrace_config {
            ftrace_events: "sched/sched_switch"
            ftrace_events: "power/cpu_frequency"
            ftrace_events: "power/cpu_idle"
            atrace_categories: "gfx"
            atrace_categories: "view"
            atrace_categories: "binder_driver"
            atrace_categories: "dalvik"
        }
    }
}

You can push this config to the device and start tracing:

cat config.pbtx | adb shell perfetto -c - -o /data/misc/perfetto-traces/trace.perfetto-trace

Diagnosing Frame Jank via Frame Timeline

When analyzing a Perfetto trace in the UI (ui.perfetto.dev), the "Frame Timeline" track is essential for UI debugging. The UI thread emits Choreographer#doFrame events. If a frame takes longer than the vsync interval (e.g., 16.6ms for 60fps), it will be marked in red. To find the root cause, click the red frame and look at the lower tracks:

  • CPU Track: Was the UI thread actually running, or was it descheduled?
  • Lock Contention: Is the thread blocked in a monitor state (indicated by monitor contention with owner)?
  • RenderThread: Did the UI thread finish quickly, but the RenderThread took 30ms to process the DisplayList?

Memory Analysis via Perfetto Heap Dump

Perfetto can capture native memory allocations using heapprofd. This is vastly superior to older tools like malloc_debug because it incurs much lower overhead.

By enabling the android.heapprofd data source, Perfetto intercepts malloc and free calls and records the call stack. In the Perfetto UI, you can use the Flame Graph viewer to visually identify which functions are allocating memory that is not being freed.

Tracking Binder Transactions

Binder transactions are notorious for causing subtle latency issues. Perfetto traces Binder calls across process boundaries using the binder_driver ftrace category.

In the Perfetto UI, when you click on a binder transaction slice in the client process, Perfetto automatically draws an arrow connecting it to the binder reply slice in the target server process. This allows you to measure exactly how much time was spent traversing the kernel driver versus executing the remote method in the server application.

You can also write SQL queries against the trace to find the longest Binder calls:

SELECT slice.name, slice.dur / 1e6 AS duration_ms
FROM slice
WHERE slice.name LIKE 'binder%'
ORDER BY slice.dur DESC
LIMIT 10;