Perfetto Tracing Mastery
Perfetto is the next-generation system profiling and tracing tool in Android, replacing the legacy Systrace. It allows for system-wide, high-frequency telemetry collection with minimal overhead.
Comprehensive Trace Collection Config
Perfetto relies on a protobuf-based configuration to define what data sources to enable. A comprehensive configuration might look like this:
buffers {
size_kb: 65536
fill_policy: RING_BUFFER
}
data_sources {
config {
name: "linux.ftrace"
ftrace_config {
ftrace_events: "sched/sched_switch"
ftrace_events: "power/cpu_frequency"
ftrace_events: "power/cpu_idle"
atrace_categories: "gfx"
atrace_categories: "view"
atrace_categories: "binder_driver"
atrace_categories: "dalvik"
}
}
}
You can push this config to the device and start tracing:
cat config.pbtx | adb shell perfetto -c - -o /data/misc/perfetto-traces/trace.perfetto-trace
Diagnosing Frame Jank via Frame Timeline
When analyzing a Perfetto trace in the UI (ui.perfetto.dev), the "Frame Timeline" track is essential for UI debugging.
The UI thread emits Choreographer#doFrame events. If a frame takes longer than the vsync interval (e.g., 16.6ms for 60fps), it will be marked in red.
To find the root cause, click the red frame and look at the lower tracks:
- CPU Track: Was the UI thread actually running, or was it descheduled?
- Lock Contention: Is the thread blocked in a monitor state (indicated by
monitor contention with owner)? - RenderThread: Did the UI thread finish quickly, but the
RenderThreadtook 30ms to process theDisplayList?
Memory Analysis via Perfetto Heap Dump
Perfetto can capture native memory allocations using heapprofd. This is vastly superior to older tools like malloc_debug because it incurs much lower overhead.
By enabling the android.heapprofd data source, Perfetto intercepts malloc and free calls and records the call stack. In the Perfetto UI, you can use the Flame Graph viewer to visually identify which functions are allocating memory that is not being freed.
Tracking Binder Transactions
Binder transactions are notorious for causing subtle latency issues. Perfetto traces Binder calls across process boundaries using the binder_driver ftrace category.
In the Perfetto UI, when you click on a binder transaction slice in the client process, Perfetto automatically draws an arrow connecting it to the binder reply slice in the target server process. This allows you to measure exactly how much time was spent traversing the kernel driver versus executing the remote method in the server application.
You can also write SQL queries against the trace to find the longest Binder calls:
SELECT slice.name, slice.dur / 1e6 AS duration_ms
FROM slice
WHERE slice.name LIKE 'binder%'
ORDER BY slice.dur DESC
LIMIT 10;