AOSP Expert & Production Engineering
3 min read

GPU Profiling

GPU Counter Profiling

When optimizing graphics in AOSP, you must look beyond the CPU. The GPU operates asynchronously, and CPU-side timings do not reflect actual rendering costs. GPU hardware counters provide precise metrics on how the GPU utilizes its execution units, texture cache, and bandwidth.

Each System-on-Chip (SoC) vendor provides specialized tools to read these hardware counters:

  • Qualcomm Snapdragon Profiler: Crucial for Adreno GPUs.
  • ARM Mobile Studio (Mali Graphics Debugger): Essential for Mali GPUs.
  • PowerVR Graphics SDK: Used for Imagination GPUs.

These tools connect over adb and sample internal registers, allowing you to track metrics like ALUs utilized, texture fetch stalls, and memory bandwidth saturation in real-time.

RenderDoc for GPU Frame Capture

RenderDoc is the industry standard for frame capture and graphics debugging. It allows you to capture a single frame of an OpenGL ES or Vulkan application and step through every single draw call.

To use RenderDoc on an Android device:

  1. Ensure the target application has android:debuggable="true" in its manifest, or the device is rooted.
  2. Launch the RenderDoc host application.
  3. Connect to the Android device via the Remote Server tab.
  4. Launch the target package through RenderDoc.

RenderDoc allows you to inspect:

  • Pipeline State: View the exact shaders, blending modes, and depth tests active for a draw call.
  • Textures and Buffers: Inspect the contents of render targets, uniform buffers, and vertex data.

Shader Bottleneck Identification

Shaders are small programs running directly on the GPU cores. A poorly optimized fragment shader can bring the entire rendering pipeline to a halt.

Identifying shader bottlenecks involves checking if the GPU is:

  • ALU Bound: The shader performs too many complex mathematical operations. Mitigation involves simplifying math or using lower precision types like mediump instead of highp.
  • Texture/Memory Bound: The shader is waiting on memory reads. Mitigation involves reducing texture resolution, enabling texture compression (e.g., ASTC), or optimizing UV access patterns.

Vendor profilers typically provide a "Shader Analyzer" view that estimates the cycle cost of each line of GLSL or SPIR-V code.

GPU Memory Usage Analysis

Excessive GPU memory usage leads to Out-Of-Memory (OOM) crashes and system-wide jank as memory pages are swapped out. In AOSP, you can track GPU memory allocated via the GraphicBuffer allocator (gralloc).

To inspect system-wide graphics memory allocations, use the dumpsys command against the meminfo or SurfaceFlinger services:

# Check overall memory info, look for "Graphics" and "GL mtrack"
adb shell dumpsys meminfo

# Check SurfaceFlinger's specific GraphicBuffer allocations
adb shell dumpsys SurfaceFlinger

A common AOSP optimization strategy involves ensuring UI elements do not allocate overly large textures and that off-screen buffers are promptly destroyed or reused.