Overview
MediaCodec is the primary Android API (available in Java, Kotlin, and C++ via NDK) for accessing low-level media encoders and decoders. It acts as a wrapper around the native Stagefright engine and the Codec2 framework, allowing application developers to feed compressed data in and get uncompressed data out (or vice versa) with tight control over the buffer flow.
The MediaCodec API: Encode and Decode
MediaCodec operates on a buffer-in, buffer-out model. The application acts as a client that continuously interacts with the codec's input and output queues.
- Request an empty input buffer from the codec.
- Fill the buffer with data (e.g., a compressed H.264 NAL unit or raw PCM audio).
- Queue the input buffer back to the codec for processing.
- Dequeue a filled output buffer from the codec (e.g., the decoded YUV frame or compressed AAC data).
- Consume the data and release the output buffer back to the codec.
MediaCodec State Machine
To use MediaCodec correctly, developers must understand its strict internal state machine.
- Uninitialized: The codec is just created via
MediaCodec.createDecoderByType(). - Configured: The app calls
configure(), supplying theMediaFormat(resolution, bitrate, keys) and a targetSurface(if applicable). - Flushed: The app calls
start(). The codec is now ready to receive data. - Running: Buffers are actively being processed.
- End-of-Stream (EOS): The app flags an input buffer with
BUFFER_FLAG_END_OF_STREAM. The codec processes remaining buffers and flags the final output buffer with EOS. - Released: The codec is destroyed, and hardware resources are freed.
MediaCodec Surface Mode for Video
While MediaCodec can output raw video frames (YUV) into ByteBuffer objects for the CPU to process, this is highly inefficient. For video playback, MediaCodec is almost always used in Surface Mode.
Zero-Copy Rendering
During the configure() step, the app provides a Surface (e.g., from a SurfaceView or TextureView).
MediaCodec codec = MediaCodec.createDecoderByType("video/avc");
codec.configure(format, surface, null, 0);
codec.start();
When using a Surface:
- The output buffers requested by
dequeueOutputBufferdo not contain actual pixel data. They act as tokens. - The hardware decoder writes the decoded frame directly into a
GraphicBufferowned by theSurface. - When the app calls
releaseOutputBuffer(index, true /* render */), it tellsMediaCodecto send thatGraphicBufferto SurfaceFlinger for display.
This creates a zero-copy pipeline: Network -> MediaExtractor -> MediaCodec (Hardware VPU) -> SurfaceFlinger (Hardware Display Controller).
Asynchronous Mode vs Synchronous Mode
Historically, apps used a while loop to poll dequeueInputBuffer and dequeueOutputBuffer (Synchronous mode). Starting in Android 5.0, Asynchronous mode via callbacks became the preferred approach.
Asynchronous Implementation
codec.setCallback(new MediaCodec.Callback() {
@Override
public void onInputBufferAvailable(MediaCodec mc, int inputBufferId) {
// 1. Get the empty buffer
ByteBuffer inputBuffer = mc.getInputBuffer(inputBufferId);
// 2. Read data from file/network into inputBuffer
// 3. Queue the buffer
mc.queueInputBuffer(inputBufferId, 0, size, pts, 0);
}
@Override
public void onOutputBufferAvailable(MediaCodec mc, int outputBufferId, BufferInfo info) {
// 1. Frame is ready. Render it to the Surface.
mc.releaseOutputBuffer(outputBufferId, true);
}
// ... other overrides
});
Asynchronous mode eliminates polling, reduces CPU wakeups, and allows the framework to push buffers to the application exactly when the hardware is ready, leading to smoother playback and better power efficiency.