Opened 5 weeks ago

Closed 7 days ago

#10936 closed defect (fixed)

H264 decoder skips frames when seeking to (non-IDR) recovery point near end of stream

Reported by: arch1t3cht Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: h264
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:

Given an H264 file with a recovery point SEI less than 2 * max_num_reorder_frames frames before the end of the video stream in decoding order (and no IDR frames afterwards), seeking to this recovery point and decoding from there will cause the decoder to drop some frames. With multiple threads, only one frame will be dropped, with only one thread it will be about max_num_reorder_frames frames.

How to reproduce:

Consider the following C code, compiled with latest libavformat/libavcodec (8ca57fcf9ed327a6c2d9c5345be9b7e0724ca048)

#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>

int main(int argc, char *argv[]) {
    if (argc != 5) {
        printf("Usage: %s <filename> <threads> <has_b_frames> <seek pts>", argv[0]);
        return -1;
    }

    char *filename = argv[1];
    int threads = atoi(argv[2]);
    int has_b_frames = atoi(argv[3]);
    int seek_pts = atoi(argv[4]);

    int ret = 0;
    AVFormatContext *format_ctx = NULL;

    ret = avformat_open_input(&format_ctx, filename, NULL, NULL);
    if (ret < 0) {
        printf("Failed to open file\n");
        return -1;
    }

    if (avformat_find_stream_info(format_ctx, NULL) < 0) {
        printf("Failed to find stream info\n");
        avformat_close_input(&format_ctx);
    }

    int video_stream = -1;
    for (size_t i = 0; i < format_ctx->nb_streams; i++) {
        if (format_ctx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
            video_stream = i;
            break;
        }
    }
    
    av_seek_frame(format_ctx, video_stream, seek_pts, 0);

    printf("Opening stream %d\n", video_stream);

    const AVCodec *codec = avcodec_find_decoder(format_ctx->streams[video_stream]->codecpar->codec_id);
    AVCodecContext *codec_ctx = avcodec_alloc_context3(codec);

    ret = avcodec_parameters_to_context(codec_ctx, format_ctx->streams[video_stream]->codecpar);
    if (ret < 0) {
        printf("Failed to copy codec parameters\n");
        return -1;
    }

    codec_ctx->thread_count = threads;
    codec_ctx->has_b_frames = has_b_frames;

    ret = avcodec_open2(codec_ctx, codec, NULL);
    if (ret < 0) {
        printf("Failed to open decoder\n");
        return -1;
    }

    AVPacket *pkt = av_packet_alloc();
    int received_frames = 0;

    while (1) {
        AVFrame *frame = av_frame_alloc();
        int ret = avcodec_receive_frame(codec_ctx, frame);
        if (ret == 0) {
            printf("        Received frame %d with PTS=%ld \n", received_frames, frame->pts);
            received_frames++;
        } else if (ret == AVERROR(EAGAIN)) {
            printf("        EAGAIN when receiving frame\n");
        } else if (ret == AVERROR_EOF) {
            printf("        EOF\n");
            break;
        } else {
            return -1;
        }

        av_frame_free(&frame);

        while (1) {
            ret = av_read_frame(format_ctx, pkt);

            if (ret < 0 && ret != AVERROR_EOF)
                return -1;
            if (ret == AVERROR_EOF || pkt->stream_index == video_stream)
                break;

            av_packet_unref(pkt);
        }

        ret = avcodec_send_packet(codec_ctx, pkt);

        if (ret != AVERROR(EAGAIN)) {
            if (pkt->data) {
                printf("Sent packet DTS=%ld PTS=%ld\n", pkt->dts, pkt->pts);
            } else {
                printf("Sent empty packet\n");
            }
        } else {
            printf("Got EAGAIN when sending packet\n");
        }
        av_packet_unref(pkt);
    }

    av_packet_free(&pkt);
    avformat_close_input(&format_ctx);
    return 0;
}

This just opens the given file, seeks to the given PTS, sets codec_ctx->has_b_frames (equivalently, one could change the max_num_reorder_frames field in the file's bitstream, but this is quicker) and the number of threads to the given value, and then runs a standard decoding loop.

Also consider the attached file interlaced_h264.mkv. The file is an open-gop file with 126 frames and has a recovery point (among others) at PTS 3840 (frame 93 in decoding order, counting from 0, the last frame has PTS 5000). The value of max_num_reorder_frames in the bitstream is 2 (but instead of repeatedly changing the value in the bitstream we simulate this here by setting codec_ctx->has_b_frames).

Running ./myprogram interlaced_h264.mkv 16 2 0 shows that all frames are output as usual. When seeking to PTS 3840 with ./myprogram interlaced_h264.mkv 16 2 3840, all frames after PTS 3840 are properly output, resulting in 29 output frames:

[...]
        Received frame 27 with PTS=4960 
Sent empty packet
        Received frame 28 with PTS=5000 
Sent empty packet
        EOF

However, when setting has_b_frames=15 instead with ./myprogram interlaced_h264.mkv 16 15 3840, a frame is skipped:

[...]
Sent empty packet
        Received frame 14 with PTS=4400 
Sent empty packet
        Received frame 15 with PTS=4440 
Sent empty packet
        Received frame 16 with PTS=4520 
Sent empty packet
        Received frame 17 with PTS=4560 
[...]
        Received frame 28 with PTS=5000 
Sent empty packet
        EOF

We see that the frame with PTS 4480 is no longer output. The same happens with two threads (./myprogram interlaced_h264.mkv 2 15 3840). With just one thread (./myprogram interlaced_h264.mkv 1 15 3840`), the decoder stops completely after frame 4440 and returns EOF after that:

[...]
        Received frame 14 with PTS=4400 
Sent empty packet
        Received frame 15 with PTS=4440 
Sent empty packet
        EOF

Analysis:

The problematic frame with PTS 4480 is skipped because its recovered value is 0, so it's not output by finalize_frame in h264dec.c. Normally, frames to be output are marked as recovered in h264_select_output_frame as soon as the H264Context's frame_recovered field is set (which happens as soon as the recovery point's frame is output by h264_select_output_frame). They're also marked as recovered in h264_field_start when h->frame_recovered is set.
However, the frame with PTS 4480 is decoded and added to the delayed picture buffer before the recovery point frame is output, so it's not set to recovered in h264_field_start. But the frame is only output during draining after all packets are consumed, so it's output with send_next_delayed_frame and not h264_select_output_frame. In send_next_delayed_frame, the frame is not set to recovered, so it's not output by finalize_frame. Depending on the threading logic, this either skips the frame or stops decoding entirely.

So, the precise bug is probably that send_next_delayed_frame does not set recovered, but I'll leave discussing specific patches to the mailing list.

Attachments (1)

interlaced_h264.mkv (343.0 KB ) - added by arch1t3cht 5 weeks ago.

Download all attachments as: .zip

Change History (3)

by arch1t3cht, 5 weeks ago

Attachment: interlaced_h264.mkv added

comment:1 by arch1t3cht, 5 weeks ago

[comment deleted after noticing it was incorrect]

Last edited 5 weeks ago by arch1t3cht (previous) (diff)

comment:2 by arch1t3cht, 7 days ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.