Opened 6 months ago

Last modified 9 days ago

#10642 new enhancement

[hwaccel] AV1 hardware decoding for Apple M3

Reported by: Nomis101 Owned by:
Priority: normal Component: avcodec
Version: git-master Keywords: hwaccel videotoolbox av1
Cc: Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Now that the Apple M3 family does support AV1 hardware decoding, it would be nice to add this to the hwaccel framework (videotoolbox.c). There is kCMVideoCodecType_AV1 for this.

https://www.apple.com/newsroom/2023/10/apple-unveils-m3-m3-pro-and-m3-max-the-most-advanced-chips-for-a-personal-computer/
https://developer.apple.com/documentation/coremedia/kcmvideocodectype_av1/

Change History (9)

comment:1 by jeeb, 6 months ago

Unfortunately Apple has not added an AV1 software decoder to VT, even though they ship dav1d as part of Safari for AVIF.

Thus this (very hacky) code was written completely blind, and the only verification done was that the CFDataRef should contain what appears to be a valid av1C: https://github.com/jeeb/ffmpeg/commits/videotoolbox_av1

Only builds as static due to the av1C generation not being properly shared between libraries (as I don't want to start that work until I at least know that things work on some level).

Could be checked with:

ffmpeg -v verbose -hwaccel videotoolbox -i INPUT -map 0:v -f null -

On my 2015 Intel Macbook pro I just get

[av1 @ 0x7fbf6e00a1c0] avc1C: size: 17, first bytes: 81 08 4d 00
[av1 @ 0x7fbf6e00a1c0] VideoToolbox decoder for this format not found.
[av1 @ 0x7fbf6e00a1c0] Failed setup for format videotoolbox_vld: hwaccel initialisation returned error.
[av1 @ 0x7fbf6e00a1c0] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 0x7fbf6e00a1c0] Failed to get pixel format.

comment:2 by Nomis101, 5 months ago

Thanks for your input. This will not work on Intel, only on M3 and later (and maybe latest iPhone). Would you be able to test this on Apple M3 hardware?

comment:3 by Arkelic, 5 months ago

I was able to set up this branch on my M3 MacBook Pro and here is the output when using this sample file:

ffmpeg -v verbose -hwaccel videotoolbox -i /Users/arkelic/Downloads/spbtv_sample_bipbop_av1_960x540_25fps.mp4  -map 0:v -f null -
ffmpeg version N-112731-g8d79cdeccb Copyright (c) 2000-2023 the FFmpeg developers
  built with Apple clang version 15.0.0 (clang-1500.1.0.2.2)
  configuration:
  libavutil      58. 32.100 / 58. 32.100
  libavcodec     60. 33.100 / 60. 33.100
  libavformat    60. 17.100 / 60. 17.100
  libavdevice    60.  4.100 / 60.  4.100
  libavfilter     9. 13.100 /  9. 13.100
  libswscale      7.  6.100 /  7.  6.100
  libswresample   4. 13.100 /  4. 13.100
Selecting decoder 'av1' because of requested hwaccel method videotoolbox
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/Users/arkelic/Downloads/spbtv_sample_bipbop_av1_960x540_25fps.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: avc1isommp42avc1
    creation_time   : 2018-10-05T14:40:45.000000Z
  Duration: 00:00:15.04, start: 0.000000, bitrate: 130 kb/s
  Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 32000 Hz, mono, fltp, 8 kb/s (default)
    Metadata:
      handler_name    : snd
      vendor_id       : [0][0][0][0]
  Stream #0:1[0x2](und): Video: av1 (Main), 1 reference frame (av01 / 0x31307661), yuv420p(tv), 960x540, 116 kb/s, 25 fps, 25 tbr, 90k tbn (default)
    Metadata:
      handler_name    : vid
      vendor_id       : [0][0][0][0]
[out#0/null @ 0x6000038506c0] Adding streams from explicit maps...
[vost#0:0/wrapped_avframe @ 0x12e605db0] Created video stream from input stream 0:1
Stream mapping:
  Stream #0:1 -> #0:0 (av1 (native) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
[av1 @ 0x12e609fb0] avc1C: size: 17, first bytes: 81 04 0c 00
[av1 @ 0x12e609fb0] vt decoder cb: output image buffer is null: -12911
[av1 @ 0x12e609fb0] HW accel end frame fail.
[vist#0:1/av1 @ 0x12e609e60] Error submitting packet to decoder: Unknown error occurred
[av1 @ 0x12e609fb0] VideoToolbox decoder needs reconfig, restarting..
[av1 @ 0x12e609fb0] avc1C: size: 17, first bytes: 81 04 0c 00
[av1 @ 0x12e609fb0] vt decoder cb: output image buffer is null: -12911
[av1 @ 0x12e609fb0] HW accel end frame fail.
[vist#0:1/av1 @ 0x12e609e60] Error submitting packet to decoder: Unknown error occurred
[1]    64153 segmentation fault  ffmpeg -v verbose -hwaccel videotoolbox -i  -map 0:v -f null -
Last edited 5 months ago by Arkelic (previous) (diff)

comment:4 by NG, 3 months ago

Will we get AV1 on M3 soon ?
Personally for Moonlight gamestream

comment:5 by Aaron Graham, 3 months ago

It seems like FFmpeg doesn't support AV1 decoding using the macOS VideoToolbox API yet.

There are few Merge Requests and suggestions here but no final call to merge them:
https://github.com/jeeb/ffmpeg/commits/videotoolbox_av1

comment:6 by quinkblack, 3 months ago

The most difficult thing for developer is we don't have the hardware for test. Apple doesn't provide a software implementation for products before M3. It's hard if not impossible for personal developer to has access to the latest hardware.

comment:7 by Aaron Graham, 6 weeks ago

Any luck on this port ?

comment:8 by nenkoru, 3 weeks ago

Currently working on this.
Trying to overcome the -12911 issue, but no luck so far.
Already took a look into how chromium & moonlight implemented this + one forum on apple developer's platform.
Tried adding more info into configuration that is being passed to VT, no luck.
One thing that I noticed that getting Sequence OBU Header always returns the same length and bytes for different sample videos: av1C = {length = 17, bytes = 0x81040c000a0b00000024cf7f0dbfff3008};
Buf I am not sure what length and bytes should be returned, I suppose it should be 20, but not sure what each bit refers to, reading ISO documentation for now. Will try debugging parsing on a nvenc enabled host.

https://forums.developer.apple.com/forums/thread/739953
https://github.com/moonlight-stream/moonlight-ios/issues/585

update1:
following the quote from the forum:

I then got caught out with a decode error because I didn't realise that I also had to pass in the Sequence Header OBU with the first frame data I attempted to decode. It wasn't enough that I had already given the same Sequence Header OBU when creating the video format description (via the extensions).

I ended up checking obu of the frame and found out that
max_frame_width_minus_1 is essentially zero at the videotoolbox_av1_end_frame.

update2:
Compiled chromium from sources and compared the configuration it creates for av1 video. av1C is the same as the one ff_isom.. generates. Other configuration does not change the behaviour of -12911 error. Going to debug first frame that is generated both in ffmpeg and chrome to compare them.

Last edited 3 weeks ago by nenkoru (previous) (diff)

comment:9 by padenot, 9 days ago

Hi, we've recently done this in Firefox in https://bugzilla.mozilla.org/show_bug.cgi?id=1871796 (diff at https://phabricator.services.mozilla.com/D205841, complete source at https://searchfox.org/mozilla-central/source/dom/media/platforms/apple/AppleVTDecoder.cpp), let us know (e.g. on matrix at matrix.to/#/#media:mozilla.org or email padenot@mozilla.com) if you need us to dump some values e.g. some extradata or something.

Firefox is fairly easy to build and work with if you want to do it yourself to compare, please don't hesitate to ask (for that or anything else).

Note: See TracTickets for help on using tickets.