Opened 9 years ago
Last modified 5 years ago
#5080 open enhancement
Excessive HTTP GETs reading MP4 from web server
Reported by: | Neal McConnell | Owned by: | |
---|---|---|---|
Priority: | wish | Component: | avformat |
Version: | git-master | Keywords: | http mov |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
OS: RHEL 3.14.33-26.47.amzn1.x86_64
Build: ffmpeg-2.8.1-64bit-static
Obtained from: http://johnvansickle.com/ffmpeg/
Summary of the bug:
When FFmpeg/FFprobe reads an MP4 file from a web server that allows partial download, the application may use an excessive number of GETs. This occurs with files in which audio/video frames are interleaved where audio-video frames exceed a difference in file offset, presumably exceeding some I/O buffer size. I have encountered many files that bring out this behavior, so it's not a corner case.
Doubling probesize to 10000000 does not decrease the number of GETs. All these GETs can cause a dramatic slowdown of encoding speed(4fps vs. 90fps), depending upon the response time of the service/server.
To develop my presumption of frame "position", I used FFprobe's to print that information.
./ffprobe -hide_banner -v trace -show_entries packet=pos,pts_time,codec_type "http://transcodes.s3.amazonaws.com/content/iPhone4_upload_via_3G.MP4" -of compact 2>&1 | grep 'Content-Range\|codec_type'
And an excerpt from the attached output of that command:
packet|codec_type=audio|pts_time=3.436553|pos=42024 packet|codec_type=video|pts_time=2.466667|pos=61374 [http @ 0x3f7a4a0] header='Content-Range: bytes 31559-684616/684617' packet|codec_type=audio|pts_time=3.482993|pos=42182 packet|codec_type=audio|pts_time=3.529433|pos=42317 packet|codec_type=video|pts_time=2.533333|pos=66016 packet|codec_type=audio|pts_time=3.575873|pos=42473 packet|codec_type=video|pts_time=2.600000|pos=66407 packet|codec_type=audio|pts_time=3.622313|pos=42618 packet|codec_type=video|pts_time=2.666667|pos=67537 packet|codec_type=audio|pts_time=3.668753|pos=48642 packet|codec_type=audio|pts_time=3.715193|pos=48785 packet|codec_type=video|pts_time=2.733333|pos=68161 [http @ 0x3f7a4a0] header='Content-Range: bytes 48926-684616/684617' packet|codec_type=audio|pts_time=3.761633|pos=48926 packet|codec_type=video|pts_time=2.800000|pos=69005 packet|codec_type=audio|pts_time=3.808073|pos=49058
How to reproduce:
Please note that this is extremely dependent on the source file and must be over HTTP, which is why I made it available.
./ffmpeg -hide_banner -v trace -report -i "https://transcodes.s3.amazonaws.com/content/iPhone4_upload_via_3G.MP4" -y output.mp4 2>&1 | grep 'Content-Range\|frame='
Log Output:
I had to use a fairly verbose logging level to show the http requests and frames processed. Please see the attached report to this bug.
Here's an excerpt, though:
[https @ 0x48f2640] request: GET /content/iPhone4_upload_via_3G.MP4 HTTP/1.1 User-Agent: Lavf/56.40.101 Accept: */* Range: bytes=65677- Connection: close Host: transcodes.s3.amazonaws.com Icy-MetaData: 1 [libx264 @ 0x48fca20] frame= 0 QP=23.32 NAL=3 Slice:I Poc:0 I:112 P:0 SKIP:0 size=6416 bytes [libx264 @ 0x48fca20] frame= 1 QP=23.35 NAL=2 Slice:P Poc:4 I:0 P:105 SKIP:7 size=1053 bytes [libx264 @ 0x48fca20] frame= 2 QP=31.00 NAL=0 Slice:B Poc:2 I:0 P:65 SKIP:47 size=87 bytes [libx264 @ 0x48fca20] frame= 3 QP=23.65 NAL=2 Slice:P Poc:8 I:0 P:99 SKIP:13 size=957 bytes [libx264 @ 0x48fca20] frame= 4 QP=31.00 NAL=0 Slice:B Poc:6 I:0 P:56 SKIP:55 size=79 bytes [libx264 @ 0x48fca20] frame= 5 QP=23.66 NAL=2 Slice:P Poc:12 I:0 P:104 SKIP:8 size=1133 bytes [libx264 @ 0x48fca20] frame= 6 QP=29.67 NAL=0 Slice:B Poc:10 I:0 P:60 SKIP:47 size=134 bytes [libx264 @ 0x48fca20] frame= 7 QP=23.84 NAL=2 Slice:P Poc:16 I:0 P:98 SKIP:14 size=884 bytes [libx264 @ 0x48fca20] frame= 8 QP=31.00 NAL=0 Slice:B Poc:14 I:0 P:62 SKIP:49 size=84 bytes [libx264 @ 0x48fca20] frame= 9 QP=23.73 NAL=2 Slice:P Poc:20 I:0 P:108 SKIP:4 size=1056 bytes [https @ 0x48f2640] request: GET /content/iPhone4_upload_via_3G.MP4 HTTP/1.1 User-Agent: Lavf/56.40.101 Accept: */* Range: bytes=82048- Connection: close Host: transcodes.s3.amazonaws.com Icy-MetaData: 1 [libx264 @ 0x48fca20] frame= 10 QP=31.00 NAL=0 Slice:B Poc:18 I:0 P:48 SKIP:62 size=68 bytes [libx264 @ 0x48fca20] frame= 11 QP=23.79 NAL=2 Slice:P Poc:24 I:0 P:107 SKIP:5 size=1079 bytes [libx264 @ 0x48fca20] frame= 12 QP=31.50 NAL=0 Slice:B Poc:22 I:0 P:48 SKIP:64 size=56 bytes [libx264 @ 0x48fca20] frame= 13 QP=23.96 NAL=2 Slice:P Poc:28 I:0 P:105 SKIP:7 size=971 bytes [libx264 @ 0x48fca20] frame= 14 QP=30.33 NAL=0 Slice:B Poc:26 I:0 P:69 SKIP:39 size=137 bytes [libx264 @ 0x48fca20] frame= 15 QP=23.62 NAL=2 Slice:P Poc:32 I:0 P:112 SKIP:0 size=1858 bytes [libx264 @ 0x48fca20] frame= 16 QP=29.60 NAL=0 Slice:B Poc:30 I:0 P:64 SKIP:46 size=196 bytes [libx264 @ 0x48fca20] frame= 17 QP=23.32 NAL=2 Slice:P Poc:36 I:0 P:105 SKIP:7 size=1228 bytes [libx264 @ 0x48fca20] frame= 18 QP=29.05 NAL=0 Slice:B Poc:34 I:0 P:83 SKIP:27 size=170 bytes [libx264 @ 0x48fca20] frame= 19 QP=23.24 NAL=2 Slice:P Poc:40 I:0 P:107 SKIP:5 size=1039 bytes [libx264 @ 0x48fca20] frame= 20 QP=29.67 NAL=0 Slice:B Poc:38 I:0 P:74 SKIP:33 size=181 bytes [libx264 @ 0x48fca20] frame= 21 QP=23.45 NAL=2 Slice:P Poc:44 I:0 P:110 SKIP:2 size=1133 bytes [https @ 0x48f2640] request: GET /content/iPhone4_upload_via_3G.MP4 HTTP/1.1 User-Agent: Lavf/56.40.101 Accept: */* Range: bytes=90655- Connection: close Host: transcodes.s3.amazonaws.com Icy-MetaData: 1
Attachments (4)
Change History (20)
by , 9 years ago
Attachment: | Pos_differences_and_GETs.png added |
---|
comment:1 by , 9 years ago
Keywords: | mov added |
---|
Is this issue reproducible with current FFmpeg git head?
comment:2 by , 9 years ago
This issue is reproducible with current FFmpeg git head, and has been present for a couple years back.
comment:3 by , 9 years ago
Please provide command line and console output for a current version of FFmpeg (as you have done for many reports before).
comment:4 by , 9 years ago
Version: | 2.8.3 → git-master |
---|
comment:5 by , 9 years ago
I replaced the attached reports with ones from todays build. The command has not changed.
by , 9 years ago
Attachment: | inefficient_http_amz.txt added |
---|
Output from FFprobe showing GETs and A/V frame positions
follow-up: 7 comment:6 by , 9 years ago
You can see from the list of A/V positions that the file has a drift in multiplexing. By that I mean that while at the beginning of the list the difference in position between audio and video chunks of the same timestamp are about 6k (audio with pts 0 at pos 44, video with pts 0 at pos 6210) this increases towards the end of the list to about 30k (audio with pts 34.3 at pos 647105, video with pts 34.3 at pos 675027).
I have also seen such files in the wild on numerous occasions but they are hard to handle nicely. For smaller files as in your case just increasing the size of some I/O buffers might help. But for movies of hours length which drift apart by many megabytes the currently employed solution is not so bad, handling it without additional GETs would require complex code and would increase buffer size requirements by a lot.
On the content side, such files can be improved by remultiplexing or using http://linux.die.net/man/3/mp4optimize
comment:7 by , 9 years ago
Replying to stefang:
I have also seen such files in the wild on numerous occasions but they are hard to handle nicely. For smaller files as in your case just increasing the size of some I/O buffers might help. But for movies of hours length which drift apart by many megabytes the currently employed solution is not so bad, handling it without additional GETs would require complex code and would increase buffer size requirements by a lot.
So is this an invalid ticket?
comment:8 by , 9 years ago
Please review the attached PNG("Pos_differences_and_GETs.png") showing the sample position differences throughout this file. It does not show a steady increase, but rather a cyclic one. For this reason, I do not think it is invalid.
comment:9 by , 9 years ago
nealzebub, you are right that the differences do get smaller again, I only looked at the trimmed list and not the PNG before I wrote my post. So in this case there would not be a big buffering requirement to handle the file smoothly.
However, handling such files correctly still is difficult to implement. If you look at the abstraction of a demuxer e.g. 'mov' - which gets the order 'give me the next packet in time' - and the protocol e.g. 'http' - which understands 'give me the next few bytes' and 'seek to position x' - you have a problem with badly interleaved files.
The demuxer would need some logic to detect the gap between the current audio position and the current video position in the file and buffer it by itself. Only then would a seek back to the lower of the two positions be avoided. And an additional buffer I would say is not a desirable thing in a demuxer and would raise the question how big such a buffer is allowed to grow.
Anyway, I think your report is absolutely valid. I would just argue that the current behaviour is not a bug and would relabel it as an enhancement request to handle badly interleaved files in a smoother way.
comment:10 by , 9 years ago
Component: | undetermined → avformat |
---|---|
Priority: | normal → wish |
Status: | new → open |
Type: | defect → enhancement |
comment:11 by , 9 years ago
It's easier to handle when audio is earlier, since it's typically lower bitrate than video.
I have occasionally (rarely) found ffmpeg producing files, when using http input, where the audio is significantly delayed. I didn't look at byte positions, just relative timestamp positions as it seems media players use that instead of byte position.
This resulted in files where common media players didn't have audio. Seeking to the middle of the file would turn audio back on. (totem video player also has problems when audio length is more than three times longer than video, things like that.)
In rare cases, ffmpeg will produce a file that it can't recognize itself, even when using probesize options as it suggests, but it isn't common enough to have reported it yet.
I was going to suggest using another program like wget to download the file, but that actually leads to a bunch of error messages in ffmpeg.
ffmpeg -analyzeduration 50M -probesize 50M -i <(wget -qO - http://transcodes.s3.amazonaws.com/content/iPhone4_upload_via_3G.MP4)
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x1270000] stream 0, offset 0x2c: partial file
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x1270000] Could not find codec parameters for stream 1 (Video: h264 (avc1 / 0x31637661), none(tv, bt709), 224x128, 133 kb/s): unspecified pixel format
Consider increasing the value for the 'analyzeduration' and 'probesize' options
(Setting an output format, like pulse or opengl, leads to other, different, errors.)
Maybe the log output explains it, but ffmpeg/ffprobe might be using a buffer based on timestamps, just like media players often use, and not based on file size. The video resolution is pretty small; the encoder or muxer might just be adding packets when the file size exceeds a certain amount, not when timestamps exceed a certain amount. (Actually, it seems the maximum difference is ~1 second; the variation in position might be due to video frame size.) My comment is decreasing in usefulness, but why is the audio listed as being later in time, but earlier in file position? I had thought that -show_packets, unlike -show_frames, listed frames in position order, but it doesn't do so with this file.
comment:12 by , 6 years ago
Couldn't this also be solved in http.c with a configurable buffer which can help avoid excessive seeking?
If someone is willing to work with this on a consultancy basis please contact me at robert@boffins.se.
comment:13 by , 6 years ago
One could also write a new wrapping protocol with a configurable buffer(s), which works on many underlying protocols including http. See Cache protocol and tee protocol for examples.
https://ffmpeg.org/ffmpeg-protocols.html#toc-cache
https://ffmpeg.org/ffmpeg-protocols.html#tee
New protocol approach could be much simpler than modifying the existing http.c
comment:15 by , 6 years ago
I doubt that cache protocol would resolve this(Because of the way requests are made from mp4 demuxer). But you could try it out to be sure.
comment:16 by , 5 years ago
The original sample file doesn't seem to be available any more. Here is a video with a similar problem:
https://k2c42.dy.fi/tmp/ffmpeg_5080/
Plot of position differences read and HTTP GETs