Opened 9 years ago
Last modified 9 years ago
#4984 new defect
ffmpeg amerge and amix filter delay when working with RTSP
Reported by: | leogsa | Owned by: | |
---|---|---|---|
Priority: | normal | Component: | undetermined |
Version: | unspecified | Keywords: | RTSP |
Cc: | Blocked By: | ||
Blocking: | Reproduced by developer: | no | |
Analyzed by developer: | no |
Description
ffmpeg amerge and amix filter delay
I need to take audio-streams from several IP cameras and merge them into
one stream, so that they would sound simaltaneousely.
I tried filter "amix": (for testing purposes I take audio-stream 2 times
from the same camera. yes, I tried 2 cameras - result is the same)
ffmpeg -i rtsp://user:pass@172.22.5.202 -i rtsp://user:pass@172.22.5.202
-map 0:a -map 1:a -filter_complex
amix=inputs=2:duration=first:dropout_transition=3 -ar 22050 -vn -f flv
rtmp://172.22.45.38:1935/live/stream1
result: I say "hello". And hear in speakers the first "hello" and in 1
second I hear the second "hello". Instead of hearing two "hello"'s
simaltaneousely.
also I tried filter "amerge":
ffmpeg -i rtsp://user:pass@172.22.5.202 -i rtsp://user:pass@172.22.5.202
-map 0:a -map 1:a -filter_complex amerge -ar 22050 -vn -f flv rtmp://
172.22.45.38:1935/live/stream1
result: the same as in the first example, but now I hear the first "hello"
in left speaker and in 1 second I hear the second "hello" in right speaker,
instead of hearing two "hello"'s in both speakers simaltaneousely.
Here is ful command-line output for both variants: amix:
ffmpeg -i rtsp://admin:12345@172.22.5.202 -i rtsp://
admin:12345 at 172.22.5.202 -map 0:a -map 1:a -filter_complex
amix=inputs=2:duration=longest:dropout_transition=0 -vn -ar 22050 -f flv
rtmp://172.22.45.38:1935/live/stream1 ffmpeg version N-76031-g9099079
Copyright (c) 2000-2015 the FFmpeg developers
built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-16)
configuration: --enable-gpl --enable-libx264 --enable-libmp3lame
--enable-nonfree --enable-version3
libavutil 55. 4.100 / 55. 4.100
libavcodec 57. 6.100 / 57. 6.100
libavformat 57. 4.100 / 57. 4.100
libavdevice 57. 0.100 / 57. 0.100
libavfilter 6. 11.100 / 6. 11.100
libswscale 4. 0.100 / 4. 0.100
libswresample 2. 0.100 / 2. 0.100
libpostproc 54. 0.100 / 54. 0.100
Input #0, rtsp, from 'rtsp://admin:12345@172.22.5.202':
Metadata:
title : Media Presentation
Duration: N/A, start: 0.032000, bitrate: N/A
Stream #0:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25
tbr, 90k tbn, 40 tbc
Stream #0:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s
Stream #0:2: Data: none
Input #1, rtsp, from 'rtsp://admin:12345@172.22.5.202':
Metadata:
title : Media Presentation
Duration: N/A, start: 0.032000, bitrate: N/A
Stream #1:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25
tbr, 90k tbn, 40 tbc
Stream #1:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s
Stream #1:2: Data: none
Output #0, flv, to 'rtmp://172.22.45.38:1935/live/stream1':
Metadata:
title : Media Presentation
encoder : Lavf57.4.100
Stream #0:0: Audio: mp3 (libmp3lame) ([2][0][0][0] / 0x0002), 22050
Hz, mono, fltp (default)
Metadata:
encoder : Lavc57.6.100 libmp3lame
Stream mapping:
Stream #0:1 (g726) -> amix:input0
Stream #1:1 (g726) -> amix:input1
amix -> Stream #0:0 (libmp3lame)
Press [q] to stop, ? for help
[rtsp @ 0x2689600] Thread message queue blocking; consider raising the
thread_queue_size option (current value: 8)
[rtsp @ 0x2727c60] Thread message queue blocking; consider raising the
thread_queue_size option (current value: 8)
[rtsp @ 0x2689600] max delay reached. need to consume packet
[NULL @ 0x268c500] RTP: missed 38 packets
[rtsp @ 0x2689600] max delay reached. need to consume packet
[NULL @ 0x268d460] RTP: missed 4 packets
[flv @ 0x2958360] Failed to update header with correct duration.
[flv @ 0x2958360] Failed to update header with correct filesize.
size= 28kB time=00:00:06.18 bitrate= 36.7kbits/s
video:0kB audio:24kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 16.331224%
and amerge:
# ffmpeg -i rtsp://admin:12345@172.22.5.202 -i rtsp://
admin:12345 at 172.22.5.202 -map 0:a -map 1:a -filter_complex amerge -vn -ar
22050 -f flv rtmp://172.22.45.38:1935/live/stream1
ffmpeg version N-76031-g9099079 Copyright (c) 2000-2015 the FFmpeg
developers
built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-16)
configuration: --enable-gpl --enable-libx264 --enable-libmp3lame
--enable-nonfree --enable-version3
libavutil 55. 4.100 / 55. 4.100
libavcodec 57. 6.100 / 57. 6.100
libavformat 57. 4.100 / 57. 4.100
libavdevice 57. 0.100 / 57. 0.100
libavfilter 6. 11.100 / 6. 11.100
libswscale 4. 0.100 / 4. 0.100
libswresample 2. 0.100 / 2. 0.100
libpostproc 54. 0.100 / 54. 0.100
Input #0, rtsp, from 'rtsp://admin:12345@172.22.5.202':
Metadata:
title : Media Presentation
Duration: N/A, start: 0.064000, bitrate: N/A
Stream #0:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25
tbr, 90k tbn, 40 tbc
Stream #0:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s
Stream #0:2: Data: none
Input #1, rtsp, from 'rtsp://admin:12345@172.22.5.202':
Metadata:
title : Media Presentation
Duration: N/A, start: 0.032000, bitrate: N/A
Stream #1:0: Video: h264 (Baseline), yuv420p, 1280x720, 20 fps, 25
tbr, 90k tbn, 40 tbc
Stream #1:1: Audio: adpcm_g726, 8000 Hz, mono, s16, 16 kb/s
Stream #1:2: Data: none
[Parsed_amerge_0 @ 0x3069cc0] No channel layout for input 1
[Parsed_amerge_0 @ 0x3069cc0] Input channel layouts overlap: output
layout will be determined by the number of distinct input channels
Output #0, flv, to 'rtmp://172.22.45.38:1935/live/stream1':
Metadata:
title : Media Presentation
encoder : Lavf57.4.100
Stream #0:0: Audio: mp3 (libmp3lame) ([2][0][0][0] / 0x0002), 22050
Hz, stereo, s16p (default)
Metadata:
encoder : Lavc57.6.100 libmp3lame
Stream mapping:
Stream #0:1 (g726) -> amerge:in0
Stream #1:1 (g726) -> amerge:in1
amerge -> Stream #0:0 (libmp3lame)
Press [q] to stop, ? for help
[rtsp @ 0x2f71640] Thread message queue blocking; consider raising the
thread_queue_size option (current value: 8)
[rtsp @ 0x300fb40] Thread message queue blocking; consider raising the
thread_queue_size option (current value: 8)
[rtsp @ 0x2f71640] max delay reached. need to consume packet
[NULL @ 0x2f744a0] RTP: missed 18 packets
[flv @ 0x3058b00] Failed to update header with correct duration.
[flv @ 0x3058b00] Failed to update header with correct filesize.
size= 39kB time=00:00:04.54 bitrate= 70.2kbits/s
video:0kB audio:36kB subtitle:0kB other streams:0kB global headers:0kB
muxing overhead: 8.330614%
UPDATE 30 oct 2015: I found interesting detail when connecting 2 cameras
(they have different microphones and I hear the difference between them):
the order of "Hello"'s from different cams depends on the ORDER OF INPUTS.
with command
ffmpeg -i rtsp://cam2 -i rtsp://cam1 -map 0:a -map 1:a -filter_complex
amix=inputs=2:duration=longest:dropout_transition=0 -vn -ar 22050 -f flv
rtmp://172.22.45.38:1935/live/stream1
I hear "hello" from 1st cam and then in 1 second "hello" from 2nd cam.
with command
ffmpeg -i rtsp://cam1 -i rtsp://cam2 -map 0:a -map 1:a -filter_complex
amix=inputs=2:duration=longest:dropout_transition=0 -vn -ar 22050 -f flv
rtmp://172.22.45.38:1935/live/stream1
I hear "hello" from 2nd cam and then in 1 second "hello" from 1st cam.
So, As I understand - ffmpeg takes inputs not simaltaneousely, but in the
order of inputs given.
P.S. FILES are mixed and merged perfectly with same commands.
Change History (3)
comment:1 by , 9 years ago
comment:2 by , 9 years ago
Cigaes, thank yoou for the answer.
Why then whis command works perfectly with files? and why the difference is so big - 1 second? Does it mean that ffmpeg begins taking data from 1 camera and only in 1 second from 2nd one?
comment:3 by , 9 years ago
I can not observe your files, I can only assume that they start at the same instant.
As for the recording from the cameras, this depends on the cameras themselves on top of the probings performed by ffmpeg. You can try to see the delay when you run ffmpeg to read from a single camera, between the instant you validate the command and the instant it starts printing progress.
I believe this is expected. Neither amix nor amerge (preferred) take the input timestamps into account. Furthermore, the command line you use subtracts the initial timestamps of both streams, and since the capture do not start exactly at the same time, there is a shift.
I suspect that to get this working, you would need to use the
-copyts
option, then find a way of subtract the same initial timestamp to both streams, and finally use aresample to sync the audio to its timestamps.