Opened 11 months ago

Closed 11 months ago

Last modified 11 months ago

#10399 closed defect (invalid)

Output presents "crackling " static sounds

Reported by: drive4code Owned by:
Priority: normal Component: undetermined
Version: unspecified Keywords:
Cc: drive4code Blocked By:
Blocking: Reproduced by developer: no
Analyzed by developer: no

Description

Summary of the bug:
The output file contains crackling artifacts

What I Was Trying To Accomplish:
I was running a github project, called cleanvid, which aims to censor profanity out of videos. Running this mutes the audio by generating the provided command in the part where profanity is found. Here's the steps it takes detailed by the project-s Readme:

"
cleanvid is a little script to mute profanity in video files in a few simple steps:

1.The user provides as input a video file and matching .srt subtitle file. If subtitles are not provided explicitly, they will be extracted from the video file if possible; if not, subliminal is used to attempt to download the best matching .srt file.

  1. pysrt is used to parse the .srt file, and each entry is checked against a list of profanity or other words or phrases you'd like muted. Mappings can be provided (eg., map "sh*t" to "poop"), otherwise the word will be replaced with *.

3.A new "clean" .srt file is created. with only those phrases containing the censored/replaced objectional language.

4. ffmpeg is used to create a cleaned video file. This file contains the original video stream, but the audio stream is muted during the segments containing objectional language. The audio stream is re-encoded as AAC and remultiplexed back together with the video. Optionally, the clean .srt file can be embedded in the cleaned video file as a subtitle track.
"

Problem Encountered:
There seem to be occasional artifacts when running this. Using the "-aac_tns 0", which seems to help mitigate the problem, for long videos these seems to be a big portion (I've seen a 20-minute portion on a 1 hour 30-minute video) of artifacting, which then seems to get better through the rest of the video and never fully disappear. Without using the flag, the problem is consistent and constant throughout the video.

Additional Notes:
The problem seems to be somehow mitigated by including the "-aac_tns 0" flag, and worsened by using the "-b:a 640k" flag

How to reproduce:
Run the following on the uploaded input, and listen to the artifacts that pop up. Due to the small time of the clip, it may take multiple attempts

% ffmpeg -y -i input.mp4 -sn -c:v copy -af "volume=enable='between(t,0.260,0.380)':volume=0,volume=enable='between(t,1.320,3.020)':volume=0,volume=enable='between(t,30.860,31.460)':volume=0" -c:a aac -report output.mp4
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)

Loglevel Output:
The size was way too large. I can only provide the first and last part:

% ffmpeg -v 9 -loglevel 99 -i input.mp4
ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
  configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Splitting the commandline.
Reading option '-v' ... matched as option 'v' (set logging level) with argument '9'.
Reading option '-loglevel' ... matched as option 'loglevel' (set logging level) with argument '99'.
Reading option '-i' ... matched as input url with argument 'input.mp4'.
Finished splitting the commandline.
Parsing a group of options: global .
Applying option v (set logging level) with argument 9.
Successfully parsed a group of options.
Parsing a group of options: input url input.mp4.
Successfully parsed a group of options.
Opening an input file: input.mp4.
[NULL @ 0x55967fe54240] Opening 'input.mp4' for reading
[file @ 0x55967fe54ec0] Setting default whitelist 'file,crypto,data'
Probing mov,mp4,m4a,3gp,3g2,mj2 score:100 size:2048
Probing mp3 score:1 size:2048
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] Format mov,mp4,m4a,3gp,3g2,mj2 probed with size=2048 and score=100
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'ftyp' parent:'root' sz: 24 8 23353518
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] ISO: File Type Major Brand: mp42
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'uuid' parent:'root' sz: 40 32 23353518
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'mdat' parent:'root' sz: 23319625 72 23353518
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'moov' parent:'root' sz: 33829 23319697 23353518
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'mvhd' parent:'moov' sz: 108 8 33821
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] time scale = 48000
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'trak' parent:'moov' sz: 19339 116 33821
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'tkhd' parent:'trak' sz: 92 8 19331
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'mdia' parent:'trak' sz: 19239 100 19331
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'mdhd' parent:'mdia' sz: 32 8 19231
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'hdlr' parent:'mdia' sz: 45 40 19231
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] ctype=[0][0][0][0]
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] stype=vide
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'minf' parent:'mdia' sz: 19154 85 19231
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'vmhd' parent:'minf' sz: 20 8 19146
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'dinf' parent:'minf' sz: 36 28 19146
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'dref' parent:'dinf' sz: 28 8 28
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] Unknown dref type 0x206c7275 size 12
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'stbl' parent:'minf' sz: 19090 64 19146
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'stsd' parent:'stbl' sz: 150 8 19082
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] size=134 4CC=avc1 codec_type=0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'avcC' parent:'stsd' sz: 48 8 48
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'stts' parent:'stbl' sz: 32 158 19082
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] track[0].stts.entries = 2
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] sample_count=4150, sample_duration=1000
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] sample_count=1, sample_duration=740
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'ctts' parent:'stbl' sz: 24 190 19082
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] track[0].ctts.entries = 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] count=4151, duration=3300
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] dts shift 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'stsc' parent:'stbl' sz: 1072 214 19082
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] track[0].stsc.entries = 88
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'stsz' parent:'stbl' sz: 16624 1286 19082
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] sample_size = 0 sample_count = 4151
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'stco' parent:'stbl' sz: 884 17910 19082
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] type:'stss' parent:'stbl' sz: 296 18794 19082
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] keyframe_count = 70
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 0, offset 50, dts 0, size 63932, distance 0, keyframe 1
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 1, offset fa0c, dts 1000, size 218, distance 1, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 2, offset fae6, dts 2000, size 344, distance 2, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 3, offset fc3e, dts 3000, size 117, distance 3, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 4, offset fcb3, dts 4000, size 367, distance 4, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 5, offset fe22, dts 5000, size 226, distance 5, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 6, offset ff04, dts 6000, size 1268, distance 6, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 7, offset 103f8, dts 7000, size 1152, distance 7, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 8, offset 10878, dts 8000, size 8271, distance 8, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 9, offset 128c7, dts 9000, size 5689, distance 9, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 10, offset 13f00, dts 10000, size 8869, distance 10, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 11, offset 161a5, dts 11000, size 3581, distance 11, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 12, offset 16fa2, dts 12000, size 7453, distance 12, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 13, offset 18cbf, dts 13000, size 7036, distance 13, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 14, offset 1a83b, dts 14000, size 9793, distance 14, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 15, offset 1ce7c, dts 15000, size 4045, distance 15, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 16, offset 1f6a5, dts 16000, size 8956, distance 16, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 17, offset 219a1, dts 17000, size 9253, distance 17, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 18, offset 23dc6, dts 18000, size 10984, distance 18, keyframe 0
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] AVIndex stream 0, sample 19, offset 268ae, dts 19000, size 5354, distance 19, keyframe 0



...





[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] All info found
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] stream 0: start_time: 0.055 duration: 69.179
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] stream 1: start_time: 0 duration: 69.248
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] format: start_time: 0 duration: 69.259 (estimate from stream) bitrate=2697 kb/s
[mov,mp4,m4a,3gp,3g2,mj2 @ 0x55967fe54240] After avformat_find_stream_info() pos: 122805 bytes read:196065 seeks:2 frames:17
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : mp42
    minor_version   : 0
    compatible_brands: mp41isom
    creation_time   : 2023-06-03T08:37:14.000000Z
  Duration: 00:01:09.26, start: 0.000000, bitrate: 2697 kb/s
  Stream #0:0(und), 16, 1/60000: Video: h264 (Main), 1 reference frame (avc1 / 0x31637661), yuv420p(left), 1920x1080 (1920x1088) [SAR 1:1 DAR 16:9], 0/1, 2533 kb/s, 60 fps, 60 tbr, 60k tbn, 120 tbc (default)
    Metadata:
      creation_time   : 2023-06-03T08:37:14.000000Z
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
      encoder         : AVC Coding
  Stream #0:1(und), 1, 1/48000: Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 163 kb/s (default)
    Metadata:
      creation_time   : 2023-06-03T08:37:14.000000Z
      handler_name    : SoundHandler
      vendor_id       : [0][0][0][0]
Successfully opened the file.
At least one output file must be specified
[AVIOContext @ 0x55967fe5d280] Statistics: 196065 bytes read, 2 seeks

Attachments (1)

ffmpeg-20230603-104845.log (14.3 KB ) - added by drive4code 11 months ago.
Fill of ffmpeg when running the command

Download all attachments as: .zip

Change History (3)

by drive4code, 11 months ago

Attachment: ffmpeg-20230603-104845.log added

Fill of ffmpeg when running the command

comment:1 by Elon Musk, 11 months ago

Resolution: invalid
Status: newclosed

This is invalid bug report, this trac is not support channel.
Also volume operates per audio frames, audio frames have usually more than single sample per channel.
If you turn volume abruptly up/down from/to 1/0 it will cause artifacts.
Your solution for your problem is invalid. You either need afade filter to fade audio in/out or wait for aoverlay filter which will simplify this.

comment:2 by Elon Musk, 11 months ago

Keywords: AAC volume removed
Priority: importantnormal
Note: See TracTickets for help on using tickets.