Guidelines for high quality lossy audio encoding
This guide is meant to help people new to encoding lossy audio to use the correct encoders and settings.
When should I transcode audio?
Avoid transcoding from a lossy format to the same or another lossy format when possible. Transcode to lossy from the lossless source (if you have it), or just copy the lossy source audio track instead of transcoding.
Another option if you have a lossless source is to transcode it to another lossless codec, like flac or TrueHD (last one needs -strict -2).
Generation loss
Transcoding from a lossy format like MP3, AAC, Vorbis, Opus, WMA, etc. to the same or different lossy format might degrade the audio quality even if the bitrate stays the same (or higher). This quality degradation might not be audible to you but it might be audible to others.
This post on hydrogenaudio.org demonstrates what will happen if you re-encode a file 100 times: http://www.hydrogenaud.io/forums/index.php?showtopic=100067
Copying audio tracks
If the target container format supports the audio codec of the source file then consider just muxing it into the output file without re-encoding. MKV supports virtually any audio codec. This can be achieved by specifying 'copy' as the audio codec.
Example:
Transcoding a WebM file (with VP8 video/Vorbis audio) to a MKV file (with H.264 video/unaltered Vorbis audio):
ffmpeg -i someFile.webm -c:a copy -c:v libx264 outFile.mkv
In some cases this might not be possible, because the target device/player doesn't support the codec or the target container format doesn't support the codec. Another reason to transcode might be that the source audio track is too big (it has a higher bitrate than what you want to use in the output file).
Audio encoders FFmpeg can use
FFmpeg can encode to a wide variety of lossy audio formats.
Here are some popular lossy formats with encoders listed that FFmpeg can use:
Dolby Digital: ac3 Dolby Digital Plus: eac3 TrueHD 0xFBA: truehd MP2: libtwolame, mp2 Windows Media Audio 1: wmav1 Windows Media Audio 2: wmav2 AAC LC: libfdk_aac, aac HE-AAC: libfdk_aac Vorbis: libvorbis, vorbis MP3: libmp3lame, libshine Opus: libopus
Based on quality produced from high to low:
libopus > libvorbis >= libfdk_aac > libmp3lame >= eac3/ac3 > aac > libtwolame > vorbis > mp2 > wmav2/wmav1
The >= sign means greater or the same quality.
This list is just a general guide and there may be cases where a codec listed to the right will perform better than one listed to the left at certain bitrates.
The highest quality internal/native encoder available in FFmpeg without any external libraries is aac.
Please note it is not recommended to use the experimental vorbis
for Vorbis encoding; use libvorbis
instead.
Please note that wmav1
and wmav2
don't seem to be able to reach transparency at any given bitrate.
Container formats
Only certain audio codecs will be able to fit in your target output file.
Container | Audio formats supported |
MKV/MKA | Opus, Vorbis, MP2, MP3, LC-AAC, HE-AAC, WMAv1, WMAv2, AC3, E-AC3, TrueHD |
MP4/M4A | Opus, MP2, MP3, LC-AAC, HE-AAC, AC3, E-AC3, TrueHD |
FLV/F4V | MP3, LC-AAC, HE-AAC |
3GP/3G2 | LC-AAC, HE-AAC |
MPG | MP2, MP3 |
PS/TS Stream | MP2, MP3, LC-AAC, HE-AAC, AC3, TrueHD |
M2TS | AC3, E-AC3, TrueHD |
VOB | MP2, AC3 |
RMVB | Vorbis, HE-AAC |
WebM | Vorbis, Opus |
OGG | Vorbis, Opus |
There are more container formats available than those listed above, like mxf. Also, E-AC3 is only officially (according to Dolby) supported in mp4 (for example, E-AC3 needs editlist to remove padding of initial 256 silence samples).
Recommended minimum bitrates to use
The bitrates listed here assume 2-channel stereo and a sample rate of 44.1kHz or 48kHz. Mono, speech, and quiet audio may require fewer bits.
- libopus – usable range ≥ 32Kbps. Recommended range ≥ 64Kbps
- libfdk_aac default AAC LC profile – recommended range ≥ 128Kbps; see AAC Encoding Guide.
- libfdk_aac -profile:a aac_he_v2 – usable range ≤ 48Kbps CBR. Transparency: Does not reach transparency. Use AAC LC instead to achieve transparency
- libfdk_aac -profile:a aac_he – usable range ≥ 48Kbps and ≤ 80Kbps CBR. Transparency: Does not reach transparency. Use AAC LC instead to achieve transparency
- libvorbis – usable range ≥ 96Kbps. Recommended range
-aq 4
(≥ 128Kbps)
- libmp3lame – usable range ≥ 128Kbps. Recommended range
-aq 2
(≥ 192Kbps)
- ac3 or eac3 – usable range ≥ 160Kbps. Recommended range ≥ 160Kbps
Example of usage:ffmpeg -i input.wav -c:a ac3 -b:a 160k output.m4a
- aac – usable range ≥ 32Kbps (depending on profile and audio). Recommended range ≥ 128Kbps
Example of usage:ffmpeg -i input.wav -c:a aac -b:a 128k output.m4a
- libtwolame – usable range ≥ 192Kbps. Recommended range ≥ 256Kbps
- mp2 – usable range ≥ 320Kbps. Recommended range ≥ 320Kbps
The vorbis and wmav1/wmav2 encoders are not worth using.
The wmav1/wmav2 encoder does not reach transparency at any bitrate.
The vorbis encoder does not use the bitrate specified in FFmpeg. On some samples it does sound reasonable, but the bitrate is very high.
To calculate the bitrate to use for multi-channel audio: (bitrate for stereo) x (channels / 2).
Example for 5.1 (6 channels) Vorbis audio: 128Kbps x (6 / 2) = 384Kbps
When compatibility with hardware players doesn't matter then use libopus
in a MKV container when libfdk_aac
isn't available.
When compatibility with hardware players does matter then use libmp3lame or ac3 in a MP4/MKV container when libfdk_aac
isn't available.
Transparency means the encoded audio sounds indistinguishable from the audio in the source file.
Some codecs have a more efficient variable bitrate (VBR) mode which optimizes to a given, constant quality level rather than having variable quality at a given, constant bitrate (CBR). The info above is for CBR. VBR is more efficient than CBR but may not be as hardware-compatible.
Resources
- Opus – Codec Landscape
- xiph.org – 64kbit/sec stereo multiformat listening test including Opus, aoTuV Vorbis, two HE-AAC encoders, and a 48kbit/sec AAC-LC low anchor
- Results of the public multiformat listening test (July 2014) – Opus, AAC and Ogg Vorbis at 96 kbps against a classic MP3 128 kbps
- Google Listening Test #1 – comparing Opus, Speex, and others
- Google Listening Test #2 – comparing Opus, Speex, and others