There is a reason why the Anime community has collectively has ditched VLC in favor of MPV and MPC-HC. Color reproduction, modern codec support, ASS subtitle rendering, and even audio codecs are janky or even broken on VLC. 98% of all Anime encode release playback problems are caused by the user using VLC.
We even have a dedicated pastebin on a quick run down of what is wrong: https://rentry.co/vee-ell-cee
And this pastebin doesn't even have all the issues. VLC has a long standing issue of not playing back 5.1 Surround sound Opus correctly or at all. VLC is still using FFmpeg 4.x. We're on FFmpeg 8.x these days
I can not even use VLC to take screenshots of videos I encode because the color rendering on everything is wrong. BT.709 is very much NOT new and predates VLC itself.
And you can say "VLC is easy to install and the UI is easy." Yeah so is IINA for macOS, Celluloid for Linux, and MPV.net for Windows which all use MPV underneath. Other better and easy video players exist today.
We are not in 2012 anymore. We are no longer just using AVC/H264 + AAC or AC-3 (Dolby Audio) MP4s for every video. We are playing back HEVC, VP9, and AV1 with HDR metadata in MKV/webm cnotainers with audio codecs like Opus or HE-AACv3 or TrueHD in surround channels, BT.2020 colorspaces. VLC's current release is made of libraries and FFmpeg versions that predate some of these codecs/formats/metadata types. Even the VLC 4.0 nightly alpha is not keeping up. 4.0 is several years late to releasing and when it does, it may not even matter.
Would have been nice if these "MUST KNOW BEFORE" advises were structured in a way so one could easily come back and use it as a reference, like just a list, but instead it's like a over-dinner conversation with your "expert and correct but socially-annoying" work colleague, who refuses to elaborate on the how's and why's, but still have very strong opinions.
Not sure why it takes a dump on VLC - it’s been the most stable and friendly video player for Windows for a long time (it matters that ordinary users, like school teachers, can use it without special training. I don’t care how ideological you are about Linux or video players or whatever lol).
It seems like the main criticisms I am getting for this article are because it's escaped past its main target audience, so let me clarify a few things.
This post was born out of me hanging out in communities where people would make their own shortened edits of TV series and, in particular, anime, often to cut out filler or padding. Many people there would make many of the mistakes mentioned in the post, in particular reencoding at every step without knowing how to actually control efficiency/quality. I spent a lot of time helping out individual people one-on-one, but eventually wrote the linked article to collect all of my advice in one place. That way I (or other people I know) can just link to it like "Read the section on containers here," and then answer any follow-up questions, instead of having to explain from scratch each time.
> It seems really weirdly written. / ranty format
So, yes, it does. It was born out of one-to-one explanations on Discord. I wouldn't be surprised if it may seem condescending to a more advanced reader, but if I rant about some point to hammer it down it's because it's a mistake I've seen people make often enough that it has to be reenforced this much. I wouldn't write a professional article this way.
The other point many people seem to get hung up about is the "hate" on VLC. Let me clarify that I do not "hate" VLC at all, I just don't recommend it. VLC is only mentioned once in the entire page, exactly because I didn't want to slot in an intermission purely to list a bunch of VLC issues. I felt like that would qualify more as "hate."
That said, yes, pretty much anyone I know in the fansubbing or encoding community does not recommend VLC because of various assorted issues. The rentry post [1] is often shared to list those, though I don't like how it does not give sources or reproducible examples for the issues it lists. I really do want to go through it and make proper samples and bug reports for all of these issues, I just didn't have the time yet.
Let me also clarify that I have nothing against the VLC developers. VideoLan does great work even outside of VLC, and every interaction I've had with their developers has been great. I just do not recommend the tool.
If folks want to get involved, there's also a chat community that's pretty active: https://video-dev.org.
https://github.com/occivink/mpv-scripts
There is also a way to losslessly cut preserving the original encoding but you give up the precision of the cuts due to keyframes. The MPV script above can do that too: script-opts/encode_slice.conf
Why? I only know Topaz and I always thought it had its narrow but legitimate uses cases for upscaling and equalizing quality?
It's a simple tool which is great for many things, it has filters and there are most of the formats. I think it uses ffmpeg under the hood.
It's an old tool but it's fine for most things, when ffmpeg is to fastidious to use. ffmpeg is still what I use, but some more complex tasks are just more comfortable with avidemux.
These days I'm much more inclined to try and transparently encode the source material, tag it appropriately in the media container, and let the player adjust the image on the fly. Though I admit, I still spend hours playing around with Vapoursynth filter settings and AV1 parameters to try and get a good quality/compression ratio.
I have to say that the biggest improvement to the experience of watching my videos was when I got an OLED TV. Even some garbage VHS rip can look interesting when the night sky has been adjusted to true black.
Given the increasing abilities of TVs and processing abilities and feature sets of players, I'm not much persuaded to upgrade my DVD collection to Blu-Ray. Though I admit some of that is that I enjoy the challenge of getting a good video file out of my DVDs.
I partially disagree with the use of ASS subtitles. For a lot of traditional movies, using SRT files is sensible because more players support it, and because it's often sensible to give the player the option of how to render the text (because the viewing environment informs what is e.g. the appropriate font size).
My impression is, their audience equates file size with quality so the bigger the file the more "value" they got from the creator. This is frustrating because bigger files means hitting transfer limits, slower to download, slower to copy, taking more space, etc...
The way I understand it, we've got the YCbCr that is being converted to an RGB value which directly corresponds to how bright we drive the R, G, and B subpixels. So wouldn't the entire range already be available? As in, post-conversion to RGB you've got 256 levels for each channel which can be anywhere from 0 to 255 or 0% to 100%? We could go to 10-bit color which would then give you finer control with 1024 levels per channel instead of 256, but you still have the same range of 0% to 100%. Does the YCbCr -> RGB conversion not use the full 0-255 range in RGB?
Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant, but that wouldn't change the on-disk or on-the-wire formats. Those formats have changed (video files are specifically HDR or SDR and operating systems need to support HDR to drive HDR monitors), so clearly I am missing something but all of my searches only find people comparing the final image without digging into the technical details behind the shift. Anyone care to explain or have links to a good source of information on the topic?
Edit: from some googling it looks like encoding is encoding, whether it’s used for recording or rendering footage. In that case the same quality arguments the article is making should apply for recording too. I only did a cursory search though and have not had a chance to test so if anyone knows better feel free to respond
ffmpeg seems ridiculously complicated, but infact its amazing the amount of work that happens under the hood when you do
ffmpeg -i input.mp4 output.webm
and tbh theyve made the interface about as smooth as can be given the scope of the problem.Grump grump grumpity grump. Same experience with every dashcam I've bought over the years.
The idea that YCbCr is only here because of "legacy reasons", and that we only we discard half of chrominance because of equally "legacy reasons" is bonkers, though.
My biggest misconception, bar none, was around what a codec is exactly, and how well specified they are. I'd keep hearing downright mythical sounding claims, such as how different hardware and software encoders, and even decoders, produce different quality outputs.
This sounded absolutely mental to me. I thought that when someone said AVC / H.264, then there was some specification somewhere, that was then implemented, and that's it. I could not for the life of me even begin to fathom where differences in quality might seep in. Chief of this was when somebody claimed using single threaded encoding instead of multi threaded encoding was superior. I legitimately considered I was being messed with, or that the person I was talking to simply didn't know what they were talking about.
My initial thoughts on this were that okay, maybe there's a specification, and the various codec implementations just "creatively interpret" these. This made intuitive sense to me because "de jure" and "de facto" distinctions are immensely common in the real world, be it for laws, standards, what have you. So I'd start differentiating and going "okay so this is H.264 but <implementation name>". I was pretty happy with this, but eventually, something felt off enough to make me start digging again.
And then, not even a very long time ago, the mystery unraveled. What the various codec specifications actually describe, and what these codecs actually "are", is the on-disk bitstream format, and how to decode it. Just the decode. Never the encode. This applies to video, image, and sound formats; all lossy media formats. Except for telephony, all these codecs only ever specify the end result and how to decode that, but not the way to get there.
And so suddenly, the differences between implementations made sense. It isn't that they're flaunting the standard: for the encoding step, there simply isn't one. The various codec implementations are to compete on finding the "best" way to compress information to the same cross-compatibly decode-able bitstream. It is the individual encoders' responsibility to craft a so-called psychovisual or psychoacoustic model, and then build a compute-efficient encoder that can get you the most bang for the buck. This is how you get differences between different hardware and software encoders, and how you can get differences even between single and multi-threaded codepaths of the same encoder. Some of the approaches they chose might simply not work or work well with multi threading.
One question that escaped me then was how can e.g. "HEVC / H.265" be "more optimal" than "AVC / H.264" if all these standards define is the end result and how to decode that end result. The answer is actually kinda trivial: more features. Literally just more knobs to tweak. These of course introduce some overhead, so the question becomes, can you reliably beat this overhead to achieve parity, or gain efficiency. The OP claims this is not a foregone conclusion, but doesn't substantiate. In my anecdotal experience, it is: parity or even efficiency gain is pretty much guaranteed.
Finally, I mentioned differences between decoder output quality. That is a bit more boring. It is usually a matter of fault tolerance, and indeed, standards violations, such as supporting a 10 bit format in H.264 when the standard (supposedly, never checked) only specifies 8-bit. And of course, just basic incorrectness / bugs.
Regarding subbing then, unless you're burning in subs (called hard-subs), all this malarkey about encoding doesn't actually matter. The only thing you really need to know about is subtitle formats and media containers. OP's writing is not really for you.
[1] Technically the term codec refers to a specific program that can encode and decode a certain format.
No need to know anything about the video file anymore.
(Of course if you're hosting billions of videos on a website like YouTube it is a different story, but at that point you need to learn a _lot_ more e.g. about hardware accelerators, etc.)
If I want the best possible quality image at a precisely specified time, what would I do?
Can I increase quality if I have some leeway regarding the time (to use the closest keyframe)?
Is there a way to "undo" motion blur and get a sharp picture?
On a separate note also not mentioned llm's are really good at generating ffmpeg commands. Just discuss with chatGPT your source file and goals for a video and you can typically oneshot a targeted command even if you aren't familiar with ffmpeg cli.
No, that's just nonsense for any guide targetting beginners, it's not fine, it's too error-prone and complicated and requires entering the whole unfriendly land of the cli!
> If you must use a GUI
Of course you must! It's much better to provide beginners with your presets in Handbrake that avoid the footguns you mention (or teach them how to avoid them on their own) rather than ask them to descend into the dark pit of ffmpeg "basics"
> Before you start complaining about how complicated ffmpeg is and how arcane its syntax is, do yourself a favor and read the start of its documentation. It turns out that reading the (f.) manual actually helps a lot!
It turns out that wrapping the bad UI in a simpler typed GUI interface wastes less of the collective time than asking everyone to read dozens of pages of documentation!
> Actual video coding formats are formats like H.264 (also known as AVC) or H.265 (also known as HEVC). Sometimes they're also called codecs, short for "encode, decode".
Codec is coder/decoder. It's not the format.
There's a footnote claiming people mix the 2 terms up (a video format is apparently equal to a video codec according to this "expert") but apparently acknowledging the difference is seemingly only what nitpickers do. Sheesh. If you want to educate, educate with precision, and don't spread your misinformation!