FRESH

Hacker News

Home

What you need to know before touching a video file

368 points by qbow883

by Drybones

12 subcomments

Nearly this entire HN comment section is upset about VLC being mentioned once and not recommended. If you can not understand why this very minor (but loud?) note was made, then you probably do not do any serious video encoding or you would know why it sucks today and is well past its prime. VLC is glorified because it was a video player that used to be amazing back in the day, but hasn't been for several years now. It is the Firefox of media players.
There is a reason why the Anime community has collectively has ditched VLC in favor of MPV and MPC-HC. Color reproduction, modern codec support, ASS subtitle rendering, and even audio codecs are janky or even broken on VLC. 98% of all Anime encode release playback problems are caused by the user using VLC.
We even have a dedicated pastebin on a quick run down of what is wrong: https://rentry.co/vee-ell-cee
And this pastebin doesn't even have all the issues. VLC has a long standing issue of not playing back 5.1 Surround sound Opus correctly or at all. VLC is still using FFmpeg 4.x. We're on FFmpeg 8.x these days
I can not even use VLC to take screenshots of videos I encode because the color rendering on everything is wrong. BT.709 is very much NOT new and predates VLC itself.
And you can say "VLC is easy to install and the UI is easy." Yeah so is IINA for macOS, Celluloid for Linux, and MPV.net for Windows which all use MPV underneath. Other better and easy video players exist today.
We are not in 2012 anymore. We are no longer just using AVC/H264 + AAC or AC-3 (Dolby Audio) MP4s for every video. We are playing back HEVC, VP9, and AV1 with HDR metadata in MKV/webm cnotainers with audio codecs like Opus or HE-AACv3 or TrueHD in surround channels, BT.2020 colorspaces. VLC's current release is made of libraries and FFmpeg versions that predate some of these codecs/formats/metadata types. Even the VLC 4.0 nightly alpha is not keeping up. 4.0 is several years late to releasing and when it does, it may not even matter.

by embedding-shape

5 subcomments

It seems really weirdly written. It's written with a lot of authority, like saying "Don't use VLC" and "Don't use Y" yet provides no reasoning for those things. Just putting "Trust me, just don't" doesn't suddenly mean I trust the author more, it probably has the opposite effect. Some sections seem to differ based on if the reader knows/doesn't know something, but I thought the article was supposed to be for the latter.
Would have been nice if these "MUST KNOW BEFORE" advises were structured in a way so one could easily come back and use it as a reference, like just a list, but instead it's like a over-dinner conversation with your "expert and correct but socially-annoying" work colleague, who refuses to elaborate on the how's and why's, but still have very strong opinions.

by EdNutting

6 subcomments

Interesting read, it’s a shame the ranty format makes it 3x longer than necessary.
Not sure why it takes a dump on VLC - it’s been the most stable and friendly video player for Windows for a long time (it matters that ordinary users, like school teachers, can use it without special training. I don’t care how ideological you are about Linux or video players or whatever lol).

by happytoexplain

7 subcomments

I'm always amazed when I see how many people are unfamiliar with VLC hate. It was notorious (to the point of it being a popular meme topic) for video artifacts, slow/buggy seeking, bloated/clumsy UI/menus, having very little format support out of the box, and buggy subtitles. I assume nowadays it's much better, since it seems popular, but its reputation will stick with me forever.

by arch1t3cht

0 subcomment

Original post author here.
It seems like the main criticisms I am getting for this article are because it's escaped past its main target audience, so let me clarify a few things.
This post was born out of me hanging out in communities where people would make their own shortened edits of TV series and, in particular, anime, often to cut out filler or padding. Many people there would make many of the mistakes mentioned in the post, in particular reencoding at every step without knowing how to actually control efficiency/quality. I spent a lot of time helping out individual people one-on-one, but eventually wrote the linked article to collect all of my advice in one place. That way I (or other people I know) can just link to it like "Read the section on containers here," and then answer any follow-up questions, instead of having to explain from scratch each time.
> It seems really weirdly written. / ranty format
So, yes, it does. It was born out of one-to-one explanations on Discord. I wouldn't be surprised if it may seem condescending to a more advanced reader, but if I rant about some point to hammer it down it's because it's a mistake I've seen people make often enough that it has to be reenforced this much. I wouldn't write a professional article this way.
The other point many people seem to get hung up about is the "hate" on VLC. Let me clarify that I do not "hate" VLC at all, I just don't recommend it. VLC is only mentioned once in the entire page, exactly because I didn't want to slot in an intermission purely to list a bunch of VLC issues. I felt like that would qualify more as "hate."
That said, yes, pretty much anyone I know in the fansubbing or encoding community does not recommend VLC because of various assorted issues. The rentry post [1] is often shared to list those, though I don't like how it does not give sources or reproducible examples for the issues it lists. I really do want to go through it and make proper samples and bug reports for all of these issues, I just didn't have the time yet.
Let me also clarify that I have nothing against the VLC developers. VideoLan does great work even outside of VLC, and every interaction I've had with their developers has been great. I just do not recommend the tool.
[1] https://rentry.co/vee-ell-cee

by mmcclure

0 subcomment

Shameless plug for anyone that wants to go deeper on specific video topics: I've been organizing a conference for video devs for 11 years now and there's a wealth of info in the recordings. A talk from the most recent one on hacking a Sega Genesis to stream video might not seem that practical, but there were some fascinating bits on compression (or, rather, not being able to use actual compression). https://www.youtube.com/watch?v=GZdxdpw-3nI
If folks want to get involved, there's also a chat community that's pretty active: https://video-dev.org.

by coppsilgold

0 subcomment

MPV plugins can actually do frame-perfect cuts and crops for you (+ whatever ffmpeg filters you want), something that would generally require the hassle of opening editing software. And those cuts can be done in h264 lossless (for additional processing later at no additional quality loss from this step).
https://github.com/occivink/mpv-scripts
There is also a way to losslessly cut preserving the original encoding but you give up the precision of the cuts due to keyframes. The MPV script above can do that too: script-opts/encode_slice.conf

by weinzierl

3 subcomments

"Don't use Topaz AI, Anime4k, RealESRGAN, RIFE, etc. Trust me, just don't."
Why? I only know Topaz and I always thought it had its narrow but legitimate uses cases for upscaling and equalizing quality?

by jokoon

2 subcomments

I wish he talked about avidemux.
It's a simple tool which is great for many things, it has filters and there are most of the formats. I think it uses ffmpeg under the hood.
It's an old tool but it's fine for most things, when ffmpeg is to fastidious to use. ffmpeg is still what I use, but some more complex tasks are just more comfortable with avidemux.

by liampulles

0 subcomment

I remember using Gordian Knot to create avi files from my DVDs back when XviD was the pragmatic method for encoding videos, and the whole goal was to get movies under 700mb so that you could write them to a CD. Avisynth and community filters were largely geared towards undoing all sorts of crap done to an image because artifacts that were relatively unnoticeable on a general CRT television were quite apparent on a computer monitor, as well as to then prepare the video to look good once it has been highly compressed with XviD or DivX.
These days I'm much more inclined to try and transparently encode the source material, tag it appropriately in the media container, and let the player adjust the image on the fly. Though I admit, I still spend hours playing around with Vapoursynth filter settings and AV1 parameters to try and get a good quality/compression ratio.
I have to say that the biggest improvement to the experience of watching my videos was when I got an OLED TV. Even some garbage VHS rip can look interesting when the night sky has been adjusted to true black.
Given the increasing abilities of TVs and processing abilities and feature sets of players, I'm not much persuaded to upgrade my DVD collection to Blu-Ray. Though I admit some of that is that I enjoy the challenge of getting a good video file out of my DVDs.
I partially disagree with the use of ASS subtitles. For a lot of traditional movies, using SRT files is sensible because more players support it, and because it's often sensible to give the player the option of how to render the text (because the viewing environment informs what is e.g. the appropriate font size).

by socalgal2

2 subcomments

Tangential but, at least for me, I find lots of video creators making 2-3 gig videos for no noticable difference in quality for me re-encoding them to 1/4th the size or less.
My impression is, their audience equates file size with quality so the bigger the file the more "value" they got from the creator. This is frustrating because bigger files means hitting transfer limits, slower to download, slower to copy, taking more space, etc...

by craftkiller

7 subcomments

Something I've never been able to find satisfactory information on (and unfortunately this article also declares it out of scope), is what is the actual hard on-the-wire and on-disk differences between SDR and HDR? Like yes, I know HDR = high dynamic range = bigger difference between light and dark, but what technical changes were needed to accomplish this?
The way I understand it, we've got the YCbCr that is being converted to an RGB value which directly corresponds to how bright we drive the R, G, and B subpixels. So wouldn't the entire range already be available? As in, post-conversion to RGB you've got 256 levels for each channel which can be anywhere from 0 to 255 or 0% to 100%? We could go to 10-bit color which would then give you finer control with 1024 levels per channel instead of 256, but you still have the same range of 0% to 100%. Does the YCbCr -> RGB conversion not use the full 0-255 range in RGB?
Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant, but that wouldn't change the on-disk or on-the-wire formats. Those formats have changed (video files are specifically HDR or SDR and operating systems need to support HDR to drive HDR monitors), so clearly I am missing something but all of my searches only find people comparing the final image without digging into the technical details behind the shift. Anyone care to explain or have links to a good source of information on the topic?

by ro_bit

2 subcomments

I edit videos on a hobbyist level (mostly using davinci resolve to edit clips of me dying in video games to upload to a shareX host to show to friends). The big takeaway for me was reading that for quality/efficiency libx264 is better than nvenc for rendering h264 video. All this time I’ve assumed nvenc is better because it used shiny GPU technology! Is libx264 better for recording high quality videos too? I know it will run on CPU unlike NVENC but I doubt that’s an issue for my use case.
Edit: from some googling it looks like encoding is encoding, whether it’s used for recording or rendering footage. In that case the same quality arguments the article is making should apply for recording too. I only did a cursory search though and have not had a chance to test so if anyone knows better feel free to respond

by webdevver

2 subcomments

video format world is one where you nope out pretty quick once you realize how many moving pieces there are.
ffmpeg seems ridiculously complicated, but infact its amazing the amount of work that happens under the hood when you do
```
    ffmpeg -i input.mp4 output.webm
```
and tbh theyve made the interface about as smooth as can be given the scope of the problem.

by kwar13

0 subcomment

Pretty good writeup but not sure why VLC is not recommended...?

by WalterBright

0 subcomment

I bought a new dashcam. It generates .mp4 files. I tried to play them back with my new Roku media player, and it says invalid format. (They will play with Windows media player.)
Grump grump grumpity grump. Same experience with every dashcam I've bought over the years.

by jorl17

2 subcomments

I thought it was a good read, although with a couple of mistakes and a somewhat (IMO) childish sense of entitlement. This reads a bit like something a young teen who is heavy into tech wrote. I'm sure I could have authored something with the same overall tone and vibe when I was younger (perhaps not same quality, though!). Either way, it's a very decent read!
The idea that YCbCr is only here because of "legacy reasons", and that we only we discard half of chrominance because of equally "legacy reasons" is bonkers, though.

by Jabrov

3 subcomments

What's wrong with VLC?

by buzer

0 subcomment

I would disagree somewhat on his stance that video quality is not affected by container format (especially on part "Here is a list of things that people commonly associate with a video's quality"). Different container formats have different limitations regarding what video (and audio) formats they support. And while it subtitles support doesn't directly affect video quality, it does do so indirectly. If you cannot add subtitles without hardsubbing or subtitle formats are so limited that you end up needing hardsubbing anyway then the choice of the container format ends up affecting the video quality.

by perching_aix

5 subcomments

I've had a lot of misconceptions that I had to contend with over the years myself as well. Maybe this thread is a good opportunity to air the biggest one of those. Additionally, I'll touch on subbing at the end, since the post specifically calls it out.
My biggest misconception, bar none, was around what a codec is exactly, and how well specified they are. I'd keep hearing downright mythical sounding claims, such as how different hardware and software encoders, and even decoders, produce different quality outputs.
This sounded absolutely mental to me. I thought that when someone said AVC / H.264, then there was some specification somewhere, that was then implemented, and that's it. I could not for the life of me even begin to fathom where differences in quality might seep in. Chief of this was when somebody claimed using single threaded encoding instead of multi threaded encoding was superior. I legitimately considered I was being messed with, or that the person I was talking to simply didn't know what they were talking about.
My initial thoughts on this were that okay, maybe there's a specification, and the various codec implementations just "creatively interpret" these. This made intuitive sense to me because "de jure" and "de facto" distinctions are immensely common in the real world, be it for laws, standards, what have you. So I'd start differentiating and going "okay so this is H.264 but <implementation name>". I was pretty happy with this, but eventually, something felt off enough to make me start digging again.
And then, not even a very long time ago, the mystery unraveled. What the various codec specifications actually describe, and what these codecs actually "are", is the on-disk bitstream format, and how to decode it. Just the decode. Never the encode. This applies to video, image, and sound formats; all lossy media formats. Except for telephony, all these codecs only ever specify the end result and how to decode that, but not the way to get there.
And so suddenly, the differences between implementations made sense. It isn't that they're flaunting the standard: for the encoding step, there simply isn't one. The various codec implementations are to compete on finding the "best" way to compress information to the same cross-compatibly decode-able bitstream. It is the individual encoders' responsibility to craft a so-called psychovisual or psychoacoustic model, and then build a compute-efficient encoder that can get you the most bang for the buck. This is how you get differences between different hardware and software encoders, and how you can get differences even between single and multi-threaded codepaths of the same encoder. Some of the approaches they chose might simply not work or work well with multi threading.
One question that escaped me then was how can e.g. "HEVC / H.265" be "more optimal" than "AVC / H.264" if all these standards define is the end result and how to decode that end result. The answer is actually kinda trivial: more features. Literally just more knobs to tweak. These of course introduce some overhead, so the question becomes, can you reliably beat this overhead to achieve parity, or gain efficiency. The OP claims this is not a foregone conclusion, but doesn't substantiate. In my anecdotal experience, it is: parity or even efficiency gain is pretty much guaranteed.
Finally, I mentioned differences between decoder output quality. That is a bit more boring. It is usually a matter of fault tolerance, and indeed, standards violations, such as supporting a 10 bit format in H.264 when the standard (supposedly, never checked) only specifies 8-bit. And of course, just basic incorrectness / bugs.
Regarding subbing then, unless you're burning in subs (called hard-subs), all this malarkey about encoding doesn't actually matter. The only thing you really need to know about is subtitle formats and media containers. OP's writing is not really for you.

by swiftcoder

1 subcomments

Really good quickstart guide

by zzo38computer

1 subcomments

Sometimes I would want to convert from MPEG-TS H.264 to DVD video format, or other conversions, so there are reasons to do so. However, once I had got desynchronized audio, and I don't know if that is because of the original source, because of the conversion, or because some segments have not been recorded. (Also, it could not retain the EIA-608 captions, but that seems to be a limitation with FFmpeg, rather than something I did.)

by tmaly

0 subcomment

This is a great write up. Thank you for sharing.

0 subcomment

by hamonrye

0 subcomment

Container formats for x.264, AVC, or H.264 are in .mkv or .mp4 codecs to encode and decode.
[1] Technically the term codec refers to a specific program that can encode and decode a certain format.

by vivzkestrel

0 subcomment

would be nice if someone like epic spaceman actually broke down how videos are encoded, stored, processed and how encoding algorithms work visually, i am bad at understanding things by reading about them

by g4zj

4 subcomments

I'm curious what the issue is with using Handbrake? I use it all the time on macOS and it's generally a simple and effective tool for my purposes.

by pandemic_region

1 subcomments

Could have used this in the nineties, where hunting a specific codec to play that video you downloaded off a BBS was an actual thing.

by amelius

0 subcomment

Nowadays, I just ask an LLM to give me the ffmpeg command that I need.
No need to know anything about the video file anymore.
(Of course if you're hosting billions of videos on a website like YouTube it is a different story, but at that point you need to learn a _lot_ more e.g. about hardware accelerators, etc.)

by weinzierl

3 subcomments

The article talks about image comparisons but does not say what the best way to extract an image is.
If I want the best possible quality image at a precisely specified time, what would I do?
Can I increase quality if I have some leeway regarding the time (to use the closest keyframe)?
Is there a way to "undo" motion blur and get a sharp picture?

by jdprgm

2 subcomments

Just writing off AI upscaling completely is bs. It's not some magic bullet to use on every video and there is a learning curve on how to apply but there are absolutely scenarios where you can get shockingly good results. I think a lot of people make judgements on it based on super small sample sizes.
On a separate note also not mentioned llm's are really good at generating ffmpeg commands. Just discuss with chatGPT your source file and goals for a video and you can typically oneshot a targeted command even if you aren't familiar with ffmpeg cli.

0 subcomment

by eviks

0 subcomment

> I would recommend you to just learn basic ffmpeg usage instead > but ffmpeg is fine for beginners
No, that's just nonsense for any guide targetting beginners, it's not fine, it's too error-prone and complicated and requires entering the whole unfriendly land of the cli!
> If you must use a GUI
Of course you must! It's much better to provide beginners with your presets in Handbrake that avoid the footguns you mention (or teach them how to avoid them on their own) rather than ask them to descend into the dark pit of ffmpeg "basics"
> Before you start complaining about how complicated ffmpeg is and how arcane its syntax is, do yourself a favor and read the start of its documentation. It turns out that reading the (f.) manual actually helps a lot!
It turns out that wrapping the bad UI in a simpler typed GUI interface wastes less of the collective time than asking everyone to read dozens of pages of documentation!

by netsharc

2 subcomments

The second technical definition in this document is wrong. Great way to put the "the author is opinionated but is clueless" marker right near the top.
> Actual video coding formats are formats like H.264 (also known as AVC) or H.265 (also known as HEVC). Sometimes they're also called codecs, short for "encode, decode".
Codec is coder/decoder. It's not the format.
There's a footnote claiming people mix the 2 terms up (a video format is apparently equal to a video codec according to this "expert") but apparently acknowledging the difference is seemingly only what nitpickers do. Sheesh. If you want to educate, educate with precision, and don't spread your misinformation!

by effnorwood

0 subcomment

They are very hot