FRESH

Hacker News

Home

Drawvg Filter for FFmpeg

165 points by nolta

by jasode

2 subcomments

drawvg is very useful. Before drawvg, I was always fine using the stable FFmpeg releases such as 8.0.1 but when I saw drawvg added to the master branch[1], I didn't want to wait for the next stable release and immediately re-built ffmpeg from master to start using it.
My main use case is modifying youtube videos of tech tutorials where the speaker overlays a video of themselves in a corner of the video. drawvg is used to blackout that area of the video. I'm sure some viewers like having a visible talking head shown on the same screen as the code but I find the constant motion of someone's lips moving and eyes blinking in my peripheral vision extremely distracting. Our vision is very tuned into paying attention to faces so the brain constantly fighting that urge so it can concentrate on the code. (A low-tech solution is to just put a yellow sticky know on the monitor to cover up the speaker but that means you can't easily resize/move the window playing the video ... so ffmpeg to the rescue.)
If the overlay was a rectangle, you can use the older drawbox filter and don't need drawvg. However, some content creaters use circles and that's where drawvg works better. Instead of creating a separate .vgs file, I just use the inline syntax like this:
```
  ffmpeg -i input.webm -filter_complex "[0:v]drawvg='circle 3388 1670 400 setcolor black fill'[v2];[0:a]atempo=1.5[a2]" -map "[v2]" -map "[a2]" output.mp4
```
That puts a black filled circle on the bottom right corner of a 4k vid to cover up the speaker. Different vids from different creators will require different x,y,radius coordinates.
(The author of the drawvg code in the git log appears to be the same as the author of this thread's article.)
[1] https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/016d767c8e9d...

by sorenjan

0 subcomment

The cropdetect example made me wonder if they're thinking about including support for yolo or similar models. They're including Whisper for text to speech already, I think yolo would enable things like automatic face censoring and general frame content aware editing. Or maybe Segment anything, and have more fine grained masks available.
On the other hand, when I compared the binaries (ffmpeg, ffprobe, ffplay) I downloaded the other day with the ones I had installed since around September, they where almost 100 MB larger. I don't remember the exact size of the old ones but the new ones are 640 MB, the old ones well under 600 MB. The only difference in included libraries was Cairo and the JPEG-XS lib. So while I think a bunch of new ML models would be really cool, maybe they don't want to go down that route. But some kind of pluggable system with accelerated ML models would be helpful I think.

by torginus

4 subcomments

One of the strangest discoveries of my life was that vector graphics is a solved problem, and the solution is turtle graphics that I was taught in primary school.

by fercircularbuf

0 subcomment

This could be a great and easy way to add annotations to technical instruction videos in a way that is self-documenting and programmable.

by jasonjmcghee

0 subcomment

Didn't know this was a thing- thanks for posting.
Is there something similar that supports shaders? Like metal / wgsl / glsl or something?
Sounds like a fun project...

by dotnot

0 subcomment

Nice readme example, I would also like to see time of execution for these tasks, I mean how much time for handling filters adds comparing to non using it

by dingdingdang

0 subcomment

How far are we from having ffmpeg make automatic rotoscope versions of our videos?!

by mikkupikku

0 subcomment

Looks very cool. I wonder if drawvg filters can be modified in real time using zmq.

by arc-in-space

0 subcomment

Oh, I was not expecting an entire DSL for this. This looks really useful.

by shevy-java

1 subcomments

That's quite cool. I guess ffmpeg would kind of technically be a replacement for avisynth at this point.
Still, I find the syntax it uses horrible:
```
  ffmpeg -an -ss 12 -t 3 -i bigbuckbunny.mov -vf 'crop=iw-1, drawvg=file=progress.vgs, format=yuv420p' -c:v libvpx-vp9 output.webm
```
I understand that most of this comes from simplicity of use from the shell, so if you take this point of view, the above makes a lot of sense.
My poor, feeble brain, though, has a hard time deducing all of this. Yes, I can kind of know what it does to some extent ... start at 12 seconds right? during 3 seconds ... apply the specified filter in the specified format, use libvpx-vp9 as the video codec ... but the above example is somewhat simple. There are total monsters in actual use when it comes to the filter subsystem in ffmpeg. Avisynth was fairly easy on my brain; ffmpeg does not, and nobody among the ffmpeg dev team seems to think that complicated uses are an issue. I even wrote a small ruby script that expands shortcut options as above, into the corresponding long names, simply because the long names are a bit easier to remember. Even that fails when it comes to complex filters used.
It's a shame because ffmpeg is otherwise really great.