Crafting an Audiovisual Mashup using FFmpeg

Tutorial

Let's have some fun! (Note: You don't need to have fun if you don't want to.)

Have you ever made or seen a video recording and wanted to set it to a different audio program? Video is the primary organizational criteria named because that is what most people sort things with. If you use a search engine, unless you are looking for a music video, you will typically search for the name of a visual presentation. When a video is remarkable, though, sometimes you can apply a game design principle to make it perhaps more fun.

Here's an explanation of this easy process. In this tutorial, I'm going to show you how to flip the primary organizational criteria to audio; that is, we are going to replace the audio program in an existing video file.

FFmpeg is an audiovisual processing program for Linux and UNIX which can take one or multiple input files and perform combinations and transformations of each video and audio track in any input.

In this example, we are going to take two separate files, one video and one audio, and output to a resulting media file you can view on a computer and in a web browser. (Be aware that file formats vary, and any single container file may have several video tracks such as DVD angles, or several audio tracks for multiple languages.)

You're going to learn how to install FFmpeg, query the software's dependencies on any libraries, and how to use the software you're obtained to do something fun. This was tested on Ubuntu Linux in September of 2018.

A video and audio file are necessary to complete this tutorial:

First, make sure you have FFmpeg installed. Since I have it installed already, I will also show you how to query a package you already have installed so you can view its installation status.

Important: You may want to note that packages in dpkg-based systems (such as Debian or Ubuntu) can have dependencies! This is a fact of how software is packaged on these systems. In FFmpeg's situation, these are what are called codecs and other supporting software which encode and decode the files that you feed to it.

This is why you usually see apt-get ask if you want to download and install multiple packages when you ask it to install one package: the package you requested knows it needs other packages to complete a task for you.


$ sudo apt-get install ffmpeg

(Your computer will prompt you for your password to install this software if you haven't already done so. As I noted above, I have it installed, so let's continue.)

$ dpkg -s ffmpeg

Package: ffmpeg
Status: install ok installed
Priority: optional
Section: video
Installed-Size: 2223
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Architecture: amd64
Multi-Arch: foreign
Version: 7:3.4.4-0ubuntu0.18.04.1
Replaces: libav-tools (<< 6:12~~), qt-faststart (<< 7:2.7.1-3~)
Depends: libavcodec57 (>= 7:3.4.4) | libavcodec-extra57 (>= 7:3.4.4),
libavdevice57 (>= 7:3.4.4), libavfilter6 (>= 7:3.4.4) | 
libavfilter-extra6 (>= 7:3.4.4), libavformat57 (>= 7:3.4.4), 
libavresample3 (>= 7:3.4.4), libavutil55 (>= 7:3.4.4),
libc6 (>= 2.14), libpostproc54 (>= 7:3.4.4), libsdl2-2.0-0 (>= 2.0.8),
libswresample2 (>= 7:3.4.4), libswscale4 (>= 7:3.4.4)
Suggests: ffmpeg-doc
Breaks: libav-tools (<< 6:12~~), qt-faststart (<< 7:2.7.1-3~)
Conffiles:
 /etc/ffserver.conf a384a0e47a2facb870217cc2f5123af7
Description: Tools for transcoding, streaming and playing of multimedia files
 FFmpeg is the leading multimedia framework, able to decode, encode, transcode,
 mux, demux, stream, filter and play pretty much anything that humans and
 machines have created. It supports the most obscure ancient formats up to the
 cutting edge.
 .
 This package contains:
  * ffmpeg: a command line tool to convert multimedia files between formats
  * ffserver: a multimedia streaming server for live broadcasts
  * ffplay: a simple media player based on SDL and the FFmpeg libraries
  * ffprobe: a simple multimedia stream analyzer
  * qt-faststart: a utility to rearrange Quicktime files
Homepage: https://ffmpeg.org/
Original-Maintainer: Debian Multimedia Maintainers 

$ ffmpeg \
       -i richard.mp4 \
       -i campgranada.m4a \
       -filter_complex "[0:v]scale=320:240;[1:a]atempo=1.04" \
       -t 1:19.32 \
       -c:v libvpx \
       -b:v 256k \
       -b:a 32k \
                      stallman.webm

Why am I showing a command that has multiple lines?

Linux and UNIX shells use this silly \ thing to continue one line of a command into the next. (Other systems such as Microsoft Windows use it for directory controls, but its usage here is not to denote a directory.) You will note that I placed the output filename at the end of the command; I even moved it different from the other lines so you would notice it, and I didn't add another \, because if you press Enter at that point, your computer will start running your command. You don't need to use these if you want to type it all on one line, but they make it easier for me to show you this.

You have a choice. You can run the command as it is, you can change some of the settings first, or you can read on and learn more. Remember that you will always need to have correct filenames for any command you give to run properly.

Let's look at the -filter_complex option and what we are using it to do.

-filter_complex "[0:v]scale=320:240;[1:a]atempo=1.04"

Why does 0=1 and 1=2 in the above explanation?

Computers begin counting from the number 0, not the number 1. However, when we count the first in a list, we usually use our first "number one" finger. Please understand the difference in how a computer counts and how you might count, as it will help you if you want learn how to program your computer.

There are other options, however this is not a walkthrough, it's a tutorial. The lines with the k numbers indicate to you the accuracy (bit rate) of the video and audio, and the thing that looks like a time is necessary since the two files are different lengths.

$ ls -l

-rw------- 1 drw drw  6722117 Sep 21 23:38  campgranada.m4a
-rw-rw-r-- 1 drw drw 11585571 Sep 21 20:08  richard.mp4
-rw-rw-r-- 1 drw drw  2951232 Sep 22 01:24  stallman.webm

$ file *

campgranada.m4a: ISO Media, Apple iTunes ALAC/AAC-LC (.M4A) Audio
richard.mp4:     ISO Media, MP4 Base Media v1 [IS0 14496-12:2003]
stallman.webm:   WebM

Without further delay, I present to you the result of this tutorial:


stallman.webm: 320x240 pixels, 3 megabytes, 1 minute 19 seconds

What you've accomplished

lol :)

Exercises which remain for the reader include changing the start time of the audio or video (so as to create a mashup using a subset of either or both inputs), and adjustment of the bit rate settings to change the output file attributes (e.g. targeting a higher or lower megabyte file size so your output file could fit inside a downloadable application or a game system cartridge.) You will find that planning ahead by going beyond what is required to complete a task will allow you to build a toolset that can handle nearly any file that reaches your computer. Disk space is not cheap for all, but if you have some, use it for something useful.

If you have a different result, or if you would like to share a new option for FFmpeg, I would appreciate your contact.

Synchronization for fun: How?!

Did you wonder how it's possible to sync two streams and achieve an interesting result?

For most, this may be a mystery, however I have explained this process to myself with drum beats and timing.

Many years ago, the first instance I can remember of this being a perhaps useful ability to develop was after finishing the Nintendo video game Super Mario Bros. 3. (Takashi Tezuka, Shigeru Miyamoto, R&D4)

When you finish this game, there is a fun song that plays in the background before you reset it and the curtain opens again. Pressing reset on this, however, does not play the song, but you can see visually how the music used at the end of the game may have originally been made for the opening introduction.


mario3.webm: 320x240 pixels, 5.7 megabytes, 54 seconds

I like to whistle songs to myself as my grandmother did, so I'm sure that's what I was doing wherever I reset the console that day. Of course, in those days, there was no easy way to rewind after pressing reset to see if the audio matched up, and I had no computer that could produce a video like this. So if there is any question why someone who loves drumming is so knowledgeable in computer games, this may explain some of that.

These were the simple pleasures we had playing video games without other distractions all those years ago: whereas some would aim for high scores or shortcuts through the games, I would find something unique to appreciate and walk away happy. (I like to think of Mario as red Luigi.) The opening to this game teaches you everything you need to know about game design and hardware limitations on this platform, and if there were music present with the demonstration of this small world, I don't know if I would have noticed that lesson.

As gaming was very social back then (I didn't have an NES at home growing up), and this particular section is 25+ years in the making, I hope you enjoy my result!


Contact E-mail: Douglas Winslow <winslowdoug@gmail.com>