STDOUT.puts self.projects: multimedia

Showing posts with label multimedia. Show all posts

March 7, 2011

HTML5 native audio support comparison

Here are the results of a quick test how various browsers on OS X 10.6 (and iPhone 3GS) play HTML5 <audio>.

UPDATE 25.5.2011: I updated the data and found that the audio support has become better in many browsers. I left in old data for comparison.

Browser / OS	mp3	wav	ogg
Android 2.2	X	X	X
Chrome 9.0.597	ok	X¹	ok
Chrome 11	ok	X¹	ok
Firefox 3.6.13	X	ok	X²
Firefox 4.0.1	X	ok	ok
iOS 4.2	ok	X	X
Opera 11.1	X	ok³	ok
Safari 5.0.3	ok	X	X
Safari 5.0.5	ok	ok³	X

¹ plays, but is buggy
² hangs on load
³ is able to seek!

Likewise with <video>, there is a need to encode to many formats to support all browsers. I have no data of IE9. Unfortunately Android 2.2 cannot play any audio type at all! This is supposed to be fixed in Android 2.3. See this link for more information on HTML5 audio on Android. Also, Opera 11.1 could play ogg files, but I noticed that when they were encoded in higher bitrates, the playback stopped in the middle of the track, systematically. This seems to be fixed in Opera 11.11. Chrome buggyness on wav files appears upon seeking- the play stops and cannot be started again without reloading the page.

The tests were conducted by creating a very simple HTML5 document with this content:

      <!DOCTYPE html>
      <html>
      mp3: <audio src="test.mp3" controls></audio>
      wav: <audio src="test.wav" controls></audio>
      ogg: <audio src="02.ogg" controls></audio>
      </html>

The files were hosted (locally) from the same directory as the html document by ad-hoc Python web server:

      python -m SimpleHTTPServer

February 27, 2011

Audio CD mastering with open source tools

You want to burn an audio CD from a bunch of audio files from various sources, not direct CD rips? You can use open source tools to check the properties of the tracks and prepare them properly to avoid potential pitfalls and ensure a better listening experience.

The tools used in this memo:

Some of these do have overlapping functionality. With the exception of wavbreaker (needs ALSA/OSS), all of them are available on both OS X and Linux.

Installation on Ubuntu:

  $ sudo apt-get install sndfile-programs sox normalize-audio wavbreaker shntool cuetools

Please note that the following changes to wav files are destructive so be sure to operate on copies of the originals.

Normalize

Check if the tracks need normalization.

  $ sndfile-info *.wav | egrep 'File|Signal Max'

    File : 01.wav
    Signal Max  : 16462 (-5.98 dB)
    File : 02.wav
    Signal Max  : 19946 (-4.31 dB)
    File : 03.wav
    Signal Max  : 24188 (-2.64 dB)
    File : 04.wav
    Signal Max  : 12092 (-8.66 dB)
    File : 05-06.wav
    Signal Max  : 16781 (-5.81 dB)
    File : 07.wav
    Signal Max  : 32768 (0.00 dB)
    File : 08-09.wav
    Signal Max  : 27442 (-1.54 dB)

So there is one with a really loud peak (07.wav) and a few bit quieter tracks. This is what normalize-audio suggests to do (gain column):

  $ normalize-audio -n *.wav

      level        peak         gain
    -12,6922dBFS -5,9791dBFS  0,6922dB   01.wav
    -14,3614dBFS -4,3119dBFS  2,3614dB   02.wav
    -15,2635dBFS -2,6370dBFS  3,2635dB   03.wav
    -22,4749dBFS -8,6588dBFS  10,4749dB  04.wav
    -15,9599dBFS -5,8126dBFS  3,9599dB   05-06.wav
    -13,3062dBFS 0,0000dBFS   1,3062dB   07.wav
    -13,3358dBFS -1,5407dBFS  1,3358dB   08-09.wav

To go with automatic settings:

  $ normalize-audio *.wav

    Applying adjustment of 0,69dB to 01.wav...
    Applying adjustment of 2,36dB to 02.wav...
    Applying adjustment of 3,26dB to 03.wav...
    Applying adjustment of 10,47dB to 04.wav...
    Applying adjustment of 3,96dB to 05-06.wav...
    Applying adjustment of 1,31dB to 07.wav...
    Applying adjustment of 1,34dB to 08-09.wav...

I really didn't like what this did to track 04.wav, so I reverted it back from the backups and adjusted a smaller gain manually:

  $ normalize-audio -g 2,5dB 04.wav

    Applying adjustment of 2,500000dB...
     04.wav            100% done, ETA 00:00:00 (batch 100% done, ETA 00:00:00)

See normalize-audio --help for more controls. Normalization does not adjust signal-to-noise ratio, but trust your ears and speakers when adjusting volume. The desired peak level is within range -1,0 .. -6,0 dB. See Wikipedia.

Resample to Red Book CD standard

Convert the tracks to CD audio format (stereo, 16 bit, 44100 Hz). First check their current sample rate:

  $ sndfile-info *.wav | egrep 'File| Channels| Sample Rate| Bit Width'

    File : 01.wav
      Channels      : 2
      Sample Rate   : 48000
      Bit Width     : 16
    File : 02.wav
      Channels      : 2
      Sample Rate   : 48000
      Bit Width     : 16
    File : 03.wav
      Channels      : 2
      Sample Rate   : 48000
      Bit Width     : 16
    File : 04.wav
      Channels      : 2
      Sample Rate   : 44100
      Bit Width     : 16
    File : 05-06.wav
      Channels      : 2
      Sample Rate   : 48000
      Bit Width     : 16
    File : 07.wav
      Channels      : 2
      Sample Rate   : 44100
      Bit Width     : 16
    File : 08-09.wav
      Channels      : 2
      Sample Rate   : 44100
      Bit Width     : 16

Here I have a bunch of files in incompatible sampling rate. SoX can do many things to audio files, including resampling.

  $ sox inputfile.wav -r 44100 -b 16 -c 2 outputfile.wav

I converted the 48 kHz files with this ad-hoc loop:

  $ IFS=$'\n' ;\
    for f in $(\
      sndfile-info *.wav |\
      egrep 'File|^Sample Rate' |\
      grep -B 1 48000 |\
      awk -F' : ' /File/{'print $2'}); do \
        sox "${f}" -r 44100 -b 16 -c 2 "${f}-44k1.wav" && rm -f "${f}" ; done

Eliminate sector boundary errors (SBE)

To make sure there won't be any clicks between tracks, when the audio is continuous over a track change segue, you need to ensure the proper length of each track. On disc, tracks begin at full sectors, and a single sector length is 1/75 sec. If the wav of previous track is just slightly too short, the cd burner will insert some silence where audio is missing, to fill out the sector. This is enough to cause an annoying click in some cd players.

shntool is a great tool for extracting information and manipulating audio files. I will not give an example here, but see the output from shntool info *.wav to examine the properties of your wave files.

Examine the output of the shntool len command to discover problems:

  $ shntool len *.wav

     length     expanded size    cdr  WAVE problems  fmt   ratio  filename
      3:43.65       39490508 B   -b-   --   -----    wav  1.0000  01.wav-44k1.wav
      5:40.02       59979756 B   -b-   --   -----    wav  1.0000  02.wav-44k1.wav
      4:26.66       47076708 B   -b-   --   -----    wav  1.0000  03.wav-44k1.wav
      2:22.08       25067660 B   ---   --   -----    wav  1.0000  04.wav
      5:53.48       62382468 B   -b-   --   -----    wav  1.0000  05-06.wav-44k1.wav
      4:36.65       48838628 B   -b-   --   -----    wav  1.0000  07.wav
      7:39.68       81128396 B   -b-   --   -----    wav  1.0000  08-09.wav
     34:23.22      363964124 B                            1.0000  (7 files)

The b in the cdr column indicates a SBE. The hyphen indicates that everything is OK.

I wrote a small script called sbeok to check wav for SBEs. You can either use the script or shntool.

  $ sbeok *.wav

    01.wav-44k1.wav: FAILED, missing 1968 bytes
    02.wav-44k1.wav: FAILED, missing 992 bytes
    03.wav-44k1.wav: FAILED, missing 968 bytes
    04.wav: OK
    05-06.wav-44k1.wav: FAILED, missing 2024 bytes
    07.wav: FAILED, missing 696 bytes
    08-09.wav: FAILED, missing 1536 bytes

Seems all but track 04.wav has improper length and a small amount of silence would be added to the end of these tracks while burning the cd. Additionally, there are two files: 05-06.wav and 08-09.wav that need to be cut and they have music during the segue.

The brute force approach to recalculate sector boundaries is to merge all wavs into one and manually insert the track markers. This has the benefit that you don't need to cut tracks in previous steps of your workflow, but to leave it until last.

Merge all tracks together with shnjoin.

  $ shnjoin *.wav

    Joining [01.wav-44k1.wav] (3:43.65) --> [joined.wav] (34:23.22) : 100% OK
    Joining [02.wav-44k1.wav] (5:40.02) --> [joined.wav] (34:23.22) : 100% OK
    Joining [03.wav-44k1.wav] (4:26.66) --> [joined.wav] (34:23.22) : 100% OK
    Joining [04.wav] (2:22.08) --> [joined.wav] (34:23.22) : 100% OK
    Joining [05-06.wav-44k1.wav] (5:53.48) --> [joined.wav] (34:23.22) : 100% OK
    Joining [07.wav] (4:36.65) --> [joined.wav] (34:23.22) : 100% OK
    Joining [08-09.wav] (7:39.68) --> [joined.wav] (34:23.22) : 100% OK
    Post-padded output file with 1128 zero-bytes.

Use wavbreaker to set track split markers.

  $ wavbreaker joined.wav

This opens up a friendly GUI to insert your track marks. Seek to proper position in the topmost waveform view, and adjust the exact cut point from the second-to-top view. Click "Add" to enter a marker .. the interface is very easy to use. You could split the tracks from within wavbreaker, but once you have the marker file, it is possible to script the whole process. "Export to TOC" into file master.toc and exit the program.

Use shnsplit with cuebreakpoints to split the files based on the TOC. Write the master CD cast to directory "master":

  $ mkdir master
  $ cuebreakpoints -i toc master.toc | shnsplit -t %n -d master joined.wav

    Splitting [joined.wav] (34:23.22) --> [master/01.wav] (3:44.44) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/02.wav] (5:40.54) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/03.wav] (4:25.48) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/04.wav] (2:23.67) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/05.wav] (3:23.38) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/06.wav] (2:28.05) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/07.wav] (4:36.48) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/08.wav] (1:18.24) : 100% OK
    Splitting [joined.wav] (34:23.22) --> [master/09.wav] (6:21.69) : 100% OK

How about those sector sizes now?

  $ sbeok master/*.wav

    master/01.wav: OK
    master/02.wav: OK
    master/03.wav: OK
    master/04.wav: OK
    master/05.wav: OK
    master/06.wav: OK
    master/07.wav: OK
    master/08.wav: OK
    master/09.wav: OK

  $ shntool len master/*.wav

    length     expanded size    cdr  WAVE problems  fmt   ratio  filename
     3:44.44       39617132 B   ---   --   -----    wav  1.0000  master/01.wav
     5:40.22       60027788 B   ---   --   -----    wav  1.0000  master/02.wav
     4:26.05       46934204 B   ---   --   -----    wav  1.0000  master/03.wav
     2:23.67       25382828 B   ---   --   -----    wav  1.0000  master/04.wav
     3:21.15       35491724 B   ---   --   -----    wav  1.0000  master/05.wav
     2:30.28       26525900 B   ---   --   -----    wav  1.0000  master/06.wav
     4:36.48       48799340 B   ---   --   -----    wav  1.0000  master/07.wav
     1:18.24       13815692 B   ---   --   -----    wav  1.0000  master/08.wav
     6:21.69       67370732 B   ---   --   -----    wav  1.0000  master/09.wav
    34:23.22      363965340 B                            1.0000  (9 files)

Looks good. How are the volume levels?

    File : master/01.wav
    Signal Max  : 22620 (-3.22 dB)
    File : master/02.wav
    Signal Max  : 28166 (-1.31 dB)
    File : master/03.wav
    Signal Max  : 29384 (-0.95 dB)
    File : master/04.wav
    Signal Max  : 16501 (-5.96 dB)
    File : master/05.wav
    Signal Max  : 26729 (-1.77 dB)
    File : master/06.wav
    Signal Max  : 20980 (-3.87 dB)
    File : master/07.wav
    Signal Max  : 29205 (-1.00 dB)
    File : master/08.wav
    Signal Max  : 17139 (-5.63 dB)
    File : master/09.wav
    Signal Max  : 27442 (-1.54 dB)

All of them are within acceptable range, and the volume is perceived to be sensible acroll the tracks during playback. The perceived volume depends on the dynamic range of the audio - along with the equipment and the listener self as well.

Burn to CD

Since you're already on the command line, on Linux you can use cdrecord to burn the CD.

  $ cdrecord -v speed=1 dev=0,0,0 -dao -audio master/*.wav

April 2, 2009

Streaming from webcam

A little while ago I purchased a cheap webcam: Logitech QuickCam E3500. It works in Linux with the in-kernel 'uvcvideo' driver (plug-and-play) and I was interested to tinker with live video stream. It is very simple to view (a) or record (b) a video stream;

(a)  mplayer tv://
 (b)  mencoder -quiet tv:// -tv noaudio -ovc lavc -o "webcam-$(date "+%Y%m%d %H:%M").avi"

Streaming live video to the internet is a bit more complicated. At first I wanted to use FlowPlayer, a neat Flash program that has elaborate JavaScript controls, to view the stream using RTMP stream, which is a proprietary protocol. Setting up a RTMP server on the other hand seemed to be impossoble to setup in the time that I was going to invest. There seemed to be only few options; the Apple Darwin Streaming Server and Red5. In short, Darwin did not compile and Red5 was way too much work to configure. Away with RTMP - behold, VLC to the rescue!

vlc -I dummy --no-audio --no-sout-audio v4l2:///dev/video0:width=320:height=240 \
    --sout='#transcode{venc=ffmpeg,vcodec=x264,vb=256,vt=128}:std{access=mmsh,mux=asfh,dst=:8080/stream.asf}'

MMS, WTF?! A deprecated proprietary protocol and ASF container! Oh well, this was the first (and worst) simple streaming method I came across, and thought it would be worth a try. This seems to be a popular way of doing HTML embedded video. My internet connection is slow so I opted for the x264 codec, which by the way, is not supported by IE or WMP, which kind of cancels out the theoretical advantage of MMS that is works on the regular Windows user. With Firefox + VLC Mozilla plugin this seems to actually work. IE users can display the stream in VLC.

<OBJECT ID="MediaPlayer" WIDTH="640" HEIGHT="480"
    CLASSID="CLSID:22D6f312-B0F6-11D0-94AB-0080C74C7E95"
    STANDBY="Loading Windows Media Player components..."
    TYPE="application/x-oleobject"
    CODEBASE="http://activex.microsoft.com/activex/controls/mplayer/en/nsmp2inf.cab#Version=6,4,7,1112">
    <PARAM name="autoStart" value="True">
    <PARAM name='showControls' value="False">
    <PARAM name="filename" value="mms://:8080/stream.asf">
    <EMBED TYPE="application/x-mplayer2"
        SRC="mmsh://:8080/stream.asf"
        NAME="MediaPlayer"
        WIDTH=640
        HEIGHT=480>
    </EMBED>
  </OBJECT>

March 25, 2009

Hacking MFserver

The Maximum T-8000 PVR is an interesting device; it is a consumer-grade product and comes with a custom Linux OS. Marusÿs is the proprietor of the server's source code (which I believe they can, as it uses libupnp, which is licensed under BSD), but they offer it for download. The server (version 0.0.1) needed some patching to compile on GCC 4. The fixes were relatively simple. The major problem was that VLC-0.9 does no longer accept EXTVLCOPTS in playlists, and so far MFserver had relied on this. Fortunately VLC has a powerful way to chain up the needed transcode and networking modules from a single command, so I got it working by defining a few class variables and using sprintf() to contruct the VLC command. This means I can open the MFclient on the television screen, browse the server's video collection with the remote, and select a title to play on the TV. The PVR receives MPEG2 TS stream, such as in DVB broadcasts. I had a couple of test videos; a) DVB recordings from the PVR itself (ts) and from Kaffeine (m2t) b) Youtube videos (flv) c) YLE videos (wmv) d) DVD container units (?) (vob) e) random AVIs from the internet Videos that were recorded on the PVR, transferred to a PC, and streamed back without transcoding provided expectedly the best result, in almost no skipping in audio or video. There should be no loss in the video or audio quality. The Youtube videos seem to generally work quite reasonably. Some videos have low bitrates and it is obvious on the TV screen. Some of the random AVIs pulled from the internet worked very well, others did not work at all. I haven't yet got any picture from Kaffeine recordings. After some experimenting I figured the DVB recordings should not be transcoded. FFmpeg does seem to do ”the wrong thing” when it is asked to transcode from MPEG2 to MPEG2, and actually does demux and re-encode! The transcode options should be set by MFserver and be based on the video file suffix. In Ruby this would be as simple as:

transcode = 
    case filename[/\.([^\.]+)$/,1]
    when 'ts'
      ''
    else
      'transcode{vcodec=mp2v,acodec=mpga,.....'
    end

However, MFserver is written in C (although the file suffix is .cpp). I haven't touched C in years. Since then I have learned to do some programming on Java, Bash, Perl, Ruby, JavaScript and Python (in that chronological order). These are high level, (mostly) object oriented languages. Now, I had to face char[], *char and the paradigm shift from higher level abstractions to byte-level processing of an array, trying to remember how pointers work and what happens when I return a local variable and so on. There was an option to switch to C++ and use the String class without including any additional headers but I was intrigued by solving this in C. After three hours of intense reading and experimenting with a simple testing script, I had a function that parsed the suffix from the filename *char.

void VLCMgr::parseSuffix(char *filename)
{
       char _suffix[5] = "";
       int len = strlen(filename);
       int i=0;
       int seek = len-6;
       int dotreached=0;
       // get suffix
       for ( ; seek <= len ; seek++) {
               // separator . not reached
               if ((filename[seek]=='.') && strlen(_suffix)==0) 
               {
                       dotreached=1;
                       _suffix[i] = filename[seek]; // this will be overwritten
               }
               else if ((dotreached==1) && strlen(_suffix)!=0)
               {
                       // then do not add dots
                       if (filename[seek]!='.')
                       {
                               _suffix[i] = filename[seek];
                               i++;
                       }
               }
       }
       filename[i-1]='\0'; // mark the end of suffix string
       strncpy(filename,_suffix,5);
}

That's a lot of work just to get such a simple task done. Remember the Ruby equivalent was

filename[/\.([^\.]+)$/,1]

It makes me even more amazed how people can program whole operating systems and desktop environments with this language. I also dug up example sources of socket communication and merged them in to send remote control commands to VLC's remote control interface that was opened on a TCP socket. This provided to be somewhat tricky but now the commands pause, faster and slower are sent to the VLC socket. However the speeding up function does not seem to work while transcoding. This is an ongoing project and I'll post updates and the patches later, when I get a reasonable set of functionality finished.