A Rip Storing Time

Adventures in the world of parallel DVD ripping and encoding

Published on 07 December 2016

From a youth of misspent money, I have a moderately large DVD collection. Some 600 movies in half a dozen DVD racks take a disproportionately large space in my dining room. This year, my partner and I decided to host Xmas dinner for the family, meaning a dining table that could host a dozen was in order and... well, you can see where this is heading. The DVDs had to go.

Now, even though I barely watch them anymore, there are still a couple of amazing films in my collection that aren't available from the (far too changeable) online streaming services and, as such, I was reluctant to simply banish them all to the attic. Instead, I decided to undertake the monumental task of ripping them all to HDD before boxing them all up.

Software

There are innumerable software solutions for ripping and transcoding DVD's... if you want to do one at a time. As you can probably appreciate, this wasn't a possibility for me or I'd still be ripping discs well into the New Year (not to mention through the Xmas dinner I'm actually doing this for). So a more, 'roll-your-own' solution was required.

A bit of googling revealed this post which described a mechanism for ripping discs as soon as you put them into the drive. While it was still targetting people who wanted to rip discs one at a time, it did point me towards the two pieces of software I did ultimately use:

  1. MakeMKV - for ripping the entire contents of the DVD and,
  2. Handbrake - for transcoding source files into compressed H.264 format.

Both these utilities come with CLI interfaces (details here and here respectively) allowing them to be automated. Furthermore, MakeMKV can run multiple instances allowing you to do your drive ripping (the long slow process) in parallel from multiple drives.

Hardware

I'd been intending to buy a new server for my home to be used as a Docker host because my current server is so old it simply doesn't support the required virtualization instructions. I'd had my eye on the Dell T20 Xeon E3 but was waiting for it to drop back below the £200 (after cashback) mark. However, realising I could (temporarily) use it as a DVD ripping machine, I decided to bite the bullet and bought the T20 with an additional 4Gb of RAM for £324 (less 2% Quidco and £80 cashback).

It arrived a couple of days later and I must say I'm impressed. It's a lot of machine for the money, well built, has a copious number of USB sockets and is surprisingly fast. I installed Windows Server 2016 on this as it's the OS I intend to use for the Docker host and I thought it'd be a good practice run.

To this I added the following:

  1. 2 x 1Tb internal HDDs - configured as RAID-0 providing a fast destination for the drive rips.
  2. 4 x USB DVD-ROM drives - drives for ripping from - a couple I already had plus a couple from Amazon
  3. 1 x 4Tb external USB HDD - destination for transcoded rips

All in, it's quite a monster.

Putting it together

Unlike most of the out of the box software I'd found, I needed software that would do the following:

  1. Wait for a disc to be inserted in one of the drives
  2. Determine that the disc is actually a DVD video
  3. Rip the disc to a destination folder based on the volume label of the DVD
  4. Allow up to four concurrent rip operations
  5. Queue ripped folders for transcoding allowing only a single transcoding operation to run at a time.
  6. Save the transcoded movie to the external drive.

While there may be software out there that does this, a couple of evening's googling didn't reveal it so, I decided a DIY job was in order. Besides, I thought the multiple producer/single consumer nature of the multiple rips/single transcoding would be a great fit for playing with Dataflow.

In relatively short order, I created DriveRipper... and it's worked pretty well. I've been happily ripping four DVD's at a time, averaging around 12-15 DVD's an hour, with only a few small problems:

  1. Because Dataflow enforces order of tasks throughout the pipeline (one if it's biggest strengths!) a slow rip or drive can hold up transcoding such that a queue builds up. Not a massive problem as the encoding process on the Xeon is actually pretty rapid (averaging 450fps) so it catches up with the relatively slow ripping process (about 20 minutes per disc) with ease.
  2. Some DVD's - particularly the old ones - are single sided or low quality or lack additional material or all of the above. This means that the contents of the disc is less than the 4.2Gb cut off I used for determining that a particular disc is likely to be a DVD. This can be sorted with a simple code change but I dediced to put these discs to one side for now and come back to them.
  3. Some DVD's - again, particularly the olds ones - don't have a rational volume name; using terms like 'e19245' or, even worse, 'DVDVideo'. This has meant that, after the transcoding is complete, I need to rename the folders so that the contents don't get overwritten by a subsequent rip.

So far I'm 150 DVD's through and just been told that the new dining room furniture is being delivered Monday.

Rip little machine, rip like wind!! Oh, wait...