Earlier this year I asked for suggestions on Google+ about dealing with increasingly large image collections. In our house, we have two DSLRs, four phones that take pictures, and two point and shoot cameras. The images from these are scattered across several hard drives and online backup accounts; over the past several years they’ve been inconsistently backed up. We have a network attached storage device that houses all images, but due to poor backup processes in the past, we have several cases of duplicate images.
Adding to the complexity, we paid Scandigital to scan years of print photos – everything from our honeymoon to our first cross-country drive to our first house. This added several thousand images to our archive – a good thing, to be sure, as we now had electronic copies of pictures we hadn’t looked at in years. But the challenge of managing those images – now numbering close to 50,000 – was getting insurmountable.
I hadn’t gotten around to actually implementing a solution – we had a busy summer and I wasn’t convinced I really wanted to tackle this. Then my son had a school assignment last week requiring him to find a dozen pictures to share with classmates from his childhood… and actually finding pictures for him was a nightmare. After more than an hour of poking through our archive, we hadn’t found more than 5 he was happy with. I was frustrated, he was annoyed, and it became clear this wasn’t sustainable. It was time to dive in.
The solution I more or less settled on was what I documented this summer: dedicate one desktop computer to organizing the image catalog. This past weekend, I picked up a computer at Best Buy, and I’m already happy with the progress (though I expect it’ll be months before I will feel like I’m done). Here’s what I did:
Bought a Gateway desktop with a dual core Intel processor and a 1TB hard drive. Total cost? $350. (Twenty years-ago me will stare at that line for a long, long time. It’s OK, me-from-the-past; computers are commodities but gas is now $5/gallon.) It features an HDMI port, so I parked the computer behind the television in our family room and plugged it into one of the TV’s available HDMI connections; when I want to display photo albums easily, I can just pull them up on the TV. (Note: when I first connected the computer to the TV, the computer’s display extended beyond the boundaries of the screen. This blog post helped me figure out the problem: I had to adjust the TV’s settings to stop zooming in; once I did that, I was all set.)
I added a Logitech wireless keyboard to the computer so I could operate the computer from the couch; it includes an integrated trackpad, and so far I’m pretty happy with it.
Copied over all of the images from the NAS drive to the PC. Installed Picasa, and let it find all of the pictures. All told, there are slightly over 50,000 images taking up 200 gigs of disk space (I think there might be more, actually, but I haven’t finished confirming that everything made it over yet). Thanks to the fast processor, indexing these images took Picasa just a few minutes; last time I tried this with a laptop it took hours and didn’t complete. Hardware matters!
This is where it started getting magical: after just a couple hours, Picasa had found thousands of faces across our images, and grouped them very accurately. All of a sudden, I could see photos of my six year-old daughter, from her birth to this past summer vacation. There’s my twelve year-old son – at his third birthday party, on his first day of kindergarten, leaving for his first overnight Scout camp – in one place. And my ten year-old son – the day he was born, his first airplane ride, the day he learned to ride a two-wheeler. It wasn’t just the kids: my wife and I are there too, as are the grandparents (including my grandparents, both of whom have died), extended family, and friends.
Like I said, I’m nowhere near done. This is a solid foundation, but I have a long way to go. Here’s what I think I need to do to get this under control:
- De-dupe the catalog. Picasa has a nice “show duplicates” feature, but since it shows both copies of the picture that’s duplicated, removing the dupes while leaving one copy is a time-consuming affair. This article from Digital Inspiration looks like it’ll help; according to Picasa I have more than 4,000 duplicates.
- Confirm I have all the pictures. I haven’t done a full audit of where all of the family’s pictures are hiding; in my Picasa account, in my wife’s, on the kids’ SD cards, etc.
- Simplify synchronization from those sources. Once I have all of the images, the next step is to ensure that going forward the new images will get included in the master Picasa collection. Crashplan on the Mac will likely satisfy this for both my wife and I; I’m looking into solutions for Android (Dropbox with its instant-upload option may be a good go-between here, though I haven’t started looking at how best to do this across several devices).
- Install VNC on the photo server. While I’m able to operate the computer from the couch, that’s not the most useful way to do actual work. It’s great for lean-back viewing of the pictures, but doing lots of manipulation can get tedious. I’m going to install VNC so that I can access the computer from my laptop when I’m at home, which should make it easier to do the heavy lifting when needed.
- Turn on cloud sync. I’ve got a lot of unused disk space on my Google Drive account, so once I have the local catalog in a good place, I’m going to enable Picasa’s cloud synchronization, which will not only give me reliable backup of all images, it’ll also give me an easy way to share all of these images. For the most part that means sharing with my wife, but I’ll probably also share with family who may like the ability to browse through all of our images.