Merlin’s weekly podcast with Dan Benjamin. We talk about creativity, independence, and making things you love.
Merlin’s weekly podcast with Dan Benjamin. We talk about creativity, independence, and making things you love.
”What’s 43 Folders?”
43Folders.com is Merlin Mann’s website about finding the time and attention to do your best creative work.
Paperless on a budget
Carl Ranson | Feb 21 2008
Been thinking a bit about getting rid of some old paper files I have and want to get them on a pc for the lowest cost possible. I wanted to kick this idea around a bit and see if it fights back... The problem: I have a cheap scanner and my budget doesn't stretch to a nice automated scanner like the ScanSnap. Solution: My idea is this, scan the pages as images and ocr them. Ocr isn't perfect, as you all know, but as long as it can produce valid search text I don't care too much. The clever bit is (i think)...I want to store them as MHT files containing both the scanned page images and the ocr'd text. MHT is the multi part mime format that IE uses to save complete web pages as a single file. The text will make the file searchable via google desktop or whatever, but the original image will still be there. I don't need to pull the documents into other systems or anything, just to be able to retrieve them if necessary. Sure, the files will be bigger than a PDF would have been, but who really cares these days with storage so cheap. If storing them like this is viable, I'd do a small app that acts as a front end to scan, ocr, and compile the files. Thoughts? Problems & pitfalls? Cheers S8 2 Comments
POSTED IN:
About Section8 |
|
EXPLORE 43Folders | THE GOOD STUFF |