Well, semi automated scanning to be fair. I got to thinking that if anything happened to my filing cabinet I wouldn’t have a record of anything, so time to nerd out and find a way to do this in linux.
To jump right in, here’s a download link to scandoc. It’s just a simple bash script. It’s pretty well commented but I’ll explain things a bit here.
#!/bin/bash
# map the first arguement from the command line
# to $file for easier reading
file=$1;
# N is just a counter for the file name
n=1;
# Wait for input, any input works but enter is a pretty common key to
# hit
read -n 1 -p “Hit enter to scan.”;
for i in `seq 1 1000`; do
# Scan images.
# format for output file is “filename”+”pg”+”page number”
# this can be formated as you will.
echo Scanning $1pg$n.pnm;
# Run scanimage itself. Plustek backend, access with lib usb, bus 1,
# device 14. “Auto” does not work with libusb specified, semi
# automatic detection could probably be done. If you only have one
# scanner just call scanimage without -d specifiying the device, it’ll
# find it. Resolution of 150dpi, full plater scan area (change x+y for
# smaller scan areas), the lide 25/plustek keeps the calibration data in
# cache so this speeds things up in batch scanning, default for this
# scanner is color.
scanimage -p -d plustek:libusb:001:014 –resolution 150 -x 215 -y 297 –calibration-cache=yes > $file”pg”$n.pnm;
# Nothing fancy needs to be done post scan with this scanner.
echo Converting to pdf;
convert $file”pg”$n.pnm $file”pg”$n.pdf;
echo Next;
read -n 1 -p “Hit enter to scan.”;
let n=$n+1; #increase n so the page number goes up.
# Second scanner, if you don’t have one just delete all this, though it
# does give some examples on post scan processing.
echo Scanning $1pg$n;
# Genesys is the devil! This is the only reason we’re scanning at
# 150dpi. 300dpi grayscale doesn’t always work, 300dpi color always craps
# out the backend. 600dpi color is overkill for simple document scanning.
# 150dpi color is “enough” for a clear image with the necessary color
# definition. scanimage –help –device-name=Name will tell you the
# details of what the scanner/backend can do. Genesys appears incapable
# of doing 200dpi at all. It also fails quietly, all you’ll get on the
# next scan is “sane: end of file”. Yeah, that’s helpfull. No caching
# of calibration data, so even though the lide 35 is faster at scanning,
# it takes longer to start. Slightly faster in then end on usb1. Lide35
# is a USB2 scanner btw.
sudo scanimage -p -d genesys:libusb:001:015 –mode Color –resolution 150 -x 218 -y 299 > $file”pg”$n.pnm;
echo Converting to pdf;
convert $file”pg”$n.pnm $file”pg”$n.pdf;
# This scanner is dark, convert pdf and pnm into a better image.
# You can do any post scan manipulation here, or seperatly from another
# script. I do it now so I don’t have to come back and guess which files
# need to be lightened up.
convert $file”pg”$n.pdf -modulate 120% $file”pg”$n.pdf;
convert $file”pg”$n.pnm -modulate 120% $file”pg”$n.pnm
echo Next;
read -n 1 -p “Hit enter to scan.”;
let n=$n+1;
done;
Actually there’s not much to explain since it’s all in the comments already. Just a few points, I found the genesys backend to be a real bear, and it turns out that the plustek backend has brightness and contrast built in. Fuck. I’ve spent hours on this damned genesys crap. Also, you don’t have to hit enter as it says in the comments. And you can hold down a key and the script will scan until the buffer is empty. Great when you’re scanning a bunch of the same sized documents.
One further note, convert is a bit odd and really increases the image size. 6 to 13 mb! pmmbrighten does the same thing, but keeps the image size the same. For example ppmbrighten -v 25 test.pnm > test2.pnm will increase the brightness by 25%. It’s slightly faster too. I just didn’t get around to updating the script to use this instead. It’s part of the netpbm package if you can’t find it on your system.