# Scan multi-page documents directly to pdf quickly.

## PowerFactor

I don't know if anyone else has been as frustrated by the lack of easy to use software focused on document scanning on linux.  I'm not talking about ocr, just scanning documents into a portable multi-page image format.  In fact the only software that I've found that does exactly what I wanted was Adobe Acrobat on windows.  But with that I had to go through the windows twain driver interface for my scanner which seems designed to make the process as slow and clumsy as possible.  Still, that was the method I used for the last couple years on occasions when it was useful.

Finally, after I bought a new printer/scanner a couple month ago, (epson cx3200, its nice) I decided it was time figure out how to do the job on linux.  By then I knew all the command-line tools to do what I wanted were available, it was "just" a matter of writing a little script to tie it all together.  Being the amature I am it took me a couple days to figure it all out, but I got it working. This was back in January.

 The other day I was playing around with controlling it with kdialog and I thought maybe someone else would find it usefull (the non kde dependant version that is) So I figured why not post it. I did try to make it a little more user friendly. It's still an ugly hack, but it works for me.  Not only that, but once it's setup it's better at its specific purpose than anything else I've tried.  :Cool: 

The script depends on the following packages.

sane-frontends

imagemagick

netpbm

ghostscript

If your scanner has a decent 1-bit(Lineart) mode (or if you can actually get convert's threshold function to work for you) then you can modify the script slightly and get rid of the netpbm dependency.

You need to know how to use the scanimage program with your scanner, as you will need to modify the SCANDEVICE and SCANCMD variables to fit.  The rest of the configuration is pretty self explanatory I think.

To use it you just put you first page in the scanner then run the script with the name of the file to save as the argument.  It will then immediately scan the first page then prompt you for more. The rest is gravy.

It's not very robust, if your scanner has a warm-up period then make sure it's finished before you start. Otherwise scanimage may timeout and the script gets a little confused then.  And it's not designed to work with scanners that have an adf.

Anyway, hope someone can use it. Even if just for inspiration.  :Laughing: 

EDIT: Later versions posted further down the thread. Chrwei posted one that should work with an ADF(I don't have the hardware to try it) and I've posted the python version I've been using for a while.  This version is left here mainly for reference.

```
#!/bin/bash

#

# scan-pdf version 2

# April 27, 2004

# Copyright 2004 Zacchaeus Pearsall (zap4260 at yahoo.com)

# Distributed under the terms of the GNU General Public License v2

# Set this to a value between 0 and 1

# You may have to play with it some to get good scans

# of colored paper

THRESHOLD=0.55

# Scan resolution

RES=300

# Set these for the size paper you are scanning

# defaults are X=212.5, Y=275; for US letter paper

X=212.5

Y=275

# Paper size for pdf output. See "man convert" for possibilites

PAGESIZE="Letter"

# Set this to the appropriate sane device for your scanner

SCANDEVICE="epson:/dev/usb/scanner0"

# Leave this as Gray unless you have a scanner that has

# a good useable 1-bit mode

SCANMODE=Gray

# If your scanner has a good 1-bit mode and you plan

# to use it then uncomment this. I have no way to test this.

# You may need to make some other modificatinos as well.

#SCANNER_HAS_BW="yes"

# Modify the scan command as needed to fit your scanner

SCANCMD="scanimage -d ${SCANDEVICE} --mode ${SCANMODE} \

    -x ${X} -y ${Y} --resolution ${RES}"

if [ -z ${1} ]

then

    echo "USAGE: scan-pdf file.pdf"

    exit 0

fi

if [ -e ${1} ]

then

    echo "${1}: file exists"

    echo "Press any key to continue, Ctrl-c to quit"

    read -s -n 1 junk

fi

PDFFILE=${1}

TMPDIR=`/bin/mktemp -td scan-pdf.XXXXXXXXXX`

trap "rm -fr ${TMPDIR}" 0

trap "exit 2" 1 2 3 15

i=1

MORE_PAGES="y"

echo "To scan more sheets press space; press q when done"

while [ -z ${MORE_PAGES} ] || [ ${MORE_PAGES} != "q" ]

do

    if [ ${i} -lt 10 ]

    then

        PG="pg0${i}"

    else

        PG="pg${i}"

    fi

    

    if [ ${SCANNER_HAS_BW} ]

    then

        ${SCANCMD} > ${TMPDIR}/${PG}.pbm

    else

        ${SCANCMD} | pgmtopbm -threshold -value ${THRESHOLD} >\

    ${TMPDIR}/${PG}.pbm

    fi

    

    read -s -p "More?" -n 1 MORE_PAGES

    echo  ' '

    let i++

done

convert ${TMPDIR}/pg*.pbm -adjoin -page ${PAGESIZE} ${TMPDIR}/pgs.ps

ps2pdf13 ${TMPDIR}/pgs.ps ${TMPDIR}/pgs.pdf

mv -i ${TMPDIR}/pgs.pdf ${PDFFILE}
```

Last edited by PowerFactor on Sun Feb 11, 2007 4:42 pm; edited 1 time in total

----------

## FatherBusa

Dude, you're a genius.  This is just what I was looking for.  Thanks!

----------

## chrwei

very nice, here's my enhancements :)

summary:

- Added command line options with defaults

- Added ADF support with command line toggle to to use flatbed.  can be set to use flatbed by default with command line toggle to use ADF.

- Changed to use scanimage's batch mode and prompt so that timeouts shouldn't be an issue.  ADF doesn't use the prompt

- Made the scanner device name optional as scanimage will normaly detect your scanner automaticaly.

scanners tried:

- HP Officejet 6110

TODO:

- add more paper size options

- NetPBM says pgmtopbm is depreciated as of 7/2004 and to use pamditherbw instead.  I plan on only doing color or full greyscale documents so I'm not touching this.

bugs:

- "mode" seems to be scanner specific, some want "Grey" others want "Greyscale".  - needs testing

- might be an isue with providing -x and -y when using ADF, I need to test more

things-i-wish-worked-better:

- too many temp files!

and the code:

```

#!/bin/bash

#

# scan-pdf version 3

# April 27, 2004

# Copyright 2004 Zacchaeus Pearsall (zap4260 at yahoo.com)

# Distributed under the terms of the GNU General Public License v2

#

# February 15, 2005

# Chris Weiss

# - Added ADF and color support 

# - Changed to use sane's built in batch mode

# - Added command line options

###defaults - set these so you don't have to supply them 

###                    on teh command line ever time

# Set this to a value between 0 and 1 

# You may have to play with it some to get good scans

# of colored paper,  only used for BW scans

THRESHOLD=0.55

# Scan resolution

RES=300

# Set this to the appropriate sane device for your scanner 

# if you have more than one or sane doens't autodetect your scanner

# SCANDEVICE="epson:/dev/usb/scanner0"

SCANDEVICE=""

# use ADF in no-prompt batch mode.  Add the options your printer needs for this

ADF=Y

ADFOPTS="--batch-scan=yes"  #hpoj

# for black and white choose grey

# if you have a scanner with a good 1-bit mode choose lineart

# for full color PDF's choose color

SCANMODE="color"

# If your scanner has a good 1-bit mode and you plan

# to use it then change this to Y

SCANNER_HAS_BW=N

# Paper size for pdf output. See "man convert" for possibilites

PAGESIZE="Letter"

#TODO: add X and Y sizes for more paper

# additional options

ADDOPT=""

###end defaults - you shouldn't need to modify anything below here

myname=`basename "$0"`

usage() {

cat<<EOF

$myname scans documents from your flatbed or ADF scanner and stores them in a multi page pdf.

Usage: $myname [Options] filename.pdf

   Options:

   -page "size"   Page size for the PDF.  See "man convert" for possibilites

   -mode "mode"   lineart, greyscale, or color.

   -1bit [Y/N]   If you scanner has a good 1-bit more and you want lineart, use Y here.

   -adf [Y/N]   Use ADF in no-prompt batch mode (Y/N) - edit this script and set your scanners options

   -res dpi   Resolution to scan at in DPI

   -opts "options"   Additional option to pass to 'scanimage' program

   -threshold 0.55   Value between 0 and 1 to pass to 'pgmtopbm'

   -h Help      This info.

   

Open $myname in your favorite editor to change the default values.

$myname requires: sane-frontends, imagemagick, netpbm, and ghostscript 

EOF

exit 0

}

while [ $# -ne 0 ];

do

    case "$1" in

   -page)     shift;   PAGESIZE=$1 ;;

   -mode)      shift;   SCANMODE=$1 ;;

   -1bit)      shift;   SCANNER_HAS_BW=$1 ;;

   -res)      shift;   RES=$1 ;;

   -threshold)   shift;   THRESHOLD=$1 ;;

   -opts)      shift;   ADDOPT=$1 ;;

   -adf)      shift;   ADF=$1 ;;

   -h)      usage ;;

   *)      PDFFILE=${1} ;;

    esac

    shift

done

if [ -z ${PDFFILE} ]; then

   usage

fi

if [ -e ${PDFFILE} ]

then

    echo "${PDFFILE}: file exists"

    echo "Press any key to overwrite, Ctrl-c to quit"

    read -s -n 1 junk

fi

outdir=`dirname "$PDFFILE"`

OPTIONS=""

BITCONVERT=""

case "$SCANMODE" in

color)

   OPTIONS=" --mode color"

   ;;

grayscale)

   OPTIONS=" --mode Greyscale"

   ;;

lineart)

   

   if [ "$SCANNER_HAS_BW" = "Y"] || [ "$SCANNER_HAS_BW" = "y"]; then

      OPTIONS=" --mode Lineart"

   else

      BITCONVERT="Y"

      OPTIONS=" --mode Greyscale"

   fi

esac

OPTIONS="$OPTIONS --resolution $RES"

if [ "$SCANDEVICE" != "" ]; then

   OPTIONS="-d $SCANDEVICE $OPTIONS"

fi

if [ "$ADF" = "Y" ] || [ "$ADF" = "y" ]; then

   OPTIONS="$OPTIONS $ADFOPTS"

else

   OPTIONS="$OPTIONS --batch-prompt"

fi

if [ -z $ADDOPT ]; then

   OPTIONS="$OPTIONS $ADDOPT"

fi

case "${PAGESIZE}" in

Letter)

   OPTIONS="$OPTIONS -x 212.5 -y 275" ;;

esac

origdir=`pwd`

cd "$outdir"

#echo "scanimage $OPTIONS -b "

scanimage $OPTIONS -b

if [ "$BITCONVERT" != "" ]; then

   echo "Converting greyscale to lineart"

   for f in "out*.pnm"; do

      #echo "cat $f | pgmtopbm -threshold -value ${THRESHOLD} > $f.pbm"

      cat $f | pgmtopbm -threshold -value ${THRESHOLD} > $f.pbm

      rm -f $f

   done

   #echo "convert out*.pbm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps"

   echo "creating postscript"

   convert out*.pbm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps

   rm -f out*.pbm

else

   #echo "convert out*.pnm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps"

   echo "creating postscript"

   convert out*.pnm -adjoin -page ${PAGESIZE} ${PDFFILE}.ps

   rm -f out*.pnm

fi

#echo "ps2pdf13 ${PDFFILE}.ps ${PDFFILE}.ps"

echo "Convert postscript to PDF"

ps2pdf13 ${PDFFILE}.ps ${PDFFILE}

rm -f ${PDFFILE}.ps

cd "$origdir"

```

----------

## r.abbott

This thing is great!  Thanks.

----------

## gcediel

One (maybe silly) question: How can I make scanimage stop scanning more pages? I have tried several keys, but I can't stop it.

----------

## r.abbott

Use <Ctrl-D>   :Smile: 

----------

## chrwei

I haven't used it in a while, but I think it tells you that on screen, at least it did on mine.  You should run it in a terminal and not just from a "run" dialog.

----------

## gcediel

Well, CTRL+D doesn't work for me.

BTW: very nice stuff!

----------

## djmaze

CTRL+C works for me. (Try it two times, if it doesn't work.)

----------

## gcediel

Thanks, it works, although not a clean way.

----------

## zatalian

this script used to work for me but now convert gives me trouble...

convert -page letter converts the original image to a blank postscript file. Converting without the -page option works but then the pdf document is not in the correct format. Is this happening to anybody else? Any sollutions?

----------

## bludger

My HP 3500c doesn't have the mode function at all. This means that it can only output colour images.  How would you convert something like this to black and white?

Also I had a number of tiff files and managed to convert them into a multi page pdf with:

  convert <tif1> <tif2> <tif3> file.pdf

Why not just convert like this, leaving out the intermediate ps stage?

----------

## bludger

I solved my problem with the following:

scanimage -d <device> --resolution 150|ppmtopgm|pamthreshold -simple >tempscanfile1.pbm

convert -compress fax tempscanfile*.pbm outfile.pdf

This produced a 33kB file with resolution 150 and a 60kB file with resolution 300.  The 150 res version was readable, but a bit ugly and the 300 version was excellent.

----------

## bludger

I have been using the above method successfully and conveniently for the last few months now.  One problem that I have found is that when I try to convert multiple pbm files into one multi page pdf, I can quickly run out of memory if I get above 6 pages or so.  Does anyone have any suggestions as to how to get around this?

----------

## martoss

Isn't xsane doing the same?

My xsane version has an option to scan "pages" to a pdf. Works pretty well AFAIR. I don't see a big difference.

Xsane has also other nice features like just "copying stuff" and "emailing stuff". Anyways, your script sounds also nice  :Smile: 

----------

## PowerFactor

It seems I have been lax in keeping up with this. Better late than never I guess.

 *zatalian wrote:*   

> this script used to work for me but now convert gives me trouble... 
> 
> convert -page letter converts the original image to a blank postscript file. Converting without the -page option works but then the pdf document is not in the correct format. Is this happening to anybody else? Any sollutions?

 

I've ran into this several times. Seems to be some interdependancy between imagemagick and ghostscript that caused a problem when I upgrade one or the other. Usually recompiling imagemagick after a ghostscript upgrades fixes it.

 *bludger wrote:*   

> I have been using the above method successfully and conveniently for the last few months now. One problem that I have found is that when I try to convert multiple pbm files into one multi page pdf, I can quickly run out of memory if I get above 6 pages or so. Does anyone have any suggestions as to how to get around this?

 

I ran into this a while back too. You need use the "-limit Memory" and possibly the  "-limit Map" options for convert to limit it's ram usage. Usually 1/4 of my physcal ram seems to work well enough.  It does take a long time to convert though.

 *martoss wrote:*   

> Isn't xsane doing the same? ...

 

It is now and I'm glad to see it.  It didn't have those options 3 years ago though when I posted this.  It still looks like this script might be more convenient in for some tasks.  Xsane is probably less buggy though.  :Wink: 

----------

## PowerFactor

I've also made some changes since that original version.  I converted it to python and added some ncurses "eyecandy" using dialog. Also got rid of the netpbm dependency. I had intended to rewrite it as a "proper" modular program with a seperate config file and such but never got very far with it.  It's not something I use very often anyway.

Anyhow, here's my latest working version.  Plenty of bugs I'm sure but it mostly works when I need it.

Dependencies have changes a little:

dev-lang/python

media-gfx/sane-frontends

media-gfx/imagemagick

virtual/ghostscript

dev-util/dialog

```
#!/usr/bin/python

#

# scan-pdf version 8

# Sep 13, 2004

# Copyright 2004 Zack Pearsall (zap4260@yahoo.com)

# Distributed under the terms of the GNU General Public License v2

import sys

import os

import signal

import shutil

import math

from tempfile import mkdtemp

# Set this to a value between 0 and 1

# You may have to play with it some to get good scans

# of colored paper

THRESHOLD="0.55"

# Scan resolution

RES="300"

# Set these for the size paper you are scanning

# defaults are X=212.5, Y=275; for US letter paper

X="212.5"

Y="275"

# Paper size for pdf output. See "man convert" for possibilites

PAGESIZE="Letter"

# Set this to the appropriate sane device for your scanner

SCANDEVICE="epson:"

# Leave this as Gray unless you have a scanner that has

# a good useable 1-bit mode

SCANMODE="Gray"

# Modify the scan command as needed to fit your scanner

SCANCMD="scanimage -d " + SCANDEVICE + " --mode " + SCANMODE + \

    " -x " + X + " -y " + Y + " --resolution " + RES

# ImageMagick limits

MEMLIMIT="128"

MAPLIMIT="256"

# End of configuration options

    

def cleanup(signum=0 , stkframe=0):

    if os.path.isdir(TMPDIR):

        shutil.rmtree(TMPDIR, 1)

    sys.exit(signum)

def imove(src, dest, ask=True):

    if ask and os.path.isfile(dest):

        userin=raw_input("overwrite file '" + dest + "'? ")

        if userin == "y" or userin == "yes":

            shutil.move(src, dest)

    else:

        shutil.move(src, dest)

    

    

def scanpage(scancmd, pgname):

    buffsize=1024

    

    progress=os.popen("dialog --gauge \"Scanning...\" 6 60", "w")

    scan=os.popen(scancmd)

    

    magic=scan.readline()

    if magic == "P4\n":

        outfilename=pgname + ".pbm"

    elif magic == "P5\n":

        outfilename=pgname + ".pgm"

    else:

        outfilename=pgname + ".ppm"

        

    outfile=file(outfilename, "w")

    outfile.write(magic)

    

    buff=scan.readline()

    while buff.startswith("#") or buff.startswith("\n"):

        outfile.write(buff)

        buff=scan.readline()

        

    hw=buff.split()

    outfile.write(buff)

    if magic == "P4\n":

        bmsize=int(hw[0]) * int(hw[1]) / 8

    elif magic =="P5\n":

        buffsize=8192

        buff=scan.readline()

        bmsize=int(float(hw[0]) * float(hw[1]) * math.ceil(math.log(float(buff)+1, 2)) / 8)

        outfile.write(buff)

        

    bytescopyed=0

    prevpercent=0

    buff=scan.read(buffsize)

    while len(buff) != 0:

        bytescopyed+=len(buff)

        if int(math.ceil(float(bytescopyed) / float(bmsize) * 100)) > prevpercent:

            prevpercent=int(math.ceil(float(bytescopyed) / float(bmsize) * 100))

            progress.write(str(prevpercent) + "\n")

            progress.flush()

            

        outfile.write(buff)

        buff=scan.read(buffsize)

 

    progress.close()       

    scan.close()

    outfile.close()

    

    

# start of main program

if len(sys.argv) <= 1:

    print "USAGE: scan-pdf file.pdf\n"

    sys.exit(0)

PDFFILE=sys.argv[1]

if os.path.isfile(PDFFILE):

    print '\a'

    if os.system("dialog --yesno \"${1}: File exists! Continue?\" 15 60") != 0:

        os.system("clear")

        sys.exit(0)

elif os.path.isdir(PDFFILE):

    print "\a'" + PDFFILE + "' is a directory!"

    sys.exit(0)

TMPDIR=mkdtemp("scan-pdf")

signal.signal(signal.SIGHUP, cleanup)

signal.signal(signal.SIGINT, cleanup)

signal.signal(signal.SIGQUIT, cleanup)

signal.signal(signal.SIGTERM, cleanup)

os.system("dialog --msgbox \"Insert first page\" 5 24")

i=1

retval=0

while retval == 0:

    PG="pg" + str(i).zfill(5)

    scanpage(SCANCMD, TMPDIR + "/" + PG)

    i+=1

    retval=os.system("dialog --yesno \"Scan another page?\" 5 25")

os.system("dialog --infobox \"Converting...\" 3 23")

os.system("convert -limit Memory " + MEMLIMIT + " -limit Map " + MAPLIMIT + \

    " -threshold " + str(int(float(THRESHOLD) * 65535)) + " " + \

    TMPDIR + "/pg*.pgm -adjoin -page " + PAGESIZE + " " + TMPDIR + "/pgs.ps")

os.system("ps2pdf13 " + TMPDIR + "/pgs.ps " + TMPDIR + "/pgs.pdf")

os.system("clear")

imove(TMPDIR + "/pgs.pdf", PDFFILE)

cleanup()

```

----------

## bludger

 *PowerFactor wrote:*   

> I ran into this a while back too. You need use the "-limit Memory" and possibly the  "-limit Map" options for convert to limit it's ram usage. Usually 1/4 of my physcal ram seems to work well enough.  It does take a long time to convert though.

 Thanks for this. I just found this out independantly today and was returning to the thread to post my results, but it appears that you beat me too it.

I have just one question though. I used only the memory limit option. What does the map limit option actually do? The documentation seems rather sparse.

----------

## PowerFactor

As I understand it the Map limit option limits the amount of filespace that can be mmaped for pixel cache.

http://en.wikipedia.org/wiki/Memory-mapped_file

I think theres probably no need to use the Map limit on most systems.  I think I just put it in mine because I had no clue how mmaping worked back then.  It doesn't seem to make any performace difference when I remove it.

----------

## bludger

To get the scan device, I had been performing the following:

SCANDEVICE=$(scanimage -L|grep hp3500|awk -F '`' '{print $2}'|awk -F \' '{print $1}')

(my device is an hp3500)

This would read the correct usb port. From your script, I see that it might be possible to use just "hp3500:".  I'll give that a try.

----------

## csim

Hi,

i have a small suggestion:

i think it would be cool to have the basic parameters accessible via some kind of menu for example:

scanimage -L lists all available devices, it would be cool to select them via dropdown menu...

```

device `v4l:/dev/video0' is a Noname stk11xx virtual device

device `plustek:libusb:002:002' is a Canon LiDE25 USB flatbed scanner

```

scanimage --resolution=300 -x 210 -y 297 -d plustek:libusb:002:002 > /home/user1/image.pnm

Basically having a 2 dropdown menus specifying paper size (A4 would translate to -x210 -y297) and DPI would be also cool

Let me dream a bit about this, having such simple (preferably GTK+ based interface) with a Finish button and a place where you can name your pdf would be great.

```

Select device:     Select Paper Size:   Select Resolution in DPI:

[Canon Lide 25]    [A4]                 [300]

[Your Name of pdf] . pdf   

[/your/location]                     [Choose location...]               

                                          [Scan]    [Finish]

```

----------

## redwood

I was googling for Zacchaeus Pearsall's original version of this script, when I found this page.

I too used his script as a starting point when writing a shell script for batch document scanning using scanadf.

My version "bscan" is available at http://www.acjlaw.net:8080/~jeremy/Ricoh/usage_bscan.html

It uses a configuration file, ~/.bscanrc 

where one can list all your scanners in a bash array, 

with devices names as shown by "scanimage -L"

and the default scanner being SCANDEVICE="${scanners[0]}"

Importantly, specifying the scanner names in ~/.bscanrc saves time 

since the script then skips finding the scanners using "scanimage -L" 

One can also  specify which scanners are true duplex, 

so the script will scan fake duplex mode when true duplex is not available.

One can also specify lp printer instances so one can scan direclty to printer;

e.g. if you scan a document in duplex mode on letter-sized paper, 

it will be printed in duplex from the appropriate tray holding letter-sized paper.

By default the script scans from the ADF in grayscale @300dpi and saves to format PDF.

So to scan a letter-sized document from the ADF @300dpi grayscale, 

then compress using lzw, binarize using djvu and save to OUTFILE.pdf

one would use:

bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw -BW OUTFILE

or for legal-sized paper

bscan --mode=8-bit --shades=2 --page=Legal --comp=lzw -BW OUTFILE

or letter-sized paper from the FlatBed:

bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw --source=FB -BW OUTFILE

To simplify things, I usually define some aliases for black/white, grayscale and color scanning:

alias b='bscan --mode=1-bit --page=Letter' --comp='lzw'

alias bl='bscan --mode=1-bit --page=Legal --comp=lzw'

alias B='bscan --mode=8-bit --shades=2 --page=Letter --comp=lzw'

alias BL='bscan --mode=8-bit --shades=2 --page=Legal --comp=lzw'

alias C='bscan --mode=color --shades=32 --page=Letter --comp=lzw'

alias CL='bscan --mode=color --shades=32 --page=Legal --comp=lzw'

alias truecolor='bscan --mode=color --shades=truecolor --page=Letter --comp=lzw'

Then to scan in b/w from the ADF @300dpi grayscale a letter-sized document:

b OUTFILE

for legal-sized:

bl OUTFILE

To scan in grayscale and binarize using djvu wavelet compression:

For letter:

B -BW OUTFILE

For Legal:

BL -BW OUTFILE

For letter using pnmtools' truecolor shades:

truecolor -c44 --djvutopdf=25 OUTFILE

For letter using duplexing and djvu binarization:

B -duplex -BW OUTFILE

Or to rotate the document 180 degrees:

B --rot=r180 OUTFILE

To save to another format, use --format={pnm,tif,pdf,ps,djv} or alternatively,

-pnm <equivalent to --format=pnm>

-tif     <equivalent to --format=tif>, 

and similarly for the other output options:

-pdf, -ps, -djv

Shortcut options, like the above switches take a single '-'

and arguments requiring a value  have the form '--option=value'

One can specify various binarization algorithms, 

such as those from Fred Weinhaus http://www.fmwconcepts.com/imagemagick/index.html

using the option  --thresh={bw, constant, 2color, fuzzy, isodata, kmeans, sahoo, triangle, }

where the various binarization scripts must be in your $PATH.

If you use xsane or gscan2pdf to scan some images because, e.g. you need to crop the image

or tweak the contrast/brightness/gamma settings, 

you can save the images as OUTFILE.%d.pnm 

e.g. OUTFILE.0001.pnm, OUTFILE.0002.pnm, ...

Then use can use bscan with the option "-noscan" to skip the scanning, 

and instead just process the images:

e.g., to rotate the images 180degrees and binarize using djvu compression:

B -noscan -BW --rot=180 OUTFILE

which would process the series of images and create one multipage OUTFILE.pdf

One can also deskew images using unpaper from http://unpaper.berlios.de/

The options to "unpaper" are hardwired into bscan because the options are just too numerous

to specify on the commandline. 

so it might be best to just make alocal copy of bscan, 

and modify the line which runs unpaper using whatever unpaper options you need.

Alternatively, you could add an option for unpaper settings

so that you could scan, e.g. B --unpaper=setting1 -BW OUTFILE

where setting1 would be specified in ~/.bscanrc or hardwired into bscan.

To photocopy, i.e. scan the print to printer:

For letter printed to PRINTERLETTER

B -prn --n=<number of copies>

For legal printed to PRINTERLEGAL

BL -prn --n=<#copies>

Or for duplex letter to PRINTERLTRDUP

B -duplex -prn --n=<#copies>

And legal duplex to PRINTERLGLDUP

BL -duplex -prn --n=<#copies>

You just need to define the lp printer instances in /etc/cups/lpoptions or ~/.cups/lpoptions

However, I find that KDE keeps modifying/deleting any printer instances in ~/.cups/lpoptions

so I given up and just use /etc/cups/lpoptions, which KDE leaves untouched.

You can define lp printer instances using lpoptions,

but I find it easier to just directly edit /etc/cups/lpoptions

e.g. for my Xerox Phaser8860  print queue

I can define a letter,color,simplex queue:

Dest Phaser8860/letter Duplex=None fitplot=false InputSlot=Tray2 media=letter MediaType=Auto OutputMode=Enhanced PageRegion=letter PageSize=letter

And in my ~/.bscanrc, I add the name of the printer destination:

PRINTERLETTER="Phaser8860/letter"

And similarly for a color duplex-letter queue:

Dest Phaser8860/ltrdup Duplex=DuplexNoTumble fitplot=false InputSlot=Tray2 media=letter MediaType=Auto OutputMode=Enhanced PageRegion=letter PageSize=letter sides=two-sided-long-edge

with the destination

PRINTERLTRDUP="Phaser8860/ltrdup"

in my ~/.bscanrc

"bscan" will choose the appropriate letter/legal simplex/duplex printer destinations depending on whether the scan was letter/legal, simplex/duplex.

----------

## undrwater

 *redwood wrote:*   

> I was googling for Zacchaeus Pearsall's original version of this script, when I found this page.
> 
> I too used his script as a starting point when writing a shell script for batch document scanning using scanadf.
> 
> My version "bscan" is available at... 

 

Thank you for this!  Brother had provided some scripts but they used a tool that not longer works.  I will have to see if I can use this.

----------

