NAME¶
csepdjvu - DjVu encoder for separated data files.
 
SYNOPSIS¶
csepdjvu [options] [sepfiles]...
  outputdjvufile
 
DESCRIPTION¶
This program creates a DjVuDocument file 
outputdjvufile from separated
  data files 
sepfiles. It can read separated data from the standard input
  when given a single dash instead of the separated data file names. This
  feature is intended for pre-processing programs that push separated data into
  
csepdjvu via a pipe.
 
Each separated data file represents one or more page images. When the program
  arguments specify multiple pages, all the pages are encoded and saved as a
  bundled multi-page document. When the program arguments specify a single page,
  the page is encoded and saved as a single page file.
 
OPTIONS¶
  - -d n
 
  - Specify the resolution information encoded into the output
      file expressed in dots per inch. The resolution information encoded in
      DjVu files determine how the decoder scales the image on a particular
      display. Meaningful resolutions range from 25 to 6000. The default value
      is 300 dpi.
 
  - -q n,...,n
 
  
  - -q n+...+n
 
  - Specify the encoding quality of the IW44 encoded background
      layer. The option argument contain several integers (one per chunk)
      separated by either commas or pluses. This option is similar to option
      -slice of program c44. Please refer to the c44(1) man
      page for additional details. The default quality specification is -q
      72,83,93,103.
    
 
    This option does not apply to uniformly white background that were not
      specified by the separated data but are called for by the DjVu
      specification. Such background images always come at the lowest possible
      resolution and with a standard quality setting that ensures the color
      uniformity. 
  - -t
 
  - Program csepdjvu interprets certain comments in the
      separated file to construct a hidden text layer in the DjVu file. This
      layer records the location of each word for hiliting purposes. This option
      reduces the file size by simply recording the location of each line.
 
  - -v
 
  - Display a brief message describing each page.
 
  - -vv
 
  - Display extensive informational messages during encoding.
    
 
   
Each separated data file contains a concatenation of one or more separated page
  images. Each page is logically represented by a foreground image with a
  transparent color and by a background image visible through the transparent
  pixels. The data for each separated page image is the concatenation of the
  following data blocks:
  - *
 
  - A foreground image encoded using either the "Color RLE
      format" or the "Bitonal RLE format". These formats are
      described later in this section.
 
  - *
 
  - An optional background image encoded as a "Portable
      Pixmap" ( PPM ). This well known format is summarized
      later in this section. The absence of a background image simply indicates
      that a uniformly white background should be assumed.
 
  - *
 
  - An arbitrary number of comment lines starting with
      character "#" and terminated by a linefeed character. Comment
      lines whose first word starts with a capital letter have special meanings
      documented later in this document.
 
The dimensions (width and height) of the background image must be obtained by
  rounding up the quotient of the foreground image dimensions by an integer
  reduction factor ranging from 1 to 12. Assume, for instance, that the width of
  the foreground is 2507 and the reduction factor is 3. The width of the
  background image will be the integer ratio (2507+2)/3.
 
The Color RLE format is a simple run-length encoding scheme for color images
  with a limited number of distinct colors. The data always begin with a text
  header composed of the two characters "R6", the number of columns,
  the number of rows, and the number of color palette entries. All numbers are
  expressed in decimal 
ASCII. These four items are separated by
  blank characters (space, tab, carriage return, or linefeed) or by comment
  lines introduced by character "#". The last number is followed by
  exactly one character which usually is a linefeed character.
 
The header is followed by the color palette containing three bytes per color
  entry. The bytes represent the red, green, and blue components of the color.
 
The palette is followed by a collection of four bytes integers (most significant
  bit first) representing runs of pixels with an identical color. The twelve
  upper bits of this integer indicate the index of the run color in the palette
  entry. The twenty lower bits of the integer indicate the run length. Color
  indices greater than 0xff0 are reserved. Color index 0xfff is used for
  transparent runs. Each row is represented by a sequence of runs whose lengths
  add up to the image width. Rows are encoded starting with the top row and
  progressing toward the bottom row.
 
The Bitonal RLE format is a simple run-length encoding scheme for bitonal
  images. The data always begin with a text header composed of the two
  characters "R4", the number of columns, and the number of rows. All
  numbers are expressed in decimal 
ASCII. These three items are
  separated by blank characters (space, tab, carriage return, or linefeed) or by
  comment lines introduced by character "#". The last number is
  followed by exactly one character which usually is a linefeed character.
 
The rest of the file encodes a sequence of numbers representing the lengths of
  alternating runs of transparent and black pixels. Lines are encoded starting
  with the top line and progressing toward the bottom line. Each line starts
  with a white run. The decoder knows that a line is finished when the sum of
  the run lengths for that line is equal to the number of columns in the image.
  Numbers in range 0 to 191 are represented by a single byte in range 0x00 to
  0xbf. Numbers in range 192 to 16383 are represented by a two byte sequence:
  the first byte, in range 0xc0 to 0xff, encodes the six most significant bits
  of the number, the second byte encodes the remaining eight bits of the number.
  This scheme allows for runs of length zero, which are useful when a line
  starts with a black pixel, and when a very long run (whose length exceeds
  16383) must be split into smaller runs.
 
The Portable Pixmap format is a well known format for representing color images.
  Check the 
ppm(1) man page for complete information.
 
The data always begin with a text header composed of the two characters
  "P6", the number of columns, the number of rows, and the maximal
  value of a color component (usually 255). All numbers are expressed in decimal
  
ASCII. These three items are separated by blank characters
  (space, tab, carriage return, or linefeed) or by comment lines introduced by
  character "#". The last number is followed by exactly one character
  which usually is a linefeed character.
 
The rest of the file encodes all the pixels. Each pixel is represented by three
  bytes representing the red, green and blue component of the pixel. Pixels are
  ordered in left to right, top to bottom.
 
Each page is followed by an arbitrary number of comment lines starting with
  character "#" and terminated by a linefeed character. Comment lines
  whose first word starts with a capital letter have special meanings. The
  following constructs are currently defined:
  - *
 
  - # T px:py
      dx:dy
      wxh+x+y
      (string)
    
 
    This constructs indicates that the piece of text string must be
      associated with an area of size wxh at position
      x,y relative to the lower left corner of the page. The
      string is UTF-8 encoded. Special characters can be escaped as in
      PostScript using the backslash character. Integers px, and
      py represent the position of the current point on the text baseline
      before the text was drawn. The drawing operation then moves the current
      point by dx, and dy pixels. When such comments are present,
      csepdjvu produces a hidden text layer for the corresponding
    pages. 
  - *
 
  - # L
      wxh+x+y
      (url)
    
 
    This construct indicates that an hyperlink to url url should be
      associated with area of size wxh at position
      x,y. When such comments are present, csepdjvu
      produces pages with an annotation chunk containing the specified
      hyperlinks. 
  - *
 
  - # B count (string)
      (#pageno)
    
 
    This constructs provides outline information for the document. An outline
      entry entitled string is associated with page pageno.
      Integer count indicates how many of the following outline entries
      must be attached to the current entry as subentries. When such comments
      are present in the first page csepdjvu produces an navigation chunk
      with the specified outline.
     
   
CREDITS¶
This program was initially written by Léon Bottou
  <leonb@users.sourceforge.net> and was improved by Bill Riemers
  <docbill@sourceforge.net> and many others.
 
SEE ALSO¶
djvu(1), 
ppm(5), 
c44(1)