[lug] Combining pdf documents

Kenneth D. Weinert mc at morat.net
Tue Jan 14 08:14:40 MST 2003

On Tue, 14 Jan 2003 15:30:56 +0100
rm at fabula.de wrote:

> On Mon, Jan 13, 2003 at 11:17:05AM -0700, J. Wayde Allen wrote:
> > On Mon, 13 Jan 2003 rm at fabula.de wrote:
>            However: looking at your specific requirements 
> (esp. modifying content like page numbers) i'd strongly advise against
> using pdf as a submission format. PDF is a display format -- almost no
> structural information is present. You'd have to request that all your 
> authors use some special _visual_ markup to tag relevant bits of information
> (like: use Adobe-Comic-Sans for page numbers so our program can identify them
> during processing).

	Not true, actually. There *can* be (note the emphasis) a lot
of structural information in a PDF document. There aren't many tools
that deal with a PDF on this level to be sure, but there is at least
one other aside from Adobe, but none that are open source that I'm
aware of and I've done a lot of looking (if someone knows about a
library that deals with a PDF on the object level that is open source
I'd appreciate hearing about it.)

	Pages can contain Page Labels, so you could renumber all the
pages by adding a PageLabels entry in the document catalog.

	Note that this is not necessarily an easy thing to do - it
might require your processing of each of the page dictionaries to
remove any existing page label, but most PDFs just rely on the page
index as the page number if it's displayed.

> > > In theory it should be possible to combine pdf documents by
> > > reading their dictionaries (the last object in a file - the
> > > toplevel/root object so to say) and adding all object trees
> > > to a newly created root object (but you would need to renumber
> > > all objects to avoid duplicated object IDs). Doable, but most
> > > likely not fun ....

	Definately not fun, and a lot more work than you'd think :)

/~\ The ASCII        Ken Weinert   Ken.Weinert at ihs.com 
\ / Ribbon Campaign  303-858-6956 (V) 303-705-4258 (F)
 X  Against HTML     GnuPG: 9274F1CE  GnuPG available at http://www.gnupg.org/
/ \ Email!           1D87 3720 BB77 4489 A928  79D6 F8EC DD76 9274 F1CE
Black holes are where God is dividing by zero.

More information about the LUG mailing list