1

1

I need a way to programmatically count the number of pages in PDF files. I'm working in a Windows environment. I don't really want to use any third party libraries.

Any suggestions?

flag

1 Answer

1

If you don't want to use any third party libraries then you'll need to programmatically parse the internals of the PDF yourself.

Some background: each page in a PDF is represented by a page object. The page object is a dictionary which includes references to the page's content and other attributes. The individual page objects are tied together in a structure called the page tree.

To count the number of pages all you need to do is to parse the PDF, as if it were a text file, for the /Page entry. The total number of /Page entries will equal the total number of pages in the document.

I've included an example of what one of these /Page entries might look like here:

10 0 obj    % <-- Page object
<</Type /Page
/Parent 5 0 R
/Resources 20 0 R
/Contents 40 0 R
>>
endobj

For more information about this I'd recommend taking a look at the PDF specification.

link|flag
Thanks for the answer, very useful! – PDF Seeder Dec 15 at 1:57

Your Answer

Not the answer you're looking for? Browse other questions tagged or ask your own question.