Ebook Tips (PDF & TXT)

If you’re following the blog, you read a while back that I’m using the SONY PRS 505 for my reading fix.

One of the biggest issues with any ebook reader is the whole business of how to take a PDF file that has been formatted for letter size, or some obscure size, and get it formatted properly for your ereader.

There are tons of sites that have books in either ePub or LRF formats (which work perfectly on the SONY; and lots of sites with the Mobi format for the Kindle.)  But suppose you come across files that are in PDF and you want a simple way to get the font size right, and not have the footers showing up in the middle of the page on the ebook.

Here’s what works for me:

I use Calibre software to manage and do my conversions of ebooks; but like any conversion program it will have problems with properly converting PDFs.  So this is sort of a pre-conversion to get a PDF ready for Calibre.


One way:

Open the PDF file in ACROBAT PROFESSIONAL (not the reader, but the expensive one).  I still have an old version, ACROBAT PRO 6.0 which came with an early version of the Adobe Photoshop Suite.  There may be other PDF convertors that will work as well.

– Save As HTML (I put all my HTML BOOKS IN THEIR OWN DIRECTORY).  When you save the book make sure that the settings tab is set to give it tags if it doesn’t find any.

– Basically, that’s it.  Now add it to your Calibre library and convert it to a format for your specific device.  In my case, I convert all my files to LRF, and in Calibre you can set the font size.  So my books are all set to 10.5 pt.

And from there you just put that specific format (not the PDF FILE) onto your ebook.  All done.  I’ve done a bunch of books this way without any issues.

Second Way:

If you don’t have ACROBAT PROFESSIONAL, open the PDF file in a recent Adobe Reader (I’m using 9.0 at the time of this writing).  Save it as a TXT file.

Now you may be able to directly import that TXT file and do the rest of your conversion to get the right font size etc. in Calibre.  But sometimes the conversion to TXT sticks TAB characters (hidden) into the TXT file and the formatting gets screwed up.  So:

I use Open Office to do a global replace of all the TAB CHARACTERS WITH NEWLINES and this works perfectly.  You can use any word processor that can find the TAB character and replace it with a NEWLINE character.  With Open Office, Search and Replace: turn Regular Expression On.  Replace \t with \n (tab with newline).

Then into Calibre to complete the transformation as necessary.

I do have one question.  I know that the Kindle Store will do PDF conversions.  If anyone has a Kindle and has done this conversion – does this result in a properly formatted file for the Kindle?


Published by


My name is Dave Beckerman. I am a fine art photographer working in New York City.