multipart tutorial on How To Publish An Ebook: Convert to HTML.
You may not know this, but an ebook is a lot like a web page. You're looking at a web page right now, and what makes it work is a markup language called HTML. The people who keep track of internet standards can tell you the difference between HTML4, HTML5, XHTML, XML and several other similar data format standards. You probably don't care. I'll be inexact in my terminology: when I say HTML I may be playing fast-and-loose and really mean XHTML instead. Will this hurt anything?
I hope not. I'd rather you get a few niggling details wrong and get the overall concept right.
Let's suppose you've gone through the process of writing a book. And that book just happens to be in DOCX format. This is the file format that Microsoft Word uses. If you use another word processor, you can probably get it converted to DOC or DOCX format easily enough. Otherwise, ask and we'll work out that contingency.
There are a lot of ways to convert a DOCX file into HTML. And they all work fairly well. However, they tend to generate bloated HTML code. You can generally represent something in many different ways. And when you have a Word document, it can have a lot of odd formatting things that anyone might put in for any reason. But that's not you because you're writing an ebook.
That's why I like a clean, lean, light-weight HTML translation that's relatively minimalist. (And if your ebook design is not minimalist, you're doing something wrong.)
That's where Rick Boatright's translation comes in handy for me. Go here to see what i mean.
You'll see two boxes. One on the left and another on the right.
Go into Word. Hit Control-A to select everything. Then hit Control-C to copy everything.
Go into your browser and Rick's translation page. Click in the left box and hit Control-V to paste everything.
Select the checkboxes you want, then click the button marked "Clean up Word Text" and wait for a few minutes--depending on how long your document is. If it barfs, break up your ebook into chapters and try again.
When you get each piece of your ebook translated to HTML, click in the right box and hit Control-A to select everything, and Control-C to copy everything. Then paste your buffer into a Notepad file and save it off with an extension of HTML into project directory.
Now is a good time to look for badly translated symbols like smart-quotes, copyright or trademark symbols and other bits of noise that'll hurt the appearance of your ebook. It's best to find these errors as early in the workflow process as possible to avoid rework
Do you have to use Rick Boatright's translator? No. Can you use other HTML translators? Probably. I'm only telling you what worked for me. And you might have some other way that works better. I certainly haven't cornered the market on truth. Can you avoid Word altogether and use another tool that generates clean HTML automatically? Idunno. Haven't tried.
Let me know if you have tried something different--like, say, Scrivener.
(You can find the bullet-point outline of How To Publish An Ebook here.)