Scrivener .mobi file size versus hand-coded html

User avatar
tomwood
Posts: 98
Joined: Sun Dec 31, 2017 5:28 pm
Platform: Windows
Contact:

Mon Jun 11, 2018 11:15 am Post

I attended Robin Sullivan's presentation on how to code an ebook from a Word source file:

https://www.meetup.com/DC-Write-To-Publ ... 249564513/

I said that I use Scrivener to produce a .mobi and that it was a lot easier than the hand-coded method, so why go that route. (I write fiction, so it's a simple text file with just the cover image and then all text.) She said that she thinks the Scrivener file is likely to be larger than the hand-coded file, to the extent that it would affect the download fee that Amazon charges.

Has anyone done a comparison of the Scrivener .mobi file size in comparison to a hand-coded version? Is the difference in file size something that I would see reflected in the Amazon download fee?
I have a very odd feeling about this...

User avatar
AmberV
Posts: 21834
Joined: Sun Jun 18, 2006 4:30 am
Platform: Mac + Linux
Location: Santiago de Compostela, Galiza
Contact:

Wed Jun 13, 2018 6:31 pm Post

In general it’s going to be very difficult to beat a hand-coded book with machine generated HTML/CSS. A human can know when to cut corners and how best to do so, but more than that the current iteration of Scrivener has a lot of duplication in the CSS output, since it generates a new .css file for each formal section within the ebook structure. In other words if you have fifteen identically formatted chapters in a novel, you’ll still get fifteen .css files all largely saying the same thing over and over.

Of course your question is how much of a savings you will get from hand-coding. I have no idea, it would be a very time consuming question to answer with hard data, and of course the answer would also to a degree depend upon the skill of the writer. :)

I would say a good balance between automation and good clean code would be with a Markdown-based workflow. Generate an ePub with a conversion tool like Pandoc or MultiMarkdown, and you’d be hard-pressed to beat that with hand-coding. I don’t know if we’re planning to take the same approach with the 3.0 beta as we have on the Mac, but if so, we actually use this method ourselves internally now. When people use Scrivener to generate an ePub 3 or KF8, their work is converted to MultiMarkdown internally, and then the result is passed through a bunch of processing that adds compile logic back into the HTML semantics. It’s not quite as clean as raw Markdown output, but very close, and not in the same realm as rich text to HTML conversion.

By the way: I’m a proponent of hand-coding whenever one can. When I was maintaining the Literature & Latte website, before the recent revamp, that was 100% hand-coded from the ground up. When I put together a newsletter that we send out, I write each line of in a text editor, saving every byte that I can, because that really matters when you’re sending out thousands of them. If I were publishing eBooks, I would absolutely spend time editing the raw source directly and making it as efficient as possible. I would use Markdown (and Pandoc) to get the basic structure in place; all of the busy work out of the way.

That said, I don’t know how much of that has to do with saving every kilobyte. That is certainly part of why that is my opinion, but another part is that I care about every strut of what I’m making, not just how it looks on the surface. It’s a bit ideological, you might say.
.:.
Ioa Petra'ka
“Whole sight, or all the rest is desolation.” —John Fowles

User avatar
tomwood
Posts: 98
Joined: Sun Dec 31, 2017 5:28 pm
Platform: Windows
Contact:

Thu Jun 14, 2018 4:26 pm Post

Thank you for the thoughtful response!

At some point I'll take a run at a hand-coded file just to satisfy my curiosity. And I do appreciate the desire to know the struts and structure.

Extrapolating from where I am now in word count, it appears I'll end up around 1.5M for the final .mobi coming out of Scrivener. KDP says they will do some compression on the cover image , which is the single largest item affecting file weight, so it may end up smaller.

Research on the web indicates that the median Kindle file is about 1.9M, so I'm in the ballpark. They charge $0.15/M for the download fee, rounded to the nearest Kb, so keeping it under 2M is my initial goal.
I have a very odd feeling about this...

User avatar
AmberV
Posts: 21834
Joined: Sun Jun 18, 2006 4:30 am
Platform: Mac + Linux
Location: Santiago de Compostela, Galiza
Contact:

Thu Jun 14, 2018 6:35 pm Post

Oh with Kindle, you may already know this and be factoring it in, but something to be aware of is that the .mobi file the KindleGen utility creates will be roughly twice as large as the actual ebook file that will be sent to readers—and thus the rate you will be charged for. It contains both the old legacy MobiPocket format as well as the newer spec. When you upload to KDP it splits it apart (or that happens when it sends, I’m not exactly sure).

That 2mb median figure seems awfully high to me. Looking through the contents of my Kindle reader, I don’t even have any books that large, and that includes a few omnibuses and all-in-one trilogies! The largest books are those with several illustrations. Most of the typically sized novels are in the 500kb range. Perhaps the figures you’ve seen are reporting the oversized KindleGen .mobi.
.:.
Ioa Petra'ka
“Whole sight, or all the rest is desolation.” —John Fowles

User avatar
tomwood
Posts: 98
Joined: Sun Dec 31, 2017 5:28 pm
Platform: Windows
Contact:

Thu Jun 14, 2018 7:13 pm Post

I was wondering about that - if the KindleGen .mobi had both. Good to know.

I can't find the link now, but you may be right about that median size data referring to the oversized .mobi.

Thanks!
I have a very odd feeling about this...