Kindlegen error about multiple titles with Pandoc epub export

pt
ptmkenny
Posts: 37
Joined: Fri May 04, 2018 8:43 am
Platform: Mac + Windows

Thu May 17, 2018 2:45 pm Post

Error from Kindle Previewer when attempting to open an ePub file compiled from Scrivener:

Code: Select all

*************************************************************
 Amazon kindlegen(MAC OSX) V2.9 build 0830-03578f
 A command line e-book compiler
 Copyright Amazon.com and its Affiliates 2015
*************************************************************

Info:I9026:option: (hidden) amazon creator tool or pipeline
Error(opfparser):E20006: There are more than one title defined in OPF metadata. But none of them is refined with "title-type" as "main" title. Refer http://idpf.org/epub/30/spec/epub30-publications.html#sec-opf-dctitle for more info.
Error(prcgen):E21011: The book title was not set. Please set the title before generating the mobi.


Content of content.opf from the ePub:

Code: Select all

<?xml version="1.0" encoding="utf-8"?>
<package version="3.0" unique-identifier="epub-id-1" prefix="ibooks: http://vocabulary.itunes.apple.com/rdf/ibooks/vocabulary-extensions-1.0/" xmlns="http://www.idpf.org/2007/opf">
  <metadata xmlns:opf="http://www.idpf.org/2007/opf" xmlns:dc="http://purl.org/dc/elements/1.1/">
    <dc:identifier id="epub-id-1">urn:uuid:D2A497E8-EB7F-4759-9658-2A5CB9E29D9A</dc:identifier>
    <dc:title id="epub-title-1">NovelTemplate</dc:title>
    <dc:title id="epub-title-2">NovelTemplate</dc:title>
    <dc:date id="epub-date">2018-05-17T14:37:40Z</dc:date>
    <dc:language>en</dc:language>
    <dc:creator id="epub-creator-1">PK</dc:creator>
    <meta property="role" scheme="marc:relators" refines="#epub-creator-1">aut</meta>
    <dc:creator id="epub-creator-2">PK</dc:creator>
    <meta property="dcterms:modified">2018-05-17T14:37:40Z</meta>
  </metadata>
  <manifest>
    <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/>
    <item id="nav" href="Text/nav.xhtml" media-type="application/xhtml+xml" properties="nav"/>
    <item id="style" href="Styles/stylesheet1.css" media-type="text/css"/>
    <item id="title_page_xhtml" href="Text/title_page.xhtml" media-type="application/xhtml+xml"/>
    <item id="ch001_xhtml" href="Text/ch001.xhtml" media-type="application/xhtml+xml"/>
    <item id="ch002_xhtml" href="Text/ch002.xhtml" media-type="application/xhtml+xml"/>
    <item id="ch003_xhtml" href="Text/ch003.xhtml" media-type="application/xhtml+xml"/>
    <item id="ch004_xhtml" href="Text/ch004.xhtml" media-type="application/xhtml+xml"/>
    <item id="ch005_xhtml" href="Text/ch005.xhtml" media-type="application/xhtml+xml"/>
  </manifest>
  <spine toc="ncx">
    <itemref idref="title_page_xhtml" linear="yes"/>
    <itemref idref="ch001_xhtml"/>
    <itemref idref="ch002_xhtml"/>
    <itemref idref="ch003_xhtml"/>
    <itemref idref="ch004_xhtml"/>
    <itemref idref="ch005_xhtml"/>
  </spine>
  <guide>
    <reference type="toc" title="NovelTemplate" href="Text/nav.xhtml"/>
  </guide>
</package>


The title appears twice, as kindlegen complains:

Code: Select all

    <dc:title id="epub-title-1">NovelTemplate</dc:title>
    <dc:title id="epub-title-2">NovelTemplate</dc:title>


How to reproduce

1. Create basic novel.
2. On the Compile screen, set the "Title" and "Authors" and leave other metadata blank.
3. Under Metadata on the Project Formats screen, make sure everything is blank.
4. Compile from Pandoc to ePub. (problem occurs with both ePub 2 + 3)

Versions:

Scrivener: 3.0.2
OS X: 10.13.4
Pandoc: 2.2.1, installed via package
Kindle Previewer: 3.22.0

User avatar
KB
Site Admin
Posts: 20240
Joined: Tue Jun 13, 2006 11:23 pm
Platform: Mac
Location: Truro, Cornwall
Contact:

Thu May 17, 2018 6:23 pm Post

This seems to be a bug in Pandoc. You can check that Scrivener's side of things is all fine by ticking "Save source files in a folder with exported file" in the Compile options. If you open the .xml and .txt file that this results in, you will see that there is no problem. These are the files that Scrivener is passing to Pandoc.

A Google search confirms that this is indeed a Pandoc issue:

https://groups.google.com/forum/#!topic ... iR_wXqLHxk

It might be worth trying to convert the ePub using KindleGen to see if that works (although I would guess that Kindle Previewer uses KindleGen internally).
"You can't waltz in here, use my toaster, and start spouting universal truths without qualification."

User avatar
nontroppo
Posts: 1053
Joined: Mon Mar 05, 2007 5:22 pm
Platform: Mac
Location: Airstrip One

Fri May 18, 2018 7:00 am Post

This is not a bug in Pandoc. If you compile to Multimarkdown (pandoc flavour), and use pandoc directly:

Code: Select all

pandoc -t epub -o test.epub test.md


Then only one dc:title is created. What seems to be happening is that the metadata is not being correctly assigned to the correct yaml values. So for example this metadata:

Screen Shot 2018-05-18 at 14.49.44_SMALL.png
Screen Shot 2018-05-18 at 14.49.44_SMALL.png (5.96 KiB) Viewed 1443 times


... gets converted to this invalid metadata (Pandoc requires YAML formatting):

Code: Select all

% TEST
% Joanna Doe


Whereas when I compile to Pandoc flavoured Multimarkdown I get this by default:

Code: Select all

---
Title: TEST 
Author: Joanna Doe
---


This is better but still not valid. Pandoc metadata is case sensitive and it should be:

Code: Select all

---
title: TEST 
author: Joanna Doe
---


This then generates an EPUB without problems. It should be very easy for Keith to fix this as it already works for Pandoc output via the Multimarkdown option, just with the proviso that by default the title and author should be lowercase. This case for title/author can be modified by the user manually, but I suppose it would be nice if Scrivener could set this by default. Because MMD also supports YAML style metadata, I did ask previously if it wouldn't be easier if Keith just used YAML by default, this would mean just one type of metadata style to convert irrespective of the compile type...

User avatar
KB
Site Admin
Posts: 20240
Joined: Tue Jun 13, 2006 11:23 pm
Platform: Mac
Location: Truro, Cornwall
Contact:

Fri May 18, 2018 2:10 pm Post

???? Since when did Pandoc require YAML formatting. Scrivener has always output Pandoc using the percentages for metadata, as per the docs:

https://pandoc.org/MANUAL.html#metadata-blocks

According to the docs, either the percent-based metadata (pandoc_title_block) or YAML metadata blocks (yaml_metadata_block) can be used, so I'm not sure you are right that this isn't a bug in Pandoc.

EDIT: Never mind, I see that the documentation specifies a different (YAML) sort of metadata for ePub, which is a bit silly.
"You can't waltz in here, use my toaster, and start spouting universal truths without qualification."

User avatar
KB
Site Admin
Posts: 20240
Joined: Tue Jun 13, 2006 11:23 pm
Platform: Mac
Location: Truro, Cornwall
Contact:

Fri May 18, 2018 2:30 pm Post

Actually, it turns out that this has nothing at all to do with the format of the metadata provided. The problem is that Scrivener is including a metadata block *at all*, because Scrivener is already providing the metadata as an XML using --epub-metadata (the same XML metadata as is generated for non-Pandoc epub files). So because Scrivener is providing a metadata block inside the Pandoc file too, Pandoc is merging that information into the provided metadata.xml file, resulting in duplicate entries. So I have fixed this for the next update by simply not providing any Pandoc metadata when exporting to epub since the Metadata.xml file is already being fed in.
"You can't waltz in here, use my toaster, and start spouting universal truths without qualification."

User avatar
KB
Site Admin
Posts: 20240
Joined: Tue Jun 13, 2006 11:23 pm
Platform: Mac
Location: Truro, Cornwall
Contact:

Fri May 18, 2018 4:42 pm Post

This really *does* seem to be a bug in Pandoc. If I omit the metadata from the top of the text, then no title page is generated. But if I provide that metadata to generate a title page, then it also gets merged into the metadata.xml file I provide and results in this problem. That doesn't seem like intentional behaviour to me.
"You can't waltz in here, use my toaster, and start spouting universal truths without qualification."

User avatar
nontroppo
Posts: 1053
Joined: Mon Mar 05, 2007 5:22 pm
Platform: Mac
Location: Airstrip One

Fri May 18, 2018 5:06 pm Post

Keith, you are correct I'd forgotten that % metadata was a default extension (I assumed it was an MMD compatibility extension). Everyone I know using Pandoc uses YAML as it is much more flexible and extensible, and integrates with the Pandoc templates and output more closely. Anyway, glad that you figured out the real issue.

pt
ptmkenny
Posts: 37
Joined: Fri May 04, 2018 8:43 am
Platform: Mac + Windows

Sat May 19, 2018 9:50 am Post

KB wrote:This really *does* seem to be a bug in Pandoc. If I omit the metadata from the top of the text, then no title page is generated. But if I provide that metadata to generate a title page, then it also gets merged into the metadata.xml file I provide and results in this problem. That doesn't seem like intentional behaviour to me.


I'm not sure if this is helpful, but this is from the Pandoc manual (https://pandoc.org/MANUAL.html#extension-yaml_metadata_block):

A document may contain multiple metadata blocks. The metadata fields will be combined through a left-biased union: if two metadata blocks attempt to set the same field, the value from the first block will be taken.


The "document" here appears to refer to everything being compiled in one pass (meaning that multiple files are treated as one document). So, this does appear to be a bug in Pandoc-- according to the documentation, when the same value is set multiple times, Pandoc should prioritize one of the metadata blocks and use that value, not duplicate the values.

User avatar
KB
Site Admin
Posts: 20240
Joined: Tue Jun 13, 2006 11:23 pm
Platform: Mac
Location: Truro, Cornwall
Contact:

Sat May 19, 2018 11:46 am Post

I've worked around the issue as follows:

1. In the metadata.xml file passed in to --epub-metadata, I know longer include the title or author elements for Pandoc > ePub.

2. I add the title and authors in the metadata block at the top of the Pandoc .txt file (using YAML for epub format here in case multiple authors are added).

This ensures that the title page gets created from the metadata block and that there is no duplication in the OPF file. All this is done for 3.0.3, which should be released next week (fingers crossed).

All the best,
Keith
"You can't waltz in here, use my toaster, and start spouting universal truths without qualification."