Insights

What is an EPUB file?

Insights What is an EPUB file?

Anyone who is involved in eBooks will by now have come across the file type EPUB and may be wondering what it is. This article aims to answer a few questions:
1. What is an EPUB file?
2. How does an EPUB work?
3. How do you create an EPUB?
4. How do you read an EPUB file?
5. What does the EPUB file give the trainer?

  1. What is an EPUB file?

Let’s start with a general overview. You will be familiar with many file types from .doc and .xls files through to the .pdf. Each file has a different structure and is created and read in different programmes. For eBooks there are a number of files in use namely:

                              PDF

                              EPUB

                              MOBI

                              AZW or AZW3

                              IBA

                              .txt

The PDF is an Adobe product and is an image file which ‘locks’ the arrangement of text and graphics created in  other programmes allowing transfer without distortion. The PDF can be read in many different forms of PDF reader. It can also have a degree of interactivity (see below)

The MOBI file format was the original file format created for the Amazon Kindle product. Unlike a PDF or a .doc file it is ‘reflowable’. This means that when the text is presented on a screen the words and spaces of the page will automatically be reorganised to the suit the screen of the reader. Amazon later changed the DRM protection of the books it sold and although using the same file structure moved to the AZW format.

The AZW and later AZW3 formats are what is required for reading on a Kindle and because of their proprietary nature are only available from the Amazon store.

The other proprietary format is the IBA, which you are unlikely to stumble across since it is created by the iBooks author for use solely in the iTunes store.

The EPUB is now in its third reincarnations as an EPUB3. Because EPUB is free to use, open standard, and vendor-independent, it has grown to become the most common eBook format. Initially developed by the International Digital Publishing Forum (which is now part of the World Wide Web Consortium) - the W3 still maintain the file standards - https://www.w3.org/

The EPUB allows for text, images, video, web links, interactive forms and images and a host of other features.

The EPUB is also  the most widely supported eBook format and can be read on a variety of devices, including computers, smartphones, tablets, and most ereaders (except Kindles). All EPUB file formats are DRM protected and have strong copy protection and EPUB files are reflowable.

So in summary:

                             

 

Platforms

Reflowable

Fixed Layout

DRM

Interactivity

 

.txt

All

Yes

No

No

No

 

.azw

Kindle

Yes

Yes

Yes

Yes

 

.EPUB

All but Kindle

Yes

Yes

Yes

Yes

 

.mobi

All but Nook and Sony

Yes

Yes

Yes

Yes

 

.iba

iBooks

Yes

Yes

Yes

Yes

 

.pdf

All

No

Yes

Yes

Limited

 

 

Various Sources but especially: https://learn.g2.com/ebook-formats

 

  1. How does an EPUB work?

The best phrase used to describe an EPUB is that it is a ‘web site in a box’ – or to quote W3 – ‘The EPUB format provides a means of representing, packaging and encoding structured and semantically enhanced Web content — including HTML, CSS, SVG and other resources — for distribution in a single-file container’ https://www.w3.org/

So let’s unpack that technical jargon. A web site is structured with a series of HTML files which carry the text of the web site, a style sheet which defines the colours, fonts, heading styles etc. and a series of other files including the images and any fancy movement in a java script file. To make sense of it all and view the finished result you need a web browser like Chrome.

An EPUB is very similar and could be represented by the schematic below:Epub Schematic

(https://www.edrlab.org/open-standards/anatomy-of-an-EPUB-3-file/)

All zipped together the EPUB has a similar structure to the components of a web site  – the MIMEtype file is merely a small text file that tells the reader ‘this is an EPUB’ – the main content is housed in a directory named “OEBPS”. Why such a strange name? This is simply historical: Open eBook Publication Structure was the name of a legacy ebook format which has been superseded by the EPUB format. In the OEBPS are the text files (HTML) the fonts, images, any video content and any interactive elements in the Javascript files. This is all bundled with the essential .opf file. The OPF is the table of contents and contains a manifest of all the other files. The diagram also shows a similar table of contents file the .ncx – this is not always required and is a legacy of the earlier EPUB2 format whose readers need this format of indexing.

Finally, the whole structure is controlled by some Meta data files. Thus we have a ‘website’ with all the functions and possibilities of a website but contained within a single file. This can be saved and read off line without the need for the web browser.

  1. How do you open an EPUB?

Easiest way to open the is using the free to use Calibre available at https://calibre-ebook.com/  Calibre allows the opening of an EPUB and also allows the EPUB structure to be examined and where there are issues within the HTML code these can be edited. Most people involved in the eBook industry will have a copy of this software close at hand. Other similar systems include:

                          EPUBor Reader - https://www.epubor.com/

                          Sumatra PDF Reader - https://www.sumatrapdfreader.org/free-pdf-reader

                          Freda - https://freda-epub-ebook-reader.en.softonic.com/

                          Icecream eBook Reader - https://icecreamapps.com/Ebook-Reader/

                          Neat Reader - https://www.neat-reader.com/

                          Sigil - https://sigil-ebook.com/

                          Kobo - https://www.kobo.com/gb/en

Whilst the above tools will allow viewing and often editing of the EPUB and are invaluable for the publisher they are not always a great user interface and if you distributing an eBook you are reliant on the recipient having these readers. Rather in the same way you would be reliant on the recipient of a .doc file having Word on their computer. For the training industry to share eBooks there is requirement to distribute the reader access along with a compatible book.

  1. How do I create an EPUB?

Many of the tools listed above will allow the import of .doc files and PDFs for conversion into an EPUB but there is an issue.

Whilst there are a set of standards around the structure of the EPUB file, it would thought that any reader would open any file. However, this appears not to be the case. Because of the complexity of the files in the container and the nature of the indexing not all readers will open all EPUBS. A good example is the EPUB created by Adobe In Design - these work great on Adobe Digital Editions but will fail on other readers because of the structure of the contents file.

Thus you will find that the many of the readers which have a great user interface will be associated with a compatible EPUB creation software. Alternatively, and most commonly, the EPUB can be created by a third party organisation that has sufficient technical knowledge that they can alter the file structure if required to suit the reader.

  1. What does the EPUB file give the trainer?

 

Whilst a PDF can be distributed and read widely and it can be created with some interactivity including interactive forms and the ability to add web links, it has two disadvantages. Firstly, the interactivity is limited compared to an EPUB, so no video and no Java scripted interactivity. Also if the reader has a DRM overlay you may find the interactivity of the PDF disappears. So a PDF will give you a lot of functionality but little protection other than the password.

On the other hand the EPUB can have full DRM protection including control of printing, copying and number of devices AND have the full interactivity including video, hot spot images, quizzes and interactive forms.

For further information and discussion - give us a call

By: David Platt