→IDPF/EPUB: Fixing edits, to remove external website references. |
AFigureOfBlue (talk | contribs) m Bold text not needed in headings |
||
Line 118: | Line 118: | ||
The advantage of DjVu is that it is possible to take a high-resolution scan (300-400 DPI), good enough for both on-screen reading and printing, and store it very efficiently. Several dozens of 300 DPI black-and-white scans can be stored in less than a megabyte. |
The advantage of DjVu is that it is possible to take a high-resolution scan (300-400 DPI), good enough for both on-screen reading and printing, and store it very efficiently. Several dozens of 300 DPI black-and-white scans can be stored in less than a megabyte. |
||
== Microsoft |
== Microsoft LIT == |
||
Published as .''[[LIT (file format)|lit]]'' |
Published as .''[[LIT (file format)|lit]]'' |
Revision as of 20:35, 10 May 2009
The following is a comparison of e-book formats used to create and publish e-books.
A writer or publisher has many options when it comes to choosing a format for production. While the average end-user might arguably simply want to read books, every format has its proponents and champions, and debates over "which format is best" can become intense. The myriad of e-book formats is sometimes referred to as the "Tower of eBabel". For the average end user to read a book, every format has its advantages and disadvantages. Formats available include, but are by no means limited to:
Plain text files
Published as .txt
E-books in plain text exist and are very small in size. For example, the Bible is about 4 MB.
Hypertext Markup Language
Published as .htm or .html
HTML is the markup language used for most web pages. E-books using HTML can be read using a Web browser. The specifications to the format are freely available from the W3C.
As markup language, HTML adds especially marked meta elements to otherwise plain text encoded using character sets like ASCII or UTF-8. As such suitably formatted files can be, and sometimes are, generated by hand using a plain text editor or programmer's editor. Many HTML generator applications exist to ease this process and often require less intricate knowledge of the format details involved.
HTML is not a particularly efficient format to store information, requiring more storage space for a given work than many other formats, even if images are not used to illustrate it. The format does not describe pages and has no facility to store multiple things (images, etc.) in a single file. Often e-books in this format will store one chapter per file.
Amazon Kindle (AZW) Format
Published as .azw
With the launch of the Kindle eBook reader, Amazon.com created the AZW format. It is based on the Mobipocket standard, with a slightly different serial number scheme (it uses an asterisk instead of a Dollar sign) and its own DRM formatting. Because the eBooks bought on the Kindle are delivered wirelessly over EvDO (the system is called Whispernet by Amazon), the user does not see the AZW files during the download process.
Open Electronic Book Package Format
Published as .opf
OPF is an XML-based e-book format created by E-Book Systems.
TomeRaider
Published as .tr2 or .tr3
The TomeRaider e-book format is a proprietary format. There are versions of TomeRaider for Windows, Windows Mobile (aka Pocket PC), Palm, Symbian and more. Several Wikipedias are available as TomeRaider files with all articles unabridged, some even with nearly all images. Capabilities of the TomeRaider3 ebook reader vary considerably per platform: the Windows and Windows Mobile editions support full HTML and CSS. The Palm edition supports limited HTML (e.g. no tables, no fonts), and CSS support is missing. For Symbian there is only the older TomeRaider2 format, which does not render images or offer category search facilities. Despite these differences any TomeRaider ebook can be browsed on all platforms. Tomeraider is popular among readers because of its huge free document base. According to their records the Tomeraider Website has over 4000 free ebooks to read. The IMDB movie database is also as a regularly updated Tomeraider ebook. Tomeraider developers have recently developed full Wikipedia (English version up to 2007 December data) as an ebook, which is 3.3GB file. You can download the file here with payment for bandwidth, but free for 2006 October data which is 1GB file.
Arghos Diffusion
Published as .arg
The ARG format is an XML-based proprietary format developed by the French firm Arghos Diffusion.
ARG files use a proprietary DRM and encryption method and are readable only in the Arghos Player.
It supports various input formats for text, audio or video, such as PDF, WMA, MP3, WMV, and allows multiple interactive functions such as bookmarking, advanced plain-text searching, dynamic text highlighting, etc.
Flip Books
A "Flip Book" is a type of E-Book distinguished by virtual pages that actually "flip", much like turning pages of paper in a real book or magazine. The first dynamic Flip Book Reader was developed in 2003/2004 by Interaxive Media for Nishe Media (Canada) and was therefore called "Nishe Pages". The first version was produced in part by Cybaris (Canada) and was first publicly showcased in August 2004. Soon thereafter, many copycat "flip books" started appearing thanks to technological advances in Macromedia Flash, mostly hardcoded using Flash components. The original software remains unique in that it is powered by a complete server-based CMS system that allows the books to be created, published, and viewed remotely from a web server without requiring any custom software to be installed. Nishe Media went defunct in 2004, leaving the unfinished software to Interaxive Media who continued its development in Hong Kong. Though not widely used outside of Asia, it is now at version 3.0 and is arguably the most advanced server-based E-Book platform. It remains privately held by the original developer, Ryan Sutherland, owner and founder of Interaxive Media.
NISO Z39.86 Format
Commonly known as DAISY
DAISY is an XML-based e-book format created by the DAISY international consortium of libraries for people with print disabilities.
DAISY implementations have focused on two main types: audio e-books and text e-books. A subset of the DAISY format has been adopted by law in the United States as the National Instructional Material Accessibility Standard, and K-12 textbooks and instructional materials are now required to be provided to students with disabilities. [1]
FictionBook
Published as .fb2
FictionBook is a popular XML-based e-book format, supported by free readers such as Haali Reader and FBReader. See http://haali.cs.msu.ru/pocketpc/FictionBook_description.html
Text Encoding Initiative
TEI Lite is the most popular of the TEI-based (and thus XML-based or SGML-based) electronic text formats.
Plucker
Plucker is a free e-book reader application with its own associated file format and software to automatically generate plucker files from HTML files, web sites or RSS feeds. The format is a compressed HTML archive, somewhat like Microsoft's CHM.
CHM Format
Published as .chm
Also known as Microsoft Compressed HTML Help
CHM format is a proprietary format based on HTML. Multiple pages and embedded graphics are distributed along with proprietary metadata as a single compressed file. In contrast, in HTML, a site consists of multiple HTML files and associated image files in standardized formats.
Portable Document Format (PDF)
Published as .pdf
A file format created by Adobe Systems, initially to provide a standard form for storing and editing printed publishable documents. The format derives from PostScript by removing language features like loops and adding support for things like compression and passwords. Because PDF documents can easily be viewed and printed by users on a variety of computer platforms, they are very common on the World Wide Web. The specification of the format is freely available from Adobe.
PDF files typically contain brochures, product manuals, magazine articles, up to entire books, as they can embed fonts, images, and other documents. A PDF file contains one or more page images, each of which you can zoom in on or out from.
Since the format is designed to reproduce page images, the text cannot be re-flowed to fit the screen width, PDF files designed for printing on standard paper sizes are less easily viewed on screens with limited size or resolution such as found on mobile phones and PDAs. Adobe has addressed this by adding a re-flow facility to its Acrobat Reader software, but for this to work the document must be marked for re-flowing at creation [2], which means existing PDF documents will not benefit. The Windows Mobile (aka Pocket PC) version of Adobe Acrobat will automatically attempt to tag a PDF for reflow during the synchronization process. This tagging process will not work on most locked PDF documents. When using Windows Vista with a Windows Mobile device and Adobe Acrobat, this tagging process must occur before the device is synchronized.
Multiple Adobe products support creating PDF files, such as Adobe Acrobat and Acrobat Capture, as do third-party products such as PDFCreator, OpenOffice.org, and FOP, and several programming libraries. Acrobat Reader (now simply called Adobe Reader) is Adobe's product used to view PDF files, with third party viewers such as xpdf also available.
Later versions of the specification add support for forms, comments, hypertext links, and even interactive elements such as buttons for forms entry and for triggering sound and video. Such features may not be supported by older or third-party viewers and some are not transferrable to print.
PDF files are supported on the following e-book readers: iRex iLiad, iRex DR1000, Sony Reader, Bookeen Cybook, Foxit eSlick and Amazon Kindle DX.
PostScript
Published as .ps
PostScript is a page description language used in the electronic and desktop publishing areas for defining the contents and layout of a printed page, which can be used by a rendering program to assemble and create the actual output bitmap. Many office printers directly support interpreting PostScript and printing the result. As a result, the format also sees wide use in the Unix world.
DjVu
Published as .djvu
DjVu is a format that specialises in and particularly excels at storing scanned images. It includes advanced compressors optimized for low-color images, such as text documents. Individual files may contain one or more pages.
The format has long remained in obscurity, but now that free tools to manipulate the format are available, that is starting to change.
The contained page images are divided in separate layers (such as multi-color, low-resolution, lossily-compressed background layer, and few-colors, high-resolution, tightly-compressed foreground layer), each compressed in the best available method. The format is designed to decompress very quickly, even faster than vector-based formats.
The advantage of DjVu is that it is possible to take a high-resolution scan (300-400 DPI), good enough for both on-screen reading and printing, and store it very efficiently. Several dozens of 300 DPI black-and-white scans can be stored in less than a megabyte.
Microsoft LIT
Published as .lit
DRM-protected LIT files are only readable in the proprietary Microsoft Reader program, as the .LIT format, otherwise similar to Microsoft's CHM format, includes Digital Rights Management features. Other third party readers, such as Lexcycle Stanza, can read unprotected LIT files.
There are also tools such as Convert Lit, which can convert .lit files to HTML files or OEBPS files.
The MS reader uses patented ClearType display technology. In Reader navigation works with a keyboard, mouse, stylus, or through electronic bookmarks. The Catalogue Library records reader books in a personalized "home page", and books are displayed with ClearType to improve readability. A user can add annotations and notes to any page, create large-print e-books with a single command, or create free-form drawings on the reader pages. A built-in dictionary allows the user to look up words.
eReader (formerly Palm Digital Media/Peanut Press)
Published as .pdb
eReader is a freeware program for viewing Palm Digital Media electronic books. Versions are available for PalmOS, iPhone, Symbian, Windows Mobile Pocket PC/Smartphone, desktop Windows, and Macintosh. The reader shows text one page at a time as paper books do. eReader supports embedded hyperlinks and images. Additionally the Stanza application for the iPhone and iPod Touch can read both encrypted and unencrypted eReader files.
The company's web site - ereader.com offers a wide selection of eReader-formatted ebooks available for purchase and download, and also a few for free. The paid-for books are encrypted, with the key being the purchaser's full name and credit card number. This information is not stored in the ebook though. A one-way hash is used, so there no risk of the user's information being extracted.
The program supports features like bookmarks and footnotes, enabling the user to mark any page with a bookmark, and any part of the text with a footnote-like commentary. Footnotes can later be exported as a Memo document.
The company also offers two Windows/MacOS programs for producing ebooks: the free Dropbook, and the paid-for eBook Studio. Dropbook is simply a file-oriented PML-to-PDB converter, and eBook Studio incorporates a WYSIWYG editor. PML (Palm Markup Language) is basically text with embedded formatting tags, so feeding a pure text file into eBook Studio or Dropbook also works.
There is also support for an integrated reference dictionary (with many options up to and including a 476,000-word Merriam-Webster Dictionary, including pronunciation keys) so that any word in the text can be highlighted and looked up on the dictionary instantly. Commercial fonts can also be individually purchased and downloaded at the company's web site, ereader.com.
Desktop Author
Published as .dnl or .exe
Desktop Author is an electronic publishing suite that allows creation of digital web books with virtual turning pages. Digital web books of any publication type can be written in this format, including brochures, e-books, digital photo albums, e-cards, digital diaries, online resumes, quizzes, exams, tests, forms and surveys. DesktopAuthor packages the e-book into a ".dnl" or ".exe" book. Each can be a single, plain stand-alone executable file which does not require any other programs to view it. DNL files can be viewed inside a web browser or stand-alone via the DNL Reader.
DNL Reader
DNL format is an e-Book format, one which replicates the real life alternative, namely page turning Books. The DNL e-Book is developed by [DNAML Pty Limited] an Australian company established in 1999. THE DNL e-Book can be produced using DeskTop Author or DeskTop Communicator.
Newton eBook
Published as .pkg
Commonly known as an Apple Newton book; a single Newton package file can contain multiple books (for example, the three books of a trilogy might be packaged together).
All systems running the Newton operating system (the most common include the Newton MessagePads, eMates, Siemens Secretary Stations, Motorola Marcos, Digital Ocean Seahorses and Tarpons) have built-in support for viewing Newton books. The Newton package format was released to the public by Newton, Inc. prior to that company's absorption into Apple Computer. The format is thus arguably open and various people have written readers for it (writing a Newton book converter has even been assigned as a university-level class project[1]).
Newton books have no support for DRM or encryption. They do support internal links, potentially multiple tables of contents and indexes, embedded grayscale images, and even some scripting capability (for example, it's possible to make a book in which the reader can influence the outcome).[2]
Newton books utilize Unicode and are thus available in numerous languages.
An individual Newton book may actually contain multiple views representing the same content in different ways (such as for different screen resolutions).
APABI
Published as ".xeb" or ".ceb".
APABI is a format deviced by Founder Electronics. It is a popular format for Chinese e-books. It can be read using the Apabi Reader software, and produced using Apabi Publisher. Both .xeb and .ceb files are encoded binary files. The Iliad e-book device includes an Apabi 'viewer'.
iPod Notes
Notes is a feature of iPod that allows short text notes to be displayed on the iPod screen. As the size limit for one note is 4096 bytes, there are some tools that create the notes from the longer plain text file. Basic HTML is allowed, but otherwise the format is plain text only.
Libris
Libris is a Java based eBook reader for mobile devices such as cell phones. Libris will run on most Java enabled devices that support MIDP. The reader formats books to fit the device screen, and shows one page at a time using high quality anti-aliased fonts. Books may employ encryption or be unrestricted. Libris content may be produced using the MakeLibris tool. The Libris reader also supports the PalmDoc format.
Mobipocket
Published as .prc or .mobi.
The Mobipocket e-book format based on the Open eBook standard using XHTML can include JavaScript and frames. It also supports native SQL queries to be used with embedded databases. There is a corresponding e-book reader. A free e-book of the German Wikipedia has been published in Mobipocket format; see [3].
The Mobipocket Reader has a home page library. Readers can add blank pages in any part of a book and add free-hand drawings. Annotations — highlights, bookmarks, corrections, notes, and drawings — can be applied, organized, and recalled from a single location. Mobipocket Reader has electronic bookmarks, and a built-in dictionary
The reader has a full screen mode for reading and support for many PDAs, Communicators, and Smartphones. Mobipocket products support most Windows, Symbian, BlackBerry and Palm operating systems. On Linux and Macintosh applications like Okular and FBReader can be used to read non-encrypted files.
The Amazon Kindle's AZW format is basically just the Mobipocket format with a slightly different serial number scheme (it uses an asterisk instead of a Dollar sign).
Mobipocket is working on an .epub to .mobi converter called mobigen. See [4].
IDPF/EPUB
Published as .epub
The .epub or OEBPS format is an open standard for eBooks created by the International Digital Publishing Forum (IDPF). It combines three IDPF open standards:
- Open Publication Structure (OPS) 2.0, which describes the content markup (either XHTML or Daisy DTBook)
- Open Packaging Format (OPF) 2.0, which describes the structure of an .epub in XML
- OEBPS Container Format (OCF) 1.0, which bundles files together (as a renamed ZIP file)
Currently, the format can be read by Adobe Digital Editions, Lexcycle Stanza, BookGlutton, AZARDI and the Mozilla Firefox plugin OpenBerg Lector. Several other reader software programs are currently implementing support for the format, such as dotReader, FBReader, Mobipocket and Okular. Another software .epub reader, Lucidor, is in beta. As of 7/23/2008, an update to the Sony Reader will allow the device to read .epub documents. In 2008 BookGlutton launched a server-side HTML-to-EPUB converter.[5]
uBook (from Gowerpoint software) also provides basic support for the format, but cannot read DRM-protected .epub files.
Adobe Digital Edition uses .epub format for its e-books, with DRM protection provided through their proprietary ADEPT mechanism. The recently developed INEPT framework and scripts have been reverse-engineered to circumvent this DRM system.
SSReader
Published as .pdg
The digital book format used by a popular digital library company 超星数字图书馆[6] in China. It is a proprietary raster image compression and binding format, with reading time OCR plugin modules. The company scanned a huge number of Chinese books in the China National Library and this becomes the major stock of their service. The detailed format is not published. There are also some other commercial ebook formats used in Chinese digital libraries.
Multimedia Books
A Multimedia EBook is media and book content that utilizes a combination of different book content forms. The term can be used as a noun (a medium with multiple content forms) or as an adjective describing a medium as having multiple content forms. The term is used in contrast to media which only utilize traditional forms of printed or text book. Multimedia EBook includes a combination of text, audio, still images, animation, video, and interactivity content forms.
Multimedia EBook used for creation on the basis of a literary fiction book, addition of the audio-visual and interactive contents, the new form of creativity. The user (the former reader) has an opportunity to participate in events occurring to characters, to feel influence of a musical part of a narration and graphic part.
The perception of several media forms of contents considerably expands depth of transfer power of art and creativity.
Currently, configuration of several forms of media is possible only on the basis of technology Adobe Flash. The technique Flip Book is applied to preservation sequences statements of the traditional book.
References
- ^ http://www.daisy.org/z3986/2005/errataDaisy_2005.html
- ^ http://www.adobe.com/ap/epaper/tips/acr5reflow/index.html Reflow the contents of Adobe PDF documents: Tutorial
See also
- e-book device
- ebookwise-1150 ebook reader device [7]
- ebook reader articles at Mobile Read Wiki [8]