Audiobooks 2.0

The Augmented Audiobook Experience. How Amazon and Apple Could Upgrade a Decades-Old Format

Audiobooks 2.0

Audiobooks have been around since the tapes and CDs era. Their spread took place in multiple stages, leveraging every new technological wave.
Three key turning points have significantly boosted their relevance:

  1. Internet growth in the mid-nineties, when Audible launched. Till then, audiobooks were only available in physical retail locations.
  2. iPod and iTunes Store expansion post-2003 where multiple books could easily be kept in the pocket and consumed on the go.
  3. Smartphones’ ecosystem evolution post-2008 when iPhone and Android devices offered mobile stores with the possibility to purchase and download content without requiring a separate desktop action.

My first introduction to the format was in the late nineties, but I never got hooked as it felt that sitting around a computer to listen to books provided a limited advantage over a print version.

It was only in 2016 that I decided to give it another try and since then, here’s what happened…

The ability to listen while driving and walking unlocked a massive daily “reading” window that was never available before.

I couldn’t have read the same number of print books during that period. Aside from the practical side, the voice and style of some narrators made the books surprisingly come to life, and delivered a more enjoyable experience.

However, audiobooks do have drawbacks, and the central issues are most apparent in non-fiction titles.

In this post, I will share a few thoughts about how to solve the most important shortcomings and upgrade the audiobook experience.

Smartphones’ operating systems evolved drastically since the first iPhone launch. Now we have advanced frameworks to build media-rich Apps with features, such as Augmented Reality, that were hard to imagine just a few years back.

Yet, the audiobook format is exactly the same as it was in the nineties. A plain audio recording split into chapters labeled by numbers.

Recently, Audible started adding the actual chapter names which resolved one of my major complaints. Now, at least a high-level navigation within the book is possible.

The inadequacies of the audio format become apparent with content rich in numerical data, graphs, sketches or any visual representation. Also, questionnaires and whatever requires input is unsuitable.

Presently, that problem is solved by offering a companion PDF file including the extra material. The downside of that method is that:

  1. It requires a different device to download the file.
  2. The context around the content is lost. You can’t view the graphs while you listen and once you have access to the visual file, there’s no simple way to navigate back to the related audio section.

There were attempts in the past to build eBooks 2.0 by enriching the mostly text content with audio, video and interactive 3D animations. Companies such as Inkling started with academic textbooks and designed their solution around the iPad. The format never took off though. They had to change their model to a B2B software platform for corporates’ internal training manuals.

“Audiobooks Are the New Ebooks, Except They Might Keep Growing”

This piece by Boris Kachka summarizes the history and current trends of the digital publishing sector where audiobooks are the fastest-growing segment.

What if a more attractive format was to go the opposite direction?

Start from a mostly audio content and enrich it with visuals and interactivity in the relevant parts.

Let’s look at the example of the Audible App.

Most of the display real estate is untapped and only shows the static cover of the book.

That area would be much more useful by becoming active during the parts that relate to numerical tables, graphs or other visual representations.

Furthermore, non-fiction titles often include web links. It is a hassle to manually bookmark the audio segment, go back, listen again, and take note of addresses. Besides, long URL strings are not practical to write down.

Here are the 3 enhancements that could considerably augment the current audiobook experience.

When a link is mentioned, it would be worthy to present a synchronized notification in the App central space with various actions to choose from:

  • Open link in the browser.
  • Add to bookmark or favorite reader.
  • For media URLs, play directly or open separately.
  • If it’s a reference to another book, show the cover with an option to add to the Audible/Amazon wish list.
  • For Podcasts, offer to add the episode/show to default/favorite client.
  • Even when people are mentioned, showing the written names and photos with a link to their Wikipedia or personal pages would be great.
  • Various books have a summary section at the end of each chapter. Why not make the actual text of those sections available in the app? Recap bullet points are sometimes easier to remember in text form.

2. Alternative Navigations

Currently, only chapter level navigation is available. Another useful way to browse could be from visual assets.

The option to start from a list of graphs, tables, photos or links then be able to listen to the related audio segments would be handy especially after finishing the book. It’s a practical way to review sections of interest or the audio portions that were played when one could not look at the screen.

An additional convenient feature is The Lost Art of the Index. As mentioned in the article, even eBooks rarely have an index. The argument is that with the digital format, there’s search.

Search is great when one knows precisely what to look for.

…The other problem with relying on search instead of an index is that you lose the benefit of synonyms and related terms. An indexer takes all that into consideration so you’re much more likely to find everything you’re looking for with a good index than a simple text search…. — Joe Wikert

Audiobooks don’t have an index nor a search function.

The text from which the audio was recorded is already available. There would be no need to use speech recognition to make the content searchable. It’s a matter of syncing/mapping the two sources.

Adding advanced search would be huge if done right:

…When you search for a phrase in an ebook the results are shown in chronological order. You see all the occurrences from the beginning of the book to the end. Imagine if Google worked that way. So when you type in a phrase Google tells you the first (oldest) site to use that phrase, then the next oldest site that used it, etc. Users would laugh and reject it, yet that’s exactly what we’re forced to accept in ebook search.
What I really want is relevance-based results. Show me the location in the book with the highest density of that phrase and prioritize occurrences of it in a heading over occurrences in body text. I’m sure there are other attributes that could be rolled into an effective ebook search algorithm but I’ll take just those two features for starters... — Joe Wikert

Lastly, why limit search only to the current book?

In the digital format, books don’t have to be kept siloed.

For fiction titles, it wouldn’t make sense to run a query across multiple books. However, when it comes to business, reference guides, and how-to content, a cross-book search within the personal library would be extremely valuable.

Multiple books cover similar topics, and the knowledge is now isolated within the individual titles. An advanced search ranking similar to what Joe Wikert suggests above would be even more potent by surfacing insights from multiple sources.

Finally, I would love to see a topic based playlist dynamic creator. That feature could leverage the cross-book search function. It is for sure more challenging to build with the currently available NLP technology. It would need to figure out the length of the sections that cover a given topic from each book then work-out the most relevant ones to combine. The other difficulty is that in some cases, content may lose its meaning when taken out of its original book context. Still, it’s worth building a basic version that could progressively evolve.

3. Voice Notes

eBooks already support notes’ taking and text highlighting. Those features are still lacking in the audio format.

The notes should ideally be stored as voice memos and then post-processed into text format to be searchable. Real-time speech recognition has significantly improved and works well in many cases. Nevertheless, the quality degrades rapidly in the presence of background noise. In the car, for example, short commands do work, but longer form dictation is almost unusable.

The transcribed voice could later be used as alternative navigation to dive back into the related sections of the book.

Furthermore, when listening via home speakers such as the Amazon Echo, voice notes would be the most natural method to record personal insights.

The case of fiction books.

So far, most of the suggested enhancements are not relevant to the fiction genre.

Indeed, adding visuals to fiction books would start to get into movie production territory.

There is, however, one enrichment that could make a fiction title more engaging and entertaining. It is about adding a soundtrack with music and sound effects.

The example below from GraphicAudio gives an idea of the result.

GraphicAudio
GraphicAudio "A Movie In Your Mind" - Full Cast Dramatized Audio Book Entertainment

The sound effects presence and volume could be tuned in the settings to suit individual preferences.


Audiobooks are becoming more popular than ever. Nonetheless, both the current format and available software players are rather primitive. Augmenting the technology would have clear benefits from a user’s perspective, but the effort would require investments from both the content distribution platforms and the authors.

Audible and iTunes are best positioned to lead such a development but do they have enough incentives to do so?