Remote Audio Data Encoding for Podcasts

This document chronicles a pilot in 2017. Please visit rad.npr.org for updated information.

This proposal outlines a draft specification for a custom ID3 tag, referred to below as a Remote Audio Data (RAD) tag. The RAD tag can be encoded with listening event metadata to allow clients to report valuable listening data to podcast content producers and sponsors. 

NPR tested this concept in 2017 within the NPR One app, both Android and iOS. The spec below is intended to be a baseline, with a formal implementation dependent upon industry feedback. 

If you would like to ask questions or offer feedback to NPR on RAD, please use this form.


NPR One Pilot

Overview

In the spring of 2017, NPR tested the RAD method of adding a custom ID3 tag to selected audio files to encode supplemental listening event metadata for podcast episodes.

For the scope of our pilot, NPR tested approximately 20 mp4 episodes of How I Built This and TED Radio Hour. The ID3 tags were encoded by hand and NPR One served as the test client. A server was set up to receive the listening events, which were then available for downloaded by Splunk, our tool for custom listening analytics. Due to the limited sample size, capturing conclusive listening data for these episodes was out of scope for this initial pilot.

This pilot was intended to provide a proof of concept for the technical implementation of the RAD method, to determine both the required level of effort, as well as the accuracy of the captured data. To measure the accuracy, we analyzed the RAD reported events against the reported listening data that we receive from NPR One.

Outcomes

Comparing the RAD event data against baseline NPR One data did include certain complexities; the pilot used mp4 files because that is the majority of listening within NPR One for these podcasts, however there was a subset of mp3 and HLS listening that we could not include.

We sought to prove accuracy by comparing observed drop-off, with the following data confirming our expectations: 

  • Podcast completion rates from the RAD events aligned with NPR One events.
  • Sponsorship funnels (the drop off from ad break 1 to ad break 2, and so on) and advertisement completion rates behaved as expected.

Technical Implementation

For testing purposes, a custom Remote Audio Data (RAD) tag was created and JSON containing the specified metadata was encoded directly into the custom tag within the ID3 tag. The NPR One client was programmed to parse the metadata when the encoded audio file was downloaded, and an observer was implemented to watch for listening events and then construct a URL to send the event data to a remote server for tracking purposes. The server logs were then made available via download for reporting and analysis. Note: RAD tags do not affect audio playback for clients who do not implement an observer for listening events.

Example metadata for RAD custom tag
{
  "remoteAudioData":
  {
    "podcastId":510298,
    "episodeId":497679856,
    "trackingUrl":"https://tracking.publisher.org/remote_audio_data",
  }
  {
    "events":
    [
      {
        "label":"podcastDownload",
        "eventTime":"00:00:00.000",
        "adId":0,
        "creativeId":0,
        "adPosition":0,
        "eventNum":0
      },
      {
        "label":"podcastStart",
        "eventTime":"00:00:05.000",
        "adId":0,
        "creativeId":0,
        "adPosition":0,
        "eventNum":1
      },
      {
        "label":"adStart",
        "eventTime":"00:05:00.000",
        "adId":123456,
        "creativeId":1234567,
        "adPosition":1,
        "eventNum":2
      },
      {
        "label":"adEnd",
        "eventTime":"00:05:15.000",
        "adId":123456,
        "creativeId":1234567,
        "adPosition":1,
        "eventNum":3
      },
      {
        "label":"adStart",
        "eventTime":"00:09:15.000",
        "adId":123457,
        "creativeId":1234568,
        "adPosition":2,
        "eventNum":4
      },
      {
        "label":"adEnd",
        "eventTime":"00:09:30.000",
        "adId":123457,
        "creativeId":1234568,
        "adPosition":2,
        "eventNum":5
      },
      {
        "label":"adStart",
        "eventTime":"00:12:15.000",
        "adId":123458,
        "creativeId":1234569,
        "adPosition":3,
        "eventNum":6
      },
      {
        "label":"adEnd",
        "eventTime":"00:12:30.000",
        "adId":123458,
        "creativeId":1234569,
        "adPosition":3,
        "eventNum":7
      },
      {
        "label":"podcast98",
        "eventTime":"00:27:45.000",
        "adId":0,
        "creativeId":0,
        "adPosition":0,
        "eventNum":8
      }
    ]
  }
Annotated metadata for RAD custom tag
The top-level object in the encoded JSON includes podcast and episode IDs, as well as the tracking URL that should receive event notifications from the client.
{
"remoteAudioData":
{
"podcastId":510298,
"episodeId":497679856,
"trackingUrl":"https://tracking.publisher.org/remote_audio_data",
}
...
}
“eventTime” matches the timestamp format for cue points set by client-implemented observer.
{
"events":
[
{
"label":"podcastDownload",
"eventTime":"00:00:00.000",
"adId":0,
"creativeId":0,
"adPosition":0,
"eventNum":0
},
“adStart” and “adEnd” event labels should repeat as needed within the "events" array to represent the number of sponsorship spots featured within the episode.
{
"label":"adStart",
"eventTime":"00:05:00.000",
"adId":123456,
"creativeId":1234567,
"adPosition":1,
"eventNum":2
},
{
"label":"adEnd",
"eventTime":"00:05:15.000",
"adId":123456,
"creativeId":1234567,
"adPosition":1,
"eventNum":3
},

Notes on client implementation

The current client implementation framework observes the current audio session, parses the tags encoded into the currently playing podcast audio file, and implements an event listener to compare the current playback position to the event times parsed from the encoded media file. 

Listening events

A series of listening events are saved as objects within the "events" array of the encoded RAD metadata (see example above). Event labels can be customized as desired, but character limits in the ID3 tag may limit the number of events that can be included. For the initial pilot, NPR limited the number of encoded events to 8 per episode. The playback events being tracked for this preliminary trial included the following:

Event Label

Description

“podcastDownload”

The eventTime for events with this label should always equal "00:00:00.000". This event indicates that a podcast has been downloaded by the client, but does not confirm the episode has been played.

“podcastStart”

The eventTime for event with this label should always be set to "00:00:05.000". This event confirms that the episode has started to play.

“adStart”

The eventTime matches the start time for the piece of sponsorship  with the "adId" indicated in the event object.

Multiple adStart events may included in the "events" array to match the number of sponsorship pieces included in this episode.

“adEnd”

The eventTime for "adEnd" corresponds to the end time for the piece of sponsorship with the "adId" indicated in the event object.

Multiple adEnd events may included in the "events" array to match the number of sponsorship pieces included in this episode.

“podcast98”

The eventTime for events with this label should be equal to 98% of the duration of this episode.

Additional events

Additional events have been proposed since the pilot, including:

  • playStart/playStop
  • podcast25, podcast50 and podcast75 to measure content listening length

Reporting listening events

When the client detects a match between current playback position and an "eventTime", it makes a GET request to the provided tracking URL, appending the metadata for that event as parameters to the base "trackingUrl". The remote web server that receives this GET request should log the constructed URLs for future reporting and analysis. 

There are three data sources that contribute to the constructed URL for a GET request:  

  1. The encoded audio file, which provides the JSON metadata in the RAD tag that includes the base tracking URL, the podcast and episode IDs, and an array of event objects.

  2. The client application, which supplies the application ID for the “application” param.

  3. The platform, which supplies the device-specific advertising ID for the “sessionStart” param, for example, the UUID on Apple devices.

Example of constructed URL to report a listening event to the web server:
"https://tracking.publisher.org/remote_audio_data?application=org.client.podcastdatametrics&sessionStart="+advertisingID+"&podcastId=510298&episodeId=497679856&label=podcastDownload&eventTime=00:00:00.000&adId=0&creativeId=0&adPosition=0&eventNum=0"
Example RAD metadata that provided the key-value pairs for the constructed URL above:
{
    "podcastId":510298,
    "episodeId":497679856,
    "trackingUrl":"https://tracking.publisher.org/remote_audio_data",
  }
  {
    "events":
    [
     {
        "label":"podcastDownload",
        "eventTime":00:00:00.000,
        "adId":0,
        "creativeId":0,
        "adPosition":0,
        "eventNum":0
      },
   …
]
Platform-specific notes: iOS

The iOS media player supports the parsing of custom fields in the ID3 tags by default, so no extra effort is needed on the part of the client to locate and parse the metadata encoded into a tag within the file.

Platform-specific notes: Android

Android clients must first locate the RAD metadata atom within the audio file before the data can be parsed:

[moov] [trak] [udta] [meta] [ilst] [----] [data] "RAD"

NPR One uses the ExoPlayer 2’s MetadataUtil class to parse the metadata from this tag, but the specifics of this implementation will vary based on platform.

Client-side performance implications 

The RAD pilot used NPR One as its test client, which is essentially a streaming application. Publishers or implementers of RAD clients should be aware of the lifecycle of cellular radio hardware, particularly for non-streaming applications. From a sleeping state, a network request will wake the radio subsystems (an energy-expensive operation), make the request, then the radio will enter an idle state for a platform specific amount of time (typically around 90 seconds). If no further connections are made in that 90 seconds, the radio will power down back to the sleeping state. If a new request is made at the 91st second, the full energy cost of re-waking the radio must be paid, effectively leading to significant battery drain if repeated.

With these factors in mind, publishers should be careful not to include a significant amount of unimportant events and non-streaming based applications should consider caching the events to memory/disk and batching the send of event data.

Frequently Asked Questions

What software did we use to write to the ID3 tags?

We used TagScanner to write to the ID3 tags for the pilot.

If a user manually seeks forward, do we send the events he/she has passed but hasn't actually listened to?

For the NPR pilot, yes. All prior events up to the elapsed time were sent (including all ad breaks). This was based on scope/time to implement the prototype. This could be modified in future implementations.

If a user rewinds to a location in a podcast that is before an event after having passed and hit that event, do we send the event again?

If that is within the same listening event/session, no.

If they were to listen to the episode again would we send the event again?

Yes, if it is a separate listening event/session.

Can RAD work with dynamic ad/sponsorship insertion?

Yes. Ad vendors and/or publishers should be aware of the lengths of these dynamic spots and that a system will require them to keep track of the running time codes, updating the RAD tags accordingly.

Remote Audio Data Encoding Workflow

RAD graphic2x2

Last modified