Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out the coupling between audio focus/session, audio playback and remote control events #9

Closed
foolip opened this issue Feb 20, 2015 · 8 comments

Comments

@foolip
Copy link
Member

foolip commented Feb 20, 2015

This has big implications for the shape of the API.

Android:

  • The Audio Focus API allows apps to pause, resume and duck audio as appropriate. Ideally, one should request focus and start audio playback if the request is successful, but it seems possible to play without audio focus and to get audio focus without playing.
  • The old registerMediaButtonEventReceiver and a newer MediaSession API allow apps to handle media buttons. Both appear to be orthogonal to audio focus and audio playback.

iOS:

  • The Audio Session API mediates which app is playing and how they deal with interruptions. The documentation says ‘For app sound to play, or for recording to work, your audio session must be active.’ and ‘The system has final authority to activate or deactivate any audio session present on a device.’
  • The Remove Control Events allows apps to handle media buttons. Crucially, ‘Your app must be the “Now Playing” app. Restated, even if your app is the first responder and you have turned on event delivery, your app does not receive remote control events until it begins playing audio.’ (It's not clear to me if “Now Playing” means having an active audio session, or also having a playing media player.)

CC @sicking, @jernoble, @richtr, @marcoscaceres. Anyone else?

@foolip
Copy link
Member Author

foolip commented Feb 20, 2015

@jernoble, you clarified in #1 (comment) and some more pedantic detail on the iOS coupling of these issues would be helpful. Specifically:

  1. Is it possible to activate and deactivate an audio session (thus interrupting other apps) without playing any audio?
  2. When an audio session is interrupted, is any audio automatically paused?
  3. Is ducking automatic and does an app know when its audio is being ducked?
  4. Is an active audio session enough to receive remote control events, or must there also be audio playback?
  5. Does starting audio playback implicitly activate an audio session?
  6. It looks like AVAudioSession is a per-app singleton. Would this make it difficult to have something more fine-grained than a per-tab audio focus/session concept?

There are probably more nuances I'm not aware of, basically I'd like to learn which kind of Web platform API would be possible to implement in iOS Safari, assuming no private APIs that other apps don't have.

@jernoble
Copy link

@foolip

(It's not clear to me if “Now Playing” means having an active audio session, or also having a playing media player.)

The former. The higher-level APIs like AVPlayer and MPMoviePlayerController will set up and activate an audio session on your behalf.

  1. Is it possible to activate and deactivate an audio session (thus interrupting other apps) without playing any audio?

Yes, modulo priority. A music-playing app will not be allowed to interrupt a phone call, for example.

  1. When an audio session is interrupted, is any audio automatically paused?

If you interact with an audio session through an AVPlayer or MPMoviePlayerController, then yes, they will pause when they receive an audio session interruption notification. If you interact with an audio session through an AVAudioSession, it's up to you to implement a "pause" behavior when you receive said notification.

  1. Is ducking automatic...

Yes, because ducking is controlled by the interrupting session. E.g., a navigation app will cause the audio from a music app to be ducked when speaking turn-by-turn directions.

... and does an app know when its audio is being ducked?

No, I don't believe so.

  1. Is an active audio session enough to receive remote control events, or must there also be audio playback?

No, I think an active audio session is enough, though I haven't verified this.

  1. Does starting audio playback implicitly activate an audio session?

See the first comment; only if you use a high-level playback API. If your app plays audio via low-level audio API without activating an audio session manually, not only will that app not get remote control events, no audio will be produced in hardware.

  1. It looks like AVAudioSession is a per-app singleton. Would this make it difficult to have something more fine-grained than a per-tab audio focus/session concept?

So this is no longer about iOS in general and instead is about WebKit on iOS specifically: I haven't seen a difficulty in implementing (non-web exposed) remote control events for and

@foolip
Copy link
Member Author

foolip commented Feb 23, 2015

When an audio session is interrupted, is any audio automatically paused?

If you interact with an audio session through an AVAudioSession, it's up to you to implement a "pause" behavior when you receive said notification.

Does this mean that it's technically possible to continue producing audio even when your session becomes inactive? I assume that there's forced muting for phone calls, but when one music player is interrupted by another? (Assuming they both use the low-level audio APIs.)

Is an active audio session enough to receive remote control events, or must there also be audio playback?

No, I think an active audio session is enough, though I haven't verified this.

This makes me optimistic, do you think a similar model for a Web-exposed API would work, where one can get remote control events by activating a session, which doesn't need to be tied to an HTMLMediaElement or AudioContext internally? (Good defaults for existing uses of HTMLMediaElement are important, of course.)

@foolip
Copy link
Member Author

foolip commented Feb 23, 2015

I neglected lock screen controls in my first comment. Here's what I found:

On Android, it used to be handled by RemoteControlClient, which has been deprecated in favor of MediaSession.

On iOS, it's part of Now Playing Information which appears to be tied to Remote Control Events.

I'm not certain, but it looks like coupling lock screen controls to remove control events is the way to go. As a user, I'd certainly appreciate if any app that's playing while the screen is locked can be stopped from the lock screen.

foolip added a commit to foolip/mediasession that referenced this issue Mar 2, 2015
This is motivated by @jernoble's explanation of the iOS model, as it
does not seem like media playback is required, only an active media
session. This hasn't been verfied, however:
w3c#9 (comment)
@foolip
Copy link
Member Author

foolip commented Mar 27, 2015

Some more rambling ideas in https://github.com/whatwg/media-keys/blob/766130421a85e6a101c99e63570a31b5ce300323/MediaSession.md#integration-with-audiocontext-and-htmlmediaelement

@jernoble, have you been able to figure out precisely what the restrictions are on iOS? In my mind, the "An active MediaSession is required in order to start playing audio" way seems somewhat reasonable but it seems to be the reverse of what iOS does.

@foolip
Copy link
Member Author

foolip commented May 20, 2015

Closing this now, the spec as it is answers these questions as such: to activate a media session you need to attempt to play a media element, and only an active media session of kind "content" will get audio focus and media key events. (The transient kinds will get transient audio focus but no UI or events.)

@foolip foolip closed this as completed May 20, 2015
@doomdavve
Copy link
Contributor

What we've gathered when testing this on iOS (using AVAudioSessoin and AVAudioPlayer):

  • As Jer says above, you do need an active session to play audio. So not the other way around; your session doesn't become active by playing.
  • You may interrupt and take part in the audio focus system without actually (yet) playing yourself. Setting your session to active pauses iTunes.
  • You may receive remote control events like play from the lock-screen without actually playing/before playing.
  • If you are actually playing or not seems to affect UI and meta-data. If you set your session to active and provide meta-data like artist and title, the meta-data will not show up until you actually start to play.

The last point may force us to special case meta-data handling if we ever separate media session activation from media playback in a web exposed way.

@doomdavve
Copy link
Contributor

Additional observations:

  • Requesting remote control events to your app will hook up the play button in the lock screen UI, given no other app is already playing. This is independent of your media session being active or not and if you have (had) any playing media.
  • If another app is playing in the background (iTunes or so), activating the media session pauses the other app but doesn't give you control of the lock screen and won't give you remote control events until you actually start playing, i.e the other app is still playable using the lock screen, by toggling the paused state. This holds until you start playing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants