Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The OCP Sprint #15

Open
JarbasAl opened this issue Apr 11, 2023 · 2 comments
Open

The OCP Sprint #15

JarbasAl opened this issue Apr 11, 2023 · 2 comments
Assignees

Comments

@JarbasAl
Copy link
Member

JarbasAl commented Apr 11, 2023

sprint to get OCP into it's next stage, this should accompany ovos-core 0.0.8 if possible but since it has an independent release cycle it is ok to do this during 0.0.9 dev cycle

please suggest additions

quick refresher on OCP playback types (OCP result setting):

class PlaybackType(IntEnum):
    SKILL = 0  # skills handle playback whatever way they see fit,
    VIDEO = 1  # Video results
    AUDIO = 2  # Results should be played audio only, if possible using GUI
    AUDIO_SERVICE = 3  # Results should be played without using the GUI
    MPRIS = 4  # External MPRIS compliant player
    WEBVIEW = 5  # GUI webview, render a url instead of media player
    UNDEFINED = 100  # data not available, hopefully status will be updated soon..

refresher on playback mode (OCP setting)

class PlaybackMode(IntEnum):
    AUTO = 0  # play each entry as considered appropriate, ie, make it happen the best way possible
    AUDIO_ONLY = 10  # only consider audio entries
    VIDEO_ONLY = 20  # only consider video entries
    FORCE_AUDIO = 30  # cast video to audio unconditionally (audio can still play in gui)
    FORCE_AUDIOSERVICE = 40  # cast everything to audio service backend, gui will not be used
    EVENTS_ONLY = 50  # only emit ocp events, do not display or play anything. allows integration with external interfaces

refresher on media_type (OCP result setting), used in NLP/Search stage

class MediaType(IntEnum):
    GENERIC = 0
    AUDIO = 1
    MUSIC = 2
    VIDEO = 3
    AUDIOBOOK = 4
    GAME = 5
    PODCAST = 6
    RADIO = 7
    NEWS = 8
    TV = 9
    MOVIE = 10
    TRAILER = 11
    VISUAL_STORY = 13
    BEHIND_THE_SCENES = 14
    DOCUMENTARY = 15
    RADIO_THEATRE = 16
    SHORT_FILM = 17
    SILENT_MOVIE = 18
    BLACK_WHITE_MOVIE = 20
    CARTOON = 21

    ADULT = 69 # filter these results by default, dont show in featured media
    HENTAI = 70 # filter these results by default, dont show in featured media

dev strategy:

  • create video service
  • extract ocp gui into a video service plugin
  • make OCP headless
  • develop server for simplified testing (host OCP skills in a separate dev machine)
  • work on the new NLP apis to add fundamental concepts needed for other tasks
  • develop new plugins + skills that depend on them in parallel
  • end2end test sprint
  • skill maintenance and development

NLP

refresher on how OCP intent matching currently works

- OCP registers itself via padatious intents "play {query}" / "next {song}" etc
- "{query}" is parsed internally by OCP
- OCP contains an internal intent parser based on Padacioso
- OCP will look for specific media type requests and perform a search
- if no results or not media type matched, OCP will send a generic search
- individual OCP skills decide what media types they want to answer

Audio Service

refresher on AudioService

- AudioService entries come from mycroft.conf (explicitly enabled)
- There is a ordered list of preferred plugins to be used
- each plugin reports what kind of uri/file extension it can handle  
    - file: , http: , https: , spotify: , mp3 , wav , ...
- OCP will try plugins by preference order until one can handle the requested uri
- OCP keeps track of the playlist (mix and match media types)
- plugins only handle play/pause/resume/seek when told by OCP
  • device based audio plugins -> allow location_alias (eg kitchen / living room...):
    • spotify plugin https://github.com/OpenVoiceOS/ovos-audio-plugin-spotify -> provide dynamic AudioService entries, 1 per device
    • sonos plugin -> provide dynamic AudioService entries, 1 per device
    • DLNA plugin -> provide dynamic AudioService entries, 1 per device
    • hardware based audio plugins
      • external soundcard inputs (eg, cassette player connected to usb mic input) -> video demo provide multiple AudioService entries (from config), on/off commands only + ocp search result
      • new MediaType.ANALOG_AUDIO_INPUT
      • old radio connected to smart plug -> provide multiple AudioService entries (from config), on/off commands only + ocp search result
      • new PlaybackType.EXTERNAL_AUDIO_OUTPUT
      • bluetooth speaker (connect + delegate to some software player such as vlc) -> provide multiple AudioService entries (from config)
      • pulseaudio speaker (connect to remote pulseaudio sink + delegate to some software player such as vlc) -> provide multiple AudioService entries (from config)

Video Service

refresher on how OCP currently handles PlaybackType.VIDEO

- OCP checks if GUI is available
- if GUI not available, ignore PlaybackType.VIDEO results
   - some media types, such as MediaType.MUSIC might be cast to a audio stream  
      (eg, youtube music/podcast/audiobook)
- issue playback to GUI to handle both audio (with animations) and video if possible
  • introduce video service, similar to audio plugins but capable of video
    • fully optional -> OCP is headless by default
    • GUI frameworks may provide a VideoPlugin if they can handle playback
    • can also handle audio, if plugin provides companion visuals
    • device based video plugins -> allow location_alias (eg kitchen / living room...):
      • chromecast plugin -> provide dynamic VideoService entries, 1 per device
    • application based video plugins -> app names instead of locations (eg vlc / mplayer...):
    • hardware based video plugins -> on/off commands only
      • hdmi/RCA/composite video capture (VHS/DVD players, consoles....) - video demo
      • new MediaType.ANALOG_VIDEO_INPUT
      • old TV connected to smart plug
      • new PlaybackType.EXTERNAL_VIDEO_OUTPUT

Web Player Service

refresher on how OCP currently handles PlaybackType.WEBVIEW

- OCP checks if GUI is available
- if GUI not available, ignore PlaybackType.WEBVIEW results
- use native QML browser + execute javascript on page load

skills can provide OCP results to be opened in browsers and additionally request javascript to be executed on page load (eg, to make video fullscreen, remove html elements or adds), this is meant to handle web platforms that dont offer direct streams or include DRM/require login such as netflix, youtube, etc

Session support

Favorite Songs

  • allow "liking" a song
    • GUI click
    • intent, "I like that song"
  • automatic playlist of liked songs
    • once user voice recognition is in, do this per user -> "play my favorite music" -> playlist for that user

Performance

Misc Bugs

Server

  • create REST endpoint for standalone OCP
    • load selected OCP skills
    • search media results
    • NLP classifiers

Skills

UnitTests

  • reach 75% coverage for OCP
  • unittests for media_type intent matching
  • end2end tests for concurrent session support
  • end2end tests for 1 skill of each media_type
  • end2end tests for "like" functionality
@JarbasAl JarbasAl self-assigned this Apr 11, 2023
@JarbasAl JarbasAl pinned this issue Apr 11, 2023
@JarbasAl JarbasAl changed the title roadmap The OCP Sprint Oct 3, 2023
@forslund
Copy link

Lot's of good things in here. One thing I thought of when doing the first draft of the Spotify skill was how checking if an uri was supported is done.... checking if the uri starts with the uri-type reported by the audio service followed by ://.

In the spotify skill / audio backend I ran into the issue that spotify "uri's" starts with spotify:xxx which could not be verified.

Extending the audioservice with a method can_handle() or uri_is_supported that returns True if the uri (or track-reference) can be handled. A default implementation can be made replicating todays behavior and skills needing other checks may override it.

@NeonJarbas
Copy link
Contributor

initial docs OpenVoiceOS/ovos-technical-manual#14

@JarbasAl JarbasAl transferred this issue from OpenVoiceOS/ovos-ocp-audio-plugin Feb 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants