Skip to content
Sean DuBois edited this page Apr 1, 2020 · 18 revisions

Pion WebRTC Media API

This document details a completely new media API for Pion WebRTC. The current media API has deficiencies that prevent it from being used in a few production workloads. This document doesn't aim to modify/extend the existing API, we are looking at it with fresh eyes.

API Requirements

API Users

If you can think of more use cases please provide them, this list is not exhaustive!

Sending pre-recorded content to viewer(s)

A user has audio/video file on disk and wants to send the content to many viewers. There will be no congestion control, you will have some loss handling (NACK). If the remote viewer doesn't support the codec we offer handshaking will fail.

Relaying RTP Traffic (with no feedback)

A user has an existing RTP feed (RTSP camera), and wants to send the content to many viewers. There will be no congestion control, you will have some loss handling (NACK). If the remote viewer doesn't support the codec we offer handshaking will fail.

Sending live generated content

A user will be encoding content and sending to many viewers, this could be an MCU, capturing a webcam or desktop (like github.com/nerdism/neko). There will be congestion control, and packet loss handling (NACK/PLI). The user should be informed of the codecs the remote supports, and then be able to generate on the fly what is requested.

Ingesting WebRTC for Later Playback

A user wants to save media from a remote peer to disk. This could be for playback later, or some other async task. We need to ensure the best experience possible by providing loss handling, and congestion control. Latency doesn't matter as much.

Ingesting WebRTC for Live Playback

A user wants to consume media from a remote peer live. This could be used for processing (like GoCv) or playing back live. We need to ensure the best experience possible by providing loss handling, and congestion control. We will also need to be careful to not add much latency, this could hurt the entire experience.

Relaying WebRTC Traffic

Users should be able to build the classical SFU use cases. For each Peer you will have one PeerConnection, and transfer all tracks across that. If possible we should support Simulcast and SVC. However if nothing is supported we should just request the lowest bitrate that works for all peers. Beyond that we should pass everything through and let de-jitter happen on each receiver side. This needs more research.

Code that works in native and web

Users should be able to write idiomatic WebRTC code that works in both their native and Web applications. They should be able to call getUserMedia and have it work across both platforms. This portability is also very important for our ability to test.

API Features

Without even setting out how the APIs work, this is what usage will look like at a high level

Set supported codecs at PeerConnection Level

A user on startup will declare what codecs they will support.

They can can push on a list of codecs, and can control

  • Type (Audio/Video)
  • Name (H264, Opus, VP8)
  • Payload Format Specifier (H264 Profile)

They can also set additional audio attributes, and additional video attributes (experiments per media type)

TODO-- Do we allow people to set mapping (for rtx) and rtcp-fb? I think we can handle this all internally.

Create MediaStream/MediaStreamTrack

Clone this wiki locally