360° Video Streaming and Playback Approaches
360° video got quite popular and we have been working on solutions for efficient delivery of high-resolution content, especially also to TV devices (in addition to Head-Mounted Displays).
On this page, we summarise the different approaches for 360° video streaming and rendering highlighting the respective challenges, advantages and disadvantages. Additionally, the different approaches are showcased on our 360° Video Playout Demopage. We also show how interactive video technologies can enhance the 360° video playback and user experience, especially on TV devices.
- TV devices: This comprises TV sets, set top boxes, streaming devices like FireTV, AppleTV etc.
To stay informed about the latest developments of Fraunhofer FOKUS FAME concerning 360° videos on TV and other working areas as well as coming events, including the 9th FOKUS Media Web Symposium 2020, please subscribe to our newsletter.
360° video Streaming
A 360° video is recorded with an array of cameras to comprise a full sphere of all possible viewing directions. This video format uses a projection method to store the sphere as a 'flat' video. The resolution of the Field of View (FOV, which is the section a viewer is looking at) directly depends on the resolution of the overall 360° video since it is always a fraction of it. That means to obtain a high resolution in the field of view (which is the image quality experienced on the client device) we need a much higher resolution of the whole sphere!
The higher the 360° video resolution the more bandwidth and computing resources are required. But, even if we have a fast PC on hand, how to get this amount of data on to our device with a standard internet connection?
To allow users to navigate in this video and chose a desired Field Of View (FOV), the appropriate picture has to be extracted and projected (according to the selection projection method), which is called rendering. There are two 360° rendering approaches, namely Client Side Rendering and Server Side Rendering which will be presented in detail in the following subsections.
- FOV: The Field of View is the part of a 360° video that is visible on a screen at a certain time
Client Side Rendering
Client Side Rendering is the common method today to stream and playback 360° videos on devices such as Head-Mounted Displays (HMDs) and mobile devices. For example, YouTube and Facebook use this approach. In Client Side Rendering, the playout and distribution of 360° video is done in the same way as traditional video playout. It uses the same content distribution servers and network protocols as other video formats.
This approach requires to stream the full 360° content to the end-user device, where – however - only about 10% (depending on FOV angle) are actually presented to the viewer, while the other 90% are disregarded at the client side. This depicts a huge waste of bandwidth and limiting the quality of the requested FOV (as there are upper limits for the total resolution of the full video, such as up to 8K).
- Angle: The viewing angle depicts the portion of the 360° video based on the visible angle. An angle of e.g. 90° means 1/4, 360°/4=90° of the horizontal sphere is visible.
Streaming the whole 360° video to the client implicates a number of challenges to the client. In order to achieve a certain quality of the actually presented picture (FOV), the whole 360° video file needs to cover approx. 10 times of the data actually presented, which has implications and challenges related to:
- network bandwidth, and
- processing capabilities
Both have serious impact on performance as well as on image quality!
The following table shows the impact of the 360° video resolution (4K, 16K and 24K) and FOV angle (60°, 90° and 120°) on the FOV video resolution and wasted bandwidth.
As depicted in the table:
- row #1: a FOV with HD resolution can be achieved from a 4K 360° video and 120° wide FOV angle
- row #5: a FOV with 4K resolution can be achieved from a 16K 360° video and 90° wide FOV angle
- row #9: a FOV with 4K resolution can be achieved from a 24K 360° video and 60° wide FOV angle
This is a full equirect image, 360° width and 180° height, covering the full sphere.
A small angle results in 'zooming' into the image reducing the resolution of the visible area. A large viewing angle results in a 'fish eye' like effect where the image gets distorted. The most often used value is between 60° and 100°.
You can clearly see that the visible area is only a fraction of the whole video.
- To serve a 4K TV with a 4K FOV at 60° FOV you will need a 24K 360°.
- To stream a 24K video to the client an internet connection with a bandwidth of approximately 400Mbit/s is necessary
- And even if you can transfer this amount of data as a stream, the client will not be able to render this large video due to limited processing power
- More than 97% of the bandwidth are wasted since only a fraction can be seen at a certain time
Even the playback of an 8K video is a challenge to most modern laptops. Currently, with client side rendering your only option is to stick with the low quality in order to make this approach possible at all. To overcome the challenges with client side rendering we developed the server side approach.
Server Side Rendering
The Server Side Rendering approach addresses the challenges of Client Side Rendering by reducing required network bandwidth (>400Mbit/s) and processing requirements (rendering a FOV from the full 360° 24K video). The core idea, as depicted in the following architecture diagram, is to process the 360° video and calculate the video content in FOV on the server and only stream this video to the client. The client has a standard video player that plays back the FOV video. It is embedded into a client control framework which sends all navigation commands to the server to change the FOV. By doing so, the bandwidth and processing requirement will be reduced to the same level that is needed for normal non 360° videos of the same resolution.
Let us take, as example, row #9 from the table above. In order to provide a 4K FOV, we need, when using Server Side Rendering, a bandwidth of a around 20 Mbit/s to stream a 4K video instead of 24K video. If we would use Client Side Rendering here, we need to stream 36x (24K) as much content to present the same 4K FOV (with slight variation due to the used video codec and other factors). In other words, if the bandwidth is limited to stream only 4K video, we can see by comparing rows #3 and #9 that the maximum possible FOV resolution by Client Side Rendering is SD (640x360) while the maximum possible FOV resolution by using Server Side Rendering is 4K (3840x2160).
A limitation of the Server Side Rendering is the scalability since a rendering instance (with GPU) is needed for a single client. To solve this issue, we have developed a solution that extends the Server Side Rendering approach by rendering different combinations of FOVs in advance and make them available to clients via the existing video streaming infrastructure and Content Delivery Networks (CDNs) as depicted in the following architecture diagram.