Twilio Announces New APIs to Enable Multi-User Augmented Reality Applications

Twilio (NYSE: TWLO), the leading cloud communications platform, today announced new capabilities to enable multi-user augmented reality (AR) as part of its Programmable Video platform. Developers will now be able to create more engaging communications for their customers that combine real-world immediacy with rich virtual content. To learn more, please visit:

Augmented reality applications utilize a device’s camera and microphone to overlay virtual content onto real-world environments through a video stream, enabling users to see and experience digital objects and information as if they were physically present. In the past, augmented reality applications have been largely built for a single user due to the massive complexity of implementing a shared or collaborative experience. In order to connect multi-user voice and video to an AR app, developers would need to first solve audio and video routing and streaming, as well as data-synchronization — all in real time.

Twilio Programmable Video now solves these problems and enables developers to create entirely new kinds of augmented reality apps with the following communications capabilities:

  • Global media server infrastructure
    Twilio’s Programmable Video platform provides globally-distributed low-latency media servers that offer complete client-side control over audio and video layout and spatialization, down to the individual client device. Twilio’s approach offers vastly lower latency, improved quality, and superior mobile integration relative to legacy multipoint control unit (MCU) media server technology, because it avoids transcoding or mixing audio and video and lets the client control layout. This gives developers maximum flexibility to control layout and spatialization in their applications.
  • DataTrack API
    The new Twilio DataTrack API shares important metadata between endpoints, without needing to setup a separate communications channel. The DataTrack can transport events about AR objects, details for audio and video spatialization, and more. Metadata is shared in real-time so the AR experience can react to changes immediately and provide a more immersive environment for the user.
  • Media Sync API
    Media Sync APIs let developers synchronize AR metadata with real-time media, enabling accurate playback of video and augmented reality objects. Media Sync is coming soon to Twilio’s SDKs for iOS and Android. Developers can sign up to learn more at

These new capabilities in Twilio’s Programmable Video platform work alongside AR capabilities such as Apple’s ARKit and Google’s ARCore.

“Augmented reality is completely transforming the way we understand and interact with the world around us, especially in terms of remote collaboration,” said Rob Brazier, director of product for Twilio’s Programmable Video platform. “Imagine if the next time you called your cable company, you could simply show them a video of the flashing lights on your cable box and they could instantly know what’s wrong, and then guide you through the process of wiring it up correctly by overlaying the correct cabling on-screen in your physical space. The prospects for improving remote support and sales conversations is incredible.”

Augmented reality startup Streem chose Twilio Programmable Video to power communications for their home services app, which is demonstrated in this video. The new Twilio Programmable Video capabilities will enable developers like Streem to create a wide variety of AR communications applications for consumers and businesses alike.

“As one of the first companies implementing remote AR, our unique challenges have required strong technical solutions. Twilio’s top-notch team and products have met the challenge, and been nothing less than remarkable,” said Sean Adkinson, CTO and co-founder of Streem. “With first-class support and collaboration from the very beginning, Twilio is instrumental to Streem’s success, and they will play a large role as we continue to quickly scale and change the future of communication within augmented reality environments.”

Some of the augmented reality communications uses cases that are now possible include:

  • Remote collaboration on virtual 3-D assets, for example in architectural, medical, or industrial settings
  • Real-time translation of audio content into a user’s native language, with subtitles visualized directly on the video stream
  • Syncing real-life emotional, physical and verbal data to virtual avatars in a gaming or telepresence environment

Twilio’s Programmable Video platform makes it easy to add real-time, multi-user audio and video communications to any web or mobile application. In addition to the new augmented reality APIs described above, Programmable Video supports both peer-to-peer and group rooms, enabling up to 50 participants to connect via audio and video. Twilio’s Programmable Video provides SDKs for iOS, Android and JavaScript, and is based on WebRTC, the dominant industry standard for real-time communications.

Developers can learn more about Twilio’s Programmable Video, download sample multi-user applications, and sign up at

About Twilio
Twilio’s mission is to fuel the future of communications. Developers and businesses use Twilio to make communications relevant and contextual by embedding messaging, voice, and video capabilities directly into their software applications. Founded in 2008,Twilio has over 800 employees, with headquarters in San Francisco and other offices in Bogotá, Dublin, Hong Kong, London, Madrid, Malmö, Mountain View, Munich, New York City, Singapore and Tallinn.