how much data does discord voice chat take

Discord Voice server contains two components: a signaling component and a media relay component called the selective forwarding unit or SFU. In practice, our entire message schema contains 16 columns, but the average message only has 4 values set. It has to replicate deletes to other nodes and do it even if other nodes are temporarily unavailable. We then sequentially query partitions until enough messages are collected. Discord continues to grow faster than we expected and so does our user-generated content. Which helps us understand as to how does Discord make money? Deleting a column and writing null to a column are the exact same thing. A big part of Discord is about sharing your gaming experiences with your friends. They both generate a tombstone. All of our signaling servers are written in Elixir allowing lots of code re-use. Access raw audio data to perform voice activity detection and share both game audio and video. It quickly became clear that our reads were extremely random and our read/write ratio was about 50/50. Implement our own volume control to avoid changing your global operating system volume. It has now been just over a year since we made the switch and, despite “the big surprise,” it has been smooth sailing. We are constantly monitoring voice quality (clients report quality metrics to our backend servers). Before choosing a new database, we had to understand our read/write patterns and why we were having problems with our current solution. There are several backend services that make voice possible, but we will focus on three of them — Discord Gateway, Discord Guilds and Discord Voice. The solution to this was simple: only write non-null values to Cassandra. During less "outdoor" months i hover around 150-200ish MB of data. And also you … Back in the day, players who wish to come together on parties, guilds, or raids can only use the methods available in-game to organize themselves. This is a lot of data that is ever increasing in velocity, size, and must remain available. We are investing heavily in our voice and video clients and infrastructure to make Discord the best voice and video experience for gamers. We setup our code to double read/write to MongoDB and Cassandra. Although SFU crashes are rare, we use the same mechanism for zero-downtime SFU updates. It really depends what you're doing, if you tend to load images lots, embeds enabled, voice chat etc. For clarity, we will use the term “guild” to represent a collection of users and channels — they are called “servers” in the client. Since all writes in Cassandra are upserts, that means you are generating a tombstone even when writing null for the first time. It is a required field! Once You’re in Server Settings: Slide yourself on over to the “Server Region” option on the right side. We decided to bucket our messages by time. This means our browser app relies on the WebRTC implementation offered by the browser. Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Every audio/video communication in Discord is multiparty. Figuring out that the message is corrupt and deleting it from database. The Discord Guilds server watches the service discovery system and assigns the least utilized voice server in the given region to the guild. Clients send periodic ping messages to ensure that the firewall remains open all the time. Introducing a new system into production is always scary so it’s a good idea to try to test it without impacting users. So how did this bite us? It became clear we had to somehow bound the size of partitions because a single Discord channel can exist for years and perpetually grow in size. You could easily have listviews hosting server icons and friendlists, another listview for chat threads, and on native code the same would take what... 20~50MB memory? This means that the communication is always initiated by the client, which reduces both client and backend complexity and also increases resilience against errors. In practice this has proved to be fine because for active Discords enough messages are usually found in the first partition and they are the majority. The downside is that if your voice drops below that level, the sound cuts off completely. Once 50 channels are reached, no more channels can be created inside that category. The original version of Discord was built in just under two months in early 2015. We want to avoid doing this one and don’t think we will have to do it. Arguably, one of the best databases for iterating quickly is MongoDB. Everything on Discord was stored in a single MongoDB replica set and this was intentional, but we also planned everything for easy migration to a new database (we knew we were not going to use MongoDB sharding because it is complicated to use and not known for stability). Slack vs Discord: Integrated File Sharing. 2. level 1. Write the whole message back when editing the message. Explore, If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. Everything on Discord was stored in a single MongoDB replica set and this was intentional, but we also planned everything for easy migration to a new database (we knew we were not going to use MongoDB sharding because it is complicated to use and not known for stability). Our homegrown SFU (written in C++) is responsible for forwarding audio and video traffic within channels. Cassandra 3 has a. You can now select a voice channel of your choice. Every client is notified about the new voice server and creates a voice WebSocket connection to the new voice server to start media setup. The channels can be configured a number of ways, and set aside for specific purposes, such as announcements, contests, or music. When a client displays “Voice Connected,” it means that the client successfully exchanged UDP messages with the selected Discord Voice server. The problem is that even though this is a small amount of messages it makes it harder to serve this data to the users. It took only 15mb to 1hr & 30min call. The partition contains multiple rows within it and a row within a partition is identified by the second K, which is the clustering key. You can think of a partition as an ordered dictionary. Within Game Subscriptions – Discord Nitro. The client opens a second WebSocket connection to the voice server (we call this the voice WebSocket connection) which is used for setting up media forwarding and speaking indication. When a client displays “Awaiting Endpoint,” it means that the Discord Guilds server is looking for the best Discord Voice server. This meant that if a user caused this query again then at worst Cassandra would be scanning only in the most recent bucket. We exchange all necessary information between client and server (less than 1200 bytes round trip) and SDP is synthesized from this information at the clients. In a year, this kind of server is unlikely to reach 1,000 messages. Once 500 channels are reached, no more channels can be created. The text rooms are very much like forums (though if you want an actual gaming forum, you should look elsewhere). Discord is a voice, video and text communication service to talk and hang out with your friends and communities. When a Discord Voice server dies, it fails the periodic ping and gets removed from the service discovery system. We have a handful of engineers working on both client and server components and looking after operations as well. Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Select “Save Changes,” and you’re good to go! Every audio/video communication in Discord is multiparty. WebRTC is available in all modern browsers and also as a native library to embed into applications. If the SFU crashes, it is restarted right away causing minimal service interruption (few dropped packets). We lowered the lifespan of tombstones from 10 days down to 2 days because we run. We are adding hardware video encoding to our desktop application for a better experience. Clients tear down their voice WebSocket connection to the old Discord Voice server and create a new voice WebSocket connection to the new Discord Voice server. Nothing was surprising, we got exactly what we expected. Our SFU is tailored to our use case offering maximum performance and thus the lowest cost. It implements a transport and encryption for both browser and standalone applications and translates between the two as it forwards media packets. Clients are also notified about the selected Discord Voice server. They have thousands of members sending thousands of messages a day and easily rack up millions of messages a year. Minimal Data Usage Since Discord’s a chat app, it doesn’t require too much data or high internet speed for optimal performance. How can it be null? We started digging and found a Discord channel that was taking 20 seconds to load. Importing messages into Cassandra went without a hitch and we were ready to try in production. Discord is a newcomer to the chat scene, but it’s made a big splash. For example, administrators can disable audio/video for offending participants. Large partitions put a lot of GC pressure on Cassandra during compaction, cluster expansion, and more. But by simply switching on the “low data usage” mode, users can save much more data, consuming only 134KB per minute. We don’t have dedicated DevOps engineers yet (only 4 backend engineers), so having a system we don’t have to worry about has been great. This provisioning includes lots of redundancy to handle data center failures and DDoS attacks. Remember that messages were indexed in MongoDB using channel_id and created_at? Discord Gateway, Discord Guilds and Discord Voice are all horizontally scalable. The best way to describe Cassandra to a newcomer is that it is a KKV store. Supporting large group channels (we have seen Doubling those to account for the two-way flow, that’s a total of 7.25 MB/minute minimum, 27 MB/minute maximum. This is actually part of our company culture: build quickly to prove out a product feature, b… The first thing to mention is that Discord is focused on voice chat. 1. If you are passionate about working with audio and video, check out our jobs page! Discord’s privacy policy says they may retain messages and VoIP data to help maintain services but that does not mean actual voice calls. Arguably, one of the best databases for iterating quickly is MongoDB. Cassandra advertises that it can support 2GB partitions! When a DDoS attack subsides, the server is added back to the service discovery system for serving voice traffic. Thanks to all our engineering efforts on both the client and server architecture, we are able to serve more than 2.6 million concurrent voice users with egress traffic of more than 220 Gbps (bits-per-second) and 120 Mpps (packets-per-second). Having a large partition also means the data in it cannot be distributed around the cluster. Discord lets friends chat with each other either one-to-one or as a group via a server. You can also write to any node and it will resolve conflicts automatically using “last write wins” semantics on a per column basis. Our Collection and Use of Personal Information: We collect the following categories of personal information: identifiers (such as your username, the email address you use to sign up, and your phone number, if you’ve chosen to provide it); commercial information (a record of what you’ve bought from Discord, if anything); financial data (payment information, if you’ve bought anything from Discord… We are currently running a 12 node cluster with a replica factor of 3 and will just continue to add new Cassandra nodes as needed. Due to the success of this project we have since moved the rest of our live production data to Cassandra and that has also been a success. How Much Money Does Discord Make Per Year? Large companies such as Netflix and Apple have thousands of Cassandra nodes. In Discord, voice/video communication is initiated by entering a voice channel or call. We decided early on to store all chat history forever so users can come back at any time and have their data available on any device. These properties combined allow for very powerful data modeling. The native WebRTC library lets you implement your own transport layer using the webrtc::TransportAPI. The client updates its voice state object using the gateway WebSocket connection. Discord Business Model Discord has more than 87 million users , and is planning to rule over the $1.7 billion worth of voice chat … Get an enhanced Discord experience for one low monthly cost. They almost always are requesting messages sent in the last hour and they are requesting them often. Slack and Discord both the tools, allows the users to share a wide range of files across their respective applications. In a follow-up to this post we will explore how we make billions of messages searchable. channel_id became the partition key since all queries operate on a channel, but created_at didn’t make a great clustering key because two messages can have the same creation time. The first K is the partition key and is used to determine which node the data lives on and where it is found on disk. First, WebRTC relies on the Session Description Protocol (SDP) to negotiate audio/video information between participants (which can be close to ten kilobytes in size round-trip). Reduce your bandwidth and CPU consumption during periods of silence — even very large voice channels only have a few concurrent speakers at any given time. Our desktop, iOS, and Android applications, however, make use of a single C++ media engine built on top of the WebRTC native library — specifically tailored to the needs of our users. The signaling component constantly monitors the SFU. The main reason is that gamers need high-quality, lag-free voice chat to communicate with each other when playing different games. 4 years ago. We looked at the largest channels on Discord and determined if we stored about 10 days of messages within a bucket that we could comfortably stay under 100MB. We noticed Cassandra was running 10 second “stop-the-world” GC constantly but we had no idea why. Minecraft is a close second, with 569,000 members We went from over 100 million total messages to more than 120 million messages a day, with performance and stability staying consistent. Supporting large group channels (we have seen 1000 people taking turns speaking) requires client-server networking architecture because peer-to-peer networking becomes prohibitively expensive as the number of participants increases. Apparently, just because it can be done, it doesn’t mean it should. Explore, If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. Cassandra is an AP database which means it trades strong consistency for availability which is something we wanted. You can use it to send direct messages to friends, have video calls with them, voice chat and even screen share. We were writing 12 tombstones into Cassandra most of the time for no reason. The client also notices server failure due to the severed voice WebSocket connection and requests a voice server ping through the gateway WebSocket connection. This creates a choppy effect at times, and can sound like a faulty connection. Related data is stored contiguously on disk providing minimum seeks and easy distribution around the cluster. More often if the automatic input sensitivity is … The two Ks comprise the primary key. The Discord Guilds server confirms the failure, consults the service discovery system, and assigns a new Discord Voice server to the guild. WebRTC is a specification for real-time communication comprised of networking, audio, and video components standardized by both World Wide Web Consortium and Internet Engineering Task Force. An old Reddit post by a Discord staff member tried to clarify the situation around privacy. Discord is a chat app released in May 2013 by Discord Inc. (originally called Hammer and Chisel Inc.) that’s made by gamers, for gamers. Lastly, WebRTC uses the Secure Real-time Transport Protocol (SRTP) for media encryption. Provide system-wide push to talk functionality. While solving this problem, we noticed we were being very inefficient with our writes. This includes the voice backend server address and port, encryption method and keys, codec, and stream identification (about 1000 bytes). We just recently added a South Africa region. It is an anti-pattern to read-before-write (reads are more expensive) in Cassandra and therefore everything that Cassandra does is essentially an upsert even if you provide only certain columns. Luckily every ID on Discord is actually a Snowflake (chronologically sortable), so we were able to use them instead. Discord enables a noise gate by default for voice chat. Cassandra does this by treating deletes as a form of write called a “tombstone.” On read, it just skips over tombstones it comes across. Build a system to archive unused channels to flat files on Google Cloud Storage and load them back on-demand. In Discord’s free plan, the user can share files to the limit of 8MB and can extend this limit up to … These decisions enabled us to massively scale our operation with a small team and limited resources. Since it was public we joined it to take a look. Although we released video chat and screen sharing about a year ago, you can currently only use it with direct messaging. The Puzzles & Dragons Subreddit public Discord server was the culprit. A better way is to stop data-hungry apps from using too much data in the first place. Using the WebRTC native library allows us to use a lower level API from WebRTC (webrtc::Call) to create both send stream and receive stream. Learn more, Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. What will change or stay the same about Discord, especially the decision to not track and sell user data, if Microsoft does acquire Discord? As such its annual revenues are mostly estimated. With a camera in every smartphone and traversing the globe increasingly common, video-calling is a great way to keep in touch with loved ones. Immediately after launching we started getting errors in our bug tracker telling us that author_id was null. The only way our team can support all these platforms is to take advantage of code re-use and WebRTC. A bunch of servers regions will pop up. The messages were stored in a MongoDB collection with a single compound index on channel_id and created_at. Configuring the Automatic Input Sensitivity Settings. We knew that in the coming year we would add even more ways for users to issue random reads: the ability to view your mentions for the last 30 days then jump to that point in history, viewing plus jumping to pinned messages, and full-text search. The original version of Discord was built in just under two months in early 2015. Discord is a VoIP, instant messaging and digital distribution platform designed for creating communities. This breaks down as follows: 2016 – $5 million; 2017 – $10 million; 2018 – $30 million; 2019 – $70 million; 2020 – $120 million; How Much Money Can Discord Make in the Future? 1Gb of data will be spent up within a day of you aren't all that careful. On the top left corner of your screen, tap the “ Three horizontal lines ” icon. We exchange a minimal amount of information when joining a voice channel. Curtailing your phone use as you near your data cap at the end of each month is no way to live. Discord’s audio and video features are implemented using WebRTC. It was time to migrate to a database more suited to the task. In the future, we want to use this information to automatically detect and reduce voice degradation issues. It is a voluntary program that focuses on influencing to take part in purchases. When you join a voice channel, you are assigned to a Discord Voice server. It said: Deleted Messages are deleted permanently. The latest reports in 2013 suggests that Skype uses 200-300mb of data (up & down combined) per hour of usage on a video call, and around 40-50mb per hour on a voice only call. When a user loaded this channel, even though there was only 1 message, Cassandra had to effectively scan millions of message tombstones (generating garbage faster than the JVM could collect it). Because of that the data is usually in the disk cache. This post gives a brief overview of the different technologies Discord uses to make audio/video communications a seamless reality. You’re free to experiment with all of the server regions freely. Voice chat heavy Discord servers send almost no messages. While Cassandra has schemas not unlike a relational database, they are cheap to alter and do not impose any temporary performance impact. Cassandra is known to have faster writes than reads and we observed exactly that. Finally, the SFU is also responsible for handling the real-time control transport protocol (RTCP), which is used for video quality optimization. Since those days the network speeds were slow and people had limited bandwidth, video chat was almost impossible without interruptions. Upgrade our message cluster from Cassandra 2 to Cassandra 3. If you have been paying attention you might remember how Cassandra handles deletes using tombstones (mentioned in Eventual Consistency). The noise gate helps eliminate electronic and background noise by cutting off all sound below a certain threshold. When offending users are moderated (server mute), their audio packets are dropped. In an instant, the IP phone converts your voice into digital data. Write on Medium, Windows automatically reduces volume of all applications when communications device is used, Captivate Your Community with Stage Channels, How Discord Scaled Elixir to 5,000,000 Concurrent Users, Inside 8-bit Music Theory’s Growing Video Game Soundtrack Server, How Discord Maintains Performance While Adding Features, Celebrating Black Community Leaders on Discord, Circumvent auto-ducking behavior of the default communications device on Windows. The clustering key acts as both a primary key within the partition and how the rows are sorted. This means that certain features work better in the installed application than in the browser. If you wish to adjust your voice chat settings, then click on “ Voice setting s”.