Linux Plumbers Conference virtual town hall

Posted Jun 23, 2020 18:45 UTC (Tue) by SiB (subscriber, #4048)
In reply to: Linux Plumbers Conference virtual town hall by corbet
Parent article: Linux Plumbers Conference virtual town hall

I am operating two BigBlueButton instances. We seem to consitently run into scalability limits at around 200 participants in a single session. I could not identify any resource that was limited.
Recently we had two conferences on one server with around 50 participants each, where CPU cycles became a bottleneck.
That is a virtual server with four dedicated CPU cores, 16GBytes of RAM, 1Gbit network. The second instance has six cores and 32Gbytes.
Both instances are installed on top of a Ubuntu 16.04 image provided by the hosting service.
I am looking forward to any tips and tricks how to scale this up.

Linux Plumbers Conference virtual town hall

Posted Jun 23, 2020 19:13 UTC (Tue) by mbunkus (subscriber, #87248) [Link] (2 responses)

We're running BBB clusters with Scalelite (meaning we have many meetings, but none of them is too big). Some more or less random thoughts/points.

Those numbers are pretty much in line with what everyone else writes everywhere, including the BigBlueButton docs & FAQ itself. 200 people in a single meeting are pretty much the the limit, even for beefy machines.

If you want to scale up more, there's the Scalelite project which acts as a load-balancer (meaning you can achieve more meetings in total, but not more people per meeting as all participants of a meeting are handled on the same machine). Note that Scalelite's load balancer is currently very, very coarse; it distributes the next incoming meeting based on the number of active meetings on each node, not on the actual load or the number of active participants. There are patches for that, though. Nevertheless, even when scaling horizontally your nodes must be beefy enough to carry Yet Another Full Meeting (the load balancer doesn't know how many people will join your meeting). Or to put it differently: it's massively better to have four cluster nodes with eight CPU cores instead of eight notes with four cores as one big meeting can easily saturate a four core machine.

In general: when in doubt, use more CPU cores. RAM isn't as important. Always use dedicated CPU cores in virtual environments (that should be obvious for every latency-sensitive application). 6 cores/32 GB is definitely a bad ratio; 8 cores/10 GB is what we use at the moment for our cluster nodes. Thinking about going 12/12.

As for networking: a 1 Gbit/s link suffices. You'll pretty much always run out of CPU before you run out of bandwidth.

Keep in mind that incoming audio & video streams induce different amounts of load. A person joining in "listening only" mode without video doesn't impose much load if any. A user with active video & audio induces ~3.2 times the load of a user having only audio active. Muted audio counts as having audio active. If you use phone-based dial in each active phone call counts as a regular "audio active" user.

Only run your servers up to 80, maybe 85% CPU usage. Otherwise you'll probably get load spikes resulting in very noticeable audio interruptions (which are much, much more annoying than short video interruptions).

Keep in mind that after recording a meeting the node it was recorded on will re-encode the video with ffmpeg (e.g. in order to provide both WebM and MP4 versions, depending on configuration, of course). This will induce further load. You can modify the corresponding scripts to at least nice those processes so that FreeSWITCH (which does all audio muxing) has higher priority.

Linux Plumbers Conference virtual town hall

Posted Jun 23, 2020 19:21 UTC (Tue) by mbunkus (subscriber, #87248) [Link]

I meant to start with: "We're running BBB clusters with Scalelite _for schools_"; without that the part about many but small meetings doesn't make much sense, I guess.

I also meant to say that we're running our clusters on virtualized hardware.

Linux Plumbers Conference virtual town hall

Posted Jun 23, 2020 19:47 UTC (Tue) by SiB (subscriber, #4048) [Link]

Thanks!
We typically do not use much video. The news I take away is that muted audio costs much more than listen-only. That may have been the problem last week, when one lecture with an audience of 50 listen-only participants was disrupted by a meeting with 50 audio participants.
The excess RAM is bundled with the cores, nothing I can do about that.