To test the capacity of dialers and web servers in a cluster: load as many agents onto one of each until that server begins to act erratically. Then back off by 20% and see if that fixes the problem. Repeat the test a few times to be sure where your threshold is. But do NOT make the mistake of assuming this threshold is written in blood. Too many variables. But you may get a good idea of the basic limitation for that single server, and then apply this same limit to all identical servers to get your system capacity.
If you can, capture the logs during the "erratic behavior" moment so you can investigate where your limitation came from. Chase it down and see if this is a hard barrier or something that can be raised (such as additional open files or network ports or meetme rooms). For instance, if the limitation was dropped packets, you can improve your network to avoid dropping packets and then test again.
Avoid running more than one role on a single server as you get more servers in the cluster.
Single server: DB/Dialer/Web/Archive
Two server: DB/Web/Archive |Dialer
Three server: DB/Web/Archive |Dialer |Dialer
Four server: DB/Archive |Dialer |Dialer |Web
Five server: DB/Archive |Dialer |Dialer |Dialer |Web
Six server: DB/Archive |Dialer |Dialer |Dialer |Web |Web
The Archive server can actually be anywhere in the system. In fact, it can be a completely unrelated server in a different location running just Web/FTP without any loss of functionality.
If your DB must share a role, consider Web rather than Dialer on the DB server. It is not necessary to disable a role on a server that's not using that role, but configuration options may be changed to avoid excessive waste for an unused role and it does free up some resources to disable a role entirely. Not running asterisk on the DB server doesn't free up huge resources, but it does free up SOME resources. If your DB server is your chokepoint (common), then freeing up resources on it is necessary.
Keep an eye on the Average Server Load (using htop or uptime) for all servers during production. If the Average Server Load never exceeds half of the CPU core count, that server is not overloaded as yet. For example: staying under 4.0 on an 8 core system is smooth sailing. Once you exceed half of the core count (sustained for more than a few seconds), that server is nearing load. It will progress from half to full MUCH faster than it did from idle (0.1?) to half. And once it approches full load (8.0 on 8 core system) it will overload quickly. Not to say that an 8 core system will crash and burn if it's running at 16.0, because I've seen that many times. But any hiccups after full load Can Cause Crash. For instance, running a report or increasing a dial ratio. Anything can trigger failure at that point.
Happy Hunting!