Web stops responding and times out
![Post Post](./styles/vicidial/imageset/icon_post_target.gif)
We are having times where the Web stops responding and times out. I have increase the Maxclients setting per the vici-server-tuning.conf suggestion;
StartServers 450
MinSpareServers 250
MaxSpareServers 500
ServerLimit 768
MaxClients 768
MaxRequestsPerChild 1000
KeepAliveTimeout 15
We have about 70 agents and the dialers are making about 220+ call at a time. I have recently changed our dialers to also provide web Our top output is as follows;
Web/Mysql
top - 16:01:46 up 16:25, 2 users, load average: 0.33, 0.35, 0.33
Tasks: 846 total, 1 running, 845 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.7%us, 0.4%sy, 0.0%ni, 96.8%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 64543M total, 13343M used, 51200M free, 114M buffers
Swap: 4063M total, 0M used, 4063M free, 10224M cached
---------------------------------------------------------------------------------------
load average: 0.28, 0.34, 0.32 mysqld 5.2.13-MariaDB-log up 0 day(s), 17:24 hrs
40 threads: 4 running, 44 cached. Queries/slow: 1.6K/0 Cache Hit: 99.94%
Opened tables: 0 RRN: 401 TLW: 3.2M SFJ: 0 SMP: 0 QPS: 0
ID USER HOST DB TIME COMMAND STATE INFO
1 slave 192.168.40.24:49 59065 Binlog Has sent all
65 cron 192.168.40.28:58 asterisk Query statistics SELECT ... FROM call_log where caller_code='V6021601500008630000' and server_ip='192.168.40.28' o
2405304 mysqltop localhost Query show full processlist
---
Dialer 1
top - 16:02:41 up 3 days, 5:31, 4 users, load average: 0.20, 0.20, 0.29
Tasks: 623 total, 3 running, 609 sleeping, 0 stopped, 11 zombie
Cpu(s): 2.2%us, 1.0%sy, 0.0%ni, 96.6%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8190120k total, 7870052k used, 320068k free, 234268k buffers
Swap: 4176892k total, 0k used, 4176892k free, 6702716k cached
Dialer 2
top - 16:03:00 up 3 days, 5:32, 3 users, load average: 0.16, 0.10, 0.15
Tasks: 597 total, 1 running, 593 sleeping, 0 stopped, 3 zombie
Cpu(s): 1.7%us, 1.1%sy, 0.0%ni, 97.1%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8190116k total, 7952540k used, 237576k free, 245972k buffers
Swap: 4176892k total, 200k used, 4176692k free, 6832976k cached
Dialer 3/Web
top - 16:03:19 up 4 days, 6:57, 2 users, load average: 1.51, 0.77, 0.72
Tasks: 602 total, 1 running, 591 sleeping, 0 stopped, 10 zombie
Cpu(s): 4.3%us, 1.4%sy, 0.0%ni, 93.6%id, 0.1%wa, 0.0%hi, 0.6%si, 0.0%st
Mem: 8190116k total, 7971888k used, 218228k free, 144504k buffers
Swap: 4176892k total, 9360k used, 4167532k free, 5038260k cached
Dialer4/Web
top - 16:03:46 up 4 days, 6:58, 2 users, load average: 0.67, 0.95, 1.04
Tasks: 620 total, 2 running, 618 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.0%us, 3.9%sy, 0.0%ni, 86.5%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st
Mem: 8190116k total, 8010164k used, 179952k free, 126304k buffers
Swap: 4176892k total, 14304k used, 4162588k free, 5054024k cached
Dialer 5/Web
top - 16:04:23 up 4 days, 6:58, 2 users, load average: 1.19, 1.54, 1.44
Tasks: 649 total, 3 running, 623 sleeping, 0 stopped, 23 zombie
Cpu(s): 15.6%us, 3.1%sy, 0.0%ni, 80.5%id, 0.5%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 8190116k total, 7952352k used, 237764k free, 125892k buffers
Swap: 4176892k total, 21576k used, 4155316k free, 5011864k cached
Dialer 6/Web
top - 16:05:11 up 4 days, 6:58, 3 users, load average: 0.03, 0.05, 0.05
Tasks: 577 total, 2 running, 575 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.2%us, 1.2%sy, 0.0%ni, 97.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16078M total, 3639M used, 12439M free, 120M buffers
Swap: 4078M total, 0M used, 4078M free, 1047M cached
The 6th dialer is obviously not being used much but Dialer 3 & 5 are getting beaten. The dialers continually seem to have zombie processes which is always the VD_amd.agi. I don't have dialer 2 & 3 in the web proxy cluster because of some issue with a custom script connecting to the DB.
I am planning on adding more ram to the 8gb dialers to try and help with swap usage.
I also have the log table archive script running at the default 2 months but they are still about 12-18 million records.
Any suggestions to combat this problem is greatly appreciated.
StartServers 450
MinSpareServers 250
MaxSpareServers 500
ServerLimit 768
MaxClients 768
MaxRequestsPerChild 1000
KeepAliveTimeout 15
We have about 70 agents and the dialers are making about 220+ call at a time. I have recently changed our dialers to also provide web Our top output is as follows;
Web/Mysql
top - 16:01:46 up 16:25, 2 users, load average: 0.33, 0.35, 0.33
Tasks: 846 total, 1 running, 845 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.7%us, 0.4%sy, 0.0%ni, 96.8%id, 0.0%wa, 0.0%hi, 0.1%si, 0.0%st
Mem: 64543M total, 13343M used, 51200M free, 114M buffers
Swap: 4063M total, 0M used, 4063M free, 10224M cached
---------------------------------------------------------------------------------------
load average: 0.28, 0.34, 0.32 mysqld 5.2.13-MariaDB-log up 0 day(s), 17:24 hrs
40 threads: 4 running, 44 cached. Queries/slow: 1.6K/0 Cache Hit: 99.94%
Opened tables: 0 RRN: 401 TLW: 3.2M SFJ: 0 SMP: 0 QPS: 0
ID USER HOST DB TIME COMMAND STATE INFO
1 slave 192.168.40.24:49 59065 Binlog Has sent all
65 cron 192.168.40.28:58 asterisk Query statistics SELECT ... FROM call_log where caller_code='V6021601500008630000' and server_ip='192.168.40.28' o
2405304 mysqltop localhost Query show full processlist
---
Dialer 1
top - 16:02:41 up 3 days, 5:31, 4 users, load average: 0.20, 0.20, 0.29
Tasks: 623 total, 3 running, 609 sleeping, 0 stopped, 11 zombie
Cpu(s): 2.2%us, 1.0%sy, 0.0%ni, 96.6%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8190120k total, 7870052k used, 320068k free, 234268k buffers
Swap: 4176892k total, 0k used, 4176892k free, 6702716k cached
Dialer 2
top - 16:03:00 up 3 days, 5:32, 3 users, load average: 0.16, 0.10, 0.15
Tasks: 597 total, 1 running, 593 sleeping, 0 stopped, 3 zombie
Cpu(s): 1.7%us, 1.1%sy, 0.0%ni, 97.1%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 8190116k total, 7952540k used, 237576k free, 245972k buffers
Swap: 4176892k total, 200k used, 4176692k free, 6832976k cached
Dialer 3/Web
top - 16:03:19 up 4 days, 6:57, 2 users, load average: 1.51, 0.77, 0.72
Tasks: 602 total, 1 running, 591 sleeping, 0 stopped, 10 zombie
Cpu(s): 4.3%us, 1.4%sy, 0.0%ni, 93.6%id, 0.1%wa, 0.0%hi, 0.6%si, 0.0%st
Mem: 8190116k total, 7971888k used, 218228k free, 144504k buffers
Swap: 4176892k total, 9360k used, 4167532k free, 5038260k cached
Dialer4/Web
top - 16:03:46 up 4 days, 6:58, 2 users, load average: 0.67, 0.95, 1.04
Tasks: 620 total, 2 running, 618 sleeping, 0 stopped, 0 zombie
Cpu(s): 9.0%us, 3.9%sy, 0.0%ni, 86.5%id, 0.1%wa, 0.0%hi, 0.5%si, 0.0%st
Mem: 8190116k total, 8010164k used, 179952k free, 126304k buffers
Swap: 4176892k total, 14304k used, 4162588k free, 5054024k cached
Dialer 5/Web
top - 16:04:23 up 4 days, 6:58, 2 users, load average: 1.19, 1.54, 1.44
Tasks: 649 total, 3 running, 623 sleeping, 0 stopped, 23 zombie
Cpu(s): 15.6%us, 3.1%sy, 0.0%ni, 80.5%id, 0.5%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 8190116k total, 7952352k used, 237764k free, 125892k buffers
Swap: 4176892k total, 21576k used, 4155316k free, 5011864k cached
Dialer 6/Web
top - 16:05:11 up 4 days, 6:58, 3 users, load average: 0.03, 0.05, 0.05
Tasks: 577 total, 2 running, 575 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.2%us, 1.2%sy, 0.0%ni, 97.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 16078M total, 3639M used, 12439M free, 120M buffers
Swap: 4078M total, 0M used, 4078M free, 1047M cached
The 6th dialer is obviously not being used much but Dialer 3 & 5 are getting beaten. The dialers continually seem to have zombie processes which is always the VD_amd.agi. I don't have dialer 2 & 3 in the web proxy cluster because of some issue with a custom script connecting to the DB.
I am planning on adding more ram to the 8gb dialers to try and help with swap usage.
I also have the log table archive script running at the default 2 months but they are still about 12-18 million records.
Any suggestions to combat this problem is greatly appreciated.