Intermittent "Can't connect to MySQL server" and dead calls

All installation and configuration problems and questions

Moderators: gerski, enjay, williamconley, Op3r, Staydog, gardo, mflorell, MJCoate, mcargile, Kumba, Michael_N

Intermittent "Can't connect to MySQL server" and dead calls

Postby alo » Wed Jan 08, 2014 1:58 pm

Good Morning Friends and Experts.

We are having trouble with some SQL connect errors, along with DEAD calls showing on the realtime report while agents are still in a call.

We have several asterisk servers and 2 web servers.

My thought is that our network switch is not able to handle all the connections to the database from all the servers.

The reason I think this is I have been getting the following errors from the dialing servers

DBI connect('asterisk:IP.IP.IP.IP:3306','cron',...) failed: Can't connect to MySQL server on 'IP.IP.IP.IP' (110) at /usr/share/astguiclient/ADMIN_keepalive_ALL.pl line 706.
Couldn't connect to database: Can't connect to MySQL server on 'IP.IP.IP.IP' (110) at /usr/share/astguiclient/ADMIN_keepalive_ALL.pl line 706.

But this is intermittent.

Also this one:

DBI connect('asterisk:IP.IP.IP.IP:3306','cron',...) failed: Can't connect to MySQL server on 'IP.IP.IP.IP' (110) at /usr/share/astguiclient/AST_CRON_audio_1_move_mix.pl line 142.
Couldn't connect to database: Can't connect to MySQL server on 'IP.IP.IP.IP' (110) at /usr/share/astguiclient/AST_CRON_audio_1_move_mix.pl line 142.

Also on mtop I sometimes notice querys waititng for 4 or more seconds like this not sure its related, but My thought is perhaps its not comunicating with the webserver/asterisk server thats sending this because the network switch is too loaded.:

Command: Query State: Waiting for table level lock

SELECT list_id, gmt_offset_now, called_since_last_reset, phone_code,
phone_number, address3, alt_phone, called_count, security_phrase
FROM vicidial_list
WHERE lead_id='940055'


All Servers were installed with ViciBox5.x86_64-5.0.3.iso and use SVN 2016

The DB server is 16 core 16gb RAM with raid 10 SSD.

My only other thought is that my.conf for maria DB or server-tuning.conf for apache is not configured correctly for this much load.

-----my.conf-----

Code: Select all
[mysqld]
port      = 3306
socket      = /var/run/mysql/mysql.sock
datadir   = /var/lib/mysql
skip-external-locking
key_buffer_size = 1400M
max_allowed_packet = 2M
table_open_cache = 8192
sort_buffer_size = 4M
net_buffer_length = 8K
read_buffer_size = 4M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 64M
thread_cache_size = 50
query_cache_size = 0
thread_concurrency=8
skip-name-resolve
connect_timeout=60
long_query_time=3
log_slow_queries
max_connections=868
open_files_limit=24576
max_heap_table_size=32M
expire_logs_days=3
default-storage-engine=MyISAM
table_definition_cache=8192
table_cache=8192
myisam_recover
myisam_repair_threads=1


# This will disable networking
#skip-networking



----server-tuning.conf----

Code: Select all
<IfModule prefork.c>
   # number of server processes to start
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#startservers
   StartServers         5
   # minimum number of server processes which are kept spare
   # http://httpd.apache.org/docs/2.2/mod/prefork.html#minspareservers
   MinSpareServers      5
   # maximum number of server processes which are kept spare
   # http://httpd.apache.org/docs/2.2/mod/prefork.html#maxspareservers
   MaxSpareServers     10
   # highest possible MaxClients setting for the lifetime of the Apache process.
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#serverlimit
   ServerLimit        250
   # maximum number of server processes allowed to start
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxclients
   MaxClients         250
   # maximum number of requests a server process serves
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild
   MaxRequestsPerChild  10000
</IfModule>

# worker MPM
<IfModule worker.c>
   # initial number of server processes to start
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#startservers
   StartServers         3
   # minimum number of worker threads which are kept spare
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#minsparethreads
   MinSpareThreads     25
   # maximum number of worker threads which are kept spare
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxsparethreads
   MaxSpareThreads     75
   # upper limit on the configurable number of threads per child process
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#threadlimit
   ThreadLimit         64
   # maximum number of simultaneous client connections
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxclients
   MaxClients         250
   # number of worker threads created by each child process
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#threadsperchild
   ThreadsPerChild     25
   # maximum number of requests a server process serves
   # http://httpd.apache.org/docs/2.2/mod/mpm_common.html#maxrequestsperchild
   MaxRequestsPerChild  10000
</IfModule>


#
# KeepAlive: Whether or not to allow persistent connections (more than
# one request per connection). Set to "Off" to deactivate.
#
KeepAlive On

#
# MaxKeepAliveRequests: The maximum number of requests to allow
# during a persistent connection. Set to 0 to allow an unlimited amount.
# We recommend you leave this number high, for maximum performance.
#
MaxKeepAliveRequests 100

#
# KeepAliveTimeout: Number of seconds to wait for the next request from the
# same client on the same connection.
#
KeepAliveTimeout 15


I hope I provided all the information needed
Any Ideas would be greatly appreciative.
I really Appreciate being part of such a great community.
alo
 
Posts: 197
Joined: Wed Jun 20, 2012 10:21 am

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby williamconley » Wed Jan 08, 2014 8:34 pm

We have several asterisk servers and 2 web servers

1) Welcome to the Party! 8-)

2) As you are obviously new here, I have some suggestions to help us all help you:

When you post, please post your entire configuration including (but not limited to) your installation method and vicidial version with build.

This IS a requirement for posting along with reading the stickies (at the top of each forum) and the manager's manual (available on EFLO.net, both free and paid versions)

You should also post: Asterisk version, telephony hardware (model number is helpful here), cluster information if you have one, and whether any other software is installed in the box. If your installation method is "from scratch" you must post your operating system and should also post the .iso version from which you installed your original operating system. If your installation is "Hosted" list the site name of the host.

If this is a "Cloud" or "Virtual" server, please note the technology involved along with the version of that techology (ie: VMware Server Version 2.0.2). If it is not, merely stating the Motherboard model # and CPU would be helpful.

Similar to This:

Vicibox X.X from .iso | Vicidial X.X.X-XXX Build XXXXXX-XXXX | Asterisk X.X.X | Single Server | No Digium/Sangoma Hardware | No Extra Software After Installation | Intel DG35EC | Core2Quad Q6600
_____________

3) What you've shown is an interesting symptom. But without some basic information ... it'll be hard to even guess. (Um ... "several asterisk servers"?)

4) Do you have the "huge" mysql configuration in place? Those would be the my.cnf settings to allow massive mysql connections for a large cluster. These settings are available for auto-install during Vicibox's iso install method.

5) Have you captured packets during a fail or taken other steps to see WHY the fail occurred? Dropped packet is a different cause entirely than a refused connection due to a connection limit being reached. Your mysql configuration will determine whether a failed connection should be logged. If so, your mysql log could contain some ... valuable information 8-) (and if not ... start looking for dropped packets!)
Vicidial Installation and Repair, plus Hosting and Colocation
Newest Product: Vicidial Agent Only Beep - Beta
http://www.PoundTeam.com # 352-269-0000 # +44(203) 769-2294
williamconley
 
Posts: 20258
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby alo » Thu Jan 09, 2014 1:41 pm

So Sorry sir. thought I included all that information but I understand now how this may relevant. 9 Asterisk servers. all 8 core 16GB ram with SSDs.

However, I was able to catch this in the logs: "nf_conntrack: table full, dropping packet."

I raised the conntrackmax with
Code: Select all
/sbin/sysctl -w net.netfilter.nf_conntrack_max=196608


and it seemed to help a LOT.
In case anyone else has similar issues. It appears you may also need to raise the hash table size too...

I also installed vicibox5 on a VM quick to find the my-huge but it did not ask me during install. using ViciBox5.x86_64-5.0.3.preload.iso however I did find the my-bigvici.cnf in the /usr/src/astguiclient/conf directory. and am using the settings from that.

Any Idea where I can find the HUGE-my.cnf? Nevermind just found it in /usr/share/mysql/my-huge.cnf but it seems less agressive then the vici one, so not sure this is right...
alo
 
Posts: 197
Joined: Wed Jun 20, 2012 10:21 am

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby williamconley » Thu Jan 09, 2014 4:28 pm

You should modify your firewall to NOT track or have any effect on the 'inter-server' communications. Most firewalls have the ability to ignore (allow all traffic) on individual network cards. If your server have a public IP (track/firewall) and an internal ip (NO track/NO firewall), you can improve performance by NOT having a firewall in place for local communications. This applies for inter-server communications as well as agent communications (although agents should be disabled from sensitive ports such as 22 and 3306).
Vicidial Installation and Repair, plus Hosting and Colocation
Newest Product: Vicidial Agent Only Beep - Beta
http://www.PoundTeam.com # 352-269-0000 # +44(203) 769-2294
williamconley
 
Posts: 20258
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby Vince-0 » Fri Jan 10, 2014 7:52 am

In addition to William's recommendations, take a look at tcp_tw_recycle on your web servers. Reference this thread:

viewtopic.php?f=2&t=26800&p=94900#p94900

Vin.
Vince-0
 
Posts: 272
Joined: Fri Mar 02, 2012 4:27 pm
Location: South Africa

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby williamconley » Fri Jan 10, 2014 10:14 pm

Vince-0 wrote:In addition to William's recommendations, take a look at tcp_tw_recycle on your web servers. Reference this thread:

viewtopic.php?f=2&t=26800&p=94900#p94900

Vin.

And if that is useful (or not), please do report your findings. It's not a "usual" problem, but when it crops up it causes IT staff to lose hair. So ... join the fun and help everyone keep the hair.
Vicidial Installation and Repair, plus Hosting and Colocation
Newest Product: Vicidial Agent Only Beep - Beta
http://www.PoundTeam.com # 352-269-0000 # +44(203) 769-2294
williamconley
 
Posts: 20258
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby alo » Mon Jan 13, 2014 5:41 pm

Thank you guys so much for the advice.

I Actually am fairly confident that your advice William Will Help the inter-server communications, so I am going to work on implementing that as soon as possible.

In the mean time, Increasing the conntract_Max Helped a lot. I also did sysctl -w net.ipv4.tcp_tw_reuse=1 Per vinces recommendation.

I was also thinking the "dead calls" was caused by the vicidial_agent_log Table locks I was seeing so I ran the /usr/share/astguiclient/ADMIN_archive_log_tables.pl --months=1
And that did help quite a bit. (not with the system displaying less dead calls, but with the locks)

I also see the vicidial_list Getting a lot of table locks with 4 million leads. I Think tuning the my.cnf file could help with this.

I installed vicibox5.0.3.iso on a virtual Machine but was unable to obtain the my-HUGE.cnf But I did find the my-bigvici.cnf in the /usr/src/astguiclient/conf directory.

Anyone know the settings in the huge or some awesome vici tuned settings for my.cnf for a 16 core 16gb raid10 ssd server they care to share? :)
alo
 
Posts: 197
Joined: Wed Jun 20, 2012 10:21 am

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby williamconley » Mon Jan 13, 2014 7:28 pm

May be very helpful for "the next guy" if you defined this a bit:
Increasing the conntract_Max Helped a lot

As far as big vs huge, I may have simply misquoted the big file's name. But if you diff that against the normal one, you may find that the differences will give you a hint of changes you could make to take it farther.

also: 32G ram may not be a bad idea ...
Vicidial Installation and Repair, plus Hosting and Colocation
Newest Product: Vicidial Agent Only Beep - Beta
http://www.PoundTeam.com # 352-269-0000 # +44(203) 769-2294
williamconley
 
Posts: 20258
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby alo » Mon Jan 13, 2014 10:56 pm

ah yes, I am sorry I was more clear in the earlier post.

I was able to catch this in the logs: "nf_conntrack: table full, dropping packet."

I raised the conntrackmax with
CODE: SELECT ALL
/sbin/sysctl -w net.netfilter.nf_conntrack_max=196608


and it seemed to help a LOT.
In case anyone else has similar issues. It appears you may also need to raise the hash table size too...


I notice while running htop that my Mem is usually at 2192/16032 and nothing is ever in the Swap. So I wouldn't assume that I am coming close to using too much ram. Unless I am not looking at that correctly in htop. top does show:
KiB Mem: 16417296 total, 16184440 used, 232856 free, 135808 buffers. May not be a vicidial question but am I looking at this backwards in htop where I actually am running quite low on memory instead of having more then enough as I previously thought?
alo
 
Posts: 197
Joined: Wed Jun 20, 2012 10:21 am

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby williamconley » Tue Jan 14, 2014 12:28 pm

Actually, I was referring to "it helped a lot" being a bit vague. What exactly did you see? Before you crashed once every hour and after you crashed every six hours? or XX% less logged errors ... ? Or ...?
Vicidial Installation and Repair, plus Hosting and Colocation
Newest Product: Vicidial Agent Only Beep - Beta
http://www.PoundTeam.com # 352-269-0000 # +44(203) 769-2294
williamconley
 
Posts: 20258
Joined: Wed Oct 31, 2007 4:17 pm
Location: Davenport, FL (By Disney!)

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby Vince-0 » Tue Jan 14, 2014 2:45 pm

alo,
Run a mysqltuner script and check suggested configuration? Mine suggests my RAM allocation is way over limit but I don't see it ever reaching high usage.

Approximation from my MySQL server tuning variables:
max_heap_table_size is way bigger. You can check the size of the tables in memory somehow.
query_cache_size is the same at 0 and I found increasing this can lead eventually to dead calls stats in the real time screen.
thread_concurrency I remember being twice the number of cores you have so 2*16=32.

My Apache server tuning variables are all like 10 times yours above. This example system may cater for 200 agents logged in and working in practice.

Looks like you may have more than one cause for the problems.
Vince-0
 
Posts: 272
Joined: Fri Mar 02, 2012 4:27 pm
Location: South Africa

Re: Intermittent "Can't connect to MySQL server" and dead ca

Postby alo » Fri Jan 17, 2014 7:52 pm

Hey Guys.

Vince-o Thanks for the advice, I will take a look at that this weekend.

Looks like the dead calls pop up when the vicidial_agent_log table is locked. (for our system anyways)

Looks like the the table gets locked and queries get backed up after the cron for AST_cleanup_agent_log.pl runs.
I have been running the ADMIN_archive_log_tables.pl --months=1 script every night to keep that table down but the table still locks up.

I have disabled this cron for now.

More to come...
alo
 
Posts: 197
Joined: Wed Jun 20, 2012 10:21 am


Return to Support

Who is online

Users browsing this forum: No registered users and 57 guests