Table Lock Problem

Any and all non-support discussions

Moderators: gerski, enjay, williamconley, Op3r, Staydog, gardo, mflorell, MJCoate, mcargile, Kumba, Michael_N

Table Lock Problem

Postby liigi » Fri Nov 17, 2017 9:18 am

Hello everyone
I come to you to help me diagnose a problem I have in my platform.

My platform is in:
1 db server 2CPUsix core 40gb ram
1 web server
10 tlf server

200 agents

VERSION: 2.12-563a
BUILD: 160801-2119



SERVER DESCRIPTION
srvdbvd01 DataBase Only - DO NOT DELETE 172.16.4.120 AND 0 - 100% 0 1% 2578
 / usr / src / astguiclient / trunk
Path:.
Working Copy Root Path: / usr / src / astguiclient / trunk
URL: svn: //svn.eflo.net/agc_2-X/trunk
Relative URL: ^ / agc_2-X / trunk
Repository Root: svn: //svn.eflo.net
Repository UUID: 3d104415-ff17-0410-8863-d5cf3c621b8a
Revision: 2578
Node Kind: directory
Schedule: normal
Last Changed Author: mattf
Last Changed Rev: 2578
Last Changed Date: 2016-08-09 11:17:57 -0400 (Tue, 09 Aug 2016)



The problem is that during the management the BD hangs, which affects the entire campaign and calls. doing an exhaustive review I realized that when this happens the tables entered a state of table lock and until it does not finish the process that is blocked does not stop executing the rest of instructions .. This instruction that generates the block is usually a report or some update of a list.

According to the recommendations I have enough hardware to handle the amount of agents, but I do not understand why the BD hangs.

Do I need to tune up to the BD?
liigi
 
Posts: 75
Joined: Wed Jul 13, 2016 4:39 pm

Re: Table Lock Problem

Postby mflorell » Fri Nov 17, 2017 3:49 pm

How many records in the vicidial_list table?

What kind of hard drives and RAID card are you using on your database server?

Do you have slow query logging enabled in mysql?
mflorell
Site Admin
 
Posts: 18387
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Table Lock Problem

Postby liigi » Fri Nov 17, 2017 4:12 pm

vicidial_list table : 3.500.000 rows

i have 4 disk drive sas 15.000 RPM in RAID 10 . Hewlett-Packard Company Smart HBA H240

and yes a have enable the slow query
liigi
 
Posts: 75
Joined: Wed Jul 13, 2016 4:39 pm

Re: Table Lock Problem

Postby mflorell » Fri Nov 17, 2017 4:38 pm

What are some of the long running locking queries in your slow query log?

How often do you archive your log tables, and how log do you keep your active logs?
mflorell
Site Admin
 
Posts: 18387
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Table Lock Problem

Postby liigi » Fri Nov 17, 2017 9:25 pm

These are some output from the slow query log:

srvdbvd01:~ # mysqldumpslow -a -s c -t 5

Reading mysql slow query log from /var/log/mysql/mysqld-slow.log
Count: 147  Time=6.91s (1015s)  Lock=1.59s (233s)  Rows_sent=0.0 (0), Rows_examined=3431420.3 (504418786), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
  SELECT count(*),status, sum(length_in_sec) from vicidial_log where user='' and call_date >= '2017-11-16 0:00:01'  and call_date <= '2017-11-16 23:59:59'  group by status order by status

Count: 144  Time=6.09s (876s)  Lock=1.14s (164s)  Rows_sent=1.0 (144), Rows_examined=2943730.6 (423897200), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
  SELECT count(*),status, sum(length_in_sec) from vicidial_log where user='' and call_date >= '2017-11-14 0:00:01'  and call_date <= '2017-11-14 23:59:59'  group by status order by status

Count: 53  Time=7.39s (391s)  Lock=2.45s (129s)  Rows_sent=0.0 (0), Rows_examined=3703603.8 (196290999), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
  SELECT count(*),status, sum(length_in_sec) from vicidial_log where user='' and call_date >= '2017-11-17 0:00:01'  and call_date <= '2017-11-17 23:59:59'  group by status order by status

Count: 44  Time=7.00s (307s)  Lock=1.78s (78s)  Rows_sent=0.0 (0), Rows_examined=3218343.7 (141607124), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
  SELECT count(*),status, sum(length_in_sec) from vicidial_log where user='' and call_date >= '2017-11-15 0:00:01'  and call_date <= '2017-11-15 23:59:59'  group by status order by status

Count: 40  Time=7.58s (303s)  Lock=4.16s (166s)  Rows_sent=3.0 (121), Rows_examined=3707553.5 (148302142), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
  SELECT count(*),status, sum(length_in_sec) from vicidial_log where user='26470476' and call_date >= '2017-11-17 0:00:01'  and call_date <= '2017-11-17 23:59:59'  group by status order by status


Also when we reset Lead-Called-Status for a list, the vicidial_list table generates slow query creating a high volume of transactions with waiting for lock table avoiding that the dialer can establish calls affecting the whole campaign in the system.



We apply the scripts "ADMIN_archive_log_tables.pl" with the options --daily --carrier-daily applied in this way:

20 22 * * * /usr/share/astguiclient/ADMIN_archive_log_tables.pl --daily --carrier-daily

The tables "vicidial_log" and "vicidial_agent_log" remain active for a month and are archived at  the ends of the month. Normally these two tables during a month reach an accumulate about 5 million records.
liigi
 
Posts: 75
Joined: Wed Jul 13, 2016 4:39 pm

Re: Table Lock Problem

Postby mflorell » Sat Nov 18, 2017 8:32 am

It looks like you are not specifying the amount of the logs to keep:
[--days=XX] = number of days to archive past, default is 732(2 years)

If you want to keep a month's worth, you should specify "--days=31" in your archive crontab entry.

Also, "--daily" does not touch the "vicidial_log" table, which seems to be the one that is causing locks.
mflorell
Site Admin
 
Posts: 18387
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Table Lock Problem

Postby liigi » Mon Nov 20, 2017 8:58 am

Hi Matt,

Thank you for your soon response, sorry that previously i did not mention that we do the archiving of the tables "vicidial_log"and "vicidial_agent_log" through other scripts which passes the data to the archive tables on the last day of the month leaving the tables empty for the begining of next month, This is the crontab that executes the task:

### DB Maintenance Vicidial_Log and Vicidial_Agent_Log End of Month
00 23 * * * /root/vicidial_archive_logs_month.php
55 23 * * * [ $(date +\%d) -eq $(echo $(cal) | awk '{print $NF}') ] && /root/bd_mantenimiento.sh

This is the actual size of the

Table Name Rows Count Table Size (MB) Archive
vicidial_log 3834540 476.10 Monthly
vicidial_list 2642990 622.48
vicidial_agent_log 1020541 156.95 Monthly
call_log 12493 3.73 Daily
vicidial_dial_log 7213 1.22 Daily
vicidial_carrier_log 7130 1.15 Daily


We started to have slow query caused by the tables "vicidal_log" and "vicidial_list" since we grew from 130 agents to 200 agents, but we have followed the hardware recommendations in the database server for 200 agents. I have the following questions:

1.- What is the recommended size of the "vicidial_list" and "vicidial_log" tables for 200 agents in outgoing campaigns with a 3: 1 ratio calls

2.- If I have slow query in those tables, Do I need to increase more memory and cpu? What is the recommended size? What could be the solution in order to reduce this issue.

Thanks a lot fro your time.

Regards
liigi
 
Posts: 75
Joined: Wed Jul 13, 2016 4:39 pm

Re: Table Lock Problem

Postby mflorell » Mon Nov 20, 2017 9:41 am

You didn't show any slow queries logged for vicidial_list, could you please show those?

More memory and CPU can always help on the DB side.
mflorell
Site Admin
 
Posts: 18387
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Table Lock Problem

Postby liigi » Mon Nov 20, 2017 1:51 pm

Hi Matt,

These are some of the slow query from vicidial_list:

Count: 2 Time=411.26s (822s) Lock=0.08s (0s) Rows_sent=0.0 (0), Rows_examined=142063199.5 (284126399), Rows_affected=0.0 (0), cron[cron]@[172.16.18.29]
delete from custom_9638 where lead_id in (select max(lead_id) from vicidial_list where list_id=N and date_format(entry_date, )=date_format(N, )
and status = group by phone_number having count(phone_number)>N)

Count: 5 Time=18.79s (93s) Lock=535.61s (2678s) Rows_sent=3132249.4 (15661247), Rows_examined=3132249.4 (15661247), Rows_affected=0.0 (0), cron[cron]@[172.16.4.53]
select * from asterisk.vicidial_list

Count: 6 Time=16.64s (99s) Lock=0.00s (0s) Rows_sent=3087730.3 (18526382), Rows_examined=3087730.3 (18526382), Rows_affected=0.0 (0), root[root]@localhost
SELECT /*!N SQL_NO_CACHE */ * FROM `vicidial_list`

Count: 8 Time=15.28s (122s) Lock=2.05s (16s) Rows_sent=4389.0 (35112), Rows_examined=3380350.4 (27042803), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
SELECT vl.call_date,vl.phone_number,vi.status,vl.user,vu.full_name,vl.campaign_id,vi.vendor_lead_code,vi.source_id,vi.list_id,vi.gmt_offset_now,vi.phone_code,vi.phone_number,vi.title,vi.first_name,vi.middle_initial,vi.last_name,vi.address1,vi.address2,vi.address3,vi.city,vi.state,vi.province,vi.postal_code,vi.country_code,vi.gender,vi.date_of_birth,vi.alt_phone,vi.email,vi.security_phrase,vi.comments,vl.length_in_sec,vl.user_group,vl.alt_dial,vi.rank,vi.owner,vi.lead_id,vl.uniqueid,vi.entry_list_id,UNIX_TIMESTAMP(vl.call_date) from vicidial_users vu,vicidial_log vl,vicidial_list vi where vl.call_date >= and vl.call_date <= and vu.user=vl.user and vi.lead_id=vl.lead_id and vl.campaign_id IN( ) order by vl.call_date desc limit N

Count: 9 Time=8.96s (80s) Lock=3.02s (27s) Rows_sent=1.0 (9), Rows_examined=2266743.7 (20400693), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
SELECT count(*) FROM vicidial_list where called_since_last_reset= and status IN( , , , , , , ) and list_id

Count: 12 Time=8.55s (102s) Lock=0.00s (0s) Rows_sent=1410.8 (16930), Rows_examined=3089141.2 (37069694), Rows_affected=0.0 (0), cron[cron]@localhost
select distinct phone_code from vicidial_list

Count: 9 Time=8.34s (75s) Lock=0.66s (5s) Rows_sent=3984.6 (35861), Rows_examined=3362708.3 (30264375), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
SELECT vl.call_date,vl.phone_number,vl.status,vl.user,vu.full_name,vl.campaign_id,vi.vendor_lead_code,vi.source_id,vi.list_id,vi.gmt_offset_now,vi.phone_code,vi.phone_number,vi.title,vi.first_name,vi.middle_initial,vi.last_name,vi.address1,vi.address2,vi.address3,vi.city,vi.state,vi.province,vi.postal_code,vi.country_code,vi.gender,vi.date_of_birth,vi.alt_phone,vi.email,vi.security_phrase,vi.comments,vl.length_in_sec,vl.user_group,vl.alt_dial,vi.rank,vi.owner,vi.lead_id,vl.uniqueid,vi.entry_list_id from vicidial_users vu,vicidial_log vl,vicidial_list vi where vl.call_date >= and vl.call_date <= and vu.user=vl.user and vi.lead_id=vl.lead_id and vl.campaign_id IN( ) order by vl.call_date limit N

Count: 506 Time=8.32s (4209s) Lock=0.05s (24s) Rows_sent=0.0 (0), Rows_examined=3383894.0 (1712250364), Rows_affected=0.0 (0), cron[cron]@localhost
SELECT lead_id,list_id,gmt_offset_now,phone_number,state,status,modify_date,user,vendor_lead_code FROM vicidial_list where ( ( (called_since_last_reset IN( , , , , )) and (status= ) and (( (gmt_offset_now= ) and (last_local_call_time < "S") ) or ( (gmt_offset_now= ) and (last_local_call_time < "S") ) ) ) ) and ((list_id IN( ))) and lead_id NOT IN( , , , , , , , ) and ((gmt_offset_now IN( , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ) ) ) order by lead_id asc limit N

Count: 4 Time=7.71s (30s) Lock=1.68s (6s) Rows_sent=1.0 (4), Rows_examined=2177959.0 (8711836), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
SELECT count(*) FROM vicidial_list where called_since_last_reset= and status IN( , , , , , , ) and list_id IN( , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ) and (list_id= or ((list_id= and gmt_offset_now NOT and (called_count < N) and list_id NOT IN( )

Count: 7 Time=7.69s (53s) Lock=1.11s (7s) Rows_sent=1.0 (7), Rows_examined=2267768.3 (15874378), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
SELECT count(*) FROM vicidial_list where called_since_last_reset= and status IN( , , , , , , ) and list_id I

Count: 1 Time=7.15s (7s) Lock=0.00s (0s) Rows_sent=1.0 (1), Rows_examined=2216237.0 (2216237), Rows_affected=0.0 (0), cron[cron]@[172.16.4.129]
SELECT count(*) FROM vicidial_list where called_since_last_reset= and status IN( , , , , , , ) and list_id IN( , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ) ) ) and (called_count < N) and list_id NOT IN( )

Count: 563 Time=7.15s (4025s) Lock=0.38s (216s) Rows_sent=0.0 (2), Rows_examined=2894039.5 (1629344266), Rows_affected=0.0 (0), cron[cron]@localhost
SELECT lead_id,list_id,gmt_offset_now,phone_number,state,status,modify_date,user,vendor_lead_code FROM vicidial_list where ( ( (called_since_last_reset IN( , , , , )) and (status= ) and (



Please help know what are the recomended size from vicidial_list and vicidial_log in order to avoid slow query, the next month we are going to grow from 200 to 300 agents.

Regards
liigi
 
Posts: 75
Joined: Wed Jul 13, 2016 4:39 pm

Re: Table Lock Problem

Postby mflorell » Mon Nov 20, 2017 3:02 pm

It's not as much the size of vicidial_list(although you should keep it below 10 million, or even better below 6 million), it's the problem of having too many leads in the active lists for a campaign. You should try to keep no more than 400,000 leads in the active lists within a campaign if you want it to run smoothly.
mflorell
Site Admin
 
Posts: 18387
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida

Re: Table Lock Problem

Postby liigi » Tue Nov 21, 2017 9:54 am

Thanks Matt,

I will follow your recomendation, i actually have more than 400.000 active leads in a campaign, but i work with list mix where i have 15 active list asigned to an outbound campaign but only 5 of those list are on list mix within the campaign, my question is: if in the campaign are taken into account only the list within the list mix like active leads or also all the other lists assigned to the campaign while being active?

Thanks
Regards
liigi
 
Posts: 75
Joined: Wed Jul 13, 2016 4:39 pm

Re: Table Lock Problem

Postby mflorell » Tue Nov 21, 2017 10:19 am

The lists that are set to active=Y are all you really need to worry about as far as the recommended 400,000 limit. As for List Mix, we almost never recommend anyone use it because it is hard to manage and usually results in no improvement in dialing efficiency, and yes it does put additional load on your database by using it.
mflorell
Site Admin
 
Posts: 18387
Joined: Wed Jun 07, 2006 2:45 pm
Location: Florida


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 54 guests