DB Server upgrade in cluster failed
Posted: Fri Feb 14, 2020 2:18 pm
A few days ago, I replaced my DB server with a bigger box - went from 24 Core processor / 64GB RAM / 500GB Drive to 40 Core / 128GB / 12 TB...I had existing servers that were on an older SVN / schema, so I maintained those versions, at least initially, I planned on updating the SVNs to the latest version, or close to it, once the new DB server proved reliable...
Due to an issue with the CMOS / CD ROM driver, I couldn't use CD to install from the v7 ISO on the new box - I may have been able to use a bootable USB copy, but instead, I used the v8 ISO, which didn't have the issue. To keep the versions consistent across the cluster, I installed the same SVN (2973) and DB schema (1542). I moved the asterisk database to the new box, changed the old box's IP, changed the new box's IP to the old DB's IP (so no IP Update should have been required), and started it up - all servers registered with each other, test calls worked, but, when we went live, it was a mess...asterisk service crasing, MySQL extremely slow on queries that took a couple of seconds on the previous server, and more...so, last night, we moved the database back to the original box, flip-flopped the IP addresses between the old and new DB server again, rebooted, all looked good - but - again - major issues...when an Agent logged in, their phone rings (IAX or SIP), but when they answer, they do not get the "Only member" message and the call drops almost immediately. The solution seemed to be to re-pointed any agent phones I wanted to work to one of the other servers. All of these issues, make me think that - even though I used the same SVN/schema - maybe there were changes in the MySQL schema / values purely based on he ISO version I installed from? Or CONF file changes that were introduced by v8?
here is some other info about our system:
VERSION: 2.14-670a
BUILD: 180424-1521
I have been up for about 60 hours at this point, so - if I left anything out, or if it doesn't make any sense, please ask me and I will do my best to re-word/elaborate/etc as needed to include the information necessary to help me...
I have just about everyone working again - all extensions registered to a single server - but - if anyone could point me in the direction of what I should be researching - I really would appreciate it...David
Due to an issue with the CMOS / CD ROM driver, I couldn't use CD to install from the v7 ISO on the new box - I may have been able to use a bootable USB copy, but instead, I used the v8 ISO, which didn't have the issue. To keep the versions consistent across the cluster, I installed the same SVN (2973) and DB schema (1542). I moved the asterisk database to the new box, changed the old box's IP, changed the new box's IP to the old DB's IP (so no IP Update should have been required), and started it up - all servers registered with each other, test calls worked, but, when we went live, it was a mess...asterisk service crasing, MySQL extremely slow on queries that took a couple of seconds on the previous server, and more...so, last night, we moved the database back to the original box, flip-flopped the IP addresses between the old and new DB server again, rebooted, all looked good - but - again - major issues...when an Agent logged in, their phone rings (IAX or SIP), but when they answer, they do not get the "Only member" message and the call drops almost immediately. The solution seemed to be to re-pointed any agent phones I wanted to work to one of the other servers. All of these issues, make me think that - even though I used the same SVN/schema - maybe there were changes in the MySQL schema / values purely based on he ISO version I installed from? Or CONF file changes that were introduced by v8?
here is some other info about our system:
VERSION: 2.14-670a
BUILD: 180424-1521
I have been up for about 60 hours at this point, so - if I left anything out, or if it doesn't make any sense, please ask me and I will do my best to re-word/elaborate/etc as needed to include the information necessary to help me...
I have just about everyone working again - all extensions registered to a single server - but - if anyone could point me in the direction of what I should be researching - I really would appreciate it...David