hansg wrote:... I am still getting frequent crashes. No idea what is causing them.
... which is crashing the database
It is a Cloud Server.
Cloud server doesn't answer the question: Is it Virtual or Physical? Both can reside in a colocation facility such as Amazon or elsewhere. Vicidial does not run Virtual. Google it, enjoy the battle, but be prepared for a serious time investment and likely failure. lol.
... Still crashing frequently.
kjburto wrote: I am having the same issue with one of my servers.
If an application is "crashing", then it would either be "stuck" or no longer running and in most cases it would leave an error trail of some sort.
I'm not seeing any error logs, or any information regarding processes no longer running or processes that are running but no longer functioning (both scenarios are common during this scenario). (IE: no Troubleshooting ... just guesses)
The original poster showed a red server in the "Reports" main page, but kjburto did not show or mention such an entry.
Normally when the server "goes red" it is because one of the "screen -ls" entries is missing, because one of the processes running in one of those screens has exited (abnormally) and the screen auto-closed. In some cases, the script running in the screen simply stops running without exiting, but that missing script turns the server red as the services the script provides are necessary and there's a "watchdog" field that is no longer updated ... thus turning the server red.
So the first thing we need to know is: Are all your processes running? If you're not sure, try this:
- Code: Select all
screen -ls
If you do not know which ones *should* be running, reboot and run that command again when everything is operating normally. The scripts are run based on the server's /etc/astguiclient.conf setting for "VARactive_keepalives".
Once we know which script is missing (If that's the case) we can figure out why it's not running by starting it normally using the same command the keepalive script would use to start it. Add one thing to the command: " --debugX" and you may get lucky and find out why it's crashing. You may need to modify the keepalive to execute it with the --debugX directive so you can keep an eye on the screen (leave the screen running in a terminal and wait for it to crash ... the history may be useful). Often this information is in a log for the screen in question (those missing logs I was talking about
)
If the script that is missing is the "asterisk" script, and asterisk is crashing, try using the /etc/asterisk/modules.conf file to set "noload" directives for any asterisk modules you are not using (as a hip shot) or check the asterisk logs to find out if the first error is in any way related to a module.
An older server's noload list
- Code: Select all
noload => pbx_gtkconsole.so
;load => pbx_gtkconsole.so
;
load => res_musiconhold.so
;
; Load either OSS or ALSA, not both
; By default, load OSS only (automatically) and do not load ALSA
;
noload => chan_alsa.so
;noload => chan_oss.so
noload => chan_woomera.so
noload => chan_capi.so
noload => cdr_csv.so
A newer server's noload list:
- Code: Select all
; Load one of: chan_oss, alsa, or console (portaudio).
; By default, load chan_oss only (automatically).
;
noload => chan_alsa.so
noload => chan_oss.so
noload => chan_console.so
;
noload => cdr_csv.so
noload => chan_ooh323.so
noload => chan_woomera.so
noload => chan_capi.so
noload => res_config_sqlite.so
noload => app_cdr.so
noload => cdr_manager.so
noload => cdr_sqlite3_custom.so
noload => func_cdr.so
noload => app_forkcdr.so
noload => cdr_custom.so
noload => cdr_sqlite.so
noload => cdr_syslog.so
noload => res_ael_share.so
noload => pbx_lua.so
noload => res_speech.so
noload => res_jabber.so
noload => res_fax.so
noload => res_smdi.so
noload => pbx_ael.so
noload => app_ices.so
noload => app_festival.so
noload => pbx_realtime.so
noload => func_realtime.so
noload => chan_skinny.so
noload => format_jpeg.so
noload => format_vox.so
noload => app_sms.so
noload => app_talkdetect.so
noload => chan_agent.so
noload => app_zapateller.so
noload => app_nbscat.so
noload => app_queue.so
noload => cel_sqlite3_custom.so
noload => app_disa.so
noload => chan_gtalk.so
noload => app_image.so
noload => app_dictate.so
noload => app_url.so
noload => res_phoneprov.so
noload => func_pitchshift.so
noload => func_blacklist.so
noload => app_page.so
noload => res_http_post.so
noload => app_directory.so
noload => app_test.so
noload => app_flash.so
noload => chan_unistim.so
noload => app_sendtext.so
noload => app_minivm.so
noload => chan_jingle.so
You can also set asterisk to perform a core dump at the crash moment and read the core to find the error, but that's a wee bit involved. Usually disabling modules that aren't needed will solve an asterisk crash scenario.