asterisk segfault at 0
Posted: Tue Jul 09, 2019 10:28 am
Perhaps someone can help me:
New installation of vicibox 8.1 on dell R610 server 6 ssd drives in raid 10 configuration.
uname -a output :
Linux pepper 4.4.179-99-default #1 SMP Tue May 14 18:07:16 UTC 2019 (c775d39) x86_64 x86_64 x86_64 GNU/Linux
once I put a significant (15 agents predictive 3:1) load on it it segments
dmesg shows:
[ 3249.645571] ------------[ cut here ]------------
[ 3249.645581] WARNING: CPU: 4 PID: 9773 at ../kernel/sched/core.c:8172 __might_sleep+0x76/0x80()
[ 3249.645585] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810cb7cb>] prepare_to_wait+0x2b/0x80
[ 3249.645632] Modules linked in: ip_set_hash_net(O) ip_set_hash_ip(O) ip_set(O) nfnetlink dahdi(O) crc_ccitt af_packet iscsi_ibft iscsi_boot_sysfs xt_tcpudp msr iptable_filter ip_tables x_tables intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drbg iTCO_wdt ansi_cprng ipmi_ssif iTCO_vendor_support gpio_ich lpc_ich aesni_intel aes_x86_64 lrw gf128mul igb ptp pps_core pcspkr mfd_core ipmi_si ipmi_devintf wmi fjes joydev bnx2 acpi_cpufreq dcdbas dca glue_helper i7core_edac button ablk_helper ipmi_msghandler edac_core cryptd shpchp processor ext4 crc16 jbd2 mbcache sr_mod cdrom uas hid_generic ata_generic usb_storage usbhid i2c_algo_bit drm_kms_helper(O) syscopyarea sysfillrect uhci_hcd ehci_pci sysimgblt fb_sys_fops ehci_hcd ttm(O) ata_piix
[ 3249.645641] sd_mod usbcore libata drm(O) serio_raw usb_common megaraid_sas sg scsi_mod autofs4
[ 3249.645644] CPU: 4 PID: 9773 Comm: asterisk Tainted: G IO 4.4.179-99-default #1
[ 3249.645645] Hardware name: Dell Inc. PowerEdge R610/0F0XJ6, BIOS 6.4.0 07/23/2013
[ 3249.645647] 0000000000000000 ffffffff81349e57 ffff880e11757cc8 ffffffff81a27b6e
[ 3249.645649] ffffffff810863a1 ffffffffa047a2d0 ffff880e11757d18 0000000000001743
[ 3249.645651] 0000000000000000 0000000000000002 ffffffff8108641c ffffffff81a18070
[ 3249.645651] Call Trace:
[ 3249.645667] [<ffffffff8101b0a9>] dump_trace+0x59/0x350
[ 3249.645671] [<ffffffff8101b49a>] show_stack_log_lvl+0xfa/0x180
[ 3249.645674] [<ffffffff8101c291>] show_stack+0x21/0x40
[ 3249.645680] [<ffffffff81349e57>] dump_stack+0x5c/0x85
[ 3249.645687] [<ffffffff810863a1>] warn_slowpath_common+0x81/0xb0
[ 3249.645691] [<ffffffff8108641c>] warn_slowpath_fmt+0x4c/0x50
[ 3249.645694] [<ffffffff810abfa6>] __might_sleep+0x76/0x80
[ 3249.645702] [<ffffffff811cb5a4>] __might_fault+0x34/0x40
[ 3249.645712] [<ffffffffa047204b>] dahdi_chanandpseudo_ioctl+0x3db/0x1790 [dahdi]
[ 3249.645729] [<ffffffffa047455a>] dahdi_unlocked_ioctl+0x31a/0x14e0 [dahdi]
[ 3249.645735] [<ffffffff8122eef7>] do_vfs_ioctl+0x337/0x5f0
[ 3249.645748] [<ffffffff8122f224>] SyS_ioctl+0x74/0x80
[ 3249.645755] [<ffffffff8164bb25>] entry_SYSCALL_64_fastpath+0x24/0xed
[ 3249.648761] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x24/0xed
[ 3249.648762] Leftover inexact backtrace:
[ 3249.648779] ---[ end trace 4f0560146804eb1e ]---
[ 5165.553988] perf interrupt took too long (2508 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 9074.460721] asterisk[26757]: segfault at 0 ip 000000000052917a sp 00007f7c2072fe90 error 4 in asterisk[400000+2b2000]
asterisk segfault happened much later then kernel dump, but it shows CPU issue; I think.
and
asterisk last 2 Threads of core dump backtrace shows:
[New LWP 2932]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/asterisk -vvvvvvvvvvvvvvvvvvvvvgcT'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000052917a in ast_frame_adjust_volume ()
[Current thread is 1 (Thread 0x7f7c20734700 (LWP 26757))]
#0 0x000000000052917a in ast_frame_adjust_volume ()
No symbol table info available.
#1 0x00007f7c3bbcf666 in conf_run () from /usr/lib64/asterisk/modules/app_meetme.so
No symbol table info available.
#2 0x00007f7c3bbd3363 in conf_exec () from /usr/lib64/asterisk/modules/app_meetme.so
No symbol table info available.
#3 0x000000000058d479 in pbx_exec ()
No symbol table info available.
#4 0x0000000000581e7c in pbx_extension_helper.constprop ()
No symbol table info available.
#5 0x0000000000583e8a in __ast_pbx_run ()
No symbol table info available.
#6 0x0000000000586d1d in ast_pbx_run ()
No symbol table info available.
#7 0x000000000047b8b5 in ast_bridge_run_after_goto ()
No symbol table info available.
#8 0x00000000004734ad in bridge_channel_ind_thread ()
No symbol table info available.
#9 0x00000000005fa47a in dummy_start ()
No symbol table info available.
#10 0x00007f7cec2a0724 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#11 0x00007f7ceb80ae8d in clone () from /lib64/libc.so.6
No symbol table info available.
Thread 188 (Thread 0x7f7c226c3700 (LWP 2932)):
#0 0x00007f7cec2a50ff in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f7c43b54532 in iax2_process_thread () from /usr/lib64/asterisk/modules/chan_iax2.so
#2 0x00000000005fa47a in dummy_start ()
#3 0x00007f7cec2a0724 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f7ceb80ae8d in clone () from /lib64/libc.so.6
I ran a stress test on all cores for 30 minutes and no new dmesg errors
I stressed ram as well without new dmesg errors
Is my hardware bad? if so, what is it CPU (4)?
New installation of vicibox 8.1 on dell R610 server 6 ssd drives in raid 10 configuration.
uname -a output :
Linux pepper 4.4.179-99-default #1 SMP Tue May 14 18:07:16 UTC 2019 (c775d39) x86_64 x86_64 x86_64 GNU/Linux
once I put a significant (15 agents predictive 3:1) load on it it segments
dmesg shows:
[ 3249.645571] ------------[ cut here ]------------
[ 3249.645581] WARNING: CPU: 4 PID: 9773 at ../kernel/sched/core.c:8172 __might_sleep+0x76/0x80()
[ 3249.645585] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff810cb7cb>] prepare_to_wait+0x2b/0x80
[ 3249.645632] Modules linked in: ip_set_hash_net(O) ip_set_hash_ip(O) ip_set(O) nfnetlink dahdi(O) crc_ccitt af_packet iscsi_ibft iscsi_boot_sysfs xt_tcpudp msr iptable_filter ip_tables x_tables intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drbg iTCO_wdt ansi_cprng ipmi_ssif iTCO_vendor_support gpio_ich lpc_ich aesni_intel aes_x86_64 lrw gf128mul igb ptp pps_core pcspkr mfd_core ipmi_si ipmi_devintf wmi fjes joydev bnx2 acpi_cpufreq dcdbas dca glue_helper i7core_edac button ablk_helper ipmi_msghandler edac_core cryptd shpchp processor ext4 crc16 jbd2 mbcache sr_mod cdrom uas hid_generic ata_generic usb_storage usbhid i2c_algo_bit drm_kms_helper(O) syscopyarea sysfillrect uhci_hcd ehci_pci sysimgblt fb_sys_fops ehci_hcd ttm(O) ata_piix
[ 3249.645641] sd_mod usbcore libata drm(O) serio_raw usb_common megaraid_sas sg scsi_mod autofs4
[ 3249.645644] CPU: 4 PID: 9773 Comm: asterisk Tainted: G IO 4.4.179-99-default #1
[ 3249.645645] Hardware name: Dell Inc. PowerEdge R610/0F0XJ6, BIOS 6.4.0 07/23/2013
[ 3249.645647] 0000000000000000 ffffffff81349e57 ffff880e11757cc8 ffffffff81a27b6e
[ 3249.645649] ffffffff810863a1 ffffffffa047a2d0 ffff880e11757d18 0000000000001743
[ 3249.645651] 0000000000000000 0000000000000002 ffffffff8108641c ffffffff81a18070
[ 3249.645651] Call Trace:
[ 3249.645667] [<ffffffff8101b0a9>] dump_trace+0x59/0x350
[ 3249.645671] [<ffffffff8101b49a>] show_stack_log_lvl+0xfa/0x180
[ 3249.645674] [<ffffffff8101c291>] show_stack+0x21/0x40
[ 3249.645680] [<ffffffff81349e57>] dump_stack+0x5c/0x85
[ 3249.645687] [<ffffffff810863a1>] warn_slowpath_common+0x81/0xb0
[ 3249.645691] [<ffffffff8108641c>] warn_slowpath_fmt+0x4c/0x50
[ 3249.645694] [<ffffffff810abfa6>] __might_sleep+0x76/0x80
[ 3249.645702] [<ffffffff811cb5a4>] __might_fault+0x34/0x40
[ 3249.645712] [<ffffffffa047204b>] dahdi_chanandpseudo_ioctl+0x3db/0x1790 [dahdi]
[ 3249.645729] [<ffffffffa047455a>] dahdi_unlocked_ioctl+0x31a/0x14e0 [dahdi]
[ 3249.645735] [<ffffffff8122eef7>] do_vfs_ioctl+0x337/0x5f0
[ 3249.645748] [<ffffffff8122f224>] SyS_ioctl+0x74/0x80
[ 3249.645755] [<ffffffff8164bb25>] entry_SYSCALL_64_fastpath+0x24/0xed
[ 3249.648761] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x24/0xed
[ 3249.648762] Leftover inexact backtrace:
[ 3249.648779] ---[ end trace 4f0560146804eb1e ]---
[ 5165.553988] perf interrupt took too long (2508 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 9074.460721] asterisk[26757]: segfault at 0 ip 000000000052917a sp 00007f7c2072fe90 error 4 in asterisk[400000+2b2000]
asterisk segfault happened much later then kernel dump, but it shows CPU issue; I think.
and
asterisk last 2 Threads of core dump backtrace shows:
[New LWP 2932]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/sbin/asterisk -vvvvvvvvvvvvvvvvvvvvvgcT'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000000000052917a in ast_frame_adjust_volume ()
[Current thread is 1 (Thread 0x7f7c20734700 (LWP 26757))]
#0 0x000000000052917a in ast_frame_adjust_volume ()
No symbol table info available.
#1 0x00007f7c3bbcf666 in conf_run () from /usr/lib64/asterisk/modules/app_meetme.so
No symbol table info available.
#2 0x00007f7c3bbd3363 in conf_exec () from /usr/lib64/asterisk/modules/app_meetme.so
No symbol table info available.
#3 0x000000000058d479 in pbx_exec ()
No symbol table info available.
#4 0x0000000000581e7c in pbx_extension_helper.constprop ()
No symbol table info available.
#5 0x0000000000583e8a in __ast_pbx_run ()
No symbol table info available.
#6 0x0000000000586d1d in ast_pbx_run ()
No symbol table info available.
#7 0x000000000047b8b5 in ast_bridge_run_after_goto ()
No symbol table info available.
#8 0x00000000004734ad in bridge_channel_ind_thread ()
No symbol table info available.
#9 0x00000000005fa47a in dummy_start ()
No symbol table info available.
#10 0x00007f7cec2a0724 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#11 0x00007f7ceb80ae8d in clone () from /lib64/libc.so.6
No symbol table info available.
Thread 188 (Thread 0x7f7c226c3700 (LWP 2932)):
#0 0x00007f7cec2a50ff in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f7c43b54532 in iax2_process_thread () from /usr/lib64/asterisk/modules/chan_iax2.so
#2 0x00000000005fa47a in dummy_start ()
#3 0x00007f7cec2a0724 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f7ceb80ae8d in clone () from /lib64/libc.so.6
I ran a stress test on all cores for 30 minutes and no new dmesg errors
I stressed ram as well without new dmesg errors
Is my hardware bad? if so, what is it CPU (4)?