[Home]

Summary:ASTERISK-25323: Asterisk: ongoing segfaults uncovered by CHAOS_DEBUG
Reporter:Scott Griepentrog (sgriepentrog)Labels:
Date Opened:2015-08-14 08:18:19Date Closed:
Priority:MinorRegression?
Status:Open/NewComponents:General
Versions:13.0.0 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace-core.11062.txt
( 1) backtrace-core.12412.txt
( 2) backtrace-core.12729.txt
( 3) backtrace-core.29894.txt
( 4) full-log-core.12729.txt
Description:Ongoing use of CHAOS_DEBUG on a test server is uncovering extremely unlikely scenarios that can result in a segfault, by use of randomly simulating a failed allocation.

Rather than create a new issue for each case, I'm grouping them under this single issue as they are found.
Comments:By: Scott Griepentrog (sgriepentrog) 2015-08-17 10:20:42.618-0500

Instance #1: crash in ast_str_hash due to null str

{noformat}
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `asterisk -fvvvvvdddddgn'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007fe9dc111f4a in ast_str_hash (str=0x0) at /root/13-c57b78d4c94e592d069521123a24fcb80524a893/include/asterisk/strings.h:1180
1180 while (*str)
#0  0x00007fe9dc111f4a in ast_str_hash (str=0x0) at /root/13-c57b78d4c94e592d069521123a24fcb80524a893/include/asterisk/strings.h:1180
       hash = 5381
#1  0x00007fe9dc111f99 in sorcery_memory_hash (obj=0x0, flags=64) at res_sorcery_memory.c:88
       id = 0x0
#2  0x000000000045ffb6 in hash_ao2_find_first (self=0x1ed5568, flags=65, arg=0x0, state=0x7fe9d270f5e0) at astobj2_hash.c:388
       node = 0x1ed5510
       bucket_cur = 0
       cmp = 7185311
#3  0x000000000045e34f in internal_ao2_traverse (self=0x1ed5568, flags=65, cb_fn=0x7fe9dc111f9b <sorcery_memory_cmp>, arg=0x0, data=0x0, type=AO2_CALLBACK_DEFAULT, tag=0x0, file=0x0, line=0, func=0x0) at astobj2_container.c:341
       ret = 0x0
       cb_default = 0x7fe9dc111f9b <sorcery_memory_cmp>
       cb_withdata = 0x0
       node = 0x7fe9d270f7b0
       traversal_state = 0x7fe9d270f5e0
       orig_lock = AO2_LOCK_REQ_MUTEX
       multi_container = 0x0
       multi_iterator = 0x0
       __PRETTY_FUNCTION__ = "internal_ao2_traverse"
#4  0x000000000045e6bb in __ao2_callback (c=0x1ed5568, flags=65, cb_fn=0x7fe9dc111f9b <sorcery_memory_cmp>, arg=0x0) at astobj2_container.c:452
No locals.
#5  0x000000000045e86e in __ao2_find (c=0x1ed5568, arg=0x0, flags=65) at astobj2_container.c:493
       arged = 0x0
       __PRETTY_FUNCTION__ = "__ao2_find"
#6  0x00007fe9dc11240a in sorcery_memory_update (sorcery=0x1e77d28, data=0x1ed5568, object=0x7fe9f4003ca0) at res_sorcery_memory.c:193
       existing = 0x0
       __PRETTY_FUNCTION__ = "sorcery_memory_update"
#7  0x00000000005c16e7 in sorcery_wizard_update (obj=0x1ed4ae8, arg=0x7fe9d270f880, flags=0) at sorcery.c:2045
       object_wizard = 0x1ed4ae8
       details = 0x7fe9d270f880
       __PRETTY_FUNCTION__ = "sorcery_wizard_update"
#8  0x00000000005c1829 in ast_sorcery_update (sorcery=0x1e77d28, object=0x7fe9f4003ca0) at sorcery.c:2068
       details = 0x7fe9f4003ca0
       object_type = 0x1ed4b68
       object_wizard = 0x0
       found_wizard = 0x1ed4ae8
       i = 0
       sdetails = {sorcery = 0x1e77d28, obj = 0x7fe9f4003ca0}
       __PRETTY_FUNCTION__ = "ast_sorcery_update"
#9  0x00007fe9d32cd802 in update_contact_status (contact=0x1fae9b0, value=value@entry=AVAILABLE) at res_pjsip/pjsip_options.c:162
       status = 0x7fe9f4001dd0
       update = 0x7fe9f4003ca0
       __PRETTY_FUNCTION__ = "update_contact_status"
#10 0x00007fe9d32cdd62 in qualify_contact_cb (token=0x1fae9b0, e=<optimized out>) at res_pjsip/pjsip_options.c:303
       contact = 0x1fae9b0
       __PRETTY_FUNCTION__ = "qualify_contact_cb"
#11 0x00007fe9d32c8e00 in send_request_cb (token=0x7fe9f4000c90, e=0x7fe9d270f9c0) at res_pjsip.c:3224
       req_data = 0x7fe9f4000c90
       tsx = 0x1c24e78
       challenge = 0x7fe9e0051828
       tdata = 0x7fe9f4000cc0
       supplement = <optimized out>
       endpoint = <optimized out>
       res = <optimized out>
       __PRETTY_FUNCTION__ = "send_request_cb"
#12 0x00007fe9d32c8657 in endpt_send_request_cb (token=0x7fe9f4000ce0, e=0x7fe9d270f9c0) at res_pjsip.c:3005
       req_wrapper = 0x7fe9f4000ce0
       __PRETTY_FUNCTION__ = "endpt_send_request_cb"
#13 0x00007fe9df28697a in tsx_set_state () from /lib64/libpjsip.so.2
No symbol table info available.
#14 0x00007fe9df2885f6 in tsx_on_state_proceeding_uac () from /lib64/libpjsip.so.2
No symbol table info available.
#15 0x00007fe9df28883d in tsx_on_state_calling () from /lib64/libpjsip.so.2
No symbol table info available.
#16 0x00007fe9df289d0f in pjsip_tsx_recv_msg () from /lib64/libpjsip.so.2
No symbol table info available.
#17 0x00007fe9df289db5 in mod_tsx_layer_on_rx_response () from /lib64/libpjsip.so.2
No symbol table info available.
#18 0x00007fe9df27445f in pjsip_endpt_process_rx_data () from /lib64/libpjsip.so.2
No symbol table info available.
#19 0x00007fe9d32d27b9 in distribute (data=0x7fe9e0051828) at res_pjsip/pjsip_distributor.c:439
       param = {start_prio = 0, start_mod = 0x7fe9d34f0760 <distributor_mod>, idx_after_start = 1, silent = 0}
       handled = 0
       rdata = 0x7fe9e0051828
       is_request = 0
       is_ack = 0
       endpoint = <optimized out>
#20 0x00000000005e0e92 in ast_taskprocessor_execute (tps=0x1e762b8) at taskprocessor.c:768
       local = {local_data = 0x7fe9d27109c0, data = 0x5f6c79 <ast_threadstorage_set_ptr+60>}
       t = 0x7fe9e04f25d0
       size = 0
       __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#21 0x00000000005ecccd in execute_tasks (data=0x1e762b8) at threadpool.c:1269
       tps = 0x1e762b8
#22 0x00000000005e0e92 in ast_taskprocessor_execute (tps=0x1e765b8) at taskprocessor.c:768
       local = {local_data = 0x55cc0c76, data = 0x1e60a80}
       t = 0x7fe9e04b7e20
       size = 0
       __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#23 0x00000000005eb144 in threadpool_execute (pool=0x1e60ad8) at threadpool.c:351
       __PRETTY_FUNCTION__ = "threadpool_execute"
#24 0x00000000005ec662 in worker_active (worker=0x7fe9e8006bc8) at threadpool.c:1075
       alive = 0
#25 0x00000000005ec41f in worker_start (arg=0x7fe9e8006bc8) at threadpool.c:995
       worker = 0x7fe9e8006bc8
       __PRETTY_FUNCTION__ = "worker_start"
#26 0x00000000005f855b in dummy_start (data=0x7fe9e801c030) at utils.c:1237
       __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, 7449420058881167766, 0, 140642234730944, 140642234730240, 20, 7449420058906333590, -7443873705584488042}, __mask_was_saved = 0}}, __pad = {0x7fe9d270fdf0, 0x0, 0x0, 0x7fe9ff6162e8 <__pthread_keys+8>}}
       __cancel_routine = 0x4510a7 <ast_unregister_thread>
       __cancel_arg = 0x7fe9d2710700
       __not_first_call = 0
       ret = 0x7fe9fe9a7860 <internal_trans_names.8316>
       a = {start_routine = 0x5ec398 <worker_start>, data = 0x7fe9e8006bc8, name = 0x7fe9e8002fc0 "worker_start         started at [ 1049] threadpool.c worker_thread_start()"}
#27 0x00007fe9ff406df5 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#28 0x00007fe9fe6e71ad in clone () from /lib64/libc.so.6
{noformat}

By: Richard Mudgett (rmudgett) 2015-08-17 10:39:51.825-0500

[~sgriepentrog] The crash is caused because {{ast_sorcery_alloc()}} does not check the {{ast_strdup()}} return value that assigns the sorcery id member string for failure.

By: Scott Griepentrog (sgriepentrog) 2015-08-18 17:26:17.805-0500

Instance #2: assert from base_process_dial_end

[^backtrace-core.12412.txt]

This may be considered less of a fixable bug then an intended crash as it is an assert not normally applied unless DO_CRASH is enabled.

By: Scott Griepentrog (sgriepentrog) 2015-08-18 17:50:46.391-0500

Instance #3: crash on null contact_hdr uri

[^backtrace-core.29894.txt]

By: Scott Griepentrog (sgriepentrog) 2015-08-26 14:15:52.596-0500

Instance #4: reference to pvt when channel is NULL

[^backtrace-core.12729.txt] [^full-log-core.12729.txt]

By: Scott Griepentrog (sgriepentrog) 2015-08-26 15:25:18.138-0500

Instance #5: strndup alloc failed in get_media_encryption_type

[^backtrace-core.11062.txt]

By: Scott Griepentrog (sgriepentrog) 2015-09-08 10:46:14.236-0500

I'm including malloc failure related crashes on this issue, even where they were not triggered by CHAOS_DEBUG but through an honest malloc failure, detected by a crash on a ram limited (no swap) test machine under stress.