Summary: | ASTERISK-24538: SRTP p->tag corruption | ||
Reporter: | Badalian Vyacheslav (slavon) | Labels: | |
Date Opened: | 2014-11-19 09:29:02.000-0600 | Date Closed: | 2015-05-22 10:33:04 |
Priority: | Critical | Regression? | |
Status: | Closed/Complete | Components: | Channels/chan_sip/General |
Versions: | 11.13.1 11.14.1 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) core.m1-asterisk01.tcsbank.ru-2014-11-18T18:46:15+0300.txt ( 1) core.m1-asterisk01.tcsbank.ru-2014-11-18T19:27:42+0300.txt ( 2) core.m1-asterisk01.tcsbank.ru-2014-11-19T13:03:47+0300.txt | |
Description: | Today 3 crashes in one place
Backtraces will be added bellow | ||
Comments: | By: Badalian Vyacheslav (slavon) 2014-11-19 09:34:33.849-0600 Version with patch added in ASTERISK-24472 - ws_rewrite.diff This crash in sip_sdp part (not in WS part that was be changed). I belive that my patch does not do regression to SDP part. We can't test vanil version of Asterisk. Without patch server crash after <50 calls or 2-5 min of work. Please do not discart this issue with reason "Not vanila vesion". Also its production server that not use WSS and SRTP... its work in WS version (server wait for fixes in srtplib and openssl leaks) By: Badalian Vyacheslav (slavon) 2014-11-19 09:36:40.642-0600 Before first crash server do 16000+ calls... By: Matt Jordan (mjordan) 2014-11-19 09:40:59.146-0600 In {{gdb}}, please print out the following: {noformat} # frame 3 # print *p # print p->tag # print p->local_key64 {noformat} By: Badalian Vyacheslav (slavon) 2014-11-19 09:42:50.787-0600 {code} (gdb) frame 3 #3 0x00007ffbd21d7b5a in sdp_crypto_offer (p=0x7ffba44962e0, taglen=80) at sip/sdp_crypto.c:304 304 if (ast_asprintf(&p->a_crypto, "a=crypto:%s AES_CM_128_HMAC_SHA1_%i inline:%s\r\n", (gdb) print *p $1 = {a_crypto = 0x0, local_key = "\004d\252\234\233\070p*\230\214X\265\254\344?\255p<jYHy\211\233\063\260\237H)\005", tag = 0xd0840bee6bd56333 <Address 0xd0840bee6bd56333 out of bounds>, local_key64 = "BGSqnJs4cCqYjFi1rOQ/rXA8allIeYmbM7CfSCkF", remote_key = '\000' <repeats 29 times>} (gdb) print p->tag $2 = 0xd0840bee6bd56333 <Address 0xd0840bee6bd56333 out of bounds> (gdb) print p->local_key64 $3 = "BGSqnJs4cCqYjFi1rOQ/rXA8allIeYmbM7CfSCkF" {code} By: Badalian Vyacheslav (slavon) 2014-11-19 09:45:23.096-0600 Another core {code} (gdb) frame 3 #3 0x00007f6f86133b5a in sdp_crypto_offer (p=0x7f6f1566d8e0, taglen=80) at sip/sdp_crypto.c:304 304 if (ast_asprintf(&p->a_crypto, "a=crypto:%s AES_CM_128_HMAC_SHA1_%i inline:%s\r\n", (gdb) print *p $1 = {a_crypto = 0x0, local_key = "#\203\v\022\016\t\rI#_\251\276\217\201\257\337\306\366\377\363Չy\236\320i=\320,\202", tag = 0xd87 <Address 0xd87 out of bounds>, local_key64 = "I4MLEg4JDUkjX6m+j4Gv38b2//PViXme0Gk90CyC", remote_key = '\000' <repeats 29 times>} (gdb) print p->tag $2 = 0xd87 <Address 0xd87 out of bounds> (gdb) print p->local_key64 $3 = "I4MLEg4JDUkjX6m+j4Gv38b2//PViXme0Gk90CyC" {code} By: Andrey Ovchinnikov (Andrey O) 2014-11-20 11:00:26.434-0600 Are there any updates? By: Badalian Vyacheslav (slavon) 2014-11-21 01:14:42.669-0600 https://github.com/cisco/libsrtp/commit/8ba46ebc3daf228c82e79c907bfba661f092e09e i think this must fix By: Rusty Newton (rnewton) 2014-11-21 16:10:44.036-0600 [~slavon] What version of libsrtp are you using? If you haven't tested with 1.5.0 can you do that and report back here? Thanks! By: Andrey Ovchinnikov (Andrey O) 2014-11-22 04:40:30.280-0600 current version of libsrtp is libsrtp.x86_64 1.4.4-10.20101004cvs.el6 @epel libsrtp-devel.x86_64 1.4.4-10.20101004cvs.el6 @epel By: Badalian Vyacheslav (slavon) 2014-11-22 06:25:11.939-0600 We now testing libsrtp 1.5.0 I will request after test it By: Badalian Vyacheslav (slavon) 2014-11-22 06:30:36.431-0600 100 calls now... no leaks... but in start asterisk have this leak (repeat 6 times): {code} ==44910== Conditional jump or move depends on uninitialised value(s) ==44910== at 0xAF78401: cipher_type_test (cipher.c:150) ==44910== by 0xAF7F7C8: crypto_kernel_load_cipher_type (crypto_kernel.c:336) ==44910== by 0xAF7F943: crypto_kernel_init (crypto_kernel.c:169) ==44910== by 0xAF750F8: srtp_init (srtp.c:1716) ==44910== by 0xAD6D8E7: res_srtp_init (res_srtp.c:561) ==44910== by 0xAD6D98D: load_module (res_srtp.c:584) ==44910== by 0x503240: start_resource (loader.c:861) ==44910== by 0x503CFF: load_resource_list (loader.c:1063) ==44910== by 0x504332: load_modules (loader.c:1211) ==44910== by 0x44C2DC: main (asterisk.c:4337) ==44910== Uninitialised value was created by a heap allocation ==44910== at 0x4A06A2E: malloc (vg_replace_malloc.c:270) ==44910== by 0xAF7F9E5: crypto_alloc (alloc.c:102) ==44910== by 0xAF7C641: aes_cbc_alloc (aes_cbc.c:74) ==44910== by 0xAF7833F: cipher_type_test (cipher.c:115) ==44910== by 0xAF7F7C8: crypto_kernel_load_cipher_type (crypto_kernel.c:336) ==44910== by 0xAF7F943: crypto_kernel_init (crypto_kernel.c:169) ==44910== by 0xAF750F8: srtp_init (srtp.c:1716) ==44910== by 0xAD6D8E7: res_srtp_init (res_srtp.c:561) ==44910== by 0xAD6D98D: load_module (res_srtp.c:584) ==44910== by 0x503240: start_resource (loader.c:861) ==44910== by 0x503CFF: load_resource_list (loader.c:1063) ==44910== by 0x504332: load_modules (loader.c:1211) ==44910== {code} By: Badalian Vyacheslav (slavon) 2014-11-24 06:08:38.803-0600 We look that 1.5.0 not have valgrind leaks. Only init bug that i post before. By: Andrey Ovchinnikov (Andrey O) 2014-11-26 02:37:43.575-0600 Are there any updates? Bug is very important. Crash occured every day 1-7 time. By: Badalian Vyacheslav (slavon) 2014-11-27 09:08:58.376-0600 tested with 1.5.0 1 = {a_crypto = 0x0, local_key = "\325N\343\354\226+\327\016\360e\032\225\353\221\026\266\066j\245\220\246\302\344xpܤ)z}", tag = 0x88accb537a3804f <Address 0x88accb537a3804f out of bounds>, local_key64 = "1U7j7JYr1w7wZRqV65EWtjZqpZCmwuR4cNykKXp9", remote_key = '\000' <repeats 29 times>} Some errors... By: Badalian Vyacheslav (slavon) 2014-11-28 20:25:12.389-0600 hm... this is your part of code... tag generated not from libsrtp.... By: Badalian Vyacheslav (slavon) 2014-11-28 20:42:43.378-0600 This is becouse you do alloc in sdp_crypto_setup but not init {{p->tag}} you must set {code} p->tag=NULL; {code} By: Badalian Vyacheslav (slavon) 2014-11-28 20:53:16.703-0600 if {{p-tag != NULL}} next part of code does not work {code} if (!p->tag) { ast_debug(1, "Accepting crypto tag %s\n", tag); p->tag = ast_strdup(tag); if (!p->tag) { ast_log(LOG_ERROR, "Could not allocate memory for tag\n"); return -1; } } {code} it can be if allocated memory for {{p}} added to exist part of memory and {{p->tag}} may have not NULL value By: Badalian Vyacheslav (slavon) 2014-11-28 20:59:28.484-0600 Patch attached By: Badalian Vyacheslav (slavon) 2014-12-01 06:25:26.036-0600 Tested. 20 000 calls. Bug no more present in valgrind output By: Joshua C. Colp (jcolp) 2014-12-01 06:28:59.129-0600 Your fix is likely just covering something else up. The ast_sdp_crypto structure is allocated using ast_calloc. This zeros the allocated memory, which sets p->tag to NULL. By: Badalian Vyacheslav (slavon) 2014-12-01 06:37:15.885-0600 Hmmm... but no more crashes and valgrind errors.. maybe local_key is overloaded to tag before (in code) setting it to NULL? By: Joshua C. Colp (jcolp) 2014-12-01 06:40:50.396-0600 If so that would be the real issue. By: Badalian Vyacheslav (slavon) 2014-12-02 20:16:13.670-0600 i add more info after all lines that may change {{p}} structure. I will try found that like do noize By: Andrey Ovchinnikov (Andrey O) 2014-12-03 09:10:50.069-0600 Do you have any idea what could be the reason? By: Andrey Ovchinnikov (Andrey O) 2014-12-08 08:53:17.502-0600 What kind of information do you need from us? By: Matt Jordan (mjordan) 2015-02-25 16:00:31.533-0600 An alternative way of fixing this has just been merged into 11/13 in r432239/r432258. In that particular patch, the {{tag}} field was changed to be an integer. This may prevent this crash. If you can, please test with the lastest from the appropriate branch. At the same time, if something is writing passed the length of the {{local_key}} buffer - which would be a problem in {{libsrtp}} - then I'd expect some strange behaviour regardless. We would be less likely to crash is all. By: Badalian Vyacheslav (slavon) 2015-03-06 10:42:18.675-0600 Hm...in 11.16.0 tag is char. more info... may be it help. another crash place: {code} [root@vm-asterisk02t obs]# ./asan_symbolize.py.txt < /tmp/asan_dump.8289 ================================================================= ==8289== ERROR: AddressSanitizer: SEGV on unknown address 0x000000003038 (pc 0x7f3874665e12 sp 0x7f3853dbbc90 bp 0x000000003048 T31) AddressSanitizer can not provide additional info. #0 0x7f3874665e11 in ?? ??:0 #1 0x7f3874673377 in ?? ??:0 #2 0x7f385f83ca2f in sdp_crypto_destroy /home/obs/asterisk-11.16.0/channels/sip/sdp_crypto.c:68 #3 0x7f385f840cb8 in sip_srtp_destroy /home/obs/asterisk-11.16.0/channels/sip/srtp.c:51 #4 0x7f385f6f1571 in __sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6445 #5 0x7f385f6f35be in sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6678 #6 0x7f385f6f346a in sip_destroy_fn /home/obs/asterisk-11.16.0/channels/chan_sip.c:6667 #7 0x4827c6 in internal_ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:466 #8 0x482c54 in __ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:548 #9 0x7f385f6c91d3 in dialog_unref_debug /home/obs/asterisk-11.16.0/channels/chan_sip.c:2332 #10 0x7f385f6cf4c5 in dialog_unlink_all /home/obs/asterisk-11.16.0/channels/chan_sip.c:3309 #11 0x7f385f77cb43 in dialog_needdestroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:19569 #12 0x484239 in internal_ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1107 #13 0x4848de in __ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1212 #14 0x7f385f7d9b3d in do_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29159 #15 0x7173c5 in dummy_start /home/obs/asterisk-11.16.0/main/utils.c:1223 #16 0x7f3874676ba7 in ?? ??:0 #17 0x7f38733da9d0 in start_thread pthread_create.c:0 #18 0x7f38741af8fc in __clone ??:0 Thread T31 created by T0 here: #0 0x7f3874668b6b in ?? ??:0 #1 0x71775f in ast_pthread_create_stack /home/obs/asterisk-11.16.0/main/utils.c:1276 #2 0x7f385f7d9e99 in restart_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29207 #3 0x7f385f81e95c in load_module /home/obs/asterisk-11.16.0/channels/chan_sip.c:34799 #4 0x5f3488 in start_resource /home/obs/asterisk-11.16.0/main/loader.c:861 #5 0x5f4e11 in load_resource_list /home/obs/asterisk-11.16.0/main/loader.c:1063 #6 0x5f5ba1 in load_modules /home/obs/asterisk-11.16.0/main/loader.c:1216 #7 0x4814bc in main /home/obs/asterisk-11.16.0/main/asterisk.c:4337 #8 0x7f38740e5d5c in __libc_start_main ??:0 ==8289== ABORTING {code} By: Badalian Vyacheslav (slavon) 2015-03-06 10:43:37.668-0600 Hm...in 11.16.0 tag is char. more info... may be it help. another crash place: gcc 4.8.2 ASAN {code} [root@vm-asterisk02t obs]# ./asan_symbolize.py.txt < /tmp/asan_dump.8289 ================================================================= ==8289== ERROR: AddressSanitizer: SEGV on unknown address 0x000000003038 (pc 0x7f3874665e12 sp 0x7f3853dbbc90 bp 0x000000003048 T31) AddressSanitizer can not provide additional info. #0 0x7f3874665e11 in ?? ??:0 #1 0x7f3874673377 in ?? ??:0 #2 0x7f385f83ca2f in sdp_crypto_destroy /home/obs/asterisk-11.16.0/channels/sip/sdp_crypto.c:68 #3 0x7f385f840cb8 in sip_srtp_destroy /home/obs/asterisk-11.16.0/channels/sip/srtp.c:51 #4 0x7f385f6f1571 in __sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6445 #5 0x7f385f6f35be in sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6678 #6 0x7f385f6f346a in sip_destroy_fn /home/obs/asterisk-11.16.0/channels/chan_sip.c:6667 #7 0x4827c6 in internal_ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:466 #8 0x482c54 in __ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:548 #9 0x7f385f6c91d3 in dialog_unref_debug /home/obs/asterisk-11.16.0/channels/chan_sip.c:2332 #10 0x7f385f6cf4c5 in dialog_unlink_all /home/obs/asterisk-11.16.0/channels/chan_sip.c:3309 #11 0x7f385f77cb43 in dialog_needdestroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:19569 #12 0x484239 in internal_ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1107 #13 0x4848de in __ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1212 #14 0x7f385f7d9b3d in do_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29159 #15 0x7173c5 in dummy_start /home/obs/asterisk-11.16.0/main/utils.c:1223 #16 0x7f3874676ba7 in ?? ??:0 #17 0x7f38733da9d0 in start_thread pthread_create.c:0 #18 0x7f38741af8fc in __clone ??:0 Thread T31 created by T0 here: #0 0x7f3874668b6b in ?? ??:0 #1 0x71775f in ast_pthread_create_stack /home/obs/asterisk-11.16.0/main/utils.c:1276 #2 0x7f385f7d9e99 in restart_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29207 #3 0x7f385f81e95c in load_module /home/obs/asterisk-11.16.0/channels/chan_sip.c:34799 #4 0x5f3488 in start_resource /home/obs/asterisk-11.16.0/main/loader.c:861 #5 0x5f4e11 in load_resource_list /home/obs/asterisk-11.16.0/main/loader.c:1063 #6 0x5f5ba1 in load_modules /home/obs/asterisk-11.16.0/main/loader.c:1216 #7 0x4814bc in main /home/obs/asterisk-11.16.0/main/asterisk.c:4337 #8 0x7f38740e5d5c in __libc_start_main ??:0 ==8289== ABORTING {code} GCC 4.9.1 ASAN {code} [root@vm-asterisk02t obs]# cat /tmp/asan_dump.32422 ================================================================= ==32422==ERROR: AddressSanitizer: SEGV on unknown address 0x00002095719c (pc 0x7f3f8c2c486d sp 0x7f3f8d21abf0 bp 0x0000209571ac T43) #0 0x7f3f8c2c486c (/usr/lib64/libasan.so.1+0x1e86c) #1 0x7f3f8c2fa585 in __interceptor_free (/usr/lib64/libasan.so.1+0x54585) #2 0x7f3f74d90b8f in sdp_crypto_destroy sip/sdp_crypto.c:68 #3 0x7f3f74d94920 in sip_srtp_destroy sip/srtp.c:51 #4 0x7f3f74c3ddf8 in __sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6445 #5 0x7f3f74c400b0 in sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6678 #6 0x7f3f74c3fedb in sip_destroy_fn /home/obs/asterisk-11.16.0/channels/chan_sip.c:6667 #7 0x482ac2 in internal_ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:466 #8 0x482f37 in __ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:548 #9 0x7f3f74c136de in dialog_unref_debug /home/obs/asterisk-11.16.0/channels/chan_sip.c:2332 #10 0x7f3f74c19ae0 in dialog_unlink_all /home/obs/asterisk-11.16.0/channels/chan_sip.c:3309 #11 0x7f3f74cd13f9 in dialog_needdestroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:19569 #12 0x4844a4 in internal_ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1107 #13 0x484a37 in __ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1212 #14 0x7f3f74d30928 in do_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29159 #15 0x72cb74 in dummy_start /home/obs/asterisk-11.16.0/main/utils.c:1223 #16 0x7f3f8b0229d0 in start_thread (/lib64/libpthread.so.0+0x79d0) #17 0x7f3f8bdf78fc in clone (/lib64/libc.so.6+0xe88fc) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV ??:0 ?? Thread T43 created by T0 here: #0 0x7f3f8c2c9c6a in pthread_create (/usr/lib64/libasan.so.1+0x23c6a) #1 0x72cf7d in ast_pthread_create_stack /home/obs/asterisk-11.16.0/main/utils.c:1276 #2 0x7f3f74d30cf1 in restart_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29207 #3 0x7f3f74d7647e in load_module /home/obs/asterisk-11.16.0/channels/chan_sip.c:34799 #4 0x60419f in start_resource /home/obs/asterisk-11.16.0/main/loader.c:861 #5 0x605c38 in load_resource_list /home/obs/asterisk-11.16.0/main/loader.c:1063 #6 0x606995 in load_modules /home/obs/asterisk-11.16.0/main/loader.c:1216 #7 0x48183c in main /home/obs/asterisk-11.16.0/main/asterisk.c:4337 #8 0x7f3f8bd2dd5c in __libc_start_main (/lib64/libc.so.6+0x1ed5c) ==32422==ABORTING {code} By: Matt Jordan (mjordan) 2015-03-15 20:50:17 {quote} An alternative way of fixing this has just been merged into 11/13 in r432239/r432258. {quote} That isn't in 11.16.0. Please re-test with the tip of the 11 branch. By: Matt Jordan (mjordan) 2015-05-22 10:33:04.166-0500 Closing out as Fixed, as the code was restructured to prevent this from occurring. If you can reproduce this with the lastest Asterisk 11 release (11.18.0-rc1), then please comment on here and I'll reopen the issue. |