[Home]

Summary:ASTERISK-24538: SRTP p->tag corruption
Reporter:Badalian Vyacheslav (slavon)Labels:
Date Opened:2014-11-19 09:29:02.000-0600Date Closed:2015-05-22 10:33:04
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Channels/chan_sip/General
Versions:11.13.1 11.14.1 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) core.m1-asterisk01.tcsbank.ru-2014-11-18T18:46:15+0300.txt
( 1) core.m1-asterisk01.tcsbank.ru-2014-11-18T19:27:42+0300.txt
( 2) core.m1-asterisk01.tcsbank.ru-2014-11-19T13:03:47+0300.txt
Description:Today 3 crashes in one place
Backtraces will be added bellow
Comments:By: Badalian Vyacheslav (slavon) 2014-11-19 09:34:33.849-0600

Version with patch added in ASTERISK-24472 - ws_rewrite.diff
This crash in sip_sdp part (not in WS part that was be changed).  
I belive that my patch does not do regression to SDP part.

We can't test vanil version of Asterisk.
Without patch server crash after <50 calls or 2-5 min of work.
Please do not discart this issue with reason "Not vanila vesion".

Also its production server that not use WSS and SRTP... its work in WS version (server wait for fixes in srtplib and openssl leaks)

By: Badalian Vyacheslav (slavon) 2014-11-19 09:36:40.642-0600

Before first crash server do 16000+ calls...

By: Matt Jordan (mjordan) 2014-11-19 09:40:59.146-0600

In {{gdb}}, please print out the following:

{noformat}
# frame 3
# print *p
# print p->tag
# print p->local_key64
{noformat}

By: Badalian Vyacheslav (slavon) 2014-11-19 09:42:50.787-0600

{code}
(gdb) frame 3
#3  0x00007ffbd21d7b5a in sdp_crypto_offer (p=0x7ffba44962e0, taglen=80) at sip/sdp_crypto.c:304
304             if (ast_asprintf(&p->a_crypto, "a=crypto:%s AES_CM_128_HMAC_SHA1_%i inline:%s\r\n",
(gdb) print *p
$1 = {a_crypto = 0x0, local_key = "\004d\252\234\233\070p*\230\214X\265\254\344?\255p<jYHy\211\233\063\260\237H)\005", tag = 0xd0840bee6bd56333 <Address 0xd0840bee6bd56333 out of bounds>,
 local_key64 = "BGSqnJs4cCqYjFi1rOQ/rXA8allIeYmbM7CfSCkF", remote_key = '\000' <repeats 29 times>}
(gdb) print p->tag
$2 = 0xd0840bee6bd56333 <Address 0xd0840bee6bd56333 out of bounds>
(gdb) print p->local_key64
$3 = "BGSqnJs4cCqYjFi1rOQ/rXA8allIeYmbM7CfSCkF"

{code}

By: Badalian Vyacheslav (slavon) 2014-11-19 09:45:23.096-0600

Another core

{code}
(gdb) frame 3
#3  0x00007f6f86133b5a in sdp_crypto_offer (p=0x7f6f1566d8e0, taglen=80) at sip/sdp_crypto.c:304
304             if (ast_asprintf(&p->a_crypto, "a=crypto:%s AES_CM_128_HMAC_SHA1_%i inline:%s\r\n",
(gdb) print *p
$1 = {a_crypto = 0x0, local_key = "#\203\v\022\016\t\rI#_\251\276\217\201\257\337\306\366\377\363Չy\236\320i=\320,\202", tag = 0xd87 <Address 0xd87 out of bounds>, local_key64 = "I4MLEg4JDUkjX6m+j4Gv38b2//PViXme0Gk90CyC",
 remote_key = '\000' <repeats 29 times>}
(gdb) print p->tag
$2 = 0xd87 <Address 0xd87 out of bounds>
(gdb) print p->local_key64
$3 = "I4MLEg4JDUkjX6m+j4Gv38b2//PViXme0Gk90CyC"
{code}

By: Andrey Ovchinnikov (Andrey O) 2014-11-20 11:00:26.434-0600

Are there any updates?

By: Badalian Vyacheslav (slavon) 2014-11-21 01:14:42.669-0600

https://github.com/cisco/libsrtp/commit/8ba46ebc3daf228c82e79c907bfba661f092e09e

i think this must fix

By: Rusty Newton (rnewton) 2014-11-21 16:10:44.036-0600

[~slavon] What version of libsrtp are you using?

If you haven't tested with 1.5.0 can you do that and report back here?

Thanks!

By: Andrey Ovchinnikov (Andrey O) 2014-11-22 04:40:30.280-0600

current  version of libsrtp is
libsrtp.x86_64                       1.4.4-10.20101004cvs.el6  @epel
libsrtp-devel.x86_64             1.4.4-10.20101004cvs.el6  @epel


By: Badalian Vyacheslav (slavon) 2014-11-22 06:25:11.939-0600

We now testing libsrtp 1.5.0
I will request after test it

By: Badalian Vyacheslav (slavon) 2014-11-22 06:30:36.431-0600

100 calls now... no leaks... but in start asterisk have this leak (repeat 6 times):

{code}
==44910== Conditional jump or move depends on uninitialised value(s)
==44910==    at 0xAF78401: cipher_type_test (cipher.c:150)
==44910==    by 0xAF7F7C8: crypto_kernel_load_cipher_type (crypto_kernel.c:336)
==44910==    by 0xAF7F943: crypto_kernel_init (crypto_kernel.c:169)
==44910==    by 0xAF750F8: srtp_init (srtp.c:1716)
==44910==    by 0xAD6D8E7: res_srtp_init (res_srtp.c:561)
==44910==    by 0xAD6D98D: load_module (res_srtp.c:584)
==44910==    by 0x503240: start_resource (loader.c:861)
==44910==    by 0x503CFF: load_resource_list (loader.c:1063)
==44910==    by 0x504332: load_modules (loader.c:1211)
==44910==    by 0x44C2DC: main (asterisk.c:4337)
==44910==  Uninitialised value was created by a heap allocation
==44910==    at 0x4A06A2E: malloc (vg_replace_malloc.c:270)
==44910==    by 0xAF7F9E5: crypto_alloc (alloc.c:102)
==44910==    by 0xAF7C641: aes_cbc_alloc (aes_cbc.c:74)
==44910==    by 0xAF7833F: cipher_type_test (cipher.c:115)
==44910==    by 0xAF7F7C8: crypto_kernel_load_cipher_type (crypto_kernel.c:336)
==44910==    by 0xAF7F943: crypto_kernel_init (crypto_kernel.c:169)
==44910==    by 0xAF750F8: srtp_init (srtp.c:1716)
==44910==    by 0xAD6D8E7: res_srtp_init (res_srtp.c:561)
==44910==    by 0xAD6D98D: load_module (res_srtp.c:584)
==44910==    by 0x503240: start_resource (loader.c:861)
==44910==    by 0x503CFF: load_resource_list (loader.c:1063)
==44910==    by 0x504332: load_modules (loader.c:1211)
==44910==
{code}

By: Badalian Vyacheslav (slavon) 2014-11-24 06:08:38.803-0600

We look that 1.5.0 not have valgrind leaks. Only init bug that i post before.

By: Andrey Ovchinnikov (Andrey O) 2014-11-26 02:37:43.575-0600

Are there any updates?
Bug is very important. Crash occured every day 1-7 time.

By: Badalian Vyacheslav (slavon) 2014-11-27 09:08:58.376-0600

tested with 1.5.0

1 = {a_crypto = 0x0, local_key = "\325N\343\354\226+\327\016\360e\032\225\353\221\026\266\066j\245\220\246\302\344xpܤ)z}", tag = 0x88accb537a3804f <Address 0x88accb537a3804f out of bounds>,
 local_key64 = "1U7j7JYr1w7wZRqV65EWtjZqpZCmwuR4cNykKXp9", remote_key = '\000' <repeats 29 times>}


Some errors...

By: Badalian Vyacheslav (slavon) 2014-11-28 20:25:12.389-0600

hm... this is your part of code... tag generated not from libsrtp....

By: Badalian Vyacheslav (slavon) 2014-11-28 20:42:43.378-0600

This is becouse you do alloc in sdp_crypto_setup but not init {{p->tag}}
you must set
{code}
p->tag=NULL;
{code}

By: Badalian Vyacheslav (slavon) 2014-11-28 20:53:16.703-0600

if {{p-tag != NULL}} next part of code does not work

{code}
       if (!p->tag) {
               ast_debug(1, "Accepting crypto tag %s\n", tag);
               p->tag = ast_strdup(tag);
               if (!p->tag) {
                       ast_log(LOG_ERROR, "Could not allocate memory for tag\n");
                       return -1;
               }
       }

{code}

it can be if allocated memory for {{p}} added to exist part of memory and {{p->tag}} may have not NULL value

By: Badalian Vyacheslav (slavon) 2014-11-28 20:59:28.484-0600

Patch attached

By: Badalian Vyacheslav (slavon) 2014-12-01 06:25:26.036-0600

Tested. 20 000 calls. Bug no more present in valgrind output

By: Joshua C. Colp (jcolp) 2014-12-01 06:28:59.129-0600

Your fix is likely just covering something else up. The ast_sdp_crypto structure is allocated using ast_calloc. This zeros the allocated memory, which sets p->tag to NULL.

By: Badalian Vyacheslav (slavon) 2014-12-01 06:37:15.885-0600

Hmmm... but no more crashes and valgrind errors.. maybe local_key is overloaded to tag before (in code) setting it to NULL?

By: Joshua C. Colp (jcolp) 2014-12-01 06:40:50.396-0600

If so that would be the real issue.

By: Badalian Vyacheslav (slavon) 2014-12-02 20:16:13.670-0600

i add more info after all lines that may change {{p}} structure. I will try found that like do noize

By: Andrey Ovchinnikov (Andrey O) 2014-12-03 09:10:50.069-0600

Do you have any idea what could be the reason?

By: Andrey Ovchinnikov (Andrey O) 2014-12-08 08:53:17.502-0600

What kind of information do you need from us?

By: Matt Jordan (mjordan) 2015-02-25 16:00:31.533-0600

An alternative way of fixing this has just been merged into 11/13 in r432239/r432258. In that particular patch, the {{tag}} field was changed to be an integer. This may prevent this crash. If you can, please test with the lastest from the appropriate branch.

At the same time, if something is writing passed the length of the {{local_key}} buffer - which would be a problem in {{libsrtp}} - then I'd expect some strange behaviour regardless. We would be less likely to crash is all.

By: Badalian Vyacheslav (slavon) 2015-03-06 10:42:18.675-0600

Hm...in 11.16.0 tag is char.

more info... may be it help. another crash place:

{code}
[root@vm-asterisk02t obs]# ./asan_symbolize.py.txt < /tmp/asan_dump.8289
=================================================================
==8289== ERROR: AddressSanitizer: SEGV on unknown address 0x000000003038 (pc 0x7f3874665e12 sp 0x7f3853dbbc90 bp 0x000000003048 T31)
AddressSanitizer can not provide additional info.
   #0 0x7f3874665e11 in ?? ??:0
   #1 0x7f3874673377 in ?? ??:0
   #2 0x7f385f83ca2f in sdp_crypto_destroy /home/obs/asterisk-11.16.0/channels/sip/sdp_crypto.c:68
   #3 0x7f385f840cb8 in sip_srtp_destroy /home/obs/asterisk-11.16.0/channels/sip/srtp.c:51
   #4 0x7f385f6f1571 in __sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6445
   #5 0x7f385f6f35be in sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6678
   #6 0x7f385f6f346a in sip_destroy_fn /home/obs/asterisk-11.16.0/channels/chan_sip.c:6667
   #7 0x4827c6 in internal_ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:466
   #8 0x482c54 in __ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:548
   #9 0x7f385f6c91d3 in dialog_unref_debug /home/obs/asterisk-11.16.0/channels/chan_sip.c:2332
   #10 0x7f385f6cf4c5 in dialog_unlink_all /home/obs/asterisk-11.16.0/channels/chan_sip.c:3309
   #11 0x7f385f77cb43 in dialog_needdestroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:19569
   #12 0x484239 in internal_ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1107
   #13 0x4848de in __ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1212
   #14 0x7f385f7d9b3d in do_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29159
   #15 0x7173c5 in dummy_start /home/obs/asterisk-11.16.0/main/utils.c:1223
   #16 0x7f3874676ba7 in ?? ??:0
   #17 0x7f38733da9d0 in start_thread pthread_create.c:0
   #18 0x7f38741af8fc in __clone ??:0
Thread T31 created by T0 here:
   #0 0x7f3874668b6b in ?? ??:0
   #1 0x71775f in ast_pthread_create_stack /home/obs/asterisk-11.16.0/main/utils.c:1276
   #2 0x7f385f7d9e99 in restart_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29207
   #3 0x7f385f81e95c in load_module /home/obs/asterisk-11.16.0/channels/chan_sip.c:34799
   #4 0x5f3488 in start_resource /home/obs/asterisk-11.16.0/main/loader.c:861
   #5 0x5f4e11 in load_resource_list /home/obs/asterisk-11.16.0/main/loader.c:1063
   #6 0x5f5ba1 in load_modules /home/obs/asterisk-11.16.0/main/loader.c:1216
   #7 0x4814bc in main /home/obs/asterisk-11.16.0/main/asterisk.c:4337
   #8 0x7f38740e5d5c in __libc_start_main ??:0
==8289== ABORTING

{code}


By: Badalian Vyacheslav (slavon) 2015-03-06 10:43:37.668-0600

Hm...in 11.16.0 tag is char.

more info... may be it help. another crash place:

gcc 4.8.2 ASAN
{code}
[root@vm-asterisk02t obs]# ./asan_symbolize.py.txt < /tmp/asan_dump.8289
=================================================================
==8289== ERROR: AddressSanitizer: SEGV on unknown address 0x000000003038 (pc 0x7f3874665e12 sp 0x7f3853dbbc90 bp 0x000000003048 T31)
AddressSanitizer can not provide additional info.
   #0 0x7f3874665e11 in ?? ??:0
   #1 0x7f3874673377 in ?? ??:0
   #2 0x7f385f83ca2f in sdp_crypto_destroy /home/obs/asterisk-11.16.0/channels/sip/sdp_crypto.c:68
   #3 0x7f385f840cb8 in sip_srtp_destroy /home/obs/asterisk-11.16.0/channels/sip/srtp.c:51
   #4 0x7f385f6f1571 in __sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6445
   #5 0x7f385f6f35be in sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6678
   #6 0x7f385f6f346a in sip_destroy_fn /home/obs/asterisk-11.16.0/channels/chan_sip.c:6667
   #7 0x4827c6 in internal_ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:466
   #8 0x482c54 in __ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:548
   #9 0x7f385f6c91d3 in dialog_unref_debug /home/obs/asterisk-11.16.0/channels/chan_sip.c:2332
   #10 0x7f385f6cf4c5 in dialog_unlink_all /home/obs/asterisk-11.16.0/channels/chan_sip.c:3309
   #11 0x7f385f77cb43 in dialog_needdestroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:19569
   #12 0x484239 in internal_ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1107
   #13 0x4848de in __ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1212
   #14 0x7f385f7d9b3d in do_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29159
   #15 0x7173c5 in dummy_start /home/obs/asterisk-11.16.0/main/utils.c:1223
   #16 0x7f3874676ba7 in ?? ??:0
   #17 0x7f38733da9d0 in start_thread pthread_create.c:0
   #18 0x7f38741af8fc in __clone ??:0
Thread T31 created by T0 here:
   #0 0x7f3874668b6b in ?? ??:0
   #1 0x71775f in ast_pthread_create_stack /home/obs/asterisk-11.16.0/main/utils.c:1276
   #2 0x7f385f7d9e99 in restart_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29207
   #3 0x7f385f81e95c in load_module /home/obs/asterisk-11.16.0/channels/chan_sip.c:34799
   #4 0x5f3488 in start_resource /home/obs/asterisk-11.16.0/main/loader.c:861
   #5 0x5f4e11 in load_resource_list /home/obs/asterisk-11.16.0/main/loader.c:1063
   #6 0x5f5ba1 in load_modules /home/obs/asterisk-11.16.0/main/loader.c:1216
   #7 0x4814bc in main /home/obs/asterisk-11.16.0/main/asterisk.c:4337
   #8 0x7f38740e5d5c in __libc_start_main ??:0
==8289== ABORTING

{code}

GCC 4.9.1 ASAN
{code}
[root@vm-asterisk02t obs]# cat /tmp/asan_dump.32422
=================================================================
==32422==ERROR: AddressSanitizer: SEGV on unknown address 0x00002095719c (pc 0x7f3f8c2c486d sp 0x7f3f8d21abf0 bp 0x0000209571ac T43)
   #0 0x7f3f8c2c486c (/usr/lib64/libasan.so.1+0x1e86c)
   #1 0x7f3f8c2fa585 in __interceptor_free (/usr/lib64/libasan.so.1+0x54585)
   #2 0x7f3f74d90b8f in sdp_crypto_destroy sip/sdp_crypto.c:68
   #3 0x7f3f74d94920 in sip_srtp_destroy sip/srtp.c:51
   #4 0x7f3f74c3ddf8 in __sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6445
   #5 0x7f3f74c400b0 in sip_destroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:6678
   #6 0x7f3f74c3fedb in sip_destroy_fn /home/obs/asterisk-11.16.0/channels/chan_sip.c:6667
   #7 0x482ac2 in internal_ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:466
   #8 0x482f37 in __ao2_ref /home/obs/asterisk-11.16.0/main/astobj2.c:548
   #9 0x7f3f74c136de in dialog_unref_debug /home/obs/asterisk-11.16.0/channels/chan_sip.c:2332
   #10 0x7f3f74c19ae0 in dialog_unlink_all /home/obs/asterisk-11.16.0/channels/chan_sip.c:3309
   #11 0x7f3f74cd13f9 in dialog_needdestroy /home/obs/asterisk-11.16.0/channels/chan_sip.c:19569
   #12 0x4844a4 in internal_ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1107
   #13 0x484a37 in __ao2_callback /home/obs/asterisk-11.16.0/main/astobj2.c:1212
   #14 0x7f3f74d30928 in do_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29159
   #15 0x72cb74 in dummy_start /home/obs/asterisk-11.16.0/main/utils.c:1223
   #16 0x7f3f8b0229d0 in start_thread (/lib64/libpthread.so.0+0x79d0)
   #17 0x7f3f8bdf78fc in clone (/lib64/libc.so.6+0xe88fc)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
Thread T43 created by T0 here:
   #0 0x7f3f8c2c9c6a in pthread_create (/usr/lib64/libasan.so.1+0x23c6a)
   #1 0x72cf7d in ast_pthread_create_stack /home/obs/asterisk-11.16.0/main/utils.c:1276
   #2 0x7f3f74d30cf1 in restart_monitor /home/obs/asterisk-11.16.0/channels/chan_sip.c:29207
   #3 0x7f3f74d7647e in load_module /home/obs/asterisk-11.16.0/channels/chan_sip.c:34799
   #4 0x60419f in start_resource /home/obs/asterisk-11.16.0/main/loader.c:861
   #5 0x605c38 in load_resource_list /home/obs/asterisk-11.16.0/main/loader.c:1063
   #6 0x606995 in load_modules /home/obs/asterisk-11.16.0/main/loader.c:1216
   #7 0x48183c in main /home/obs/asterisk-11.16.0/main/asterisk.c:4337
   #8 0x7f3f8bd2dd5c in __libc_start_main (/lib64/libc.so.6+0x1ed5c)

==32422==ABORTING

{code}

By: Matt Jordan (mjordan) 2015-03-15 20:50:17

{quote}
An alternative way of fixing this has just been merged into 11/13 in r432239/r432258.
{quote}

That isn't in 11.16.0. Please re-test with the tip of the 11 branch.

By: Matt Jordan (mjordan) 2015-05-22 10:33:04.166-0500

Closing out as Fixed, as the code was restructured to prevent this from occurring.

If you can reproduce this with the lastest Asterisk 11 release (11.18.0-rc1), then please comment on here and I'll reopen the issue.