[Home]

Summary:ASTERISK-24368: res_pjsip_pubsub: Subscription persistence causes crash when re-constructing stored subscription
Reporter:Matt Jordan (mjordan)Labels:
Date Opened:2014-09-29 14:52:05Date Closed:2014-10-05 19:08:42
Priority:CriticalRegression?
Status:Closed/CompleteComponents:Resources/res_pjsip_pubsub
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) ASTERISK-24368-13.diff
( 1) ASTERISK-24368-13-root.diff
( 2) ASTERISK-24368-bt.txt
( 3) pjsip.conf
Description:When some subscriptions are persisted to the AstDB, there appear to be code paths where the re-construction of the subscription is rejected due to an invalid transport. This causes an endless cycle of crashes on startup, preventing Asterisk from ever starting. Worse, since this is stored in the AstDB by default, there exists no way to bail out of it.

Note that there are multiple places where this can crash. Logs are attached; however, the relevant backtraces are as follows.

h1. Example Database Entry

{noformat}
digium-test-03*CLI> database show
/registrar/contact/2001;@sip:2001@132.177.252.58:44778;transport=tls;registering_acc=digium-test-03_digium_sipit_net: {"outbound_proxy":"","expiration_time":"1412000075","path":"","qualify_frequency":"0","user_agent":"Jitsi2.4.4997Linux","uri":"sip:2001@132.177.252.58:44778;transport=tls;registering_acc=digium-test-03_digium_sipit_net"}
/registrar/contact/2001;@sip:2001@132.177.252.58:55716;transport=tls;registering_acc=digium-test-03_digium_sipit_net: {"outbound_proxy":"","expiration_time":"1411999632","path":"","qualify_frequency":"0","user_agent":"Jitsi2.4.4997Linux","uri":"sip:2001@132.177.252.58:55716;transport=tls;registering_acc=digium-test-03_digium_sipit_net"}
/subscription_persistence/subscription_persistence/63c90551-9245-4706-b792-bafae4e1d638: {"cseq":"19090","tag":"75b6d557-0c06-4579-b5da-4865d1380060","transport_key":"TLS","packet":"SUBSCRIBE sip:2001@digium-test-03.digium.sipit.net SIP/2.0\r\nCall-ID: 3ab5b4f06f8f1069ac184120a6c3ba05@0:0:0:0:0:0:0:0\r\nCSeq: 1 SUBSCRIBE\r\nFrom: \"2001\" <sip:2001@digium-test-03.digium.sipit.net>;tag=4f2aac56\r\nTo: \"2001\" <sip:2001@digium-test-03.digium.sipit.net>\r\nVia: SIP/2.0/TLS 132.177.252.58:44778;branch=z9hG4bK-353937-d9d17e7588e3e91e6eba5bba62fed0d9\r\nMax-Forwards: 70\r\nContact: \"2001\" <sip:2001@132.177.252.58:44778;transport=tls;registering_acc=digium-test-03_digium_sipit_net>\r\nUser-Agent: Jitsi2.4.4997Linux\r\nEvent: message-summary\r\nAccept: application/simple-message-summary\r\nExpires: 3600\r\nContent-Length: 0\r\n\r\n","src_port":"44778","local_port":"5061","local_name":"2606:4100:3880:2520::6:3","expires":"1412003075","endpoint":"2001","src_name":"::ffff:132.177.252.58"}
{noformat}

h1. Location Number One

In this particular off-nominal, the subscription is rejected on creation by PJPROJECT. This returns back a NULL {{sub}} pointer, which we then dereference to hilarity.

{noformat}
#0  shutdown_subscriptions (sub=0x0) at res_pjsip_pubsub.c:1145
1145 if (AST_VECTOR_SIZE(&sub->children) > 0) {
(gdb) bt
#0  shutdown_subscriptions (sub=0x0) at res_pjsip_pubsub.c:1145
#1  0x00007fbcbc0115cd in subscription_tree_destructor (obj=0x7fbcc448d2b8) at res_pjsip_pubsub.c:1171
#2  0x000000000045ff0c in internal_ao2_ref (user_data=0x7fbcc448d2b8, delta=-1, file=0x615494 "astobj2.c", line=513, func=0x615720 <__FUNCTION__.8363> "__ao2_ref")
   at astobj2.c:442
#3  0x0000000000460230 in __ao2_ref (user_data=0x7fbcc448d2b8, delta=-1) at astobj2.c:513
#4  0x00007fbcbc00fcfb in create_subscription_tree (handler=handler@entry=0x7fbca7563380 <presence_handler>, endpoint=endpoint@entry=0x2ad08c0,
   rdata=rdata@entry=0x7fbcfc180840, resource=resource@entry=0x7fbcfc1807f0 "1001", generator=generator@entry=0x7fbcaf390180 <pidf_body_generator>,
   tree=tree@entry=0x7fbcfc180830) at res_pjsip_pubsub.c:1249
#5  0x00007fbcbc01086b in subscription_persistence_recreate (obj=0x7fbcc46cd6d0, arg=<optimized out>, flags=<optimized out>) at res_pjsip_pubsub.c:1359
#6  0x000000000046121b in internal_ao2_traverse (self=0x7fbcc454a0c0, flags=OBJ_NODATA, cb_fn=0x7fbcbc0104e0 <subscription_persistence_recreate>, arg=0x7fbcc4275530,
   data=0x0, type=AO2_CALLBACK_DEFAULT, tag=0x0, file=0x0, line=0, func=0x0) at astobj2_container.c:351
#7  0x0000000000461538 in __ao2_callback (c=0x7fbcc454a0c0, flags=OBJ_NODATA, cb_fn=0x7fbcbc0104e0 <subscription_persistence_recreate>, arg=0x7fbcc4275530)
   at astobj2_container.c:452
#8  0x00007fbcbc00f551 in subscription_persistence_load (data=<optimized out>) at res_pjsip_pubsub.c:1384
#9  0x00000000005e9ee1 in ast_taskprocessor_execute (tps=0x17b0358) at taskprocessor.c:766
#10 0x00000000005f150d in threadpool_execute (pool=0x1667bb8) at threadpool.c:351
#11 0x00000000005f2b47 in worker_active (worker=0x17b98e8) at threadpool.c:1072
#12 0x00000000005f28e3 in worker_start (arg=0x17b98e8) at threadpool.c:992
#13 0x00000000005fe89b in dummy_start (data=0x16b8e30) at utils.c:1201
#14 0x00007fbcffa16182 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#15 0x00007fbcfecd1fbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
{noformat}

With suitable protection, the logs show the following:
{noformat}
[Sep 29 03:30:57] ERROR[8069]: res_pjsip.c:2190 ast_sip_create_dialog_uas: Could not create dialog with endpoint 2001. Invalid URI (PJSIP_EINVALIDURI)
[Sep 29 03:30:57] WARNING[8069]: res_pjsip_pubsub.c:1252 create_subscription_tree: Unable to create dialog for SIP subscription
[Sep 29 03:30:57] ERROR[8069]: res_pjsip_pubsub.c:1366 subscription_persistence_recreate: FAILED to re-create subscription
{noformat}

h1. Crash Location Number Two

This one occurred less frequently, but with the same error condition: the PJSIP stack refused to create the subscription. More interestingly, however, is that this subscription did 'exist', but did not have a 'root' for the tree. This also caused a crash in {{res_pjsip_pubsub}}. I didn't manage to get a backtrace for that, but the attached patch shows where the subscription managed to exist but not have the root to the tree. The JSON in the database on this issue was used for this problem.
Comments:By: Matt Jordan (mjordan) 2014-09-29 14:55:10.337-0500

Note that this issue was found at SIPit.