[Home]

Summary:ASTERISK-30085: res_parking: Dynamic creation takes a long time with large number of extensions
Reporter:N A (InterLinked)Labels:
Date Opened:2022-05-30 11:36:18Date Closed:2022-05-31 13:07:15
Priority:MinorRegression?
Status:Closed/CompleteComponents:Resources/res_parking
Versions:18.9.0 Frequency of
Occurrence
Constant
Related
Issues:
Environment:Attachments:( 0) akriekcyui.txt
( 1) mcfbcownho.txt
Description:Asterisk deadlocks when parking a call.

Because the deadlock is in the dialplan, Asterisk must be restarted to clear the deadlock.

{noformat}
[rdct]
context => parkedcalls
parkpos => 1-9999999
parkingtime = 90
comebacktoorigin = no
comebackcontext = park-recall
comebackdialtime = 9999999
parkedmusicclass = default
{noformat}
Comments:By: Asterisk Team (asteriskteam) 2022-05-30 11:36:18.826-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: N A (InterLinked) 2022-05-30 11:36:37.125-0500

Core dump and core show locks attached

By: Sean Bright (seanbright) 2022-05-30 11:44:48.031-0500

Do you know what {{res_rdct.so}} is? I am not seeing a file by that name in the 18 branch, but it shows up in the lock backtrace you provided.

By: N A (InterLinked) 2022-05-30 11:54:49.181-0500

That is a custom module here locally.

The relevant code from it in the backtrace is this, which can be seen in the stack frame in the backtrace:

{noformat}
pbx_builtin_setvar_helper(chan, "__PARKINGDYNAMIC", "rdct"); /* Parking lot template */
pbx_builtin_setvar_helper(chan, "__PARKINGDYNCONTEXT", l->parkcontext); /* Dynamic dialplan park context */
pbx_builtin_setvar_helper(chan, "__PARKINGDYNPOS", "1-9999999"); /* Doesn't seem to get picked up for some reason... */
/* res_parking uses int for parking spaces, so full TN can't be used. Use 7-digit # instead. */
parkingspot = ast_channel_exten(chan) + 3;
ast_debug(2, "Parking spot: %s,%s\n", l->parkcontext, parkingspot);
pbx_builtin_setvar_helper(chan, "__PARKINGEXTEN", parkingspot); /* Parked # against */
pbx_builtin_setvar_helper(chan, "__PARKINGDYNEXTEN", parkingspot); /* Parked # against */

/* Try to actually park the call */
res = exec_app(chan, "Park", park_args, 0);
if (res) {
return res;
}
{noformat}

By: Joshua C. Colp (jcolp) 2022-05-30 12:03:01.881-0500

I don't see a deadlock in the backtrace, or in the "core show locks output". What happens if you get multiple backtraces? Do the threads change?

As well, any issues need to be reproduced without any involvement of an external module.

By: N A (InterLinked) 2022-05-30 12:04:01.787-0500

Here is the backtrace without the external module. It appears the same:

By: N A (InterLinked) 2022-05-30 12:04:56.170-0500

Dialplan for the last backtrace:

{noformat}
exten => _X!,1,MSet(__PARKINGDYNAMIC=rdct,__PARKINGDYNCONTEXT=centrex-1-park,__PARKINGDYNPOS=1-999999,__PARKINGEXTEN=2702129,__PARKINGDYNEXTEN=2702129)
same => n,Park(centrex-1-park,st(10)c(rdct-park-recall,4002702129,1))
{noformat}

Same res_parking.conf as before.

By: N A (InterLinked) 2022-05-30 12:07:14.473-0500

Well, maybe not a deadlock proper, per se, but there is this:

{noformat}
[2022-05-30 17:05:35] WARNING[14398][C-00000001]: channel.c:1084 __ast_queue_frame: Exceptionally long voice queue length queuing to Local/2425@rdct-originate-local-00000002;1
[2022-05-30 17:05:36] WARNING[14398][C-00000001]: channel.c:1084 __ast_queue_frame: Exceptionally long voice queue length queuing to Local/2425@rdct-originate-local-00000002;1
[2022-05-30 17:05:37] WARNING[14398][C-00000001]: channel.c:1084 __ast_queue_frame: Exceptionally long voice queue length queuing to Local/2425@rdct-originate-local-00000002;1
{noformat}

core show locks:
{noformat}

=======================================================================
=== 18.9.0
=== Currently Held Locks
=======================================================================
===
=== <pending> <lock#> (<file>): <lock type> <line num> <function> <lock name> <lock addr> (times locked)
===
=== Thread ID: 0x7f8fd7ed3700 LWP:14400 (pbx_thread           started at [ 4730] pbx.c ast_pbx_start())
=== ---> Lock #0 (pbx.c): WRLOCK 8982 ast_wrlock_context &con->lock 0x7f8ffc0cc980 (1)
       asterisk __ast_rwlock_wrlock()
       asterisk ast_wrlock_context()
       [0x7f8fe258d777] res_parking.so res_parking.c:796 parking_lot_cfg_create_extensions()
       [0x7f8fe258edc8] res_parking.so res_parking.c:1036 create_dynamic_lot_full()
       [0x7f8fe258f014] res_parking.so res_parking.c:1062 parking_create_dynamic_lot()
       [0x7f8fe25941cc] res_parking.so parking_applications.c:504 park_common_setup()
       [0x7f8fe25943d5] res_parking.so parking_applications.c:540 park_application_setup()
       [0x7f8fe2594563] res_parking.so parking_applications.c:564 park_app_exec()
       asterisk pbx_exec()
       asterisk <unknown>()
       asterisk ast_spawn_extension()
       asterisk <unknown>()
       asterisk <unknown>()
       asterisk <unknown>()
       [0x7f90048d1fa3] libpthread.so.0 pthread_create.c:487 start_thread()
       [0x7f9004365eff] libc.so.6 clone.S:97 clone()
=== -------------------------------------------------------------------
===
=======================================================================

{noformat}

Additionally, the channel can't be soft hung up. *Something* is stuck. It is stuck executing the Park application.

By: Joshua C. Colp (jcolp) 2022-05-30 12:11:39.218-0500

Okay, so this also involves a Local channel.

Please create a self contained block of dialplan logic with specific instructions for reproduction that you have tested and confirmed causes the issue so that if someone looks at this they are able to also use it.

By: N A (InterLinked) 2022-05-30 12:12:39.095-0500

Additionally, if you run "dialplan reload" that does create a deadlock.
The dialplan has a WRLOCK held indefinitely so as soon as something else needs that, then you have proper deadlock.

By: N A (InterLinked) 2022-05-30 12:17:12.386-0500

This reproduces the deadlock for me after getting run twice:

channel originate Local/2@rdct-park application Wait 11

{noformat}
[rdct-park]
exten => _X!,1,Wait(3)
same => n,MSet(__PARKINGDYNAMIC=rdct,__PARKINGDYNCONTEXT=centrex-1-park,__PARKINGDYNPOS=1-999999,__PARKINGEXTEN=2702129,__PARKINGDYNEXTEN=2702129)
same => n,Park(centrex-1-park,st(10)c(rdct-park-recall,4002702129,1))
{noformat}

{noformat}
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:956 __ast_rwlock_wrlock: pbx.c line 8982 (ast_wrlock_context): Deadlock? waited 15 sec for writelock '&con->lock'?
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(__ast_bt_get_addresses+0x1d) [0x56383d4486dd]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(__ast_rwlock_wrlock+0xb5) [0x56383d5052be]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(ast_wrlock_context+0x32) [0x56383d53978c]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /usr/lib/asterisk/modules/res_parking.so(+0x7777) [0x7fd243422777]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /usr/lib/asterisk/modules/res_parking.so(+0x8dc8) [0x7fd243423dc8]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /usr/lib/asterisk/modules/res_parking.so(+0x9014) [0x7fd243424014]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /usr/lib/asterisk/modules/res_parking.so(+0xe1cc) [0x7fd2434291cc]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /usr/lib/asterisk/modules/res_parking.so(+0xe3d5) [0x7fd2434293d5]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /usr/lib/asterisk/modules/res_parking.so(+0xe563) [0x7fd243429563]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(pbx_exec+0x138) [0x56383d53d287]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(+0x161a75) [0x56383d525a75]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(ast_spawn_extension+0x50) [0x56383d529d7e]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(+0x166b0d) [0x56383d52ab0d]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(+0x1684b6) [0x56383d52c4b6]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: asterisk(+0x214e03) [0x56383d5d8e03]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /lib/x86_64-linux-gnu/libpthread.so.0(+0x7fa3) [0x7fd265767fa3]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:76 __dump_backtrace: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fd2651fbeff]
[2022-05-30 17:15:12] ERROR[15087][C-00000002]: lock.c:964 __ast_rwlock_wrlock: pbx.c line 8982 (ast_wrlock_context): '&con->lock' was locked  here.
{noformat}

Same res_parking.conf as above.

Obviously, I should state this is with DETECT_DEADLOCKS and DEBUG_THREADS. That said, it's consistently and repeatedly showing a deadlock on the con->lock WRLOCK in the dialplan, when adding the dynamic parking location.

By: Joshua C. Colp (jcolp) 2022-05-30 12:23:15.443-0500

Why didn't you state originally that this included the use of DETECT_DEADLOCKS and DEBUG_THREADS? Oh, I assume you thought it would be obvious due to the output of "core show locks".

Does this happen without them?

As I previously stated that is an important data point for things.

By: Joshua C. Colp (jcolp) 2022-05-30 12:26:09.514-0500

And as well, the given reproduction instructions don't work due to lack of "centrex-1-park" parking lot. I changed the name of the one in the initial description to "centrex-1-park" and without the compiler flags set it works fine.

By: Joshua C. Colp (jcolp) 2022-05-30 12:30:01.377-0500

I rebuilt with DETECT_DEADLOCKS and DEBUG_THREADS set and also can't reproduce the issue with the given instructions under latest 18.

By: N A (InterLinked) 2022-05-30 12:30:13.886-0500

Sorry, yes, I thought it might have been obvious since core show locks is only present with DEBUG_THREADS, and I put it at the top of the backtrace.

centrex-1-park is dynamic. It doesn't exist (explicitly) in the dialplan.

I forgot you also need this in res_parking.conf to enable dynamic lots:
{noformat}
[general]
parkeddynamic = yes
{noformat}

I suppose I can recompile without those options to see, but I think the backtraces show this is consistent, not just a timing fluke caused by compiler options. Even if unlikely to happen otherwise, still a possible issue and a confirmed bug.

By: Joshua C. Colp (jcolp) 2022-05-30 12:37:33.626-0500

Issues aren't just a binary thing. It's not just "possible issue and a confirmed bug". Understanding the scope and impact over all is important. For example, the "parkeddynamic" option is not even used by default which means that this would apply to a limited subset of the Asterisk community, in fact you may be the only user which is why the issue exists.

By: Joshua C. Colp (jcolp) 2022-05-30 12:46:15.288-0500

There is no deadlock or infinite loop here. On creation of a dynamic context, every possible extension is created for the parking lot. You've set a large number of extensions, so it takes a long time to create all of the extensions.

By: N A (InterLinked) 2022-05-30 14:00:30.475-0500

The issue appears to happen even without those options, interestingly.

By: N A (InterLinked) 2022-05-30 14:01:37.989-0500

Ah, I see, yes, I suppose creating a million extensions would take a while ;)

I guess I will need to find a better way to allow arbitrary parking extensions like such to be used.

By: Kevin Harwell (kharwell) 2022-05-31 12:00:16.532-0500

[~InterLinked] Is this something you think you'll try to take on/patch?

By: N A (InterLinked) 2022-05-31 12:47:31.964-0500

As Joshua said, this was more user error than a real issue, I think.
If parkpos is not specified, then the extensions aren't dynamically created (and I didn't want them created anyways, so preventing that prevents this from happening).

That said even in the neighborhood of 100k extensions it didn't ever seem to finish, so I wonder if it takes too long, then it gets stuck. At the very least, maybe it should prevent users from dynamically creating millions of extensions and shooting themselves in the foot.

If it's deemed to be a problem, the issue can stay open, otherwise it can be closed; I'm not sure I have anything further to contribute.

By: Kevin Harwell (kharwell) 2022-05-31 13:07:15.511-0500

I don't disagree on a potential warning, or that there could be improvements done to the loading process but given the likelihood of the setup and scenario I'll close for now. It can be reopened later if it becomes more of a problem.