[Home]

Summary:ASTERISK-05843: [post 1.4] improper handling of contexts with same name
Reporter:Luigi Rizzo (rizzo)Labels:
Date Opened:2005-12-14 15:39:11.000-0600Date Closed:2008-03-12 17:47:41
Priority:MajorRegression?No
Status:Closed/CompleteComponents:Core/Configuration
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:[I have put reproducibility=always because the problem is deterministic,
and severity=major because it is an undetected configuration error which
may result in serious and hard-to-detect misbehaviours of the dialplan.
Then, your mileage may vary]

When using regcontext=xyz where xyz is the name of a context already
existing in extensions.conf (or probably some other config file as well),
asterisk will create two instances of the context xyz.
However the extension lookup code will stop the search after the
first instance, thus resulting in unexpected results.

A stripped down example is below, where the _5. entry is in exension.conf,
and the 551 entry is the result of regcontext=local-users regexten=551 in
sip.conf for a peer.

The source of the problem is that regcontext creates immediately
the empty context in the global list, whereas pbx_config later
builds contexts in a temporary list, then calling
ast_merge_contexts_and_delete() at line 1776 to merge the two lists.

Unfortunately the list merging only puts the local list in front of
the existing contexts, without checks for duplicates.

I have no idea on what is the proper fix, nor what is the behaviour
on a 'extension reload' or similar.

Surely, at the very least the code should produce a big warning message
in case we found one such misconfiguration (i.e. multiple contexts with
the same name), if merging the two is not possible or too expensive.

On a related topic: most functions that compare context names are
case-sensitive, however a few of them are not, e.g.
ast_context_create()
complete_show_dialplan_context()
__ast_context_destroy()
and possibly more.
Apart from the inconsistency that needs to be fixed, there is also
the issue that most of asterisk is case-insensitive when it comes to
names, so i think you should state clearly what is the policy
and why contexts are dealt with in a different way.



****** ADDITIONAL INFORMATION ******

*CLI> show dialplan local-users
[ Context 'local-users' created by 'pbx_config' ]
 '_5.' =>          2. Dial(SIP/${EXTEN})                         [pbx_config]

[ Context 'local-users' created by 'SIP' ]  
 '551' =>          1. Noop(551)                                  [SIP]

*CLI> dial 551@local-users
No such extension '551' in context 'local-users'
*CLI>
Comments:By: Luigi Rizzo (rizzo) 2005-12-14 15:40:45.000-0600

BTW i do have the disclaimer on file... and the problem seems to be
a long standing one, not related to a particular SVN version.

By: Olle Johansson (oej) 2005-12-15 05:24:34.000-0600

Let's try to take one problem per issue report. I would suggest focusing on the multiple contexts in this one and opening another to discuss the rules for context handling.

By: Luigi Rizzo (rizzo) 2005-12-15 05:58:40.000-0600

i generally try (modulo mistakes) to keep issues separate,
but in this case i reported them together intentionally.
The two issues are strongly related, because the fix has
to decide when two contexts have the same name, and this is either
case-sensitive or case-insensitive, and a decision has to be made.
The obvious solution would be to define a
function (or macro) ast_ctx_match() and use it consistently wherever
we try a context name match,
so we can revisit the decision very quickly in the future.

By: Matt O'Gorman (mogorman) 2006-01-17 11:02:44.000-0600

Rizzo I tried duplicating this behavior in our lab machine, and was unable to do so. The 2 contexts merged fine.
[ Context 'default' created by 'pbx_config' ]
 '1000' =>         1. VoicemailMain()                            [pbx_config]
 '1234' =>         1. Noop(polytest2)                            [SIP]
                   2. dial(sip/linphone)                         [pbx_config]
 '4321' =>         1. voicemailmain()                            [pbx_config]
 '501' =>          1. Dial(SIP/polytest2)                        [pbx_config]
 '601' =>          1. Dial(SIP/polytest)                         [pbx_config]
 '6025' =>         hint: Zap/3                                   [pbx_config]
 '6252' =>         hint: Zap/1                                   [pbx_config]
 '6266' =>         hint: Zap/2                                   [pbx_config]
 '7000' =>         1. Dial(sip/linphone)                         [pbx_config]
 'linphone' =>     1. Noop(linphone)                             [SIP]
 'polytest' =>     1. Noop(polytest)                             [SIP]
 Include =>        'parkedcalls'                                 [pbx_config]


the only issue at all is that my regexten flattened a line in my config file , which some might not consider a bug, are you still able to replicate this as of svn trunk 8132

By: Luigi Rizzo (rizzo) 2006-01-17 11:21:03.000-0600

yes, see below. Thing is, nothing has changed in the relevant code,
so the analysis above (of which i am pretty confident now) still applies,
please re-read it: the code does not check for duplicate contexts,
simply appends two lists.
The only reason wny you might not see the problem is if pbx_config runs
before the external module - e.g. (just guessing) if you manually
load chan_sip after pbx_config has run maybe ?

*CLI> show version
Asterisk SVN-trunk-r8127M built by luigi @ prova.iet.unipi.it on a i386 running FreeBSD on 2006-01-17 19:29:47 UTC
*CLI> show dialplan local-users
[ Context 'local-users' (1) created by 'pbx_config' ]
 '_99' =>          2. Noop(test)                                 [pbx_config]

[ Context 'local-users' (1) created by 'SIP' ]
 '551' =>          1. Noop(551)                                  [SIP]
 '552' =>          1. Noop(552)                                  [SIP]

-= 3 extensions (3 priorities) in 2 contexts. =-

By: Luigi Rizzo (rizzo) 2006-01-17 17:02:39.000-0600

I see two possible fixes:
- the easy way is to allow only one registrar per context, and
 report an error (in add_extension()) when one tries to register  
 an extension in a context with a registrar different from the
 existing one.

 Then the merge function could simply replace (entirely) the
 contexts with the same name from the same registrar, or even all
 contexts from the same registrar (this is how the code works now,
 except that it doesn't check for the multiple-registrar case).

- the alternative, more expensive, to allow entries from different
 registrars in the same context, is to have the merge function
 call the equivalent of ast_add_extension2() (but using the
 already allocated entry) on each element <ctx,ext,pri> of the
 list to merge, replacing existing entries.
 Then the second parameter to ast_merge_contexts_and_delete()
 becomes useless.

By: Matt O'Gorman (mogorman) 2006-01-17 18:09:11.000-0600

hey rizzon in commit 8162 and 8163 i changed the default load of the modules so that pbx_config and pbx_ael get loaded before channel structures as that is the way it should be anyways.  That should make your issue dissapear

By: Luigi Rizzo (rizzo) 2006-01-18 00:29:06.000-0600

just a note to remember that we need to revisit the issue.

The change of load order only fixes the problem temporarily,
because if you add an extension to the dialplan for a context
that already "belongs" to another registrar, and issue an
"extensions reload", you will see the problem again - a new
context is created with the same name.

I really believe that a proper fix involves one of the two
approaches that I mentioned.

By: Leif Madsen (lmadsen) 2006-05-02 22:53:04

/housekeeping

Since rizzo even noted that this needs to be revisited, I'm bringing it up for discussion.

By: jmls (jmls) 2006-10-31 03:41:15.000-0600

/housekeeping

rizzo, regarding 0039786: any more thoughts on the fix required ?

By: jmls (jmls) 2006-11-19 13:29:26.000-0600

hey - another 20 days have passed. PING PING PING
:)

By: Serge Vecher (serge-v) 2007-02-28 13:46:52.000-0600

perhaps Mr. Dialplan Wizard can rescue this bug

By: Steve Murphy (murf) 2007-03-01 12:26:29.000-0600

OK, been looking at the code. I have to add one more requirement to merge_contexts_and_delete: that it hold the locks for less than a frame time. Since the freeing and destruction of list elements is the most time-consuming part of the operation, (or, at least, WAS), I propose this algorithm:

4 lists are involved:
   1. the list of contexts to merge into the dialplan (extcontexts)
   2. the existing contexts (contexts)
   3. a list containing extens to free
   4. a list containing just contexts to free

The algorithm would go something like this:
   1. get the conlock & hintlock
   2. preserve the watchers as before
   3. traverse the dp, and unlink all exten/prio that match registrar. Do Not
      remove any contexts (yet). Unlink them from the contexts list, and
      link them instead to the list of extens to free.
   4. traverse the dp again, and for any empty contexts, that match registrar,
      unlink from contexts, and link to the contexts to free list. this and #3
      might be tied into a single traversal.
   5. Now, for each context in extcontexts, search for a match in contexts.
        (THIS MAKES ME NERVOUS IF THE DP IS BIG!)
      if found:
        either the context or something in it has a different registrar.
        go thru the contexts entry, and relink the exten/prios into the
        matching extcontext's entry. If there are exten
        collisions, (THIS SEARCH MAKES ME NERVOUS IF THE DP IS BIG!)
        then take all the contexts prios for that exten, and insert them into
        the collided extcontexts exten. Issue a warning only if there are
        collisions. Keep the extcontexts version. After all the prios are
        merged, then put the contexts exten into the free list.

        Now, Move the now empty contexts' context into the free context list.
        Then move the merged context into the  contexts list from the
        extcontext list.
      if not found:
        link the context from extcontexts to contexts. This is the "quick and
        easy" path.

   6. Restore the watchers, as is now being done.
   
   7. Unlock the above locks,

   8. Destroy the stuff in the to-be-freed lists

   9. Return.

This will keep the regcontexts. If the dp is big, or the extcontexts big, this
operation will run dangerously slow! Having O(1) search times for all 3 types of search (context, exten, prio) would be a big plus.

Will this be sufficient? The key is getting part 5 right.

By: Brandon Kruse (bkruse) 2008-01-27 23:38:19.000-0600

Hey Guys,

Throwing out some housekeeping. It has been almost a year.

What is the status on this issue.

Thanks!

-bk

By: Steve Murphy (murf) 2008-03-05 11:39:40.000-0600

OK, I've published both intention and then completion of fixes to this bug on the asterisk-dev, both of which letters got absolutely no response (deer in the headlights?)

They are:

http://lists.digium.com/pipermail/asterisk-dev/2008-February/032065.html

and:

http://lists.digium.com/pipermail/asterisk-dev/2008-March/032124.html

I have the fixes in team/murf/bug6002

Please review and test! I will commit these fixes to trunk soon if there are no objections.

By: Digium Subversion (svnbot) 2008-03-07 12:54:02.000-0600

Repository: asterisk
Revision: 106757

U   trunk/apps/app_dial.c
U   trunk/apps/app_meetme.c
U   trunk/apps/app_queue.c
U   trunk/channels/chan_iax2.c
U   trunk/channels/chan_sip.c
U   trunk/channels/chan_skinny.c
U   trunk/include/asterisk/pbx.h
U   trunk/include/asterisk/pval.h
U   trunk/main/features.c
U   trunk/main/pbx.c
U   trunk/pbx/pbx_ael.c
U   trunk/pbx/pbx_config.c
U   trunk/res/ael/ael.flex
U   trunk/res/ael/ael.tab.c
U   trunk/res/ael/ael.tab.h
U   trunk/res/ael/ael.y
U   trunk/res/ael/ael_lex.c
U   trunk/res/ael/pval.c
U   trunk/utils/Makefile
U   trunk/utils/ael_main.c
U   trunk/utils/conf2ael.c
U   trunk/utils/extconf.c

------------------------------------------------------------------------
r106757 | murf | 2008-03-07 12:53:59 -0600 (Fri, 07 Mar 2008) | 126 lines

(closes issue ASTERISK-5843)
Reported by: rizzo
Tested by: murf

Proposal of the changes to be made, and then an announcement of how they were accomplished:

http://lists.digium.com/pipermail/asterisk-dev/2008-February/032065.html

and:

http://lists.digium.com/pipermail/asterisk-dev/2008-March/032124.html

Here is a recap, file by file, of what I have done:

pbx/pbx_config.c
pbx/pbx_ael.c

All funcs that were passed a ptr to the context list, now will ALSO be passed a hashtab ptr to the same set.
Why? because (for the time being), the dialplan is stored in both, to facilitate a quick, low-cost move to
hash-tables to speed up dialplan processing. If it was deemed necessary to pass the context LIST, well, it
is just as necessary to have the TABLE available. This is because the list/table in question might not be
the global one, but temporary ones we would use to stage the dialplan on, and then swap into the global
position when things are ready.

We now have one external function for apps to use, "ast_context_find_or_create()" instead of the pre-existing
"find" and "create", as all existing usages used both in tandem anyway.

pbx_config, and pbx_ael, will stage the reloaded dialplan into local lists and tables, and
then call merge_contexts_and_delete, which will merge (now) existing contexts and
priorities from other registrars into this local set by copying them. Then, merge_contexts_and_delete will
lock down the contexts, swap the lists and tables, and unlock (real quick), and then
destroy the old dialplan.



chan_sip.c
chan_iax.c
chan_skinny.c

All the channel drivers that would add regcontexts now use the ast_context_find_or_create now.

chan_sip also includes a small fix to get rid of warnings about removing priorities that never got entered.


apps/app_meetme.c
apps/app_dial.c
apps/app_queue.c

All the apps that added a context/exten/priority were also modified to use ast_context_find_or_create instead.


include/asterisk/pbx.h

ast_context_create() is removed. Find_or_create_ is the new method.
ast_context_find_or_create()  interface gets the hashtab added.
ast_merge_contexts_and_delete() gets the local hashtab arg added.
ast_wrlock_contexts_version() is added so you can detect if someone else got a writelock between your readlocking and writelocking.
ast_hashtab_compare_contexts was made public for use in pbx_config/pbx_ael
ast_hashtab_hash_contexts was in like fashion make public.


include/asterisk/pval.h

ast_compile_ael2() interface changed to include the local hashtab table ptr.


main/features.c

For the sake of the parking context, we use ast_context_find_or_create().



main/pbx.c

I changed all the "tree" names to "table" instead. That's because the original
implementation was based on binary trees. (had a free library). Then I moved
to hashtabs. Now, the names move forward too.

refcount field added to contexts, so you can keep track of how many modules
wanted this context to exist.

Some log messages that are warnings were inflated from LOG_NOTICE to LOG_WARNING.

Added some calls to ast_verb(3,...) for debug messages

Lots of little mods to ast_context_remove_extension2, which is now excersized in ways
it was not previously; one definite bug fixed.

find_or_create was upgraded to handle both local lists/tables as well as the globals.

context_merge() was added to do the per-context merging of the old/present contexts/extens/prios into the new/proposed local list/tables

ast_merge_contexts_and_delete() was heavily modified.

ast_add_extension2() was also upgraded to handle changes.

the context_destroy() code was re-engineered to handle the new way of doing things,
by exten/prio instead of by context.



res/ael/pval.c
res/ael/ael.tab.c
res/ael/ael.tab.h
res/ael/ael.y
res/ael/ael_lex.c
res/ael/ael.flex
utils/ael_main.c
utils/extconf.c
utils/conf2ael.c
utils/Makefile

Had to change the interface to ast_compile_ael2(), to include the hashtab ptr.
This ended up involving several external apps.  The main gotcha was I had to
include lock.h and hashtab.h in several places.


As a side note, I tested this stuff pretty thoroughly, I replicated the problems
originally reported by Luigi, and made triply sure that reloads worked, and everything
worked thru "stop gracefully". I found a and fixed a few bugs as I was merging into
trunk, that did not appear in my tests of bug6002.

How's this for verbose commit messages?



------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=106757

By: Digium Subversion (svnbot) 2008-03-12 17:47:41

Repository: asterisk
Revision: 108351

_U  branches/1.6.0/

------------------------------------------------------------------------
r108351 | russell | 2008-03-12 17:47:37 -0500 (Wed, 12 Mar 2008) | 133 lines

Blocked revisions 106757 via svnmerge

........
r106757 | murf | 2008-03-07 12:57:57 -0600 (Fri, 07 Mar 2008) | 126 lines

(closes issue ASTERISK-5843)
Reported by: rizzo
Tested by: murf

Proposal of the changes to be made, and then an announcement of how they were accomplished:

http://lists.digium.com/pipermail/asterisk-dev/2008-February/032065.html

and:

http://lists.digium.com/pipermail/asterisk-dev/2008-March/032124.html

Here is a recap, file by file, of what I have done:

pbx/pbx_config.c
pbx/pbx_ael.c

All funcs that were passed a ptr to the context list, now will ALSO be passed a hashtab ptr to the same set.
Why? because (for the time being), the dialplan is stored in both, to facilitate a quick, low-cost move to
hash-tables to speed up dialplan processing. If it was deemed necessary to pass the context LIST, well, it
is just as necessary to have the TABLE available. This is because the list/table in question might not be
the global one, but temporary ones we would use to stage the dialplan on, and then swap into the global
position when things are ready.

We now have one external function for apps to use, "ast_context_find_or_create()" instead of the pre-existing
"find" and "create", as all existing usages used both in tandem anyway.

pbx_config, and pbx_ael, will stage the reloaded dialplan into local lists and tables, and
then call merge_contexts_and_delete, which will merge (now) existing contexts and
priorities from other registrars into this local set by copying them. Then, merge_contexts_and_delete will
lock down the contexts, swap the lists and tables, and unlock (real quick), and then
destroy the old dialplan.



chan_sip.c
chan_iax.c
chan_skinny.c

All the channel drivers that would add regcontexts now use the ast_context_find_or_create now.

chan_sip also includes a small fix to get rid of warnings about removing priorities that never got entered.


apps/app_meetme.c
apps/app_dial.c
apps/app_queue.c

All the apps that added a context/exten/priority were also modified to use ast_context_find_or_create instead.


include/asterisk/pbx.h

ast_context_create() is removed. Find_or_create_ is the new method.
ast_context_find_or_create()  interface gets the hashtab added.
ast_merge_contexts_and_delete() gets the local hashtab arg added.
ast_wrlock_contexts_version() is added so you can detect if someone else got a writelock between your readlocking and writelocking.
ast_hashtab_compare_contexts was made public for use in pbx_config/pbx_ael
ast_hashtab_hash_contexts was in like fashion make public.


include/asterisk/pval.h

ast_compile_ael2() interface changed to include the local hashtab table ptr.


main/features.c

For the sake of the parking context, we use ast_context_find_or_create().



main/pbx.c

I changed all the "tree" names to "table" instead. That's because the original
implementation was based on binary trees. (had a free library). Then I moved
to hashtabs. Now, the names move forward too.

refcount field added to contexts, so you can keep track of how many modules
wanted this context to exist.

Some log messages that are warnings were inflated from LOG_NOTICE to LOG_WARNING.

Added some calls to ast_verb(3,...) for debug messages

Lots of little mods to ast_context_remove_extension2, which is now excersized in ways
it was not previously; one definite bug fixed.

find_or_create was upgraded to handle both local lists/tables as well as the globals.

context_merge() was added to do the per-context merging of the old/present contexts/extens/prios into the new/proposed local list/tables

ast_merge_contexts_and_delete() was heavily modified.

ast_add_extension2() was also upgraded to handle changes.

the context_destroy() code was re-engineered to handle the new way of doing things,
by exten/prio instead of by context.



res/ael/pval.c
res/ael/ael.tab.c
res/ael/ael.tab.h
res/ael/ael.y
res/ael/ael_lex.c
res/ael/ael.flex
utils/ael_main.c
utils/extconf.c
utils/conf2ael.c
utils/Makefile

Had to change the interface to ast_compile_ael2(), to include the hashtab ptr.
This ended up involving several external apps.  The main gotcha was I had to
include lock.h and hashtab.h in several places.


As a side note, I tested this stuff pretty thoroughly, I replicated the problems
originally reported by Luigi, and made triply sure that reloads worked, and everything
worked thru "stop gracefully". I found a and fixed a few bugs as I was merging into
trunk, that did not appear in my tests of bug6002.

How's this for verbose commit messages?



........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=108351