[Home]

Summary:ASTERISK-16900: [patch] VoicemailMain Exits Without Warning
Reporter:Leo Brown (netfuse)Labels:
Date Opened:2010-11-02 06:46:30Date Closed:2010-11-12 14:53:11.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Applications/app_voicemail
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) app_voicemail-issue18240-1.6.2.diff.txt
( 1) bug18240.1-6-2.diff.txt
( 2) bug18240-1.6.2-diff2.txt
( 3) bug18240-trunk-2.diff.txt
Description:Hi

I've seen problems with VoicemailMain crashing in the past, from not being able to lock paths, open audio files etc.

However, I have a VoiceMailMain which is opening a mailbox, but existing after verifying a correct password.

The console trace is below, but because Asterisk is not crashing, I can not provide a backtrace. Could you please give me some guidance on how to further debug, as debug level 10 (shown below) only shows it locking/unlocking paths and then exiting.

I have checked all permissions and even removed mailboxes and let Asterisk create the paths itself. Voicemail() works fine, but VoicemailMain does not. The relevant entry from voicemail.conf is also

Leo

****** ADDITIONAL INFORMATION ******

Console output:

[Nov  2 07:22:43] DEBUG[19478]: app_voicemail.c:9355 vm_execmain: After vm_authenticate
[Nov  2 07:22:43] DEBUG[19478]: app_voicemail.c:9398 vm_execmain: Before open_mailbox
[Nov  2 07:22:43] DEBUG[19478]: app.c:1249 ast_lock_path_lockfile: Locked path '/var/spool/asterisk/voicemail/default/10350/Old'
[Nov  2 07:22:43] DEBUG[19478]: app.c:1266 ast_unlock_path_lockfile: Unlocked path '/var/spool/asterisk/voicemail/default/10350/Old'
[Nov  2 07:22:43] DEBUG[19478]: app.c:1249 ast_lock_path_lockfile: Locked path '/var/spool/asterisk/voicemail/default/10350/Old'
[Nov  2 07:22:43] DEBUG[19478]: app.c:1266 ast_unlock_path_lockfile: Unlocked path '/var/spool/asterisk/voicemail/default/10350/Old'
[Nov  2 07:22:43] DEBUG[19478]: pbx.c:4333 __ast_pbx_run: Spawn extension (from-internal,*98,11) exited non-zero on 'Local/*98@from-internal-c32e;2'


voicemaill.conf:

[default]
10350 => 1350,VM - GBG,test@example.com,,attach=yes|saycid=no|envelope=no|delete=yes


CLI output:

pbx*CLI> voicemail show users
Context    Mbox  User                      Zone       NewMsg
default    10350 VM - GBG                                  0
Comments:By: Leo Brown (netfuse) 2010-11-02 09:23:47

Hi

I have found that it's this code that is causing the return:

       /* Retrieve urgent, old and new message counts */
       ast_debug(1, "Before open_mailbox\n");
       res = open_mailbox(&vms, vmu, OLD_FOLDER); /* Count all messages, even Urgent */
       if (res < 0){
               goto out;
       }

I.e. res < 0. That's returned from open_mailbox, here:

       /* for local storage, checks directory for messages up to maxmsg limit */
       last_msg = last_message_index(vmu, vms->curdir);
       ast_unlock_path(vms->curdir);

       if (last_msg < 0) {
               return last_msg;
       }

Which of course, is from last_message_index, which is returning -1 because there are no files in the Old/ directory - which seems to be the correct operation.

So either I'm reading the source wrong, or this is intentionally dying because there are no old messages due to some sort of strange bug uploaded into 1.6.2 branch? :|

By: Leif Madsen (lmadsen) 2010-11-02 09:37:10

You can produce a backtrace still by attaching to the existing process (per Google: http://gcc.gnu.org/onlinedocs/gnat_ugn_unw/Attaching-to-a-Running-Process.html)


You should also provide 'core show locks' as well, which will require you to have Compiler Flags -> DEBUG_THREADS enabled.


~~~~~~~~~~~~~~~~~~~~~~~

Thank you for your bug report. In order to move your issue forward, we require a backtrace from the core file produced after the crash. Please see the doc/backtrace.txt file in your Asterisk source directory.

Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then:

make install

after enabling, reproduce the crash, and then execute the instructions in doc/backtrace.txt.

When complete, attach that file to this issue report. Thanks!


~~~~~~~~~~~~~~~~~~~~~~~~~~
Debugging deadlocks:

Please select DEBUG_THREADS and DONT_OPTIMIZE in the Compiler Flags section of menuselect. Recompile and install Asterisk (i.e. make install)

This will then give you the console command:

core show locks

When the symptoms of the deadlock present themselves again, please provide output of the deadlock via:

# asterisk -rx "core show locks" | tee /tmp/core-show-locks.txt

# gdb -se "asterisk" <pid of asterisk> | tee /tmp/backtrace.txt

gdb> bt
gdb> bt full
gdb> thread apply all bt

Then attach the core-show-locks.txt and backtrace.txt files to this issue. Thanks!

By: Leo Brown (netfuse) 2010-11-02 09:41:43

Hi

Thanks for the info - I have reverted branches/1.6.2 to revision 272642 (where I have a stable system) and this functions fine.

Please see my logic breakdown above - I determined this by placing debug info in the relevant areas (there was not much debug code in there). It seems that the issue is that the app_voicemail is detecting -1 as a problem with locking the path, rather than there being no messages to read. The (working) code in revision 272642 shows instead:

       /* Retrieve urgent, old and new message counts */
       ast_debug(1, "Before open_mailbox\n");
       res = open_mailbox(&vms, vmu, OLD_FOLDER); /* Count all messages, even Urgent */
       if (res == ERROR_LOCK_PATH)
               goto out;

Which does not falsely exit.

Leo

By: Michael L. Young (elguero) 2010-11-02 15:32:30

This looks like a regression to me.  It appears there was a change in logic resulting in the exiting from app_voicemail when a folder has no messages.

1.4    r293004
1.6.2  r293118
1.8    r293119
trunk  r293120

I am not sure if the attached patch is the right solution but it seemed like the most logical fix, unless there is some other part of the code that I am not seeing at this time.  last_msg will always be 0 or greater than, or -1 when there is no index (no messages).

By: Freddi Hansen (freddi_fonet) 2010-11-07 18:16:34.000-0600

Hi,
I had same problem, used your patch on 1.6.2. rel 293969, it works fine now.

thanks.

By: Alec Davis (alecdavis) 2010-11-09 15:56:29.000-0600

When using FILE based storage, VoicemailMain asks for boxnumber and password, then abrutly terminates call. Confirm with trunk and 1.6.2svn

uploaded bug18240.1-6-2.diff.txt

findings so far, not necessarily correct.

As I understand it, last_message_index returns.
> -1 for last message (0 if only 1 message exists)
 -1 for 0 messages
< -1 for errors

Also in the patch, is a possible missed unlock, IMAP code path.

By: Alec Davis (alecdavis) 2010-11-10 15:29:41.000-0600

With a simple debug print in the last_msg < 0 test it's failing on checking the urgent mailbox.

Nov 11 10:06:16] NOTICE[26106]: app_voicemail.c:7420 open_mailbox: ALEC Mailbox: /var/spool/asterisk/voicemail/default/8612/Urgent, last_msg -1



By: Alec Davis (alecdavis) 2010-11-10 16:44:54.000-0600

in 1.6.2 branch r43829, this is where the logic started going wrong.

@@ -7408,7 +7409,7 @@

       if (vm_lock_path(vms->curdir)) {
               ast_log(AST_LOG_ERROR, "Could not open mailbox %s:  mailbox is locked\n", vms->curdir);
-               return -1;
+               return ERROR_LOCK_PATH;
       }

Then from r293118 everywhere else;
       res = open_mailbox(&vms, vmu, 11); /*11 is the Urgent folder */
-       if (res < 0)
+       if (res == ERROR_LOCK_PATH)



By: Alec Davis (alecdavis) 2010-11-10 18:42:45.000-0600

as elguero in ~128751 points out, regression since trunk r293120

uploaded 1.6.2 patch bug18240-1.6.2-diff2.txt
in open_mailbox, if vm_lock_path fails return ERROR_LOCK_PATH instead of -1
reverts the ERROR_LOCK_PATH changes
unrelated missed unlock

By: Jeff Peeler (jpeeler) 2010-11-11 18:17:04.000-0600

Sorry about the mistake! The second patch, bug18240.1-6-2.diff.txt, will be the one committed. But just in case I missed something has anybody found any problems with that one?

By: Michael L. Young (elguero) 2010-11-11 18:51:22.000-0600

Ah... I see what I missed.. I failed to see that count_messages could return ERROR_LOCK_PATH inside open_mailbox.

I would say that bug18240.1-6-2.diff.txt would be the correct patch.

I am going to test this out right now and let you know the results.

By: Michael L. Young (elguero) 2010-11-11 19:16:35.000-0600

I applied alecdavis' changes to trunk (bug18240-trunk-2.diff.txt) and it was working properly.

Thanks alecdavis

By: Alec Davis (alecdavis) 2010-11-11 20:06:21.000-0600

jpeeler:
I hope you meant this one 1.6.2 patch bug18240-1.6.2-diff2.txt
sorry for the very similar names.

The only negative value at the moment that open_mailbox returns is -1 for 'no messages', so checking for < -1, which I initially suggested was wrong.

open_mailbox is supposed to return ERROR_LOCK_PATH (-100) if the vm_lock_path fails, and -1 for no messages.

By: Alec Davis (alecdavis) 2010-11-11 20:13:47.000-0600

elguero:
Looks like you concur, bug18240-trunk-2.diff.txt is from bug18240-1.6.2-diff2.txt, not as ~128807 suggests bug18240.1-6-2.diff.txt

jpeeler:
maybe actually both are right - confuse everyone.

if < -1 then an error (currently only -100).
if -1 then 0 messages
if >= 0 then 1 or more messages

By: Michael L. Young (elguero) 2010-11-11 21:14:03.000-0600

alecdavis:
You are correct, I copied the filename in jpeeler's note but I actually applied the second one as you already noted.

What part of the code are you talking about in regards to the if < -1?  The different code paths depending on the backend does seem to get confusing.

Inside open_mailbox function I understand the following:

count_messages will return:
   -100 if there is a locking error (filesystem path)
   0 or greater (filesystem path)

   0 or greater (odbc path)

last_message_index will only return:
   0 or greater than for index (1 or more messages)
   -1 to indicate no index (no message)
   filesystem path - appears that -1 returned when there is an error with opendir() but it will just be treated the same as not having an index instead of determining locking error because that function should only be called once a lock has already been set.  So, I am not sure if that should be changed or not since the locking should have already been handled.

At least this is how I am understanding it.  So, I think your second patch was the correct fix if what we are checking for is locking errors for the filesystem storage path and exiting.

By: Alec Davis (alecdavis) 2010-11-11 22:40:03.000-0600

sorry I was too vague, lets forget about the < -1.

Where I was coming from:

The existing code is
  res = open_mailbox(....
  if (res < 0)

Where the reverted code should be
  res = open_mailbox(....
  if (res == ERROR_LOCK_PATH)

But could be, to allow for future errors.
  res = open_mailbox(....
  if (res < -1 )

By: Jeff Peeler (jpeeler) 2010-11-12 10:49:44.000-0600

Yes, the old code with ERROR_LOCK_PATH did not account for things that it should have. And I added more code that is critical, so just going back to that is not the proper fix. I really did mean the patch I listed. Examining the different behaviors of the backends is indeed confusing and I don't even want to type it out. Basically there is confusion between -1 meaning error and the mail index returning -1 for 0 messages. However, given that -1 is a widely used convention of meaning error (and is used here in voicemail) the checking of < -1 should be limited as the second patch has done.

By: Digium Subversion (svnbot) 2010-11-12 14:49:11.000-0600

Repository: asterisk
Revision: 294903

U   branches/1.4/apps/app_voicemail.c

------------------------------------------------------------------------
r294903 | jpeeler | 2010-11-12 14:49:10 -0600 (Fri, 12 Nov 2010) | 16 lines

Fix regression causing abort in voicemail after opening a mailbox with no mesgs.

In order to be more safe, some error handling code was changed to respect more
error conditions including the potential memory allocation failure for deleted
and heard message tracking introduced in 293004. However, last_message_index
returns -1 for zero messages (perhaps as expected) and was triggering the
stricter error checking. Because last_message_index is only called directly
in one place, just return 0 from open_mailbox (for file based storage) when no
messages are detected unless a real error has occurred.

(closes issue ASTERISK-16900)
Reported by: leobrown
Patches:
     bug18240.1-6-2.diff.txt uploaded by alecdavis (license 585)
Tested by: pabelanger

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294903

By: Digium Subversion (svnbot) 2010-11-12 14:51:16.000-0600

Repository: asterisk
Revision: 294904

_U  branches/1.6.2/
U   branches/1.6.2/apps/app_voicemail.c

------------------------------------------------------------------------
r294904 | jpeeler | 2010-11-12 14:51:16 -0600 (Fri, 12 Nov 2010) | 23 lines

Merged revisions 294903 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.4

........
 r294903 | jpeeler | 2010-11-12 14:49:09 -0600 (Fri, 12 Nov 2010) | 16 lines
 
 Fix regression causing abort in voicemail after opening a mailbox with no mesgs.
 
 In order to be more safe, some error handling code was changed to respect more
 error conditions including the potential memory allocation failure for deleted
 and heard message tracking introduced in 293004. However, last_message_index
 returns -1 for zero messages (perhaps as expected) and was triggering the
 stricter error checking. Because last_message_index is only called directly
 in one place, just return 0 from open_mailbox (for file based storage) when no
 messages are detected unless a real error has occurred.
 
 (closes issue ASTERISK-16900)
 Reported by: leobrown
 Patches:
       bug18240.1-6-2.diff.txt uploaded by alecdavis (license 585)
 Tested by: pabelanger
........

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294904

By: Digium Subversion (svnbot) 2010-11-12 14:52:07.000-0600

Repository: asterisk
Revision: 294905

_U  branches/1.8/
U   branches/1.8/apps/app_voicemail.c

------------------------------------------------------------------------
r294905 | jpeeler | 2010-11-12 14:52:06 -0600 (Fri, 12 Nov 2010) | 30 lines

Merged revisions 294904 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.6.2

................
 r294904 | jpeeler | 2010-11-12 14:51:15 -0600 (Fri, 12 Nov 2010) | 23 lines
 
 Merged revisions 294903 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.4
 
 ........
   r294903 | jpeeler | 2010-11-12 14:49:09 -0600 (Fri, 12 Nov 2010) | 16 lines
   
   Fix regression causing abort in voicemail after opening a mailbox with no mesgs.
   
   In order to be more safe, some error handling code was changed to respect more
   error conditions including the potential memory allocation failure for deleted
   and heard message tracking introduced in 293004. However, last_message_index
   returns -1 for zero messages (perhaps as expected) and was triggering the
   stricter error checking. Because last_message_index is only called directly
   in one place, just return 0 from open_mailbox (for file based storage) when no
   messages are detected unless a real error has occurred.
   
   (closes issue ASTERISK-16900)
   Reported by: leobrown
   Patches:
         bug18240.1-6-2.diff.txt uploaded by alecdavis (license 585)
   Tested by: pabelanger
 ........
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294905

By: Digium Subversion (svnbot) 2010-11-12 14:53:09.000-0600

Repository: asterisk
Revision: 294907

_U  trunk/
U   trunk/apps/app_voicemail.c

------------------------------------------------------------------------
r294907 | jpeeler | 2010-11-12 14:53:09 -0600 (Fri, 12 Nov 2010) | 37 lines

Merged revisions 294905 via svnmerge from
https://origsvn.digium.com/svn/asterisk/branches/1.8

................
 r294905 | jpeeler | 2010-11-12 14:52:06 -0600 (Fri, 12 Nov 2010) | 30 lines
 
 Merged revisions 294904 via svnmerge from
 https://origsvn.digium.com/svn/asterisk/branches/1.6.2
 
 ................
   r294904 | jpeeler | 2010-11-12 14:51:15 -0600 (Fri, 12 Nov 2010) | 23 lines
   
   Merged revisions 294903 via svnmerge from
   https://origsvn.digium.com/svn/asterisk/branches/1.4
   
   ........
     r294903 | jpeeler | 2010-11-12 14:49:09 -0600 (Fri, 12 Nov 2010) | 16 lines
     
     Fix regression causing abort in voicemail after opening a mailbox with no mesgs.
     
     In order to be more safe, some error handling code was changed to respect more
     error conditions including the potential memory allocation failure for deleted
     and heard message tracking introduced in 293004. However, last_message_index
     returns -1 for zero messages (perhaps as expected) and was triggering the
     stricter error checking. Because last_message_index is only called directly
     in one place, just return 0 from open_mailbox (for file based storage) when no
     messages are detected unless a real error has occurred.
     
     (closes issue ASTERISK-16900)
     Reported by: leobrown
     Patches:
           bug18240.1-6-2.diff.txt uploaded by alecdavis (license 585)
     Tested by: pabelanger
   ........
 ................
................

------------------------------------------------------------------------

http://svn.digium.com/view/asterisk?view=rev&revision=294907