[Home]

Summary:ASTERISK-20945: "Unable to connect to remote asterisk" message on service asterisk start, even though service is running
Reporter:Warren Selby (wcselby)Labels:
Date Opened:2013-01-16 13:39:33.000-0600Date Closed:2013-01-21 14:30:20.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Core/General
Versions:1.8.20.0 Frequency of
Occurrence
Related
Issues:
is caused byASTERISK-19463 Asterisk deadlocks during startup with mutex errors
is the original version of this clone:ASTERISK-20962 CLONE - "Unable to connect to remote asterisk" message on service asterisk start, even though service is running
is related toASTERISK-20947 astcanary exits immediately because of wrong pid argument
Environment:CentOS 5.8 Kernel - 2.6.18-308.24.1.el5PAE Attachments:( 0) Asterisk-20945-1.8.20.diff
( 1) ASTERISK-20945-1.8-mjordan.diff
( 2) asterisk-20945-remote-intro-msg.diff
Description:Installed asterisk 1.8.19.1, worked fine. Installed 1.8.20.0, and started seeing the following issues, with the same config files for both:

When I start the asterisk service using "service asterisk start" from the command line, this is the output:

{noformat}
[root@pbx ~]# service asterisk start
Unable to connect to remote asterisk (does /var/run/asterisk/asterisk.ctl exist?)
Starting asterisk:
{noformat}

However, the /var/run/asterisk/asterisk.ctl file is being created and the process is starting:

{noformat}
[root@pbx ~]# ls -lh /var/run/asterisk/
total 4.0K
srwxr-xr-x 1 root root 0 Jan 16 12:07 asterisk.ctl
-rw-r--r-- 1 root root 6 Jan 16 12:07 asterisk.pid
{noformat}

However, I'm no longer getting the usual splash message when I connect to the asterisk console...this is what I get:

{noformat}
[root@pbx ~]# asterisk -r
Verbosity is at least 3
pbx*CLI>
{noformat}

I don't have any peers setup yet, or even any dialplan configured to test, but when I go through the logs, I don't find any errors or warnings that I'm not expecting.

I've gone back to the asterisk 1.8.19.1 install and everything works as expected (no error messages, full splash about license / version on connection to console, etc).  I performed make clean in my 1.8.20 source directory, then ./configure, make menuselect, make, make install, and even make config, and I'm still seeing this message pop up when restarting / starting the service.

I went through the CHANGELOG.TXT for 1.8.20.0 and it appears there are some items talking about changing the way the process starts up (commit r376428), but I'm not enough of a coder to understand if those would cause what I'm seeing.  
Comments:By: ewieling (ewieling) 2013-01-16 14:36:17.801-0600

I am also experiencing this issue.

By: Michael L. Young (elguero) 2013-01-16 14:57:43.186-0600

This patch [^asterisk-20945-remote-intro-msg.diff] should get the intro message to display again.

In regards to the other message, it is harmless.  But, I agree that it is annoying.  That message is being displayed due to the fact that the init script is calling "asterisk -rx" in order to determine if asterisk is running or not.

That message started to be displayed because that error message was not being printed directly to stderr before the changes.  So, it has always been there, just never displayed.

Let me see if I can think of something for that init script.

By: Matt Jordan (mjordan) 2013-01-16 15:22:41.643-0600

So, first, sorry for whatever behavior changes this introduced. I did quite a lot of testing trying to cover all my bases, and it's clear some were missed.  Ugh.

The startup sequence in Asterisk was changed to try to avoid some rather pernicious lock corruption issues that occur when a process is forked after/during a mutex access. A lot of that entailed doing two things:
# Not using our logging facilities until after the process is forked into the background (to avoid having to initialize or grab the logger facility's locks)
# Pushing as many subsystem start up sequences after the forking as possible.

This explains what Michael just pointed out - some error messages that would normally 'not appear' - since the logger subsystem couldn't actually log any of them - might start showing up since they now get dumped to stderr.

So, all that being said, I didn't see this when running on Fedora 15:

{noformat}
[root@mjordan-laptop 1.8.20.0]# service asterisk start
Starting asterisk (via systemctl):                         [  OK  ]
[root@mjordan-laptop 1.8.20.0]# asterisk -r
Asterisk 1.8.20.0, Copyright (C) 1999 - 2012 Digium, Inc. and others.
Created by Mark Spencer <markster@digium.com>
Asterisk comes with ABSOLUTELY NO WARRANTY; type 'core show warranty' for details.
This is free software, with components licensed under the GNU General Public
License version 2 and other licenses; you are welcome to redistribute it under
certain conditions. Type 'core show license' for details.
=========================================================================
Connected to Asterisk 1.8.20.0 currently running on mjordan-laptop (pid = 16744)
Verbosity is at least 5
Core debug is at least 5
mjordan-laptop*CLI>
{noformat}

That's odd, since the startup script should be the same.

By: Michael L. Young (elguero) 2013-01-16 15:32:32.772-0600

Matt, it might be that the message was not displayed since Fedora 15 is using systemd and on CentOS, it still uses SysV?

By: Matt Jordan (mjordan) 2013-01-16 15:33:39.568-0600

Hm... possibly. I'll go jump on a CentOS box and see.

By: Michael L. Young (elguero) 2013-01-16 15:38:54.121-0600

I was doing some quick searching and it looks like CentOS 6 is using Upstart while CentOS 5 is using sysvinit.  Just an FYI while you are looking into it.

By: Matt Jordan (mjordan) 2013-01-16 15:58:28.791-0600

Interesting - on a CentOS 6 box that is the behavior I get:

{noformat}
[digium@asterisk-testsuite-01 1.8.20.0]$ sudo service asterisk start

Unable to connect to remote asterisk (does /var/run/asterisk/asterisk.ctl exist?)
Starting asterisk: Unable to connect to remote asterisk (does /var/run/asterisk/asterisk.ctl exist?)
                                                          [  OK  ]
[digium@asterisk-testsuite-01 1.8.20.0]$ ps -A | grep ast
1919 ?        00:01:22 master
31430 pts/0    00:00:00 safe_asterisk
31435 pts/0    00:00:00 asterisk
[digium@asterisk-testsuite-01 1.8.20.0]$ sudo asterisk -r
Verbosity is at least 3
asterisk-testsuite-01*CLI>
Disconnected from Asterisk server
[digium@asterisk-testsuite-01 1.8.20.0]$
{noformat}

By: Matt Jordan (mjordan) 2013-01-16 16:04:20.763-0600

Your patch definitely fixes the more egregious of the annoyances, namely, no splash screen on a remote console.

For the other issue, I'm thinking the 'best' fix would probably be to modify the init.d script to pipe stderr to /dev/null. What do you think?

By: Michael L. Young (elguero) 2013-01-16 16:09:11.253-0600

That works... I was thinking of putting in a check for the presence of the pid file but that should work too.

By: Danny Nicholas (sethsdad) 2013-01-17 09:04:19.855-0600

This issue also exists in 11.2.0.  The patch works for the splash screen, but still does not solve the /etc/init.d/asterisk start error.  I worked around this on my installation by replacing start() with this
{noformat}
start() {
       ###adjust file/directory permissions on each startup

#       echo "Initiating Start of Asterisk: "

#       echo -n "Adjusting file/directory permissions:                         "
#       . /etc/rc.d/init.d/asterisk-permissions && echo "OK" || echo "FAILED"

#       echo -n "Setting ulimits:                                              "
       ulimit -n 32768 -c unlimited && echo  "" || echo "FAILED"

       # Start daemons.
       echo -n $"Starting asterisk: "
       ASTARGS="-g "
       if [ -f $SAFE_ASTERISK ] ; then
               DAEMON=$SAFE_ASTERISK
               ASTARGS="-g "
       fi
#       if [ $AST_USER ] ; then
#               ASTARGS="-U $AST_USER"
#       fi
#       if [ $AST_GROUP ] ; then
#               ASTARGS="$ASTARGS -G $AST_GROUP"
#       fi
       if [ $AST_CONFIG ]; then
               ASTARGS="$ASTARGS -C $AST_CONFIG"
       fi
### here we go!
       $DAEMON $ASTARGS
       RETVAL=$?
       [ $RETVAL -eq 0 ] && touch /var/lock/subsys/asterisk
       echo

#       echo -n "decrease the verbosity level to zero:                         "
       $DAEMON -rx 'set verbose 0' > /dev/null &2>1 && echo  "" || echo "FAILED"
       return $RETVAL
}
{noformat}


By: Matt Jordan (mjordan) 2013-01-17 09:21:57.978-0600

All that should have to be done is to pipe stderr to /dev/null when testing to see if Asterisk is already started. The error message indicating that the remote console couldn't connect to Asterisk should then be safely piped away.

I've attached a patch that updates all of the various init.d scripts as well as safe_asterisk. I've tested the redhat one; as it is, they're all fairly identical except for the debian one which also uses /bin/true in its evaluation. If someone could test the debian script as well I'd appreciate it.

By: Jamuel Starkey (jamuel) 2013-01-17 12:44:34.395-0600

Don't have access to Debian but can confirm that it works with CentOS 5.8 (Linux 2.6.18-308.24.1.el5xen #1 SMP i686 i686 i386 GNU/Linux).  Thanks for the quick turn around!

By: kaldemar (kaldemar) 2013-01-18 07:42:36.430-0600

Patched Debian script works as expected on 11.2.0.

By: Danny Nicholas (sethsdad) 2013-01-18 15:36:32.135-0600

no dice on suse 11.2.0
   VERSION=`/usr/sbin/asterisk -rx 'core show version' 2>/dev/null`
       if [ "`echo $VERSION | cut -c 1-8`" = "Asterisk" ]; then
               echo "Asterisk is already running.  $0 will exit now."
               exit 1
       fi
# /usr/local/src/asterisk-11.2.0/contrib/init.d/rc.suse.asterisk stop
Shutting down asterisk: Asterisk ended with exit status 0
Asterisk shutdown normally.
# /usr/local/src/asterisk-11.2.0/contrib/init.d/rc.suse.asterisk start
Starting asterisk: Unable to connect to remote asterisk (does /var/run/asterisk/asterisk.ctl exist?)



By: Matt Jordan (mjordan) 2013-01-21 09:51:42.142-0600

Danny: did you update the safe_asterisk script as well? Both will cause the message dumped to stderr to appear.

By: Danny Nicholas (sethsdad) 2013-01-21 11:26:55.031-0600

The safe_asterisk file doesn't change between 11.0.0 and 11.2.0.  I went back to my 11.0.0 source tree and did make install to get 11.0.0 back up.  That starts and stops fine.  Then shut it down and went to 11.2.0 source tree, did make install and back to same problem with the exception that the contributed safe_asterisk now dies with this result
Asterisk died with code 127.
cat: __ASTERISK_VARRUN_DIR__/asterisk.pid: No such file or directory
Automatically restarting Asterisk.
Asterisk ended with exit status 127
Asterisk died with code 127.
cat: __ASTERISK_VARRUN_DIR__/asterisk.pid: No such file or directory
Automatically restarting Asterisk.
Asterisk ended with exit status 127
Asterisk died with code 127.
cat: __ASTERISK_VARRUN_DIR__/asterisk.pid: No such file or directory
Automatically restarting Asterisk.
Asterisk ended with exit status 127
Asterisk died with code 127.
cat: __ASTERISK_VARRUN_DIR__/asterisk.pid: No such file or directory
Automatically restarting Asterisk.
Asterisk ended with exit status 127
Asterisk died with code 127.

here is the "diff" between the two files.
diff -s /usr/sbin/safe_asterisk /usr/local/src/asterisk-11.2.0/contrib/scripts/safe_asterisk
4c4,8
< CLIARGS="$*"                          # Grab any args passed to safe_asterisk
---
> ASTSBINDIR=__ASTERISK_SBIN_DIR__
> ASTVARRUNDIR=__ASTERISK_VARRUN_DIR__
> ASTVARLOGDIR=__ASTERISK_LOG_DIR__
>
> CLIARGS="$*"                                  # Grab any args passed to safe_asterisk
7,12c11,18
< #NOTIFY=ben@alkaloid.net      # Who to notify about crashes
< #EXEC=/path/to/somescript     # Run this command if Asterisk crashes
< #LOGFILE=/path/to/logfile     # Where to place the normal logfile (disabled if blank)
< #SYSLOG=local0                                # Which syslog facility to use (disabled if blank)
< MACHINE=`hostname`                    # To specify which machine has crashed when getting the mail
< DUMPDROP=/tmp
---
> #NOTIFY=root@localhost.localdomain            # Who to notify about crashes
> NOTIFY=${NOTIFY:-}                            # Who to notify about crashes
> #EXEC=/path/to/somescript                     # Run this command if Asterisk crashes
> #LOGFILE=${ASTVARLOGDIR}/safe_asterisk.log    # Where to place the normal logfile (disabled if blank)
> SYSLOG=${SYSLOG:-}                            # Which syslog facility to use (disabled if blank)
> MACHINE=`hostname`                            # To specify which machine has crashed when getting the mail
> DUMPDROP=${DUMPDROP:-/tmp}
> RUNDIR=${RUNDIR:-/tmp}
14,15d19
< ASTSBINDIR=/usr/sbin
< ASTVARRUNDIR=/var/run/asterisk
48c52
< VERSION=`${ASTSBINDIR}/asterisk -rx 'core show version'`
---
> VERSION=`${ASTSBINDIR}/asterisk -nrx 'core show version'`
121a126,135
>
> if test ! -d "${RUNDIR}" ; then
>       message "${RUNDIR} does not exist, creating"
>       mkdir -p "${RUNDIR}"
>       if test ! -d "${RUNDIR}" ; then
>               message "Unable to create ${RUNDIR}"
>               exit 1
>       fi
> fi
>
149c163
<                       cd /tmp
---
>                       cd "${RUNDIR}"
153c167
<                       cd /tmp
---
>                       cd "${RUNDIR}"
176,179c190,193
<                       if test -f /tmp/core.${PID} ; then
<                               mv /tmp/core.${PID} ${DUMPDROP}/core.`hostname`-$DATE &
<                       elif test -f /tmp/core ; then
<                               mv /tmp/core ${DUMPDROP}/core.`hostname`-$DATE &
---
>                       if test -f ${RUNDIR}/core.${PID} ; then
>                               mv ${RUNDIR}/core.${PID} ${DUMPDROP}/core.`hostname`-$DATE &
>                       elif test -f ${RUNDIR}/core ; then
>                               mv ${RUNDIR}/core ${DUMPDROP}/core.`hostname`-$DATE &
186,189c200,203
<                       if test -f /tmp/core.${PID} ; then
<                               mv /tmp/core.${PID} ${DUMPDROP}/core.`hostname`-$DATE &
<                       elif test -f /tmp/core ; then
<                               mv /tmp/core ${DUMPDROP}/core.`hostname`-$DATE &
---
>                       if test -f ${RUNDIR}/core.${PID} ; then
>                               mv ${RUNDIR}/core.${PID} ${DUMPDROP}/core.`hostname`-$DATE &
>                       elif test -f ${RUNDIR}/core ; then
>                               mv ${RUNDIR}/core ${DUMPDROP}/core.`hostname`-$DATE &


By: Matt Jordan (mjordan) 2013-01-21 13:57:01.732-0600

To be frank, I'm not sure what's happening on your system at this point. No, {{safe_asterisk}} was not modified for version 11.2.0. If I recall correctly, it wasn't modified for version 11.0.0. Regardless, that's all beside the point. None of the initialization scripts were modified for 11.2.0 - what affected them was a change in the startup sequence and error message handling in {{asterisk.c}}.

Both {{safe_asterisk}} and the various {{init.d}} scripts run a remote console command execution to determine if Asterisk is running, specifically, they check what the version of Asterisk is. In 11.2.0, an error message that previously would never be displayed is now written to stderr - this is the error message you're seeing.

My question was whether or not you applied the entirety of the patch on this issue (ASTERISK-20945-1.8-mjordan.diff). Based on your comment, it *looked* like you only patched the {{init.d}} script, which would mean you would still see the error message printed out when {{safe_asterisk}} attempted to determine if Asterisk was still running.

The fact that the {{_ASTERISK_VARRUN_DIR_}} variable is not populated correctly the the run directory of Asterisk is due (I suspect) to not re-installing Asterisk properly. Regardless, the {{safe_asterisk}} script - and all of the {{init.d}} scripts - use that variable.

As it is, I've gone ahead and set up an OpenSuse 12.2 VM, installed Asterisk 11.2.0, updated the {{init.d}} scripts and {{safe_asterisk}} with the patch on this issue, set up Asterisk to run as a service using {{make config}}, and did not see the error message on starting/stopping Asterisk.

By: Danny Nicholas (sethsdad) 2013-01-21 14:27:59.985-0600

Apparently my problem was that the script is now expecting certain environment variables to be set in advance.  Once I applied these changes (top new, bottom original), life is good.
4,9c4,6
< #ASTSBINDIR=__ASTERISK_SBIN_DIR__
< #ASTVARRUNDIR=__ASTERISK_VARRUN_DIR__
< #ASTVARLOGDIR=__ASTERISK_LOG_DIR__
< ASTSBINDIR=/usr/sbin
< ASTVARRUNDIR=/var/run/asterisk
< ASTVARLOGDIR=/var/log/asterisk
---
> ASTSBINDIR=__ASTERISK_SBIN_DIR__
> ASTVARRUNDIR=__ASTERISK_VARRUN_DIR__
> ASTVARLOGDIR=__ASTERISK_LOG_DIR__
21c18
< RUNDIR=/tmp #${RUNDIR:-/tmp}
---
> RUNDIR=${RUNDIR:-/tmp}
55c52
< VERSION=`${ASTSBINDIR}/asterisk -nrx 'core show version' 2>/dev/null`
---
> VERSION=`${ASTSBINDIR}/asterisk -nrx 'core show version'`



By: Paul Selly (jung) 2013-01-22 11:27:28.033-0600

Applied both patches: the splash screen issue and startup was fixed but astcanary still crashes. This happens on Fedora 13.

This patch (Asterisk-20945-1.8.20.diff) resolves the issue on this platform, it incorporates "asterisk-20945-remote-intro-msg.diff".



By: Matt Jordan (mjordan) 2013-01-22 11:46:36.025-0600

That should already have been fixed by the patch for ASTERISK-20947.