[Home]

Summary:ASTERISK-15361: [regression] app_sms not working in 1.6.1.12 (same as 0012779)
Reporter:Steve Hanselman (shanselman)Labels:
Date Opened:2009-12-23 05:40:52.000-0600Date Closed:2011-06-07 14:08:03
Priority:MinorRegression?Yes
Status:Closed/CompleteComponents:Applications/app_sms
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:
Description:app_sms has reverted to the same issue as in entry 12779:-

chan_dahdi.c:     -- Requested transfer capability: 0x00 - SPEECH
app_dial.c:     -- Called g1/17094009
app_dial.c:     -- DAHDI/2-1 is proceeding passing it to Local/17094009@default-549e;2
app_dial.c:     -- DAHDI/2-1 is ringing
app_dial.c:     -- DAHDI/2-1 answered Local/17094009@default-549e;2
pbx.c:        > Channel Local/17094009@default-549e;1 was answered.
pbx.c:        > Launching SMS(1268466100) on Local/17094009@default-549e;1
app_sms.c:  sms argc 1 queue <1268466100> opts <> addr <> body <>
app_sms.c:  initial delay 300ms
pbx.c:   == Spawn extension (default, 17094009, 1) exited non-zero on 'Local/17094009@default-549e;2'
app_sms.c:     -- SMS RX 93 00 6D
app_sms.c:     -- SMS TX 17 00
chan_dahdi.c:     -- Channel 0/2, span 1 got hangup request, cause 16
app_sms.c: ast_read failed
chan_dahdi.c:     -- Hungup 'DAHDI/2-1'

The messages never get sent (this is an upgrade from a 1.4 version of asterisk which had no issues with the sms application).


****** ADDITIONAL INFORMATION ******

My configuration is a 30 channel EuroISDN PRI.

Asterisk 1.6.1.12
Dahdi 2.2.0.2 and tools 2.2.0

Happy to run whatever debug is required,
Comments:By: Steve Hanselman (shanselman) 2009-12-30 14:46:11.000-0600

Anybody else seeing this issue?
If so what is your environment (version, hardware etc)

By: Leif Madsen (lmadsen) 2010-01-04 10:34:28.000-0600

As stated in the other issue:

"Due to the limited number of people who are able to test and report back on this issue, I'm going to close this issue as at least one reporter has mentioned the SMS delivery is now working for him in Asterisk 1.6.1.0.

If you are still having issues with SMS delivery / receipt, then please open a new issue so we can track it there along with any information that you have, and please have the ability to reproduce the issue, and allow access to the box by a developer. We currently do not have the ability to reproduce these SMS issues locally.

Thanks to everyone who helped!"

There is a patch on the other issue -- was it useful? If not, perhaps you could provide the access for a developer to look at this? (Don't post it here, as it'll be public :))

I'm going to set this as Acknowledged, and hopefully you can provide the scenario to reproduce this issue and provide any information necessary to track down the issue. Thanks!

By: Steve Hanselman (shanselman) 2010-01-04 10:43:00.000-0600

Providing access will be difficult as it's on a live system, but I'm happy to carry out whatever is required and provide files.

I'll check out the other patch, I'd assumed it was now in SVN but I didn't check, will now.

By: Steve Hanselman (shanselman) 2010-01-04 10:47:39.000-0600

Just checked the patch, one is merely a documentation change and the other replaces a pipe with a comma for current syntax and that is in 1.6.1.12

By: Leif Madsen (lmadsen) 2010-01-04 14:24:42.000-0600

Oh ok, sorry, I hadn't looked at the patch, and didn't notice any commit messages in the bug, so presumed it was a patch that had not been tested.

By: Steve Hanselman (shanselman) 2010-01-05 09:00:37.000-0600

I'm guessing that the pipe to comma change was just collected as part of the overall dialplan changes.

What's the best way to resolve this, I'm happy to run whatever tests are required and return the results?

By: Leif Madsen (lmadsen) 2010-01-05 09:02:27.000-0600

Until a developer is available to review this, I'm not sure what to ask for to be honest.

By: Tilghman Lesher (tilghman) 2010-02-01 13:20:22.000-0600

What telco provider are you using?  Have you tried setting the initial timeout in the options?  There is this comment in the source:

       /* Change the initial message delay. BT requires 300ms,
        * but for others this might be way too much and the phone
        * could time out. XXX make it configurable.
       */

and given that we're seeing a hangup right at this point, this seems germane.

By: Alec Davis (alecdavis) 2010-02-01 16:39:07.000-0600

shanselman:
I agree it doesn't work from a while ago, but have now forgotten details of how I had set it up.
To save time researching setup, are you able to post sample dialplan and methods you use to trigger the call.
This will allow me and others to lab it up, and hopefully progress this.

By: Steve Hanselman (shanselman) 2010-02-02 03:05:15.000-0600

I'll do both of those things:-
1) I'll check the working version vs the current version to see whether that timeout has changed (BTW we are with Telewest for our PRI service)
2) I'll post the extract from the dialplan and the command to send an SMS

By: Alec Davis (alecdavis) 2010-02-02 03:33:24.000-0600

Tilghman has made me aware of the following line near the top of app_sms.c for 'ALAW' sites.

/* #define OUTALAW */ /* enable this to output Alaw rather than linear */

remove the comments around the #define and recompile.

It may help, and that maybe all there was to it.

By: Steve Hanselman (shanselman) 2010-02-02 04:14:32.000-0600

Right, checked the code and in the previous working version the timeout was also 300 so no change there (main difference is that the previous was hardcoded at 300 - or rather 2400 and the new one will accept it as a parameter and multiply by 8 for opause).

To send the sms simply use

smsq <mobile number> some message text

I'll extract the relevant part from the dialplan later when I have access to our server.

By: Steve Hanselman (shanselman) 2010-02-02 04:27:06.000-0600

I'll check on the SVN when that changed re alaw vs linear and see if that could be the issue.

By: Steve Hanselman (shanselman) 2010-02-02 04:30:46.000-0600

Just checked back to the version from 3 years ago and this was commented out then, so I doubt that will be it?

WHat I'll do out of hours this evening is to build a number of versions with debug with exact timing in the messages so I can see exactly where it's dying and see whether that provides a clue, as it's no effort I'll also try Alaw and see if that is any better.

By: Tilghman Lesher (tilghman) 2010-02-02 13:15:06.000-0600

I would additionally advise you to try varying that initial pause.  It may be that 300ms was on the outside of what was previously tolerated by your telco and a slight timing change has caused this to fail.  So try 150 or 100ms and see if you get better results by starting to communicate sooner.

By: Steve Hanselman (shanselman) 2010-02-02 13:45:30.000-0600

'_1709400X' =>    1. Dial(${TELEWEST}/${EXTEN})                 [pbx_config]
                 2. congestion()                               [pbx_config]

So not much basically.

By: Steve Hanselman (shanselman) 2010-02-02 14:05:18.000-0600

Have tried 100 and 150 with no difference, also 400 and 500 likewise.

Have tried compiling with alaw and the code doesn't compile, haven't investigated why.

Looking at the trace it does see two way traffic so the number is dialled and picked up, the response is fairly standard so not just noise.

The duration from last received data to hangup is 3 seconds (just looked at the 10 attempts I've made today and it's the same on all of them).

At a guess (and a pure guess), the remote end doesn't understand our FSK transmission?

By: Tilghman Lesher (tilghman) 2010-02-03 12:09:18.000-0600

What's puzzling to me is that in your initial report, it appears that nothing was actually received or written until after the hangup was received, which is why I speculated that the initial pause could be too long.  I suppose what might need to happen is to review what the telco sees on the other end (or perhaps to run a loopback, which runs SMS with option s on the other end).

By: Filip Beijer (fbeijer) 2010-02-17 01:42:36.000-0600

I'm using smsq and app_sms in the Netherlands.
Asterisk version 1.6.1.4
Using it on 2 servers one with an ISDN-15 and the other with an ISDN-30 both from KPN, if I can help please let me know.

By: Tilghman Lesher (tilghman) 2010-02-17 10:07:36.000-0600

fbeijer: could you share the relevant parts of your dialplan/usage?  I wonder if this might just be confusion over syntax.

By: Filip Beijer (fbeijer) 2010-02-17 13:47:45.000-0600

main part of the dialplan
;
; 067364444400 is the number for KPN fixed line sms
;
exten => s,1,NoOp(${CALLERID(num)})
exten => s,n,GotoIf($[ "${CALLERID(num)}" = "067364444400"]?dosms:normal)
exten => s,n(dosms),Goto(smsmtrx,s,1)
exten => s,n(normal),Goto(did,${EXTEN},1)
exten => s,n,Hangup

[smsmtrx]
exten => s,1,Answer()
exten => s,n,Wait(1)
exten => s,n,SMS(sms-in,a)
exten => s,n,System(/usr/bin/smsq --process /usr/bin/sms-in)
exten => s,n,Hangup

/usr/bin/sms-in is a script getting the bits and pieces of the incoming message out of the environment and processes it

Outgoing sms is just done with smsq <number> <message_text>

Important to get the numbers right in de source of smsq

By: Joost Ruijsch (ploink) 2010-02-23 13:01:46.000-0600

I seem to have the same problem, using asterisk-1.6.0.5-2.fc9.i386.rpm on Fedora 9.
I am trying to send an SMS from the commandline to a SMS capable Siemens Gigaset C470IP Voip phone:

# sudo -u asterisk smsq --mt --tx  --mttx-retries=0 --mttx-channel=SIP/29 --mttx-callerid=7670 --oa=20   29 "Test SMS"

I have configured 767 as the service center on the phone. When the call comes in, it shows "Receiving SMS" on the display for a few seconds, but nothing is actually received and it shows no error.

Sniffing the communication using tcpdump/wireshark shows no abnormalities in the SIP call handling, but the RTP audio is not as it should be.
The first 2 seconds of the audio is silent in both directions, then the gigaset sends a signal of 100ms duration. Within about 30ms, asterisk replies with a similar 100ms signal. Then about 235ms later, the gigaset hangs up.

The signals seem to have some FSK modulation, but with no more than a couple of bytes data, to little to contain the message.

Note that both are not on the same local network and roundtrip time is about 40ms.

It seems that asterisk fails to send anything and then the gigaset times out..

By: Joost Ruijsch (ploink) 2010-02-23 13:16:16.000-0600

Forgot to mention, I got the same "ast_read failed" message as reported in the bug description.
These to lines:
app_sms.c: -- SMS RX 93 00 6D
app_sms.c: -- SMS TX 17 00
..suggest the same problem as I got, just a couple bytes received/sent, then hangup. However I did not get these lines in my log, perhaps I can increase verbosity somehow?

By: Joost Ruijsch (ploink) 2010-02-24 05:31:08.000-0600

Ok, I think I identified the problem.
I increased log verbosity and downloaded the ETSI specification document
http://pda.etsi.org/pda/home.asp?wki_id=3Xm.CablLH019813K,Dms

Sadly, protocol 1 and 2 are not compatible and message numbers have different meanings for both protocols. (ETSI blundered here imho)
Tho make the matter even more complicated, if the MSB is set in protocol 1, it means the message is NOT fragmented, but in protocol 2 this means the message IS fragmented.

Shanselman got this communication:
SMS RX 93 00 6D
SMS TX 17 00

The received byte 93h is message type 13h with the msb set. The second byte (00) indicates zero data length. Since a message with zero data length cannot be fragmented (the spec is very clear about this) it must be protocol 1 message 13h=DLL_SMS_EST "Indicates that the Data Link Layer connection has been established."

The third byte (6D), however, is bogus. It is a data byte in a message with zero data length and has no meaning for the specified message type.

Then asterisk transmits a 17h messages. According to the MSB logic, this must be a protocol 2 message (it does not even exist in protocol 1).
Protocol 2 type 17h: "DLL_SMS_REL Command: Release request" ("The DLL_SMS_REL is used to initiate the disconnection procedure.")
Asterisk may be confused about the protocol mixup

My case is a little different. I got:
SMS RX 93 00 6D
SMS TX 94 00

The RX is the same as with shanselman. This happens with messages received from two very different systems (Siemens Gigaset and 17094009 message center) so this must be a bug in the asterisk RX decoding logic.

The RX/TX are both protocol 1 messages, so that is ok.
In protocol 1, 14h means DLL_SMS_REL, so asterisk disconnects again without sending SMS.

My best guess is that asterisk is confused over the extra bogus byte in the RX.

By: Joost Ruijsch (ploink) 2010-02-24 08:43:32.000-0600

Oops, I did not notice that patch was not applied to my asterisk 1.6.0.5...
https://issues.asterisk.org/file_download.php?file_id=19169&type=bug

I just recompiled and SMS works!
Every RX still has a bogus byte appended, but this is ignored and not a problem.
It should also work for shanselman if he uses the right protocol and his has the patch.

By: Tilghman Lesher (tilghman) 2010-02-25 15:32:16.000-0600

Looks like this should be fixed for current SVN.  If you're using the latest SVN and still having the issue, please feel free to reopen, or ask a bug marshal on IRC, if you are unable to do so.

By: Steve Hanselman (shanselman) 2010-04-28 15:47:01

This is still an issue on 1.6.1.18.

It can't be the protocol version as having checked documentation BT only support version 1.

The call file has da and ud defined, however, the debug shows that app_sms is  called with no parameters bar the queue number, is that correct?

[Apr 28 21:43:58] VERBOSE[12159] app_dial.c:     -- DAHDI/1-1 is ringing
[Apr 28 21:43:58] VERBOSE[12159] app_dial.c:     -- DAHDI/1-1 answered Local/17094009@default-5b4f;2
[Apr 28 21:43:58] VERBOSE[12158] app_sms.c:  sms argc 1 queue <0> opts <> addr <> body <>
[Apr 28 21:43:58] VERBOSE[12158] app_sms.c:  initial delay 300ms
[Apr 28 21:43:58] VERBOSE[12159] pbx.c:   == Spawn extension (default, 17094009, 1) exited non-zero on 'Local/17094009@default-5b4f;2'
[Apr 28 21:43:59] VERBOSE[12158] app_sms.c:     -- SMS RX 93 00 6D
[Apr 28 21:43:59] VERBOSE[12158] app_sms.c:     -- SMS TX 17 00
[


By: Steve Hanselman (shanselman) 2010-05-05 08:07:10

Have checked that all the patches detailed have been applied (they already were).

By: Paul Belanger (pabelanger) 2010-05-20 20:07:31

@shanselman: Can you retest with the 1.6.2 branch
---
Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.6.0 and 1.6.1 branches has ended. For continued maintenance support please move to the 1.6.2 branch.

More information on this change can be found in the release announcement: http://www.asterisk.org/node/49924

By: Steve Hanselman (shanselman) 2010-05-25 11:07:07

I'll test on 1.6.2 this evening.

By: Steve Hanselman (shanselman) 2010-05-25 11:07:58

I lie, it'll be Thursday evening.

By: Leif Madsen (lmadsen) 2010-06-17 13:34:58

Ping -- test results?

By: Paul Belanger (pabelanger) 2010-06-25 08:54:06

Suspended due to lack of activity. Please request a bug marshal in #asterisk-bugs on the IRC network irc.freenode.net to reopen the issue should you have the additional information requested.

Further information can be found at http://www.asterisk.org/developers/bug-guidelines

By: Steve Hanselman (shanselman) 2010-06-25 09:17:33

Still haven't had an opening to test with the latest release, will probably be another 2 weeks before that can be done.

By: Paul Belanger (pabelanger) 2010-06-25 09:25:41

When time permits, post your results.

By: Steve Hanselman (shanselman) 2010-06-26 03:43:09

Ok, I've got 1.6.2.9, with latest dahdi and libpri available until 5pm UK local today.

Just re-run the test with exactly the same results as before.

By: Steve Hanselman (shanselman) 2010-06-26 10:15:23

Just tried an inbound SMS from my mobile to asterisk and that works fine.

By: Paul Belanger (pabelanger) 2010-06-26 12:28:15

Can you capture a debug log (see below)?  Be sure to enable debugs for DAHDI.
--
We require a complete debug log to help triage the issue.

This document will provide instructions on how to collect debugging logs from an Asterisk machine for the purpose of helping bug marshals troubleshoot an issue:

http://svn.digium.com/svn/asterisk/trunk/doc/HOWTO_collect_debug_information.txt

By: Steve Hanselman (shanselman) 2010-06-29 15:19:08

Looks to be a number of ways of obtaining dahdi debug, what do you specifically want, an echo to /sys/modules/dahdi/parameters/debug or something else?

By: Joost Ruijsch (ploink) 2010-07-01 05:40:31

@shanselman "It can't be the protocol version as having checked documentation BT only support version 1."

I have showed in my earlier post that asterisk receives a protocol 1 message and sends a protocol 2 message.
If BT speaks protocol 1 and your asterisk speaks protocol 2, why is that not a protocol mixup?

The proof is in your log, just read the ETSI spec and decode these messages:
[Apr 28 21:43:59] VERBOSE[12158] app_sms.c: -- SMS RX 93 00 6D
[Apr 28 21:43:59] VERBOSE[12158] app_sms.c: -- SMS TX 17 00

http://pda.etsi.org/exchangefolder/es_201912v010201p.pdf
See page 15 and 23 and read BOTH the parts titled "Message Type (1 octet):" about the MSB or "extension bit".

Try adding --pid=1 to smsq.

Why your version of asterisk defaults to protocol 2, I do not know. Perhaps it guesses the protocol from the "country=" setting in "indications.conf" or perhaps some developer decided to change the default.

By: Steve Hanselman (shanselman) 2010-07-01 06:05:57

Tried the --pid=1 but with no joy (and no difference to the message either), I'll check the specs, I recall when I last checked the source that Asterisk was defaulted to version 1, not 2, but that may have been changed.

I'll also recheck that the patches are in, but we're now on 1.6.2.9.

By: Joost Ruijsch (ploink) 2010-07-01 06:35:41

From app_sms.c, message 0x17 is listed for protocol 2, "end the connection":

enum message_types {
       DLL_SMS_MASK        = 0x7f,             /* mask for the valid bits */

       /* Protocol 1 values */
       DLL1_SMS_DATA       = 0x11,             /* data packet */
       DLL1_SMS_ERROR      = 0x12,
       DLL1_SMS_EST        = 0x13,             /* start the connection */
       DLL1_SMS_REL        = 0x14,             /* end the connection */
       DLL1_SMS_ACK        = 0x15,
       DLL1_SMS_NACK       = 0x16,

       DLL1_SMS_COMPLETE   = 0x80,             /* packet is complete */
       DLL1_SMS_MORE       = 0x00,             /* more data to follow */

       /* Protocol 2 values */
       DLL2_SMS_EST        = 0x7f,             /* magic number. No message body */
       DLL2_SMS_INFO_MO    = 0x10,
       DLL2_SMS_INFO_MT    = 0x11,
       DLL2_SMS_INFO_STA   = 0x12,
       DLL2_SMS_NACK       = 0x13,
       DLL2_SMS_ACK0       = 0x14,             /* ack even-numbered frame */
       DLL2_SMS_ACK1       = 0x15,             /* ack odd-numbered frame */
       DLL2_SMS_ENQ        = 0x16,
       DLL2_SMS_REL        = 0x17,             /* end the connection */

       DLL2_SMS_COMPLETE   = 0x00,             /* packet is complete */
       DLL2_SMS_MORE       = 0x80,             /* more data to follow */
};

By: Steve Hanselman (shanselman) 2010-07-01 06:44:32

Just checked the call file and that has pid=1, looking at the source to see why it's not being honoured.

By: Joost Ruijsch (ploink) 2010-07-01 06:58:22

Perhaps you use SMS() with the 't' option in your dialplan? Otherwise, I have no idea...
" t  - use protocol 2 (default used is protocol 1).\n"

By: Steve Hanselman (shanselman) 2010-07-01 07:07:40

Not using that, but I just added a debug print of h.protocol and it is 2, I'll add more debug and find out why as this (as you correctly pointed out) is the root cause.

By: Steve Hanselman (shanselman) 2010-07-01 07:21:57

Both before and after the parse of the options in app_sms.c, OPTION_TWO is set in flags.

By: Steve Hanselman (shanselman) 2010-07-01 07:30:42

No initialiser on the definition of flags.....
All working now.

Fix is flags = { 0 }; on the definition.

ploink, thanks for the pointer, I'd just assumed that the code was honouring the flags passed.

By: Joost Ruijsch (ploink) 2010-07-01 07:41:50

Uninitialized data... that explained why it works fine for the lucky ones (or should that be zero's ;).
I assume this will be fixed in next version.

By: Steve Hanselman (shanselman) 2010-07-01 07:47:27

Quick grep shows a few other potentials for this as well.

By: Leif Madsen (lmadsen) 2010-07-06 10:42:23

Does this mean that minor change mentioned in note 0124138 fixes the issue? Can someone post a patch with the changes required?

By: Steve Hanselman (shanselman) 2010-07-11 13:31:55

Yes, it fixes it, just downloading a pure copy and then I'll diff it and attach it.

By: Steve Hanselman (shanselman) 2010-07-11 13:40:39

also, this is a potential problem elsewhere, grep and you'll see a few other uninitialised definitions, not sure how best to report this?

By: Leif Madsen (lmadsen) 2010-07-12 10:56:50

shanselman: you could create another issue and attach a patch that resolves the other issues (ideally) and then we can just relate it here.

By: Paul Belanger (pabelanger) 2010-09-04 14:11:00

Ping! What is the status of this issue?

By: Leif Madsen (lmadsen) 2010-09-07 15:43:32

Issue closed due to lack of feedback.

By: David Woodhouse (dwmw2) 2011-08-24 08:27:47.932-0500

The fix for the uninitialised flags field, along with two other fixes, is in a patch in https://issues.asterisk.org/jira/browse/ASTERISK-18331

Please test and confirm there that it works for you.