[Home]

Summary:ASTERISK-18322: ooh323 , alternate gatekeeper
Reporter:Dmitry Melekhov (slesru)Labels:
Date Opened:2011-08-22 22:59:40Date Closed:
Priority:MinorRegression?
Status:In Progress/In ProgressComponents:Addons/chan_ooh323
Versions:10 Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) ASTERISK-18322-2.patch
( 1) change_gk_on_reload-1.patch
( 2) change_gk_on_reload-1-ast10.patch
( 3) change_gk_on_reload-2.patch
( 4) get_ast_gk.sh
( 5) gk_ha.pl
( 6) set_ast_gk.sh
Description:Hello!

It will be good for me if oooh323 will have alternate gatekeeper support, so if one gatekeeper is down it can register in another one, as many h323 devices do.

Thank you!
Comments:By: Alexander Anikin (may213) 2011-08-25 03:15:26.585-0500

Ok, i'll implement this feature but have one question here:
endpoint must register on both gatekeepers simultaneously or register on second only when primary fail?

By: Dmitry Melekhov (slesru) 2011-09-09 01:16:43.983-0500

Hello!

Sorry for late reply, I were on vacations.
In my configuration secondary (backup) gatekeeper starts only if primary gatekeeper fail.
But when primary gateekeper becomes available again, secondary gatekeeper will be shutted down.
So in my configuration asterisk have to register on secondary gatekeeper if primary fail, but if secondary fail it have to try reregister on primary, etc...

Thank you!


By: Dmitry Melekhov (slesru) 2012-02-16 03:37:25.321-0600

Hello!

Is there any progress on this?

Or I can to at least change gatekeeper manually (really by external script).
Bit if I change it in config file and then do ooh323 reload , ooh323 do not tries to change gatekeeper.
Only way is to module unload chan_ooh323 , module load chan_ooh323.
This is acceptable, but...
Could you, please, change ooh323 reload behaviour? Or I have to open another issue ?

Thank you!



By: Alexander Anikin (may213) 2012-02-18 08:10:57.256-0600

Dmitry,

Ok, I will implement change of GK IP as fist step,
then  full support of many GKs as next step.


By: Alexander Anikin (may213) 2012-02-18 18:28:06.863-0600

Dmitry,
Attached patches (for trunk and asterisk 10) allow change gatekeeper mode or IP on 'ooh323 reload'.
Please test it.


By: Dmitry Melekhov (slesru) 2012-02-19 21:47:05.528-0600

Hello!

Patch works (tested with 10.2rc) , thank you very much!


By: Dmitry Melekhov (slesru) 2012-02-19 23:31:21.396-0600

script1

By: Dmitry Melekhov (slesru) 2012-02-19 23:31:33.680-0600

script2

By: Dmitry Melekhov (slesru) 2012-02-19 23:31:45.788-0600

main script

By: Dmitry Melekhov (slesru) 2012-02-19 23:32:44.053-0600

Hello!

I uploaded quick dirty scripts which I'm going to use to switch gatekeeper.
Hope they'll help to understand what I want :-)

Thank you!


By: Dmitry Melekhov (slesru) 2012-03-07 04:07:17.770-0600

Hello!

Just found that ooh323 will not reregister in gatekeeper if it was restarted, even ooh323 reload with this patch doesn't help (may be because there were calls).
Have no debug log sorry and need time to reproduce on production system :-)

Thank you!


By: Dmitry Melekhov (slesru) 2012-03-13 11:25:04.070-0500

Hello!

Still can't reproduce :-) Looks like I need data link failure again ;-)
Is it possible to include this patch into next asterisk release?
Thank you!

By: Alexander Anikin (may213) 2012-03-13 17:14:13.270-0500

Dmitry,

I will do more testing this tomorrow.
Patch is already in trunk, rev 356848 (I was wrong in commit message,
refer to another issue - 19298).


By: Dmitry Melekhov (slesru) 2012-03-16 04:31:09.287-0500

Hello!

I reproduced problem once.
Unfortunately, ooh323 reload turns debug off, so I have no debug output.

What I did to reproduce:

1. started backup gnugk gatekeeper
2. killed with kill -9 main gatekeeper
3. changed gatekeeper to backup in ooh323.conf
4. called ooh323 reload, all is OK
5. stopped gnugk with killall gnugk
6. changed back gatekeeper to main in ooh323.conf
7. called ooh323 reload, on gnugk console I see only
GCF|10.1.1.17|ast-nsk|gateway;
(10.1.1.17 is address of this asterisk)
and there is no registration attempt .
tried several times with the same result.
So restarted asterisk and got normal registarition:
GCF|10.1.1.17|ast-nsk|gateway;
RCF|10.1.1.17:1720|ast-nsk:h323_ID|gateway|3686_p98;

I hope you will find difference between reload and ooh323 load...

Thank you!


By: Dmitry Melekhov (slesru) 2012-03-16 04:31:44.191-0500

btw, this is not 100% reproduceable :-(


By: Alexander Anikin (may213) 2012-03-21 12:47:15.594-0500

Dmiry,

pls test with change_gk_on_reload-2.patch.

I'm not sure but may be it can help if there are new calls during reload.


By: Alexander Anikin (may213) 2012-03-21 12:57:35.173-0500

Dmitry,

I can't reproduce your trouble, all reloads did ok in any cases (change from one to second gk and back or reload with same gk).

btw, Gnugk report RCF with some delay after GCF (above 1.5 second).

For further analysis there is need the ooh323_log with tracevelel=6 then we will see what happens in ooh323 internally.

By: Dmitry Melekhov (slesru) 2012-03-21 22:42:07.463-0500

Hello!

Thank you!
I just installed patch, but, I guess, I'll be able to test not before next week. :-(
I'll report about results or provide debug logs :-)


By: Dmitry Melekhov (slesru) 2012-03-28 04:19:33.234-0500

Hello!

I found this problem very hard to reproduce with last patch.
I only got it once and this was without tracelevel=6 (I forgot to set it on first attempt).
All I got is

[Mar 28 13:00:40] ERROR[20102]: utils.c:571 lock_info_destroy: Thread 'ooh323c_call_thread  started at [  169] ooh323cDriver.c ooh323c_start_call_thread()' still has a lock! - '&callListLock' (0x129b5c0) from 'ooRemoveCallFromList' in ooh323c/src/ooCalls.c:271!

on console.

Anyway, next week I'm going to reboot gnugk server, so I'll have another chance ;-)

Thank you!

By: Alexander Anikin (may213) 2012-03-29 08:24:58.135-0500

Dmitry,

it's interesting bug but i can't understand when this can be happen ;)
It suggest it cause by this code:

  ast_mutex_lock(&callListLock);

  OOTRACEINFO3("Removing call %lx: %s\n", call, call->callToken);

  if (!gH323ep.callList) return OO_OK;


when callList is empty ooCleanCall returned without unlock (but when callList is empty
there must be no any calls)

I will commit workaround and will try to understand about this problem.



By: Alexander Anikin (may213) 2012-03-29 08:52:07.593-0500

Patch for correction lock issue in ooCleanCall uploaded.

Dmitry, you can try reload with this patch.


By: Dmitry Melekhov (slesru) 2012-04-02 04:15:09.362-0500

Hello!

just tried with latest patch.
can't reproduce problem :-)

Thank you!

By: Dmitry Melekhov (slesru) 2012-05-29 04:20:15.214-0500

Hello!

I'd like to add that I just rebooted my gnugk server and all works OK.
So, if patches are not included yet , please include them in main branch.

Thank you!


By: Alexander Anikin (may213) 2012-06-01 15:08:21.930-0500

Hi Dmitry,

I will commit lock issue patch but i'm need to understand before about when this happen ;)
Unfortunatelly I'm very busy on another projects last two months and don't have too much time for ooh323 ;(
But i think we will close this issue at neat week.


By: Dmitry Melekhov (slesru) 2012-08-06 10:28:16.315-0500

Hello, Alexander!

Today I configured new server and found that some patches from here are applied in 10.7.0 , but patch-1 can be applied.
Could you tell me should I apply it or change gatekeeper on reload is implemented in another way?

Thank you!

By: Alexander Anikin (may213) 2012-08-06 17:00:25.809-0500

Hi Dmitry,

change_gk_on_reload-2.patch is already in 10 codes due to this is bug fix,
change_gk_on_reload-1.patch is in trunk but applied to 10.7.0 without any problem.

So you can apply -1.patch to the asterisk 10 and use it as previously.

By: Dmitry Melekhov (slesru) 2012-08-06 22:55:06.680-0500

It works :-)

Thank you!

By: Dmitry Melekhov (slesru) 2012-09-13 00:44:47.372-0500

Hello!

btw, just found that if I deregister asterisk in gnugk it doesn't try to reregister.
all other gateways I know reregister...

Thank you!