Review Board 1.7.16


DIALOG_INFO_XML stale (and timeout) notifiy subscriptions stop BLF's working on Grandstream GXP20xx phones.

Review Request #1813 - Created March 12, 2012 and updated

Alec Davis
1.8 and above
Reviewers
asterisk-dev
Asterisk
This issue is caused by review "handle_request_subscribe is too slow cause of iterating through all sip dialogs" https://reviewboard.asterisk.org/r/901/
commit http://svnview.digium.com/svn/asterisk?view=revision&revision=289623



Some phones (Grandstream GXP20XX series 1.2.5.3 firmware) if rebooted, loose the BLF lights due to the new subscription's "version" being less than the previous subscription's "version".
This happens the instant that asterisk timeouts the previous subscription - which has the much larger "version" sequence number.

The only way for the lights to start working again, is for the current subscription's "version" number to increment past the previous old "version" number.
That period of time could be huge, if the phone isn't rebooted for days.

The workaround has been to reboot the phone twice, within a few minutes of each reboot.
  1st time wait for the BLF's to fail (then reboot again) - which is the timeout period - a few minutes.
  2nd time the lights will work, then fail, then after a few minutes start working again.

This fix prevents the old subscription timeout from updating the phones "version" number. 

This issue doesn't affect the GXP21XX series.

Attempt to show issue byway of example below with some debug output showing call-id and version number.

astrid-test*CLI> core show hints

    -= Registered Asterisk Dial Plan Hints =-
            8601GXP0001@trusted             : Custom:q8601_a8612    State:InUse           Watchers  1
Debug output:

===Set the BLF state to InUse
- Executing [s@macro-custom-agent-inout:13] Set("SIP/GXP0001-00000005", "DEVICE_STATE(Custom:q8601_a8612)=INUSE") in new stack
...
[Mar 14 11:01:39] NOTICE[10819]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="3c65aed02ae1d5df@192.168.5.141"> version=5

=============
Reboot the phone, it gets a new BLF subscription.
=============
[Mar 14 11:02:30] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=0
===The BLF has correct state of InUse.

...
[Mar 14 11:05:00] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=1
===The BLF still has correct state of InUse.

...
The timeout 'last notification'
[Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:13131 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="3c65aed02ae1d5df@192.168.5.141" direction="recipient"> version=6
[Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:3868 __sip_autodestruct: ALEC TIMEOUT of SIP subscription 3c65aed02ae1d5df@192.168.5.141
===The BLF has incorrectly gone Idle.

...
[Mar 14 11:15:01] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=5
===The BLF is still incorrect at Idle.

    -- Registered SIP 'GXP0001' at 192.168.5.141:5066
[Mar 14 11:17:31] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=6
===The BLF now has correct state of InUse.
In place on a 1.8 deployment for months, our production box with ~80 GXP phones, half of which are the older GXP20XX series.
Review request changed
Updated (March 7, 2013, 6:18 p.m.)
  • This issue is caused by review "handle_request_subscribe is too slow cause of iterating through all sip dialogs" https://reviewboard.asterisk.org/r/901/
    commit http://svnview.digium.com/svn/asterisk?view=revision&revision=289623
    
    
    
    Some phones (Grandstream GXP20XX series 1.2.5.3 firmware) if rebooted, loose the BLF lights due to the new subscription's "version" being less than the previous subscription's "version".
    This happens the instant that asterisk timeouts the previous subscription - which has the much larger "version" sequence number.
    
    The only way for the lights to start working again, is for the current subscription's "version" number to increment past the previous old "version" number.
    That period of time could be huge, if the phone isn't rebooted for days.
    
    The workaround has been to reboot the phone twice, within a few minutes of each reboot.
      1st time wait for the BLF's to fail (then reboot again) - which is the timeout period - a few minutes.
      2nd time the lights will work, then fail, then after a few minutes start working again.
    
    This fix prevents the old subscription timeout from updating the phones "version" number. 
    
    This issue doesn't affter the GXP21XX series.
    
    Attempt to show issue byway of example below with some debug output showing call-id and version number.
    
    astrid-test*CLI> core show hints
    
        -= Registered Asterisk Dial Plan Hints =-
                8601GXP0001@trusted             : Custom:q8601_a8612    State:InUse           Watchers  1
    Debug output:
    
    ===Set the BLF state to InUse
    - Executing [s@macro-custom-agent-inout:13] Set("SIP/GXP0001-00000005", "DEVICE_STATE(Custom:q8601_a8612)=INUSE") in new stack
    ...
    [Mar 14 11:01:39] NOTICE[10819]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="3c65aed02ae1d5df@192.168.5.141"> version=5
    
    =============
    Reboot the phone, it gets a new BLF subscription.
    =============
    [Mar 14 11:02:30] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=0
    ===The BLF has correct state of InUse.
    
    ...
    [Mar 14 11:05:00] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=1
    ===The BLF still has correct state of InUse.
    
    ...
    The timeout 'last notification'
    [Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:13131 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="3c65aed02ae1d5df@192.168.5.141" direction="recipient"> version=6
    [Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:3868 __sip_autodestruct: ALEC TIMEOUT of SIP subscription 3c65aed02ae1d5df@192.168.5.141
    ===The BLF has incorrectly gone Idle.
    
    ...
    [Mar 14 11:15:01] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=5
    ===The BLF is still incorrect at Idle.
    
        -- Registered SIP 'GXP0001' at 192.168.5.141:5066
    [Mar 14 11:17:31] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=6
    ===The BLF now has correct state of InUse.
    

    This issue is caused by review "handle_request_subscribe is too slow cause of iterating through all sip dialogs" https://reviewboard.asterisk.org/r/901/
    commit http://svnview.digium.com/svn/asterisk?view=revision&revision=289623
    
    
    
    Some phones (Grandstream GXP20XX series 1.2.5.3 firmware) if rebooted, loose the BLF lights due to the new subscription's "version" being less than the previous subscription's "version".
    This happens the instant that asterisk timeouts the previous subscription - which has the much larger "version" sequence number.
    
    The only way for the lights to start working again, is for the current subscription's "version" number to increment past the previous old "version" number.
    That period of time could be huge, if the phone isn't rebooted for days.
    
    The workaround has been to reboot the phone twice, within a few minutes of each reboot.
      1st time wait for the BLF's to fail (then reboot again) - which is the timeout period - a few minutes.
      2nd time the lights will work, then fail, then after a few minutes start working again.
    
    This fix prevents the old subscription timeout from updating the phones "version" number. 
    
    This issue doesn't affect the GXP21XX series.
    
    Attempt to show issue byway of example below with some debug output showing call-id and version number.
    
    astrid-test*CLI> core show hints
    
        -= Registered Asterisk Dial Plan Hints =-
                8601GXP0001@trusted             : Custom:q8601_a8612    State:InUse           Watchers  1
    Debug output:
    
    ===Set the BLF state to InUse
    - Executing [s@macro-custom-agent-inout:13] Set("SIP/GXP0001-00000005", "DEVICE_STATE(Custom:q8601_a8612)=INUSE") in new stack
    ...
    [Mar 14 11:01:39] NOTICE[10819]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="3c65aed02ae1d5df@192.168.5.141"> version=5
    
    =============
    Reboot the phone, it gets a new BLF subscription.
    =============
    [Mar 14 11:02:30] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=0
    ===The BLF has correct state of InUse.
    
    ...
    [Mar 14 11:05:00] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=1
    ===The BLF still has correct state of InUse.
    
    ...
    The timeout 'last notification'
    [Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:13131 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="3c65aed02ae1d5df@192.168.5.141" direction="recipient"> version=6
    [Mar 14 11:05:09] NOTICE[10826]: chan_sip.c:3868 __sip_autodestruct: ALEC TIMEOUT of SIP subscription 3c65aed02ae1d5df@192.168.5.141
    ===The BLF has incorrectly gone Idle.
    
    ...
    [Mar 14 11:15:01] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=5
    ===The BLF is still incorrect at Idle.
    
        -- Registered SIP 'GXP0001' at 192.168.5.141:5066
    [Mar 14 11:17:31] NOTICE[10826]: chan_sip.c:13136 state_notify_build_xml: ALEC <dialog id="8601GXP0001" call-id="79956c0168125400@192.168.5.141"> version=6
    ===The BLF now has correct state of InUse.
    
  • changed from DIALOG_INFO_XML timeout notification stops BLF's from working. to DIALOG_INFO_XML stale (and timeout) notifiy subscriptions stop BLF's working on Grandstream GXP20xx phones.
This could be a new config option like buggymwi as used for Cisco phones.
Something like buggy_gxp20xx.


Not only does the stale subscription timeout mess up the new BLF version number, but so does a stale subscription updating a BLF.

Forcing the version number to "0" works around the stale subscription updates that happen until the subscription expires.
During this timeout period, the phone will get 2 updates for the same BLF, with both indicating the correct state 'confirmed' or 'terminated'.

Disabling the timeout notification, prevents the timeout notification setting state 'terminated;reason=timeout' which could incorrectly turn off a lit BLF.

Tested with newer GXP21xx series that don't have the problem (they reject the stale subscription as call-id doesn't match), and their BLF still operate correctly. 

https://reviewboard.asterisk.org/ runs on a server provided by Digium, Inc. and uses bandwidth donated to the open source Asterisk community by API Digital Communications in Huntsville, AL USA.
Please report problems with this site to asteriskteam@digium.com.