Review Board 1.7.16


IAX2: fix race condition when transferrring.

Review Request #2594 - Created June 4, 2013 and submitted

Alec Davis
1.8 to trunk
ASTERISK-21409
Reviewers
asterisk-dev
Asterisk
1). When touching the bridgecallno, we need to lock it.

2). stop_stuff() which calls iax2_destroy_helper()
    Assumes the lock on the pvt is already held, when iax2_destroy_helper() is called.
    Thus we need to lock the bridgecallno pvt before we call stop_stuff(iaxs[fr->callno]->bridgecallno);

3). When evaluating the state of 'callno->transferring' of the current leg, we can't change it to READY unless the bridgecallno is locked.
    Why, if we are interrupted by the other call leg before 'transferring = TRANSFER_RELEASED', the interrupt will find that it is READY and that the bridgecallno is also READY so Releases the legs.
    Then the first call leg in this example, finishes execution, and Releases the legs AGAIN!!!!!!!
    Interleaving thread execution gets interesting as well - see timeline below from June 5. 

Debug captures when it went wrong.
[May 31 14:44:01] VERBOSE[30820] chan_iax2.c:     -- Channel 'IAX2/auckland-13262' ready to transfer
[May 31 14:44:01] VERBOSE[30824][C-00000536] chan_iax2.c:     -- Channel 'IAX2/auckland-20457' ready to transfer
[May 31 14:44:01] VERBOSE[30824][C-00000536] chan_iax2.c:     -- Releasing IAX2/auckland-20457 and IAX2/auckland-13262
[May 31 14:44:01] VERBOSE[30820] chan_iax2.c:     -- Releasing IAX2/auckland-13262 and IAX2/auckland-20457
[May 31 14:44:01] DEBUG[30824][C-00000536] sched.c: Attempted to delete nonexistent schedule entry 209951!
[May 31 14:44:01] ERROR[30824][C-00000536] lock.c: chan_iax2.c line 1918 (iax2_destroy_helper): mutex '&iaxsl[pvt->callno]' freed more times than we've locked!
[May 31 14:44:01] ERROR[30824][C-00000536] lock.c: chan_iax2.c line 1918 (iax2_destroy_helper): Error releasing mutex: Operation not permitted

[Jun  5 19:53:43] VERBOSE[25606][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-19065' ready to transfer
[Jun  5 19:53:43] VERBOSE[25604][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-20047' ready to transfer
[Jun  5 19:53:43] VERBOSE[25606][C-00000000] chan_iax2.c:     -- Releasing IAX2/auckland-19065 and IAX2/auckland-20047
[Jun  5 19:53:43] VERBOSE[25604][C-00000000] chan_iax2.c:     -- Releasing IAX2/auckland-20047 and IAX2/auckland-19065
[Jun  5 19:53:43] VERBOSE[25606][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-19065' finished transfer
[Jun  5 19:53:43] DEBUG[25604][C-00000000] sched.c: Attempted to delete nonexistent schedule entry 17!
[Jun  5 19:53:43] VERBOSE[25604][C-00000000] chan_iax2.c:     -- Channel 'IAX2/auckland-20047' finished transfer


A time line of 2 threads interleaving that shows how the Jun 5 capture may have come about.
The execution path seems to switch threads as we print/log data.
thread 25606                thread 25604
======================      =======================
if US == BEGIN
   US = READY
   "ready to transfer"
                            if US == BEGIN
                               US = READY
                               "ready to transfer"
    if THEY == READY
       "Releasing .."
                               if THEY == READY
                                  "Releasing .."
       THEY = RELEASED
       US = RELEASED
       stopstuff(US)
       stopstuff(THEM)
       "finished transfer"

                                   THEY = RELEASED
                                   US = RELEASED
                                   stopstuff(US)
                                   stopstuff(THEM) (sched.c: Attempted to delete nonexistent schedule entry 17!)
                                   "finished transfer"

Yes. Between sites. many test calls and real calls being transferred back down the same trunk

Diff revision 4 (Latest)

1 2 3 4
1 2 3 4

  1. branches/11/channels/chan_iax2.c: Loading...
branches/11/channels/chan_iax2.c
Revision 385594 New Change
[20] 11517 lines
[+20] [+] immediatedial:
11518
						send_command(iaxs[iaxs[fr->callno]->bridgecallno], AST_FRAME_IAX, IAX_COMMAND_TXREJ, 0, NULL, 0, -1);
11518
						send_command(iaxs[iaxs[fr->callno]->bridgecallno], AST_FRAME_IAX, IAX_COMMAND_TXREJ, 0, NULL, 0, -1);
11519
					}
11519
					}
11520
				}
11520
				}
11521
				break;
11521
				break;
11522
			case IAX_COMMAND_TXREADY:
11522
			case IAX_COMMAND_TXREADY:

    
   
11523
				if (iaxs[fr->callno]->bridgecallno) {

    
   
11524
					while (ast_mutex_trylock(&iaxsl[iaxs[fr->callno]->bridgecallno])) {

    
   
11525
						DEADLOCK_AVOIDANCE(&iaxsl[fr->callno]);

    
   
11526
					}

    
   
11527
					if (!iaxs[fr->callno]) {

    
   
11528
						break;

    
   
11529
					}

    
   
11530
				}

    
   
11531

   
11523
				if ((iaxs[fr->callno]->transferring == TRANSFER_BEGIN) ||
11532
				if ((iaxs[fr->callno]->transferring == TRANSFER_BEGIN) ||
11524
				    (iaxs[fr->callno]->transferring == TRANSFER_MBEGIN)) {
11533
				    (iaxs[fr->callno]->transferring == TRANSFER_MBEGIN)) {
11525
					if (iaxs[fr->callno]->transferring == TRANSFER_MBEGIN)
11534
					if (iaxs[fr->callno]->transferring == TRANSFER_MBEGIN)
11526
						iaxs[fr->callno]->transferring = TRANSFER_MREADY;
11535
						iaxs[fr->callno]->transferring = TRANSFER_MREADY;
11527
					else
11536
					else
[+20] [20] 38 lines
[+20] immediatedial:
11566
							}
11575
							}
11567

    
   
11576

   
11568
						}
11577
						}
11569
					}
11578
					}
11570
				}
11579
				}

    
   
11580
				if (iaxs[fr->callno]->bridgecallno) {

    
   
11581
					ast_mutex_unlock(&iaxsl[iaxs[fr->callno]->bridgecallno]);

    
   
11582
				} 
11571
				break;
11583
				break;
11572
			case IAX_COMMAND_TXREQ:
11584
			case IAX_COMMAND_TXREQ:
11573
				try_transfer(iaxs[fr->callno], &ies);
11585
				try_transfer(iaxs[fr->callno], &ies);
11574
				break;
11586
				break;
11575
			case IAX_COMMAND_TXCNT:
11587
			case IAX_COMMAND_TXCNT:
[+20] [20] 3489 lines
  1. branches/11/channels/chan_iax2.c: Loading...

https://reviewboard.asterisk.org/ runs on a server provided by Digium, Inc. and uses bandwidth donated to the open source Asterisk community by API Digital Communications in Huntsville, AL USA.
Please report problems with this site to asteriskteam@digium.com.