Review Board 1.7.16

Fix crash due to new end_bridge_callback code

Review Request #54 - Created Nov. 12, 2008 and submitted

Mark Michelson
murf, russell, twilson
There was a bug introduced with the new end_bridge_callback code that caused a crash to occur when an attended transfer was used. Explaining this is a bit tricky, but I'll do my best here.

In app_dial, we set the end_bridge_callback_data to be the calling channel. Thus when the callback is called, we can set a few channel variables. The problem is that when an attended transfer takes place, we copy the bridge_config structure from the original call to the new call which is made in the builting_atxfer function. This is problematic, because when the end_bridge_callback is called in this case, we then end up acting on a freed channel.

To fix this, my immediate thought was to NULL out the end_bridge_callback in the copied bridge_config, but then I realized that this would lead to bad CDRs (since the time of the final leg of the call will not be logged) and wrong channel variable values. It's important to call the end_bridge_callback at the end of the call. I also can't just change the end_bridge_callback_data to point to the new calling channel because the end_bridge_callback_data is not always going to be a pointer to a channel. In app_queue, for instance, this is a pointer to a queue_ent structure.

So, the solution I came up with was to make an end_bridge_callback_data_fixup function. If a module registers this callback, then builtin_atxfer will call this function with its copy of the bridge_config structure, as well as both the calling and called channels involved in the new bridge. This way, if a module has either of the original channels in its end_bridge_callback_data, it can update properly. It's not exactly a pretty solution, but it works to clear up the problem.

This review is posted both for approval before merging the changes, and potentially for suggestions of another way of working around this issue if my way is frowned upon.
I tested attended transfers originated by both the caller and callee while running Asterisk under Valgrind. By doing this, I was able to completely remove any invalid memory accesses.
Review request changed
Updated (Nov. 13, 2008, 5:47 a.m.)
Fixed the arguments to the end_bridge_callback_data_fixup function in builtin_atxfer. It turns out the proper order for the arguments is the opposite of what I thought it was in the first note I had made on this review.

Also added a fixup to app_followme since it is logically equivalent to app_dial in this regard.
Posted (Nov. 13, 2008, 7:07 a.m.)
Mark, this was really good detective work.

I can see nothing wrong with what you did.
But I do see that I took the wrong turn a while back, and with fortitude, tried to make it work, but this
and other problems are all saying to me: "You took
the wrong path. Back out and try it a different way."

Trying to catch CDR data in the bridge looked good
when I decided to try that way back when, but it's
technically bankrupt, and your patch (and a few others
I've made myself) pretty well prove it. But, this is
a discussion for a different forum... runs on a server provided by Digium, Inc. and uses bandwidth donated to the open source Asterisk community by API Digital Communications in Huntsville, AL USA.
Please report problems with this site to