Review Board 1.7.16


res_pjsip_history: A debugging module for busy systems

Review Request #4053 - Created Oct. 8, 2014 and updated

Matt Jordan
trunk
Reviewers
asterisk-dev
Asterisk
While debugging things at SIPit, I found that the capabilities afforded by res_pjsip_logger to be inadequate for tracing SIP messages on the CLI. Often, so many requests/responses were received -- often with very large SDPs -- in a short period of time that after a single call scenario, the initial requests/responses had already been expunged from the CLI buffer. Furthermore, displaying every single SIP request/response was often counterproductive - OPTIONS, SUBSCRIBE, and even REGISTER requests were often interleaved in the call scenarios, making it difficult to find or trace portions of a call.

This isn't the fault of res_pjsip_logger: it was doing exactly what it was designed to do. And res_pjsip_logger is absolutely necessary for getting full logs when a problem occurs. However, it isn't designed for debugging things on the CLI. Hence, this module.

res_pjsip_history records every request/response received/transmitted through the PJSIP stack, but does not bother dumping them to the CLI. Instead, it provides a few CLI comamnds to access the recorded messages:

* pjsip set history {on|off} - enable/disable the history
* pjsip show history [min [max]] - display the entire history, or segments of the history.
* pjsip show entry {num} - display a particular history entry

Because we store all received/transmitted requests/responses, the module is only suitable for debugging purposes. Leaving it permanently odd is a bad idea, for obvious reasons. When the history is turned off, all messages are purged and the history reset.

As an example, we can record some history and display all of the messages:

*CLI> pjsip show history
No.  Timestamp      Rx/Tx Address        Call-ID              SIP Message
==== ============== ==================== ==================== ============
0000 1412775534.791 RX: 127.0.0.1:22428  d56324c8f042034aae29 OPTIONS sip:127.0.0.1 SIP/2.0
0001 1412775534.792 TX: 127.0.0.1:22428  d56324c8f042034aae29 SIP/2.0 200 OK
0002 1412775540.277 RX: 127.0.0.1:22428  86cd74e458e76b79e267 OPTIONS sip:127.0.0.1 SIP/2.0
0003 1412775540.277 TX: 127.0.0.1:22428  86cd74e458e76b79e267 SIP/2.0 200 OK
0004 1412775541.763 RX: 127.0.0.1:22428  f4c0050f5b604fc52ecc INVITE sip:200@127.0.0.1 SIP/2.0
0005 1412775541.765 TX: 127.0.0.1:22428  f4c0050f5b604fc52ecc SIP/2.0 200 OK
0006 1412775541.766 TX: 127.0.0.1:22428  f4c0050f5b604fc52ecc SIP/2.0 200 OK
0007 1412775541.780 RX: 127.0.0.1:22428  f4c0050f5b604fc52ecc ACK sip:127.0.0.1:5060 SIP/2.0
0008 1412775543.767 RX: 127.0.0.1:22428  f4c0050f5b604fc52ecc BYE sip:127.0.0.1:5060 SIP/2.0
0009 1412775543.768 TX: 127.0.0.1:22428  f4c0050f5b604fc52ecc SIP/2.0 200 OK
0010 1412775547.823 RX: 127.0.0.1:22428  ab6fc9f37aa1dc5ed038 SUBSCRIBE sip:1000@127.0.0.1 SIP/2.0
0011 1412775547.824 TX: 127.0.0.1:22428  ab6fc9f37aa1dc5ed038 SIP/2.0 481 Call/Transaction Does Not Exist
0012 1412775547.826 RX: 127.0.0.1:22428  ba4ed4625cbb3282b34c REGISTER sip:127.0.0.1 SIP/2.0
0013 1412775547.841 TX: 127.0.0.1:22428  ba4ed4625cbb3282b34c SIP/2.0 200 OK
0014 1412775549.854 RX: 127.0.0.1:34899  ca022bf6e5a31b306bfd REGISTER sip:127.0.0.1 SIP/2.0
0015 1412775549.870 TX: 127.0.0.1:34899  ca022bf6e5a31b306bfd SIP/2.0 200 OK
0016 1412775549.876 RX: 127.0.0.1:34899  d92c91d540a36d2cbca1 SUBSCRIBE sip:1000@127.0.0.1 SIP/2.0
0017 1412775549.876 TX: 127.0.0.1:34899  d92c91d540a36d2cbca1 SIP/2.0 200 OK
0018 1412775549.877 TX: 127.0.0.1:34899  d92c91d540a36d2cbca1 NOTIFY sip:1000@127.0.0.1:34899;transport=udp;registering_acc=127_0_0_1 SIP/2.0
0019 1412775549.877 RX: 127.0.0.1:34899  32cca4f61cbba6cbe47f OPTIONS sip:127.0.0.1 SIP/2.0
0020 1412775549.877 TX: 127.0.0.1:34899  32cca4f61cbba6cbe47f SIP/2.0 200 OK
0021 1412775549.889 RX: 127.0.0.1:34899  d92c91d540a36d2cbca1 SIP/2.0 200 OK
0022 1412775555.376 RX: 127.0.0.1:34899  f65c7ea06475be757d2c INVITE sip:1000@127.0.0.1 SIP/2.0
0023 1412775555.377 TX: 127.0.0.1:34899  f65c7ea06475be757d2c SIP/2.0 487 Request Terminated
0024 1412775555.382 TX: 127.0.0.1:34899  6ddb9f94-9c20-4f83-8 INVITE sip:1000@127.0.0.1:34899;transport=udp;registering_acc=127_0_0_1 SIP/2.0
0025 1412775555.383 TX: 127.0.0.1:34899  f65c7ea06475be757d2c SIP/2.0 487 Request Terminated
0026 1412775560.135 RX: 127.0.0.1:34899  f65c7ea06475be757d2c CANCEL sip:1000@127.0.0.1 SIP/2.0
0027 1412775560.135 TX: 127.0.0.1:34899  f65c7ea06475be757d2c SIP/2.0 200 OK
0028 1412775560.135 TX: 127.0.0.1:34899  f65c7ea06475be757d2c SIP/2.0 487 Request Terminated
0029 1412775560.146 RX: 127.0.0.1:34899  f65c7ea06475be757d2c ACK sip:1000@127.0.0.1 SIP/2.0

Alternatively, we can display on a subset of the messages:

*CLI> pjsip show history 4 9
No.  Timestamp      Rx/Tx Address        Call-ID              SIP Message
==== ============== ==================== ==================== ============
0004 1412775541.763 RX: 127.0.0.1:22428  f4c0050f5b604fc52ecc INVITE sip:200@127.0.0.1 SIP/2.0
0005 1412775541.765 TX: 127.0.0.1:22428  f4c0050f5b604fc52ecc SIP/2.0 200 OK
0006 1412775541.766 TX: 127.0.0.1:22428  f4c0050f5b604fc52ecc SIP/2.0 200 OK
0007 1412775541.780 RX: 127.0.0.1:22428  f4c0050f5b604fc52ecc ACK sip:127.0.0.1:5060 SIP/2.0
0008 1412775543.767 RX: 127.0.0.1:22428  f4c0050f5b604fc52ecc BYE sip:127.0.0.1:5060 SIP/2.0
*CLI> 

We can narrow down things further by displaying only a specific entry in the history:

*CLI> pjsip show entry 4
INVITE sip:200@127.0.0.1 SIP/2.0
Call-ID: f4c0050f5b604fc52ecca628a9393d3f@0:0:0:0:0:0:0:0
CSeq: 1 INVITE
From: "1000" <sip:1000@127.0.0.1>;tag=aece8db8
To: <sip:200@127.0.0.1>
Via: SIP/2.0/UDP 127.0.0.1:22428;branch=z9hG4bK-3833-5f28322b65a22d22475a18c1b3cb8260
Max-Forwards: 70
Contact: "1000" <sip:1000@127.0.0.1:22428;transport=udp;registering_acc=127_0_0_1>
User-Agent: Jitsi2.4.4997Linux
Content-Type: application/sdp
Content-Length: 783

v=0
o=1000 0 0 IN IP4 127.0.0.1
s=-
c=IN IP4 127.0.0.1
t=0 0
m=audio 5012 RTP/AVP 96 97 98 9 100 102 0 8 103 3 104 101
a=rtpmap:96 opus/48000/2
a=fmtp:96 usedtx=1
a=rtpmap:97 SILK/24000
a=rtpmap:98 SILK/16000
a=rtpmap:9 G722/8000
a=rtpmap:100 speex/32000
a=rtpmap:102 speex/16000
a=rtpmap:0 PCMU/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:103 iLBC/8000
a=rtpmap:3 GSM/8000
a=rtpmap:104 speex/8000
a=rtpmap:101 telephone-event/8000
a=extmap:1 urn:ietf:params:rtp-hdrext:csrc-audio-level
m=video 5014 RTP/AVP 105 99
a=recvonly
a=rtpmap:105 H264/90000
a=fmtp:105 profile-level-id=4DE01f;packetization-mode=1
a=imageattr:105 send * recv [x=[0-1360],y=[0-768]]
a=rtpmap:99 H264/90000
a=fmtp:99 profile-level-id=4DE01f
a=imageattr:99 send * recv [x=[0-1360],y=[0-768]]

*CLI> 


Because we cache the actual PJSIP tdata/rdata structures, this module could be taken further in the future by adding filtering mechanisms. Ideally, someone could filter the history on particular call-ids, message types, etc. However, at the sake of having something (and not going too crazy with this on an initial patch), the initial functionality provided in this module should be fairly useful.

 
Total:
3
Open:
3
Resolved:
0
Dropped:
0
Status:
From:
Posted (Oct. 8, 2014, 9:53 a.m.)
If we were to follow the CLI convention, it would be:

pjsip show histories 4 9 

and 

pjsip show history 4

It might be helpful to also add an alias for 'pjsip show historys'
  1. Er... I'm not sure I like the plural form. There's really only one history for the PJSIP stack.
Posted (Oct. 14, 2014, 1:37 p.m.)

   

  
/trunk/res/res_pjsip_history.c (Diff revision 1)
 
 
Shouldn't ">" be ">=" instead?
Posted (Oct. 16, 2014, 6:32 a.m.)

   

  
/trunk/res/res_pjsip_history.c (Diff revision 1)
 
 
I'm not a fan of these scoped mutexes being here. Despite the enabled check being fast you've still got a contention point on every message. Is it just protecting the vector or enabled also?
  1. Really, both. Because this is CLI controlled, you have to protect access to the boolean variable - otherwise, you have a race condition where the CLI command sets the boolean to False, locks the vector, and starts destroying it while at the same time something else passed the boolean check but is blocked on the vector lock. When the vector is done being destroyed, you access the vector and start inserting history when you shouldn't be able to do so.
    
    I don't like it either, but really any container we use will have to be locked to prevent concurrent accesses. I originally didn't use scoped mutexes, but found that there was so little that wasn't getting locked and that there were enough return paths that the scoped mutex didn't change much of anything.
  2. What if it checked again with the lock held? I'm not going for perfection here - just for most cases except when debugging this won't be enabled so getting a lock, checking, and then unlocking is a waste. I freely admit it, a bit ugly.
Posted (Oct. 16, 2014, 4:44 p.m.)

   

  
/trunk/res/res_pjsip_history.c (Diff revision 1)
 
 
 
 
 
I wonder if the code could be simplified by instead of saving rdata and tdata, saving a pjsip_msg structure instead. Both rdata and tdata contain a pjsip_msg, so you wouldn't need all of the "if this history entry has a tdata, do this, otherwise do this on the rdata instead" logic. It would also result in a smaller memory footprint. You'd just need to extract the pjsip_msg from the tdata or rdata structures in your PJSIP module callbacks.

The only piece of information you use outside of the pjsip_msg is the remote address, and that could just be saved on the pjsip_history_entry structure as a separate field.
  1. From a memory perspective looking at both of the structs the extra info there is not a HUGE amount, but agreed - simply using pjsip_msg would simplify things when accessing. It just complicates things when creating it since there's a pool to manage and some extra logic for copying the message safely. That part can be pulled from pjsip_rx_data_clone.
Posted (Nov. 5, 2014, 7:06 a.m.)

   

  
/trunk/res/res_pjsip_history.c (Diff revision 1)
 
 
This can be allocated without a lock.

https://reviewboard.asterisk.org/ runs on a server provided by Digium, Inc. and uses bandwidth donated to the open source Asterisk community by API Digital Communications in Huntsville, AL USA.
Please report problems with this site to asteriskteam@digium.com.