If you’re like me, your organization has phone number that are not assigned to users…so how do you treat them? Well you can use the Unassigned Number Feature to tell the caller some message such as “You have reached an unassigned number at “some company”……..” or you can send that call to UM. My organization at first didn’t want to use this feature and for good reason. Again if you’re like me, we get spammed with bots that are trolling numbers hoping to sell some product. Our thought was well, if you don’t have anything answer the call, maybe the bots would magically stop calling us. The thought was good but we hit some other issues. Lets discuss the major one.
We have two sip trunks in two different parts of the country. One of those trunks is Primary for both inbound and outbound PSTN calling and the other is only used if the Primary trunk is down. For this post we’re only going to talk about inbound PSTN calls from our carrier. Lets first discuss what it takes for our carrier to trigger recursion (that is, what error code must the carrier receive for the carrier to automatically re-route the call to our secondary path). Our carrier needs a 5XX code to trigger an alternate route. So here we go:
When a call would come into Lync that is not assigned the sip flow is this:
1. Call hits SBC
2. Call Hits Mediation
3. Call Hits Front End to route the call to the end user….in this case there was none
4. The last step it takes is to Edge thinking that maybe Edge can get the call to someone….which obviously it can’t.
At this point, lync essentially gives up and and times out with an error 504 server timeout. Should this happen enough times, you will get an alert from SCOM (if you’re using it) that there are issues with your mediation servers. Below are some snips from the error you’ll see on your mediation servers:
Alert: [LYNC] The Mediation Server service has encountered a major call completion problem with the Front End.
Alert description: The Mediation Server service has encountered a major call completion problem with the Front End.
Cause: Calls to this Front End failed 5 times with failure final responses. Check other MOM alerts for more details.
Since Lync responded with a 504 back to the carrier, it triggered the carrier to try an alternate path and we get the same issue on the other side (down our secondary trunk). The call continues to ping pong between our two paths until the carrier timeout is hit and then the carrier kills the call.
All of this does 3 things:
1. Makes you think there is an issue
2. Creates spam coming from SCOM
3. Begins to majorly mess up your monitoring server and metrics as the monitoring server reflects that calls are actually failing (not good if your CIO has a dashboard looking at call failures).
Further, we have a product from AudioCodes called Session Experience Manager. This tool presents a nice page that shows you where calls are failing. Since we have two trunks and since the calls for unassigned numbers always first hit our primary then secondary path, if you look at this tool, it appears that EVERY call that hits the secondary path just fails…thus making your nice tool look like there’s a MAJOR problem with one of your trunks.
The fix is to add all your DID’s into the unassigned number range, thus now the flow is
3. FE and at this point the call hits the unassigned number range, realizes the number is a part of the unassigned number range, accepts the call, and plays a message.
Moving forward your logs will clear up and your CIO will get quality data and he/she will leave you alone 🙂
Next post: Well how do you add numbers to the unassigned range if they are NOT contiguous thus if you’re like me, you would need to MANUALLY add each number individually because each range (or unique number) needs an identity in the Unassigned Number feature.