Our CTO, Joe Fuccillo often uses a crime investigation as an analogy for diagnosing communications quality issues within a UC deployment. A crime has occurred; the user is the victim. The initial list of potential suspects is long and varied: from the WAN, to the network components, to the premise Wi-Fi, to the servers, to the clients, to the device, to the user. The first step in a successful crime investigation is gathering evidence, followed closely by correlating facts, and systemically eliminating suspects. A good detective continues to narrow the search until the perpetrator becomes obvious. A bad detective makes assumptions, takes shortcuts and arrests the convenient suspect, who is not always the actual culprit.
Correlation Is Not Necessarily Causation
Imagine you walked around a corner and saw a body on the ground with someone standing nearby with a smoking gun (the gun is literally smoking). There is a lot of circumstantial evidence, but can you guarantee that the person holding the gun shot the other person? You can correlate the person on the ground, the gun, and the person holding the gun, but those 3 facts cannot definitively prove the narrative of what actually happened.
Transmitting While Wireless
Shifting the scenario to the realm of UC and diagnostics, I recently saw a demo where a Skype for Business monitoring solution arbitrarily accused an unmanaged wireless connection of creating poor quality. The solution demonstrated a record of a poor Skype for Business call and produced 3 facts extracted from the Microsoft SDN API:
• Network jitter was high (above 40 ms)
• Packet loss was high (above 35%)
• One caller was on an unmanaged wireless connection
From these 3 facts, the IT Pro using the tool concluded:
a caller was calling in over unmanaged WiFi from outside the network and that leg had very bad jitter and packet loss
There was a call leg over unmanaged wireless and there were issues with packet loss and jitter. How do we know for sure the wireless leg created those issues? How can we make that leap in our narrative? I feel bad for the wireless router. Guilty until proven innocent.
I Am Not Really A Conspiracy Theory Enthusiast, but . . .
If I was responsible for the corporate network, it would be easier to blame the unmanaged wireless network device, than cast a shadow on my own managed, internal infrastructure. I would not want to highlight anything potentially wrong with my network infrastructure. So, all things being equal, without actual data, I am most likely to blame something other than my network.
Who Cares If the Access Point is Conveniently Assigned Blame?
If IT Pro Tools falsely and conveniently place responsibility on outside factors, there is the chance that internal network issues are overlooked. Let’s assume that in the situation above, the caller on the corporate network was actually the cause of the poor communications quality. Let’s stretch our imagination and assume a router was upgraded over the weekend and QoS settings were negatively impacted or certain queues within the router were being overrun. If either of those scenarios is accurate, there is a fundamental issue in the enterprise’s network and there is the potential for many more poor Skype for Business user experiences. By taking the “convenient” path and accusing the unmanaged wireless network and overlooking the issues with the managed network, this enterprise is increasing the risk of widespread communications quality issues and broad Skype for Business user dissatisfaction.
Step Aside and Let the Real Investigators Take Over
Let’s revisit the hypothetical of the smoking gun. What if there were surveillance cameras and investigators could review the scene historically and capture and present what actually happened at that moment in time? Then, the actual culprit would be rightly identified and justice would be served. In the world of UC diagnostics, the equivalent to a surveillance camera is a probe. What if there was a probe bracketing the routers on the corporate network? If these probes saw “impaired” Skype for Business traffic coming in from the outside on that particular call, it would help validate the assumption that the unmanaged wireless access point was the villain. But, if these probes identified “impaired” Skype for Business traffic coming from the internal network, then the IT Pro can accurately accuse the internal network of creating the issues. Furthermore, if the diagnostics tool is capable of monitoring both the underlying routed network along with the UC application, the IT Pro Tool can correlate information from the various routers and further refine the search for the offending network leg or interface. It’s like those police dramas that use traffic cameras to track events, establish a timeline for the crime, and place the suspect at the crime scene at the time the crime occurred.
We Are Better Than This
It’s 2016, we should not be component profiling anymore. The unmanaged wireless access point has suffered enough. Let’s use the technology we have available and ensure we correctly identify the perpetrator so they are not left to strike again.
BTW, Don’t Fret . . .
When you walked around that corner, you happened upon a photo shoot for the posters for the new CSI movie. They were all actors, they’re all alive, and the gun was extra smoky for effect.