The Network is slow….again…. What is the root cause this time?

Troubleshoot broadcast storms

During our Monday morning coffee break/meeting last week, I heard our engineers talking about a prospect in the Middle East. I’m an engineer (or I used to be!) so I usually immediately start asking questions and ‘drilling down’ to try and understand the real problem.  It is always critical to understand the real pain, not only to help with our roadmap and messaging but it is also very important (and getting more difficult) for management to try and keep in touch with the latest  ‘issues’

The prospect has about 2000 users, Cisco core, 3 or 4 sites. When I asked why they got in touch, how they found us, what PAIN caused them to search, download and install the LANGuardian on a virtual appliance and request a quote, I got a pretty simple answer, ‘sometimes their network is very slow, he wants to know what is happening on his network. A bit like the movies I guess, sometimes the old ones are the best.

I know some companies have worked this to death and you are probably sick and tired of receiving emails on this but why are organisations still having this problem? Now we have 10 gig at the core and even 10 Gig Internet pipes, surely even the students attending large universities cannot be using it all up?
Is it because users like the millenials are now more technical, sophisticated and demanding? As a network administrator mentioned to me recently, ‘they always blame the network but 90% of the time it is the users. I use the LANGuardian to generate evidence’

Is the transition to the cloud a factor ? Video especially HD quality on youtube for example is certainly a major contributor. Is it because organisations have to do more with less and network administrators are under more pressure?

This particular issue was related to security cameras hogging bandwidth internally so this traffic was not getting to the perimeter and not easily visible to the administrator. I’m not so sure logs would have helped here as are not really useful for troubleshooting bandwidth related issues on actual links to and from remote sites or the Internet. One usually needs to have the ability to only focus on a specific link or area of the network using traffic or flow based technologies. These traffic or flow based systems can capture a lot of detail on network usage, top clients, servers, amounts transferred, type of traffic, trends over long periods of time, all very useful for network forensics and troubleshooting.

Even with all the products available today, maybe due to cost, complexity (both networks and network management systems) budgets, having to do more with less, some companies do not monitor network usage or activity until there is a problem and then they resort to their favourite tool, google. I was talking to a security officer of a large US based multinational last week and even he mentioned to me that google was even his favourite security tool.

We are seeing a trend though, organisations are looking for more visibility on bandwidth consumption, they sometimes have flow tools for example that ALMOST give the answer.  As a guy said to me at an RSA conference ‘you have to look into the packets these days’. They want to know, be able to go deeper, get that final drill down to understand exactly what is going on and have all the information and evidence required to solve the problem. Even getting a simple report like a list of the Top 10 domains for a time period is sometimes not that easy due to proxies, CDNs, so much traffic now tunnelled over port 80, etc.

The information is in the traffic and all networks have traffic, if you can sniff if via a SPAN port or tap and present it at the right level it can help with so many pains including the network is slow. If you can keep it simple, easier said than done though.


