Why is data reduction, metadata important for SMEs?
What is Metadata?
I attended a security conference in Washington DC recently with some engineers who worked for a large enterprise partner. These guys have considerable expertise in SIEM and log data so the subject of ‘big data’ came up a lot, particularly when we were exploring how the combination of wire and log data in a single system could benefit both IT security and network operations.
- By monitoring, recording and analysing traffic at critical points across a network one can use the data to troubleshoot a variety of IT security and operational use causes. Find out what users are actually doing, how critical resources – servers, applications and bandwidth – are used. The detail provided by examining both the packet headers AND contents, not just the headers, ensures users can very quickly drill down to the root cause and view the granular information to really understand the problem. For example, the information required to prove it is NOT the network but the large ISO the user had been copying across a WAN link.
- Combining this information with log data collected from critical servers, security appliances and network systems in one system leads to the ‘holy grail’ – ‘network aware’ data grabbed off the wire AND log data in one searchable database, all the detail and insight required always available through a ‘single pane of glass’.
Both sources of data – wire and log – really complement each other, enabling users to pivot on an IP or user name and really see data in context.
Our discussion was enterprise focussed, and the single term that kept popping up was ‘data reduction’. This is relevant because capturing all this data, traffic and logs and storing it all in ONE location will result in unbelievable levels of insight but a LOT of data. The hardware required to store and index it for even a few days could be very expensive not to mention the technical expertise to try and interpret and understand it. It is crucial to be able to see the ‘wood from the trees’ and quickly understand the data you are looking at.
We have also heard from many customers that tools based on Netflow are very useful for troubleshooting, bandwidth issues for example but in many use cases lacks the drill down and detail to ‘find that smoking gun’, the user name who deleted/moved a folder or downloaded the customer database before resigning, user activity monitoring and troubleshooting. But we also sometimes hear ‘DPI (Deep Packet Inspection) based solutions are complex and expensive and there is nothing better out there ?’
One option is to NOT store every single packet, only the most important and useful information, the actionable data, the metadata. Easier said than done, how does one predict the future and decide the information or metadata to retain, that packet data that will be useful in the future? Not to mention accurately detecting the application in order to reassemble the stream, extract and store only the useful detail. As regards a user downloading something from the Internet for example this metadata would include the user name, domain name, the actual page or URI accessed, the video watched, the date and time and bandwidth consumed. So one could go back and say, do you know how much bandwidth that training video you were watching consumed on that link ?
NetFort solved this by listening to and working with our customers. For example, a few years ago a UK financial customer said
‘Using the LANGuardian, I can see a user copying a large amount of data from an internal Windows file share. I can see the IP address, user name, source and destination ports, time and amount of data but I also really need to see the actual file names, is that possible?’
We started looking at the SMB protocol, broke it down and developed a follower or dissector. Now for certain critical protocols we can accurately identify, follow and reassemble them to extract the critical detail, the Netfort metadata and store it, all in real time. Then we focussed on developing ‘a google type search’, GUI to try and make it easy to enter a query and search through all this data. This was and is probably more difficult than writing the protocol decoder, usability is a huge challenge.
It is also important to note that as this metadata contains rich granular information extracted from the packet contents and as it is stored in a built in database for long periods it can be very useful for many security, user related use cases and network forensics. Granted though, that there are also use cases where having access to ALL the packet contents is critical to complete the picture or for evidence. Maybe the ideal combination is full packet capture data retained for short periods, hours, days and metadata for weeks and months ?
This detail or metadata is protocol specific, for example for Windows file shares it includes the file name and action, for MS SQL the SQL query, etc. The effort required to develop each dissector depended on the protocol, SMB V2 was not trivial for example but both for our customers and NetFort, it was definitely worth it. All because of ‘data reduction’ and building the intelligence into the software. Less is more and everybody wins. Now our customers get metadata they can understand, retain for long periods and use for many issues including network operations, IT security and network/user forensics. NetFort get a new large sector to target, SMEs, because now they have the option of affordable network activity monitoring and forensic solution, a single reference point containing very useful granular information they can understand and act on.