Extracting Metadata From PCAP Files
What our customers want to extract from PCAP files
In my previous blog post on the topic of PCAP file management, I looked at how you can import and export PCAP files from LANGuardian. Since then we had a number of queries come in from customers and website visitors. One in particular caught my eye. It came from a network engineer who had very specific requirements. They have a large PCAP repository and they need a tool to process these files and provide reporting on:
- Identify all the unique IP addresses involved in the PCAP, sources and destinations
- Identify the “big talkers” which IP’s account for sending and receiving the most traffic. ideally a list of IP’s, packets sent, and the number of packets received WITH the ability to sort by # of packets sent or received (that would show the big talkers)
- The types of traffic by protocol
- I need to have some ability to utilize an AV engine against the traffic. One way to do this with Wireshark is typically via tcpreplay, you set up a “clean system” with IDS enabled then tcpreplay to it suspect packets and watch it’s alarms/logs.
Why can’t you use Wireshark for analyzing PCAP files?
Wireshark is an excellent tool and I use it a lot myself. The most common features I use are the packet analysis and Follow TCP Stream options. What it is not good at is giving you a top level view, a summary of what went on with drill down capability to get to the detail. Another limitation is the ability to cross reference the data in the packets with threat databases or IDS signatures.
Wireshark doesn’t work well with large network capture files (you can turn all packet coloring rules off to increase performance). Some of the most interesting network data can be sourced from a SPAN or mirror port but these data sources will result in large PCAP files.
Identify all the unique IP addresses involved in the PCAP, sources and destinations
Every packet of data in a PCAP file will contain source and destination IP addresses. A modest sized PCAP could contain thousands of addresses so you need a quick and efficient way to capture these and store them in a database.
Wire data analytics is often referred to the process where metadata such as IP addresses is extracted from PCAP files or directly from the network when you monitor network traffic from a SPAN or mirror port. The image below shows a sample of this network inventory type information which LANGuardian can extract from a PCAP file. Click on the image to access this report in our online demo.
Identify the “big talkers” which IP’s account for sending and receiving the most traffic
Some time ago I spoke to a LANGuardian customer who had just purchased the system for a client. They had found us while searching on the Internet for a tool which would “analyze PCAP files that I had collected from a customer’s network that was struggling with VOIP quality issues and massive bandwidth utilization“.
They also reported that “While I read and understand PCAP files fairly well, when it came time to analyze the date and determine who my top talkers are I was at a loss.” This is one of the big problems with tools like Wireshark, sometimes it can be hard to get that summary information. Who are the top talkers on the network.
Our customer installed LANGuardian and within a short period reports that “The LANGuardian software quickly pointed out several computers that were flooding the network with data and a network switch that was faulty. Our customer mitigated those problems and has had great VOIP quality and lower total bandwidth utilization on their LAN and WAN.”
The image below shows the output of the LANGuardian Top Talkers by Traffic Volume report. If you click on the demo you can access this report on our online demo.
The types of traffic by protocol
Protocol recognition is the art and science of identifying the applications that are in use on a network and understanding the impact of each application in terms of bandwidth usage, user behavior, security, and compliance.
LANGuardian content-based application recognition (CBAR) approach to application recognition combines a unique deep packet inspection algorithm with detailed understanding of the underlying protocols. Unlike other traffic monitoring technologies such as NetFlow, which analyzes packet headers only, LANGuardian CBAR analyzes entire traffic packets and inspects their content.
By inspecting the packet content in addition to the header, LANGuardian CBAR can see past the port and address information to identify the application and/or protocol that generated the packet. The image below shows the output of the LANGuardian Applications in Use report which shows the top protocols found in a PCAP file ordered by total bandwidth. Click on the image to access this report on our online demo.
Ability to utilize an AV engine against the traffic
When I read this requirement first I was confused. How could we replay traffic against an antivirus engine. Most antivirus systems run as a service and may check memory, disk, and other data sources for the presence of malware or viruses. I checked back and what they meant was if we could run the contents of the PCAP files past an IDS or threat database.
In the case of LANGuardian, we have both an IDS and traffic analysis module running in parallel. When you import a PCAP file, the contents is sent to each analysis engine where it is checked for signs of suspicious content. You can also write your own IDS signatures to search for specific text strings within the PCAP files. The video below goes through the process of creating a custom IDS signature to check for the presence of a text string.
If you have any questions about how to monitor traffic on your network using LANGuardian, or would like to know more about how our network traffic monitoring tool can meet your organization´s requirements, do not hesitate to contact us and speak with a member from our technical support team.