The Problem: Tracking down the source of a network traffic alert
Anyone who has worked with Security Incident and Event Management (SIEM) or Intrusion Detection/Prevention System (IDS/IPS) alerts knows it can be very, very difficult to track down the actual source of the network traffic. Tracking the source becomes even more difficult when it comes to finding the machine that attempted resolution of a known bad or suspicious domain.
The Cause: DNS architecture
The Domain Name System (DNS) architecture is the primary reason why it can be so hard to find the machine attempting to resolve a bad or suspicious domain. Most organizations are Microsoft based and rely on their domain controllers to perform DNS resolution. These domain controllers are configured to act as recursive resolvers, meaning they perform resolution on behalf of the client. Because of this, when you get a SIEM or IDS/IPS alert, the source IP address will generally belong to the domain controller. This causes problems, as it usually is not the domain controller that is infected, but some client behind it. Even if you are not a Microsoft based organization, there is usually some form of a recursive resolver in place, which is at a lower level in the network than the edge IDS/IPS that detects the activity.
Why do most organizations use a recursive resolver? The answer is simple: so the internet does not see all the internal client addresses as they resolve domains. If you use NAT on your firewall, it limits what the world is able to see, but makes managing of the firewall rules for DNS more difficult. It also causes more network traffic as these recursive servers are generally caching as well. In addition, it puts more control around the name resolution service to help spot anomalies and control behavior.
The Solution: Implement DNS logging or network architecture approach
So how do you get that internal visibility to see what the true source IP address is? I will describe two main approaches in detail below. The first is relatively simple: implement DNS logging. This scales well, and integrates nicely with a SIEM. The second is a network architecture approach. This approach can be much more difficult to implement. Let me explain the benefits and negatives of both approaches.
The First Approach
The first method is to enable DNS logging. Before I go into the details, I do need to address a MAJOR resource utilization concern. Until the release of Microsoft Server 2012, DNS logging was only intended for debugging purposes. It was not made to be turned on for extended periods, and caused huge resource impacts on the server. While some organizations were able to successfully log all DNS traffic prior to Server 2012, Microsoft does not recommend it. According to Microsoft, “Debug logging can be resource intensive, affecting overall server performance and consuming disk space. Therefore, it should only be used temporarily when more detailed information about server performance is needed.” Microsoft must have made some major code changes in the DNS logging portion of their code, because in Server 2012 and beyond, DNS analytic logging consumes fewer resources and can be run on a permanent basis, as stated in the performance considerations for DNS Logging and Diagnostics on Microsoft Technet.
With all that out of the way, let’s dive in. Following the directions from Microsoft TechNet’s Using server debug logging options, DNS debug logging can be enabled on Server 2003 – 2008. This will log all requests and/or responses that the server handles. This TechNet page details how to enable DNS analytic logging on Server 2012, which provides the same logging information as DNS debug logging. All the logs are written to a text file in the configured location. An agent such as Snare Epilog can then read this flat text file and send the information to a SIEM.
If your organization is using a BIND server for DNS resolution, the logging functionality has been there for years. BIND has the ability to log all requests and/or responses to syslog. ZYTRAX DNS BIND9 logging cause is an example of how to configure BIND for syslog. Then, the syslog configuration can be modified to send these logs to a SIEM.
While it is not necessary to send the information to a SIEM, it allows for correlation of the IDS/IPS alert with the DNS request/response log. If the correlation is not done in real time, it at least gives an easy way to search though the logs to find the source that is attempting resolution of the domain. Storing all the DNS logs in a SIEM also allows for reporting and trending of activity. This helps identify trends in the environment such as beaconing activity, busy time frames, and top domains being requested.
The key with DNS logging is to ensure you are logging on the servers that clients are using for resolution.
Whether your organization uses Microsoft, BIND, or some other commercial capability for DNS resolution, you need to be prepared for the large number of logs that are generated. While a number of factors dictate the logs generated on a daily basis, this will be an extremely high volume. Be sure the system storing the DNS logs has plenty of disk space. It is common to have 50-100 queries per second, with larger organizations having hundreds per second. Organizations need to have a proper log collection and retention policy if they are going to capture these logs.
The Second Approach
The second method to find the true source is moving an IDS/IPS behind the recursive DNS server. The IDS/IPS should be running a rule set that is looking for suspicious or known bad domains. For small organizations, this can be fairly simple — by creating a SPAN/monitor session of the network port(s) the DNS server is connected to, and connecting those to the IDS/IPS. When the IDS/IPS alerts on a suspicious or known bad domain, it is close to the host and can provide the true source IP address of that host. As with any monitoring strategy, there are several methods used to capture the appropriate traffic, aggregate it, and pass it to the IDS/IPS for inspection. This may include the use of taps, RSPANs, and SPANs.
This method becomes very difficult for larger organizations that have several DNS servers dispersed throughout the environment. This is due to the location of the DNS servers logically and physically in the environment, the number of DNS servers, and the number of IDS/IPS appliances and open inspection ports. While it is still possible, it requires advanced monitoring strategies along with a balance of moving traffic around for inspection and alerting, and bandwidth availability.
One advantage of this approach is that the IDS/IPS logs can integrate directly with the SIEM. Another advantage is writing the rules to detect entire top-level domains that may be suspicious. With an IDS/IPS, you are able to write a wide variety of rules for detection that continue to give you the true source of the traffic.
The final advantage to this approach is that the activity can be stopped if the IPS is in blocking mode. With the DNS query blocked, it will help stop command and control from being established, and keep attackers from gaining control of your systems.
While there are variations to these two approaches, this covers the two main ways to gain greater internal visibility of DNS requests. If you keep getting alerts for attempted connection or resolution of suspicious domains, and can’t get the true source machine, you may want to look into implementing one of these two methods. Doing so will decrease your investigation time and allow you to identify the machines that need remediation.