Tuesday, May 9, 2017

Nuances in the Audit Logs

In a previous post I discussed the benefit of the Windows Filtering Platform audit logs and how the Windows Firewall logs were not as useful because they did not include the process information with the log entry. Things have been swamped at work, so I am just now getting around to enhancing some of the alerting that is generated from these WFP event logs. I was excited to dig into this information, my imagination going wild with the idea that nearly every workstation on my network could serve as a sensor.

Using SIEM rules based off of data collected from a large volume of endpoints is one of my favorite ways to test a theory or a set of rules with low risk of impact. My initial thought was to detect and alert on traffic anomalies based off of the username. IT person #1 might need to be using PowerShell between workstations, but Phone Operator 599 does not. If I could alert on this, I thought maybe I could get to the point of writing rules to further limit what applications were talking back and forth on the network. Easy enough, the data is in the logs. To the logs!

“You keep using that word. I do not think it means what you think it means.”
– Inigo Montoya, The Princess Bride
I quickly found out that even though there is a user field in the log entry, it was blank... on every log file I looked at. Not only was this a huge problem for my proposal for some really useful SIEM alerts, I actually had to go back to my previous post and edit it with a correction.

Strike 1. But wait, there's more...

Since I wasn't getting username information, it was time to move on to my next use case. I thought it would be really helpful to detect anomalous traffic between hosts. This would give visibility into traffic flows that I don't get from traditional sources like network boundary firewalls. I relished the idea of being able to use these logs to trigger on large volumes of addresses or ports scanned from a workstation. I even thought I could be sneaky and detect traffic leaving from a compromised host by collecting logs in unexpected ways and forwarding them off box before the attacker knew what I was doing and then using that in creative and innovative ways.

I went about programming SIEM rules to pick up on a handful of scanning scenarios. These rules would be fairly easy to test too... I started off with a quick PowerShell "Test-NetConnection" and saw the results in the SIEM. Success!

I wanted to prove this out on a larger scale, so I fired up some quick nmap scans that would meet my scenarios and then waited for the alerts to fly. And waited... and waited. They didn't happen. I was getting alerts for other things, but nothing for my scans. After reviewing my alert logic I went straight for the events. There were a few UDP packets, but that was it. I looked at my nmap results and there were thousands of packets being sent... why the disconnect?

I dug into the Windows security log on the test machine and saw the exact same thing as in the SIEM. A few UDP packets, but nothing else. Where were the thousands of TCP connections? I knew for sure that TCP packets were leaving my test machine and data was coming back to populate my nmap scan, but why weren't the WFP logs showing this? Were the logs not logging what I thought they were logging? Inconceivable!

I decided to dig into what exactly constituted a connection and find out. After a bit more testing and searching, I stumbled across this page and the following quotes:
"ALE is a set of Windows Filtering Platform (WFP) kernel-mode layers that are used for stateful filtering.
"Stateful filtering keeps track of the state of network connections and allows only packets that match a known connection state.
"Filters in the ALE layers authorize inbound and outbound connection creation, port assignments, socket operations such as listen(), raw socket creation, and promiscuous mode receiving.
"Traffic at the ALE layers is classified either per-connection or per-socket operation. At non-ALE layers, filters can only classify traffic on a per-packet basis.
"ALE layers are the only WFP layers where network traffic can be filtered based on the application identity—using a normalized file name—and based on the user identity—using a security descriptor.
"For this reason, policies that enforce who (for example, "administrator") and/or which application (for example, "Internet Explorer") are allowed to perform the network operations mentioned above are authored at the ALE layers.
"Traffic at the ALE layers is classified either per-connection or per-socket operation. At non-ALE layers, filters can only classify traffic on a per-packet basis."
After reading this things started to make a little more sense. When I executed nmap it was running as an administrator and it was configured to perform a TCP SYN scan. Since it was being run as an administrator, nmap could create raw sockets and was only sending a SYN packet and moving on without completing the handshake. Since the handshake was never completed, a stateful "connection" was never made. I believe WFP is auditing based off of the ALE layer information. If the TCP handshake isn't completed, a connection isn't made, if a connection isn't made, a WFP Audit Log isn't created, if a WFP Audit Logs isn't created, my super cool SIEM alerts never fire.

To further prove this, I re-ran the nmap scans with the "-sT" option, forcing it to use the OS stack and complete the handshake. My SIEM blew up with the alerts that I had configured. Things worked as expected.

I haven't yet found proof of this other than the events I have described. I have two theories for this:

  1. Since a TCP connection isn't fully established, the ALE layer doesn't classify a SYN scan as a connection and doesn't log it, but UDP and ICMP show every packet (or at a minimum the first packet in every sourceIP/port and destinationIP/port combo) because they are not stateful.
  2. The raw sockets somehow bypass the filtering drivers.
I am currently leaning towards the first theory.

I had a brief glimmer of hope that maybe the firewall logging that I disabled would provide different data and be more helpful - maybe it captured the Transport or Network layer data. But, it doesn't appear to be so. It was very similar to the auditing logs.

Somewhat deflated, I have had to temper my excitement for my network of sensors, at least for detecting outbound connection scenarios. When I tested Sysmon with the "-n" option, it appeared to have the same problem with with outbound detections of SYN scans. I haven't yet verified all of the scenarios around inbound SYN scans with WFP Audit logs. 

Well, at least there isn't an easy way to use TCP SYN to exfil data to hide it from my logs, like programs up to no good like this, or this, or standards that would allow anything to do it like this. :-( Looks like I have some testing to do to see if my network devices are picking up data in SYN packets. From a workstation perspective, it may be that the best option for this kind of data is in the massive data source known as ETW, but probably for other reasons.

I know that using logs from a machine that I am assuming is compromised is a weak and error-prone option, but I was hoping the element of surprise would work in my favor. It looks I am headed back to the drawing board, with this data set limited to applications that play nice with the OS, which is a lot, but know has an important caveat. One more reason to limit administrative accounts for end users and patch to prevent privilege escalation, just in case you needed one.

I hope to move on from Windows Firewall with Advanced Security on my next series of blog posts and focus on Windows Event Forwarding or a recent adventure... AppLocker!

Until then, work hard and spend time with your family.

No comments:

Post a Comment