This post was originally published here by Ryan Nolette.
This blog is a quick overview of how I use Bro IDS for threat hunting.
Specifically:
- Example queries I run when I start a hunt by specific data set.
- Examples of Risk Trigger templates customized for my organization’s environment
- Example of a Threat Hunt I performed
- Illustrate how the Threat Hunt I performed maps to the Threat Hunting Framework
- A few ideas of other Hunts you can do
What is Bro?
“Bro is an open source Unix based network monitoring framework. Often compared to a network intrusion detection system (NIDS), Bro can be used to build a NIDS but is much more. Bro can also be used for collecting network measurements, conducting forensic investigations, traffic baselining and more. Bro has been compared to tcpdump, Snort, netflow, and Perl (or any other scripting language) all in one. It is released under the BSD license.” – Wikipedia
Basically, Bro is a protocol analyzer. It will accept network events from a PCAP file or a live traffic feed, watch it, and parse out individual protocols such as RDP, FTP, HTTP, and many more into individual log files. One of Bro’s largest strengths is its ability to turn network events into actionable/useful metadata. And that metadata helps to provide us with context which is the key to finding potential threats quickly.
Bro log equivalent to normalized data source:
Standard datasource | Bro Equivalent |
Firewall | Bro Conn |
NetFlow | Bro Conn |
Proxy | Bro HTTP |
MS IIS | Bro HTTP |
MS DNS Debug | Bro DNS |
MS DNS Analytics | Bro DNS |
MS DHCP | Bro DHCP |
SSHD | Bro SSH |
MySQL Server | Bro MySQL |
MS Message Tracking | Bro SMTP |
Logical Topology
Threat Hunting With Bro
This image shows a completed Threat Hunt for Lateral Movement using netflow and Windows authentication logs.
Creating relationship between disparate data sets
The above image displays a high level visualization of all the relationships I have created between different entities within my data. This high level view allows me to quickly and easily see how my different types of data are connected and allows me to setup a game plan for a linked analysis hunt.
The above visualization expands on the high level visualization and illustrates all of the low level field mappings between each data source being ingested and each entity defined. This is where the actual mapping of the IP address field between authentication data and network data.
Example hunt
The example below illustrates how to use the hypotheses laid out above with the data and techniques enumerated.
Lateral Movement (Windows Environment) |
|
What are you looking for? (Hypothesis) |
Hypothesis:
Attackers may be attempting to move laterally in my Windows environment by leveraging PsExec. Look for:
|
Investigation (Data) |
Datasets:
For identifying use of PsExec, you will want to focus primarily on application protocol metadata, including:
|
Uncover Patterns and IOCs (Techniques) |
|
Inform and Enrich Analytics (Takeaways) |
The destination IP addresses, path, and ports involved in the Lateral Movement activity you have discovered can be taken as IOCs and added to an indicator database in order to expand automated detection systems.
You can also create packet-level signatures to trigger alerts for cases where the admin share connections you have discovered may appear again. |
Always keep in mind that for each instance of a hunt, there will always be multiple different paths that a hunter can take to address a given hypothesis.
Alert or Analytics Driven hunt
One technique to detect and alert on PsExec activity with Bro is by using custom Bro scripts looking for PsExec’s use of the C$, ADMIN$, and/or IPC$ shares. These shares added notice messages of “Potentially Malicious Use of an Administrative Share” in the Bro Notice log. The use of PsExec creates an executable named PSEXESVC.exe on the target system.
PsExec is a Windows administration tool used connect to different systems on a network via SMB, using administrative credentials. SMB is legitimately used to provide file sharing functionality, however; misconfigurations can allow malware to propagate throughout a network. Combine PsExec with the password theft abilities of mimikatz and you have an equation for lateral movement.
Detecting PsExec Activity Using Bro
Modified code for my usage. Code is originally from here.
@load base/frameworks/files
@load base/frameworks/notice
@load policy/protocols/smb
export { redef enum Notice::Type += { Match };
global isTrusted = T;
global trustedIPs: set[addr] = {192.168.1.1,192.168.1.10} &redef;
function hostAdminCheck(sourceip : addr) : bool
{
if (sourceip !in trustedIPs)
{
return F;
}
else
{
return T;
}
}
event smb2_tree_connect_request(c : connection, hdr : SMB2::Header, path : string)
{
isTrusted = hostAdminCheck(c$id$orig_h);
if (isTrusted == F) {
if (“IPC$” in path || “ADMIN$” in path || “C$” in path)
{
NOTICE([$note=Match, $msg=fmt(“Potentially Malicious Use of an Administrative Share”), $sub=fmt(“%s”,path), $conn=c]);
}
}
}
event smb1_tree_connect_andx_request(c : connection, hdr : SMB1::Header, path : string, service : string)
{
isTrusted = hostAdminCheck(c$id$orig_h);
if (isTrusted ==F) {
if (“IPC$” in path || “ADMIN$” in path || “C$” in path)
{
NOTICE([$note=Match, $msg=fmt(“Potentially Malicious Use of an Administrative Share”), $sub=fmt (“%s”,path), $conn=c]);
}
}
}
}
Detection of PsExec traffic via a Bro network sensor
In addition to remote control via SMB by PsExec, attackers will upload other binaries to the victim system or use more meterpreter modules. For example, the tool Mimikatz, which is used to dump passwords from memory, can be uploaded to a remote system via the C$, ADMIN$, and IPC$ shares. Bro has the ability to detect Mimikatz getting transferred over SMB and the ability to check its hash against VirusTotal.
If you don’t dump your bro logs into a SIEM or other log aggregation platform, I suggest a simple grep command to search for PsExec usage traffic, “grep -iE “C$|ADMIN$|IPC$””
Finally, with all the hard work done by that bro script, we are able to visualize the event of psexec being used to move from 192.168.1.100 to 192.168.1.104 as well as incorporate a bro alert for psexec to add further validation of the relationship between the hosts. While this image only shows the exact activity I am describing for ease of reading, this is what I expect an end result of a hunt to look like. All additional data and possible connections have been investigated and excluded from the original large data set until you are only left with the anomalous/suspicious/malicious event. I consider this a successful hunt. I would also have considered it a success if I had found nothing at all because the point of a hunt isn’t to a true positive malicious event every time, but instead it is to validate a hypothesis, to answer a question with a definitive yes or no. Good luck to all and happy hunting.
Intelligence/TTP driven hunting
Intelligence driven hunts are created from threat intelligence reports, threat intelligence feeds, malware analysis, vulnerability scans, and other trusted sources.
For this example, we are going to do a Hunt on http user-agents.
HTTP User-Agent Analysis
Background and Purpose
I want to identify malware by analyzing the User-Agent strings they leverage.
User-Agent (UA) strings are used to identify applications or services that perform HTTP requests. Similar to legitimate applications, HTTP-based malware may use distinct UA strings to identify itself to a command and control (C2) server; malware may also use common UA strings (e.g., UA strings used by legitimate web browsers) in order to blend in with normal web traffic.
The process described herein may also be used to analyze other HTTP headers and values, but User-Agents are among the most commonly used for hunting and detective measures.
Hypothesis
HTTP-based malware may use distinct UA strings during the C2 phase. If we analyze UA strings seen in our network and look for outliers, then we may find malware.
The assumption made in this type of analysis is that the activity in question will not be “normal” or overly prevalent on the active network. Ideally, this will lead to identification of malicious or otherwise prohibited activity that was missed via other detection mechanisms, which the UAs can then be used to detect in the future.
Data Required
- HTTP proxy data
- list of known-bad UAs (either external threat intelligence or internal threat intelligence)
- HTTP requests
- This hunt requires metadata that contains HTTP requests.
- This data should include
- the UA string used in the HTTP requests
- the source (endpoint, user, or IP address) of the HTTP request
- the URI requested in the HTTP request.
Analysis Techniques
- Stack counting
- String matching
- Tokenization
- Outlier detection
Define Your Data Set
For this hunt, the data set is a set of UA strings used in outbound HTTP requests. Identification of these UA strings may vary from network to network; however, it is recommend to start with a larger set of data (e.g., all UA strings or a specific type of UA string) and reduce the size of the set as required by the results of the hunt.
Any UA string seen in an HTTP request can be considered for inclusion in the data set. However, you may want to consider defining the activity group based upon known legitimate UA strings. By filtering out these strings, outliers will be more noticeable. (However, keep in mind that filtering out legitimate UA strings will not help you identify attackers who are maliciously using legitimate UA strings!) There are several resources online for identifying common UA strings, here is one.
A query like this can be used to identify all UA strings:
SELECT user_agent, count(*) AS count FROM BroHttp WHERE user_agent IS NOT NULL GROUP BY user_agent ORDER BY count DESC
A query like this can be used to filter out legitimate UA strings from the results, would be one method for reducing your data set.
SELECT user_agent, count(*) AS count FROM BroHttp WHERE user_agent IS NOT NULL AND user_agent NOT IN (‘<UA string 1>’, ‘<UA string 2>’, ‘<UA string 3>’, …) GROUP BY user_agent ORDER BY
If you’ve reached the point where you’re routinely running hunts in order to iteratively determine what might be anomalous on the network, you’ll likely find that adjusting timeframes for queries is of use. For instance, if the UA report is run weekly, you may want to only pull one week of data for your report. It is also be worthwhile to compare those results with the results of the prior weeks, over time.
Identify Candidates
Depending on your data set, create a query that returns the UA strings you are interested in. This may be necessary in cases where there are too many results from a data-type wide search. One method of isolating UA strings is to look for uncommonly short or long strings. There is a relatively standard format for most To do this, we first have to identify a common UA string length. This query can do that:
SELECT avg(char_length(user_agent)) FROM BroHttp
Next, we need to isolate the results using the average length data from the previous query. A simple way to do this is to look for any UA string that is, for example, shorter than the average length, like this:
SELECT user_agent, char_length(user_agent) AS ua_len FROM BroHttp WHERE char_length(user_agent) < 66 GROUP BY user_agent, ua_len ORDER BY ua_len ASC LIMIT 20
OR
select * from BroHttp where length(user_agent) < 10 limit 20
OR
MATCH UserAgent AS entity FROM CounterOps WHERE len(entity.instance_id()) < 10 limit 20
We limit the results to 20 so that we can quickly examine the results to make sure we’re getting the right kinds of results.
At this point, we need to expand the LIMIT in the queries above and review the results to identify investigation candidates.
SELECT user_agent, char_length(user_agent) AS ua_len FROM BroHttp WHERE char_length(user_agent) < <average UA length> GROUP BY user_agent, ua_len ORDER BY ua_len ASC LIMIT 1000
With these results, start at the top of the list (which will be the shortest UA strings) and review the character length for each string– with the average UA string length and the content of the UA string in mind, identify outliers that you feel may be worth looking into further. An example of the results of this query is shown below:
There are some other data stacking slices that may be useful for us to examine in order to identify interesting candidates. In some cases, these may also help us learn about our network and the functions that take place across systems.
Next, we can try to identify which UAs were observed on only a few hosts. This could be of interest in the the case that an attacker is present or malware was deployed to only a few systems.
Lastly, we may also be interested in which UAs are observed infrequently in volume across the network as a whole. In certain situations, there’s a possibility that the results from this query could differ from the prior one, which may introduce additional candidates.
Description
- Stack the entire UA string and look for rare occurrences.
- There may be a LOT of these, though. Every web plugin changes the UA string a bit, but that doesn’t mean there’s anything evil.
- Consider more detailed analysis, including
- tokenizing the string and focusing on strings with the lowest number of tokens, most unique tokens, or some combination
- Looking for abnormally short or long strings
- Look for list of known-bad UAs
Additional Sqrrl Uses
The queries identified in the prior section can all be used as starting points for exploration in the behavior graph. However, one of the goals of hunting is to reduce the amount of repetition and to re-apply what was learned on a hunt. As such, there are two additional mechanisms within Sqrrl that we can use to monitor occurrences of User-Agents.
Hunt Reports
We can turn each of our example candidate investigation queries into a hunt report in order to get quick snapshots to aid in identification of additional candidates on a regular basis. Hunt Reports are particularly well suited for data stacking.
Risk Triggers
While Hunt Reports assist with making data from certain hunts readily available, we can make use of Risk Triggers to identify observations of interest with respect to UAs and automatically use those to help bubble up entities of interest.
In one of our above examples, we determined that short UAs may be suspicious relative to others. As a result, I may want to create a Risk Trigger for entities observed using UAs shorter than a determined threshold.
Triggers also provide a good area to make use threat intelligence data that may be collected from any number of sources. Perhaps a relevant threat was recently observed using malware that made use of a slightly malformed UA that falls within normal length bounds. We can implement a trigger to help raise the scores of internal entities with observed pattern matches.
Conclusion
Bro is powerful. Bro is free. Bro is a kind and benevolent ruler.
Bro offers something that many threat hunting tools don’t, context. Using Bro as a protocol analyzer to identify traffic and its metadata are extremely valuable tools. Its ability to turn network events into actionable/useful metadata make it a must have in my security stack. This metadata helps to provide me with context which is the key to finding potential threats quickly.
Bro includes a scripting language that makes it possible to do indicator, packet-level, and “heuristic” network detection. Knowing how to convert a hunt conducted in Bro into an automated function in its framework adds tremendous value to a security operations team. This aspect of hunting is where Sqrrl Risk Triggers shine.