From Simulation to Hunting - Using LLMs to Hunt BrowserCore-Based PRT cookies Theft
I began with the hypothesis that BrowserCore.exe could be abused to steal a PRT/session cookie without being detected. I first simulated the flow using the standard toolset and techniques that the EDR reliably flagged, then exercised a range of evasive methods to bypass detection and hunted the resulting telemetry with KQL.
To make hunting repeatable and resilient to edge cases, I automated triage in a Jupyter notebook and used GPT-4o to generate dynamic KQL, normalize noisy and repetitive data, and produce analyst-style explanations. Below I walk through the simulation, the KQL playbook, the notebook and delegated-auth setup for Defender Advanced Hunting, and the LLM integration and results.
Introduction
Threat hunting is never just about running queries. Rules and signatures give you a starting point, but the real work lies in separating meaningful patterns from noise and adapting as attackers change techniques. Processes like BrowserCore.exe, which usually run legitimately as part of native messaging, can illustrate this challenge well; the same telemetry can look routine or suspicious depending on context.
I began by simulating activity and applying evasive techniques to see where EDR detections held up and where they broke down. From there, I moved into manual hunting with KQL, identifying the stable signals that could survive noisy variations. Automating that process in a Jupyter notebook made it repeatable, but rigid code alone couldn’t handle all the edge cases.
That’s where the LLM came in. GPT-4o became the bridge between automation and human judgment: fine-tuning queries on the fly, normalizing noisy command lines, and turning raw logs into analyst-style explanations. It didn’t replace the hunt; it amplified it, adding reasoning where automation alone fell short.
In the sections that follow, I’ll walk through the simulation, the KQL playbook, the notebook setup (including delegated authentication to Defender Advanced Hunting), and how the LLM integration closed the gap between raw telemetry and actionable insights.
Our simulated attack scenario - abusing BrowserCore.exe to exfiltrate a PRT session cookie
A Primary Refresh Token (PRT) is a device-bound SSO artifact used by Microsoft Entra ID Joined/Hybrid devices. In this post, we will explore a pass-the-cookie style scenario where BrowserCore.exe’s native-messaging path is abused for token theft.
This token theft technique has been documented by researchers such as Dirk-Jan Mollema, Lee Chagolla-Christensen (SpecterOps), Nestor Syyinämaa(Dr Azure Ad), and others. Unlike PRT extraction from LSASS, which requires local administrative privileges, this approach can be executed by a non-privileged user, and the cookie can bypass MFA.
I use the term “simulating” rather than “emulating” because, while there is public research showing the technique, and I believe threat actors have used variants of it, I was not able to locate a confirmed public intel report attributing this exact method to a tracked group at the time of writing. For that reason, I reproduce the known tools and techniques and then experiment with evasive methods to identify detection gaps.
Our first simulation attempt - Using ROADtoken
We’ll use the well-known ROADtoken tool.
Building the tool
Download the project from Dirk-Jan’s repository: https://github.com/dirkjanm/ROADtoken
Open the .sln (solution) file in Visual Studio.
Switch to Release mode and build the solution.
Getting the nonce
Before running the tool, you need the tenant ID (to construct the nonce endpoint). Query the OpenID configuration for your target domain:
1
https://login.microsoftonline.com/<tenant_domain>/.well-known/openid-configuration
From that document, you can obtain the tenant information you need. With the tenant ID in hand, run the following PowerShell command to retrieve the nonce
1
(Invoke-RestMethod -Uri 'https://login.microsoftonline.com/<tenant -id>/oauth2/token' -Method POST -Body @{ grant_type = 'srv_challenge' }).Nonce
This doesn’t have to be executed on the target machine.
Running the tool
We then pass the nonce as an argument to the tool on the victim machine.
Authenticating using the stolen credentials
We then authenticate using the stolen credentials to validate the effectiveness of the attack.
This attack was effectively detected by the EDR in near real time.
Using stealthier techniques
For this part of the post, I won’t share the tools, only the concepts. I tested a range of techniques (WSL, the Python interpreter, cmd.exe, powershell.exe, custom-made tools). Some of these approaches were detected effectively, while others successfully evaded detection. If you’d like to see the general idea behind bypass techniques for this attack, I recommend reading the following blog post for inspiration.
FalconFriday — Stealing and detecting Azure PRT cookies — 0xFF18
Hunting time
Manually hunting for BrowserCore.exe abuse using KQL
We will now hunt for BrowserCore.exe abuse using Microsoft Defender’s Advanced Hunting feature. First, we’ll identify the data available at a high level and pick the columns we care about so we can focus our queries on the most useful fields.
1
2
3
DeviceProcessEvents
| where FileName == "BrowserCore.exe"
| sample 10
Now that we know which columns we’re interested in, the next step is to identify the baseline behavior for BrowserCore.exe executions. For my use case, where I want to understand at a high level what a legitimate process tree should look like, I focused on:
FileName
FolderPath
ProcessCommandLine
InitiatingProcessFileName
InitiatingProcessParentFileName
I didn’t include things like file hashes or digital signatures in the initial analysis, since at this stage the goal is simply to get a sense of what “normal” looks like.
Based on this, we can use the summarize operator and count() function to view the majority of events in the environment.
1
2
3
4
DeviceProcessEvents
| where FileName == "BrowserCore.exe"
| summarize count() by FileName, FolderPath, ProcessCommandLine,
InitiatingProcessFileName, InitiatingProcessParentFileName
Based on the results, the majority of execution flows are as follows.
Based on this, the common process tree is:
This chain typically invokes the following command line:
1
"BrowserCore.exe" chrome-extension://ppnbnpeolgkicgegkbkbjmhlideopiji/ --parent-window=0
A quick search reveals that ppnbnpeolgkicgegkbkbjmhlideopiji
is the Chrome extension for Microsoft Single Sign-On (SSO).
With this baseline established, we can exclude the benign process tree from our query to focus on potentially suspicious activity.
1
2
3
4
DeviceProcessEvents
| where FileName == "BrowserCore.exe"
| where not(FileName == "BrowserCore.exe" and InitiatingProcessFileName == "cmd.exe" and InitiatingProcessParentFileName == "chrome.exe" )
| summarize count() by FileName, FolderPath, ProcessCommandLine, InitiatingProcessFileName, InitiatingProcessParentFileName
And we can see the following results, which seem suspicious.
We can then start hunting for these patterns at scale across the environment, identify compromised devices, and pivot to related compromised AccountUPNs by framing the timestamp and analyzing the user’s AAD sign-in log events following the PRT cookie theft.
Because BrowserCore.exe has a predictable execution tree, it was relatively easy to baseline and exclude legitimate activity. The next question was how to automate that process, which leads to the second (and more fun) part of this post.
Automating hunting with an LLM-integrated Jupyter notebook
The Jupyter notebook is available here:
https://github.com/Y4nush/Threat-Hunting-with-Jupyter-Notebooks/blob/main/browsercore-threat-hunting-defender.ipynb
Now we will automate the process by integrating Microsoft Defender XDR’s Advanced Hunting feature using delegated authentication, and applying an LLM to fine-tune our KQL queries and analyze the results.
At a high level, the workflow looks like this:
The final goal is to feed unique events to the LLM so it can score them. But to get there, simply using distinct is not enough, since many “similar” command lines include unique GUIDs, IDs, or hardcoded usernames, which leads to a large number of repeated values. To address this, I sampled the events, let the LLM identify similarities, and then used KQL to do the heavy lifting of reducing the number of unique events. In the final stage, the reduced set of events was fed into the LLM for analysis. Using this technique, the number of events dropped from 542 to 13, a reduction of about 97.6%, which results in much more efficient, faster, and cheaper analysis (in terms of tokens).
Prerequisites
OpenAI API key
Azure App Registration for delegated access
Setting up our app registration
In order to register an app, go to the following URL.
https://portal.azure.com/#view/Microsoft_AAD_RegisteredApps/ApplicationsListBlade
We will register a new application
- Name the application as you’d like.
- Choose Accounts in this organizational directory only.
- In the redirect URI choose Public client/native (mobile & desktop) and leave it as is
Once finished, click Register.
Now let’s configure the authentication. We will navigate to the following section.
We will then choose “Add a platform.”
And choose “Mobile and desktop application.”
Leave the defaults as they are, and add a custom redirect URI with a port that is not commonly used to avoid conflicts with other local applications that may be running on common ports:
Now let’s configure our API permissions. Navigate to “API permissions.”
Choose “Add a permission.”
Then we will choose “APIs my organization uses.”
We will search for “Microsoft Threat Protection.”
And we will then choose “Delegated permissions.”
Select the AdvancedHunting.Read
checkbox, then add the permission.
Final step: click “Grant admin consent.”
And we are good to go
Running our notebook
Now that we’re set up, let’s interact with our notebook.
We will start by installing the prerequisites and importing the required modules.
We will then add our tenant ID and the application ID of the registered app we created, and perform authentication. This will open a browser window for sign-in, and the retrieved token will be used to interact with the Advanced Hunting API.
Now that we are authenticated, we will initialize our KQL execution wrapper and run a test to ensure everything is working as expected.
We will then use the visualization options in Jupyter to present statistics on common BrowserCore.exe execution process chains. which immediately helps us spot anomalies, even before applying any fancy LLM techniques.
We will set our OpenAI API key using a Jupyter widget.
Now we will sample noisy command lines. The logic of this query is to distinguish multiple values of interest, identify the column with the most differences, and then sample a small amount.
The command lines are almost identical; the only differences are in the named pipe identifiers at the end of each line
1
2
3
4
cmd.exe /d /s /c ""C:\Windows\BrowserCore\BrowserCore.exe"
chrome-extension://ppnbnpeolgkicgegkbkbjmhlideopiji/ --parent-window=0"
< \\.\pipe\chrome.nativeMessaging.in.1b9bbfc8e027824a >
\\.\pipe\chrome.nativeMessaging.out.1b9bbfc8e027824a
vs
1
2
3
4
cmd.exe /d /s /c ""C:\Windows\BrowserCore\BrowserCore.exe"
chrome-extension://ppnbnpeolgkicgegkbkbjmhlideopiji/ --parent-window=0"
< \\.\pipe\chrome.nativeMessaging.in.632275c702172cf8 >
\\.\pipe\chrome.nativeMessaging.out.632275c702172cf8
At this stage, we will leverage the LLM’s capabilities to identify the differences and generate a KQL query that avoids repeated values in our results. I chose to use gpt-4o, as it provided the best results among the lower-cost models offered by OpenAI.
And in just 2.9 seconds, we received our KQL query, ready to be executed.
As you can see, the KQL query ran without issues and reduced the number of events from more than 500 unique entries down to only 14. These remaining events can then be analyzed with the LLM to provide context and feedback.
Getting to this point required a lot of trial and error. I had to adjust how I framed the problem and refine the rules I provided to the LLM. Because it’s a non-deterministic system, the process was challenging and often iterative, but eventually, after enough iterations, it delivered the right results.
We will then pass our output to another agent with the following prompt. The prompt contains the hypothesis and guidance on what to look for.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
RAW_PROMPT = r"""
You are a senior threat hunter.
Task
Score EACH row for likelihood of BrowserCore.exe abuse for PRT-cookie theft and explain why.
Columns present
FileName, ProcessCommandLine, InitiatingProcessFileName, NormalizedInitiatingProcessCommandLine,
InitiatingProcessParentFileName, Count
Data (TSV; first line is header; subsequent lines are rows in order 0..N-1):
<BEGIN_TABLE>
<<TABLE_TSV>>
<END_TABLE>
Row count
<<ROW_COUNT>>
Hunting hypothesis (normal vs abnormal)
- Normal chain: <browser/office/client app> -> cmd.exe -> BrowserCore.exe
- A row is normal if InitiatingProcessFileName == cmd.exe AND InitiatingProcessParentFileName is <browser/office/client app>
- Everything else is abnormal (e.g., python, wsl, powershell, custom tools; missing extension URI).
Scoring rubric (0–100 per row)
- Start 0; abnormal chain +30
- Parent empty/unknown or not in allow-list → +10
- Missing "chrome-extension://" in ProcessCommandLine OR NormalizedInitiatingProcessCommandLine → +10
- Prevalence Count: ≤2 → +20; 3–10 → +10; ≥50 → −10
- Clamp [0,100]; labels: 0–24 Benign/Background, 25–59 Review, 60–100 Suspicious
Output (STRICT JSON; no markdown):
{
"rows": [
{
"row_index": <int 0..N-1>,
"score": <int 0..100>,
"label": "Benign/Background" | "Review" | "Suspicious",
"reason": "<12–30 words citing initiator, parent, chrome-extension:// presence/absence, and Count impact>"
}
],
"summary": {
"top_suspicious_examples": [
{"row_index": <int>, "score": <int>, "quick_note": "<≤12 words>"},
{"row_index": <int>, "score": <int>, "quick_note": "<≤12 words>"}
]
}
}
Strictness
- Produce exactly <<ROW_COUNT>> items in "rows", in the same input order (row_index matches that order).
- Reasons must reference the named fields explicitly (not generic).
- Output must be valid JSON (single object).
"""
BrowserCore abuse via WSL
FileName | BrowserCore.exe |
---|---|
ProcessCommandLine | browsercore.exe |
InitiatingProcessFileName | wsl.exe |
NormalizedInitiatingProcessCommandLine | "wsl.exe" --cd ~ |
InitiatingProcessParentFileName | explorer.exe |
Count | 1 |
score | 70 |
label | Suspicious |
reason | Abnormal initiator wsl.exe, parent explorer.exe, missing chrome-extension://, low Count increases score |
Legitimate BrowserCore.exe usage via Chrome
FileName | BrowserCore.exe |
---|---|
ProcessCommandLine | "BrowserCore.exe" chrome-extension://ppnbnpeolgkicgegkbkbjmhlideopiji/ --parent-window=0 |
InitiatingProcessFileName | cmd.exe |
NormalizedInitiatingProcessCommandLine | cmd.exe /d /s /c ""C:\Windows\BrowserCore\BrowserCore.exe" chrome-extension://ppnbnpeolgkicgegkbkbjmhlideopiji/ --parent-window=0" < \\.\pipe\chrome.nativeMessaging.in.<id> > \\.\pipe\chrome.nativeMessaging.out.<id> |
InitiatingProcessParentFileName | chrome.exe |
Count | 305 |
score | 0 |
label | Benign/Background |
reason | Initiator cmd.exe, parent chrome.exe, contains chrome-extension://, high Count reduces score |
BrowserCore.exe invoked with ROADtoken
FileName | BrowserCore.exe |
---|---|
ProcessCommandLine | "browsercore.exe" |
InitiatingProcessFileName | roadtoken.exe |
NormalizedInitiatingProcessCommandLine | ROADToken.exe "AwABEgEAAAADAOz_BQD0_xGKfgi7ZIFIGGdDCEpraoMEL_P4xSOz6zRYD0dAfGLfCCoCL0aQu9Hl5haYwFmsGO7f_Oqtvx6WmW0fk8LzyXsgAA" |
InitiatingProcessParentFileName | cmd.exe |
Count | 1 |
score | 60 |
label | Suspicious |
reason | Initiator cmd.exe, parent explorer.exe, missing chrome-extension://, low Count increases score |
This is a sample of the results. As can be seen, ROADtoken received a lower score than WSL in the risk assessment. At first, this may seem odd, as ROADtoken usage is a clear example of BrowserCore abuse. However, to avoid bias, I didn’t give the LLM any explicit “known bad” hints. Instead, I only provided a high-level hypothesis of what suspicious activity might look like.
Conclusion
This exercise started with a simple question: could BrowserCore.exe activity be abused in ways that evade standard detection, and if so, how do we build a repeatable hunt for it? Along the way, I simulated known and evasive techniques, validated what our EDR could and could not see, and built a KQL playbook to surface the relevant signals.
The real challenge was scale. Noisy, repeated values made manual hunts slow and automation brittle. By integrating a Jupyter notebook with Microsoft Defender Advanced Hunting and adding GPT-4o to the workflow, I was able to normalize telemetry, reduce false uniqueness, and generate analyst-style context automatically. That combination of KQL + automation + LLM effectively bridged the gap between raw logs and actionable insight.
That said, I want to be clear: this work is a proof of concept, not something mature enough for production use. Integrating LLMs into a detection pipeline raises important considerations, especially around data protection. In this case, the LLM only processed command-line strings but what if those strings contained PII or secrets? For real-world deployments, defenders should carefully weigh whether to run a local LLM, or at minimum apply tokenizers and secret scanners to scrub sensitive content before sending it to a model.
This is just one example of how LLMs can assist. Going forward, the next step is refining these workflows further, adding more data sources, and pressure-testing the methodology in broader scenarios.
Reference
Abusing Azure AD SSO with the Primary Refresh Token — Dirk-jan Mollema
Bypassing Entra ID Conditional Access Like APT — Yuya Chudo, Takayuki Hatakeyama
FalconFriday — Stealing and Detecting Azure PRT Cookies — Henri Hambartsumyan
Journey to Azure AD PRT: Getting Access with Pass-the-Token and Pass-the-Cert — Nestor Syyinämaa
MITRE ATT&CK Driven Threat Hunting Automated by Local LLM — Fujitsu Defense & National Security (Jun Miura, Toshitaka Satomi, Eri Miura)
Microsoft Threat Protection ‘Jupyter notebook’ #AdvancedHunting Sample — Wortell
Requesting Azure AD Request Tokens on Azure-AD-Joined Machines for Browser SSO — Lee Chagolla-Christensen
ROADtoken — Dirk-jan Mollema
כל מילה נוספת מיותרת — Primary Refresh Token — ספיר פדרובסקי
Understanding Primary Refresh Token (PRT) — Microsoft