...

Beginners Recon Workflow for Bug Bounty & Web Pentesting

Securify

In bug bounty hunting and web application penetration testing, having a strong and well-defined reconnaissance workflow is more important than just running random tools. Most security researchers fail to uncover high-impact vulnerabilities not because they lack skill, but because their recon process is unstructured. A proper recon workflow helps you systematically map the entire attack surface of a target, identify hidden assets, uncover undocumented endpoints, and detect misconfigurations before anyone else. Instead of relying on a single tool or performing shallow enumeration, modern reconnaissance in 2025 requires a layered and automated approach that connects subdomain discovery, asset validation, crawling, JavaScript analysis hunting, and vulnerability scanning into one continuous pipeline. 

This guide walks you through a complete, real-world recon workflow used by professional bug bounty hunters and penetration testers to maximise coverage, reduce blind spots, and significantly improve the chances of finding critical security issues.

Phase 0: Pre-Recon Setup (Most Beginners Skip This)

Before running Subfinder, Assetfinder, or any other  recon tool, most beginners make the same mistakes:

❌ They don’t configure API keys
❌ They run tools with default rate limits
❌ They burn their API quota
❌ They get blocked or receive incomplete results

Let’s fix that first.

1. Why API Keys Matter in Recon

Most subdomain tools rely heavily on external data sources.

Without API keys → Very limited results

Subfinder uses only a few public sources, so you may only see fewer subdomains.

With API keys → 5x to 10x more results

Depending on the target, you might suddenly find 50, 100, or even hundreds of subdomains.

Why?

API keys unlock premium passive sources such as: SecurityTrails, VirusTotal, Shodan, Censys, Quad9, BinaryEdge

These services maintain massive datasets containing DNS history, SSL certificates, WHOIS data, past IP allocations, and more. Subfinder pulls from all of these when API keys are configured.

2. How to Set Up API Keys

Example for Subfinder

Config file location: ~/.config/subfinder/provider-config.yaml

Example configuration:

If you want maximum subdomain coverage, configuring API keys is essential.

Apply the same API-key setup practice to all enumeration tools (Subfinder, Amass, etc.) by reviewing their official documentation.

3. Rate Limiting: Avoid Getting Banned or Throttled

Many sources enforce strict limits: 

  • VirusTotal → 4 requests/minute
  • SecurityTrails → 300 requests/day
  • Shodan → limited by plan
  • Censys → very strict

Running tools too aggressively leads to: Temporary API ban, Quota exhaustion, Missing subdomains,  Inaccurate results

4. Proper Rate-Limit Configuration

Instead of blindly running tools, use controlled settings:

Safe Mode Example:

subfinder -d target.com -rl 5 -t 30 -timeout 10

Options:

  • -rl → request rate (requests per second)
  • -t → number of threads
  • -timeout → timeout for sources

This ensures:
✅ No API bans
✅ Stable performance
✅ Better data accuracy
✅ Long-term usability

Now: Actual Recon Begins

After this preparation, we move into:

Subdomain Discovery → Validation → Crawling → JS Extraction → Secret Hunting → Vulnerability Scanning

Phase 1: Subdomain Enumeration

The first and most critical step in external reconnaissance is identifying all possible subdomains associated with the target. Hidden or forgotten subdomains often host outdated services, exposed dashboards, staging environments, or misconfigured APIs, making them prime candidates for high-impact vulnerabilities.

A strong subdomain enumeration process must combine multiple passive sources to achieve maximum coverage without sending direct traffic to the target.

1.1 Subfinder – Passive Enumeration from Large Data Sources

Subfinder is one of the fastest passive subdomain enumerators. It pulls results from dozens of sources (certificates, web archives, APIs, threat intel feeds, etc.).

Reference: GitHub – projectdiscovery/subfinder: Fast passive subdomain enumeration tool.

subfinder -d target.com -all -recursive -rl 5 -t 30 -o subfinder.txt

1.2. Assetfinder – Quick Passive Enumeration

Assetfinder (by Tomnomnom) retrieves subdomains from major passive data sources. It’s lightweight and designed for quick recon pipelines.

assetfinder target.com > assetfinder.txt

Reference: GitHub – tomnomnom/assetfinder: Find domains and subdomains related to a given domain

1.3. Amass – Deep Passive Enumeration

OWASP Amass performs passive, semi-passive, and active recon, but here it’s used in passive mode for safe, broad asset discovery.

amass enum -passive -d target.com > amass.txt

Reference: GitHub – owasp-amass/amass: In-depth attack surface mapping and asset discovery

1.4. Combine & De-duplicate All Results

After gathering subdomains from all tools, merge and remove duplicates to form the most accurate master list. This can be done using a simple linux command – 

cat subfinder.txt assetfinder.txt amass.txt | sort -u > all_subdomains.txt

Here, it passively collects subdomains from multiple data sources using configured API keys, providing wide subdomain coverage without directly interacting with the target, removes duplicates, and saves the results into all_subdomains.txt.

Phase 2: Validate Live Subdomains and Technologies

After generating a complete subdomain list, the next step is confirming which hosts are actually alive, reachable, and serving content. Many discovered domains are dormant, expired, or misconfigured — testing these wastes considerable time and produces noise.

Live-host validation ensures every subsequent phase (vulnerability scanning, crawling, fuzzing, JS extraction) focuses only on active assets.

httpx (by ProjectDiscovery) is a fast, flexible probing tool that sends lightweight HTTP/S requests to identify live web services. It also retrieves metadata, including:

  • HTTP status codes,
  • Page titles
  • Detected technologies (powered by Wappalyzer)
  • Redirection paths
  • TLS information 
cat all_subdomains.txt | httpx -silent -status-code -title -tech-detect > alive.txt

This command identifies which subdomains are currently live by checking HTTP/S responses and filtering out inactive hosts, while also capturing useful context such as status codes, page titles, and technologies.

Reference: GitHub – projectdiscovery/httpx: httpx is a fast and multi-purpose HTTP toolkit that allows running multiple probes using the retryablehttp library.

2.1 DNS Resolution + DNS Records Discovery (dnsx, dig, nslookup)

This step performs DNS resolution to identify which subdomains actually resolve to an IP address. DNS records reveal hosting infrastructure, cloud services, and relationships between assets. Subdomains with valid A/AAAA records can then be targeted for port scanning and service-based vulnerability scanning in later steps.

cat Alive.txt | dnsx -a -resp -silent > dns_results.txt

Reference: GitHub – projectdiscovery/dnsx: dnsx is a fast and multi-purpose DNS toolkit allow to run multiple DNS queries of your choice with a list of user-supplied resolvers.

2.2  Naabu (ProjectDiscovery) – Port-Based Liveness Detection

Naabu is a high-speed port scanner designed to detect exposed services by checking for open TCP ports—not just HTTP/S. This step highlights additional attack surfaces such as SSH, FTP, SMTP, Redis, MySQL, RDP, and other non-web services.

naabu -list dns_results.txt -ports full -silent -o naabu_alive.txt

Reference: GitHub – projectdiscovery/naabu: A fast port scanner written in go with a focus on reliability and simplicity. Designed to be used in combination with other tools for attack surface discovery in bug bounties and pentests

2.2 Nmap Vulnerability Script Scanning (NSE – vuln category)

Nmap’s NSE (Nmap Scripting Engine) includes a vuln script category that performs quick vulnerability checks against detected services. This step helps identify known service misconfigurations and CVE-related issues early in the recon process.

Tip: Many WAFs block Nmap’s default Mozilla/5.0 (compatible; Nmap Scripting Engine; https://nmap.org/book/nse.html) User-Agent. Use a custom one for cleaner scans. This avoids WAF blocking and improves response accuracy.

nmap -sV -Pn --script-args http.useragent="Mozilla/5.0" --script=vuln -iL dns_results.txt -oA nmap_vuln_scan

Reference: https://nmap.org/

Phase 3: Historical URL & Sensitive File Discovery (Dirsearch + WebArchive/CDX API)

Graphic showing data protection and encryption measures

Here, it uncovers legacy content, old endpoints, deleted files, forgotten admin panels, exposed backups, and sensitive documents that no longer appear on the live application but remain archived on the internet.

Web applications frequently remove endpoints, update systems, and delete files but these assets often remain stored in public archives. The Wayback Machine, search engines, and archive services keep historical snapshots of URLs, which can expose: Old API endpoints, Debug pages, Backup files (.zip, .rar, .tar), Database dumps (.sql, .db), Internal documents (.docx, .pdf, .xlsx), Email leaks, Password leaks, Forgotten admin interface. These archived URLs often reveal high-value vulnerabilities that are completely invisible during live recon.

3.1 Directory Bruteforcing with Dirsearch

Runs directory brute-forcing against all discovered subdomains from subfinder, helping to find hidden directories, admin panels, APIs, and sensitive folders.

dirsearch -l subfinder.txt

Reference: GitHub – maurosoria/dirsearch: Web path scanner

3.2 Historical URL Extraction Using the WebArchive (Wayback Machine) CDX API

The CDX API is one of the most powerful passive reconnaissance sources. It provides a complete list of archived URLs for a domain and all of its subdomains. Using a browser becomes slow when listing thousands of URLs. CURL retrieves everything efficiently.

curl -G "https://web.archive.org/cdx/search/cdx" \  --data-urlencode "url=*.example.com/*" \  --data-urlencode "collapse=urlkey" \  --data-urlencode "output=text" \  --data-urlencode "fl=original" > output.txt

It downloads the complete archive of historical URLs into output.txt, making it easier to filter, search, and extract sensitive files.

Then, let’s filter the archived URLs to extract files with sensitive extensions such as database files, backups, spreadsheets, documents, and PDFs, all of which may contain highly sensitive leaked information.

cat output.txt | uro | grep -E '\.xls|\.xml|\.xlsx|\.json|\.pdf|\.sql|\.doc|\.docx|\.ppt|\.zip|\.rar|\.db'

Phase 4: Vulnerability Scanning on Live Hosts

Large attack surfaces cannot be manually tested immediately. Automated scanning helps quickly highlight critical or high-risk hosts that deserve deeper manual analysis. Here, this phase identifies known and common vulnerabilities automatically, acting as a fast triage layer across all live assets.

4.1 Fast Vulnerability Scanning with Nuclei

Nuclei is a rapid, template-driven scanner from ProjectDiscovery. It covers thousands of vulnerability types, including CVEs, misconfigurations, SSRF, RCE, exposed services, cloud config flaws, and more. Ideal for wide, fast triage of large target lists.

cat alive.txt | nuclei -t /path/to/templates/ -o nuclei_results.txt

Reference: GitHub – projectdiscovery/nuclei: Nuclei is a fast, customizable vulnerability scanner powered by the global security community and built on a simple YAML-based DSL, enabling collaboration to tackle trending vulnerabilities on the internet. It helps you find vulnerabilities in your applications, APIs, networks, DNS, and cloud configurations.

4.2 Deep Vulnerability Scanning Using Nessus or Other Automated Scanners

While Nuclei excels at speed and coverage, Nessus, Qualys, and OpenVAS perform deeper, heuristic-based analysis using large vulnerability plugin databases.

Export alive.txt into the Nessus target list, and run the Web Application Scan profile. It provides detailed vulnerability results from a well-maintained vulnerability database, including network-level issues that Nuclei may not detect.

Phase 5: JavaScript File Extraction & Secret Hunting

Modern web applications store critical logic, APIs, and sensitive data inside JavaScript files. This phase expands the attack surface by extracting and analysing those JavaScript assets. Hidden endpoints, internal APIs, secrets, and cloud configurations are frequently exposed through JavaScript files. Ignoring JavaScript analysis means missing high-value entry points.

5.1 Extract JavaScript Files using Katana

Katana is a powerful crawler from ProjectDiscovery. It crawls live targets and extracts JavaScript file URLs, including dynamically loaded scripts.

katana -list alive.txt -jc -jsl | grep -oE '\.js($|\?)' > js1.txt

Reference: GitHub – projectdiscovery/katana: A next-generation crawling and spidering framework.

5.2 Extract JavaScript using GoSpider

GoSpider variants are popular for deep crawling and pulling JS files that other crawlers miss and widely used crawler implementations and forks provide options for headless/crawling depth and integration with other tooling.

gospider -S alive.txt -c 10 -d 0 -t 20 --other-source | grep -oE 'https?://[^[:space:]]+' \ | grep -E '\.js($|\?)' > js2.txt

Reference: Gospider – Fast web spider written in Go

5.3 Detects sensitive information using Nuclei

Nuclei is a template-driven vulnerability scanner. Use community/official JS templates to scan downloaded JS files for common secret patterns, cloud-config leaks, tokens, etc.

Reference: GitHub – projectdiscovery/nuclei: Nuclei is a fast, customizable vulnerability scanner powered by the global security community and built on a simple YAML-based DSL, enabling collaboration to tackle trending vulnerabilities on the internet. It helps you find vulnerabilities in your applications, APIs, networks, DNS, and cloud configurations.

cat final_js.txt | nuclei -t js-templates/ -o js_leaks_report.txt

These leaks regularly lead to critical vulnerabilities. It scans JavaScript files for exposed secrets like API tokens, cloud credentials, OAuth details, and Firebase configurations.

5.4 Active & Passive Secret Monitoring using JSMon

After completing passive JavaScript reconnaissance, the extracted final_js.txt file can be uploaded to the JSMon platform, which is on JavaScript security platform that provides monitoring and scanning for JS assets. The jsmon-cli lets you upload lists or integrate scanning into pipelines and workspaces. Which identify Hardcoded secrets, Exposed API keys and tokens, Sensitive configurations, Misconfigured S3 buckets, Npm Takeover, GraphQL / API Endpoints.

jsmon-cli -d Redacted.com -wksp <WORKSPACE_ID>

Reference: Jsmon

5.5 Deep Secret Extraction with SecretFinder

SecretFinder (python) is based on LinkFinder; it beautifies JS and uses large regex sets to find potential API keys, tokens, JWTs, etc. It can output plaintext or HTML and is commonly used in JS recon pipelines.

cat final_js.txt | | while read url; do python3 SecretFinder.py -i $url -o cli >> secrets.txt; done

Reference: SecretFinder – A python script for find sensitive data (apikeys, accesstoken,jwt,..) and search anything on javascript files

5.6 JS parsing & extraction with LazyEgg

LazyEgg is a JS-focused extraction tool that can pull links, JS URLs, localStorage content, and leaked credentials; it also includes a Chrome extension for real-time JS logging. It’s useful for richer client-side data collection and live-browser captures.

cat final_js.txt | xargs -I{} bash -c 'echo -e "\nTarget: {}\n" && python lazyegg.py "{}" --js_urls --domains --leaked_creds'

Reference:  GitHub – schooldropout1337/lazyegg

Recon is not just the first step in bug bounty; it’s the foundation of everything you’ll find later.

When you stop relying on luck and start relying on a structured workflow, your results change completely.

This layered approach is what separates guesswork from real exploitation. Over time, this methodology will help you move from finding random bugs to uncovering high-impact, report-worthy vulnerabilities with confidence.

Leave a Reply