How to Use Dirsearch

Black Hills Information Security, Inc.

Bad Habits: An ANTISOC Operation Same Problem, Different Angles: When Red Team and Blue Team Actually Talk to Each Other How to Identify and Exploit New Vulnerabilities Swapper – A Pure Regex Match/Replace Burp Extension A Practical Guide to BloodHound Data Collection Network Engineering Basics Signed, Trusted, and Abused: Proxy Execution via WebView2 Getting Started In Pentesting – Advice From The BHIS Pentest Lead Cloud Security: Tips and Resources for Securing the Cloud Lessons From A Chatbot Incident How to Lead Effective Tabletops Understanding GRC: How to Navigate Risks and Compliance Standards The “P” in PAM is for Persistence: Linux Persistence Technique Malware Analysis: How to Analyze and Understand Malware OSINT: How to Find, Use, and Control Open-Source Intelligence What to Do with Your First Home Lab When the SOC Goes to Deadwood: A Night to Remember Social Engineering and Microsoft SSPR: The Road to Pwnage is Paved with Good Intentions Common Cyber Threats Finding the Right Penetration Testing Company Deceptive-Auditing: An Active Directory Honeypots Tool The Curious Case of the Comburglar How to Set Smart Goals (That Actually Work For You) Inside the BHIS SOC: A Conversation with Hayden Covington Abusing Delegation with Impacket (Part 3): Resource-Based Constrained Delegation Why You Got Hacked – 2025 Super Edition Abusing Delegation with Impacket (Part 2): Constrained Delegation Abusing Delegation with Impacket (Part 1): Unconstrained Delegation GoSpoof – Turning Attacks into Intel Model Context Protocol (MCP) Bypassing WAFs Using Oversized Requests Getting Started with AI Hacking Part 2: Prompt Injection Wrangling Windows Event Logs with Hayabusa & SOF-ELK (Part 2) DomCat: A Domain Categorization Tool Wrangling Windows Event Logs with Hayabusa & SOF-ELK (Part 1) Microsoft Store and WinGet: Security Risks for Corporate Environments Default Web Content MailFail Commonly Abused Administrative Utilities: A Hidden Risk to Enterprise Security Stop Spoofing Yourself! Disabling M365 Direct Send Bypassing CSP with JSONP: Introducing JSONPeek and CSP B Gone Offensive Tooling Cheatsheets: An Infosec Survival Guide Resource DNS Triage Cheatsheet GraphRunner Cheatsheet Burp Suite Cheatsheet Impacket Cheatsheet Wireshark Cheatsheet Hashcat Cheatsheet EyeWitness Cheatsheet Nmap Cheatsheet Netcat (nc) Cheatsheet Hunt for Weak Spots in Your Wireless Network with Airodump-ng from the Aircrack-ng Suite Detecting ADCS Privilege Escalation Vulnerability Scanning with Nmap Getting Started with NetExec: Streamlining Network Discovery and Access Augmenting Penetration Testing Methodology with Artificial Intelligence – Part 3: Arcanum Cyber Security Bot How to Design and Execute Effective Social Engineering Attacks by Phone Abusing S4U2Self for Active Directory Pivoting Why Use a Macro Pad? Espanso: Text Replacement, the Easy Way Caging Copilot: Lessons Learned in LLM Security Augmenting Penetration Testing Methodology with Artificial Intelligence – Part 2: Copilot Augmenting Penetration Testing Methodology with Artificial Intelligence – Part 1: Burpference Intercepting Traffic for Mobile Applications that Bypass the System Proxy How to Root Android Phones Communicating Security to the C-Suite: A Strategic Approach Offline Memory Forensics With Volatility Getting Started with AI Hacking: Part 1 Go-Spoof: A Tool for Cyber Deception How to Test Adversary-in-the-Middle Without Hacking Tools Canary in the Code: Alert()-ing on XSS Exploits How to Hack Wi-Fi with No Wi-Fi Why Your Org Needs a Penetration Test Program Burp Suite Extension: Copy For Light at the End of the Dark Web Wi-Fi Forge: Practice Wi-Fi Security Without Hardware Avoiding Dirty RAGs: Retrieval-Augmented Generation with Ollama and LangChain Gone Phishing: Installing GoPhish and Creating a Campaign 5 Things We Are Going to Continue to Ignore in 2025 John Strand’s 5 Phase Plan For Starting in Computer Security Questions From a Beginner Threat Hunter GRC for Security Managers: From Checklists to Influence AI Large Language Models and Supervised Fine Tuning Attack Tactics 9: Shadow Creds for PrivEsc w/ Kent & Jordan One Active Directory Account Can Be Your Best Early Warning Introduction to Zeek Log Analysis Indecent Exposure: Your Secrets are Showing Creating Burp Extensions: A Beginner’s Guide Pitting AI Against AI: Using PyRIT to Assess Large Language Models (LLMs) The Top Ten List of Why You Got Hacked This Year (2023/2024) ICS Hard Knocks: Mitigations to Scenarios Found in ICS/OT Backdoors & Breaches Intro to Data Analytics Using SQL Finding Access Control Vulnerabilities with Autorize The Detection Engineering Process Cyber Risk Lessons We Can Learn From Hurricane Preparedness Intro to Desktop Application Testing Methodology What Is Penetration Testing? Adversary in the Middle (AitM): Post-Exploitation Pentesting, Threat Hunting, and SOC: An Overview QEMU, MSYS2, and Emacs: Open-Source Solutions to Run Virtual Machines on Windows

BHIS · 2025-07-02 · via Black Hills Information Security, Inc.

Chris has been working in security for 30 years, mainly doing penetration testing in both consulting and corporate environments. Chris is the author of the Nikto web scanner, founder of the RVAsec conference, and has been involved in many OSS projects and community efforts.

Dirsearch is an open-source multi-threaded “web path discovery” tool first released in 2014. The program, written in Python, is similar to other tools such as Dirbuster or Gobuster, and aims to quickly find hidden content on web sites. Dirsearch is still under active development, unlike Dirbuster (and possibly Gobuster), and is focused on path discovery unlike Gobuster.

It has several features to aid in discovery and can be easily customized to handle web servers which respond in unusual ways or require additional headers.

It operates by reading in a list of files and paths (a “wordlist”), optionally performing transformations on the list, making an HTTP request for each file, and reporting the results based on internal or user-defined rules.

Why Use Dirsearch?

Hidden (or unlinked, if you prefer) content on web sites can lead to security issues in multiple ways. This content could include administration panels, installation files, full applications, documentation, test programs, source code repositories, and neglected or forgotten content, among other things. Sometimes this leads to simple information disclosure, and other times can lead to full compromise.

No matter the type of test, knowledge of the full attack surface is critical to properly assessing it.

Installation

Dirsearch requires Python and can run on any platform which supports Python version 3.9 or higher. It can be installed with pip, manually via GitHub, many operating system package managers, or in Docker. This post details the GitHub installation method.

Git installation:

git clone https://github.com/maurosoria/dirsearch.git --depth 1

For installation on some operating systems, such as Apple OS X, Python’s virtual environment must be used to properly install dependencies. This can be easily accomplished after the venv program is installed.

python3 -m venv venv_dirsearch
source venv_dirsearch/bin/activate
python3 -m pip install -r requirements.txt

Verify the installation is successful by checking the installed version.

Dirsearch Version Check

See the project’s README document on GitHub for other installation options.

Wordlists

Dirsearch is only as good as the wordlist used. A wordlist is a simple text file with paths and/or filenames (with or without file extensions). Dirsearch reads this file, transforms each line if requested by the user, and then makes the HTTP request to look for the file.

Dirsearch includes a word list located at db/dicc.txt which includes nearly 10,000 files. Other wordlists can be obtained around the internet, for example from the Seclists repository. Some lists are product specific, such as Java Servlet names, and some are generic. Your selection may vary from website to website. A large and generic list, such as big.txt, is often a good place to start.

The default wordlist uses custom variables such as %EXT% to denote where the file extension should be placed. Only these variables will be replaced by default—other lines will not have file extensions. To force the use of extensions on every file, use the -f or —force flag.

Note that if you use a non-default wordlist with Dirsearch, you can override the default extensions in the wordlist with the --overwrite-extensions flag. For example, a wordlist with the file admin.php combined with the option --overwrite-extensions html,jsp will test for:

admin.html
admin.jsp

Basic Usage

The simplest way to use Dirsearch is to provide a URL with the -u flag. This will run the program with the default options using the built-in wordlist.

python3 dirsearch.py -u https://example.com/

The program will report what is being tested in the initial output.

As seen in the screenshot above, the default output can be quite verbose and includes things you probably don’t care about. To change this, use the -x (—exclude) flag to stop reporting files which reported as “not found” with HTTP response code 404.

python3 dirsearch.py -u https://example.com/ -x 404

This output is more helpful, as we can see paths which are there (200 “OK”) and paths which redirect to someplace else (301 “Redirect”). However, we don’t want to have to load each of those paths in a web browser to see where they end up because we are ~~lazy~~ busy. To have Dirsearch do this for us, we can add the -F (--follow-redirects) option.

python3 dirsearch.py -u https://example.com/ -x 404 -F

Now Dirsearch has followed redirects and is only reporting the ones which are not 404 “Not Found” afterwards.

Finally, we know the web application is built in PHP, so we will focus the scan and improve performance by only looking for files we think might be there. We can accomplish this with the -e (-extensions) option along with a list of file extensions.

python3 dirsearch.py -u https://example.com/ -x 404,403 -F -e php,htm,htm

While the output does not appear different, fewer requests were made due to the limited file extensions.

Advanced Usage

The previous output example actually has a false positive result with the /passwords path. The target server answers 200 for this even though it’s not there, a common problem when scanning the web. The response body for the false positive page has the string F5 in it that we can use to filter out the incorrect results because it only appears in pages which aren’t found.

For this, we’ll use the --exclude-text option (other similar options include --exclude-regex and --exclude-size).

python3 dirsearch.py -u https://example.com/ -x 404,403 -F -e php,htm,html --exclude-text "F5"

The program examined the response body, matched F5, and ignored the results.

Some scenarios may require additional headers, such as an access token or authorization header. This can be done with the -H (—header) option, for example:

python3 dirsearch.py -u https://example.com/ -x 404,403 -F -e php,htm,html --exclude-text "F5" -H "Authorization: Basic RnVubnk6WW91VGhvdWdodFRoaXNXYXNSZWFsCg=="

This will send the Authorization header with each request.

If we want to dig into a web site even further and use the response body to find additional paths, we can sometimes get better results. To assist with this, Dirsearch has the --crawl option.

Compare the output from the following two commands.

python3 dirsearch.py -u https://www.google.com/ -x 301,302,400

python3 dirsearch.py -u https://www.google.com/ -x 301,302,400 --crawl

In the output with --crawl, more paths were reported on the target server because it extracted them from the page’s response HTML. If this is combined with recursive scanning, (-r or --recursive) the scanner will continue to run against each path identified. While this can lead to better results, be aware that this can also generate a lot of requests and run for an extended time.

Finally, if we want to save the results, Dirsearch has extensive options including file and database types, including simple, plain, JSON, XML, Markdown, CSV, HTML, SQLite, MySQL SQL, and Postgres SQL. For these output formats, combine the -O (--output-format) and -o (--output-path) options, for example:

python3 dirsearch.py -u https://example.com/ -O xml -o results.xml

It can also connect directly to Postgres and MySQL database with the --postgres-url and -–mysql-url options, respectively.

Summary

Finding a hidden file or admin panel can sometimes lead to a full compromise. Knowing your target’s attack surface as fully as possible will help lead you to the best results during a penetration test.

Dirsearch has many additional options that can influence both scanning and reporting to help you get there. As you become more familiar with the tool, explore these additional flags to find your most reliable scanning methods, even though it will likely vary from target to target.

Ready to learn more?

Level up your skills with affordable classes from Antisyphon!

Pay-What-You-Can Training

Available live/virtual and on-demand

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

Black Hills Information Security, Inc.

Why Use Dirsearch?

Installation

Wordlists

Basic Usage

Advanced Usage

Summary