🖥️
Offensive security concepts
  • Introduction
  • 💿Virtualbox network setup
    • What is VirtualBox?
    • NAT
    • NAT network
    • Bridged adapter
    • Internal network
    • pfSense
    • vboxmanage
    • Overview
  • 🕵️OSINT
    • What is OSINT?
    • Google dorks
    • Metadata
    • Social media
      • osintagram
  • Tools
    • waybackurls
    • recon-ng
    • sherlock
    • maltego
    • theHarvester
    • photon
  • 😨Social Engineering
    • What is social engineering?
    • 7 tricks of social engineering
    • Email phishing
    • Typosquatting
    • Compiled resources
  • 😈MitM attack
    • What is MitM attack?
    • ARP spoof/poison
    • DNS spoof/poison
    • HTTP MitM attack
    • ICMP redirect attack
    • DHCP spoofing
    • Evil twin attack
    • Experiment (guest network)
    • Compiled resources
  • 🔌UPnP exploitation
    • What is UPnP?
    • What is SSDP?
    • IGD functions
    • LAN devices
    • Compiled resources
  • Network Reconnaissance & Attacks
    • What is network recon & attacks?
  • 1️⃣Network live host discovery
    • What is network live host discovery?
    • nmap
    • arp-scan
    • masscan
  • 2️⃣Network port scan/services enumeration
    • What is network port scan/services enumeration?
    • nmap
    • netcat
  • 3️⃣Network services vulnerability scanning & exploitation
    • What is network vulnerability scanning/exploitation?
    • 20/21 ~ FTP
    • 22 ~ SSH
    • 25 ~ SMTP
    • 53 ~ DNS
    • 80/443 ~ HTTP/HTTPS
    • 110 ~ POP3
    • 111/2049 ~ RPC/NFS
    • 139/445 ~ SMB
    • 143 ~ IMAP
    • 3389 ~ RDP
  • Vulnerability & exploitation
    • Database
    • Metasploit
    • Msfvenom
  • Misconfigurations
    • .DS_Store
  • Web Application Penetration Testing
    • Introduction
    • Web Content Discovery
      • Directories/URLs gathering
      • Subdomain enumeration
    • File inclusion & Path traversal
    • Insecure Direct Object Reference (IDOR)
    • Upload vulnerabilities
      • File extension cheat-sheet
    • SSRF
    • CSRF
    • XSS
    • SSTI
    • SQL injection
      • Filter evasion techniques
      • Practical challenge examples
        • TryHackMe
          • TryHackMe Burp suite: Repeater room
          • TryHackMe Advanced SQL Injection
  • Authentication/session management
    • OWASP WSTG-SESS-10 ~ JSON Web Token (JWT)
    • OWASP WSTG-ATHZ-05 ~ OAuth weaknesses
  • Webshell
  • Web API pentesting
    • Resources
    • Methodology
    • jq
    • httpx
    • ParamSpider
  • Web app pentesting methodology
  • OWASP
    • OWASP top 10
    • OWASP API top 10
    • Web Security Testing Guide (WSTG)
      • WSTG-ATHZ
        • WSTG-ATHZ-05 ~ OAuth weaknesses
      • WSTG-SESS
        • WSTG-SESS-10 ~ JWT
  • General web knowledge
    • URI standard (RFC 3986)
    • HTTP headers
  • 🛣️Attacks on routing protocols
    • What are attacks on routing protocols?
    • BGP hijacking
  • 🏕️To explore
    • MQTT
    • Routersploit
    • DNS rebinding attack
    • LLMNR/mDNS poisoning
  • 👤Anonymity
    • VPN
    • Proxychains
    • TOR
    • Obfuscation
  • Credentials brute-force/cracking
    • Introduction
    • Windows SAM database
    • Dictionary attack
    • Rainbow attack
      • Hash database
    • Tools
      • Hydra
      • John the ripper
      • Hashcat
      • hash-identifier
  • Post-exploitation
    • Gaining shell
      • netcat
      • socat
      • powershell
      • bash
      • PHP
    • Repository
  • Privilege escalation
    • Linux
      • Repositories
      • Enumeration
      • Vulnerabilities exploit
        • General
        • Kernel exploit
        • Sudo
        • SUID
        • Capabilities
        • Cronjobs
        • $PATH
        • NFS (target-machine)
        • Filesystem sharing
          • NFS (attacker-machine)
    • Windows
      • Password harvesting
      • Vulnerabilities exploit
        • Scheduled tasks
        • AlwaysInstallElevated
        • Service misconfigurations
          • Insecure permissions on service executable
          • Unquoted service path
          • Insecure service permission
        • Abusing privileges
  • Ⓜ️MITRE ATT&CK
    • Introduction
  • 🧰Tools/services
    • Introduction
    • Web application pentesting
      • Web discovery/fuzzing
        • paramspider
        • arjun
        • katana
      • uro
      • Password brute-forcing
      • Burp Suite (Community)
      • scanners
        • ZAP (Zed Attack Proxy)
        • nikto
        • nuclei
    • Information gathering/reconnaissance
    • Network recon & attacks
      • nmap (general overview)
      • scapy
      • bettercap
    • General
      • impacket
    • Wordlists
      • cewl
  • Professional report writing
    • Report template
      • Web applicaton pentesting
        • OWASP report layout
  • Tasks on-the-go
    • Note taking on-the-go
    • Other tips
  • Practice
    • Web Application Pentesting
      • OWASP
        • OWASP Juice Shop
        • OWASP Mutillidae II
        • OWASP Hackademic
      • Vulnhub
        • ...
      • Damn Vulnerable Web Application (DVWA)
    • Metasploitable 2
  • Operational Security (OpSec)
    • Hardening
      • General
      • Oracle VirtualBox
      • Web Browser
      • VPN/Proxy
  • Safe document viewer
    • PDF
    • .docx
  • Write-ups
    • TryHackMe
      • Silver Platter
      • Light
      • Pickle Rick
      • Hammer
        • Enumeration (active recon)
          • /hmr
          • Further directory discovery
          • /phpmyadmin
          • burp suite sitemap
        • Brute forcing 4-digit code
        • Retrieving the flag
      • OWASP Top 10 - 2021 (task 22)
      • sqlmap
    • OverTheWire
      • Untitled
    • OWASP
      • OWASP Juice Shop
      • OWASP WebGoat
  • AI prompt
    • ChatGPT
Powered by GitBook
On this page
  • Common web directories to recon:
  • Burp suite target -> site map
  • getallurls/gau
  • hakrawler
  • ffuf
  • Example
  • dirb
  • Example
  • gobuster
  • Example
  1. Web Application Penetration Testing
  2. Web Content Discovery

Directories/URLs gathering

getallurls/gau, hawkrawler, gobuster, wfuzz, dirb, dirbuster, ffuf

PreviousWeb Content DiscoveryNextSubdomain enumeration

Last updated 12 days ago

Refer to the general guide for more information on a few selected tools.

Common web directories to recon:

  1. Robots.txt

  2. Sitemap.xml

Burp suite target -> site map

The Burp Suite Site map feature found under the Target tab provides an overview of the directories found by Burp, that is gathered as the web app is explored, and with additional crawling.

getallurls/gau

The package getallurlsinstalled directly in Kali Linux with apt installseems to have some issue.

The gau binary can be installed from the installation step listed in the Github link below instead:

  1. Determine the current system architecture (Linux)

$ uname -a
$ arch
$ cat /proc/version
... aarch64
... x86_64

Since I am running my Kali on a Raspberry pi in this case, the commands above will display aarch64, which is ARM-64.

  1. Download the apprioprate gau binary .tar.gz file. In my case, it will be labelled as gau_2.2.4_linux_arm64.tar.gz.

  2. Extract the files, and move the binary file to another location. Depending on your system, the default binary path may defer. In this example, I'll assume its /usr/bin.

$ mktemp -d # create temp dir
$ cd ... # change dir to temp dir
 
$ tar xvf gau_2.2.4_linux_arm64.tar.gz
LICENSE
README
gau

$ file gau
/usr/bin/gau: ELF 64-bit LSB executable, ARM aarch64, ...
$ mv gau /usr/bin/gau
$ which gau
/usr/bin/gau

Basic commands:

# Basic 
$ echo <domain> | gau
$ gau <domain> 

$ gau <domain> --verbose # show verbose output

# eg.
$ gau vulnweb.com --verbose

hakrawler

$ echo <HTTP_URL> | hakrawler
$ echo <HTTP_URL> | hakrawler -subs # include subdomains

# eg. 
$ echo https://domain.com | hakrawler 
$ echo https://domain.com | hakrawler -subs

ffuf

Ffuf is a fast web fuzzer for directory discovery. The term fuzzing refers to the act of sending random data to applications (URLs in this case), to discover content that would not have been discovered otherwise.

Basic command:

$ ffuf -w <path_to_wordlist> -u <http_url>/FUZZ

Flags

-w: Path to wordlist

-u: HTTP/HTTPS endpoint URL to fuzz

The FUZZ keyword in the URL supplied to the -u flag will be replaced by each word given in the wordist.

Example

Suppose there is a target at the HTTP address http://88.88.88.88 to be fuzzed: with the wordlist in the ~/wordlists/common.txt directory containing common directory values.

$ ffuf -w ~/wordlists/common.txt -u http://88.88.88.88/FUZZ

        /'___\  /'___\           /'___\       
       /\ \__/ /\ \__/  __  __  /\ \__/       
       \ \ ,__\\ \ ,__\/\ \/\ \ \ \ ,__\      
        \ \ \_/ \ \ \_/\ \ \_\ \ \ \ \_/      
         \ \_\   \ \_\  \ \____/  \ \_\       
          \/_/    \/_/   \/___/    \/_/       

  ...
________________________________________________

 :: Method           : GET
 :: URL              : http://88.88.88.88/FUZZ
 :: Wordlist         : FUZZ: ~/wordlists/common.txt
 ...
 ...
________________________________________________

assets                  [Status: 301, ...]
robots.txt                 [Status: 200, ...]

:: Progress: [xxx/xxx] :...

The output shows that the directories: assets and robots.txt, returned a valid status code, indicating that there are contents present.

dirb

DIRB is a Web Content Scanner. It looks for existing (and/or hidden) Web Objects. It basically works by launching a dictionary based attack against a web server and analyzing the responses.

Basic command:

$ dirb <http_url> <path_to_wordlist>

Example

Scan the URL http://88.88.88.88/ with the wordlist provided.

$ dirb http://88.88.88.88/ ~/wordlists/common.txt

-----------------
DIRB ...    
By The Dark Raver
-----------------

...
URL_BASE: http://88.88.88.88/
WORDLIST_FILES: ~/wordlists/common.txt

-----------------

GENERATED WORDS: ...                                                          

---- Scanning URL: http://88.88.88.88/ ----
...  
...

gobuster

Gobuster is a tool used to brute-force: URIs (directories and files) in web sites, DNS subdomains (with wildcard support), Virtual Host names on target web servers, ...

Basic command:

$ gobuster dir --url <http_url> -w <path_to_wordlist>

Flags

dir: Uses directory/file enumeration mode

--url: HTTP/HTTPS endpoint URL to brute-force

-w: Path to wordlist

Example

To brute-force the HTTP URL http://88.88.88.88/ with the wordlist ~/wordlists/common.txt

$ gobuster dir --url http://88.88.88.88/ -w common.txt

===============================================================
Gobuster v3.6
by OJ Reeves (@TheColonial) & Christian Mehlmauer (@firefart)
===============================================================
[+] Url:                     http://88.88.88.88/
[+] Method:                  GET
..
[+] Wordlist:                ~/wordlists/common.txt
...
===============================================================
Starting gobuster in directory enumeration mode
===============================================================
/assets               (Status: 301) ...
/robots.txt              (Status: 200) ...

Progress: xxx/xxx (99.98%)
===============================================================
Finished
===============================================================

The output shows that the directories: assets and robots.txt, returned a valid status code, indicating that there are contents present.

Web Fuzzing
GitHub - lc/gau: Fetch known URLs from AlienVault's Open Threat Exchange, the Wayback Machine, and Common Crawl.GitHub
GitHub - hakluke/hakrawler: Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web applicationGitHub
ffuf | Kali Linux ToolsKali Linux
dirb | Kali Linux ToolsKali Linux
dirb tool
gobuster | Kali Linux ToolsKali Linux
Logo
https://docs.google.com/document/d/1r7l_Idd-C13G6aWO6grdbyx5NkqBiANi3cOOZqf06sg/editdocs.google.com
Pentesting steps
https://docs.google.com/document/d/1LLC8hHAKBBRtnVGkW1Bcaf1k_1lSQ1T5h_g8xUL8Gw4/editdocs.google.com
List of tools
Logo
Logo
Logo
Logo