Directories/URLs gathering

getallurls/gau, hawkrawler, gobuster, wfuzz, dirb, dirbuster, ffuf

Common web directories to recon:

  1. Robots.txt

  2. Sitemap.xml

Burp suite target -> site map

The Burp Suite Site map feature found under the Target tab provides an overview of the directories found by Burp, that is gathered as the web app is explored, and with additional crawling.

getallurls/gau

The package getallurlsinstalled directly in Kali Linux with apt install seems to have some issue. To use the gau binary that can be installed from the installation step listed in the Github link below.

Basic commands:

# Basic 
$ echo <domain> | gau
$ gau <domain> 

$ gau <domain> --verbose # show verbose output

# eg.
$ gau vulnweb.com --verbose

hakrawler

$ echo <HTTP_URL> | hakrawler
$ echo <HTTP_URL> | hakrawler -subs # include subdomains

# eg. 
$ echo https://domain.com | hakrawler 
$ echo https://domain.com | hakrawler -subs

ffuf

Ffuf is a fast web fuzzer for directory discovery. The term fuzzing refers to the act of sending random data to applications (URLs in this case), to discover content that would not have been discovered otherwise.

Basic command:

$ ffuf -w <path_to_wordlist> -u <http_url>/FUZZ

Flags

-w: Path to wordlist

-u: HTTP/HTTPS endpoint URL to fuzz

The FUZZ keyword in the URL supplied to the -u flag will be replaced by each word given in the wordist.

Example

Suppose there is a target at the HTTP address http://88.88.88.88 to be fuzzed: with the wordlist in the ~/wordlists/common.txt directory containing common directory values.

$ ffuf -w ~/wordlists/common.txt -u http://88.88.88.88/FUZZ

        /'___\  /'___\           /'___\       
       /\ \__/ /\ \__/  __  __  /\ \__/       
       \ \ ,__\\ \ ,__\/\ \/\ \ \ \ ,__\      
        \ \ \_/ \ \ \_/\ \ \_\ \ \ \ \_/      
         \ \_\   \ \_\  \ \____/  \ \_\       
          \/_/    \/_/   \/___/    \/_/       

  ...
________________________________________________

 :: Method           : GET
 :: URL              : http://88.88.88.88/FUZZ
 :: Wordlist         : FUZZ: ~/wordlists/common.txt
 ...
 ...
________________________________________________

assets                  [Status: 301, ...]
robots.txt                 [Status: 200, ...]

:: Progress: [xxx/xxx] :...

The output shows that the directories: assets and robots.txt, returned a valid status code, indicating that there are contents present.

dirb

"DIRB is a Web Content Scanner. It looks for existing (and/or hidden) Web Objects. It basically works by launching a dictionary based attack against a web server and analyzing the responses."

Basic command:

$ dirb <http_url> <path_to_wordlist>

Example

Scan the URL http://88.88.88.88/ with the wordlist provided.

$ dirb http://88.88.88.88/ ~/wordlists/common.txt

-----------------
DIRB ...    
By The Dark Raver
-----------------

...
URL_BASE: http://88.88.88.88/
WORDLIST_FILES: ~/wordlists/common.txt

-----------------

GENERATED WORDS: ...                                                          

---- Scanning URL: http://88.88.88.88/ ----
...  
...

gobuster

"Gobuster is a tool used to brute-force: URIs (directories and files) in web sites, DNS subdomains (with wildcard support), Virtual Host names on target web servers, ..."

Basic command:

$ gobuster dir --url <http_url> -w <path_to_wordlist>

Flags

dir: Uses directory/file enumeration mode

--url: HTTP/HTTPS endpoint URL to brute-force

-w: Path to wordlist

Example

To brute-force the HTTP URL http://88.88.88.88/ with the wordlist ~/wordlists/common.txt

$ gobuster dir --url http://88.88.88.88/ -w common.txt

===============================================================
Gobuster v3.6
by OJ Reeves (@TheColonial) & Christian Mehlmauer (@firefart)
===============================================================
[+] Url:                     http://88.88.88.88/
[+] Method:                  GET
..
[+] Wordlist:                ~/wordlists/common.txt
...
===============================================================
Starting gobuster in directory enumeration mode
===============================================================
/assets               (Status: 301) ...
/robots.txt              (Status: 200) ...

Progress: xxx/xxx (99.98%)
===============================================================
Finished
===============================================================

The output shows that the directories: assets and robots.txt, returned a valid status code, indicating that there are contents present.

Last updated