The first phase for any hacker is information gathering and reconnaissance. This post explains some of the key concepts behind it and some tools that can be utilized.

Reconnaissance

As discussed previously, the first phase of the methodology is reconnaissance.

Reconnaissance involves identifying as much information as possible regarding the target system or network.

In general, it is a non-intrusive systematic method used to accumulate data about a specific target network passively, with the goal of finding ways to intrude into the environment.

The effective of the information gathering process is directly related to the successfulness of an attack. It is crucial to systematically and methodically ensure that all pieces of information related to the target are identified.

The tester must harvest information to execute a focused attack.

The more information collected, the more opportunities become available.

Information is often collected from public sources - social networks, Google, company websites but also from technical sources like DNS.

Information Gathering Types

Information gathering can be both passive and active.

Passive is done by finding out details that are freely available and by various other techniques without directly coming into contact with the organization's servers.

Reviewing the targets and other informative websites could be exceptions (as you are contacting the company) as the information gathering activities carried out by an attacker do not raise suspicion - who cares if you visit the website?

Calling the help desk and attempting to social engineer them out of privileged information is an example of active information gathering.

Additionally, scanning is also a type of active information gathering.

Footprinting

Reconnaissance involves footprinting the target company (establish a blueprint of the security profile). The aim is to determine the security posture of a target as well as identifying staff as an example with the end goal being to find a way to intrude the network.

This is the most important phase of a penetration test. It is thought that an attacker spends 90% of the time in profiling an organization and the remaining 10% launching an attack.

This could contain information from open sources, running utilities like DNS utilities or manually searching for information such as articles, financial databases, social networks for employees or ex-employees etc...

Google Hacking

No, sadly this is NOT a tutorial on how to hack Google and make billions of dollars. Google Hacking simply refers to the use of advanced operators in Google to find potential vulnerable or interesting information.

Google searches can be a rich source of information to perform passive reconnaissance. Using Google advanced searches, specific data can be filtered to extract sensitive information.

Google advanced search strings can be quite sophisticated but they can provide a range of interesting information.

Google advanced operators are used to make searches less ambiguous. Some of the most useful operators are:

allintext
allintitle
allinurl
filetype
info
intext
intitle
inurl
link
site

As an example, the filetype operator could be used to find PDFs belonging to a specific organization.

Another example is the site operator which can be used to gather information about a specific site/domain. For example the query "site:complexsec.com" returns pages only from complexsec.com.

Combining the two, you could search for a specific domain and the PDF files on that site using "site:domain.com filetype:pdf".

A quick summary of some operators:

intitle - searches for pages with the string in their HTML title
allintitle - similiar to intitle, but looks for all the strings in the title
inurl - searches for string in the URL
allinurl - same as inurl, but searches for all terms
filetype - searches for specific files
ext - similiar to filetype

For some fun, here are a few examples@

intitle:"admin login" + site:"bbc.co.uk"
intitle:"Network Camera" inurl:"/ViewerFrame?Mode=Refresh"
intitle:"Toshiba Network Camera" user login
inurl:"section.php?id="
inurl:"item_id.php?id="

Please be aware that searching for these is not illegal, what you do with some of the information found could be illegal.

A fantastic website is the Google Hacking Database

Incremental Substitution

Incremental substitution is an old technique which involves replacing numbers in a URL in an attempt to find directories or files that are hidden or unlinked from other pages.

By changing the numbers in file names, other files could be found. In some examples, substitution is used to modify the numbers in the URL to locate other files or directories that exist.

For example, some possible modifications could be:

gallery/DSCFOO1.jpg -> gallery/DSCF002.jpg
lecture/01/handout.pdf -> lecture/02/handout.pdf

Robots.txt

Robots are cool right? Well, they’re also useful for us as hackers.

A robots.txt file restricts access to a site to search engine robots that crawl the web. The bots check the robots.txt file to see if they are prevented from accessing certain pages.

This file can give details to malicious users on the storage of sensitive information or a location by including the path inside it.

The file contains "rules" about what is private in a website's directory structure. Analysis of the file could help an intruder narrow down the scope of an attack.

An example is Google.

Archive of Websites

The Internet Archive Wayback Machine allows you to explore archived versions of websites. This exploration allows an attacker to gather information on an organization's web pages since its creation.

As the website keeps track of web pages from the time of creation, an attacker can retrieve even information removed from the target website, such as pages, audio files, video files, images, text, and programs.

Attackers can use this to perform phishing and other types of web app attacks on the target organization.

Another useful feature is that you can use the Wayback Machine to scout a website of a potential target. As an example, instead of visiting google.com, you could visit the Wayback Machine and visit the most recently cached version of that site so you do not actively interact with the target site.

Job Sites

Don’t worry, you don’t have to apply for any jobs or have any interviews. This is purely for research!

Job sites are an excellent source of information, which in many cases would immediately provide details on the target's infrastructure.

You can gather infrastructure information from job postings. For example, a Network Admin role might specify experience with Cisco equipment or Windows Server 2016 indicating that they use that technology.

It's a good idea to look for:

Job Requirements
Employee profile
Hardware information
Software information

Social Networks

On social networks, people may post personal information like date of birth, educational information, employment background, spouse names, etc...

Organizations often post information such as potential partners, websites and upcoming news about the company.

For an attacker, social networking sites are valuable sources of information about the target person or organization. Remember that the attacker can only gather the information posted by individuals - if it's not posted, then the information doesn't exist!

To obtain more information, an attacker may create a fake account and use social engineering techniques to lure the victim into revealing more information.

As an example, the attacker could send a friend request to a target person. If they accept the request, the attacker could then access the restricted pages of the target person on that website.

Social Media Resources

There are several resources available to gather valuable information about a target from one or more social media sites.

These allow attackers to discover most shared content across social media sites by using hashtags or keywords, track accounts and URLs on various social media sites, obtain a target email address and more.

The information helps attackers to perform phishing, social engineering, and other types of attacks.

Some tools used by attackers include:

Some other tools that can be used for footprinting include:

These can be used to footprint social media sites like Twitter, Instagram, Facebook and Pinterest to gather sensitive information like DOB, qualifications, employment status, relatives, and info about the organization they work for including business strategy, potential clients, and upcoming projects.

People Search

If you want to obtain more information about a specific individual, you could use several online resources. However, Pipl is the one that stands out - sadly requiring a subscription to it.

Others include Anywho and Spokeo but are mostly for the US only.

Whois Database

The WHOis database contains information about the assignment of Internet address, domain names, registrars and individual contacts.

It is a query and response protocol used for querying databases that store the registered users or assignees of an Internet resource, such as a domain name, IP address block or an autonomous system (AS).

The protocol listens on port 43 (TCP).

The InterNIC WHOis database system lists the registrars of websites based on the organization's name or domain name for web sites.

Once you have the registrar's name, you can go to the registrar's web site and get more information, contact details of the admins, registration dates, and the addresses of its DNS servers.

WHOis is usually the first step in reconnaissance, supplying the target's domain registrant, its administrative and technical contacts, together with a list of their domain servers, which can be used to gain information that can be used to perform DNS enumeration.

WHOis searches locate details on network's autonomous system numbers, network related handles, and other related points of contact.

WHOis is the primary tool used to navigate databases and query DNS. As domain allocation is deregulated, it is advisable to use different WHOis tools to obtain a complete picture.

Some of the tools include:

DNS Enumeration

DNS is a hierarchical database that stores data about domain names and IP addresses. DNS enumeration is the process of locating all the DNS server and their records.

A company may have internal and external DNS servers that can yield target information such as computer names and IP addresses.

After collecting the WHOis records, the next phase is DNS footprinting. Attackers perform DNS footprinting to gather information about DNS servers, DNS records, and types of servers used by the organization.

This helps attackers to identify the hosts connected in the target network and perform further exploitation on the target organization.

WHOis can be used to perform DNS enumeration.

Extracting DNS Information

DNS footprinting reveals information about DNS zone data. Things like DNS domain names, computer names, IP addresses and more information.

An attacker uses DNS information to determine key hosts in the network and then performs social engineering attacks to gather even more information.

DNS - Target Information

DNS footprinting helps in determining the following records about the target DNS:

NSLookup

Once an attacker knows one of the DNS servers, the attacker can begin interrogating the name servers. NSlookup is a tool to query DNS servers for a record, which displays information that can be used to diagnose DNS infrastructure.

Output can give useful information like system names and IP addresses.

In a zone transfer (very bad!), the nslookup program asks the DNS server to transmit all information it has about a given domain.

Locate Network Range

Having been able to locate names, addresses, some server names, and IP addresses, the next step is to:

Identify the range of IP addresses
Discern the subnet mask
Locate the network range to know which addresses can be targeted for future scanning and enumeration analysis

If an attacker takes the IP address of a web server and enters it into the WHOis lookup, the network range can be identified.

The following is an example:

Network Path

Traceroute can be employed to determine that path taken by packets across an IP network from source to destination, but also identifies the routers employed.

Traceroute operates by sending an ICMP echo to each hop along the path, until the destination address is reached. It uses an IP header field TTL to illustrate the path packets travel between two hosts by sending out consecutive packets with increasing TTLs.

Traceroute can reveal routers, their geographical location & the target's DNS entries.

TTL functions as a counter to track each router hop as the packet travels to the target. Each hop that a packet passes through reduces the TTL field by 1.

If it reaches 0, the packet is discarded and a time exceeded in transit ICMP message is created to inform the source of the failure.

By using this, an attacker determines the layout of a network and the location of each device.

Visual Route

VisualRoute is a graphical tool that determines where and how traffic is flowing on the route between the source and destination, by providing a geographical map of the route.

It also has the ability to identify the geographical location of routers, servers and other IP devices.

Email Tracking

Email provides another tool in the information gathering toolbox. Email spiders can be used to collect email addresses by searching the internet.

An example website is hunter.io

Defending Against Reconnaissance

Defending can be difficult as much of the information is open source. A company should make sure that its own staff do not advertise anything related to work activity (posting selfies in the office with their badge IDs or PCs in the background, no work talk on Twitter/Facebook, etc...)

The first line of technical defence that any target system can adopt is proper configuration and implementation of their DNS.

Inappropriate queries must be refused by the system thereby checking crucial information leakage.

If the organization is a high security organization, it can opt to register a domain in the name of a third party, as long as they agree to accept responsibility.

Practical Time

I hear you asking, can we see some of this in action? I’m glad you asked! I will walk through a very simple example walkthrough in an internally made home network - I won’t go into detail about how I made it but time to get hands on.

The topology for my home network is as follows:

Some additional information:

There is a web server running on Windows Server 2016 (192.168.95.100)
There is a web server running on Linux Server (192.168.207.101)

Gathering Data from DNS Servers

So, as discussed, DNS can provide valuable data. In basic terms, DNS is used to translate domain names into IP addresses. There are a ton of tools capable of extracting data from name servers.

First, we’ll look at NSLookup which is a tool that can be used to query domain name servers for specific records. To use it, the syntax is nslookup [domain]:

The server 192.168.95.100 is the authoritative server for this network.

DNS servers can be queried using a number of different parameters. The first is the ""-type=ns" parameter which looks for the authoritative name server:

nslookup -type=ns caley.uni

To find out the Mail Exchanger for the same domain, specify the "mx" type:

nslookup -type=mx caley.uni

An alternative to NSlookup is the "dig" command. The syntax is

dig caley.uni

DIG provides more information. First, the version is displayed. Then, the global options and the status are displayed.

The A record for my.domain is in the end. The command also returns the server that provided the answer to the query. In this case, this is at IP 192.168.95.100.

By addition the option "any" to the command above, all the records available for the my.domain domain can be displayed:

dig caley.uni any

There are a large number of options available. One example is printing the short results (i.e. only the IP):

dig +short XPClient.caley.uni

A reverse lookup (IP to domain) can be obtained by adding the option "-x" to the command:

dig -x 192.168.95.100

The IP address is displayed in reverse and the in-addr.arpa special domain is attached to. This represents a TLD and is used exclusively for technical purposes.

In a DNS server, it is used as the domain for reverse lookups.

DNS ENUMERATION WITH DNSENUM AND FIERCE

DNS enumeration allows the discovery of DNS servers and their records for an organization.

Name servers often contain valuable information including usernames, hostnames and IP addresses of possible target networks.

DNSEnum is capable of returning a considerable amount of information at once, including information on the Mail Server. The syntax for DNSEnum is:

dnsenum caley.uni

DNS Zone Transfer is used to replicate name servers data on other servers. When a user or another server performs a zone transfer request, all the records hosted by the NS will be returned in human readable format.

However, this only occurs if the DNS server allows them to occur - which is not the case for most.

Additionally, DNSEnum attempts to brute force the server with the dns.txt wordlist. This is used to perform an automated search for possible hosts registered with the name server:

In a real situation, a dictionary file could be used to perform a brute force scan against a DNS server. This could help discover subdomains or hostnames that are registered.

Fierce is a PERL script that quickly scans domains for records. First, it queries the DNS server of the local PC. Then, it switches to querying the target's DNS server.

It also attempts to perform a dictionary query by default using a file called hosts.txt.

The syntax is:

fierce -dns caley.uni

It finds the DNS server instantly, but the Zone Transfer fails. It then moves onto other tests.

To understand why the Zone Transfer failed, we can log onto a Windows 2016 Server that is my DNS server. Once logged in, go to Start -> Administrative Tools -> DNS:

Expand the Forward Lookup Zones, select and right-click on the my.domain folder and hit Properties:

Select the Zone Transfers tab - as you can see it is unchecked:

If we check it and execute Fierce again, a lot more information comes back:

To increase the security of your DNS server, zone transfers should only be allowed for name servers in the name server resource records for a zone or for specified DNS servers.

If you allow any DNS server to perform a zone transfer, you are allowing your network information to be passed to any host.

WEB CRAWLING

Wget is an automatic tool that allows you to download an entire web site. The command supports HTTP, HTTPS and FTP.

Wget could be used to spider a web site and create an offline mirror. Other web crawlers are available including HTTrack and Open Web Spider.

Entering the "wget -r" command performs a recursive search

wget -r intranet.caley.uni

Listing the contents, a folder was created that contains the full target website.

Inside the publicrel folder is a document file - the web page contained a hidden link to this file.

Crawling websites may leave traces in the server logs so be careful!

Finally, the Wget command can be ran with an IP address instead of a domain name:

wget 192.168.95.100
wget 192.168.207.101

DNS INFORMATION GATHERING WITH BASH SCRIPTING

We can create a file called "dns.txt" and enter some of the hostnames:

We can then create the following script:

This script tries DNS lookups for every name available in the dns.txt file. If a match is found on the queried DNS server, an IP address is returned.

A similiar one can be used for dig:

Conclusion

So, there you have it. You know have a solid idea of how to gather some initial information gathering via open sources like Google, Job Sites, Social Media. Not only that, you also have an idea of how to perform some DNS enumeration using tools like nslookup, dig and Fierce.

You deserve a break after reading all of this. Take a nice sip of whatever beverage (alcoholic or not) you feel like.

Thank for you reading and have a good day/night wherever you are!

Introduction to the Reconnaissance Phase

Reconnaissance

Information Gathering Types

Footprinting

Google Hacking

Incremental Substitution

Robots.txt

Archive of Websites

Job Sites

Social Networks

Social Media Resources

People Search

Whois Database

DNS Enumeration

Extracting DNS Information

DNS - Target Information

NSLookup

Locate Network Range

Network Path

Visual Route

Email Tracking

Defending Against Reconnaissance

Practical Time

Gathering Data from DNS Servers

DNS ENUMERATION WITH DNSENUM AND FIERCE

WEB CRAWLING

DNS INFORMATION GATHERING WITH BASH SCRIPTING

Conclusion

ComplexSec

Introduction to the Reconnaissance Phase

Reconnaissance

Information Gathering Types

Footprinting

Google Hacking

Incremental Substitution

Robots.txt

Archive of Websites

Job Sites

Social Networks

Social Media Resources

People Search

Whois Database

DNS Enumeration

Extracting DNS Information

DNS - Target Information

NSLookup

Locate Network Range

Network Path

Visual Route

Email Tracking

Defending Against Reconnaissance

Practical Time

Gathering Data from DNS Servers

DNS ENUMERATION WITH DNSENUM AND FIERCE

WEB CRAWLING

DNS INFORMATION GATHERING WITH BASH SCRIPTING

Conclusion

Introduction to the Network Scanning Phase

Introduction to Legal Issues in Cybersecurity

ComplexSec