In this post, I will cover all of the Information Disclosure labs located at PortSwigger Academy as well as providing some context regarding what information disclosure is and how we can find it.

Information Disclosure

Information disclosure (information leakage) is when a site unintentionally reveals sensitive information to its users. Depending on the context, websites may leak all kinds of information to a potential attacker including:

Data about other users, such as usernames or financial information
Sensitive commercial or business data
Technical details about the websites and its infrastructure

The dangers of leaking sensitive user or business data are fairly obvious, but disclosing technical information can sometimes be just as serious. Although some of this information will be of limited use, it can potentially be a starting point for exposing an additional attack surface, which may contain other interesting vulnerabilities. The knowledge that you are able to gather could even provide the missing piece of the puzzle when trying to construct complex, high-severity attacks.

Occasionally, sensitive information might be carelessly leaked to users who are simply browsing the website in a normal fashion. More commonly, however, an attacker needs to elicit the information disclosure by interacting with the website in unexpected or malicious ways. They will then carefully study the website's responses to try and identify interesting behavior.

Examples of Information Disclosure

Some basic examples include:

Revealing the names of hidden directories, their structure, and their contents via a robots.txt file or directory listing
Providing access to source code files via temp backups
Explicitly mentioning database table or column names in error messages
Unnecessarily exposing highly sensitive information, such as credit card details
Hard-coding API keys, IP addresses, database credentials, and so on in the source code
Hinting at the existence or absence of resources, usernames, and so on via subtle differences in app behaviour

How Do They Arise?

Information disclosure vulnerabilities can arise in countless different ways, but these can be broadly categorized as:

Failure to remove internal content from public content - for example, dev comments in markup are sometimes visible to users in the production environment
Insecure config of the website and related technologies - for example, failing to disable debugging and diagnostic features can sometimes provide attackers with useful tools to help them obtain sensitive information. Default configs can also leave websites vulnerable, for example, by displaying overly verbose error messages.
Flawed design and behaviour of the app - for example, if a website returns distinct responses when different error states occur, this can also allow attackers to enumerate sensitive data, such as valid user credentials.

Impact of Information Disclosure

Information disclosure vulnerabilities can have both a direct and indirect impact depending on the purpose of the website and, therefore, what information an attacker is able to obtain. In some cases, the act of disclosing sensitive information alone can have a high impact on the affected parties. For example, an online shop leaking its customers' credit card details is likely to have severe consequences.

On the other hand, leaking technical information, such as the directory structure or which third-party frameworks are being used, may have little to no direct impact. However, in the wrong hands, this could be the key information required to construct any number of other exploits. The severity in this case depends on what the attacker is able to do with this information.

How to Assess the Severity

Although the ultimate impact can potentially be very severe, it is only in specific circumstances that information disclosure is a high-severity issue on its own. During testing, the disclosure of technical information in particular is often only of interest if you are able to demonstrate how an attacker could do something harmful with it.

For example, the knowledge that a website is using a particular framework version is of limited use if that version is fully patched. However, this information becomes significant when the website is using an old version that contains a known vulnerability. In this case, performing a devastating attack could be as simple as applying a publicly documented exploit.

It is important to exercise common sense when you find that potentially sensitive information is being leaked. It is likely that minor technical details can be discovered in numerous ways on many of the websites you test. Therefore, your main focus should be on the impact and exploitability of the leaked information, not just the presence of information disclosure as a standalone issue. The obvious exception to this is when the leaked information is so sensitive that it warrants attention in its own right.

How to Prevent Information Disclosure

Preventing information disclosure completely is tricky due to the huge variety of ways in which it can occur. However, there are some general best practices that you can follow to minimize the risk of these kinds of vulnerability creeping into your own websites.

Make sure that everyone involved in producing the website is fully aware of what information is considered sensitive. Sometimes seemingly harmless information can be much more useful to an attacker than people realize. Highlighting these dangers can help make sure that sensitive information is handled more securely in general by your organization.
Audit any code for potential information disclosure as part of your QA or build processes. It should be relatively easy to automate some of the associated tasks, such as stripping developer comments.
Use generic error messages as much as possible. Don't provide attackers with clues about application behavior unnecessarily.
Double-check that any debugging or diagnostic features are disabled in the production environment.
Make sure you fully understand the configuration settings, and security implications, of any third-party technology that you implement. Take the time to investigate and disable any features and settings that you don't actually need.

Finding Information Disclosure Vulnerabilities

Generally speaking, it is important not to develop "tunnel vision" during testing. In other words, you should avoid focusing too narrowly on a particular vulnerability. Sensitive data can be leaked in all kinds of places, so it is important not to miss anything that could be useful later.

You will often find sensitive data while testing for something else. A key skill is being able to recognize interesting information whenever and wherever you do come across it.

The following are some examples of high-level techniques and tools that you can use to help identify information disclosure vulnerabilities during testing.

Fuzzing

If you identify interesting parameters, you can try submitting unexpected data types and specially crafted fuzz strings to see what effect this has. Pay close attention; although responses sometimes explicitly disclose interesting information, they can also hint at the application's behavior more subtly.

For example, this could be a slight difference in the time taken to process the request. Even if the content of an error message doesn't disclose anything, sometimes the fact that one error case was encountered instead of another one is useful information in itself.

You can automate much of this process using tools such as Burp Intruder. This provides several benefits. Most notably, you can:

Add payload positions to parameters and use pre-built wordlists of fuzz strings to test a high volume of different inputs in quick succession.

Easily identify differences in responses by comparing HTTP status codes, response times, lengths, and so on.
Use grep matching rules to quickly identify occurrences of keywords, such as error, invalid, SELECT, SQL, and so on.
Apply grep extraction rules to extract and compare the content of interesting items within responses.

You can also use the Logger++ extension, available from the BApp store. In addition to logging requests and responses from all of Burp's tools, it allows you to define advanced filters for highlighting interesting entries.

Engineering Information Responses

Verbose error messages can sometimes disclose interesting information while you go about your normal testing workflow. However, by studying the way error messages change according to your input, you can take this one step further. In some cases, you will be able to manipulate the website in order to extract arbitrary data via an error message.

There are numerous methods for doing this depending on the particular scenario you encounter. One common example is to make the application logic attempt an invalid action on a specific item of data.

For example, submitting an invalid parameter value might lead to a stack trace or debug response that contains interesting details. You can sometimes cause error messages to disclose the value of your desired data in the response.

Common Sources of Information Disclosure

Information disclosure can occur in a wide variety of contexts within a website. The following are some common examples of places where you can look to see if sensitive information is exposed.

Files for web crawlers
Directory listings
Developer comments
Error messages
Debugging data
User account pages
Backup files
Insecure configuration
Version control history

Files for Web Crawlers

Many websites provide files at /robots.txt and /sitemap.xml to help crawlers navigate their site. Among other things, these files often list specific directories that the crawlers should skip, for example, because they may contain sensitive information.

As these files are not usually linked from within the website, they may not immediately appear in Burp's site map. However, it is worth trying to navigate to /robots.txt or /sitemap.xml manually to see if you find anything of use.

Directory Listings

Web servers can be configured to automatically list the contents of directories that do not have an index page present. This can aid an attacker by enabling them to quickly identify the resources at a given path, and proceed directly to analyzing and attacking those resources.

It particularly increases the exposure of sensitive files within the directory that are not intended to be accessible to users, such as temporary files and crash dumps.

Directory listings themselves are not necessarily a security vulnerability. However, if the website also fails to implement proper access control, leaking the existence and location of sensitive resources in this way is clearly an issue.

Developer Comments

During development, in-line HTML comments are sometimes added to the markup. These comments are typically stripped before changes are deployed to the production environment. However, comments can sometimes be forgotten, missed, or even left in deliberately because someone wasn't fully aware of the security implications. Although these comments are not visible on the rendered page, they can easily be accessed using Burp, or even the browser's built-in developer tools.

Occasionally, these comments contain information that is useful to an attacker. For example, they might hint at the existence of hidden directories or provide clues about the application logic.

Error Messages

One of the most common causes of information disclosure is verbose error messages. As a general rule, you should pay close attention to all error messages you encounter during auditing.

The content of error messages can reveal information about what input or data type is expected from a given parameter. This can help you to narrow down your attack by identifying exploitable parameters. It may even just prevent you from wasting time trying to inject payloads that simply won't work.

Verbose error messages can also provide information about different technologies being used by the website. For example, they might explicitly name a template engine, database type, or server that the website is using, along with its version number. This information can be useful because you can easily search for any documented exploits that may exist for this version.

Similarly, you can check whether there are any common configuration errors or dangerous default settings that you may be able to exploit. Some of these may be highlighted in the official documentation.

You might also discover that the website is using some kind of open-source framework. In this case, you can study the publicly available source code, which is an invaluable resource for constructing your own exploits.

Differences between error messages can also reveal different application behavior that is occurring behind the scenes. Observing differences in error messages is a crucial aspect of many techniques, such as SQL injection, username enumeration, and so on.

Debugging Data

For debugging purposes, many websites generate custom error messages and logs that contain large amounts of information about the application's behavior. While this information is useful during development, it is also extremely useful to an attacker if it is leaked in the production environment.

Debug messages can sometimes contain vital information for developing an attack, including:

Values for key session variables that can be manipulated via user input
Hostnames and credentials for back-end components
File and directory names on the server
Keys used to encrypt data transmitted via the client

Debugging information may sometimes be logged in a separate file. If an attacker is able to gain access to this file, it can serve as a useful reference for understanding the application's runtime state. It can also provide several clues as to how they can supply crafted input to manipulate the application state and control the information received.

User Account Pages

By their very nature, a user's profile or account page usually contains sensitive information, such as the user's email address, phone number, API key, and so on. As users normally only have access to their own account page, this does not represent a vulnerability in itself. However, some websites contain logic flaws that potentially allow an attacker to leverage these pages in order to view other users' data.

For example, consider a website that determines which user's account page to load based on a user parameter.

```
GET /user/personal-info?user=carlos
```

Most websites will take steps to prevent an attacker from simply changing this parameter to access arbitrary users' account pages. However, sometimes the logic for loading individual items of data is not as robust.

An attacker may not be able to load another users' account page entirely, but the logic for fetching and rendering the user's registered email address, for example, might not check that the user parameter matches the user that is currently logged in.

In this case, simply changing the user parameter would allow an attacker to display arbitrary users' email addresses on their own account page.

Source Code Disclosure via Backups

Obtaining source code access makes it much easier for an attacker to understand the application's behavior and construct high-severity attacks. Sensitive data is sometimes even hard-coded within the source code. Typical examples of this include API keys and credentials for accessing back-end components.

If you can identify that a particular open-source technology is being used, this provides easy access to a limited amount of source code.

Occasionally, it is even possible to cause the website to expose its own source code. When mapping out a website, you might find that some source code files are referenced explicitly. Unfortunately, requesting them does not usually reveal the code itself. When a server handles files with a particular extension, such as .php, it will typically execute the code, rather than simply sending it to the client as text.

However, in some situations, you can trick a website into returning the contents of the file instead. For example, text editors often generate temporary backup files while the original file is being edited. These temporary files are usually indicated in some way, such as by appending a tilde (~) to the filename or adding a different file extension. Requesting a code file using a backup file extension can sometimes allow you to read the contents of the file in the response.

Once an attacker has access to the source code, this can be a huge step towards being able to identify and exploit additional vulnerabilities that would otherwise be almost impossible. One such example is insecure deserialization.

Information Disclosure Due to Insecure Config

Websites are sometimes vulnerable as a result of improper configuration. This is especially common due to the widespread use of third-party technologies, whose vast array of configuration options are not necessarily well-understood by those implementing them.

In other cases, developers might forget to disable various debugging options in the production environment. For example, the HTTP TRACE method is designed for diagnostic purposes. If enabled, the web server will respond to requests that use the TRACE method by echoing in the response the exact request that was received. This behavior is often harmless, but occasionally leads to information disclosure, such as the name of internal authentication headers that may be appended to requests by reverse proxies.

Version Control History

Virtually all websites are developed using some form of version control system, such as Git. By default, a Git project stores all of its version control data in a folder called .git. Occasionally, websites expose this directory in the production environment. In this case, you might be able to access it by simply browsing to /.git.

While it is often impractical to manually browse the raw file structure and contents, there are various methods for downloading the entire .git directory. You can then open it using your local installation of Git to gain access to the website's version control history. This may include logs containing committed changes and other interesting information.

This might not give you access to the full source code, but comparing the diff will allow you to read small snippets of code. As with any source code, you might also find sensitive data hard-coded within some of the changed lines.

Lab 1 - Disclosure in Error Messages

The description of this lab states that the verbose error messages reveal a vulnerable version of a third-party framework From this, we can immediately gather that we should try and break the application in a way that causes an error message to appear.

As with any challenge, we should gather some information about the website itself and how it operates under normal circumstances. Navigating to the lab, we see what appears to be a simple store:

It’s always a good idea to click around, click on various links, see if you can input anywhere and map out what you can see in the application. If you have Burp Suite open while clicking around, it will do this for you and after a while, you will get a sitemap/directory structure forming:

Nothing seems to stand out instantly. A good idea is to look for anywhere a user could possible interfere and input something that might break the web application. In Burp Suite inside the Target —> Site Map tab, you can see all the requests made on the right (check image above). This section has a field title “Params” and will show a check mark if a certain request has parameters.

Parameters are a fantastic testing point to try and break web sites or submit different input and data that the web app is not expecting. In this instance, we can see that every product has its own request with the productId parameter:

From here, we can send any of these requests with parameters to Repeater and start playing around. As a normal user, you will probably be thinking “ok, each product has its own ID, pretty standard, nothing wrong with that”.

However, as a hacker or security expert, when we see parameters and what is normally sent, we should understand what the web app is expecting, and can we break that expectation? In this case, every request using the productId parameter is sending a number ranging from 1-20, which is likely the number of products being sold.

What happens if we replace the number with a very large number - e.g. 16746596?

We get a “Not Found” error message. This is not necessarily breaking the application, but from this, we know that giving it a number that does not correspond to a number in a likely back-end database results in a Not Found message.

If we assume that the database in the back end likely has a productId table and there are values 1-20 inside it, and the web app always send this number when clicking on a product, we can start thinking about how to break it.

For example, if it is expecting a number/integer, what happens if we send a special character or even just a string/word instead? Let’s try it. For this example, I will replace the productId value with the value “random”:

We get a much better looking error - 500 Internal Server Error. Even better, scrolling down to the bottom of the response, you will see a very interesting string, one that may even reveal what the back end is running and even the version number, but I’ll leave that to you to find out - practice is key in this industry.

The takeaway from this lab should be that if you observe the normal functionality of a web app and see some parameters, try playing around with it. Does the website back-end expect a string/word? Try injecting special characters or numbers. Does the website expect a number? Try injecting a string or special characters.

Sometimes, the developer will not expect the end users to play around and leave verbose debug messages on, revealing vulnerable software versions running on the server - bad times.

Lab 2 - Disclosure in Debug Page

The description of this lab states that a debug page is present that discloses sensitive information and our end goal is to submit the SECRET_KEY environment variable.

As before, we should gather some information about the website itself and how it operates under normal circumstances. Once again, we see a simple shop:

After navigating through the website, we can populate the site map as usual in Burp Suite:

Immediately, I see something that stands out - however that does come with experience and known key file and directory names that scream “I’M VALUABLE”.

If I was brand new to this, I wouldn’t know what to look for. In that case, there are a couple of things you could do. A good idea would be since we’ve done a previous lab with the same web page to compare the results of the site map and see if anything different stands out about this one.

Doing this comparison, we can see both sites have the root (/), academyLabHeader, product folder and everything inside it, the resources folder and the submitSolution page. One thing that is different about this lab is a folder title “cgi-bin”:

Another way to do this, if we are looking for information disclosure is checking the source code of the page and Ctrl+F (or search) for the string “<!—” which starts a comment in HTML:

The same thing appears - this is how Burp Suite found it, but we could not see it on the main page.

For a little bit of context, the cgi-bin folder is “a folder used to house scripts that will interact with a Web browser to provide functionality for a Web page or website”.

In the most basic of terms, CGI or Common Gateway Interface is the process for scripts to communicate with your hosting server. The folder for CGI scripts is what we call the cgi-bin. It is created in the directory root of your website and where your scripts are permitted to run or execute. The cgi-bin folder will store the scripts such as Perl (.pl) used by your website.

However, in security terms, it can also host important debug pages if the developer forgets or could house other additional scripts that may have hardcoded credentials inside. In general, if you see a cgi-bin folder, investigate it and its contents as they may contain juicy information inside.

In this example, we see a file called phpinfo.php.

What is phpinfo.php? Well, phpinfo() is actually a PHP function that outputs information about PHP's configuration. However, this output can be turned into a page viewable in a browser, displaying all that juicy PHP information that would be too much to talk about here.

If we simply browse to this cgi-bin/phpinfo.php page, we can see this information:

This type of information about the configuration is a gold mine for a hacker or security expert, as it contains so much information about the back end we could potentially use to our advantage. In the case of this lab, we are looking for an environment variable called SECRET_KEY.

Scrolling down long enough, we will see a section called “ENVIRONMENT” that contains all the environment variables and we will see the SECRET_KEY value we can submit to complete the lab.

Lab 3 - Disclosure via Backup Files

The description of this lab states that this lab leaks its source code via backup files in a hidden directory.

As before, we should gather some information about the website itself and how it operates. As before, we see a simple shop:

Right off the bat, nothing looks out of the ordinary. There’s no login page or any sort of account sign in/sign up that we could test. The best thing to do starting out is simply click around, check the source code and generate a site map while Burp Suite runs in the background:

Again, nothing really stands out. There’s a couple of hidden directories such as the css and images folders, but they are not really useful and are likely just there to make the website look prettier via CSS and images for the products.

Checking the source code also reveals no HTML comments, indicating no hidden directory:

At this point, you might assume this is a dead end - there’s no injection points or account sign ups, no hidden directories, nothing. However, it is always a good idea to check every possible source that can contain hidden directories.

In this case, we are forgetting about the robots.txt file. The robots.txt contains instructions for bots/search engine crawlers that tell them which webpages they can and cannot access, which could include hidden directories.

Trying to navigate to /robots.txt reveals the contents of the file:

Here, there is a disallow entry for the /backup directory. A disallow entry tells bots not to access the webpage or set of webpages that come after the command. Disallowed pages aren't necessarily "hidden" – they just aren't useful for the average search engine end user, so they aren't shown to them.

Most of the time, a user on the website can still navigate to these pages if they know where to find them.

If we navigate to this directory, we can see the contents:

This directory includes a .bak file. A bak file is a filename extension commonly used to signify a backup copy of a file. It’s a good idea to download this file, remove the .bak extension and take a look at what appears to be a Java file, likely being the source code for something.

To download this, you can use any tool of your choosing, but I prefer using wget:

Once downloaded, I simply rename it to ProductTemplate.java and open it in a text editor to examine its contents:

If you would like, you can go through this line by line on your own time, but for this walkthrough, I will simply highlight the important parts.

On line 29, we can see something called ConnectionBuilder. Doing some quick research on it, it is used to establish a connection to a database:

I’ll leave you to do more research about it if you want. But, we can also gather that the database system running is PostGreSQL judging by the strings contained inside it. Going down the list, we can likely guess what each line is:

org.postgresql.Driver is likely some kind of driver as indicated to make the connection
postgresql is likely the database type
localhost is the IP or where the local database is stored
5432 is the port for PostGreSQL
postgres is maybe the database name
postgres is maybe the user
The long string at the end “zml64anij87wnj05agfeyyq1bs23lqq3” is likely a password or a hash of some kind.

Note that these are just assumptions and may not be correct, but from a high-level overview, we can guess what some of them might be.

With this long string that we guess is the password, we can try and submit it as the answer for the database password:

And we complete the lab.

Remember, if nothing seems to stick out immediately, try checking the robots.txt file for hidden directories or hidden files that were not indexed.

Lab 4 - Authentication Bypass via Information Disclosure

In the description of this lab, we are told that the lab's administration interface has an authentication bypass vulnerability, but it is impractical to exploit without knowledge of a custom HTTP header (important) used by the front-end and that our end goal is to get the header name, use it to bypass authentication, and delete Carlos’s account.

As always, we navigate to the site, explore around and get the lay of the land before attempting any exploitation:

Once again, nothing stands out. However, this time, we do have a login page that we can play around with, as well as the my-account page once a successful login is made. Logging in with the credentials provided to us (wiener:peter), we see a simple my account page:

Again, nothing too exciting here, but we are building an idea of how the web app functions normally - enumeration is key.

In the lab description, we are told about the administration interface. However, we don’t know the URL of this page. One way to find it is to perform directory busting with a tool of your choice. There are many tools available including Gobuster, Feroxbuster, Dirb, Dirbuster and a lot more.

However, before trying directory busting, it’s important to realize that developers are lazy and typically want the admin panel to be easy to remember so trying simple directory/page names first is a good first step. One of the most common names for this page is simply /admin. What happens if we try that?

We get an error message stating that this admin interface is only available to local users - interesting. Just to show you that directory busting works, I also ran a Feroxbuster scan against the page using the dirsearch.txt wordlist and it successfully found the same page:

With this new information, it seems we need to somehow trick the web application into thinking we are a local user. There are many methods to do this and many HTTP headers that also accomplish this. One of the most common is the X-Forwarded-For HTTP header.

This header is a standard header for identifying the originating IP address of a client connecting to a web server through a proxy server. Sounds like it would help us, right? We can try intercepting a request to the /admin panel and adding this header manually before sending it on:

It’s important to note the IP address used here. The error message stated that only local users are allowed access. Since we don’t necessarily know what the IP address is of the internal network or what subnet it is whitelisting, it’s a good idea to use the most local IP address possible - 127.0.0.1 (localhost) - which simply refers to itself.

Unfortunately, we get the same error message - 401 Unauthorized - indicating that it failed. As stated above, there are many other HTTP Headers that could potentially be in use by the web server that can be found here. However, going through of these manually is tiresome and not effective. Additionally, a web server can also have its own custom HTTP Headers.

With this information, there are other HTTP Methods available out there to help us. The TRACE method is particularly useful for enumerating these HTTP headers. TRACE is used for diagnostic purposes and, if enabled, will respond to requests that use the TRACE method by echoing in its response the exact request it received.

What happens if we remove the X-Forwarded-For header, and change the GET /admin request to TRACE /admin instead?

On the right hand see, we see the response of 200 OK, indicating it accepted the TRACE request. Additionally, in the body of the message, we see what the server actually received. Going through it, you can see every header that was sent.

Looking at the very bottom, there is one header that was not sent in the request, but appeared anyways - X-Custom-IP-Authorization that is set to your public IP address (hence the blurring). Doing some Googling, it appears this header is custom made and simply grabs your public IP address, before forwarding it on to the server to check if you are local or not.

With all this information, what happens if we send another GET /admin request, but before sending it, we attach this custom header ourselves and set it to a local machine (i.e. 127.0.0.1)?

If we remember before, the request to this page would result in a 401 Unauthorized error, but this time, we get the 200 OK message, indicating something changed. It’s likely that this header injection/authorization bypass worked successfully.

To check this, we can simply intercept a request, add this custom header, forward it on and see if we gain access to the admin panel:

And we do! The authorization bypass worked. From here, we could delete the Carlos user and finish this lab.

When testing a web application, it’s important to look at the Headers as there may be custom ones that we can exploit to perform things such as authorization bypass. A good website for this can be found here where it provides more possible 401/403 bypasses along with the various different headers.

PortSwigger: All Information Disclosure Labs