Alpine Linux: From vulnerability discovery to code execution (Pt 1 of 2)

Share this…

I recently uncovered two critical vulnerabilities in Alpine Linux’s package manager, assigned CVE-2017-9669 and CVE-2017-9671. These vulnerabilities could potentially lead to an attacker executing malicious code on your machines, if you are using Alpine Linux knowingly or implicitly.

Alpine Linux is a lightweight Linux distribution that has become increasingly popular in the last several years. This was mainly possible thanks to its use within containers, notably in Docker. The majority of the official Docker repositories have an alpine build variant, and the alpine repository itself currently has more than 10 million pulls.

Alpine is advertised as a security-oriented distribution and its developers are putting a great deal of effort into living up to that claim, for instance, by including kernel side defense mechanisms and by compiling user space packages with all modern binary protections. For these reasons we use Alpine here at Twistlock, and as a security researcher I decided to dig into alpine’s internals and hunt for significant flaws that could compromise all alpine users.

Targeting apk and finding bugs

There are different tools that make the alpine distribution. The one thing I really wanted to take a look into was apk – alpine’s package manager. If I was somehow able to alter packages before they were installed, or convince the package manager to downgrade packages, I could execute code on a target system.

Following some preliminary research, I eventually decided to attempt fuzzing specific parts of apk. I patched apk with some modifications and compiled it for afl (to be precise, I wrote my own applet that uses a file as an input). In the past I had success in finding zero-day vulnerabilities by fuzzing with afl, so I had a good feeling I could find bugs in apk by fuzzing specific functions that I assumed potentially vulnerable.

The applet I wrote focused on the tar parsing code. Apk accepts update files as gzipped tarballs (tar.gz), so I isolated the code that takes a tar stream and fed it a file. It was the perfect fuzzing point – there is a lot of code that deals with parsing user input, and any crashes I would find should happen prior to any signing process. For comparison, a crash in the parsing of a specific package would be less significant because apk checks for file signatures.

My call was correct. After less than a day afl found numerous crashes in my apk applet. I began triaging the crashes and later debugging to locate the bugs. Fast forward some painful trial and error, I found two pieces of code in the tar parsing function that could lead to a heap overflow.

Explaining the bugs

The overflow may occur within the following lines (from archive.c):

case 'L': /* GNU long name extension */
if (blob_realloc(&longname, entry.size+1)) goto err_nomem; = longname.ptr;
is->read(is,, entry.size);[entry.size] = 0;
offset += entry.size;
entry.size = 0;

First, let’s understand the context. The code snippet is from the function apk_parse_tar in archive.c. This function takes a tar stream of apk_istream, parses it and runs a callback function on each parsed chunk.

In general, a tar stream consists of blocks of 512 bytes, starting with a tar header block following blocks of the file data. One of the header fields is a typeflag which indicates the type of the following file. It is also used to indicate the use of special blocks, such as a longname (or “GNU long name extension”) flag, meaning the next bytes include the name of the following file. This is used when the file name is longer than 100 bytes.

So when the parser encounters a longname block, it should allocate a buffer of given size and copy the name from the stream to it. This buffer is longname, and blob_realloc is the function used to expand the buffer size if required.

Let’s have a look at blob_realloc(1):

static int blob_realloc(apk_blob_t *b, int newsize)
char *tmp;
if (b->len >= newsize) return 0;
tmp = realloc(b->ptr, newsize);
if (!tmp) return -ENOMEM;
b->ptr = tmp;
b->len = newsize;
return 0;

The troubling thing about this function is that it accepts a size of type int, which is naturally signed. b->len is of type long, which means it’s also signed. Consequently, the comparison of these variables is signed.

You may be wondering why this is of any significance. Try simulating what happens with any size greater than 0x80000000. 0x80000000 is 2147483648 as an unsigned int, and -2147483648 as a signed int(2). In other words, when the size is bigger than the maximum signed int size, it will be considered negative, so blob_realloc will return 0 and leave the buffer unmodified.

In the following call to is->read, a huge amount of bytes will be copied to the buffer, overflowing its allocated size and overwriting any subsequent data on the heap. That is as long as the read function treats the size as unsigned, and in the case of a tar.gz that function is gzi_read (from gunzip.c), which expects a size_t (unsigned).

It is also worth mentioning that changing the blob_realloc definition to accept a size_t instead of an int would not be enough to solve this issue, because an integer overflow may occur when adding a 1 to a maximum entry.size, which would allow the buffer overflow to happen just like before.

This buffer overflow may be exploited to achieve code execution by an attacker who can predict the memory layout at execution, by overriding a function offset on the heap to any function with his parameters. At the time being I am not releasing my proof of concept, but I plan to to do so and write about the exploitation process at a future date, once the bug had been fixed and is unlikely to be exploited in the wild.

MITRE has assigned CVE-2017-9669 to this vulnerability. Another call to blob_realloc can be found in the code that handles a tar pax header, in which a buffer overflow may occur in the same fashion. It was assigned CVE-2017-9671.


These are numerous ways in which these particular buffer overflows could be exploited. The obvious would be to try and achieve code execution on the system. The only prerequisite would be to figure out the memory layout of the program. Protections like ASLR or other hardenings may block the attacker from succeeding, but he may be able to get around it and still achieve execution. More on this in my following post.

An actual attack scenario would be a man in the middle attacker imposing Alpine’s update server. The attacker would craft a malicious APKINDEX.tar.gz file (Alpine’s update file) and host it at his HTTP server. When any user on the network would either: run apk update in any machine (container or not) or build a container image based on alpine (that calls the latter command) – the attacker’s file would be served, and the attacker’s malicious code would execute on the victim’s machine. The attacker could execute code that hides the attack and the victim may never know his machine was compromised.


Finally, regarding versions. All versions of apk since 2.5.0_rc1 are vulnerable to both CVEs. I took a look into older code, and it seems it may be vulnerable to a similar issue. So I would like to say that all versions are vulnerable, but it would be out of scope to really examine very old versions.


I’ve privately informed Alpine’s developers of this issue. Apk’s manager, Timo Teräs (fabled), had promptly responded to my emails and worked with me on understanding the implications and issuing a quick patch.

Teras also mentioned implementing some additional hardenings to apk that may further restrict an attacker from exploiting this vulnerability and any that are similar, like control-flow integrity.

The fixes are available from apk-tools 2.7.2 and 2.6.9 and all alpine repositories back to 3.2-stable are updated with it (3.3-stable is actually the latest supported version but Timo updated 3.2 too).