...

Python Cache Poisoning as a Linux Privilege Escalation Technique

Securify

How misconfigured bytecode caching turns a Python performance feature into a local privilege escalation path — and why it keeps showing up in environments that otherwise look well-hardened.

Introduction

There’s a particular kind of finding that’s uncomfortable to present — not because it’s catastrophic, but because it’s embarrassing. When you show a team that one of their privileged automation scripts has been running with a world-writable __pycache__ directory for the past two years, the reaction isn’t usually alarm. It’s a long pause, followed by something like: “wait, that actually works?”

It does. Python’s bytecode caching mechanism, which every Python developer uses implicitly every single day, has a property that makes it interesting from an attacker’s perspective: Python trusts cached bytecode with essentially zero verification. Pair that with a privileged script and a writable cache directory, and you have a reliable local privilege escalation path that doesn’t require touching the source file, doesn’t require a running service, and in some configurations doesn’t leave obvious forensic artifacts.

This post is not a how-to. It’s a breakdown of the mechanism, the conditions that create vulnerability, and what actually matters when you’re trying to find and fix this in a real environment. If you’re a security engineer preparing for a Linux hardening review, a DevOps lead who owns automation infrastructure, or a compliance stakeholder trying to map real attack surface to framework controls, this is worth understanding in depth.

What Python Bytecode Caching Actually Does

When Python imports a module — or runs a script that imports one — it compiles the source code down to bytecode. Bytecode is a lower-level, platform-independent instruction set for the Python virtual machine. The CPython interpreter executes it significantly faster than parsing raw source every time.

To avoid repeating that compilation step on every execution, Python caches the bytecode in .pyc files. In Python 3, these live in a __pycache__ directory co-located with the source, with filenames that embed the interpreter version:

Example: __pycache__ directory layout
# Typical __pycache__ layout after importing utils.py
myapp/
  run.py
  utils.py
  __pycache__/
utils.cpython-311.pyc   # bytecode cache for Python 3.11
utils.cpython-310.pyc   # separate cache per interpreter version

Python checks a small header in the .pyc file to decide whether the cache is still valid. The default validation mode relies on a timestamp and the source file’s size. If both match what’s recorded in the header, Python skips recompilation and loads the bytecode directly.

You can inspect that header yourself with the struct module:

Reading a .pyc header (Python 3)

import struct, sys

with open(‘__pycache__/utils.cpython-311.pyc’, ‘rb’) as f:
    magic      = f.read(4)   # Python version magic number
    flags      = struct.unpack(‘<I’, f.read(4))[0]
    timestamp  = struct.unpack(‘<I’, f.read(4))[0]  # mtime of source
    sourcesize = struct.unpack(‘<I’, f.read(4))[0]  # size of source in bytes

print(f’Flags:      {flags}’)       # 0 = timestamp-based (default)
print(f’Timestamp:  {timestamp}’)   # must match source mtime
print(f’Source size:{sourcesize}’)  # must match source size

There is no cryptographic signature. No hash of the bytecode itself. The validation is purely based on filesystem metadata: timestamp and size.

KEY MECHANISMPython 3.8 introduced hash-based .pyc files as an opt-in alternative (PEP 552), where the header stores a hash of the source rather than a timestamp. This isn’t the default. The vast majority of Python installations in production use timestamp validation.

Import Resolution Order and the Attack Surface

When a script executes import utils, Python searches a list of directories in order. That list — stored in sys.path — typically puts the script’s own directory first:

sys.path resolution order
# sys.path at runtime for /opt/scripts/run.py
[
  ‘/opt/scripts’,          # script directory — checked FIRST
  ‘/usr/lib/python311.zip’,
  ‘/usr/lib/python3.11’,
  ‘/usr/lib/python3.11/lib-dynload’,
  ‘/usr/local/lib/python3.11/dist-packages’,
]

# Python resolves ‘import utils’ by checking:
#   /opt/scripts/__pycache__/utils.cpython-311.pyc  <– checked before source
#   /opt/scripts/utils.py

The cache directory Python checks for bytecode is co-located with the script. So the chain looks like this: a privileged process runs a Python script, that script imports a module, Python checks the local __pycache__ directory for a valid .pyc file, and if the header metadata matches the source, Python loads it — without further verification.

If an attacker can write to that __pycache__ directory, they can place a crafted .pyc file with arbitrary bytecode. When the privileged process next imports that module, the attacker’s code executes in that elevated context.

The source file itself is never touched. If you’re auditing file modification times on .py scripts, you won’t see anything. The attack surface is entirely in the cache layer.

The Specific Conditions That Create Vulnerability

This requires a specific combination of conditions — all must be present simultaneously. That’s actually why it’s worth understanding carefully: it tells you exactly what to look for.

Condition 1: A Privileged Python Script

The most common examples in practice are cron jobs running as root, systemd service units executing Python with elevated permissions, and scripts in /etc/sudoers that allow a lower-privileged user to run them. Let’s look at each:

Common privileged Python execution contexts
# /etc/cron.d/backup — runs as root every night
0 2 * * * root /usr/bin/python3 /opt/scripts/backup.py

# /etc/sudoers entry — lets ‘deploy’ user run script as root, no password
deploy ALL=(root) NOPASSWD: /usr/bin/python3 /opt/scripts/deploy_helper.py

# /etc/systemd/system/healthcheck.service
[Service]
User=root
ExecStart=/usr/bin/python3 /opt/monitoring/healthcheck.py

Condition 2: A Writable __pycache__ Directory

This is the permission misconfiguration that enables the attack. Let’s look at what a vulnerable directory looks like versus a properly hardened one:

Permission comparison: vulnerable vs. safe __pycache__
# — VULNERABLE: world-writable __pycache__ —
$ ls -la /opt/scripts/
drwxr-xr-x  root root   .                  # script dir: OK
-rwxr-xr-x  root root   backup.py          # script: OK
drwxrwxrwx  root root   __pycache__        # !! world-writable

# — ALSO VULNERABLE: group-writable, group includes attackers —
drwxrwxr-x  root devops __pycache__        # writable by ‘devops’ group

# — SAFE: only root can write —
drwxr-xr-x  root root   __pycache__        # 755, others can only read

How does this happen? When a developer runs a script as root for the first time, Python creates __pycache__ using the process umask. If the system umask is 022, the directory is created 755 — fine. But if umask is 000 or someone later runs chmod 777 to fix a PermissionError during testing and forgets to revert it, the door is open.

Audit commands: finding the vulnerable combination
# Quick audit: find world-writable __pycache__ dirs under /opt and /usr/local
find /opt /usr/local -type d -name ‘__pycache__’ -perm -o+w 2>/dev/null

# Cross-reference with scripts that run as root via cron
grep -r ‘python’ /etc/cron.d/ /etc/cron.daily/ /var/spool/cron/

# Check sudoers for python script entries
grep -i python /etc/sudoers /etc/sudoers.d/* 2>/dev/null

Condition 3: Predictable Module Imports

The attacker needs to know which module to target. With read access to the script, this is trivial. Even without it, the .pyc filenames in __pycache__ reveal exactly which modules are imported — because the filename IS the module name:

Inferring import targets from __pycache__ filenames
# Even without reading backup.py, __pycache__ reveals its imports:
$ ls /opt/scripts/__pycache__/
backup.cpython-311.pyc
utils.cpython-311.pyc       # backup.py imports ‘utils’
db_helper.cpython-311.pyc   # backup.py imports ‘db_helper’
config.cpython-311.pyc      # backup.py imports ‘config’

# Target: replace db_helper.cpython-311.pyc with malicious bytecode
# that executes when the privileged cron job next imports ‘db_helper’
SCOPE NOTEThis is a local privilege escalation technique. It requires an attacker who already has a foothold on the system — a shell session, a compromised service account, or code execution as an unprivileged user. It is not a remote entry point. But in incident response, most post-exploitation chains rely on exactly this kind of local escalation to move from www-data to root.

Why This Gets Missed in Practice

Here’s the honest answer: it doesn’t look like a vulnerability until you understand it. A world-writable __pycache__ directory doesn’t trigger alerts. It’s not a CVE. It’s a permission misconfiguration that only becomes dangerous in a specific context, and identifying that context requires connecting filesystem permissions, process privilege, Python import behavior, and scheduler or service configuration simultaneously.

Most security tools and manual reviews focus on the script itself. Is the script writable? Is its parent directory writable? The cache directory — one level deeper, named with an underscore convention that signals “internal plumbing” — escapes scrutiny.

There’s also a cognitive gap in how developers think about .pyc files. These are widely understood as ephemeral build artifacts — something you .gitignore and forget. The idea that they represent a persistent, exploitable attack surface doesn’t fit the mental model.

The Role of Metadata in .pyc Headers

For a crafted .pyc to be loaded by a privileged process, the attacker must forge the header metadata correctly — specifically, the source file’s mtime and size. If either value doesn’t match the real source, Python discards the cache and recompiles from source, overwriting the malicious file.

The header structure for a standard timestamp-based .pyc looks like this:
.pyc header structure (conceptual — no exploit code)
# .pyc file binary layout (Python 3, timestamp-based validation)
#
# Offset  Size  Field
# ——  —-  —–
#      0     4  Magic number (Python version identifier)
#      4     4  Flags (0x00 = timestamp-based; 0x01 = hash-based checked;
#                      0x03 = hash-based unchecked)
#      8     4  Source last-modified timestamp (32-bit little-endian unix epoch)
#     12     4  Source file size in bytes (32-bit little-endian)
#     16     *  Marshalled code object (the actual bytecode)

# To successfully poison the cache, an attacker must:
#   1. Read the legitimate source file stat (mtime + size)
#   2. Embed those exact values at offsets 8 and 12 in the fake .pyc
#   3. Write valid marshalled bytecode after the 16-byte header

This is not cryptographically difficult — it’s metadata transcription. But it does require read access to the source file. If read access to the source is restricted, the attacker must guess the mtime and size, which is generally impractical for mtime (Unix epoch precision to the second).

The critical defensive implication: file integrity monitoring that only watches .py source files will not detect this attack. The malicious .pyc will have a creation time consistent with normal cache generation. You need to monitor .pyc files too, and ideally validate header consistency against source stats.

Concept: detecting header mismatches in .pyc files
# Integrity check concept: compare .pyc header metadata against source stat
import struct, os, pathlib

def check_pyc_consistency(pyc_path):
    “””Returns True if .pyc header is consistent with its source file.”””
    pyc = pathlib.Path(pyc_path)
    # Reconstruct source path from __pycache__/name.cpython-XY.pyc
    src = pyc.parent.parent / (pyc.stem.split(‘.’)[0] + ‘.py’)

    if not src.exists():
        return None  # no source to compare against

    src_stat = src.stat()
    with open(pyc, ‘rb’) as f:
        f.read(8)  # skip magic + flags
        cached_mtime = struct.unpack(‘<I’, f.read(4))[0]
        cached_size  = struct.unpack(‘<I’, f.read(4))[0]

    actual_mtime = int(src_stat.st_mtime)
    actual_size  = src_stat.st_size

    if cached_mtime != actual_mtime or cached_size != actual_size:
        print(f'[ALERT] Header mismatch: {pyc}’)
        print(f’  Cached  mtime={cached_mtime}, size={cached_size}’)
        print(f’  Actual  mtime={actual_mtime}, size={actual_size}’)
        return False
    return True

Real-World Contexts Where This Shows Up

The risk profile varies considerably by environment. These are the contexts where this finding surfaces most frequently.

Internal Automation and Shared DevOps Tooling

A team builds Python utilities for deployment or log rotation. These accumulate over months, running as root in a shared /opt directory. Nobody has revisited the permissions since initial deployment. The __pycache__ directories were generated automatically at first run and haven’t been thought about since.

Vulnerable shared DevOps directory structure
# Typical vulnerable shared tooling layout
/opt/devops/
  deploy.py        # owned root, runs via sudo
  backup.py        # scheduled cron, runs as root
  utils.py         # shared helper module
  config.py        # reads secrets from /etc/
  __pycache__/     # drwxrwxr-x devops group — VULNERABLE
    utils.cpython-311.pyc
    config.cpython-311.pyc

# Any member of the ‘devops’ group can overwrite config.cpython-311.pyc
# The next time backup.py (running as root) imports config, it loads
# the attacker’s bytecode in a root context.

Cron Infrastructure

Scheduled jobs often run as root and are rarely revisited after initial setup. Scripts executed by cron don’t go through the same code review cycle as application code. Permissions drift over time.

Cron job with world-writable cache directory
# /etc/cron.d/maintenance
30 3 * * * root /usr/bin/python3 /opt/maintenance/rotate_logs.py

# Check: what does the __pycache__ look like?
$ stat /opt/maintenance/__pycache__
  File: /opt/maintenance/__pycache__
  Mode: 0777/drwxrwxrwx    # world-writable: VULNERABLE
  Uid: 0 (root)  Gid: 0 (root)

# Any local user can replace a .pyc that rotate_logs.py imports,
# wait until 03:30, and have their code execute as root.

Containerized Workloads Running as Root

Containers that run as root internally and use bind mounts extend this attack surface from container to host. If the bind-mounted path is writable by an unprivileged process on the host, the host’s filesystem becomes the attack surface.

Container bind mount risk and safer alternative
# docker-compose.yml (simplified)
services:
  worker:
    image: myapp:latest
    user: root                       # runs as root inside container
    volumes:
      – /srv/scripts:/app/scripts    # bind mount from host

# If /srv/scripts/__pycache__ is writable by an unprivileged host user,
# they can poison the cache consumed by the root-running container process.

# Safer alternative: use a read-only bind mount for script directories
    volumes:
      – /srv/scripts:/app/scripts:ro  # :ro prevents writes from inside too

Legacy Python 2 Applications

Python 2’s .pyc files sit directly in the source directory alongside the .py files, making the permission surface more obvious but the problem equally real. EOL since January 2020, but still running in countless production environments.

Python 2: .pyc beside source, simpler header
# Python 2: no __pycache__ subdirectory — .pyc lives beside the source
/opt/legacy/
  app.py
  utils.py
  utils.pyc      # Python 2 cache — same directory, same permissions

# If the directory is group-writable, utils.pyc is directly overwritable.
# No subdirectory to check separately — it’s all in one place.

# Python 2 .pyc header is slightly different (8 bytes total before bytecode):
# Offset 0: 4-byte magic | Offset 4: 4-byte mtime (no size field)

Strategic Remediation

Fixing this is not complicated. The hard part is finding it. Here’s how to approach remediation across the layers that matter.

1. Filesystem Permissions — The Immediate Fix

Privileged script directories and their __pycache__ subdirectories should not be writable by any user other than the owning account. Treat them like system files.

Hardening script directory permissions
# Harden a privileged script directory
chown -R root:root /opt/scripts/
chmod -R 755 /opt/scripts/

# Explicitly lock down any existing __pycache__ directories
find /opt/scripts -type d -name ‘__pycache__’ -exec chmod 755 {} \;
find /opt/scripts -name ‘*.pyc’ -exec chmod 644 {} \;

# Verify: no world-writable or group-writable cache dirs remain
find /opt/scripts -type d -name ‘__pycache__’ \(
    -perm -o+w -o -perm -g+w \) -print
# Should produce no output.

2. PYTHONDONTWRITEBYTECODE — Eliminate the Cache

For privileged scripts that don’t need caching performance, the simplest fix is telling Python not to write cache files at all.

Ways to disable .pyc generation for privileged scripts
# Option A: environment variable in cron or systemd
# /etc/cron.d/backup
0 2 * * * root PYTHONDONTWRITEBYTECODE=1 /usr/bin/python3 /opt/scripts/backup.py

# Option B: -B flag passed directly to the interpreter
ExecStart=/usr/bin/python3 -B /opt/monitoring/healthcheck.py

# Option C: set it at the top of the script itself
import sys
sys.dont_write_bytecode = True  # must be set before any imports

# Option D: in systemd service unit
[Service]
Environment=PYTHONDONTWRITEBYTECODE=1
ExecStart=/usr/bin/python3 /opt/scripts/deploy_helper.py

3. Hash-Based Validation (Python 3.8+)

For environments where you want to keep caching but harden the validation, compile .pyc files in checked-hash mode. Python will recompute the source hash at import time and reject any cache that doesn’t match.

Compiling .pyc files with hash-based validation
# Compile a single module with checked-hash validation
python3 -m py_compile –invalidation-mode checked-hash utils.py

# Compile an entire directory tree
python3 -m compileall –invalidation-mode checked-hash /opt/scripts/

# Verify: the flags field in the header should be 0x01 (not 0x00)
import struct
with open(‘__pycache__/utils.cpython-311.pyc’, ‘rb’) as f:
    f.read(4)  # magic
    flags = struct.unpack(‘<I’, f.read(4))[0]
    print(‘Hash-based checked’ if flags == 1 else
          ‘Hash-based unchecked’ if flags == 3 else
          ‘Timestamp-based (DEFAULT — not hardened)’)

4. File Integrity Monitoring

AIDE and Tripwire can be configured to watch __pycache__ directories alongside source files. Any unexpected modification to a .pyc file in a monitored directory should trigger an alert.

AIDE configuration for monitoring __pycache__ directories
# AIDE configuration snippet: monitor privileged script directories
# /etc/aide/aide.conf (or aide.conf.d/)

# Monitor source and cache together with strict rules
/opt/scripts/         CONTENT_EX
/opt/scripts/__pycache__/  CONTENT_EX

# CONTENT_EX includes: sha256, size, mtime, inode, uid, gid, permissions
# Any change to a .pyc file will produce an alert on the next aide –check

# Initialize the database after hardening:
# aide –init && mv /var/lib/aide/aide.db.new /var/lib/aide/aide.db

# Run check (e.g., in a daily cron):
# aide –check

5. Auditing Sudo and Cron for Python Scripts

The audit should enumerate all privileged Python execution contexts and check each one’s directory permissions systematically.

Audit script: privileged Python contexts + writable caches
#!/bin/bash
# Audit script: find privileged Python + writable cache combinations

echo ‘=== Cron jobs executing Python ===’
grep -rh ‘python’ /etc/cron.d/ /etc/cron.daily/ /etc/cron.weekly/ \
        /var/spool/cron/crontabs/ 2>/dev/null | grep -v ‘^#’

echo ”
echo ‘=== Sudoers entries for Python ===’
grep -i ‘python’ /etc/sudoers /etc/sudoers.d/* 2>/dev/null

echo ”
echo ‘=== Systemd units running Python as root ===’
grep -rl ‘python’ /etc/systemd/system/ /lib/systemd/system/ 2>/dev/null |
  xargs grep -l ‘User=root\|User=0’ 2>/dev/null

echo ”
echo ‘=== World/group-writable __pycache__ dirs under /opt /usr/local ===’
find /opt /usr/local -type d -name ‘__pycache__’ -perm -g+w 2>/dev/null
find /opt /usr/local -type d -name ‘__pycache__’ -perm -o+w 2>/dev/null
COMPLIANCE MAPPINGSOC 2 CC6.1 / CC6.3 — Logical access controls and access restriction. Writable privileged script directories are a direct violation of least-privilege principles that SOC 2 CC6.x expects to be enforced.ISO 27001 A.9.4 — System and application access control. A.9.4.1 requires information access restriction; A.9.4.5 requires access control to program source code. The cache directory, as part of the Python execution chain, falls within scope.CIS Benchmark for Linux (v3.x) — Section 6 (File Permissions) includes guidance on world-writable directories and system script permissions. Privileged Python script directories should be treated with the same rigor as /etc/cron.d.

Prioritization and What Actually Matters

Not every writable __pycache__ directory represents the same risk. Prioritize by the privilege level of the process consuming the cache and the number of users who can reach the writable path.

Risk triage for prioritizing remediation
# Risk triage: HIGH -> MEDIUM -> LOW

# HIGH: root cron jobs + world-writable __pycache__
#   Any local user can exploit this without triggering the privileged command
#   themselves — they just wait for the scheduler.

# HIGH: sudoers NOPASSWD Python scripts + group-writable __pycache__
#   Attacker can trigger execution on demand after poisoning the cache.

# MEDIUM: privileged systemd service + group-writable __pycache__
#   Requires service restart or natural execution cycle.

# LOW: non-privileged scripts in writable directories
#   Lateral movement risk, not vertical escalation.
#   Still worth fixing — just not P1.

One more thing for engineering leaders: automated scanning helps here, but generic vulnerability scanners won’t surface this. You need tooling that understands Python’s import mechanics and correlates execution context with filesystem permissions. The audit script above is a starting point. A full review by someone who knows what they’re looking for will surface things automation misses.

Closing Thoughts

The finding that’s hardest to operationalize isn’t the one with the highest CVSS score. It’s the one that requires connecting three separate data points that no single tool is watching. Python cache poisoning as a local privilege escalation path is exactly that — technically straightforward once you see it, consistently overlooked because it lives at the intersection of performance optimization, filesystem permissions, and execution context.

The remediation isn’t exotic. Tighten directory permissions. Disable unnecessary caching on privileged scripts. Add __pycache__ to your integrity monitoring scope. Audit your cron and sudoers configurations with this threat model in mind.

But finding it in the first place requires knowing where to look.