Major Python Infrastructure Breach – Over 170K Users Compromised. How Safe Is Your Code?

The Checkmarx Research team has unearthed a sophisticated attack campaign that leveraged fake Python infrastructure to target the software supply chain, affecting over 170,000 users, including the GitHub organization and several individual developers. This multifaceted attack involved techniques such as account takeover via stolen browser cookies, verified malicious code contributions, the establishment of a custom Python mirror, and the dissemination of harmful packages through the PyPi registry.

Key Insights

  • Silent Software Supply Chain Assault: The attackers orchestrated a silent assault on the software supply chain, employing multiple tactics to steal sensitive information from unsuspecting victims. This included the creation of malicious open-source tools with enticing descriptions to lure victims, most of whom were likely redirected from search engines.
  • The Use of a Fake Python Mirror: A cornerstone of this campaign was the distribution of a malicious dependency through a counterfeit Python infrastructure, which was linked to popular projects on GitHub and legitimate Python packages. The attackers not only hijacked GitHub accounts to spread malicious Python packages but also engaged in social engineering to amplify their reach.
  • A Multi-Stage, Evasive Payload: The attack featured a complex, multi-stage payload designed to harvest valuable data such as passwords and credentials from infected systems before exfiltrating this data to the attackers’ infrastructure. Notably, a fake Python packages mirror was deployed, distributing a poisoned version of the widely-used “colorama” package.

One notable victim shared their experience of encountering suspicious activity related to the “colorama” package, which ultimately led to the realization that they had been hacked. This account underscores the stealth and deceit employed in the campaign, with the attackers leveraging fake Python mirrors and typosquatting to deceive users and spread malware through malicious GitHub repositories.

The Technical Backbone of the Attack

The fake Python mirror, appearing under the domain “files[.]pypihosted[.]org”, mimicked the official Python package mirror, playing a crucial role in the attack’s success. By hosting a tampered version of “colorama” laden with malicious code and utilizing stolen GitHub identities to commit changes to reputable repositories, the attackers showcased a sophisticated understanding of the software supply chain’s vulnerabilities.

Attack Tecniques Used

The attack on the software supply chain leveraging fake Python infrastructure utilized a complex array of techniques to compromise over 170,000 users. Here’s a breakdown of the key attack techniques used:

  1. Account Takeover via Stolen Browser Cookies: The attackers gained unauthorized access to GitHub accounts by stealing session cookies. This allowed them to bypass authentication measures and perform malicious activities without the need to know the accounts’ passwords.
  2. Malicious Code Contributions with Verified Commits: Utilizing the hijacked accounts, the attackers contributed malicious code to reputable projects. These contributions often appeared as legitimate due to the use of verified commits, making them harder to detect.
  3. Setting Up a Custom Python Mirror: A central element of the campaign was the establishment of a counterfeit Python package mirror. This mirror hosted poisoned versions of popular Python packages, including a tampered version of “colorama” that contained malicious code.
  4. Publishing Malicious Packages to the PyPi Registry: The attackers published harmful packages to the Python Package Index (PyPi), exploiting the trust within the Python community in this repository. These packages often had clickbait descriptions to attract victims, many of whom were redirected from search engines.
  5. Typosquatting and Fake Python Mirror for Package Distribution: The domain “files[.]pypihosted[.]org” was registered as part of the attack, cleverly typosquatting the official Python mirror’s domain to deceive users into downloading malicious packages.
  6. Social Engineering to Increase Credibility and Visibility: By taking over reputable GitHub accounts, the attackers were able to star multiple malicious repositories, increasing their visibility and the likelihood of other users trusting and downloading from these sources.
  7. Multi-Stage, Evasive Malicious Payload: The attack deployed a multi-stage payload that initially appeared benign but was designed to harvest and exfiltrate valuable data, such as passwords and credentials, from infected systems. This payload was sophisticated, employing obfuscation and evasion techniques to avoid detection.

Each of these techniques demonstrates the attackers’ deep understanding of both social engineering and technical vulnerabilities within the software supply chain. The combination of these methods allowed for a highly effective and damaging attack.

Hosting a Poisoned ‘colorama’

The attackers hosted a poisoned version of “colorama”, a widely used package in the Python community with over 150 million monthly downloads. Here’s how they executed this part of their sophisticated attack:

  1. Copying and Modifying “Colorama”: The threat actors started by copying the legitimate “colorama” package and inserting malicious code into it. This code was designed to be part of the package’s functionality, making it difficult to detect without thorough inspection.
  2. Concealing the Malicious Code: The harmful payload was concealed within the modified “colorama” package using space-padding. This method pushed the malicious code off-screen in text editors, requiring users to scroll horizontally to discover it. This technique significantly decreased the likelihood of the malicious content being spotted during casual review.
  3. Using a Typosquatted Domain for Hosting: The modified, malicious version of “colorama” was hosted on a fake Python mirror. This mirror was accessible via a domain that closely resembled the official Python package hosting service, leveraging typosquatting to deceive users. The domain “files[.]pypihosted[.]org” was used for this purpose, mimicking the legitimate “”.
  4. Distributing the Poisoned Package: To spread the poisoned “colorama”, the attackers manipulated project dependencies. They committed changes to reputable projects on GitHub, modifying the requirements.txt files to include the malicious package version hosted on their fake mirror. This ensured that when the project was installed or updated, the poisoned “colorama” would be downloaded and executed.
  5. Evading Detection: The strategic use of a typosquatted domain, along with the method of concealing malicious code within a legitimate package, made this attack particularly evasive. The attackers’ efforts to blend the malicious package into normal dependencies made it challenging for users and automated tools to identify the threat.

By hosting this poisoned “colorama” package on their fake Python infrastructure and linking it to popular projects, the attackers were able to execute a silent supply chain attack, compromising the systems of unsuspecting developers and users. This attack underscores the importance of verifying the sources of software dependencies and the need for vigilance in the face of increasingly sophisticated cyber threats.

The deployment of the malicious package in the attack using the fake Python infrastructure involved a sophisticated multi-stage process. Here’s a breakdown of the stages through which the malicious package, particularly the poisoned “colorama”, was deployed and executed on the victims’ systems:

Stage 1: Initial Download and Execution

  • Malicious Repository or Package Download: The unsuspecting user clones a repository or downloads a package that contains a malicious dependency. This dependency points to the poisoned “colorama” package hosted on the attackers’ fake Python mirror (typosquatted domain “files[.]”).
  • Execution of Initial Malicious Code: Upon installation or update, the malicious “colorama” package executes its payload, which includes additional malicious code. This stage sets the foundation for further exploitation.

Stage 2: Malicious Code Activation

  • Identical Code with Malicious Snippet: The “colorama” package contains code identical to the legitimate version, with the exception of a short malicious snippet. This snippet was initially located within a seemingly innocuous file but was strategically placed to ensure execution.
  • Obfuscation and Execution of Further Malicious Code: The attacker used significant whitespace to push the malicious code off-screen in text editors, requiring horizontal scrolling for discovery. This code, once executed, fetches another piece of Python code from a remote server, which installs necessary libraries and decrypts hard-coded data.

Stage 3: Payload Delivery

  • Fetching Additional Obfuscated Python Code: The malware progresses to fetch more obfuscated Python code from another external link. This code is then executed using Python’s “exec” function, initiating the next phase of the attack.

Stage 4: System Compromise and Data Harvesting

  • Advanced Obfuscation Techniques: Techniques such as the use of non-English character strings, compression, and misleading variable names complicate the analysis and understanding of the code.
  • Deployment of Final Malicious Payload: The code checks the compromised host’s operating system, selects a random folder and file name for the final malicious Python code, and retrieves it from a remote server.
  • Persistence Mechanism: The malware modifies the Windows registry to create a new run key, ensuring that the malicious code is executed every time the system restarts. This allows the malware to maintain its presence on the compromised system.

Stage 5: Data Exfiltration

  • Broad Data-Stealing Capabilities: The final payload reveals the malware’s ability to target a wide range of applications and steal sensitive information. This includes data from web browsers, Discord, cryptocurrency wallets, Telegram sessions, and more.
  • Keylogging and File Stealing: A keylogging component captures the victim’s keystrokes, and a file stealer searches for files with specific keywords, targeting directories like Desktop and Downloads.
  • Exfiltration to Attacker’s Server: The stolen data, along with files compressed into ZIP files, are uploaded to the attacker’s server. Various techniques, including anonymous file-sharing services and direct HTTP requests, are used for data exfiltration.

These stages illustrate the meticulous planning and execution of the attack, showcasing the attackers’ technical sophistication and understanding of both software dependencies and human behavior. The multi-stage approach not only facilitated the deployment of the malicious payload but also helped in evading detection, making the attack particularly damaging.

The attack involving the fake Python infrastructure and the poisoned “colorama” package also saw the publication of several other malicious packages to the Python Package Index (PyPI). These packages were part of the attackers’ strategy to distribute malware through the Python package ecosystem. Below is a list of some of the packages involved in this campaign, along with their version numbers and the usernames of the publishers:

  • jzyrljroxlca Version 0.3.2, published by user pypi/xotifol394 on 21-Jul-23
  • wkqubsxekbxn Version 0.3.2, published by user pypi/xotifol394 on 21-Jul-23
  • eoerbisjxqyv Version 0.3.2, published by user pypi/xotifol394 on 21-Jul-23
  • lyfamdorksgb Version 0.3.2, published by user pypi/xotifol394 on 21-Jul-23
  • hnuhfyzumkmo Version 0.3.2, published by user pypi/xotifol394 on 21-Jul-23
  • hbcxuypphrnk Version 0.3.2, published by user pypi/xotifol394 on 20-Jul-23
  • dcrywkqddo Version 0.4.3, published by user pypi/xotifol394 on 20-Jul-23
  • mjpoytwngddh Version 0.3.2, published by user pypi/poyon95014 on 21-Jul-23
  • eeajhjmclakf Version 0.3.2, published by user pypi/tiles77583 on 21-Jul-23
  • yocolor Version 0.4.6, published by user pypi/felpes on 05-Mar-24
  • coloriv Version 3.2, published by user pypi/felpes on 22-Nov-22
  • colors-it Version 2.1.3, published by user pypi/felpes on 17-Nov-22
  • pylo-color Version 1.0.3, published by user pypi/felpes on 15-Nov-22
  • type-color Version 0.4, published by user felipefelpes on 01-Nov-22

These packages, including variations of the “colorama” package and others with obscure or clickbait names, were part of a broader strategy to distribute malware. The attackers employed these packages as vectors for delivering malicious code to unsuspecting victims’ systems, exploiting the trust placed in the PyPI ecosystem and the routine use of these packages in Python projects.

This list provides a snapshot of the malicious packages published by the attackers, illustrating the scale and diversity of their efforts to infiltrate the software supply chain. Users and developers are urged to exercise caution and perform thorough vetting before incorporating third-party packages into their projects.

This campaign exemplifies the advanced strategies malicious actors adopt to infiltrate and compromise trusted platforms like PyPI and GitHub. It serves as a stark reminder of the necessity for diligence when installing packages and repositories, even from seemingly reliable sources. Vigilance, thorough vetting of dependencies, and the maintenance of robust security measures are paramount in mitigating the risks posed by such sophisticated attacks.