It's a shame some ecosystems move waaay too fast, or don't have a good story for having distro-specific packages. For example, I don't think there are Node.js libraries packaged for Debian that allow you to install them from apt and use it in projects. I might be wrong.
It's comparing the likelihood of an update introducing a new vulnerability to the likelihood of it fixing a vulnerability.
While the article frames this problem in terms of deliberate, intentional supply chain attacks, I'm sure the majority of bugs and vulnerabilities were never supply chain attacks: they were just ordinary bugs introduced unintentionally in the normal course of software development.
On the unintentional bug/vulnerability side, I think there's a similar argument to be made. Maybe even SemVer can help as a heuristic: a patch version increment is likely safer (less likely to introduce new bugs/regressions/vulnerabilities) than a minor version increment, so a patch version increment could have a shorter cooldown.
If I'm currently running version 2.3.4, and there's a new release 2.4.0, then (unless there's a feature or bugfix I need ASAP), I'm probably better off waiting N days, or until 2.4.1 comes out and fixes the new bugs introduced by 2.4.0!
Bottom line those security bugs are not all from version 1.0 , and when you update you may well just be swapping known bugs for unknown bugs.
As has been said elsewhere - sure monitor published issues and patch if needed but don't just blindly update.
Libraries themselves should perhaps also take a page from the book of Linux distributions and offer LTS (long term support) releases that are feature frozen and include only security patches, which are much easier to reason about and periodically audit.
Ok if this is an amazing advice and the entire ecosystem does that: just wait .... then what? We wait even more to be sure someone else is affected first?
Every time I see people saying you need to wait to upgrade it is like you are accumulating tech debt: the more you wait, the more painful the upgrade will be, just upgrade incrementally and be sure you have mitigations like 0 trust or monitoring to cut early any weird behavior.
One of the classic scammer techniques is to introduce artificial urgency to prevent the victim from thinking clearly about a proposal.
I think this would be a weakness here as well: If enough projects adopt a "cooldown" policy, the focus of attackers would shift to manipulate projects into making an exception for "their" dependency and install it before the regular cooldown period elapsed.
How to do that? By playing the security angle once again: An attacker could make a lot of noise how a new critical vulnerability was discovered in their project and every dependant should upgrade to the emergency release as quickly as possible, or else - with the "emergency release" then being the actually compromised version.
I think a lot of projects would could come under pressure to upgrade, if the perceived vulnerability seems imminent and the only point for not upgrading is some generic cooldown policy.
The attacker will try to figure out when they are the least available: during national holidays, when they sleep, during conferences they attend, when they are on sick leave, personal time off, ...
Many projects have only a few or even only a single person, that's going to notice. They are often from the same country (time zone) or even work in the same company (they might all attend the same conference or company retreat weekend).
Except if everyone does it chance of malicious things being spotted in source also drops by virtue of less eyeballs
Still helps though in cases where maintainer spot it etc
If the code just sits there for a week without anyone looking at it, and is then considered cooled down just due to the passage of time, then the cool down hasn't done anything beneficial.
A form of cooldown that could would in terms of mitigating problems would be a gradual rollout. The idea is that the published change is somehow not visible to all downstreams at the same time.
Every downstream consumer declares a delay factor. If your delay factor is 15 days, then you see all new change publications 15 days later. If your delay factor is 0 days, you see everything as it is published, immediately. Risk-averse organizations configure longer delay factors.
This works because the risk-takers get hit with the problem, which then becomes known, protecting the risk-averse from being affected. Bad updates are scrubbed from the queue so those who have not yet received them due to their delay factor wlll not see those updates.
I delay the use of updated software by a week, and anyone that doesn't takes the risk. Therefore I, the user of the cooldown, enjoys reduced risk at the expense of everyone not implementing a cooldown.
If everyone simply delays their updates, then there is nobody to suffer an attack which notifies users of the cooldown (in this case, everybody).
The blog post makes the argument that the vendors are incentivized to discover these attacks in this time, but that's an entirely different argument and if that were true, they would already be doing that.
In fact, auditing updates for vulnerabilities is the general solution. The whole appeal of the cooldowns is that you don't have to do that - the cost is that it's a zero-sum game reliant on the suffering of those less wise.
It’s one of the main reasons I used popular open source software: so I could ride the coattails of the rest of the community. Basically everyone else could be my beta tester.
A regular update was an input to the community security practice, so I would let it settle for a while. A security patch was an output of the community security practice, so I would install ASAP, even if it meant breaking a feature temporarily.
I also manually managed dependencies as commits to the main codebase, meaning my entire site was one deployable object from a single Git repo. The “modern” practice today seems to instead favor a minimal repo and resolving and pulling dependencies at deploy time. Personally I think this is a bad idea that has amplified the risk of supply chain attacks.
And now the idea is apparently back to: give it a little while. Tell the automatic dependency puller to chill out and wait.
My understanding of a cooldown, from video games, is a period of time after using an ability where you can't use it again. When you're firing a gun and it gets too hot, you have to wait while it cools down.
I was trying to apply this to the concept in the article and it wasn't quite making sense. I don't think "cooldown" is really used when taking a pie out of the oven, for example. I would call this more a trial period or verification window or something?
I can't in good conscience, say "Don't use dependencies," which solves a lot of problems, but I can say "Let's be careful, out there," to quote Michael Conrad.
I strongly suspect that a lot of dependencies get picked because they have a sexy Website, lots of GH stars and buzz, and cool swag.
I tend to research the few dependencies that I use. I don't depend lightly.
I'm also fortunate to be in the position where I don't need too many. I am quite aware that, for many stacks, there's no choice.
If you tell people that cooldowns are a type of test and that until the package exits the testing period, it's not "certified" [*] for production use, that might help with some organizations. Or rather, would give developers an excuse for why they didn't apply the tip of a dependency's dev tree to their PROD.
So... not complaining about cooldowns, just suggesting some verbiage around them to help contextualize the suitability of packages in the cooldown state for use in production. There are, unfortunately, several mid-level managers who are under pressure to close Jira tickets IN THIS SPRINT and will lean on the devs to cut whichever corners need to be cut to make it happen.
[*] for some suitable definition of the word "CERTIFIED."
Austral[0] gets this right. I'm not a user, just memeing a good idea when I see it.
Most languages could be changed to be similarly secure. No global mutable state, no system calls without capabilities, no manual crafting of pointers. All the capabilities come as tokens or objects passed in to the main, and they can be given out down the call tree as needed. It is such an easy thing to do at the language level, and it doesn't require any new syntax, just a new parameter in main, and the removal of a few bad ideas.
P.S. When I was working at Amazon, I remember that a good number of on-call tickets were about fixing dependencies (in most of them are about updating the outdated Scala Spark framework--I believe it was 2.1.x or older) and patching/updating OS'es in our clusters. What the team should have done (I mentioned this to my manager) is to create clusters dynamically (do not allow long-live clusters even if the end users prefer it that way), and upgrading the Spark library. Of course, we had a bunch of other annual and quarterly OKRs (and KPIs) to meet, so updating Spark got the lowest of priorities...
- you are vulnerable for 7 days because of a now public update
- you are vulnerable for x (hours/days) because of a supply chain attack
I think the answer is rather simple: subscribe to a vulnerability feed, evaluate & update. The amount of times automatic updates are necessary is near zero as someone who has ran libraries that are at times 5 to 6 years out of date exposed to the internet without a single event of compromise and it's not like these were random services, they were viewed by hundreds of thousands of unique addresses. There was only 3 times in the last 4 years where I had to perform updates due to a publically exposed service where these vulnerabilities affected me.
Okay, the never being compromised part is a lie because of php, it's always PHP (monero-miner I am sure everyone is familiar with). The solution for that was to stop using PHP and assosiated software.
Another one I had problems with was CveLab (GitLab if you couldn't tell), there has been so many critical updates pointing to highly exploitable CVE's that I had decided to simply migrate off it.
In conclusion avoiding bad software is just as important as updates from my experience lowering the need for quick and automated actions.
Stacking up more sub-par tooling is not going to solve anything.
Fortunately this is a problem that doesn't even have to exist, and isn't one that anyone falls into naturally. It's a problem that you have to actively opt into by taking steps like adding things to .gitignore to exclude them from source control, downloading and using third-party tools in a way that introduces this and other problems, et cetera—which means you can avoid all of it by simply not taking those extra steps.
(Fun fact: on a touch-based QWERTY keyboard, the gesture to input "vendoring" by swiping overlaps with the gesture for "benefitting".)
* If everybody does it, it won't work so well
* I've seen cases where folks pinned their dependencies, and then used "npm install" instead of "npm ci", so the pinning was worthless. Guess they are the accidental, free beta testers for the rest of us.
* In some ecosystems, distributions (such as Debian) does both additional QA, and also apply a cooldown. Now we try to retrofit some of that into our package managers.
Something like, upgrade once there are N independent positive reviews AND less than M negative reviews (where you can configure which people are organisations you trust to audit). And of course you would be able to audit dependencies yourself (and make your review available for others).
Also, there should a way to distinguish between security updates and normal updates for this. If there is, a cooldown is a useful idea in general for normal updates, since (presumably) the current version works and the new version may introduce bugs.
How?
It helps a lot with these dev cycles.
Recently, I saw even EmberJS moved away from it. I still think they could've done it without leaving semver. Maybe I'm wrong.
Also whoever keeps changing the entire API of node-redis, please stop.
The article does not discuss this tradeoff.
Copilot seems well placed with its GitHub integration here, it could review dependency suggestions, CVEs, etc and make pull requests.
Though I guess it'd be hard to prove intent in this case.
I know Ubuntu and others do the same but I don't know what they call their STS equivalent.
Half of my job when reviewing work is telling junior devs to try to do something without the dependency. Usually they just learn about a very basic skill.
"Ok so you downloaded a thing called apscheduler, what does that do?" "So it's like cron?" "Can you use threads, or even better just use a separate process?" "Cool"
My (least) favourite example is the amount of juniors that download a dependency from a random indian dev to store and read credentials from .env . Just open() and read() the security critical file my dude.
I've been working on automatic updates for some of my [very overengineered] homelab infra and one thing that I've found particularly helpful is to generate PRs with reasonable summaries of the updates with an LLM. it basically works by having a script that spews out diffs of any locks that were updated in my repository, while also computing things like `nix store diff-closures` for the before/after derivations. once I have those diffs, I feed them into claude code in my CI job, which generates a pull request with a nicely formatted output.
one thing I've been thinking is to lookup all of those dependencies that were upgraded and have the LLM review the commits. often claude already seems to lookup some of the commits itself and be able to give a high level summary of the changes, but only for small dependencies where the commit hash and repository were in the lock file.
it would likely not help at all with the xz utils backdoor, as IIRC the backdoor wasn't even in the git repo, but on the release tarballs. but I wonder if anyone is exploring this yet?
For projects with hundreds or thousands of active dependencies, the feed of security issues would be a real fire hose. You’d want to use an LLM to filter the security lists for relevance before bringing them to the attention of a developer.
It would be more efficient to centralize this capability as a service so that 5000 companies aren’t all paying for an LLM to analyze the same security reports. Perhaps it would be enough for someone to run a service like cooldown.pypi.org that served only the most vetted packages to everyone.