If your site discusses databases then turning on the default SQL injection attack prevention rules will break your site. And there is another ruleset for file inclusion where things like /etc/hosts and /etc/passwd get blocked.
I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.
Fine tuning the rules is time consuming. You often have to just completely turn off the ruleset because when you try to keep the ruleset on and allow the use-case there are a ton of changes you need to get implemented (if its even possible). Page won't load because /etc/hosts was in a query param? Okay, now that you've fixed that, all the XHR included resources won't load because /etc/hosts is included in the referrer. Now that that's fixed things still won't work because some random JS analytics lib put the URL visited in a cookie, etc, etc... There is a temptation to just turn the rules off.
Another developer in the team decided they wanted to log what customers searched for, so if someone typed in "OutOfMemoryException" in the search bar...
How about this: don't run a dumb as rocks Web Application Firewall on an endpoint where people are editing articles that could be about any topic, including discussing the kind of strings that might trigger a dumb as rocks WAF.
This is like when forums about web development implement XSS filters that prevent their members from talking about XSS!
Learn to escape content properly instead.
But it doesn't. This case highlights a bug, a stupid bug. This case highlights that people who should know better, don't!
The tension between security and usability is real but this is not it. Tension between security and usability is usually a tradeoff. When you implement good security that inconveniences the user. From simple things like 2FA to locking out the user after 3 failed attempts. Rate limiting to prevent DoS. It's a tradeoff. You increase security to degrade user experience. Or you decrease security to increase user experience.
This is neither. This is both bad security and bad user experience. What's the tension?
[1] the second time it happened, a colleague added "if we got 403, print "HAHAHA YOU'VE BEEN WAFFED" to our deployment script, and for that I am forever thankful because I saw that error more times than I expected
It turns out OpenRouter’s API is protected by Cloudflare and something about specific raw chunks of HTML and JavaScript in the POST request body cause it to block many, though not all, requests. Going direct to OpenAI or Anthropic with the same prompts is fine. I wouldn’t mind but these are billable requests to commercial models and not OpenRouter’s free models (which I expect to be heavily protected from abuse).
This isn't a tension. This rule should not be applied at the WAF level. It doesn't know that this field is safe from $whatever injection attacks. But the substack backend does. Remove the rule from the WAF (and add it to the backend, where it belongs) and you are just as secure and much more usable. No tension.
This is like spam filtering. I'm an anti-spam advocate, so the idea that most people can't discuss spam because even the discussion will set off filters is quite old to me.
People who apologize for email content filtering usually say that spam would be out of control if they didn't have that in place, in spite of no personal experience on their end testing different kinds of filtering.
My email servers filter based on the sending server's configuration: does the EHLO / HELO string resolve in DNS? Does it resolve back to the connecting IP? Does the reverse DNS name resolve to the same IP? Does the delivery have proper SPF / DKIM? Et cetera.
My delivery-based filtering works worlds better than content-based filtering, plus I don't have to constantly update it. Each kind has advantages, but I'd rather occasional spam with no false positives than the chance I'm blocking email because someone used the wrong words.
With web sites and WAF, I think the same applies, and I can understand when people have a small site and don't know or don't have the resources to fix things at the actual content level, but the people running a site like Substack really should know better.
Trying to contact support was difficult too due to AI chatbots, but when I finally did reach a human, their "tech support" obviously didn't bother to look at this in any reasonable timeframe.
It wasn't until some random person on Twitter suggested the possibility of some magic string tripping over some stupid security logic that I found the problem and could finally edit my post.
1. Create a new post. 2. Include an Image, set filter to All File types and select "/etc/hosts". 3. You get served with an weird error message box displacing a weird error message. 4. After this the Substack posts editor is broken. Heck, every time i access the Dashboard, it waits forever to build the page.
Did find this text while browsing the source for an error (see original ascii art: https://pastebin.com/iBDsuer7):
SUBSTACK WANTS YOU
TO BUILD A BETTER BUSINESS MODEL FOR WRITING
https://substack.com/jobs
The same application also stored my full password in localStorage and a cookie (without httponly or secure). Because reasons. Sigh.
I'm going to do a hot take and say that WAFs are bollocks mainly used by garbage software. I'm not saying a good developer can't make a mistake and write a path traversal, but if you're really worried about that then there are better ways to prevent that than this approach which obviously is going to negatively impact users in weird and mysterious ways. It's like the naïve /(fuck|shit|...)/g-type "bad word filter". It shows a fundamental lack of care and/or competency.
Aside: is anyone still storing passwords in /etc/passwd? Storing the password in a different root-only file (/etc/shadow, /etc/master.passwd, etc.) has been a thing on every major system since the 90s AFAIK?
> "How could Substack improve this situation for technical writers?"
They don’t care about (technical) writers. All they care about is building a TikTok clone to “drive discoverability” and make the attention-metrics go up. Chris Best is memeing about it on his own platform. Very gross.
I've worked with a WAF installation (totally different product), where the "WAF fail" tell was HTTP status 200 (!) and "location: /" (and some garbage cookies), possibly to get browsers to redirect using said cookies. This was part of the CSRF protection. Other problems were with "command injection"-patterns (like in the article, expect with specific Windows commands, too - they clash with everyday words which the users submit), and obviously SQL injections which cover some relevant words, too.
The bottom line is that WAFs in their "hardened/insurance friendly" standard configs are set up to protect the company from amateurs exposing buggy, unsupported software or architectures. WAF's are useful for that, but you still gave all the other issues with buggy, unsupported software.
As others have written, WAFs can be useful to protect against emerging threats, like we saw with the log4j exploit which CloudFlare rolled out protection for quite fast.
Unless you want compliance more than customers, you MUST at least have a process to add exceptions to "all the rules"-circus they put in front of the buggy apps.
Whack-a-mole security filtering is bad, but whack-a-mole relaxation rule creation against an unknown filter is really tiring.
It references this CVE https://github.com/tuo4n8/CVE-2023-22047 which allows the reading of system files. The example given shows them reading /etc/passwd
One of the authors of the paper has said "WAFs are just speed bump to a determined attacker."
I worked on a project where we had to use a WAF for compliance reasons. It was a game of wack-a-mole to fix all the places where standard rules broke the application or blocked legitimate requests.
One notable, and related example is any request with the string "../" was blocked, because it might be a path traversal attack. Of course, it is more common that someone just put a relative path in their document.
Why would random text be parsed? I read the article but this doesn't make sense to me. They suggested directory transversal but your text shouldn't have anything to do with that and transversal is solved by permission settings
Ahh, the modern trend of ”unalived”¹ etc. comes to every corner of society eventually.
generally think that Substack has done a good thing for its core audience of longform newsletter writer creators who want to be Ben Thompson. however its experience for technical people, for podcasters, for people who want to start multi-channel media brands, and for people who write for reach over revenue (but with optional revenue) has been really poor. (all 4 of these are us with Latent.Space). I've aired all these complaints with them and theyve done nothing, which is their prerogative.
i'd love for "new Substack" to emerge. or "Substack for developers".
/etc/hosts
See, HN didn't complain. Does this mean I have hacked into the site? No, Substack (or Cloudflare, wherever the problem is) is run by people who have no idea how text input works.
Writing `find` as the first word in your search will prevent Firefox from accepting the “return” key is pressed.
Pretty annoying.
If that's their idea of security...
The outcome is the usual one, stuff breaks and there is no additional security.
* The product provided for blogging/content publishing did a shitty job of configuring WAF rules for its use cases (the utility of a "magic WAF that will just solve all your problems" being out of the picture for now) * The WAF product provided by the cloud platform clearly has shitty, overreaching rules doing arbitrary filtering on arbitrary strings. That filtering absolutely can (and will) break unrelated content if the application behind the WAF is developed with a modicum of security-mindedness. You don't `fopen()` a string input (no, I will not be surprised - yes, sometimes you do `fopen()` a string input - when you are using software that is badly written).
So I am wondering:
1. Was this sent to Substack as a bug - they charge money for their platform, and the inability to store $arbitrary_string on a page you pay for, as a user, is actually a malfunction and disfunction"? It might not be the case "it got once enshittified by a CIO who mandated a WAF of some description to tick a box", it might be the case "we grabbed a WAF from our cloud vendor and haven't reviewed the rules because we had no time". I don't think it would be very difficult for me, as an owner/manager at the blogging platform, to realise that enabling a rule filtering "anything that resembles a Unix system file path or a SQL query" is absolutely stupid for a blogging platform - and go and turn it the hell off at the first user complaint.
2. Similarly - does the cloud vendor know that their WAF refuses requests with such strings in them, and do they have a checkbox for "Kill requests which have any character an Average Joe does not type more frequently than once a week"? There should be a setting for that, and - thinking about the cloud vendor in question - I can't imagine the skill level there would be so low as to not have a config option to turn it off.
So - yes, that's a case of "we enabled a WAF for some compliance/external reasons/big customer who wants a 'my vendor uses a WAF' on their checklist", but also the case of "we enabled a WAF but it's either buggy or we haven't bothered to configure it properly".
To me it feels like this would be 2 emails first ("look, your thing <X> that I pay you money for clearly and blatantly does <shitty thing>, either let me turn it off or turn it off yourself or review it please") - and a blog post about it second.
It's not likely to be a WAF or content scanner, because the HTTP request is using PUT (which browser forms don't use) and it's uploading the content as a JSON content-type in a JSON document. The WAF would have to specifically look for PUTs, open up the JSON document, parse it, find the sub-string in a valid string, and reject it. OR it would have to filter raw characters regardless of the HTTP operation.
Neither of those seem likely. WAFs are designed to filter on specific kinds of requests, content, and methods. A valid string in a valid JSON document uploaded by JavaScript using a JSON content-type is not an attack vector. And this problem is definitely not path traversal protection, because that is only triggered when the string is in the URL, not some random part of the content body.