FRESH

Hacker News

Home

Writing "/etc/hosts" breaks the Substack editor

544 points by scalewithlee

by matt_heimer

12 subcomments

The people configuring WAF rules at CDNs tend to do a poor job understanding sites and services that discuss technical content. It's not just Cloudflare, Akamai has the same problem.
If your site discusses databases then turning on the default SQL injection attack prevention rules will break your site. And there is another ruleset for file inclusion where things like /etc/hosts and /etc/passwd get blocked.
I disagree with other posts here, it is partially a balance between security and usability. You never know what service was implemented with possible security exploits and being able to throw every WAF rule on top of your service does keep it more secure. Its just that those same rulesets are super annoying when you have a securely implemented service which needs to discuss technical concepts.
Fine tuning the rules is time consuming. You often have to just completely turn off the ruleset because when you try to keep the ruleset on and allow the use-case there are a ton of changes you need to get implemented (if its even possible). Page won't load because /etc/hosts was in a query param? Okay, now that you've fixed that, all the XHR included resources won't load because /etc/hosts is included in the referrer. Now that that's fixed things still won't work because some random JS analytics lib put the URL visited in a cookie, etc, etc... There is a temptation to just turn the rules off.

by netsharc

2 subcomments

Reminds me of an anecdote about an e-commerce platform: someone coded a leaky webshop, so their workaround was to watch if the string "OutOfMemoryException" shows up in the logs, and then restart the app.
Another developer in the team decided they wanted to log what customers searched for, so if someone typed in "OutOfMemoryException" in the search bar...

by Y_Y

5 subcomments

Does it block `/etc//hosts` or `/etc/./hosts`? This is a ridiculous kind of whack-a-mole that's doomed to failure. The people who wrote these should realize that hackers are smarter and more determined than they are and you should only rely on proven security, like not executing untrusted input.

by simonw

2 subcomments

"How could Substack improve this situation for technical writers?"
How about this: don't run a dumb as rocks Web Application Firewall on an endpoint where people are editing articles that could be about any topic, including discussing the kind of strings that might trigger a dumb as rocks WAF.
This is like when forums about web development implement XSS filters that prevent their members from talking about XSS!
Learn to escape content properly instead.

by blenderob

1 subcomments

> This case highlights an interesting tension in web security: the balance between protection and usability.
But it doesn't. This case highlights a bug, a stupid bug. This case highlights that people who should know better, don't!
The tension between security and usability is real but this is not it. Tension between security and usability is usually a tradeoff. When you implement good security that inconveniences the user. From simple things like 2FA to locking out the user after 3 failed attempts. Rate limiting to prevent DoS. It's a tradeoff. You increase security to degrade user experience. Or you decrease security to increase user experience.
This is neither. This is both bad security and bad user experience. What's the tension?

by SonOfLilit

2 subcomments

After having been bitten once (was teaching a competitive programming team, half the class got a blank page when submitting solutions, after an hour of debugging I narrowed it down to a few C++ types and keywords that cause 403 if they appear in the code, all of which happen to have meaning in Javascript), and again (working for a bank, we had an API that you're supposed to submit a python file to, and most python files would result in 403 but short ones wouldn't... a few hours of debugging and I narrowed it down to a keyword that sometimes appears in the code) and then again a few months later (same thing, new cloud environment, few hours burned on debugging[1]), I had the solution to his problem in mind _immediately_ when I saw the words "network error".
[1] the second time it happened, a colleague added "if we got 403, print "HAHAHA YOU'VE BEEN WAFFED" to our deployment script, and for that I am forever thankful because I saw that error more times than I expected

by pimanrules

1 subcomments

We faced a similar issue in our application. Our internal Red Team was publishing data with XSS and other injection attack attempts. The attacks themselves didn't work, but the presence of these entries caused our internal admin page to stop loading because our corporate firewall was blocking the network requests with those payloads in them. So an unsuccessful XSS attack became an effective DoS attack instead.

by mrgoldenbrown

3 subcomments

Everything old is new again :) We used to call this the Scunthorpe problem.
https://en.m.wikipedia.org/wiki/Scunthorpe_problem

by mike-cardwell

0 subcomment

Just rot13 any request data using javascript before posting, and rot13 it again on the server side. Problem solved. (jk)

by petercooper

1 subcomments

I ran into a similar issue with OpenRouter last night. OpenRouter is a “switchboard” style service that provides a single endpoint from which you can use many different LLMs. It’s great, but last night I started to try using it to see what models are good at processing raw HTML in various ways.
It turns out OpenRouter’s API is protected by Cloudflare and something about specific raw chunks of HTML and JavaScript in the POST request body cause it to block many, though not all, requests. Going direct to OpenAI or Anthropic with the same prompts is fine. I wouldn’t mind but these are billable requests to commercial models and not OpenRouter’s free models (which I expect to be heavily protected from abuse).

by stefs

0 subcomment

this feels like blocking terms like "null" or "select" just because you failed to properly parameterize your SQL queries.

by robertlagrant

2 subcomments

> This case highlights an interesting tension in web security: the balance between protection and usability.
This isn't a tension. This rule should not be applied at the WAF level. It doesn't know that this field is safe from $whatever injection attacks. But the substack backend does. Remove the rule from the WAF (and add it to the backend, where it belongs) and you are just as secure and much more usable. No tension.

by johnklos

1 subcomments

Content filtering should be highly context dependent. If the WAF is detached from what it's supposed to filter, this happens. If the WAF doesn't have the ability to discern between command and content contexts, then the filtering shouldn't be done via WAF.
This is like spam filtering. I'm an anti-spam advocate, so the idea that most people can't discuss spam because even the discussion will set off filters is quite old to me.
People who apologize for email content filtering usually say that spam would be out of control if they didn't have that in place, in spite of no personal experience on their end testing different kinds of filtering.
My email servers filter based on the sending server's configuration: does the EHLO / HELO string resolve in DNS? Does it resolve back to the connecting IP? Does the reverse DNS name resolve to the same IP? Does the delivery have proper SPF / DKIM? Et cetera.
My delivery-based filtering works worlds better than content-based filtering, plus I don't have to constantly update it. Each kind has advantages, but I'd rather occasional spam with no false positives than the chance I'm blocking email because someone used the wrong words.
With web sites and WAF, I think the same applies, and I can understand when people have a small site and don't know or don't have the resources to fix things at the actual content level, but the people running a site like Substack really should know better.

by josephcsible

0 subcomment

WAFs were created by people who read https://thedailywtf.com/articles/Injection_Rejection and didn't realize that TDWTF isn't a collection of best practices.

by jmmv

0 subcomment

I encountered this a while ago and it was incredibly frustrating. The "Network error" prevented me from updating a post I had written for months because I couldn't figure out why my edits (which extended the length and which I assumed was the problem) couldn't get through.
Trying to contact support was difficult too due to AI chatbots, but when I finally did reach a human, their "tech support" obviously didn't bother to look at this in any reasonable timeframe.
It wasn't until some random person on Twitter suggested the possibility of some magic string tripping over some stupid security logic that I found the problem and could finally edit my post.

by vintermann

0 subcomment

That reminds me of issues I once had with Microsoft's boneheaded WAF. We had base64 encoded data in a cookie, and whenever certain particular characters were produced next to each other in the data - I think the most common was "--" - the WAF would tilt and stop the "attempted SQL injection attack". So every so often someone would get an illegal login cookie and just get locked out of the system until they deleted it or it expired. Took a while to find out what went wrong, and even longer to figure out how to remove the more boneheaded rules from the WAF.

by dvorack101

1 subcomments

Indeed a severe case of paranoia?
1. Create a new post. 2. Include an Image, set filter to All File types and select "/etc/hosts". 3. You get served with an weird error message box displacing a weird error message. 4. After this the Substack posts editor is broken. Heck, every time i access the Dashboard, it waits forever to build the page.
Did find this text while browsing the source for an error (see original ascii art: https://pastebin.com/iBDsuer7):
SUBSTACK WANTS YOU
TO BUILD A BETTER BUSINESS MODEL FOR WRITING
```
                   https://substack.com/jobs
```

by arp242

2 subcomments

Few years ago I had an application that allowed me to set any password, but then gave mysterious errors when I tried to use that password to login. Took me a bit to figure out what was going on, but their WAF blocked my "hacking attempt" of using a ' in the password.
The same application also stored my full password in localStorage and a cookie (without httponly or secure). Because reasons. Sigh.
I'm going to do a hot take and say that WAFs are bollocks mainly used by garbage software. I'm not saying a good developer can't make a mistake and write a path traversal, but if you're really worried about that then there are better ways to prevent that than this approach which obviously is going to negatively impact users in weird and mysterious ways. It's like the naïve /(fuck|shit|...)/g-type "bad word filter". It shows a fundamental lack of care and/or competency.
Aside: is anyone still storing passwords in /etc/passwd? Storing the password in a different root-only file (/etc/shadow, /etc/master.passwd, etc.) has been a thing on every major system since the 90s AFAIK?

by donatj

0 subcomment

We briefly had a WAF forced upon us and it caused so many problems like this we were able to turn it off, for now. I'm sure it'll be back.

by Osiris

0 subcomment

I understand applying path filters in URLS and search strings, but I find it odd that they would apply the same rules to request body content, especially content encoded as valid JSON, and especially for a BLOG platform where the content would be anything.

by nickagliano

1 subcomments

As a card carrying Substack hater, I’m not suprised.
> "How could Substack improve this situation for technical writers?"
They don’t care about (technical) writers. All they care about is building a TikTok clone to “drive discoverability” and make the attention-metrics go up. Chris Best is memeing about it on his own platform. Very gross.

by halffullbrain

1 subcomments

At least, in this case, the WAF in question had the decency to return 403.
I've worked with a WAF installation (totally different product), where the "WAF fail" tell was HTTP status 200 (!) and "location: /" (and some garbage cookies), possibly to get browsers to redirect using said cookies. This was part of the CSRF protection. Other problems were with "command injection"-patterns (like in the article, expect with specific Windows commands, too - they clash with everyday words which the users submit), and obviously SQL injections which cover some relevant words, too.
The bottom line is that WAFs in their "hardened/insurance friendly" standard configs are set up to protect the company from amateurs exposing buggy, unsupported software or architectures. WAF's are useful for that, but you still gave all the other issues with buggy, unsupported software.
As others have written, WAFs can be useful to protect against emerging threats, like we saw with the log4j exploit which CloudFlare rolled out protection for quite fast.
Unless you want compliance more than customers, you MUST at least have a process to add exceptions to "all the rules"-circus they put in front of the buggy apps.
Whack-a-mole security filtering is bad, but whack-a-mole relaxation rule creation against an unknown filter is really tiring.

by eniac111

0 subcomment

https://en.wikipedia.org/wiki/Bush_hid_the_facts

by Null-Set

1 subcomments

This looks like it was caused by this update https://developers.cloudflare.com/waf/change-log/2025-04-22/ rule 100741.
It references this CVE https://github.com/tuo4n8/CVE-2023-22047 which allows the reading of system files. The example given shows them reading /etc/passwd

by jkrems

0 subcomment

Could this be trivially solved client-side by the editor if it just encoded the slashes, assuming it's HTML or markdown that's stored? Replacing `/etc/hosts` with `/etc/hosts` for storage seems like an okay workaround. Potentially even doing so for anything that's added to the WAF rules automatically by syncing the rules to the editor code.

by wglb

1 subcomments

The problem with WAF is discussed in https://users.ece.cmu.edu/~adrian/731-sp04/readings/Ptacek-N....
One of the authors of the paper has said "WAFs are just speed bump to a determined attacker."

by 0xDEAFBEAD

0 subcomment

Weird idea: What if user content was stored and transmitted encrypted by default? Then an attacker would have to either (a) identify a plaintext which encrypts to an attack ciphertext (annoying, and also you could keep your WAF rules operational for the ciphertext, with minimal inconvenience to users) or (b) attack the system when plaintext is being handled (could still dramatically reduce attack surface).

by aidog

1 subcomments

It's something I ran into quite a few times in my career. It's a weird call to get if the client can't save their cms site, due to typing something harmless. I think worst was when there was a dropdown that I defined which had a value in the mod rules that was not allowed.

by thayne

0 subcomment

As soon as I saw the headline, I knew this was due to a WAF.
I worked on a project where we had to use a WAF for compliance reasons. It was a game of wack-a-mole to fix all the places where standard rules broke the application or blocked legitimate requests.
One notable, and related example is any request with the string "../" was blocked, because it might be a path traversal attack. Of course, it is more common that someone just put a relative path in their document.

by nicoledevillers

2 subcomments

it was a cf managed waf rule for a vulnerability that doesn't apply to us. we've disabled it.

by godelski

1 subcomments

I don't get it. Why aren't those files just protected so they have no read or write permissions? Isn't this like the standard way to do things? Put the blog in a private user space with minimal permissions.
Why would random text be parsed? I read the article but this doesn't make sense to me. They suggested directory transversal but your text shouldn't have anything to do with that and transversal is solved by permission settings

by teddyh

0 subcomment

> For now, I'll continue using workarounds like "/etc/h*sts" (with quotes) or alternative spellings when discussing system paths in my Substack posts.
Ahh, the modern trend of ”unalived”¹ etc. comes to every corner of society eventually.
1. <https://knowyourmeme.com/memes/unalive>

by swyx

1 subcomments

substack also does wonderful things like preserve weird bullet points, lack code block displays, and make it impossible to customize the landing page of your site beyond the 2 formats they give you.
generally think that Substack has done a good thing for its core audience of longform newsletter writer creators who want to be Ben Thompson. however its experience for technical people, for podcasters, for people who want to start multi-channel media brands, and for people who write for reach over revenue (but with optional revenue) has been really poor. (all 4 of these are us with Latent.Space). I've aired all these complaints with them and theyve done nothing, which is their prerogative.
i'd love for "new Substack" to emerge. or "Substack for developers".

by driverdan

0 subcomment

This is a common problem with WAFs and, more specifically, Cloudflare's default rulesets. If your platform has content that is remotely technical you'll end up triggering some rules. You end up needing a test suite to confirm your real content doesn't trigger the rules and if it does you need to disable them.

by paxys

5 subcomments

This isn't a "security vs usability" trade-off as the author implies. This has nothing to do with security at all.
/etc/hosts
See, HN didn't complain. Does this mean I have hacked into the site? No, Substack (or Cloudflare, wherever the problem is) is run by people who have no idea how text input works.

by iefbr14

0 subcomment

So "/etc/h*sts" is not stopped by the filters? Nice to know for the hackers :)

0 subcomment

by mifydev

0 subcomment

It's /con/con all over again

by skybrian

0 subcomment

Did anyone try reporting this to Substack?

by righthand

2 subcomments

Similar:
Writing `find` as the first word in your search will prevent Firefox from accepting the “return” key is pressed.
Pretty annoying.

by nottorp

0 subcomment

So everyone should start looking for vulnerabilities in the substack site?
If that's their idea of security...

0 subcomment

by badgersnake

0 subcomment

Seems like a case of somebody installing something they couldn’t be bothered to understand to tick a box marked security.
The outcome is the usual one, stuff breaks and there is no additional security.

by lofaszvanitt

0 subcomment

Using a WAF is the strongest indicator that someone doesn't know what's happening and where or something underneath is smelly and leaking profusely.

by t1234s

0 subcomment

writing "bcc: someone@email.com" sometimes triggers WAF rules

0 subcomment

by ChrisArchitect

0 subcomment

Just tried to post a tweet with this article title and link and got a similar error (on desktop twitter.com). Lovely.

by HenryBemis

0 subcomment

Aaaahh they are trying to prevent a Little Bobby Tables story..

0 subcomment

by julik

0 subcomment

Ok so: there is a blogging/content publishing engine, which is somewhat of a darling of the startup scene. There is a cloud hosting company with a variety of products, which is an even dearer darling of the startup scene. Something is posted on the blobbing/content publishing engine that clearly reveals that
* The product provided for blogging/content publishing did a shitty job of configuring WAF rules for its use cases (the utility of a "magic WAF that will just solve all your problems" being out of the picture for now) * The WAF product provided by the cloud platform clearly has shitty, overreaching rules doing arbitrary filtering on arbitrary strings. That filtering absolutely can (and will) break unrelated content if the application behind the WAF is developed with a modicum of security-mindedness. You don't `fopen()` a string input (no, I will not be surprised - yes, sometimes you do `fopen()` a string input - when you are using software that is badly written).
So I am wondering:
1. Was this sent to Substack as a bug - they charge money for their platform, and the inability to store $arbitrary_string on a page you pay for, as a user, is actually a malfunction and disfunction"? It might not be the case "it got once enshittified by a CIO who mandated a WAF of some description to tick a box", it might be the case "we grabbed a WAF from our cloud vendor and haven't reviewed the rules because we had no time". I don't think it would be very difficult for me, as an owner/manager at the blogging platform, to realise that enabling a rule filtering "anything that resembles a Unix system file path or a SQL query" is absolutely stupid for a blogging platform - and go and turn it the hell off at the first user complaint.
2. Similarly - does the cloud vendor know that their WAF refuses requests with such strings in them, and do they have a checkbox for "Kill requests which have any character an Average Joe does not type more frequently than once a week"? There should be a setting for that, and - thinking about the cloud vendor in question - I can't imagine the skill level there would be so low as to not have a config option to turn it off.
So - yes, that's a case of "we enabled a WAF for some compliance/external reasons/big customer who wants a 'my vendor uses a WAF' on their checklist", but also the case of "we enabled a WAF but it's either buggy or we haven't bothered to configure it properly".
To me it feels like this would be 2 emails first ("look, your thing <X> that I pay you money for clearly and blatantly does <shitty thing>, either let me turn it off or turn it off yourself or review it please") - and a blog post about it second.

by curtisszmania

0 subcomment

[dead]

by selfselfgo

0 subcomment

[dead]

by chaitrack

0 subcomment

[dead]

by untill

7 subcomments

[flagged]

by Sharo2025

0 subcomment

[flagged]

by 0xbadcafebee

4 subcomments

Worth noting that people here are assuming that the author's assumption is correct, that his writing /etc/hosts is causing the 403, and that this is either a consequence of security filtering, or that this combination of characters at all that's causing the failure. The only evidence he has, is he gets back a 403 forbidden to an API request when he writes certain content. There's a thousand different things that could be triggering that 403.
It's not likely to be a WAF or content scanner, because the HTTP request is using PUT (which browser forms don't use) and it's uploading the content as a JSON content-type in a JSON document. The WAF would have to specifically look for PUTs, open up the JSON document, parse it, find the sub-string in a valid string, and reject it. OR it would have to filter raw characters regardless of the HTTP operation.
Neither of those seem likely. WAFs are designed to filter on specific kinds of requests, content, and methods. A valid string in a valid JSON document uploaded by JavaScript using a JSON content-type is not an attack vector. And this problem is definitely not path traversal protection, because that is only triggered when the string is in the URL, not some random part of the content body.