FRESH

Hacker News

Home

Google de-indexed Bear Blog and I don't know why

432 points by nafnlj

by firefoxd

7 subcomments

Traffic to my blog plummeted this year and you can never be entirely sure how it happened. But here are two culprits i identified.
1. Ai overview: my page impressions were high, my ranking was high, but click through took a dive. People read the generated text and move along without ever clicking.
2. You are now a spammer. Around August, traffic took a second plunge. In my logs, I noticed these weird queries in my search page. Basically people were searching for crypto and scammy websites on my blog. Odd, but not like they were finding anything. Turns out, their search query was displayed as an h1 on the page and crawled by google. I was basically displaying spam.
I don't have much control over ai overview because disabling it means I don't appear in search at all. But for the spam, I could do something. I added a robot noindex on the search page. A week later, both impressions and clicks recovered.
Edit: Adding write up I did a couple weeks ago https://idiallo.com/blog/how-i-became-a-spammer

by PrairieFire

2 subcomments

Whether or not this specific author’s blog was de-indexed or de-prioritized, the issue this surfaces is real and genuine.
The real issue at hand here is that it’s difficult to impossible to discover why, or raise an effective appeal, when one runs afoul of Google, or suspects they have.
I shudder to use this word as I do think in some contexts it’s being overused, I think it’s the best word to use here though: the issue is really that Google is a Gatekeeper.
As the search engine with the largest global market share, whether or not Google has a commercial relationship with a site is irrelevant. Google has decided to let their product become a Utility. As a Utility, Google has a responsibility to provide effective tools and effective support for situations like this. Yes it will absolutely add cost for Google. It’s a cost of doing business as a Gatekeeper, as a Utility.
My second shudder in this comment - regulation is not always the answer. Maybe even it’s rarely the answer. But I do think when it comes to enterprises that have products that intentionally or unintentionally become Gatekeepers and/or Utilities, there should be a regulated mandate that they provide an acceptable level of support and access to the marketplaces they serve. The absence of that is what enables and causes this to perpetuate, and it will continue to do so until an entity with leverage over them can put them in check.

by donatj

4 subcomments

About six months ago Ahrefs recommended I remove some Unicode from the pathing on a personal project. Easy enough. Change the routing, set up permanent redirects for the old paths to the new paths.
I used to work for an SEO firm, I have a decent idea of best practices for this sort of thing.
BAM, I went from thousands of indexed pages to about 100
See screenshot:
https://x.com/donatj/status/1937600287826460852
It's been six months and never recovered. If I were a business I would be absolutely furious. As it stands this is a tool I largely built for myself so I'm not too bothered but I don't know what's going on with Google being so fickle.
Updated screenshots;
https://x.com/donatj/status/1999451442739019895

by bjt12345

4 subcomments

What I find strange about Google, is that there's a lot of illegal advertising on Google maps - things like accomodation and liquor sellers that don't have permits.
However, if they do it for the statutory term, they can then successfully apply for existing-use rights.
Yet I've seen expert witnesses bring up Google pins on Maps during tribunal over planning permits and the tribunal sort of acts as if it's all legit.
I've even seen the tribunals report publish screenshots from Google maps as part of their judgement.

by FuturisticLover

5 subcomments

Google search results have gone shit. I am facing some deindexing issues where Google is citing a content duplicate and picking a canonical URL itself, despite no similar content.
Just the open is similar, but the intent is totally different, and so is the focus keyword.
Not facing this issue in Bing and other search engines.

by hyruo

0 subcomment

I encountered the same problem. I also use the Bear theme, specifically Hugo Bear. Recently, my blog was unindexed by Bing. Using `site:`, there are no links at all. My blog has been running normally for 17 years without any issues before.

by graeme

1 subcomments

Entirely possible the rss failed validation triggered some spam flag that isn't documented, because documenting anti-spam rules lets spammers break the rules.
The amount of spam has increased enormously and I have no doubt there are a number of such anti-spam flags and a number of false positive casualties along the way.

by watwut

2 subcomments

I noticed google not being able to find smaller blogs a few years ago. The sort of blogs I used to like and read a lot - small irregular blog of an expert in something like cryptography, sociology etc kind of disappeared from the search. Then they disappeared for real.
Even when I knew the exact name of article I was looking for google was unable to find it. And yes it still existed,

by quietfox

3 subcomments

I'll be honest, I read "Google de-indexed my Bear Blog" and was looking forward to discovering an interesting blog about bears.

by p0w3n3d

2 subcomments

Sounds similar to https://news.ycombinator.com/item?id=46203343 in terms, that Google decides who survives and who does not in business

by inglor_cz

1 subcomments

At the risk of sounding crazy, I de-indexed my blog myself and rely on the mailing list (which is now approaching 5000 subscribers) + reprints in several other online media to get traffic to me. On a good day, I get 5000 hits, which is quite a lot by Czech language community standards.
Together with deleting my Facebook and Twitter accounts, this removed a lot of pressure to conform to their unclear policies. Especially around 2019-21, it was completely unclear how to escape their digital guillotine which seemed to hit various people randomly.
The deliverability problem still stands, though. You cannot be completely independent nowadays. Fortunately my domain is 9 years old.

by Aldipower

0 subcomment

Google search also favors large, well-known sites over newcomers. For sites that have a lot of competition, this is a real problem and leads to asymmetry and a chicken-and-egg problem. You are small/new, but you can't really be found, which means you can't grow enough to be found. At the same time, you are also disadvantaged because Google displays your already large competitors without any problems!

by nottorp

2 subcomments

I bet Google doesn't know why either...

by dazc

3 subcomments

Breaking News: Google de-indexes random sites all of the time and there is often no obvious reason why. They also penalize sites in a way where pages are indexed but so deep-down that no one will ever find them. Again, there is often no obvious reason.

by huksley

0 subcomment

I have the same issue with DollarDeploy and Bing (and consequently with DuckDuckGo which uses bing)
Primary domain cannot be found via search - Bing knows about brand, LinkedIn, YouTube channel and but refuses to show search results about primary domain.
Bing search console does not give any clue, force reindexing does not help. Google search works fine.

by scosman

1 subcomments

How does one debug issues like this?
I have a page that ranks well worldwide, but is completely missing in Canada. Not just poorly ranked, gone. It shows up #1 for keyword in the US, but won't show up with precise unique quotes in Canada.

by cabirum

1 subcomments

In 2025, is it still prohibively expensive to run some community-supported crawler & search engine? Without Google censorship, ads, and ai.

by runjake

0 subcomment

This submission title is wrong. The Title should be what the post title is:
"Google De-Indexed My Entire Bear Blog and I Don’t Know Why"
Bearblog.dev is not de-indexed from Google. I can pull up results fine.

by xnx

4 subcomments

The author doesn't know the cause but states "The whole affair is Google’s fault"?

by motbus3

1 subcomments

Without going into details. The company I work for has potentially millions of pages indexed. Despite new content being published everyday, since around the same October dates we are seeing a decrease in the number of indexed pages.
We have a consultant for the topic but I am not sure how much of that conversation I could share publicly so I will refrain myself of doing so.
But I think I can say that it is not only about data structure or quality. The changes in methodology applied by Google in September might be playing a stronger role than what people initially thought

by Popeyes

1 subcomments

Had the same issue - we have a massive register of regulated services and Google was a help for people finding those names easily.
But in August suddenly "Page is not indexed: Crawled – currently not indexed" shot up massively. We've tried all sorts to get them back into the index but with no help. It would be helpful if Google explained why they aren't indexed or have been removed. As with the blogpost every other search engine is fine.

by eikowagenknecht

0 subcomment

I‘m running a very small personal website / blog as well (https://eikowagenknecht.com) where I‘ve been writing about mostly home automation related things that I couldn’t see properly documented before, but also other topics sometimes. A typical page would be a migration guide for Home Assistant from a Pi 4 to Pi 5.
These niche posts had their steady stream of visitors for years now, coming almost exclusively from Google. But August 25 2025, from one day to the other it dropped by about 95 percent and it has been that way ever since, from 1000 visitors per month to maybe 50. Nothing I tried SEO wise could fix it.
Now I don’t need visitors for anything, but I‘m kind of sad that people who were clearly finding my content useful (I got lots of thank you mails and things like that) can’t find it any more.
Nowadays, I get more traffic from Bing, Yandex.ru, DDG and Brave than from Google..

by terrycody

0 subcomment

This is a known Google soft penalty in order to contain spam contents and domains, but unfortunately, this thing penalized a lot of good, clean, and legit blogs, and when you caught this penalty, in 99% cases, you will never recovered.
It will only shows 1,2,3,4,5,6,7,8,9,11, or simialr numbers when you site:yourdomain on Google.
I found this thing around 2019, and it still exists till today.
A way to identify this: first post more than 20-30 articles on your blog, and if after several months, it still shows only several pages when you site: your domain, then it mostly caught this thing.
More info: https://www.blackhatworld.com/seo/anyone-site-only-4-results...

by pentagrama

0 subcomment

> Every time I published a new post, I would go to GSC and request indexing for the post URL, and my post would be on Google search results shortly after, as expected.
I doubt this is the actual cause, but I can’t think of any other plausible explanation. One possibility is that repeatedly requesting manual indexing in GSC (Google Search Console), while the same URLs were also being discovered automatically through the sitemap, may have unintentionally triggered a spam or quality signal in Google’s indexing system.
This kind of duplicated or aggressive indexing behavior could be misinterpreted by the algorithm, even if the content itself was legitimate.

0 subcomment

by subpixel

0 subcomment

Bearblog.dev keeps subdomains out of search indexes until it approves them, as a measure against hosting the sort of things that would get the whole system de-indexed.
My guess is that they are more successful at suppressing subdomains than at getting them indexed. After all, they are not in control of what search engines do, they can only send signals.
For reference, I have a simple community event site on bearblog.dev which has been up for months and is not in any search index.

by arjie

0 subcomment

Google is blackboxy about this and I understand why. SEO is an arms race and there's no advantage to them advertising what they use as signals of "this is a good guy". My blog (on Mediawiki) was deranked to oblivion. Exactly zero of my pages would index on Google. Some of it is that my most read content is about pregnancy and IVF and those are sensitive subjects that google requires some authorship credibility on. That's fair.
But there were other posts that I thought were just normal blog posts of the form that you'd expect to be all right. But none of the search engines wanted anything to do with me. I talked to a friend[0] who told me it was probably something to do with the way MediaWiki was generating certain pages and so on, and I did all the things he recommended:
* edit the sitemap
* switch from the default site.tld/index.php/Page to site.tld/fixed-slug/Page
* put in json+ld info on the page
* put in meta tags
The symptoms were exactly as described here. All pages crawled, zero indexed. The wiki is open to anonymous users, but there's no spam on it (I once had some for an hour before I installed RequestAccount). Finally, my buddy told me that maybe I just need to dump this CMS and use something else. I wondered if perhaps they need you to spend on their ads platform to get it to work so I ran some ads too as an experiment. Some $300 or so. Didn't change a thing.
I really wanted things to be wiki-like so I figured I'd just write and no one would find anything and that's life. But one day I was bored enough that I wrote a wiki bot that reads each recently published page and adds a meta description tag to it.
Now, to be clear, Google does delay reinstatement so that it's not obvious what 'solved' the problem (part of the arms race), but a couple of days later I was back in Google and now I get a small but steady stream of visits from there (I use self-hosted Plausible in cookie-free mode so it's just the Referer [sic] header).
Overall, I get why they're what they are. And maybe there's a bunch of spammy Mediawiki sites out there or something. But I was surprised that a completely legitimate blog would be deranked so aggressively unless a bunch of SEO measures were taken. Fascinating stuff the modern world is.
I suspect it has to do with the Mediawiki because the top-level of the domain was a static site and indexed right away!
0: https://news.ycombinator.com/user?id=jrhizor

by nmeofthestate

0 subcomment

A weird thing: on the hacker news page, in firefox mobile, all the visited links are grey, but the link to this blog post won't turn grey even when visited.

by Havoc

0 subcomment

Parts of Google are all Blackbox-y. Never know when computer says no. And if they had usable ways to contact a human they’d just tell you they don’t know either

by devil1432

0 subcomment

I remember when I was 12yo, I used to create a lot of small fan pages, blogs and forums (with total of 10 users if I was lucky). They were always indexed by Google with no problem. I miss these simpler times. Nowadays, Internet is no longer democratic.

by econ

1 subcomments

I never really use it but there is a lot in the Yahoo index that google refuses to index.
https://search.yahoo.com/search?p=blog.james-zhan.com&fr=yfp...

by p410n3

0 subcomment

I ran into the same thing! My site still isnt indexed and I would REALLY like to not change the URL (its a shop and the url is printed on stuff) - redirects are my last resort.
But basically what happened: In august 2025 we finished the first working version of our shop. I wanted to accelerate indexing after some weeks because only ~50 of our pages were indexed and submitted the sitemap and everything got de-indexed within days. I thought for the longest time that its content quality because we sell niche trading cards and the descriptions are all one liners i made in Excel. ("This is $cardname from $set for your collection or deck!"). And because its single trading cards we have 7000+ products that are very similiar. (We did do all product images ourselves I thought google would like this but alas).
But later we added binders, whole sets and took a lot of care with their product data. The frontpage also got a massive overhaul - no shot. Not one page in index. We still get traffic from marketplaces and our older non-shop site. The shop itself lives on a subdomain (shop.myoldsite.com). The normal site also has a sitemap but that one was submitted 2022. I later rewrote how my sitemaps were generated and deleted the old ones in search console hoping this would help. It did not. (The old sitemap was generated by the shop system and was very large. Some forums mentioned that its better to create a chunked sitemap so I made a script that creates lists with 1000 products at a time as well as an index for them.)
Later observations are:
- Both sitemaps i deleted in GSC are still getting crawled and are STILL THERE. You cant see them in the overview but if you have the old links they still appear as normal.
- We eventually started submitting product data to google merchant center as well. It works 100% fine and our products are getting found and bought. The clicks still even show up in search console!!!! So I have a shop with 0 indexed pages in GSC that gets clicks every day. WTHeck?
So like... I dont even know anymore. Maybe we also have to restart like the person in the blog did and move the shop to a new domain and NEVER give google a sitemap. If I really go that route I will probably delete the cronjob that creates the sitemap in case google finds it by itself. But also like what the heck? I have worked in a web agency for 5 years and created a new webpage about every 2-8 weeks so i roughly launached about 50-70 webpages and shops and i NEVER saw that happen. Is it an ai hallucinating? Is it anti spam gone too far? Is it a straight up bug that they dont see? Who knows. I dont
(Good article though and I hope maybe some other people chime in and googlers browsing HN see this stuff).

by storus

0 subcomment

They probably don't know why it was de-indexed either, likely a bunch of unexplainable ML models flagged it.

by hackerbeat

0 subcomment

Don't take it personal. Google has lost control of its algo a long time ago already.

by luxuryballs

0 subcomment

It’s weird that the number one search engine in modern times is so finicky, perhaps just has become way over-engineered and over-rigged. Just index the web, at a certain point they went from search engine to arbiter of what people can find.

by DeathArrow

0 subcomment

We depend too much on Google.

by marbu

0 subcomment

Run into a similar problem with my blog this year. After spending some time trying to resolve it, I just gave up.
I can understand that every now and then Google changes it's rules and validation procedures, so that what used to work now gets removed from the index out of sudden, given their fight with spam and slop. But what I'm struggling to understand is how could Google crawler and Google Search Console be so bad so that:
* google crawler stops fetching sitemap out of sudden, even though Google claims it's an important signal for the search engine * requesting sitemap refresh via GSC fails on "unknown" error, which is puzzling considering according to my web logs, nobody tried to load it between my request and the error * after fixing an error, validation job gets stuck for weeks, only to fail for unclear error * random deindexing events as explained in the post
And I don't buy the argument that this is necessary for Google to deal with spam, because Bing Webmaster Tools just works flawlessly, and they have to deal with it as well.
I don't understand how a small business deal with this kind of issues.

by dmix

0 subcomment

I find this thing sometimes works itself out. Just submit sitemaps and the usual stuff and be careful with your HTML.

by ciferkey

0 subcomment

Adding another data point from my own experience. I had a very similar case with my own blog which I detailed here: https://blog.matthewbrunelle.com/i-dont-want-to-play-the-seo...
Gone through everything I can find, but nothing has made a difference for months now. Would love to hear any thoughts people have that aren't the usual checklist items.
Good news is I still get some visitors through Kagi, DDG, and Bing.

by foobarkey

0 subcomment

Probably an intern (oh its 2025, maybe LLM?) messed up some spaghetti part and the async job for reindexing your site is failing since then and the on-call is busy taking mojito/the alert is silenced :)

by ZiiS

0 subcomment

Bear witness that Google have bear away from bearing Bear blog.

by ErroneousBosh

0 subcomment

Why do you care if Google indexes your site or not?
I'm annoyed that mine even shows up on Google.

by FragrantRiver

0 subcomment

This is not about bears at all. Very disappointed.

by searchlurch

0 subcomment

[dead]

by throwaway984393

0 subcomment

[dead]

by digitalgravix

0 subcomment

[dead]

by onetokeoverthe

0 subcomment

[dead]

by echelon

6 subcomments

We need a P2P internet.
No more Google. No more websites. A distributed swarm of ephemeral signed posts. Shared, rebroadcasted.
When you find someone like James and you like them, you follow them. Your local algorithm then prioritizes finding new content from them. You bookmark their author signature.
Like RSS but better. Fully distributed.
Your own local interest graph, but also the power of your peers' interest graphs.
Content is ephemeral but can also live forever if any nodes keep rebroadcasting it. Every post has a unique ID, so you can search for it later in the swarm or some persistent index utility.
The Internet should have become fully p2p. That would have been magical. But platforms stole the limelight just as the majority of the rest of the world got online.
If we nerds had but a few more years...

by neilv

0 subcomment

> Second, the issue couldn’t be the quality or the quantity of the content. I came across some other pretty barebones Bear blogs that don’t have much content, and looked them up on Google, and they showed up in the results just fine. An example:
Suggestion: Remember that many large companies are emergently shitty, with shitty processes, and individuals motivated to act in shitty ways.
When a company is so powerful, this might be a time to think about solidarity.
When you're feeling an unexplained injustice from them, sometimes saying "but you let X do it" could just throw X under the bus.
Whether because a fickle process or petty actor simply missed X before, or because now they have new reason to double down and also punish the others (to CYA consistency, or, if petty, to assert their power now that you've questioned it).

by qwertox

2 subcomments

When I reload the page "https://journal.james-zhan.com/google-de-indexed-my-entire-b...", I get
Request URL: https://journal.james-zhan.com/google-de-indexed-my-entire-b...
Request Method: GET
Status Code: 304 Not Modified
So maybe it's the status code? Shouldn't that page return a 200 ok?
When I go to blog.james..., I first get a 301 moved permanently, and then journal.james... loads, but it returns a 304 not modified, even if i then reload the page.
Only when I fully sumbit the URL again in the URL-bar, it responds with a 200.
Maybe crawling also returns a 304, and Google won't index that?
Maybe prompt: "why would a 301 redirect lead to a 304 not modified instead of a 200 ok?", "would this 'break' Google's crawler?"
> When Google's crawler follows the 301 to the new URL and receives a 304, it gets no content body. The 304 response basically says "use what you cached"—but the crawler's cache might be empty or stale for that specific URL location, leaving Google with nothing to index.