Why are anime catgirls blocking my access to the Linux kernel?

tofu@lemmy.nocturnal.garden · 14 days ago

Why are anime catgirls blocking my access to the Linux kernel?

rtxn@lemmy.world · edit-2 14 days ago

The current version of Anubis was made as a quick “good enough” solution to an emergency. The article is very enthusiastic about explaining why it shouldn’t work, but completely glosses over the fact that it has worked, at least to an extent where deploying it and maybe inconveniencing some users is preferable to having the entire web server choked out by a flood of indiscriminate scraper requests.

The purpose is to reduce the flood to a manageable level, not to block every single scraper request.

poVoq@slrpnk.net · edit-2 14 days ago

And it was/is for sure the lesser evil compared to what most others did: put the site behind Cloudflare.

I feel people that complain about Anubis have never had their server overheat and shut down on an almost daily basis because of AI scrapers 🤦

mobotsar@sh.itjust.works · 14 days ago

Is there a reason other than avoiding infrastructure centralization not to put a web server behind cloudflare?

poVoq@slrpnk.net · 14 days ago

Yes, because Cloudflare routinely blocks entire IP ranges and puts people into endless captcha loops. And it snoops on all traffic and collects a lot of metadata about all your site visitors. And if you let them terminate TLS they will even analyse the passwords that people use to log into the services you run. It’s basically a huge survelliance dragnet and probably a front for the NSA.

tofu@lemmy.nocturnal.garden · edit-2 14 days ago

Yeah, I’m just wondering what’s going to follow. I just hope everything isn’t going to need to go behind an authwall.

rtxn@lemmy.world · 14 days ago

The developer is working on upgrades and better tools. https://xeiaso.net/blog/2025/avoiding-becoming-peg-dependency/

daniskarma@lemmy.dbzer0.com · 13 days ago

I still think captchas are a better solution.

In order to surpass them they have to run AI inference which is also comes with compute costs. But for legitimate users you don’t run unauthorized intensive tasks on their hardware.

poVoq@slrpnk.net · 13 days ago

They are much worse for accessibility, and also take longer to solve and are more distruptive for the majority of users.

daniskarma@lemmy.dbzer0.com · edit-2 13 days ago

Anubis is worse for privacy. As you have to have JavaScript enabled. And worse for the environment as the cryptographic challenges with PoW are just a waste.

Also reCaptcha types are not really that disturbing most of the time.

As I said, the polite thing you just be giving users the options. Anubis PoW running directly just for entering a website is one of the most rudest piece of software I’ve seen lately. They should be more polite, and just give an option to the user, maybe the user could chose to solve a captcha or run Anubis PoW, or even just having Anubis but after a button the user could click.

I don’t think is good practice to run that type of software just for entering a website. If that tendency were to grow browsers would need to adapt and straight up block that behavior. Like only allow access to some client resources after an user action.

mfed1122@discuss.tchncs.de · edit-2 13 days ago

Yeah, well-written stuff. I think Anubis will come and go. This beautifully demonstrates and, best of all, quantifies the ~~negligence~~ negligible cost to scrapers of Anubis.

It’s very interesting to try to think of what would work, even conceptually. Some sort of purely client-side captcha type of thing perhaps. I keep thinking about it in half-assed ways for minutes at a time.

Maybe something that scrambles the characters of the site according to some random “offset” of some sort, e.g maybe randomly selecting a modulus size and an offset to cycle them, or even just a good ol’ cipher. And the “captcha” consists of a slider that adjusts the offset. You as the viewer know it’s solved when the text becomes something sensical - so there’s no need for the client code to store a readable key that could be used to auto-undo the scrambling. You could maybe even have some values of the slider randomly chosen to produce English text if the scrapers got smart enough to check for legibility (not sure how to hide which slider positions would be these red herring ones though) - which could maybe be enough to trick the scraper into picking up junk text sometimes.

Guillaume Rossolini@infosec.exchange · 14 days ago

@mfed1122 @tofu any client-side tech to avoid (some of the) bots is bound to, as its popularity grows, be either circumvented by the bot’s developers or the model behind the bot will have picked up enough to solve it

I don’t see how any of these are going to do better than a short term patch

rtxn@lemmy.world · edit-2 14 days ago

That’s the great thing about Anubis: it’s not client-side. Not entirely anyways. Similar to public key encryption schemes, it exploits the computational complexity of certain functions to solve the challenge. It can’t just say “solved, let me through” because the client has to calculate a number, based on the parameters of the challenge, that fits certain mathematical criteria, and then present it to the server. That’s the “proof of work” component.

A challenge could be something like “find the two prime factors of the semiprime 1522605027922533360535618378132637429718068114961380688657908494580122963258952897654000350692006139”. This number is known as RSA-100, it was first factorized in 1991, which took several days of CPU time, but checking the result is trivial since it’s just integer multiplication. A similar semiprime of 260 decimal digits still hasn’t been factorized to this day. You can’t get around mathematics, no matter how advanced your AI model is.

Guillaume Rossolini@infosec.exchange · 14 days ago

@rtxn I don’t understand how that isn’t client side?

Anything that is client side can be, if not spoofed, then at least delegated to a sub process, and my argument stands

Passerby6497@lemmy.world · 14 days ago

Please, explain to us how you expect to spoof a math problem that you have to provide an answer to the server before proceeding.

You can math all you want on the client, but the server isn’t going to give you shit until you provide the right answer.

Guillaume Rossolini@infosec.exchange · 14 days ago

@Passerby6497 I really don’t understand the issue here

If there is a challenge to solve, then the server has provided that to the client

There is no way around this, is there?

Passerby6497@lemmy.world · 14 days ago

You’re given the challenge to solve by the server, yes. But just because the challenge is provided to you, that doesn’t mean you can fake your way through it.

You still have to calculate the answer before you can get any farther. You can’t bullshit/spoof your way through the math problem to bypass it, because your correct answer is required to proceed.

There is no way around this, is there?

Unless the server gives you a well-known problem you have the answer to/is easily calculated, or you find a vulnerability in something like Anubis to make it accept a wrong answer, not really. You’re stuck at the interstitial page with a math prompt until you solve it.

Unless I’m misunderstanding your position, I’m not sure what the disconnect is. The original question was about spoofing the challenge client side, but you can’t really spoof the answer to a complicated math problem unless there’s an issue with the server side validation.

Guillaume Rossolini@infosec.exchange · 14 days ago

@Passerby6497 my stance is that the LLM might recognize that the best way to solve the problem is to run chromium and get the answer from there, then pass it on?

dabe@lemmy.zip · 14 days ago

I’m sure you meant to sound more analytical than anything… but this really comes off as arrogant.

You make the claim that Anubis is negligent and come and go, and then admit ton only spending minutes at a time thinking of solutions yourself, which you then just sorta spout. It’s fun to think about solutions to this problem collectively, but can you honestly believe that Anubis is negligent when it’s so clearly working and when the author has been so extremely clear about their own perception of its pitfalls and hasty development (go read their blog, it’s a fun time).

daniskarma@lemmy.dbzer0.com · 14 days ago

Sometimes I think. Imagine if a company like google or facebook would implement something like anubis. And suddenly most people’s browsers would start solving cpu intensive constant cryptographic challenges. People would be outraged by the wasted energy. But somehow “cool small company” does it and it’s fine.

I do not think anubis system is sustainable for all the people to use it, it’s just too wasteful energy wise.

Tangent5280@lemmy.world · 13 days ago

What alternatives do you propose?

daniskarma@lemmy.dbzer0.com · edit-2 13 days ago

Captcha.

It does all Anubis does. If a scrapper wants to solve it automatically it’s computer intensive, they have to run AI inference, but for the user it’s just a little time consuming.

With captchas you don’t run aggressive software unauthorized on anyone’s computer.

Solution did exist. But Anubis is “trendy” and they are masters in PR within some specific circles of people who always wants the lastest most trendiest thing.

But good old captcha would achieve the same result as Anubis, in a more sustainable way.

Or at least give user an option of running or not running the challenge and leave the page. And make clear for the user that their hardware is going to run an intensive task. It really feels very aggressive to have a webpage to run basically a cryptominer unauthorized in your computer. And for me having a cargirl as a mascot does not forgive the rudeness of it.

katy ✨@piefed.blahaj.zone · 13 days ago

but captcha is trash whose only purpose is to train ai for google

daniskarma@lemmy.dbzer0.com · edit-2 13 days ago

What?

You don’t need to use google, or cloudfare, captcha to have a captcha.

There are open source implementations of reCaptcha. And you can always run a classical captcha based on image recognition.

katy ✨@piefed.blahaj.zone · 13 days ago

google is like 95% of the captchas on the internet.