Does lemmy have any communities dedicated to archiving/hoarding data?
Be smart and keep it all on thumb drives.
Welcome to datahoarders.
We’ve been here for decades.
Also follow 3-2-1 people. 3 Backups, 2 storage mediums, 1 offsite.
“backups”? Pray tell, fine sir and or madam, what is that?
You know there’s only two kind of people, those who do backups and those that haven’t lost a hard drive/data before. Also: raid is no backup
Still remember the PSU blast taking out my main drive plus my backup drive in like 2001. I thought I was so good because I at least had a backup 😑. Those were the days 🤷🏻♀️
That sounds like an adventure!
Ya, me learning that a dinky psu is your worst enemy, i upgraded my SOs old duron to an athlon for work, which used more energy…
My condolences! That said Athlons were late 90s (?) cool.
I downloaded wikipedia a month or two ago, I recommend it.
How big is Wikipedia?
If you don’t care about edit history, and only care about English, there are zim files w/ images for <150 GiB
Okay so where do I find some cheap hard drives? Europe if possible :-)
look for dvr’s they have huge hdds in them and you can find them at thrift stores for cheap
I kind of want that hackermans diy pc that runs on 18650 cells
I still have a copy of wikipedia from 2021 somewhere on my NAS.
This is just minor datahoarding. I do it, on an extreme level.
Yeah not gonna lie, i think i heard someone in a youtube video a while back talk about how the entirety of wikipedia takes up like 200 gigs or something like that, and it got me seriously considering to actually make that offline backup. Shit is scary when countries like the uk are basically blocking you from having easy access to knowledge.
UKGOV haven’t started on things like Wikipedia yet. They know kids use it for school and blinded by ideology though they are, even they can see there’d be an enormous backlash if they blocked it any time soon.
If that’s going to happen at all, I doubt it would be before the next election. That’s whether Labour get re-elected or the Tories make an unexpected comeback. You can tell how far Labour have fallen in the eyes of their party faithful when they’ve taken a Tory-drafted policy and made it their own.
Ironically, the up and coming third option fascist party, have said they’re going to repeal the Online Safety Act. They have other fish to fry if they get in, and they’ll want to keep their preferred demographic(s) happy while they do it.
I assume that eventually something like the OSA would come back to “protect the children”. They love the current US President.
None of this is hopeful. Take this as more of a rant.
I’m certain that when UK forces DigitalID upon the nation it will be a requirement for access to every website
Every day it seems the entire west is gonna bee a fascist hellhole in a decade
Yeah, it’s surprisingly small when it’s compressed if you exclude things like images and media. It’s just text, after all. But the high level of compression requires special software to actually read without uncompressing the entire archive. There are dedicated devices you can get, which pretty much only do that. Like there are literal Wikipedia readers, where you just give it an archive file and it’ll allow you to search for and read articles.
if you remove topics you are not interessed it can shrink even more
Sure, but removing knowledge kind of goes against what creating a Wikipedia backup is about…
Well, i doubt i will ever need to know anything about a football player or a car
“Fellow survivors, oh my God! What are your names?”
“I’m OJ Simpson. This is my friend Aaron Hernandez. And this is his car, Christine.”
If my experience with mashing the random article button is any indicator, you could reduce the size by 30% just by removing articles on sports players. I doubt I’ll need those
I also recommend downloading “Flashpoint archive” to have flash games and animations to stay entertained.
There is a 4gb version and a 2.3TB version.
Is that Flash exclusive or do they accept other games from that era?
I’m not sure, but I do think it’s just flash
There is a 4gb version and a 2.3TB version.
That’s quite the range
When I downloaded it years ago it was 1.8TB. It’s crazy how big the archive is. The smaller one is just so it’s accessible to most people.
When the Arch wiki was getting DDOS’d a few weeks ago I got a local copy from the AUR that was pretty handy.
we need all repos to be stored offline, and documentations to troubleshoot.
the 1st i have no idea how much space we will need. Most linux packages are prerry light, no? But there is A LOT of them…
the 2nd is easy. Heard someone say the entire of wikipedia is 200GB, should be doable. Dont forget the technical wikis too: Debian, Gentoo, Arch.
Can’t remember who it was (b3ta? popbitch? penny-arcade?), but I recently saw a comment by someone who’s been running a website since the turn of the millennium, and they said that fully 99% of the links they posted two decades ago were no longer valid.
To really put that into perspective, you have to remember that for most sites to get linked to from a popular site like that, meant that it was usually something of value that would have had a lot of work put into it, and that people found interesting or useful.
It’s truly devastating how much of the old internet has died to the corporations taking over the internet.
The official USBs of Trixie fit all 28 DVDs of AMD64 on a 256GiB USB stick
https://www.linuxcollections.com/products/debian/debianusb.htm?id=51007
You’d probably want the 512GiB with all the sources for a real backup in this scenario
What’s a way to create a local repo mirror?
Or, in this post fact era just generate a wiki with a hallucinating AI instead.
https://github.com/XanderStrike/endless-wiki
Honestly this project looks like a lot of fun.
I saw that Wikipedia was having funding problems, what happened to Debian?
They lie. Wikipedia has plenty of money. Do not give those parasites any more.
https://en.wikipedia.org/wiki/Wikimedia_Foundation#Spending_and_fundraising_practices
I have been archiving Linux builds for the last 20 years so I could effectively install Linux on almost any hardware since 1998-ish.
I have been archiving docker images to my locally hosted gitlab server for the past 3-5 years (not sure when I started tbh). I’ve got around 100gb of images ranging from core images like OS to full app images like Plex, ffmpeg, etc.
I also have been archiving foss projects into my gitlab and have been using pipelines to ensure they remain up-to-date.
the only thing I lack are packages from package managers like pip, bundler, npm, yum/dnf, apt. there’s just so much to cache it’s nigh impossible to get everything archived.
I have even set up my own local CDN for JS imports on HTML. I use rewrite rules in nginx to redirect them to my local sources.
my goal is to be as self-sustaining on local hosting as possible.
Everyone should have this mindset regarding their data. I always say to my friends and family, “If you like it, download it.”. The internet is always changing and that piece of media that you like can be moved, deleted, or blocked at any time.
The pornhub collapse should have taught the average person that.
You’re awesome. Keep up the good work.
respectable level of hoarding 🏅