Does lemmy have any communities dedicated to archiving/hoarding data?

  • pyrflie@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    83
    ·
    edit-2
    4 days ago

    Welcome to datahoarders.

    We’ve been here for decades.

    Also follow 3-2-1 people. 3 Backups, 2 storage mediums, 1 offsite.

  • juipeltje@lemmy.world
    link
    fedilink
    arrow-up
    31
    ·
    4 days ago

    Yeah not gonna lie, i think i heard someone in a youtube video a while back talk about how the entirety of wikipedia takes up like 200 gigs or something like that, and it got me seriously considering to actually make that offline backup. Shit is scary when countries like the uk are basically blocking you from having easy access to knowledge.

    • palordrolap@fedia.io
      link
      fedilink
      arrow-up
      7
      ·
      4 days ago

      UKGOV haven’t started on things like Wikipedia yet. They know kids use it for school and blinded by ideology though they are, even they can see there’d be an enormous backlash if they blocked it any time soon.

      If that’s going to happen at all, I doubt it would be before the next election. That’s whether Labour get re-elected or the Tories make an unexpected comeback. You can tell how far Labour have fallen in the eyes of their party faithful when they’ve taken a Tory-drafted policy and made it their own.

      Ironically, the up and coming third option fascist party, have said they’re going to repeal the Online Safety Act. They have other fish to fry if they get in, and they’ll want to keep their preferred demographic(s) happy while they do it.

      I assume that eventually something like the OSA would come back to “protect the children”. They love the current US President.

      None of this is hopeful. Take this as more of a rant.

    • mic_check_one_two@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      10
      ·
      4 days ago

      Yeah, it’s surprisingly small when it’s compressed if you exclude things like images and media. It’s just text, after all. But the high level of compression requires special software to actually read without uncompressing the entire archive. There are dedicated devices you can get, which pretty much only do that. Like there are literal Wikipedia readers, where you just give it an archive file and it’ll allow you to search for and read articles.

            • potoooooooo ☑️@lemmy.world
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              3 days ago

              “Fellow survivors, oh my God! What are your names?”

              “I’m OJ Simpson. This is my friend Aaron Hernandez. And this is his car, Christine.”

          • mfed1122@discuss.tchncs.de
            link
            fedilink
            English
            arrow-up
            3
            ·
            4 days ago

            If my experience with mashing the random article button is any indicator, you could reduce the size by 30% just by removing articles on sports players. I doubt I’ll need those

  • Retro_unlimited@lemmy.world
    link
    fedilink
    arrow-up
    34
    ·
    3 days ago

    I also recommend downloading “Flashpoint archive” to have flash games and animations to stay entertained.

    There is a 4gb version and a 2.3TB version.

  • bulwark@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    4 days ago

    When the Arch wiki was getting DDOS’d a few weeks ago I got a local copy from the AUR that was pretty handy.

  • mazzilius_marsti@lemmy.world
    link
    fedilink
    arrow-up
    18
    ·
    4 days ago

    we need all repos to be stored offline, and documentations to troubleshoot.

    the 1st i have no idea how much space we will need. Most linux packages are prerry light, no? But there is A LOT of them…

    the 2nd is easy. Heard someone say the entire of wikipedia is 200GB, should be doable. Dont forget the technical wikis too: Debian, Gentoo, Arch.

    • skisnow@lemmy.ca
      link
      fedilink
      English
      arrow-up
      11
      ·
      4 days ago

      Can’t remember who it was (b3ta? popbitch? penny-arcade?), but I recently saw a comment by someone who’s been running a website since the turn of the millennium, and they said that fully 99% of the links they posted two decades ago were no longer valid.

      To really put that into perspective, you have to remember that for most sites to get linked to from a popular site like that, meant that it was usually something of value that would have had a lot of work put into it, and that people found interesting or useful.

  • GreenKnight23@lemmy.world
    link
    fedilink
    arrow-up
    43
    arrow-down
    1
    ·
    3 days ago

    I have been archiving Linux builds for the last 20 years so I could effectively install Linux on almost any hardware since 1998-ish.

    I have been archiving docker images to my locally hosted gitlab server for the past 3-5 years (not sure when I started tbh). I’ve got around 100gb of images ranging from core images like OS to full app images like Plex, ffmpeg, etc.

    I also have been archiving foss projects into my gitlab and have been using pipelines to ensure they remain up-to-date.

    the only thing I lack are packages from package managers like pip, bundler, npm, yum/dnf, apt. there’s just so much to cache it’s nigh impossible to get everything archived.

    I have even set up my own local CDN for JS imports on HTML. I use rewrite rules in nginx to redirect them to my local sources.

    my goal is to be as self-sustaining on local hosting as possible.

    • Foster Hangdaan@lemmy.hangdaan.com
      link
      fedilink
      arrow-up
      5
      ·
      edit-2
      2 days ago

      Everyone should have this mindset regarding their data. I always say to my friends and family, “If you like it, download it.”. The internet is always changing and that piece of media that you like can be moved, deleted, or blocked at any time.