• Agility0971@lemmy.world
    link
    fedilink
    arrow-up
    9
    ·
    1 day ago

    It feels like those tiny bars should be stocked on top of each other with all the spof projects and services

  • undefined@lemmy.hogru.ch
    link
    fedilink
    arrow-up
    80
    ·
    1 day ago

    I laugh every time AWS goes down. That’s what you fucking get, and don’t get me started on us-east-1 specifically.

    • CallMeAnAI@lemmy.world
      link
      fedilink
      arrow-up
      18
      arrow-down
      14
      ·
      1 day ago

      It’s like all you people forgot how much shit broke before AWS. They have one major outage every few years and people lose their shit pretending they aren’t hitting the SLA or coming close.

      • naught101@lemmy.world
        link
        fedilink
        arrow-up
        37
        ·
        1 day ago

        Because hosting was more diverse before, so when shit happened it took out a couple of sites, not a quarter of the internet

            • CallMeAnAI@lemmy.world
              link
              fedilink
              arrow-up
              3
              arrow-down
              5
              ·
              1 day ago

              Sure that’s what I said.

              Go ahead to rack space, or SAP, I’m sure you’ll have a much more reliable experience. Or just run your own. I’m sure it’ll be easy peasy and super reliable.

          • AmbitiousProcess (they/them)@piefed.social
            link
            fedilink
            English
            arrow-up
            18
            ·
            1 day ago

            The outage also took down people’s banks, which stopped many of them from doing things like buying groceries 💀

            I don’t think saying it’s good for us “touching grass” is a good argument here when AWS hosts such a substantial portion of all online services.

            • CallMeAnAI@lemmy.world
              link
              fedilink
              arrow-up
              4
              arrow-down
              12
              ·
              edit-2
              1 day ago

              How many banks didn’t work? Which ones? You have a source? Visa and MC were good all day here in the real world in the east coast.

              Sounds like you’re just trying to exaggerate around an edge case that frankly isn’t the end of the world even if it were common for 4 hours a year

              Why aren’t you blaming the bank for having redundancy outside a single DC? How many banks do you know if that were out susessfully using other providers that have a higher SLO/SLA?

              • jj4211@lemmy.world
                link
                fedilink
                arrow-up
                4
                ·
                1 day ago

                I’m also skeptical that any payment processing networks were impacted. I would be surprised, but less so if they couldn’t manage their account online which might have similar effect. I’m not surprised at all of the grocery store or restaurants were significantly impacted. I know a lot of the apps were broken and I could imagine someone used to apping everything leaving their cards at home and unable to get lunch. Might have some aggressively “modern” establishments that are kiosk only and I could imagine them getting downed by aws outage.

                outside a single DC?

                I’m told that a lot of the companies did all the right things but still got taken down because some dependent Amazon services are tethered to that single DC and only Amazon has the power to change that.

                • CallMeAnAI@lemmy.world
                  link
                  fedilink
                  arrow-up
                  2
                  arrow-down
                  2
                  ·
                  1 day ago

                  I’ll wait for the final root cause but…

                  We mitigated most of it by swapping to secondary DNS and completely taking any thing related to AWS DNS and services in useast1. If you didn’t have secondary DNS and heavily reliant on AWS internal DNS this might be something they experienced.

              • AmbitiousProcess (they/them)@piefed.social
                link
                fedilink
                English
                arrow-up
                3
                ·
                18 hours ago

                I can see why your account is marked with two red marks on PieFed for low reputation, because man do you come off confrontational.

                How many banks didn’t work? Which ones? You have a source?

                Search engines exist. Use them before acting as if I"m making shit up.

                The list of financial institutions that had issues, as far as I can tell from industry reporting and downdetector graphs, is Navy Federal Credit Union (~15 million members), Truist (~15 million customers), Chime (~8-9 million customers), Venmo (~60 million users), Ally Bank (~10 million customers), and Lloyds Banking group (~30 million customers).

                Assuming no overlap, that’s nearly 140 million people that lost banking and money transfer access.

                Sounds like you’re just trying to exaggerate around an edge case that frankly isn’t the end of the world even if it were common for 4 hours a year

                The outage lasted for 15 hours in some cases, due to many AWS services recovering after the outage, yet having a backlog to work through, which took many more hours. Many services also depend on AWS in a manner where AWS coming back online doesn’t instantaneously restart service. These systems are complex, and not every company that relied on them could instantly start back up the moment the main outage was resolved, let alone when many services were still marked as impacted for hours and hours later as they worked through their backlog.

                Why aren’t you blaming the bank for having redundancy outside a single DC? How many banks do you know if that were out susessfully using other providers that have a higher SLO/SLA?

                I also blame them for not having additional redundancy. I blame both them for not having a fallback, and AWS for allowing such a major outage to happen. Shockingly, more than one party can be at fault.

          • balance8873@lemmy.myserv.one
            link
            fedilink
            arrow-up
            2
            ·
            1 day ago

            Some of us have jobs. I mean I guess you have a job, but in your case losing network just means those pesky humans stop bothering you and go to a real therapist.

            • CallMeAnAI@lemmy.world
              link
              fedilink
              arrow-up
              3
              arrow-down
              3
              ·
              1 day ago

              I’m a staff engineer who has been dealing with the results of SLAs before Amazon was an idea.

              God forbid I have a p0 where I have to message a bunch of non technical directors it’s AWS not us. Much much worse than having to figure out and then pull in the team that pushed whatever untested shit made it’s way into production on a Friday afternoon.

              Unless you’ve been responsible for a SaaS with SLAs in a b2b setting; I know more about the consequences of a provider outage than you.

              • balance8873@lemmy.myserv.one
                link
                fedilink
                arrow-up
                4
                arrow-down
                2
                ·
                1 day ago

                I don’t know what you’re responding to but it doesn’t seem to be me. Either that or you forgot the username you picked for yourself in which case: whoosh

      • shalafi@lemmy.world
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        1
        ·
        22 hours ago

        No shit. I was DevOps at my last company and they were all in on AWS. In those 5 years we had one major outage. There was one other case of a particular service going down, forgot which one, but it mainly screwed DevOps and the db guys.

        You’re talking to a bunch of young people who hate Bezos and by extension AWS. They have no idea what the internet was like before.

        Personally I think the cost is outrageous, rather have my own hardware mirrored in geographically distant colos, but that doesn’t mean AWS isn’t amazing.

        • wolframhydroxide@sh.itjust.works
          link
          fedilink
          arrow-up
          4
          ·
          15 hours ago

          The problem is not that any outage occurred. This still happens often. Things just refuse to work sometimes. The issue is that SO MANY eggs were in ONE basket.

        • CallMeAnAI@lemmy.world
          link
          fedilink
          arrow-up
          4
          arrow-down
          2
          ·
          22 hours ago

          Nah. I haven’t had a service we use miss an SLA or cost more than it’s SLO budget in 2 years.

          What specific services have they missed your SLA on and what incidents were they tied to? I understand that not every team has a guy on their team to monitor that that stuff and bitch for credits, but I do, and AWS is one of our most reliable vendors.

          Look the fact that AWS, Azure, and more recently Google are the only choices sucks.

          But the reality is most companies and projects don’t have the business case to justify multi region fail over much less vendor fail over. They are all built on single points of failures and will always have outages.

          Everyone just notices it more when it’s AWS. And that’s a stupid reason to base decisions off of. Visa/mc was working. Reddit and Facebook were mostly working once they started routing through their multi cloud nodes. Maybe you couldn’t get to your banks web app, that’s on them using a single cloud with no way to route to alternate cloud nodes and services. And for them to double at best infrastructure costs, unless they are boa Chase Morgan etc, is dumb for 99.99% which is the SLA .

          The world isn’t ending, emergency services are working, visa/mc failed over, I was still on Reddit and slack most of the day. It wasn’t the end of the world.

          Anyway, I now realize I have summoned my frustrations with this entire thread and gone wildly off topic and ranted with full force at you.

          I just don’t think it’s important that when there is a major outage on AWS/Azure/cloud flare. It was going to happen elsewhere, and you wouldn’t have an excuse to tell your pm not my problem, instead of digging into your app for 2 hours to find out x portion of you very distributed vendor list failed and you still have a single point of failure. I’d rather be able to point to AWS, say shit is fucked for everyone, and if you want multi cloud it’s going to cost at least 1.5x as much as we’re spending 🤷‍♂️.

          • undefined@lemmy.hogru.ch
            link
            fedilink
            arrow-up
            2
            ·
            edit-2
            18 hours ago

            I haven’t used AWS in years. No IPv6 support in S3 in 2017 was the last straw for me. I have to deal with it at work (sometimes) and always laugh when they introduce “new” features like HTTPS records in Route53 like two years late.

            Why do you say AWS, Azure and Google are the only options? I don’t use any of those greedy companies’ platforms.

      • mitram@lemmy.pt
        link
        fedilink
        arrow-up
        14
        arrow-down
        1
        ·
        1 day ago

        It’s pretty funny to argue in favour of centralised services in a decentralised platform

        • CallMeAnAI@lemmy.world
          link
          fedilink
          arrow-up
          3
          arrow-down
          12
          ·
          1 day ago

          I never argued that. I provided the reality of what they did. I’m sorry the reality doesn’t align with how you think things should be.

          You think everyone trying to make money is just stupid and has ignored some super reliable and cheap hosting because they want to gobble bezos cock? No, they solved challenging problems and made it a lot easier to stand up a reliable app.

          • balance8873@lemmy.myserv.one
            link
            fedilink
            arrow-up
            7
            arrow-down
            1
            ·
            1 day ago

            Very unique take on how businesses make technical decisions. I’ve never heard of anyone describe the decision making process as logical before. Or even grounded in facts.

          • mitram@lemmy.pt
            link
            fedilink
            arrow-up
            2
            ·
            22 hours ago

            I feel that you’re are very jaded over this subject, I truly felt it was a funny situation. No judgement from me

            Yes, AWS has a lot of advantages and I do believe they usually provide a reliable service, but as with all centralised services when they go down a bunch of other stuff go with them and that should be avoided. Doesn’t make all the incredible engineers currently working in AWS stupid

      • Phoenixz@lemmy.ca
        link
        fedilink
        arrow-up
        12
        arrow-down
        2
        ·
        edit-2
        23 hours ago

        Eeehh, you’re literally suggesting that AWS added to the general stability and dependability of the internet in general

        You have NO idea what you’re talking about

        The internet was designed to survive nuclear war, talking about being dependable) and the entire idea was (and should continue to be) that you don’t rely on a single point of failure. Traffic should automatically route around dead nodes so that everything continues to flow. Decentralisation is key.

        But of course with companies being companies and corporate doing what corporate does best (enshittify everything so that we make more monies) everything got centralized.

        • shalafi@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          1
          ·
          22 hours ago

          The centralization is an issue, but AWS is stable as hell. When I was first in IT, tech support, I had to explain to customers daily that, “No, your internet is fine, it’s just that particular website that’s down.”

          And the centralization wouldn’t be a thing if AWS didn’t route all IAM services through us-east-1. My Lightsail in us-west-1 was fine yesterday.

          • Phoenixz@lemmy.ca
            link
            fedilink
            arrow-up
            3
            arrow-down
            1
            ·
            21 hours ago

            So your argument is that AWS centralization is good because Amazon is a good provider?

            You do understand that they’re are loads of providers out there that are perfectly stable, but that are not Amazon?

            I’ve never used it because I know how to manage a server, something you might want to expect from IT personnel that does development for companies, but there days let’s just ask Amazon todo it for us, we’re too lazy

            • 1984@lemmy.today
              link
              fedilink
              arrow-up
              1
              arrow-down
              1
              ·
              edit-2
              18 hours ago

              Companies need hell of a lot more then virtual machines today. I dont use it personally either but would i recommend a company to buy their own hardware? No. I would say they should use AWS because they can afford it and it gives them access to hundreds of services. Its rare to see technical issues.

              The value for a company is actually enormous, to have something like that at their fingertips.

              Todays downtime is forgotten in a few days and it was a big one.

              • Phoenixz@lemmy.ca
                link
                fedilink
                arrow-up
                2
                ·
                17 hours ago

                Who says companies need to buy their own hardware?

                We have datacenters for that, you rent the hardware one way or the other.

                I’m saying that nobody should put all their eggs in one basket because if that basket breaks, you’re all fucked.

                If you have the need for high availability then you don’t out all your servers in a single datacenter, or with a single provider

                If everyone and their mother is with one provider, you’ll first notice that said provider gets expensive pretty quick and you’ll also notice that when shit goes down that half the fucking internet follows.

                My services weren’t down, and never have been. I don’t use AWS, because I don’t need it

        • CallMeAnAI@lemmy.world
          link
          fedilink
          arrow-up
          5
          arrow-down
          4
          ·
          edit-2
          23 hours ago

          Yeah rack space was killing it! Sites NEVER went down, especially under dynamic load. Never.

          • Phoenixz@lemmy.ca
            link
            fedilink
            arrow-up
            6
            ·
            21 hours ago

            You understand that that has nothing to do with this? So there are shitty providers out there, find a good one that is not “just amazon”

  • Phoenixz@lemmy.ca
    link
    fedilink
    arrow-up
    25
    ·
    23 hours ago

    It’s so funny to see that this just keeps happening over and over and nobody seems to be learning any lessons at all

    • sulgoth@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      16 hours ago

      Oh they are, slowly but surely. Amazon just took over a third of the web before anyone above the basic workers realized that one force holding that much power was a bad thing. Don’t forget, those kind of people think monopolies are are a good thing.