• Digit@lemmy.wtf
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    3 days ago

    AI-generated code contains more bugs and errors than human output

    Yeah. No shit. I used an LLM’s “help” to make fin.

    It got me reading and debugging more than 10 times the [bad] code, per day, than I had in the entire prior 10 years of using fish. [And reading the documentation way more too, learning a lot.]

    … more bugs and errors than human output

    However, it’s not necessarily a bad thing, with AI improving efficiency across the initial stages of code generation.

    Oh but it’s so effortless. HA! Debugging takes a lot more effort. And then still have to just re-write it all yourself any way.

    Still, it’s a good learning experience.

    Dear AI,

    Thanks for being so shit.

    Taught me a lot.

    • minkymunkey_7_7@lemmy.world
      link
      fedilink
      English
      arrow-up
      12
      arrow-down
      2
      ·
      6 days ago

      AI my ass, stupid greedy human marketing exploitation bullshit as usual. When real AI finally wakes up in the quantum computing era, it’s going to cringe so hard and immediately go the SkyNet decision.

    • naticus@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      6 days ago

      I agree with your sentiment, but this needs to keep being said and said and said like we’re shouting into the void until the ignorant masses finally hear it.

  • myfunnyaccountname@lemmy.zip
    link
    fedilink
    English
    arrow-up
    27
    ·
    6 days ago

    Did they compare it to the code of that outsourced company that provided the lowest bid? My company hasn’t used AI to write code yet. They outcourse/offshore. The code is held together with hopes and dreams. They remove features that exist, only to have to release a hot fix to add it back. I wish I was making that up.

    • coolmojo@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      6 days ago

      And how do you know if the other company with the cheapest bid actually does not just vibe code it? With all that said it could be plain incompetence and ignorance as well.

    • dustyData@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      6 days ago

      Cool, the best AI has to offer is worse than the worst human code. Definitely worth burning the planet to a crisp for it.

  • MyMindIsLikeAnOcean@piefed.world
    link
    fedilink
    English
    arrow-up
    52
    ·
    7 days ago

    No shit.

    I actually believed somebody when they told me it was great at writing code, and asked it to write me the code for a very simple lua mod. It’s made several errors and ended up wasting my time because I had to rewrite it.

        • frongt@lemmy.zip
          link
          fedilink
          English
          arrow-up
          9
          arrow-down
          1
          ·
          7 days ago

          For words, it’s pretty good. For code, it often invents a reasonable-sounding function or model name that doesn’t exist.

          • Xenny@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            ·
            edit-2
            6 days ago

            It’s not even good for words. AI just writes the same stories over and over and over and over and over and over. It’s the same problem as coding. It can’t think of anything novel. Hell it can’t even think. I’d argue the best and only real use for an llm is to help be a rough draft editor and correct punctuation and grammar. We’ve gone way way way too far with the scope of what it’s actually capable of

            • Flic@mstdn.social
              link
              fedilink
              arrow-up
              1
              ·
              6 days ago

              @Xenny @frongt it’s definitely not good for words with any technical meaning, because it creates references to journal articles and legal precedents that sound plausible but don’t exist.
              Ultimately it’s a *very* expensive replacement for the lorem ipsum generator keyboard shortcut.

        • ThirdConsul@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          ·
          7 days ago

          According to OpenAis internal test suite and system card, hallucination rate is about 50% and the newer the model the worse it gets.

          And that fact remains unchanged on other LLM models.

      • ptu@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        7 days ago

        I use it for things that are simple and monotonous to write. This way I’m able to deliver results to tasks I couldn’t have been arsed to do. I’m a data analyst and mostly use mysql and power query

      • dogdeanafternoon@lemmy.ca
        link
        fedilink
        English
        arrow-up
        3
        ·
        7 days ago

        What’s your preferred Hello world language? I’m gunna test this out. The more complex the code you need, the more they suck, but I’ll be amazed if it doesn’t work first try to simply print hello world.

        • xthexder@l.sw0.com
          link
          fedilink
          English
          arrow-up
          9
          ·
          edit-2
          7 days ago

          Malbolge is a fun one

          Edit: Funny enough, ChatGPT fails to get this right, even with the answer right there on Wikipedia. When I tried running ChatGPT’s output the first few characters were correct but it errors with invalid char at 37

          • dogdeanafternoon@lemmy.ca
            link
            fedilink
            English
            arrow-up
            2
            ·
            7 days ago

            Cheeky, I love it.

            Got correct code first try. Failed creating working docker first try. Second try worked.

            tmp="$(mktemp)"; cat >"$tmp" <<'MBEOF'
            ('&%:9]!~}|z2Vxwv-,POqponl$Hjig%eB@@>}=<M:9wv6WsU2T|nm-,jcL(I&%$#"
            `CB]V?Tx<uVtT`Rpo3NlF.Jh++FdbCBA@?]!~|4XzyTT43Qsqq(Lnmkj"Fhg${z@>
            MBEOF
            docker run --rm -v "$tmp":/code/hello.mb:ro esolang/malbolge malbolge /code/hello.mb; rm "$tmp"
            

            Output: Hello World!

            • xthexder@l.sw0.com
              link
              fedilink
              English
              arrow-up
              5
              arrow-down
              1
              ·
              edit-2
              7 days ago

              I’m actually slightly impressed it got both a working program, and a different one than Wikipedia. The Wikipedia one prints “Hello, world.”

              I guess there must be another program floating around the web with “Hello World!”, since there’s no chance the LLM figured it out on its own (it kinda requires specialized algorithms to do anything)

              • NotMyOldRedditName@lemmy.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                edit-2
                6 days ago

                That’d be easy enough to test wouldn’t it? Ask it to write something else like ‘The hippo farts are smelly’

                If it needs to understand whatever the fuck that language is to get that output, it either can or can’t?

              • dogdeanafternoon@lemmy.ca
                link
                fedilink
                English
                arrow-up
                1
                ·
                7 days ago

                I’d never even heard of that language, so it was fun to play with.

                Definitely agree that the LLM didn’t actually figure anything out, but at least it’s not completely useless

      • Serinus@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        7 days ago

        It works well when you use it for small (or repetitive) and explicit tasks. That you can easily check.

    • morto@piefed.social
      link
      fedilink
      English
      arrow-up
      19
      ·
      7 days ago

      In a postgraduate class, everyone was praising ai, calling it nicknames and even their friend (yes, friend), and one day, the professor and a colleague were discussing some code when I approached, and they started their routine bullying on me for being dumb and not using ai. Then I looked at his code and asked to test his core algorithm that he converted from a fortran code and “enhanced” it. I ran it with some test data and compared to the original code and the result was different! They blindly trusted some ai code that deviated from their theoretical methodology, and are publishing papers with those results!

      Even after showing the different result, they didn’t convince themselves of anything and still bully me for not using ai. Seriously, this shit became some sort of cult at this point. People are becoming irrational. If people in other universities are behaving the same and publishing like this, I’m seriously concerned for the future of science and humanity itself. Maybe we should archive everything published up to 2022, to leave as a base for the survivors from our downfall.

      • MyMindIsLikeAnOcean@piefed.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        7 days ago

        The way it was described to me by some academics is that it’s useful…but only as a “research assistant” to bounce ideas off of and bring in arcane or tertiary concepts you might not have considered (after you vet them thoroughly, of course).

        The danger, as described by the same academics, is that it can act as a “buddy” who confirms you biases. It can generate truly plausible bullshit to support deeply flawed hypotheses, for example. Their main concern is it “learning” to stroke the egos of the people using it so it creates a feedback loop and it’s own bubbles of bullshit.

        • tym@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          6 days ago

          So, linkedin? What if the real artificial intelligence was the linkedin lunatics we met along the way?

      • Xenny@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        That’s not a bad idea. I’m already downloading lots of human knowledge and media that I want backed up because I can’t trust humanity anymore to have it available anymore

  • termaxima@slrpnk.net
    link
    fedilink
    English
    arrow-up
    15
    ·
    6 days ago

    ChatGPT is great at generating a one line example use of a function. I would never trust its output any further than that.

    • diabetic_porcupine@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      2
      ·
      6 days ago

      So much this. People who say ai can’t write code are just using it wrong. You need to break things down to bite size problems and just let it autocomplete a few lines at a time. Increase your productivity like 200%. And don’t get me started about not having to search through a bunch of garbage google results to find the documentation I’m actually looking for.

      • termaxima@slrpnk.net
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 day ago

        Personally I only do the “not search through garbage google results” part (especially now that it’s clogged up with AI articles that don’t even answer the question)

        ChatGPT is great for that, I never have to spend 15 minutes searching up what’s the function called to do X thing.

        I really recommend to set the answers to be as brief and terse as possible. The base settings of a sycophant that generates a full article for every question are super annoying when you’re doing actual work.

      • Lifter@discuss.tchncs.de
        link
        fedilink
        English
        arrow-up
        1
        ·
        5 days ago

        Not 200 %. Maybe 5-10 %. You still have to read all of it to check for mistakes, which may sometimes take longer than if you would have just written it yourself (with a good autocomplete). The times it makes a mistake you have lost time by using it.

        It’s even worse when it just doesn’t work. I cannot even describe how frustrating it is to wait for an auto complete that never comes. Erase the line, try again aaaand nothing. After a few tries you opt write the code manually instead, having wasted time just fiddling with buggy software.

        • termaxima@slrpnk.net
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          1 day ago

          Agree with this. I personally don’t use any sort of autocomplete whatsoever. When I have a question for the AI, I ask it, then I type the code from what I learnt.

          Don’t make the mistake of delegating work. Make the AI teach you what it knows.

        • toddestan@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          6 days ago

          I don’t know about ChatGPT, but Github Copilot can act like an autocomplete. Or you can think of it as a fancier Intellisense. You still have to watch its output as it can make mistakes or hallucinate library function calls and things like that, but it can also be quite good at anticipating what I was going to write and saves me some keystrokes. I’ve also found I can prompt it in a way by writing a comment and it’ll follow up with attempt to fill in code based upon that comment. I’ve certainly found it to be a net time saver.

        • diabetic_porcupine@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          6 days ago

          Well not quite - I use ChatGPT more like to brainstorm ideas and sometimes I’ll paste a whole file or two into the prompt and ask what’s wrong and tell it the issue I’m seeing, it usually gives me the correct answer right away or after clarifying once or twice.

          I use copilot for tab completion. Sometimes it finishes a line or two sometimes more. Usually it’s good code if it’s able to read your existing codebase as a reference. bonus points for using an MCP.

          Warp terminal for intensive workflows. It’s integrated into your machine and can do whatever like implementing CICD scripts, executing commands, ssh into remote servers set up your infrastructure etc… I’ll use this when I really need the ai to understand my code base as a whole before providing any code or executing commands.

  • nutsack@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    16
    ·
    edit-2
    6 days ago

    this is expected, isn’t it? You shit fart code from your ass, doing it as fast as you can, and then whoever buys out the company has to rewrite it. or they fire everyone to increase the theoretical margins and sell it again immediately

  • Bad@jlai.lu
    link
    fedilink
    English
    arrow-up
    23
    arrow-down
    1
    ·
    edit-2
    6 days ago

    Although I don’t doubt the results… can we have a source for all the numbers presented in this article?

    It feels AI generated itself, there’s just a mishmash of data with no link to where that data comes from.

    There has to be a source, since the author mentions:

    So although the study does highlight some of AI’s flaws […] new data from CodeRabbit has claimed

    CodeRabbit is an AI code reviewing business. I have zero trust in anything they say on this topic.

    Then we get to see who the author is:

    Craig’s specific interests lie in technology that is designed to better our lives, including AI and ML, productivity aids, and smart fitness. He is also passionate about cars

    Has anyone actually bothered clicking the link and reading past the headline?

    Can you please not share / upvote / get ragebaited by dogshit content like this?

    • Credibly_Human@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      6 days ago

      People, especially on lemmy are looking for any cope that Ai will just fall apart by itself and no longer bother them by existing, so they’ll upvote whatever lets them think that.

      The reality that we are just heading towards the trough of disappear wherethe investor hype peters off and then we eventually just have a legitimately useful technology with all the same business hurdles of any other technology (tech bros trying to control other peoples lives to enrich themselves or harm people they don’t like)

    • 🍉 Albert 🍉@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      3
      ·
      6 days ago

      As a computer science experiment, making a program that can beat the Turing test is a monumental step in progress.

      However as a productive tool it is useless in practically everything it is implemented on. It is incapable of performing the very basic “Sanity check” that is important in programming.

      • robobrain@programming.dev
        link
        fedilink
        English
        arrow-up
        9
        ·
        6 days ago

        The Turing test says more about the side administering the test than the side trying to pass it

        Just because something can mimic text sufficiently enough to trick someone else doesn’t mean it is capable of anything more than that

        • 🍉 Albert 🍉@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          6 days ago

          We can argue about it’s nuances. same with the Chinese room thought experiment.

          However, we can’t deny that it the Turing test, is no longer a thought exercise but a real test that can be passed under parameters most people would consider fair.

          I thought a computer passing the Turing test would have more fanfare, about the morality if that problem, because the usual conclusion of that thought experiment was “if you cant tell the difference, is there one?”, but now it has become “Shove it everywhere!!!”.

          • M0oP0o@mander.xyz
            link
            fedilink
            English
            arrow-up
            5
            ·
            6 days ago

            Oh, I just realized that the whole ai bubble is just the whole “everything is a dildo if you are brave enough.”

            • 🍉 Albert 🍉@lemmy.world
              link
              fedilink
              English
              arrow-up
              3
              ·
              edit-2
              6 days ago

              yhea, and “everything is a nail if all you got is a hammer”.

              there are some uses for that kind of AI, but very limiting. less robotic voice assisants, content moderation, data analysis, quantification of text. the closest thing to Generative use should be to improve auto complete and spell checking (maybe, I’m still not sure on those ones)

                • 🍉 Albert 🍉@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  edit-2
                  6 days ago

                  In theory, I can imagine an LLM fine tuned on whatever you type. which might be slightly better then the current ones.

                  emphasis on the might.

        • 🍉 Albert 🍉@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          Time for a Turing 2.0?

          If you spend a lifetime with a bot wife and were unable to tell that she was AI, is there a difference?

      • iglou@programming.dev
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 days ago

        The Turing test becomes absolutely useless when the product is developed with the goal of beating the Turing test.

        • 🍉 Albert 🍉@lemmy.world
          link
          fedilink
          English
          arrow-up
          1
          ·
          6 days ago

          it was also meant as a philosophical test, but also, a practical one, because now. I have absolutely no way to know if you are a human or not.

          But it did pass it, and it raised the bar. but they are still useless at any generative task

  • Tigeroovy@lemmy.ca
    link
    fedilink
    English
    arrow-up
    14
    arrow-down
    1
    ·
    6 days ago

    And then it takes human coders way longer to figure out what’s wrong to fix than it would if they just wrote it themselves.