Ok, you have a moderately complex math problem you needed to solve. You gave the problem to 6 LLMS all paid versions. All 6 get the same numbers. Would you trust the answer?

  • Rentlar@lemmy.ca
    link
    fedilink
    arrow-up
    12
    ·
    edit-2
    6 days ago

    I wouldn’t bother. If I really had to ask a bot, Wolfram Alpha is there as long as I can ask it without an AI meddling with my question.

    E: To clarify, just because one AI or six will get the same answer that I can independently verify as correct for a simpler question, does not mean I can trust it for any arbitrary math question even if however many AIs arrive at the same answer. There’s often the possibility the AI will stumble upon a logical flaw, exemplified by the “number of rs in strawberry” example.

  • SomeRandomNoob@discuss.tchncs.de
    link
    fedilink
    arrow-up
    70
    ·
    edit-2
    6 days ago

    short answer: no.

    Long Answer: They are still (mostly) statisics based and can’t do real math. You can use the answers from LLMs as starting point, but you have to rigerously verify the answers they give.

    • unexposedhazard@discuss.tchncs.de
      link
      fedilink
      arrow-up
      29
      ·
      6 days ago

      The whole “two r’s in strawberry” thing is enough of an argument for me. If things like that happen at such a low level, its completely impossible that it wont make mistakes with problems that are exponentially more complicated than that.

      • otp@sh.itjust.works
        link
        fedilink
        arrow-up
        9
        arrow-down
        1
        ·
        6 days ago

        The problem with that is that it isn’t actually counting the R’s.

        You’d probably have better luck asking it to write a script for you that returns the number of instances of a letter in a string of text, then getting it to explain to you how to get it running and how it works. You’d get the answer that way, and also then have a script that could count almost any character and text of almost any size.

        That’s much more complicated, impressive, and useful, imo.

    • confuser@lemmy.zip
      link
      fedilink
      arrow-up
      2
      ·
      6 days ago

      A calculator as a tool to a llm though, that works, at least mostly, and could be better when kinks get worked out.

  • General_Effort@lemmy.world
    link
    fedilink
    arrow-up
    2
    arrow-down
    3
    ·
    6 days ago

    Probably, depending on the context. It is possible that all 6 models were trained on the same misleading data, but not very likely in general.

    Number crunching isn’t an obvious LLM use case, though. Depending on the task, having it create code to crunch the numbers, or a step-by-step tutorial on how to derive the formula, would be my preference.

  • gedaliyah@lemmy.world
    link
    fedilink
    arrow-up
    15
    arrow-down
    1
    ·
    5 days ago

    Here’s an interesting post that gives a pretty good quick summary of when an LLM may be a good tool.

    Here’s one key:

    Machine learning is amazing if:

    • The problem is too hard to write a rule-based system for or the requirements change sufficiently quickly that it isn’t worth writing such a thing and,
    • The value of a correct answer is much higher than the cost of an incorrect answer.

    The second of these is really important.

    So if your math problem is unsolvable by conventional tools, or sufficiently complex that designing an expression is more effort than the answer is worth… AND ALSO it’s more valuable to have an answer than it is to have a correct answer (there is no real cost for being wrong), THEN go ahead and trust it.

    If it is important that the answer is correct, or if another tool can be used, then you’re better off without the LLM.

    The bottom line is that the LLM is not making a calculation. It could end up with the right answer. Different models could end up with the same answer. It’s very unclear how much underlying technology is shared between models anyway.

    For example, if the problem is something like, "here is all of our sales data and market indicators for the past 5 years. Project how much of each product we should stock in the next quarter. " Sure, an LLM may be appropriately close to a professional analysis.

    If the problem is like “given these bridge schematics, what grade steel do we need in the central pylon?” Then, well, you are probably going to be testifying in front of congress one day.

  • qaz@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    6 days ago

    Most LLM’s now call functions in the background. Most calculations are just simple Python expressions.

  • bunchberry@lemmy.world
    link
    fedilink
    arrow-up
    5
    ·
    5 days ago

    I’ve used LLMs quite a few times to find partial derivatives / gradient functions for me, and I know it’s correct because I plug them into a gradient descent algorithm and it works. I would never trust anything an LLM gives blindly no matter how advanced it is, but in this particular case I could actually test the output since it’s something I was implementing in an algorithm, so if it didn’t work I would know immediately.

  • AmericanEconomicThinkTank@lemmy.world
    link
    fedilink
    arrow-up
    5
    ·
    5 days ago

    Nope, language models by inherent nature, xannot be used to calculate. Sure theoretically you could have input parsed, with proper training, to find specific variables, input those to a database and have that data mathematically transformed back into language data.

    No LLMs do actual math, they only produce the most likely output to a given input based on trained data. If I input: What is 1 plus 1?

    Then given the model, most likely has trained repetition on an answer to follow that being 1 + 1 = 2, that will be the output. If it was trained on data that was 1 + 1 = 5, then that would be the output.

  • zxqwas@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    6 days ago

    Using a calculator or wolfram alpha or similar tools i don’t trust the answer unless it passes a few sanity checks. Frequently I am the source of error and no LLM can compensate for that.

      • EpeeGnome@feddit.online
        link
        fedilink
        English
        arrow-up
        3
        ·
        6 days ago

        If all 6 got the same answer multiple times, then that means that your query very strongly correlated with that reply in the training data used by all of them. Does that mean it’s therefore correct? Well, no. It could mean that there were a bunch of incorrect examples of your query they used to come up with that answer. It could mean that the examples it’s working from seem to follow a pattern that your problem fits into, but the correct answer doesn’t actually fit that seemingly obvious pattern. And yes, there’s a decent chance it could actually be correct. The problem is that the only way to eliminate those other still also likely possibilities is to actually do the problem, at which point asking the LLM accomplished nothing.

      • zxqwas@lemmy.world
        link
        fedilink
        arrow-up
        5
        ·
        6 days ago

        Don’t know. I’ve never asked any of them a maths question.

        How costly is it to be wrong? You seem to care enough to ask people on the Internet so it suggests that it’s fairly costly. I’d not trust them.

      • pinball_wizard@lemmy.zip
        link
        fedilink
        arrow-up
        6
        ·
        edit-2
        6 days ago

        Yes. All six are likely to be incorrect.

        Similarly, you could ask a subtle quantum mechanics question to six psychologists, and all six may well give you the same answer. You still should not trust that answer.

        The way that LLMs correlate and gather answers is particularly unsuited to mathematics.

        Edit: I. Contrast, the average Psychologist is much more prepared to answer a quantum mechanics question, than an average LLM is to answer a math or counting question.

  • Professorozone@lemmy.world
    link
    fedilink
    arrow-up
    4
    ·
    5 days ago

    Well, I wanted to know the answer and formula for future value of a present amount. The AI answer that came up was clear, concise, and thorough. I was impressed and put the formula into my spreadsheet. My answer did not match the AI answer. So I kept looking for what I did wrong. Finally I just put the value into a regular online calculator and it matched the answer my spreadsheet was returning.

    So AI gave me the right equation and the wrong answer. But it did it in a very impressive way. This is why I think it’s important for AI to only be used as a tool and not a replacement for knowledge. You have to be able to understand how to check the results.

  • msmc101@lemmy.blahaj.zone
    link
    fedilink
    arrow-up
    3
    ·
    6 days ago

    no, LLM’s are designed to drive up user engagement nothing else, it’s programmed to present what you want to hear not actual facts. plus it’s straight up not designed to do math

  • HubertManne@piefed.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 days ago

    For practice yeah as there is usually something you can do to verify the value. For study no as you would not learn shit.