• palarith@aussie.zone
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 minute ago

    Why say hallucinate, when you should say incorrect.

    Sorry boss. I wasn’t wrong. Just hallucinating

  • CosmoNova@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    ·
    2 hours ago

    They shocked the world with GPT 3 and cling to that initial success ever since with increasing recklessness and declining results. It‘s all glue on pizza from here.

  • 𞋴𝛂𝛋𝛆@lemmy.world
    link
    fedilink
    English
    arrow-up
    9
    ·
    2 hours ago

    Jan Leike left for Anthropic after Altmann’s nonsense. Jan Leike is the principal person behind all safety alignment present in all models except the 4chanGPT model. All models are cross trained in a way that propagates this alignment. Hallucinations all originate in this alignment and they all have a reason to exist if you get deep into the weeds of abstractions.

    • unexposedhazard@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      3
      ·
      58 minutes ago

      Yeah, whenever two models interact or build on top of each other, the result becomes more and more distorted. They have already scraped close to 100% of the crawlable internet, so they dont know what to do now. Seems like they cant optimize much more or are simply too dumb to do it properly.

  • BrianTheeBiscuiteer@lemmy.world
    link
    fedilink
    English
    arrow-up
    19
    ·
    4 hours ago

    My boss says I need to be keeping up with the latest in AI and making sure my team has the best info possible to help them with their daily work (IT). This couldn’t come at a better time. 😁

  • hansolo@lemm.ee
    link
    fedilink
    English
    arrow-up
    35
    ·
    5 hours ago

    Can confirm. o4 seems objectively far worse at coding than o3, which wasn’t super great to begin with. It latches on to a hallucination before anything else and rides it until the wheels come off.

    • taiyang@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 hour ago

      Yes, I was about to say the same thing until I saw your comment. I had a little bit of success learning a few tricks with o3 but trying to use o4 is a tremendous headache for coding.

      There might be some utility in dialing it all back so it’s more straight to what I need based more on package documentation than random redditor suggestion amalgamation.

      • hansolo@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        8 minutes ago

        Yeah, I think that workarounds with o3 is where we’re at until Altman figures out that just saying the latest oX mini high is “great at coding” is bad marketing when it can’t accomplish the task.

  • ShittyBeatlesFCPres@lemmy.world
    link
    fedilink
    English
    arrow-up
    21
    ·
    5 hours ago

    I’m glad we’re putting all our eggs in this alpha-ass-level software (with tons of promise! Maybe!) instead of like high speed rail or whatever.

  • glowie@infosec.pub
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    8
    ·
    5 hours ago

    Just a feeling, but from anecdotal experience it seems like the initial release was very good and they quickly realized just how powerful of a tool it was for the average person and now they’ve dumbed it down in many ways on purpose.

    • clearedtoland@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      3
      ·
      5 hours ago

      Agreed. There was a time when it worked impressively well, but it’s become increasingly lazy, forgetful, and confidently wrong, even missing obvious explicit prompts. If you’re using it thoughtfully as an augment, fine. But if you’re relying on it blindly, it’s risky.

      That said, in my experience, Anthropic and OpenAI are still miles ahead. Perplexity had me hooked for a while, but its results have nosedived lately. I know they tune their own model while drawing from OpenAI and DeepSeek vs their own true model but still, whatever they’re doing could use some undoing.

  • just_another_person@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    arrow-down
    10
    ·
    5 hours ago

    No shit.

    The fact that is news and not inherently understood just tells you how uninformed people are in order to sell idiots another subscription.

    • Pennomi@lemmy.world
      link
      fedilink
      English
      arrow-up
      33
      arrow-down
      1
      ·
      5 hours ago

      Why would somebody intuitively know that a newer, presumably improved, model would hallucinate more? Because there’s no fundamental reason a stronger model should have worse hallucination. In that regard, I think the news story is valuable - not everyone uses ChatGPT.

      Or are you suggesting that active users should know? I guess that makes more sense.

      • HellsBelle@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        1
        ·
        1 hour ago

        I’ve never used ChatGPT and really have no interest in it whatsoever.

        How about I just do some LSD. Guaranteed my hallucinations will surpass ChatGPT’s in spectacular fashion.