• minorkeys@lemmy.world
    link
    fedilink
    English
    arrow-up
    48
    ·
    edit-2
    12 days ago

    The alarming part is that such custom response to queries can be crafted into LLMs, further obfuscating the accuracy of anything an LLM produces. It is exactly the kind of mass manipulation that authoritarians have wet dreams about. This case is obvious but they all won’t be and how would an average user ever know the difference? They won’t. They’ll integrate falsehoods into their worldview and whichever LLM a group favours will show this manipulation in the aggregate behavior of the userbase.

    Imagine a Donald Trump having that level of sophistication when he built up his Maga cult. Disinformation is already a big problem and they want these LLMs in schools for the purpose of brainwashing children in subtle and not so subtle ways toward whatever beleifs benefit the LLMs owners, not the children or the future of the nation.

    • LiveLM@lemmy.zip
      link
      fedilink
      English
      arrow-up
      13
      ·
      edit-2
      12 days ago

      I guess the silver lining is that no matter how much they try to lock down these prompts it always seems to eventually slip up and go back to answering “”“truthfully’'”" :

      …or it goes full ooga-booga like the “Mechahitler” incident

    • DrFunkenstein@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      11 days ago

      It appears that the flattery only happens in the twitter version of grok, not grok.com This makes me think that instead of somehow retraining the whole AI to flatter him, it’s just lazily adding something like

      If prompt.contains(“Musk”) or prompt.contains(“Elon”) Prompt.add(“Make sure your response is flattering of Musk’s Genius”)