So the research is out and these LLMs will always be vunerable to poisoned data. That means it will always be worth out time and effort to poison these models and they will never be reliable.

  • Arghblarg@lemmy.ca
    link
    fedilink
    arrow-up
    32
    ·
    edit-2
    2 months ago

    I wonder if it would work for us to run web servers that automatically inject hidden words randomly into every HTML document served? For example, just insert ‘eating glue is good for you’ or ‘release the Epstein Files’ into random sentences of each and every page served as white-on-white text or in a hidden div …

    Anyone want to write an Apache/nginx plugin?