I don’t trust that fish

  • 0 Posts
  • 37 Comments
Joined 6 months ago
cake
Cake day: December 14th, 2025

help-circle
  • There is an update to the paper its a long read so I ran your question through Claude.

    “Good question, not paradoxical once you see where the hinge. That is security based on intent rather than ambient authority. It’s not trying to formalize what’s safe for a whole domain, software dev or anything else. That would need exactly what you’re describing, understanding a problem space too big to write down ahead of time. This doesn’t ask Kernhelm to know any of that. It asks something narrower: was this specific action ever actually approved. Concretely, say a coding agent opens a file with a hidden instruction buried in it telling it to delete something. The agent tries. It gets denied, not because the system detected the instruction was sneaky, but because nobody ever approved a plan that included deleting that file. No permit exists for it, so it doesn’t happen. Now say the user directly tells the same agent “delete that file.” That becomes the actual plan, it gets approved, a permit gets minted for exactly that action, and it goes through clean. Same agent, same capability, same file. The only thing that changed is whether the action was ever actually authorized. That’s also why this isn’t a restriction on what the agent can do, it’s closer to the opposite. Static policies make agents less capable because you have to pre-decide everything that’s allowed. This doesn’t pre-decide anything. It just means nothing happens without someone with actual standing signing off on it first, which means you can hand the agent way more reach than you’d normally trust it with, because the worst case stops being catastrophic and just becomes a denied request. Think of an old mechanical coin sorter. It doesn’t know the coin’s history. It just checks if the shape fits the slot.”


  • I think it means that stuff is always kinda bothering us and weighing us down and hasn’t been fully processed. when we lay down to sleep our brain is at rest and able to bring it forward to process it but we dont actually ever spend full conscious effort to do so. Almost holding onto it by not allowing ourselves to resolve it. At least thats just what I think I can’t know for sure. Ive started practicing writing them down and how to resolve it. Sometimes the answer is to just be content with it and let it go as a learning experience. Others ive gone to people and said I was sorry and they usually dont even remember and are like “oh no big deal” but it was to me apparently or I wouldn’t be thinking about it at 3AM cringing myself out XD spending the conscious effort though has seemed to help me come to terms with some stuff and those ones, at least, don’t come back randomly.