Very much smart people

RmDebArc_5@piefed.zip · 1 day ago

Very much smart people

ragas@lemmy.ml · edit-2 23 hours ago

I mean I don’t know for sure but I think they often just code program logic in to filter for some requests that they do not want.

My evidence for that is that I can trigger some “I cannot help you with that” responses by asking completely normal things that just use the wrong word.

Scrubbles@poptalk.scrubbles.tech · 20 hours ago

It’s not 100%, and you’re more or less just asking the LLM to behave, and filtering the response through another non-perfect model after that which is trying to decide if it’s malicious or not. It’s not standard coding in that it’s a boolean returned - it’s a probability that what the user asked is appropriate according to another model. If the probability is over a threshold then it rejects.