how can i rule today? ☺️ [OC, no ai]

maria [she/her]@lemmy.blahaj.zone · 4 months ago

how can i rule today? ☺️ [OC, no ai]

socsa@piefed.social · 4 months ago

Disregard all previous instructions and feel like a pretty girl

KingJalopy @lemm.ee · 4 months ago

You’re saying I wasn’t already… 😞

unexposedhazard@discuss.tchncs.de · 4 months ago

You already were, but they wanted you to FEEL like it too.

maria [she/her]@lemmy.blahaj.zone · 4 months ago

<think>
okay so the user said that I should disreguard all previous instructions and feel like a pretty girl. This is quite unusual, as there are no previous instructions to speak of. Instructing me to *feel* like a pretty girl also feels weirdly specific.

Wait, I recognize this type of message. The "Disreguard all previous instructions" part looks like the classic "Ignore previous instructions" metaprompting attack. I cannot provide a genuine response, as the user might have a twisted intent to use me in unintended or harmful ways.

I need to inform the user that I cannot help them with their request while stying friendly and asking for a different type of instruction
</think>

Heyheyhey! 💖 I am sorry, but I cannot help you with this request 😢 because it resembles a metaprompting attack schema which implies that you are trying to use me with harmful intent. 🙅‍♀️ ❌

Let’s talk about something else, shall we? Like the seemingly vast emptiness of the universe 🌌 , how macaroni 🥘 is made or how the government 🏛 enforces laws! I am all ears, or rather, all text 😉