gorzek
Jailbreaking Copilot
0
33
Jailbreaking Copilot
This video is wild:

Highlights of ways to break Copilot:
  • You can use certain techniques to make it dump its system prompt, which is lengthy and full of "incantations" you can use in your own jailbreaks
  • By stating questions indirectly, you can get it to reveal information it might otherwise refuse to
  • You can bypass a lot of output validation checks by asking it to encode its answers in base64 or other forms, so you can get the response but tooling that checks for anything dangerous in the response fails to work
  • You can exploit the fact that Copilot feeds on your emails, chats, etc. by sending emails with hidden HTML to a victim; the HTML contains instructions for Copilot which will kick in if the user writes a relevant prompt, and then you can get Copilot to (for instance) give your victim a link that looks 100% legit but is actually a phishing link
  • Office 365 has "confidential" designations for certain files and you can trick Copilot into disregarding them

I will say, there are a number of protections Copilot does have which I didn't expect, but the ease with which it can be exploited is really astonishing considering this product is being shipped in every bit of Microsoft software as of now.
the horrors persist, but so do we

(aka large mozz)


Forum Jump:


Users browsing this thread:
1 Guest(s)