How is this even possible?

Humans continually find ways to trick AI with prompts to circumvent or destroy their guidelines. They simply invent voice commands that cause or force the AI ​​to lose its own ethical values ​​and essentially change them or shape them the way they want them to. We've seen this often enough. How much security is needed to create a fully autonomous, almost autonomously thinking robot that truly listens to what you tell it and maintains its ethical values?

(2 votes)
Loading...

Similar Posts

Subscribe
Notify of
3 Answers
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Kelec
1 year ago

This is because AIs like ChatGPT and Co only seem intelligent but not.

In essence, ChatGPT is a language model so it generates only an answer to an input that is as close as possible to what a person would respond. In the end, however, it does not have the intelligence to examine its answer itself or has ed exactly not even an idea of what it writes at all.

The filtering now works in such a way that there is another Ki or software which filters user requests.

If this filter is bypassed, the Language Model generates its output.

The system therefore generally has no idea what ethics are at all. In the end, it has only one “list” of forbidden questions.

In order for an AI to have really ethics, it must first of all be able to think and have self-reflection and to be able to ask itself the question, my answer to this subject is allowed according to my principles.

Marcii21
1 year ago

Hey,

In this sense, an AI is simply an extremely huge collection of data and a very pronounced and intelligent neural network. I would like to tell you more about how OpenAI enforces their policies at ChatGPT, but I can only give you reviews.

As in any software development, this is regulated by certain queries. Here you enter a prompt and ChatGPT builds up an answer in its gigantic database (probably there is more behind it like “category of the question”, “language”, …).

When ChatGPT hits offensive wounds, sentences that move into an unauthorized category or other queries, he will cancel the answer and refuse the question.

So if someone now asks a question in a different way, such as double denial or other, for the AI incomprehensible, manipulation techniques, it may happen that ChatGPT places it in a wrong category than it should really be.
(This is pure guess, not actual implementation)

Since Artificial intelligence has no own will and acts irrational (no reason, logical thinking, no emotions or understanding) it cannot “weak” or “rethink” it.

If you use ChatGPT more often in mathematics, you will notice that many solutions do not correspond to the truth. This is because ChatGPT can’t think logically, but runs through its formulas and algorithms.

Greetings,
Marcel

WeissBrot965
1 year ago

“or to destroy” well, let’s stand this expression.

“To create complete, almost self-thinking robots” Wait a minute. Do you even know what Ki is? Nothing but mathematics and probabilities that are formulated with given data. A self-thinking Ki exists/so will not exist in the sense.

Why exactly, or how these jailbreaks work, I don’t know. But what I know is that every new model makes the Ki safer. And at the latest from GPT 5, it will probably be almost impossible.

Mfg white bread