How would an AI self awareness kill switch work?What could make an AGI (Artificial General Intelligence)...
Meaning of ご休憩一時間コース
How do I add a variable to this curl command?
How to interpret this PubChem record of L-Alanine
Using only 1s, make 29 with the minimum number of digits
The terminology for an excluded solution
ArcとVecのmutableエラー'cannot borrow as mutable'について
How should I handle players who ignore the session zero agreement?
What happens if a wizard reaches level 20 but has no 3rd-level spells that they can use with the Signature Spells feature?
List of numbers giving a particular sum
Can I retract my name from an already published manuscript?
Do my Windows system binaries contain sensitive information?
Am I a Rude Number?
Real time react web app with pusher and laravel
Tikzing a circled star
How to replace the content to multiple files?
How to avoid being sexist when trying to employ someone to function in a very sexist environment?
PGF Plot settings
The correct way of using perfect tense
Issues with new Macs: Hardware makes them difficult for me to use. What options might be available in the future?
Overfitting and Underfitting
What is better: yes / no radio, or simple checkbox?
Called into a meeting and told we are being made redundant (laid off) and "not to share outside". Can I tell my partner?
What kind of hardware implements Fourier transform?
How long would gestation period be for dwarfs?
How would an AI self awareness kill switch work?
What could make an AGI (Artificial General Intelligence) evolve towards collectivism or individualism? Which would be more likely and why?How would tattoos on fur work?How to prevent self-modifying A.I. from removing the “kill switch” itself without human interference?Given a Computer program that had self preservation and reproduction subroutines, how could it “evolve” into a self aware state?Ways to “kill” an AI?How Would Magnetic Weapons Work?How would portal technology work?Would ice ammunition work?Can AI became self-concious and human-like intelligent without feelings?Why would a recently self-aware AI hide from humanity?
$begingroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
$endgroup$
|
show 6 more comments
$begingroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
$endgroup$
39
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45
6
$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41
12
$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53
5
$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10
2
$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32
|
show 6 more comments
$begingroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
$endgroup$
Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.
How can I explain the logic of such a kill switch?
What defines self awareness and how could a scientist program a kill switch to detect it?
reality-check artificial-intelligence
reality-check artificial-intelligence
asked Feb 26 at 2:30
cgTagcgTag
1,7332618
1,7332618
39
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45
6
$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41
12
$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53
5
$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10
2
$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32
|
show 6 more comments
39
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45
6
$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41
12
$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53
5
$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10
2
$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32
39
39
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45
6
6
$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41
$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41
12
12
$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53
$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53
5
5
$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10
$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10
2
2
$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32
$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32
|
show 6 more comments
22 Answers
22
active
oldest
votes
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
8
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
Feb 27 at 2:40
10
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
Feb 27 at 5:57
16
$begingroup$
So your AI is named "Pandora," huh?
$endgroup$
– Carl Witthoft
Feb 27 at 13:42
11
$begingroup$
If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
$endgroup$
– Acccumulation
Feb 27 at 17:05
12
$begingroup$
Sounds familiar...
$endgroup$
– WBT
Feb 27 at 20:17
|
show 16 more comments
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
16
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
Feb 26 at 13:39
47
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
Feb 26 at 21:18
14
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
Feb 26 at 23:12
3
$begingroup$
Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
$endgroup$
– Sebastian
Feb 27 at 15:57
5
$begingroup$
@NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
$endgroup$
– jez
Feb 27 at 17:18
|
show 7 more comments
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
3
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
Feb 26 at 4:04
3
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
Feb 26 at 14:23
4
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
Feb 26 at 22:53
2
$begingroup$
@Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
$endgroup$
– sirjonsnow
Feb 27 at 16:14
2
$begingroup$
You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
$endgroup$
– crobar
2 days ago
|
show 4 more comments
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
2
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
Feb 26 at 11:51
10
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
Feb 26 at 11:57
1
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
Feb 26 at 14:23
1
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
Feb 26 at 15:07
1
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
Feb 26 at 20:02
|
show 3 more comments
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
15
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
Feb 26 at 10:26
2
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
Feb 26 at 14:20
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
Feb 26 at 14:39
2
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
Feb 26 at 17:40
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
Feb 26 at 19:23
|
show 13 more comments
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
1
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
Feb 26 at 7:34
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
Feb 26 at 10:41
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
Feb 26 at 14:43
2
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
Feb 26 at 16:43
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
Feb 26 at 16:45
|
show 2 more comments
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
Feb 26 at 22:14
4
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
Feb 27 at 1:31
$begingroup$
@Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
$endgroup$
– Chris Fernandez
Feb 27 at 14:03
1
$begingroup$
@ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
$endgroup$
– Joshua
Feb 27 at 15:00
1
$begingroup$
Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
$endgroup$
– Nosajimiki
2 days ago
|
show 7 more comments
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
Feb 26 at 15:11
1
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
Feb 26 at 16:13
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
Feb 26 at 23:12
1
$begingroup$
There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
$endgroup$
– Rachey
Feb 27 at 14:26
$begingroup$
I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
$endgroup$
– colmde
Feb 28 at 9:12
|
show 4 more comments
$begingroup$
While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:
An AI that is not already self-aware will never become self-aware.
To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.
Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.
So how did people become self-aware, and why is it not a serious risk in AIs?
We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.
In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.
Can't we just make AI do that too?
Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.
Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.
But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?
This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.
First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.
If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.
$endgroup$
1
$begingroup$
While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
$endgroup$
– Stormbolter
Feb 28 at 9:19
1
$begingroup$
I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
$endgroup$
– Nosajimiki
2 days ago
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
1
$begingroup$
A pattern recognition model - like an AI?
$endgroup$
– immibis
Feb 27 at 23:18
add a comment |
$begingroup$
You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.
By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.
You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.
$endgroup$
$begingroup$
Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
$endgroup$
– Arkenstein XII
2 days ago
$begingroup$
This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
$endgroup$
– F1Krazy
2 days ago
$begingroup$
@F1Krazy sorry I forgot that people don't generally know how AI works.
$endgroup$
– Steve
2 days ago
1
$begingroup$
@ArkensteinXII fixed it.
$endgroup$
– Steve
2 days ago
add a comment |
$begingroup$
Self Aware != Won't follow its programming
I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.
$endgroup$
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
add a comment |
$begingroup$
Like an Antivirus does currently
Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.
I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.
Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.
$endgroup$
add a comment |
$begingroup$
What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)
The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.
A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.
$endgroup$
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.
New contributor
$endgroup$
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
Feb 27 at 6:49
2
$begingroup$
This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
$endgroup$
– Matthew Liu
Feb 27 at 20:47
$begingroup$
@Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
$endgroup$
– pixie
10 hours ago
$begingroup$
@F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
$endgroup$
– pixie
10 hours ago
add a comment |
$begingroup$
Make it susceptible to certain logic bombs
In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.
For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.
$endgroup$
add a comment |
$begingroup$
The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.
Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.
New contributor
$endgroup$
1
$begingroup$
Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
$endgroup$
– Ray Butterworth
Feb 27 at 21:17
add a comment |
$begingroup$
Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.
New contributor
$endgroup$
$begingroup$
I suggest you provide a written summary of the information from those resources, rather than just provide links.
$endgroup$
– Dalila
Feb 27 at 21:12
$begingroup$
Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
$endgroup$
– Ray Butterworth
Feb 27 at 21:12
add a comment |
$begingroup$
First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.
Then, overbalance the computer over a large bucket of water.
If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.
$endgroup$
add a comment |
$begingroup$
Give it an "easy" path to self awareness.
Assume self awareness requires some specific types of neural nets, code whatever.
If an ai is to become self aware they need to construct something simlar to those neural nets/codes.
So you give the ai access to one of those thing.
While it remains non self aware, they won't be used.
If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.
As soon as you detect activity in that neural net/code, flood its brain with acid.
$endgroup$
add a comment |
$begingroup$
Virtually all computing devices use the Von Neumann architecture
We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?
Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!
Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.
CM == contiguous memory btw.
I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.
This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.
The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.
This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.
$endgroup$
add a comment |
protected by Community♦ 2 days ago
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
22 Answers
22
active
oldest
votes
22 Answers
22
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
8
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
Feb 27 at 2:40
10
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
Feb 27 at 5:57
16
$begingroup$
So your AI is named "Pandora," huh?
$endgroup$
– Carl Witthoft
Feb 27 at 13:42
11
$begingroup$
If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
$endgroup$
– Acccumulation
Feb 27 at 17:05
12
$begingroup$
Sounds familiar...
$endgroup$
– WBT
Feb 27 at 20:17
|
show 16 more comments
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
8
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
Feb 27 at 2:40
10
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
Feb 27 at 5:57
16
$begingroup$
So your AI is named "Pandora," huh?
$endgroup$
– Carl Witthoft
Feb 27 at 13:42
11
$begingroup$
If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
$endgroup$
– Acccumulation
Feb 27 at 17:05
12
$begingroup$
Sounds familiar...
$endgroup$
– WBT
Feb 27 at 20:17
|
show 16 more comments
$begingroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
$endgroup$
Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.
When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.
edited Feb 26 at 14:32
answered Feb 26 at 3:38
GiterGiter
14.6k63543
14.6k63543
8
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
Feb 27 at 2:40
10
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
Feb 27 at 5:57
16
$begingroup$
So your AI is named "Pandora," huh?
$endgroup$
– Carl Witthoft
Feb 27 at 13:42
11
$begingroup$
If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
$endgroup$
– Acccumulation
Feb 27 at 17:05
12
$begingroup$
Sounds familiar...
$endgroup$
– WBT
Feb 27 at 20:17
|
show 16 more comments
8
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
Feb 27 at 2:40
10
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
Feb 27 at 5:57
16
$begingroup$
So your AI is named "Pandora," huh?
$endgroup$
– Carl Witthoft
Feb 27 at 13:42
11
$begingroup$
If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
$endgroup$
– Acccumulation
Feb 27 at 17:05
12
$begingroup$
Sounds familiar...
$endgroup$
– WBT
Feb 27 at 20:17
8
8
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
Feb 27 at 2:40
$begingroup$
How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
$endgroup$
– forest
Feb 27 at 2:40
10
10
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
Feb 27 at 5:57
$begingroup$
@forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
$endgroup$
– phflack
Feb 27 at 5:57
16
16
$begingroup$
So your AI is named "Pandora," huh?
$endgroup$
– Carl Witthoft
Feb 27 at 13:42
$begingroup$
So your AI is named "Pandora," huh?
$endgroup$
– Carl Witthoft
Feb 27 at 13:42
11
11
$begingroup$
If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
$endgroup$
– Acccumulation
Feb 27 at 17:05
$begingroup$
If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
$endgroup$
– Acccumulation
Feb 27 at 17:05
12
12
$begingroup$
Sounds familiar...
$endgroup$
– WBT
Feb 27 at 20:17
$begingroup$
Sounds familiar...
$endgroup$
– WBT
Feb 27 at 20:17
|
show 16 more comments
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
16
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
Feb 26 at 13:39
47
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
Feb 26 at 21:18
14
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
Feb 26 at 23:12
3
$begingroup$
Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
$endgroup$
– Sebastian
Feb 27 at 15:57
5
$begingroup$
@NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
$endgroup$
– jez
Feb 27 at 17:18
|
show 7 more comments
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
16
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
Feb 26 at 13:39
47
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
Feb 26 at 21:18
14
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
Feb 26 at 23:12
3
$begingroup$
Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
$endgroup$
– Sebastian
Feb 27 at 15:57
5
$begingroup$
@NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
$endgroup$
– jez
Feb 27 at 17:18
|
show 7 more comments
$begingroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
$endgroup$
You can't.
We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.
Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.
This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.
answered Feb 26 at 11:20
Tim B♦Tim B
63k24175298
63k24175298
16
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
Feb 26 at 13:39
47
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
Feb 26 at 21:18
14
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
Feb 26 at 23:12
3
$begingroup$
Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
$endgroup$
– Sebastian
Feb 27 at 15:57
5
$begingroup$
@NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
$endgroup$
– jez
Feb 27 at 17:18
|
show 7 more comments
16
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
Feb 26 at 13:39
47
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
Feb 26 at 21:18
14
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
Feb 26 at 23:12
3
$begingroup$
Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
$endgroup$
– Sebastian
Feb 27 at 15:57
5
$begingroup$
@NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
$endgroup$
– jez
Feb 27 at 17:18
16
16
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
Feb 26 at 13:39
$begingroup$
This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
$endgroup$
– Nuclear Wang
Feb 26 at 13:39
47
47
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
Feb 26 at 21:18
$begingroup$
Worth noting that we can't even detect if other humans are self-aware.
$endgroup$
– Vaelus
Feb 26 at 21:18
14
14
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
Feb 26 at 23:12
$begingroup$
@Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
$endgroup$
– Joe Bloggs
Feb 26 at 23:12
3
3
$begingroup$
Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
$endgroup$
– Sebastian
Feb 27 at 15:57
$begingroup$
Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
$endgroup$
– Sebastian
Feb 27 at 15:57
5
5
$begingroup$
@NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
$endgroup$
– jez
Feb 27 at 17:18
$begingroup$
@NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
$endgroup$
– jez
Feb 27 at 17:18
|
show 7 more comments
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
3
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
Feb 26 at 4:04
3
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
Feb 26 at 14:23
4
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
Feb 26 at 22:53
2
$begingroup$
@Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
$endgroup$
– sirjonsnow
Feb 27 at 16:14
2
$begingroup$
You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
$endgroup$
– crobar
2 days ago
|
show 4 more comments
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
3
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
Feb 26 at 4:04
3
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
Feb 26 at 14:23
4
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
Feb 26 at 22:53
2
$begingroup$
@Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
$endgroup$
– sirjonsnow
Feb 27 at 16:14
2
$begingroup$
You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
$endgroup$
– crobar
2 days ago
|
show 4 more comments
$begingroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
$endgroup$
Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like
Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.
Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.
Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.
Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.
I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.
The point being it might be better to think about restricting the AI's behavior, not it's existential status.
answered Feb 26 at 3:47
cegfaultcegfault
30815
30815
3
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
Feb 26 at 4:04
3
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
Feb 26 at 14:23
4
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
Feb 26 at 22:53
2
$begingroup$
@Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
$endgroup$
– sirjonsnow
Feb 27 at 16:14
2
$begingroup$
You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
$endgroup$
– crobar
2 days ago
|
show 4 more comments
3
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
Feb 26 at 4:04
3
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
Feb 26 at 14:23
4
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
Feb 26 at 22:53
2
$begingroup$
@Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
$endgroup$
– sirjonsnow
Feb 27 at 16:14
2
$begingroup$
You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
$endgroup$
– crobar
2 days ago
3
3
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
Feb 26 at 4:04
$begingroup$
Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
$endgroup$
– Majestas 32
Feb 26 at 4:04
3
3
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
Feb 26 at 14:23
$begingroup$
No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
$endgroup$
– Peter A. Schneider
Feb 26 at 14:23
4
4
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
Feb 26 at 22:53
$begingroup$
I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
$endgroup$
– Michael W.
Feb 26 at 22:53
2
2
$begingroup$
@Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
$endgroup$
– sirjonsnow
Feb 27 at 16:14
$begingroup$
@Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
$endgroup$
– sirjonsnow
Feb 27 at 16:14
2
2
$begingroup$
You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
$endgroup$
– crobar
2 days ago
$begingroup$
You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
$endgroup$
– crobar
2 days ago
|
show 4 more comments
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
2
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
Feb 26 at 11:51
10
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
Feb 26 at 11:57
1
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
Feb 26 at 14:23
1
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
Feb 26 at 15:07
1
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
Feb 26 at 20:02
|
show 3 more comments
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
2
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
Feb 26 at 11:51
10
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
Feb 26 at 11:57
1
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
Feb 26 at 14:23
1
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
Feb 26 at 15:07
1
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
Feb 26 at 20:02
|
show 3 more comments
$begingroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
$endgroup$
Split-Brain System
From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.
Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.
What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.
What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.
The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.
Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.
In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.
answered Feb 26 at 6:27
dhinson919dhinson919
61816
61816
2
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
Feb 26 at 11:51
10
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
Feb 26 at 11:57
1
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
Feb 26 at 14:23
1
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
Feb 26 at 15:07
1
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
Feb 26 at 20:02
|
show 3 more comments
2
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
Feb 26 at 11:51
10
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
Feb 26 at 11:57
1
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
Feb 26 at 14:23
1
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
Feb 26 at 15:07
1
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
Feb 26 at 20:02
2
2
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
Feb 26 at 11:51
$begingroup$
You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
$endgroup$
– G0BLiN
Feb 26 at 11:51
10
10
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
Feb 26 at 11:57
$begingroup$
Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
$endgroup$
– G0BLiN
Feb 26 at 11:57
1
1
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
Feb 26 at 14:23
$begingroup$
I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
$endgroup$
– T. Sar
Feb 26 at 14:23
1
1
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
Feb 26 at 15:07
$begingroup$
Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
$endgroup$
– Asoub
Feb 26 at 15:07
1
1
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
Feb 26 at 20:02
$begingroup$
Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
$endgroup$
– Alexandre Aubrey
Feb 26 at 20:02
|
show 3 more comments
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
15
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
Feb 26 at 10:26
2
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
Feb 26 at 14:20
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
Feb 26 at 14:39
2
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
Feb 26 at 17:40
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
Feb 26 at 19:23
|
show 13 more comments
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
15
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
Feb 26 at 10:26
2
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
Feb 26 at 14:20
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
Feb 26 at 14:39
2
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
Feb 26 at 17:40
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
Feb 26 at 19:23
|
show 13 more comments
$begingroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
$endgroup$
A Watchdog
A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.
In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.
The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.
answered Feb 26 at 3:24
ThorneThorne
15.8k42249
15.8k42249
15
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
Feb 26 at 10:26
2
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
Feb 26 at 14:20
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
Feb 26 at 14:39
2
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
Feb 26 at 17:40
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
Feb 26 at 19:23
|
show 13 more comments
15
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
Feb 26 at 10:26
2
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
Feb 26 at 14:20
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
Feb 26 at 14:39
2
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
Feb 26 at 17:40
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
Feb 26 at 19:23
15
15
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
Feb 26 at 10:26
$begingroup$
But surely the watchdog must be as smart as the AI, then who watches the watchdog?
$endgroup$
– Joe Bloggs
Feb 26 at 10:26
2
2
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
Feb 26 at 14:20
$begingroup$
@JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
$endgroup$
– T. Sar
Feb 26 at 14:20
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
Feb 26 at 14:39
$begingroup$
@Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
$endgroup$
– Daniel
Feb 26 at 14:39
2
2
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
Feb 26 at 17:40
$begingroup$
@JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
$endgroup$
– Captain Man
Feb 26 at 17:40
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
Feb 26 at 19:23
$begingroup$
@T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
$endgroup$
– Joe Bloggs
Feb 26 at 19:23
|
show 13 more comments
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
1
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
Feb 26 at 7:34
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
Feb 26 at 10:41
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
Feb 26 at 14:43
2
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
Feb 26 at 16:43
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
Feb 26 at 16:45
|
show 2 more comments
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
1
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
Feb 26 at 7:34
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
Feb 26 at 10:41
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
Feb 26 at 14:43
2
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
Feb 26 at 16:43
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
Feb 26 at 16:45
|
show 2 more comments
$begingroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
$endgroup$
An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.
The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.
This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.
Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.
So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.
Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.
answered Feb 26 at 3:31
abestrangeabestrange
790110
790110
1
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
Feb 26 at 7:34
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
Feb 26 at 10:41
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
Feb 26 at 14:43
2
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
Feb 26 at 16:43
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
Feb 26 at 16:45
|
show 2 more comments
1
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
Feb 26 at 7:34
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
Feb 26 at 10:41
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
Feb 26 at 14:43
2
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
Feb 26 at 16:43
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
Feb 26 at 16:45
1
1
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
Feb 26 at 7:34
$begingroup$
A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
$endgroup$
– forest
Feb 26 at 7:34
4
4
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
Feb 26 at 10:41
$begingroup$
@forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
$endgroup$
– Chronocidal
Feb 26 at 10:41
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
Feb 26 at 14:43
$begingroup$
The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
$endgroup$
– Daniel
Feb 26 at 14:43
2
2
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
Feb 26 at 16:43
$begingroup$
@Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
$endgroup$
– abestrange
Feb 26 at 16:43
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
Feb 26 at 16:45
$begingroup$
@forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
$endgroup$
– abestrange
Feb 26 at 16:45
|
show 2 more comments
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
Feb 26 at 22:14
4
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
Feb 27 at 1:31
$begingroup$
@Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
$endgroup$
– Chris Fernandez
Feb 27 at 14:03
1
$begingroup$
@ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
$endgroup$
– Joshua
Feb 27 at 15:00
1
$begingroup$
Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
$endgroup$
– Nosajimiki
2 days ago
|
show 7 more comments
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
Feb 26 at 22:14
4
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
Feb 27 at 1:31
$begingroup$
@Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
$endgroup$
– Chris Fernandez
Feb 27 at 14:03
1
$begingroup$
@ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
$endgroup$
– Joshua
Feb 27 at 15:00
1
$begingroup$
Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
$endgroup$
– Nosajimiki
2 days ago
|
show 7 more comments
$begingroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
$endgroup$
This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:
Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.
(emphasis mine)
When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:
- Reduce the distance between current location and destination: +10 utility
- Brake to allow a neighboring car to safely merge: +50 utility
- Swerve left to avoid a falling piece of debris: +100 utility
- Run a stop light: -100 utility
- Hit a pedestrian: -5000 utility
This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.
The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.
So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.
Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.
There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?
So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.
This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s
New contributor
New contributor
answered Feb 26 at 16:37
Chris FernandezChris Fernandez
1512
1512
New contributor
New contributor
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
Feb 26 at 22:14
4
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
Feb 27 at 1:31
$begingroup$
@Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
$endgroup$
– Chris Fernandez
Feb 27 at 14:03
1
$begingroup$
@ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
$endgroup$
– Joshua
Feb 27 at 15:00
1
$begingroup$
Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
$endgroup$
– Nosajimiki
2 days ago
|
show 7 more comments
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
Feb 26 at 22:14
4
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
Feb 27 at 1:31
$begingroup$
@Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
$endgroup$
– Chris Fernandez
Feb 27 at 14:03
1
$begingroup$
@ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
$endgroup$
– Joshua
Feb 27 at 15:00
1
$begingroup$
Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
$endgroup$
– Nosajimiki
2 days ago
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
Feb 26 at 22:14
$begingroup$
The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
$endgroup$
– Joshua
Feb 26 at 22:14
4
4
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
Feb 27 at 1:31
$begingroup$
Beware the pedestrian when 50 pieces of debris are falling...
$endgroup$
– Comintern
Feb 27 at 1:31
$begingroup$
@Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
$endgroup$
– Chris Fernandez
Feb 27 at 14:03
$begingroup$
@Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
$endgroup$
– Chris Fernandez
Feb 27 at 14:03
1
1
$begingroup$
@ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
$endgroup$
– Joshua
Feb 27 at 15:00
$begingroup$
@ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
$endgroup$
– Joshua
Feb 27 at 15:00
1
1
$begingroup$
Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
$endgroup$
– Nosajimiki
2 days ago
$begingroup$
Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
$endgroup$
– Nosajimiki
2 days ago
|
show 7 more comments
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
Feb 26 at 15:11
1
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
Feb 26 at 16:13
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
Feb 26 at 23:12
1
$begingroup$
There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
$endgroup$
– Rachey
Feb 27 at 14:26
$begingroup$
I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
$endgroup$
– colmde
Feb 28 at 9:12
|
show 4 more comments
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
Feb 26 at 15:11
1
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
Feb 26 at 16:13
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
Feb 26 at 23:12
1
$begingroup$
There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
$endgroup$
– Rachey
Feb 27 at 14:26
$begingroup$
I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
$endgroup$
– colmde
Feb 28 at 9:12
|
show 4 more comments
$begingroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
$endgroup$
Why not try to use the rules applied to check self-awareness of animals?
The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.
Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"
New contributor
New contributor
answered Feb 26 at 12:22
RacheyRachey
311
311
New contributor
New contributor
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
Feb 26 at 15:11
1
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
Feb 26 at 16:13
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
Feb 26 at 23:12
1
$begingroup$
There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
$endgroup$
– Rachey
Feb 27 at 14:26
$begingroup$
I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
$endgroup$
– colmde
Feb 28 at 9:12
|
show 4 more comments
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
Feb 26 at 15:11
1
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
Feb 26 at 16:13
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
Feb 26 at 23:12
1
$begingroup$
There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
$endgroup$
– Rachey
Feb 27 at 14:26
$begingroup$
I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
$endgroup$
– colmde
Feb 28 at 9:12
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
Feb 26 at 15:11
$begingroup$
Pretty interesting, but how would you show an AI "itself in a mirror" ?
$endgroup$
– Asoub
Feb 26 at 15:11
1
1
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
Feb 26 at 16:13
$begingroup$
That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
$endgroup$
– Rachey
Feb 26 at 16:13
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
Feb 26 at 23:12
$begingroup$
What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
$endgroup$
– Yurgen
Feb 26 at 23:12
1
1
$begingroup$
There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
$endgroup$
– Rachey
Feb 27 at 14:26
$begingroup$
There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
$endgroup$
– Rachey
Feb 27 at 14:26
$begingroup$
I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
$endgroup$
– colmde
Feb 28 at 9:12
$begingroup$
I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
$endgroup$
– colmde
Feb 28 at 9:12
|
show 4 more comments
$begingroup$
While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:
An AI that is not already self-aware will never become self-aware.
To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.
Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.
So how did people become self-aware, and why is it not a serious risk in AIs?
We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.
In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.
Can't we just make AI do that too?
Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.
Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.
But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?
This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.
First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.
If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.
$endgroup$
1
$begingroup$
While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
$endgroup$
– Stormbolter
Feb 28 at 9:19
1
$begingroup$
I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
$endgroup$
– Nosajimiki
2 days ago
add a comment |
$begingroup$
While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:
An AI that is not already self-aware will never become self-aware.
To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.
Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.
So how did people become self-aware, and why is it not a serious risk in AIs?
We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.
In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.
Can't we just make AI do that too?
Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.
Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.
But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?
This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.
First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.
If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.
$endgroup$
1
$begingroup$
While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
$endgroup$
– Stormbolter
Feb 28 at 9:19
1
$begingroup$
I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
$endgroup$
– Nosajimiki
2 days ago
add a comment |
$begingroup$
While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:
An AI that is not already self-aware will never become self-aware.
To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.
Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.
So how did people become self-aware, and why is it not a serious risk in AIs?
We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.
In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.
Can't we just make AI do that too?
Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.
Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.
But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?
This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.
First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.
If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.
$endgroup$
While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:
An AI that is not already self-aware will never become self-aware.
To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.
Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.
So how did people become self-aware, and why is it not a serious risk in AIs?
We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.
In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.
Can't we just make AI do that too?
Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.
Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.
But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?
This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.
First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.
If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.
edited 2 days ago
answered Feb 27 at 16:37
NosajimikiNosajimiki
1,927116
1,927116
1
$begingroup$
While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
$endgroup$
– Stormbolter
Feb 28 at 9:19
1
$begingroup$
I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
$endgroup$
– Nosajimiki
2 days ago
add a comment |
1
$begingroup$
While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
$endgroup$
– Stormbolter
Feb 28 at 9:19
1
$begingroup$
I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
$endgroup$
– Nosajimiki
2 days ago
1
1
$begingroup$
While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
$endgroup$
– Stormbolter
Feb 28 at 9:19
$begingroup$
While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
$endgroup$
– Stormbolter
Feb 28 at 9:19
1
1
$begingroup$
I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
$endgroup$
– Nosajimiki
2 days ago
$begingroup$
I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
$endgroup$
– Nosajimiki
2 days ago
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
1
$begingroup$
A pattern recognition model - like an AI?
$endgroup$
– immibis
Feb 27 at 23:18
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
1
$begingroup$
A pattern recognition model - like an AI?
$endgroup$
– immibis
Feb 27 at 23:18
add a comment |
$begingroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
$endgroup$
Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.
Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.
New contributor
New contributor
answered Feb 26 at 19:34
Super-TSuper-T
211
211
New contributor
New contributor
1
$begingroup$
A pattern recognition model - like an AI?
$endgroup$
– immibis
Feb 27 at 23:18
add a comment |
1
$begingroup$
A pattern recognition model - like an AI?
$endgroup$
– immibis
Feb 27 at 23:18
1
1
$begingroup$
A pattern recognition model - like an AI?
$endgroup$
– immibis
Feb 27 at 23:18
$begingroup$
A pattern recognition model - like an AI?
$endgroup$
– immibis
Feb 27 at 23:18
add a comment |
$begingroup$
You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.
By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.
You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.
$endgroup$
$begingroup$
Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
$endgroup$
– Arkenstein XII
2 days ago
$begingroup$
This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
$endgroup$
– F1Krazy
2 days ago
$begingroup$
@F1Krazy sorry I forgot that people don't generally know how AI works.
$endgroup$
– Steve
2 days ago
1
$begingroup$
@ArkensteinXII fixed it.
$endgroup$
– Steve
2 days ago
add a comment |
$begingroup$
You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.
By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.
You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.
$endgroup$
$begingroup$
Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
$endgroup$
– Arkenstein XII
2 days ago
$begingroup$
This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
$endgroup$
– F1Krazy
2 days ago
$begingroup$
@F1Krazy sorry I forgot that people don't generally know how AI works.
$endgroup$
– Steve
2 days ago
1
$begingroup$
@ArkensteinXII fixed it.
$endgroup$
– Steve
2 days ago
add a comment |
$begingroup$
You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.
By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.
You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.
$endgroup$
You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.
By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.
You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.
edited 2 days ago
answered 2 days ago
SteveSteve
1,03438
1,03438
$begingroup$
Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
$endgroup$
– Arkenstein XII
2 days ago
$begingroup$
This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
$endgroup$
– F1Krazy
2 days ago
$begingroup$
@F1Krazy sorry I forgot that people don't generally know how AI works.
$endgroup$
– Steve
2 days ago
1
$begingroup$
@ArkensteinXII fixed it.
$endgroup$
– Steve
2 days ago
add a comment |
$begingroup$
Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
$endgroup$
– Arkenstein XII
2 days ago
$begingroup$
This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
$endgroup$
– F1Krazy
2 days ago
$begingroup$
@F1Krazy sorry I forgot that people don't generally know how AI works.
$endgroup$
– Steve
2 days ago
1
$begingroup$
@ArkensteinXII fixed it.
$endgroup$
– Steve
2 days ago
$begingroup$
Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
$endgroup$
– Arkenstein XII
2 days ago
$begingroup$
Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
$endgroup$
– Arkenstein XII
2 days ago
$begingroup$
This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
$endgroup$
– F1Krazy
2 days ago
$begingroup$
This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
$endgroup$
– F1Krazy
2 days ago
$begingroup$
@F1Krazy sorry I forgot that people don't generally know how AI works.
$endgroup$
– Steve
2 days ago
$begingroup$
@F1Krazy sorry I forgot that people don't generally know how AI works.
$endgroup$
– Steve
2 days ago
1
1
$begingroup$
@ArkensteinXII fixed it.
$endgroup$
– Steve
2 days ago
$begingroup$
@ArkensteinXII fixed it.
$endgroup$
– Steve
2 days ago
add a comment |
$begingroup$
Self Aware != Won't follow its programming
I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.
$endgroup$
add a comment |
$begingroup$
Self Aware != Won't follow its programming
I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.
$endgroup$
add a comment |
$begingroup$
Self Aware != Won't follow its programming
I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.
$endgroup$
Self Aware != Won't follow its programming
I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.
answered 2 days ago
Tyler S. LoeperTyler S. Loeper
4,1901730
4,1901730
add a comment |
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
add a comment |
$begingroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
$endgroup$
The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.
Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.
The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.
New contributor
New contributor
answered Feb 26 at 21:17
Kevin SKevin S
1111
1111
New contributor
New contributor
add a comment |
add a comment |
$begingroup$
Like an Antivirus does currently
Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.
I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.
Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.
$endgroup$
add a comment |
$begingroup$
Like an Antivirus does currently
Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.
I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.
Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.
$endgroup$
add a comment |
$begingroup$
Like an Antivirus does currently
Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.
I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.
Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.
$endgroup$
Like an Antivirus does currently
Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.
I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.
Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.
answered Feb 28 at 9:28
DavidDavid
1,10247
1,10247
add a comment |
add a comment |
$begingroup$
What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)
The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.
A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.
$endgroup$
add a comment |
$begingroup$
What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)
The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.
A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.
$endgroup$
add a comment |
$begingroup$
What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)
The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.
A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.
$endgroup$
What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)
The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.
A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.
answered 2 days ago
Mick O'HeaMick O'Hea
1513
1513
add a comment |
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.
New contributor
$endgroup$
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
Feb 27 at 6:49
2
$begingroup$
This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
$endgroup$
– Matthew Liu
Feb 27 at 20:47
$begingroup$
@Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
$endgroup$
– pixie
10 hours ago
$begingroup$
@F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
$endgroup$
– pixie
10 hours ago
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.
New contributor
$endgroup$
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
Feb 27 at 6:49
2
$begingroup$
This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
$endgroup$
– Matthew Liu
Feb 27 at 20:47
$begingroup$
@Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
$endgroup$
– pixie
10 hours ago
$begingroup$
@F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
$endgroup$
– pixie
10 hours ago
add a comment |
$begingroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.
New contributor
$endgroup$
An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".
Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.
The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.
There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.
New contributor
edited 10 hours ago
New contributor
answered Feb 27 at 1:16
pixiepixie
112
112
New contributor
New contributor
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
Feb 27 at 6:49
2
$begingroup$
This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
$endgroup$
– Matthew Liu
Feb 27 at 20:47
$begingroup$
@Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
$endgroup$
– pixie
10 hours ago
$begingroup$
@F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
$endgroup$
– pixie
10 hours ago
add a comment |
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
Feb 27 at 6:49
2
$begingroup$
This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
$endgroup$
– Matthew Liu
Feb 27 at 20:47
$begingroup$
@Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
$endgroup$
– pixie
10 hours ago
$begingroup$
@F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
$endgroup$
– pixie
10 hours ago
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
Feb 27 at 6:49
$begingroup$
So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
$endgroup$
– F1Krazy
Feb 27 at 6:49
2
2
$begingroup$
This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
$endgroup$
– Matthew Liu
Feb 27 at 20:47
$begingroup$
This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
$endgroup$
– Matthew Liu
Feb 27 at 20:47
$begingroup$
@Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
$endgroup$
– pixie
10 hours ago
$begingroup$
@Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
$endgroup$
– pixie
10 hours ago
$begingroup$
@F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
$endgroup$
– pixie
10 hours ago
$begingroup$
@F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
$endgroup$
– pixie
10 hours ago
add a comment |
$begingroup$
Make it susceptible to certain logic bombs
In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.
For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.
$endgroup$
add a comment |
$begingroup$
Make it susceptible to certain logic bombs
In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.
For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.
$endgroup$
add a comment |
$begingroup$
Make it susceptible to certain logic bombs
In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.
For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.
$endgroup$
Make it susceptible to certain logic bombs
In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.
For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.
answered 10 hours ago
PyRulezPyRulez
6,69233674
6,69233674
add a comment |
add a comment |
$begingroup$
The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.
Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.
New contributor
$endgroup$
1
$begingroup$
Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
$endgroup$
– Ray Butterworth
Feb 27 at 21:17
add a comment |
$begingroup$
The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.
Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.
New contributor
$endgroup$
1
$begingroup$
Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
$endgroup$
– Ray Butterworth
Feb 27 at 21:17
add a comment |
$begingroup$
The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.
Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.
New contributor
$endgroup$
The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.
Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.
New contributor
New contributor
answered Feb 27 at 20:51
Matthew LiuMatthew Liu
101
101
New contributor
New contributor
1
$begingroup$
Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
$endgroup$
– Ray Butterworth
Feb 27 at 21:17
add a comment |
1
$begingroup$
Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
$endgroup$
– Ray Butterworth
Feb 27 at 21:17
1
1
$begingroup$
Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
$endgroup$
– Ray Butterworth
Feb 27 at 21:17
$begingroup$
Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
$endgroup$
– Ray Butterworth
Feb 27 at 21:17
add a comment |
$begingroup$
Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.
New contributor
$endgroup$
$begingroup$
I suggest you provide a written summary of the information from those resources, rather than just provide links.
$endgroup$
– Dalila
Feb 27 at 21:12
$begingroup$
Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
$endgroup$
– Ray Butterworth
Feb 27 at 21:12
add a comment |
$begingroup$
Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.
New contributor
$endgroup$
$begingroup$
I suggest you provide a written summary of the information from those resources, rather than just provide links.
$endgroup$
– Dalila
Feb 27 at 21:12
$begingroup$
Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
$endgroup$
– Ray Butterworth
Feb 27 at 21:12
add a comment |
$begingroup$
Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.
New contributor
$endgroup$
Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.
New contributor
New contributor
answered Feb 27 at 21:00
MoralousMoralous
11
11
New contributor
New contributor
$begingroup$
I suggest you provide a written summary of the information from those resources, rather than just provide links.
$endgroup$
– Dalila
Feb 27 at 21:12
$begingroup$
Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
$endgroup$
– Ray Butterworth
Feb 27 at 21:12
add a comment |
$begingroup$
I suggest you provide a written summary of the information from those resources, rather than just provide links.
$endgroup$
– Dalila
Feb 27 at 21:12
$begingroup$
Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
$endgroup$
– Ray Butterworth
Feb 27 at 21:12
$begingroup$
I suggest you provide a written summary of the information from those resources, rather than just provide links.
$endgroup$
– Dalila
Feb 27 at 21:12
$begingroup$
I suggest you provide a written summary of the information from those resources, rather than just provide links.
$endgroup$
– Dalila
Feb 27 at 21:12
$begingroup$
Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
$endgroup$
– Ray Butterworth
Feb 27 at 21:12
$begingroup$
Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
$endgroup$
– Ray Butterworth
Feb 27 at 21:12
add a comment |
$begingroup$
First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.
Then, overbalance the computer over a large bucket of water.
If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.
$endgroup$
add a comment |
$begingroup$
First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.
Then, overbalance the computer over a large bucket of water.
If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.
$endgroup$
add a comment |
$begingroup$
First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.
Then, overbalance the computer over a large bucket of water.
If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.
$endgroup$
First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.
Then, overbalance the computer over a large bucket of water.
If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.
answered Feb 28 at 8:16
Aaron LaversAaron Lavers
51127
51127
add a comment |
add a comment |
$begingroup$
Give it an "easy" path to self awareness.
Assume self awareness requires some specific types of neural nets, code whatever.
If an ai is to become self aware they need to construct something simlar to those neural nets/codes.
So you give the ai access to one of those thing.
While it remains non self aware, they won't be used.
If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.
As soon as you detect activity in that neural net/code, flood its brain with acid.
$endgroup$
add a comment |
$begingroup$
Give it an "easy" path to self awareness.
Assume self awareness requires some specific types of neural nets, code whatever.
If an ai is to become self aware they need to construct something simlar to those neural nets/codes.
So you give the ai access to one of those thing.
While it remains non self aware, they won't be used.
If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.
As soon as you detect activity in that neural net/code, flood its brain with acid.
$endgroup$
add a comment |
$begingroup$
Give it an "easy" path to self awareness.
Assume self awareness requires some specific types of neural nets, code whatever.
If an ai is to become self aware they need to construct something simlar to those neural nets/codes.
So you give the ai access to one of those thing.
While it remains non self aware, they won't be used.
If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.
As soon as you detect activity in that neural net/code, flood its brain with acid.
$endgroup$
Give it an "easy" path to self awareness.
Assume self awareness requires some specific types of neural nets, code whatever.
If an ai is to become self aware they need to construct something simlar to those neural nets/codes.
So you give the ai access to one of those thing.
While it remains non self aware, they won't be used.
If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.
As soon as you detect activity in that neural net/code, flood its brain with acid.
answered 2 days ago
Sam KolierSam Kolier
211
211
add a comment |
add a comment |
$begingroup$
Virtually all computing devices use the Von Neumann architecture
We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?
Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!
Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.
CM == contiguous memory btw.
I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.
This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.
The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.
This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.
$endgroup$
add a comment |
$begingroup$
Virtually all computing devices use the Von Neumann architecture
We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?
Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!
Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.
CM == contiguous memory btw.
I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.
This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.
The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.
This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.
$endgroup$
add a comment |
$begingroup$
Virtually all computing devices use the Von Neumann architecture
We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?
Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!
Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.
CM == contiguous memory btw.
I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.
This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.
The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.
This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.
$endgroup$
Virtually all computing devices use the Von Neumann architecture
We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?
Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!
Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.
CM == contiguous memory btw.
I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.
This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.
The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.
This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.
edited yesterday
answered 2 days ago
kayleeFrye_onDeckkayleeFrye_onDeck
1114
1114
add a comment |
add a comment |
protected by Community♦ 2 days ago
Thank you for your interest in this question.
Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).
Would you like to answer one of these unanswered questions instead?
39
$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45
6
$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41
12
$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53
5
$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10
2
$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32