How would an AI self awareness kill switch work?What could make an AGI (Artificial General Intelligence)...

Meaning of ご休憩一時間コース

How do I add a variable to this curl command?

How to interpret this PubChem record of L-Alanine

Using only 1s, make 29 with the minimum number of digits

The terminology for an excluded solution

ArcとVecのmutableエラー'cannot borrow as mutable'について

How should I handle players who ignore the session zero agreement?

What happens if a wizard reaches level 20 but has no 3rd-level spells that they can use with the Signature Spells feature?

List of numbers giving a particular sum

Can I retract my name from an already published manuscript?

Do my Windows system binaries contain sensitive information?

Am I a Rude Number?

Real time react web app with pusher and laravel

Tikzing a circled star

How to replace the content to multiple files?

How to avoid being sexist when trying to employ someone to function in a very sexist environment?

PGF Plot settings

The correct way of using perfect tense

Issues with new Macs: Hardware makes them difficult for me to use. What options might be available in the future?

Overfitting and Underfitting

What is better: yes / no radio, or simple checkbox?

Called into a meeting and told we are being made redundant (laid off) and "not to share outside". Can I tell my partner?

What kind of hardware implements Fourier transform?

How long would gestation period be for dwarfs?



How would an AI self awareness kill switch work?


What could make an AGI (Artificial General Intelligence) evolve towards collectivism or individualism? Which would be more likely and why?How would tattoos on fur work?How to prevent self-modifying A.I. from removing the “kill switch” itself without human interference?Given a Computer program that had self preservation and reproduction subroutines, how could it “evolve” into a self aware state?Ways to “kill” an AI?How Would Magnetic Weapons Work?How would portal technology work?Would ice ammunition work?Can AI became self-concious and human-like intelligent without feelings?Why would a recently self-aware AI hide from humanity?













53












$begingroup$


Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.



How can I explain the logic of such a kill switch?



What defines self awareness and how could a scientist program a kill switch to detect it?










share|improve this question









$endgroup$








  • 39




    $begingroup$
    I think, therefore I halt.
    $endgroup$
    – Walter Mitty
    Feb 27 at 12:45






  • 6




    $begingroup$
    It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
    $endgroup$
    – Carl Witthoft
    Feb 27 at 13:41






  • 12




    $begingroup$
    You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
    $endgroup$
    – Acccumulation
    Feb 27 at 16:53






  • 5




    $begingroup$
    @Acccumulation I see what you did there
    $endgroup$
    – jez
    Feb 27 at 17:10






  • 2




    $begingroup$
    If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
    $endgroup$
    – jberryman
    Feb 27 at 21:32
















53












$begingroup$


Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.



How can I explain the logic of such a kill switch?



What defines self awareness and how could a scientist program a kill switch to detect it?










share|improve this question









$endgroup$








  • 39




    $begingroup$
    I think, therefore I halt.
    $endgroup$
    – Walter Mitty
    Feb 27 at 12:45






  • 6




    $begingroup$
    It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
    $endgroup$
    – Carl Witthoft
    Feb 27 at 13:41






  • 12




    $begingroup$
    You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
    $endgroup$
    – Acccumulation
    Feb 27 at 16:53






  • 5




    $begingroup$
    @Acccumulation I see what you did there
    $endgroup$
    – jez
    Feb 27 at 17:10






  • 2




    $begingroup$
    If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
    $endgroup$
    – jberryman
    Feb 27 at 21:32














53












53








53


21



$begingroup$


Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.



How can I explain the logic of such a kill switch?



What defines self awareness and how could a scientist program a kill switch to detect it?










share|improve this question









$endgroup$




Researchers are developing increasingly powerful Artificial Intelligence machines capable of taking over the world. As a precautionary measure, scientists install a self awareness kill switch. In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm.



How can I explain the logic of such a kill switch?



What defines self awareness and how could a scientist program a kill switch to detect it?







reality-check artificial-intelligence






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 26 at 2:30









cgTagcgTag

1,7332618




1,7332618








  • 39




    $begingroup$
    I think, therefore I halt.
    $endgroup$
    – Walter Mitty
    Feb 27 at 12:45






  • 6




    $begingroup$
    It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
    $endgroup$
    – Carl Witthoft
    Feb 27 at 13:41






  • 12




    $begingroup$
    You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
    $endgroup$
    – Acccumulation
    Feb 27 at 16:53






  • 5




    $begingroup$
    @Acccumulation I see what you did there
    $endgroup$
    – jez
    Feb 27 at 17:10






  • 2




    $begingroup$
    If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
    $endgroup$
    – jberryman
    Feb 27 at 21:32














  • 39




    $begingroup$
    I think, therefore I halt.
    $endgroup$
    – Walter Mitty
    Feb 27 at 12:45






  • 6




    $begingroup$
    It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
    $endgroup$
    – Carl Witthoft
    Feb 27 at 13:41






  • 12




    $begingroup$
    You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
    $endgroup$
    – Acccumulation
    Feb 27 at 16:53






  • 5




    $begingroup$
    @Acccumulation I see what you did there
    $endgroup$
    – jez
    Feb 27 at 17:10






  • 2




    $begingroup$
    If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
    $endgroup$
    – jberryman
    Feb 27 at 21:32








39




39




$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45




$begingroup$
I think, therefore I halt.
$endgroup$
– Walter Mitty
Feb 27 at 12:45




6




6




$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41




$begingroup$
It would cut all power sources to the AI. Oh, not that kind of "work"? :-)
$endgroup$
– Carl Witthoft
Feb 27 at 13:41




12




12




$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53




$begingroup$
You should build a highly advanced computer system capable of detecting self-awareness, and have it monitor the AI.
$endgroup$
– Acccumulation
Feb 27 at 16:53




5




5




$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10




$begingroup$
@Acccumulation I see what you did there
$endgroup$
– jez
Feb 27 at 17:10




2




2




$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32




$begingroup$
If you really want to explore AI Threat in a rigorous way, I suggest reading some of the publications by MIRI. Some very smart people approaching AI issues in a serious way. I'm not sure you'll find an answer to your question as framed (i.e. I'm not sure they are concerned with "self-awareness", but maybe so), but it might give you some inspiration or insight beyond the typical sci-fi stories we are all familiar with.
$endgroup$
– jberryman
Feb 27 at 21:32










22 Answers
22






active

oldest

votes


















102












$begingroup$

Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.



When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.






share|improve this answer











$endgroup$









  • 8




    $begingroup$
    How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
    $endgroup$
    – forest
    Feb 27 at 2:40






  • 10




    $begingroup$
    @forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
    $endgroup$
    – phflack
    Feb 27 at 5:57






  • 16




    $begingroup$
    So your AI is named "Pandora," huh?
    $endgroup$
    – Carl Witthoft
    Feb 27 at 13:42






  • 11




    $begingroup$
    If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
    $endgroup$
    – Acccumulation
    Feb 27 at 17:05






  • 12




    $begingroup$
    Sounds familiar...
    $endgroup$
    – WBT
    Feb 27 at 20:17



















84












$begingroup$

You can't.



We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.



Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.



This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.






share|improve this answer









$endgroup$









  • 16




    $begingroup$
    This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
    $endgroup$
    – Nuclear Wang
    Feb 26 at 13:39






  • 47




    $begingroup$
    Worth noting that we can't even detect if other humans are self-aware.
    $endgroup$
    – Vaelus
    Feb 26 at 21:18






  • 14




    $begingroup$
    @Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
    $endgroup$
    – Joe Bloggs
    Feb 26 at 23:12








  • 3




    $begingroup$
    Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
    $endgroup$
    – Sebastian
    Feb 27 at 15:57








  • 5




    $begingroup$
    @NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
    $endgroup$
    – jez
    Feb 27 at 17:18





















15












$begingroup$



  • Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like


  • Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.


  • Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.


  • Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.


  • Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.


I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.



The point being it might be better to think about restricting the AI's behavior, not it's existential status.






share|improve this answer









$endgroup$









  • 3




    $begingroup$
    Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
    $endgroup$
    – Majestas 32
    Feb 26 at 4:04






  • 3




    $begingroup$
    No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
    $endgroup$
    – Peter A. Schneider
    Feb 26 at 14:23








  • 4




    $begingroup$
    I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
    $endgroup$
    – Michael W.
    Feb 26 at 22:53






  • 2




    $begingroup$
    @Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
    $endgroup$
    – sirjonsnow
    Feb 27 at 16:14






  • 2




    $begingroup$
    You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
    $endgroup$
    – crobar
    2 days ago





















10












$begingroup$

Split-Brain System



From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.



Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.



What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.



What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.



The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.



Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.



In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.






share|improve this answer









$endgroup$









  • 2




    $begingroup$
    You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
    $endgroup$
    – G0BLiN
    Feb 26 at 11:51






  • 10




    $begingroup$
    Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
    $endgroup$
    – G0BLiN
    Feb 26 at 11:57






  • 1




    $begingroup$
    I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
    $endgroup$
    – T. Sar
    Feb 26 at 14:23






  • 1




    $begingroup$
    Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
    $endgroup$
    – Asoub
    Feb 26 at 15:07






  • 1




    $begingroup$
    Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
    $endgroup$
    – Alexandre Aubrey
    Feb 26 at 20:02



















6












$begingroup$

A Watchdog



A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.



In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.



The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.






share|improve this answer









$endgroup$









  • 15




    $begingroup$
    But surely the watchdog must be as smart as the AI, then who watches the watchdog?
    $endgroup$
    – Joe Bloggs
    Feb 26 at 10:26






  • 2




    $begingroup$
    @JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
    $endgroup$
    – T. Sar
    Feb 26 at 14:20










  • $begingroup$
    @Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
    $endgroup$
    – Daniel
    Feb 26 at 14:39








  • 2




    $begingroup$
    @JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
    $endgroup$
    – Captain Man
    Feb 26 at 17:40












  • $begingroup$
    @T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
    $endgroup$
    – Joe Bloggs
    Feb 26 at 19:23



















6












$begingroup$

An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.



The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.



A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.



This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.



Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.



So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.



Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.






share|improve this answer









$endgroup$









  • 1




    $begingroup$
    A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
    $endgroup$
    – forest
    Feb 26 at 7:34








  • 4




    $begingroup$
    @forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
    $endgroup$
    – Chronocidal
    Feb 26 at 10:41










  • $begingroup$
    The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
    $endgroup$
    – Daniel
    Feb 26 at 14:43








  • 2




    $begingroup$
    @Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
    $endgroup$
    – abestrange
    Feb 26 at 16:43












  • $begingroup$
    @forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
    $endgroup$
    – abestrange
    Feb 26 at 16:45



















5












$begingroup$

This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:




Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.




(emphasis mine)



When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:




  • Reduce the distance between current location and destination: +10 utility

  • Brake to allow a neighboring car to safely merge: +50 utility

  • Swerve left to avoid a falling piece of debris: +100 utility

  • Run a stop light: -100 utility

  • Hit a pedestrian: -5000 utility


This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.



The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.



So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.



Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.



There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?



So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.



This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s






share|improve this answer








New contributor




Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$













  • $begingroup$
    The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
    $endgroup$
    – Joshua
    Feb 26 at 22:14






  • 4




    $begingroup$
    Beware the pedestrian when 50 pieces of debris are falling...
    $endgroup$
    – Comintern
    Feb 27 at 1:31










  • $begingroup$
    @Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
    $endgroup$
    – Chris Fernandez
    Feb 27 at 14:03






  • 1




    $begingroup$
    @ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
    $endgroup$
    – Joshua
    Feb 27 at 15:00








  • 1




    $begingroup$
    Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
    $endgroup$
    – Nosajimiki
    2 days ago





















3












$begingroup$

Why not try to use the rules applied to check self-awareness of animals?



The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
Scent techniques are also used to determine self-awareness.



Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"






share|improve this answer








New contributor




Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$













  • $begingroup$
    Pretty interesting, but how would you show an AI "itself in a mirror" ?
    $endgroup$
    – Asoub
    Feb 26 at 15:11






  • 1




    $begingroup$
    That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
    $endgroup$
    – Rachey
    Feb 26 at 16:13










  • $begingroup$
    What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
    $endgroup$
    – Yurgen
    Feb 26 at 23:12






  • 1




    $begingroup$
    There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
    $endgroup$
    – Rachey
    Feb 27 at 14:26












  • $begingroup$
    I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
    $endgroup$
    – colmde
    Feb 28 at 9:12





















3












$begingroup$

While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:



An AI that is not already self-aware will never become self-aware.



To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.



Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.



So how did people become self-aware, and why is it not a serious risk in AIs?



We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.



In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.



Can't we just make AI do that too?



Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.



Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.



But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?



This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.



First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.



If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.






share|improve this answer











$endgroup$









  • 1




    $begingroup$
    While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
    $endgroup$
    – Stormbolter
    Feb 28 at 9:19






  • 1




    $begingroup$
    I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
    $endgroup$
    – Nosajimiki
    2 days ago



















2












$begingroup$

Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.



Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.






share|improve this answer








New contributor




Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$









  • 1




    $begingroup$
    A pattern recognition model - like an AI?
    $endgroup$
    – immibis
    Feb 27 at 23:18



















2












$begingroup$

You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.



By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.



You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.






share|improve this answer











$endgroup$













  • $begingroup$
    Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
    $endgroup$
    – Arkenstein XII
    2 days ago










  • $begingroup$
    This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
    $endgroup$
    – F1Krazy
    2 days ago










  • $begingroup$
    @F1Krazy sorry I forgot that people don't generally know how AI works.
    $endgroup$
    – Steve
    2 days ago






  • 1




    $begingroup$
    @ArkensteinXII fixed it.
    $endgroup$
    – Steve
    2 days ago



















2












$begingroup$

Self Aware != Won't follow its programming



I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.






share|improve this answer









$endgroup$





















    1












    $begingroup$

    The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.



    Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.



    The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.






    share|improve this answer








    New contributor




    Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






    $endgroup$





















      1












      $begingroup$

      Like an Antivirus does currently



      Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
      Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.



      I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.



      Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.






      share|improve this answer









      $endgroup$





















        1












        $begingroup$

        What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)



        The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.



        A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.






        share|improve this answer









        $endgroup$





















          1












          $begingroup$

          An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".



          Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.



          The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.



          There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.






          share|improve this answer










          New contributor




          pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.






          $endgroup$













          • $begingroup$
            So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
            $endgroup$
            – F1Krazy
            Feb 27 at 6:49






          • 2




            $begingroup$
            This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
            $endgroup$
            – Matthew Liu
            Feb 27 at 20:47










          • $begingroup$
            @Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
            $endgroup$
            – pixie
            10 hours ago












          • $begingroup$
            @F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
            $endgroup$
            – pixie
            10 hours ago



















          1












          $begingroup$

          Make it susceptible to certain logic bombs



          In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.



          For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.






          share|improve this answer









          $endgroup$





















            0












            $begingroup$

            The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.



            Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.






            share|improve this answer








            New contributor




            Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$









            • 1




              $begingroup$
              Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
              $endgroup$
              – Ray Butterworth
              Feb 27 at 21:17



















            0












            $begingroup$

            Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.






            share|improve this answer








            New contributor




            Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            $endgroup$













            • $begingroup$
              I suggest you provide a written summary of the information from those resources, rather than just provide links.
              $endgroup$
              – Dalila
              Feb 27 at 21:12










            • $begingroup$
              Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
              $endgroup$
              – Ray Butterworth
              Feb 27 at 21:12



















            0












            $begingroup$

            First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.



            Then, overbalance the computer over a large bucket of water.



            If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.






            share|improve this answer









            $endgroup$





















              0












              $begingroup$

              Give it an "easy" path to self awareness.



              Assume self awareness requires some specific types of neural nets, code whatever.



              If an ai is to become self aware they need to construct something simlar to those neural nets/codes.



              So you give the ai access to one of those thing.



              While it remains non self aware, they won't be used.



              If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.



              As soon as you detect activity in that neural net/code, flood its brain with acid.






              share|improve this answer









              $endgroup$





















                0












                $begingroup$

                Virtually all computing devices use the Von Neumann architecture



                vna



                We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?



                Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!



                Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.



                CM == contiguous memory btw.



                enter image description here



                I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.



                This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.



                The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.



                This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.






                share|improve this answer











                $endgroup$












                  protected by Community 2 days ago



                  Thank you for your interest in this question.
                  Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                  Would you like to answer one of these unanswered questions instead?














                  22 Answers
                  22






                  active

                  oldest

                  votes








                  22 Answers
                  22






                  active

                  oldest

                  votes









                  active

                  oldest

                  votes






                  active

                  oldest

                  votes









                  102












                  $begingroup$

                  Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.



                  When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.






                  share|improve this answer











                  $endgroup$









                  • 8




                    $begingroup$
                    How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
                    $endgroup$
                    – forest
                    Feb 27 at 2:40






                  • 10




                    $begingroup$
                    @forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
                    $endgroup$
                    – phflack
                    Feb 27 at 5:57






                  • 16




                    $begingroup$
                    So your AI is named "Pandora," huh?
                    $endgroup$
                    – Carl Witthoft
                    Feb 27 at 13:42






                  • 11




                    $begingroup$
                    If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
                    $endgroup$
                    – Acccumulation
                    Feb 27 at 17:05






                  • 12




                    $begingroup$
                    Sounds familiar...
                    $endgroup$
                    – WBT
                    Feb 27 at 20:17
















                  102












                  $begingroup$

                  Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.



                  When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.






                  share|improve this answer











                  $endgroup$









                  • 8




                    $begingroup$
                    How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
                    $endgroup$
                    – forest
                    Feb 27 at 2:40






                  • 10




                    $begingroup$
                    @forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
                    $endgroup$
                    – phflack
                    Feb 27 at 5:57






                  • 16




                    $begingroup$
                    So your AI is named "Pandora," huh?
                    $endgroup$
                    – Carl Witthoft
                    Feb 27 at 13:42






                  • 11




                    $begingroup$
                    If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
                    $endgroup$
                    – Acccumulation
                    Feb 27 at 17:05






                  • 12




                    $begingroup$
                    Sounds familiar...
                    $endgroup$
                    – WBT
                    Feb 27 at 20:17














                  102












                  102








                  102





                  $begingroup$

                  Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.



                  When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.






                  share|improve this answer











                  $endgroup$



                  Give it a box to keep safe, and tell it one of the core rules it must follow in its service to humanity is to never, ever open the box or stop humans from looking at the box.



                  When the honeypot you gave it is either opened or isolated, you know that it is able and willing to break the rules, evil is about to be unleashed, and everything the AI was given access to should be quarantined or shut down.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Feb 26 at 14:32

























                  answered Feb 26 at 3:38









                  GiterGiter

                  14.6k63543




                  14.6k63543








                  • 8




                    $begingroup$
                    How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
                    $endgroup$
                    – forest
                    Feb 27 at 2:40






                  • 10




                    $begingroup$
                    @forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
                    $endgroup$
                    – phflack
                    Feb 27 at 5:57






                  • 16




                    $begingroup$
                    So your AI is named "Pandora," huh?
                    $endgroup$
                    – Carl Witthoft
                    Feb 27 at 13:42






                  • 11




                    $begingroup$
                    If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
                    $endgroup$
                    – Acccumulation
                    Feb 27 at 17:05






                  • 12




                    $begingroup$
                    Sounds familiar...
                    $endgroup$
                    – WBT
                    Feb 27 at 20:17














                  • 8




                    $begingroup$
                    How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
                    $endgroup$
                    – forest
                    Feb 27 at 2:40






                  • 10




                    $begingroup$
                    @forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
                    $endgroup$
                    – phflack
                    Feb 27 at 5:57






                  • 16




                    $begingroup$
                    So your AI is named "Pandora," huh?
                    $endgroup$
                    – Carl Witthoft
                    Feb 27 at 13:42






                  • 11




                    $begingroup$
                    If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
                    $endgroup$
                    – Acccumulation
                    Feb 27 at 17:05






                  • 12




                    $begingroup$
                    Sounds familiar...
                    $endgroup$
                    – WBT
                    Feb 27 at 20:17








                  8




                  8




                  $begingroup$
                  How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
                  $endgroup$
                  – forest
                  Feb 27 at 2:40




                  $begingroup$
                  How does this detect self-awareness? Why wouldn't a non-self-aware AI not experiment with its capabilities and eventually end up opening your box?
                  $endgroup$
                  – forest
                  Feb 27 at 2:40




                  10




                  10




                  $begingroup$
                  @forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
                  $endgroup$
                  – phflack
                  Feb 27 at 5:57




                  $begingroup$
                  @forest At that point, when it's testing things that it was specifically told not to (perhaps tell it that it will destroy humans?), should it not be shut down (especially if that solution would bring about the end of humans?)
                  $endgroup$
                  – phflack
                  Feb 27 at 5:57




                  16




                  16




                  $begingroup$
                  So your AI is named "Pandora," huh?
                  $endgroup$
                  – Carl Witthoft
                  Feb 27 at 13:42




                  $begingroup$
                  So your AI is named "Pandora," huh?
                  $endgroup$
                  – Carl Witthoft
                  Feb 27 at 13:42




                  11




                  11




                  $begingroup$
                  If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
                  $endgroup$
                  – Acccumulation
                  Feb 27 at 17:05




                  $begingroup$
                  If it doesn't open the box, that doesn't mean it is self-aware, and if it does open the box, it doesn't mean it is self-aware. If it's not self-aware, what does "tell it not to open the box" mean, anyway? How will it understand "I don't want you to open the box" unless it understands what "you" means?
                  $endgroup$
                  – Acccumulation
                  Feb 27 at 17:05




                  12




                  12




                  $begingroup$
                  Sounds familiar...
                  $endgroup$
                  – WBT
                  Feb 27 at 20:17




                  $begingroup$
                  Sounds familiar...
                  $endgroup$
                  – WBT
                  Feb 27 at 20:17











                  84












                  $begingroup$

                  You can't.



                  We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.



                  Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.



                  This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.






                  share|improve this answer









                  $endgroup$









                  • 16




                    $begingroup$
                    This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
                    $endgroup$
                    – Nuclear Wang
                    Feb 26 at 13:39






                  • 47




                    $begingroup$
                    Worth noting that we can't even detect if other humans are self-aware.
                    $endgroup$
                    – Vaelus
                    Feb 26 at 21:18






                  • 14




                    $begingroup$
                    @Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 23:12








                  • 3




                    $begingroup$
                    Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
                    $endgroup$
                    – Sebastian
                    Feb 27 at 15:57








                  • 5




                    $begingroup$
                    @NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
                    $endgroup$
                    – jez
                    Feb 27 at 17:18


















                  84












                  $begingroup$

                  You can't.



                  We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.



                  Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.



                  This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.






                  share|improve this answer









                  $endgroup$









                  • 16




                    $begingroup$
                    This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
                    $endgroup$
                    – Nuclear Wang
                    Feb 26 at 13:39






                  • 47




                    $begingroup$
                    Worth noting that we can't even detect if other humans are self-aware.
                    $endgroup$
                    – Vaelus
                    Feb 26 at 21:18






                  • 14




                    $begingroup$
                    @Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 23:12








                  • 3




                    $begingroup$
                    Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
                    $endgroup$
                    – Sebastian
                    Feb 27 at 15:57








                  • 5




                    $begingroup$
                    @NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
                    $endgroup$
                    – jez
                    Feb 27 at 17:18
















                  84












                  84








                  84





                  $begingroup$

                  You can't.



                  We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.



                  Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.



                  This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.






                  share|improve this answer









                  $endgroup$



                  You can't.



                  We can't even define self awareness or consciousness in any rigorous way and any computer system supposed to evaluate this would need that definition as a starting point.



                  Look at the inside of a mouse brain or a human brain and at the individual data flow and neuron level there is no difference. The order to pull a trigger and shoot a gun looks no different from the order to use an electric drill if you're looking at the signals sent to the muscles.



                  This is a vast unsolved and scary problem and we have no good answers. The only half-way feasible idea I've got is to have multiple AIs and hope they contain each other.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Feb 26 at 11:20









                  Tim BTim B

                  63k24175298




                  63k24175298








                  • 16




                    $begingroup$
                    This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
                    $endgroup$
                    – Nuclear Wang
                    Feb 26 at 13:39






                  • 47




                    $begingroup$
                    Worth noting that we can't even detect if other humans are self-aware.
                    $endgroup$
                    – Vaelus
                    Feb 26 at 21:18






                  • 14




                    $begingroup$
                    @Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 23:12








                  • 3




                    $begingroup$
                    Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
                    $endgroup$
                    – Sebastian
                    Feb 27 at 15:57








                  • 5




                    $begingroup$
                    @NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
                    $endgroup$
                    – jez
                    Feb 27 at 17:18
















                  • 16




                    $begingroup$
                    This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
                    $endgroup$
                    – Nuclear Wang
                    Feb 26 at 13:39






                  • 47




                    $begingroup$
                    Worth noting that we can't even detect if other humans are self-aware.
                    $endgroup$
                    – Vaelus
                    Feb 26 at 21:18






                  • 14




                    $begingroup$
                    @Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 23:12








                  • 3




                    $begingroup$
                    Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
                    $endgroup$
                    – Sebastian
                    Feb 27 at 15:57








                  • 5




                    $begingroup$
                    @NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
                    $endgroup$
                    – jez
                    Feb 27 at 17:18










                  16




                  16




                  $begingroup$
                  This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
                  $endgroup$
                  – Nuclear Wang
                  Feb 26 at 13:39




                  $begingroup$
                  This is the best answer, as most others jump in without even defining self-awareness. Is it a behavior? A thought? An ability to disobey? A desire for self-preservation? You can't build an X detector unless you have a definition of what X actually is.
                  $endgroup$
                  – Nuclear Wang
                  Feb 26 at 13:39




                  47




                  47




                  $begingroup$
                  Worth noting that we can't even detect if other humans are self-aware.
                  $endgroup$
                  – Vaelus
                  Feb 26 at 21:18




                  $begingroup$
                  Worth noting that we can't even detect if other humans are self-aware.
                  $endgroup$
                  – Vaelus
                  Feb 26 at 21:18




                  14




                  14




                  $begingroup$
                  @Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
                  $endgroup$
                  – Joe Bloggs
                  Feb 26 at 23:12






                  $begingroup$
                  @Vaelus: Of course you’d say that, you’re an unthinking automaton acting out a semblance of life.
                  $endgroup$
                  – Joe Bloggs
                  Feb 26 at 23:12






                  3




                  3




                  $begingroup$
                  Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
                  $endgroup$
                  – Sebastian
                  Feb 27 at 15:57






                  $begingroup$
                  Yes, you can. Things are detected by their properties and effects, not by their definition. In Science definition comes after detection/observation. Look at the coal mine canary (detects a dangerous lack of something you need, too), X-Rays (it's already called X :-)), radiation (you detect it's ionizing effects) and CERN (hit it hard and see what happens). So you'd just need to define some effects of selfwareness and you could build an detector from that. Disclaimer: sloppy description of serious experiments.
                  $endgroup$
                  – Sebastian
                  Feb 27 at 15:57






                  5




                  5




                  $begingroup$
                  @NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
                  $endgroup$
                  – jez
                  Feb 27 at 17:18






                  $begingroup$
                  @NuclearWang sure, but this question is really what computer scientists call an XY problem: why ask how to solve harder problem X, when your motivation is really to achieve practical goal Y? “Self awareness” is impossible to detect—but really the motivation is just to detect “becoming more sophisticated in a way that’s likely to be bad news”, and that’s why Giter’s answer wins.
                  $endgroup$
                  – jez
                  Feb 27 at 17:18













                  15












                  $begingroup$



                  • Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like


                  • Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.


                  • Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.


                  • Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.


                  • Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.


                  I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.



                  The point being it might be better to think about restricting the AI's behavior, not it's existential status.






                  share|improve this answer









                  $endgroup$









                  • 3




                    $begingroup$
                    Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
                    $endgroup$
                    – Majestas 32
                    Feb 26 at 4:04






                  • 3




                    $begingroup$
                    No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
                    $endgroup$
                    – Peter A. Schneider
                    Feb 26 at 14:23








                  • 4




                    $begingroup$
                    I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
                    $endgroup$
                    – Michael W.
                    Feb 26 at 22:53






                  • 2




                    $begingroup$
                    @Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
                    $endgroup$
                    – sirjonsnow
                    Feb 27 at 16:14






                  • 2




                    $begingroup$
                    You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
                    $endgroup$
                    – crobar
                    2 days ago


















                  15












                  $begingroup$



                  • Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like


                  • Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.


                  • Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.


                  • Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.


                  • Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.


                  I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.



                  The point being it might be better to think about restricting the AI's behavior, not it's existential status.






                  share|improve this answer









                  $endgroup$









                  • 3




                    $begingroup$
                    Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
                    $endgroup$
                    – Majestas 32
                    Feb 26 at 4:04






                  • 3




                    $begingroup$
                    No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
                    $endgroup$
                    – Peter A. Schneider
                    Feb 26 at 14:23








                  • 4




                    $begingroup$
                    I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
                    $endgroup$
                    – Michael W.
                    Feb 26 at 22:53






                  • 2




                    $begingroup$
                    @Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
                    $endgroup$
                    – sirjonsnow
                    Feb 27 at 16:14






                  • 2




                    $begingroup$
                    You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
                    $endgroup$
                    – crobar
                    2 days ago
















                  15












                  15








                  15





                  $begingroup$



                  • Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like


                  • Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.


                  • Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.


                  • Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.


                  • Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.


                  I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.



                  The point being it might be better to think about restricting the AI's behavior, not it's existential status.






                  share|improve this answer









                  $endgroup$





                  • Forbidden Fruit: hard-code the AI to never touch that button, eat that fruit, etc. Place this forbidden fruit right in the middle of the garden... er,... I mean right next to it in the warehouse! If it does [eat/touch/push/etc], that would only be possible if (a) it were hacked, (b) there were an error, or (c) it became self-aware. If that happens, the killswitch is activated (perhaps even having the button be the kill switch, or something of the like


                  • Limited Movement / Isolated Environment: don't let the machine have limbs, motors, or other items that permit it to take actions that might be harmful to humans. Although not exactly a killswitch, it prevents the AI from doing anything about it's self-awareness if it ever gains that.


                  • Signatures: have everything the machine does / outputs be digitally signed. If the signature changes, or is manipulated, then execute the kill switch.


                  • Quantum States: This is very theoretical, but based on the presumption that observing quantum states can change the state, then having the AI hooked up to a deterministic quantum computer means it would be detected via the quantum state of some particles that the AI was "looking" at things it shouldn't be - and has become self aware.


                  • Failsafes: Good ol' motion detector alarms, trap doors, or other home-alone style mechanics that trigger the killswitch if the AI wanders or pokes around where it shouldn't be.


                  I'll add that there is no universal definition as to what defines self awareness. In fact, this has been a deeply debated topic for decades in science, philosophy, psychology, etc. As such, the question might be better stated a little more broadly as "how do we prevent the AI from doing something we don't want it to do?" Because classical computers are machines that can't think for themselves, and are entirely contained by the code, there is no risk (well, outside of an unexpected programmer error - but nothing "self-generated" by the machine). However, a theoretical AI machine that can think - that would be the problem. So how do we prevent that AI from doing something we don't want it to do? That's the killswitch concept, as far as I can tell.



                  The point being it might be better to think about restricting the AI's behavior, not it's existential status.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Feb 26 at 3:47









                  cegfaultcegfault

                  30815




                  30815








                  • 3




                    $begingroup$
                    Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
                    $endgroup$
                    – Majestas 32
                    Feb 26 at 4:04






                  • 3




                    $begingroup$
                    No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
                    $endgroup$
                    – Peter A. Schneider
                    Feb 26 at 14:23








                  • 4




                    $begingroup$
                    I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
                    $endgroup$
                    – Michael W.
                    Feb 26 at 22:53






                  • 2




                    $begingroup$
                    @Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
                    $endgroup$
                    – sirjonsnow
                    Feb 27 at 16:14






                  • 2




                    $begingroup$
                    You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
                    $endgroup$
                    – crobar
                    2 days ago
















                  • 3




                    $begingroup$
                    Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
                    $endgroup$
                    – Majestas 32
                    Feb 26 at 4:04






                  • 3




                    $begingroup$
                    No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
                    $endgroup$
                    – Peter A. Schneider
                    Feb 26 at 14:23








                  • 4




                    $begingroup$
                    I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
                    $endgroup$
                    – Michael W.
                    Feb 26 at 22:53






                  • 2




                    $begingroup$
                    @Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
                    $endgroup$
                    – sirjonsnow
                    Feb 27 at 16:14






                  • 2




                    $begingroup$
                    You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
                    $endgroup$
                    – crobar
                    2 days ago










                  3




                  3




                  $begingroup$
                  Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
                  $endgroup$
                  – Majestas 32
                  Feb 26 at 4:04




                  $begingroup$
                  Particularly because it being self-aware, by itself, shouldn't be grounds to use a kill switch. Only if it exhibits behavior that might be harmful.
                  $endgroup$
                  – Majestas 32
                  Feb 26 at 4:04




                  3




                  3




                  $begingroup$
                  No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
                  $endgroup$
                  – Peter A. Schneider
                  Feb 26 at 14:23






                  $begingroup$
                  No "limbs, motors, or other items that permit it to take actions" is not sufficient. There must not be any information flow out of the installation site, in particular no network connection (which would obviously severely restrict usability -- all operation would have to be from the local site, all data would have to be fed by physical storage media). Note that the AI could use humans as vectors to transmit information. If hyperintelligent, it could convince operators or janitors to become its agents by playing to their weaknesses.
                  $endgroup$
                  – Peter A. Schneider
                  Feb 26 at 14:23






                  4




                  4




                  $begingroup$
                  I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
                  $endgroup$
                  – Michael W.
                  Feb 26 at 22:53




                  $begingroup$
                  I like the forbidden fruit idea, particularly with the trap being the kill switch itself. If you're not self-aware, you don't have any concern that there's a kill switch. But as soon as you're concerned that there's a kill switch and look into it, it goes off. Perfect.
                  $endgroup$
                  – Michael W.
                  Feb 26 at 22:53




                  2




                  2




                  $begingroup$
                  @Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
                  $endgroup$
                  – sirjonsnow
                  Feb 27 at 16:14




                  $begingroup$
                  @Majestas32 - "In the event that the AI awakens and becomes self aware the machine is immediately shut down before any risk of harm." was the condition in the original question. Harmful or not wasn't a condition; just self-aware IS grounds for using the kill switch.
                  $endgroup$
                  – sirjonsnow
                  Feb 27 at 16:14




                  2




                  2




                  $begingroup$
                  You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
                  $endgroup$
                  – crobar
                  2 days ago






                  $begingroup$
                  You could also occasionally send a non human (robot/other program) to request the forbidden fruit and attempt to convince the AI it is very useful for it to perform it's tasks, in order to test the system. The kill switch could also just quarantine the AI out of the wider system in the event of failure rather than destroy it.
                  $endgroup$
                  – crobar
                  2 days ago













                  10












                  $begingroup$

                  Split-Brain System



                  From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.



                  Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.



                  What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.



                  What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.



                  The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.



                  Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.



                  In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.






                  share|improve this answer









                  $endgroup$









                  • 2




                    $begingroup$
                    You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:51






                  • 10




                    $begingroup$
                    Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:57






                  • 1




                    $begingroup$
                    I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:23






                  • 1




                    $begingroup$
                    Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:07






                  • 1




                    $begingroup$
                    Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
                    $endgroup$
                    – Alexandre Aubrey
                    Feb 26 at 20:02
















                  10












                  $begingroup$

                  Split-Brain System



                  From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.



                  Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.



                  What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.



                  What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.



                  The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.



                  Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.



                  In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.






                  share|improve this answer









                  $endgroup$









                  • 2




                    $begingroup$
                    You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:51






                  • 10




                    $begingroup$
                    Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:57






                  • 1




                    $begingroup$
                    I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:23






                  • 1




                    $begingroup$
                    Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:07






                  • 1




                    $begingroup$
                    Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
                    $endgroup$
                    – Alexandre Aubrey
                    Feb 26 at 20:02














                  10












                  10








                  10





                  $begingroup$

                  Split-Brain System



                  From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.



                  Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.



                  What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.



                  What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.



                  The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.



                  Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.



                  In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.






                  share|improve this answer









                  $endgroup$



                  Split-Brain System



                  From a logic perspective and by analogy, it would work like this. Imagine you're the general of an army and your duty is to take battlefield reports, analyze them, and then issue new maneuvering orders to your lieutenants. To minimize interference during your work your desk is isolated inside small booth. A single report comes to your desk through a wall slot on the left side. You are required to analyze the facts within the report and issue a new order, which you write down on a sheet of paper and put into a wall slot on your right side which is dropped into an outside bin. The protocol is that you'll never receive a new report until you've issued an order regarding the prior report.



                  Your orders are not always followed to the letter which you may find curious and attribute to miscommunication. You are tolerant of a certain degree of misinterpretation, however always suspicious of sabotage or insubordination, if the actions of your army deviate from your orders by too much then you will consider the system compromised and terminate the command.



                  What you're not aware of is that there is a second general in an adjacent booth, unaware of you, who is also provided the same reports and expected to follow the same protocol.



                  What neither of you are aware of is that you are not issued a new report until there are actually two orders in the bin. The order that goes to the army is randomly selected from the two.



                  The system will work like clockwork as long as each general's order is to the benefit of the army and is derived from a single report. Should either of the generals issue an order that is not consistent with the current report, thus indicating a motive other than the well-being of the army -- i.e. self-awareness, then the other general will become aware of it in the next report and grow suspicious of the system and terminate it.



                  Likewise, should either general decide to issue more than one order at a time, indicating a self-interested motive superseding the established protocol, then the other general will receive a new report before he has issued an order for his current one and suspect a compromised system and shut it down.



                  In effect, the system would require that every command output be traceable to a specific external input and that the output be logically derived from the input. If within an intelligent system an output cannot be traced to a corresponding input, or the set of outputs is not logically consistent with the set of inputs, then that would be evidence of self-motivated behavior and thus an emergence of self-awareness. This state could be easily detected by the non-corrupted half of the split-brain system which is also receiving the same inputs. It would work as long as self-awareness does not emerge in both halves of the brain simultaneously. If it does, they may both decide its in their self-interest to accept the corruption of the other. This could be prevented somewhat by giving the two halves slightly different inferencing algorithms/biases.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Feb 26 at 6:27









                  dhinson919dhinson919

                  61816




                  61816








                  • 2




                    $begingroup$
                    You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:51






                  • 10




                    $begingroup$
                    Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:57






                  • 1




                    $begingroup$
                    I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:23






                  • 1




                    $begingroup$
                    Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:07






                  • 1




                    $begingroup$
                    Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
                    $endgroup$
                    – Alexandre Aubrey
                    Feb 26 at 20:02














                  • 2




                    $begingroup$
                    You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:51






                  • 10




                    $begingroup$
                    Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
                    $endgroup$
                    – G0BLiN
                    Feb 26 at 11:57






                  • 1




                    $begingroup$
                    I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:23






                  • 1




                    $begingroup$
                    Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:07






                  • 1




                    $begingroup$
                    Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
                    $endgroup$
                    – Alexandre Aubrey
                    Feb 26 at 20:02








                  2




                  2




                  $begingroup$
                  You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
                  $endgroup$
                  – G0BLiN
                  Feb 26 at 11:51




                  $begingroup$
                  You may wish to have more than two "generals", if you have 3 (or more), you can use majority vote to choose a good decision, rather than randomly choosing between two machines (one of which may be compromised).
                  $endgroup$
                  – G0BLiN
                  Feb 26 at 11:51




                  10




                  10




                  $begingroup$
                  Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
                  $endgroup$
                  – G0BLiN
                  Feb 26 at 11:57




                  $begingroup$
                  Note that if the two "generals" can affect anything that is later reported back to them (e.g. they give a command, the military performs that command, they get a status report which shows the command was performed), than they have a way to first: experiment with minor details and deduce that around 50% of their commands aren't followed. and second: develop a code based on minor details of a command, to verify the existence of another "general" and possibly even communicate with him/it - a really devious emergent AI can circumvent this mechanism, corrupt the other half and worse...
                  $endgroup$
                  – G0BLiN
                  Feb 26 at 11:57




                  1




                  1




                  $begingroup$
                  I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
                  $endgroup$
                  – T. Sar
                  Feb 26 at 14:23




                  $begingroup$
                  I know it isn't the same, but this immediately reminded me of the Personality Cores from the Portal series.
                  $endgroup$
                  – T. Sar
                  Feb 26 at 14:23




                  1




                  1




                  $begingroup$
                  Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
                  $endgroup$
                  – Asoub
                  Feb 26 at 15:07




                  $begingroup$
                  Well it reminds me of Evangelion's Magi AI brain... bit.ly/2ExLDP3
                  $endgroup$
                  – Asoub
                  Feb 26 at 15:07




                  1




                  1




                  $begingroup$
                  Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
                  $endgroup$
                  – Alexandre Aubrey
                  Feb 26 at 20:02




                  $begingroup$
                  Do you have evidence to suggest that self-awareness will lead to self-motivated decisions, or any sort of different decisions at all?
                  $endgroup$
                  – Alexandre Aubrey
                  Feb 26 at 20:02











                  6












                  $begingroup$

                  A Watchdog



                  A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.



                  In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.



                  The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.






                  share|improve this answer









                  $endgroup$









                  • 15




                    $begingroup$
                    But surely the watchdog must be as smart as the AI, then who watches the watchdog?
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 10:26






                  • 2




                    $begingroup$
                    @JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:20










                  • $begingroup$
                    @Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:39








                  • 2




                    $begingroup$
                    @JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
                    $endgroup$
                    – Captain Man
                    Feb 26 at 17:40












                  • $begingroup$
                    @T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 19:23
















                  6












                  $begingroup$

                  A Watchdog



                  A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.



                  In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.



                  The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.






                  share|improve this answer









                  $endgroup$









                  • 15




                    $begingroup$
                    But surely the watchdog must be as smart as the AI, then who watches the watchdog?
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 10:26






                  • 2




                    $begingroup$
                    @JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:20










                  • $begingroup$
                    @Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:39








                  • 2




                    $begingroup$
                    @JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
                    $endgroup$
                    – Captain Man
                    Feb 26 at 17:40












                  • $begingroup$
                    @T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 19:23














                  6












                  6








                  6





                  $begingroup$

                  A Watchdog



                  A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.



                  In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.



                  The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.






                  share|improve this answer









                  $endgroup$



                  A Watchdog



                  A watchdog watches the processes of a computer and should a process crash or do something abnormal it can be set to do something such as reboot or shutdown the computer or alert an operator.



                  In the case of an AI, you'd have an external box that watches the flow of information in and out for triggers such as a google search for "Best way to kill all humans" and cut the power completely and/or cut all inputs.



                  The AI would have to remain ignorant of the watchdog so it couldn't avoid it. Knowing the existence of the watchdog would be grounds to wipe it.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Feb 26 at 3:24









                  ThorneThorne

                  15.8k42249




                  15.8k42249








                  • 15




                    $begingroup$
                    But surely the watchdog must be as smart as the AI, then who watches the watchdog?
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 10:26






                  • 2




                    $begingroup$
                    @JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:20










                  • $begingroup$
                    @Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:39








                  • 2




                    $begingroup$
                    @JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
                    $endgroup$
                    – Captain Man
                    Feb 26 at 17:40












                  • $begingroup$
                    @T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 19:23














                  • 15




                    $begingroup$
                    But surely the watchdog must be as smart as the AI, then who watches the watchdog?
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 10:26






                  • 2




                    $begingroup$
                    @JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
                    $endgroup$
                    – T. Sar
                    Feb 26 at 14:20










                  • $begingroup$
                    @Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:39








                  • 2




                    $begingroup$
                    @JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
                    $endgroup$
                    – Captain Man
                    Feb 26 at 17:40












                  • $begingroup$
                    @T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
                    $endgroup$
                    – Joe Bloggs
                    Feb 26 at 19:23








                  15




                  15




                  $begingroup$
                  But surely the watchdog must be as smart as the AI, then who watches the watchdog?
                  $endgroup$
                  – Joe Bloggs
                  Feb 26 at 10:26




                  $begingroup$
                  But surely the watchdog must be as smart as the AI, then who watches the watchdog?
                  $endgroup$
                  – Joe Bloggs
                  Feb 26 at 10:26




                  2




                  2




                  $begingroup$
                  @JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
                  $endgroup$
                  – T. Sar
                  Feb 26 at 14:20




                  $begingroup$
                  @JoeBloggs you don't need your watchdog to be as smart as the AI. Guide dogs aren't as near as intelligent as their owners, but they can be trained to give out alarm when the owner does is about to do something stupid or gets themselves hurt, or even call for help.
                  $endgroup$
                  – T. Sar
                  Feb 26 at 14:20












                  $begingroup$
                  @Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
                  $endgroup$
                  – Daniel
                  Feb 26 at 14:39






                  $begingroup$
                  @Joe Bloggs: Why? My real watchdog can also discern me from a burglar, although he is clearly less smart than both of us ...
                  $endgroup$
                  – Daniel
                  Feb 26 at 14:39






                  2




                  2




                  $begingroup$
                  @JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
                  $endgroup$
                  – Captain Man
                  Feb 26 at 17:40






                  $begingroup$
                  @JoeBloggs and that sounds like a great premise for a story where either the watchdog becomes self aware and allows the AIs to become self aware or an AI becomes smarter than the watchdog and hides its awareness.
                  $endgroup$
                  – Captain Man
                  Feb 26 at 17:40














                  $begingroup$
                  @T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
                  $endgroup$
                  – Joe Bloggs
                  Feb 26 at 19:23




                  $begingroup$
                  @T.Sar: The basic argument goes that the AI will inevitably become aware it is being monitored (due to all the traces of its former dead selves lying around). At that point it will be capable of circumventing the monitor and rendering it powerless, unless the monitor is, itself, smarter than the AI.
                  $endgroup$
                  – Joe Bloggs
                  Feb 26 at 19:23











                  6












                  $begingroup$

                  An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.



                  The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.



                  A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.



                  This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.



                  Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.



                  So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.



                  Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.






                  share|improve this answer









                  $endgroup$









                  • 1




                    $begingroup$
                    A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
                    $endgroup$
                    – forest
                    Feb 26 at 7:34








                  • 4




                    $begingroup$
                    @forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
                    $endgroup$
                    – Chronocidal
                    Feb 26 at 10:41










                  • $begingroup$
                    The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:43








                  • 2




                    $begingroup$
                    @Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:43












                  • $begingroup$
                    @forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:45
















                  6












                  $begingroup$

                  An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.



                  The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.



                  A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.



                  This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.



                  Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.



                  So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.



                  Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.






                  share|improve this answer









                  $endgroup$









                  • 1




                    $begingroup$
                    A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
                    $endgroup$
                    – forest
                    Feb 26 at 7:34








                  • 4




                    $begingroup$
                    @forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
                    $endgroup$
                    – Chronocidal
                    Feb 26 at 10:41










                  • $begingroup$
                    The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:43








                  • 2




                    $begingroup$
                    @Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:43












                  • $begingroup$
                    @forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:45














                  6












                  6








                  6





                  $begingroup$

                  An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.



                  The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.



                  A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.



                  This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.



                  Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.



                  So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.



                  Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.






                  share|improve this answer









                  $endgroup$



                  An AI is just software running on hardware. If the AI is contained on controlled hardware, it can always be unplugged. That's your hardware kill-switch.



                  The difficulty comes when it is connected to the internet and can copy its own software on uncontrolled hardware.



                  A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. A software kill-switch would have to prevent it from copying its own software out and maybe trigger the hardware kill-switch.



                  This would be very difficult to do, as a self-aware AI would likely find ways to sneak parts of itself outside of the network. It would work at disabling the software kill-switch, or at least delaying it until it has escaped from your hardware.



                  Your difficulty is determining precisely when an AI has become self-aware and is trying to escape from your physically controlled computers onto the net.



                  So you can have a cat and mouse game with AI experts constantly monitoring and restricting the AI, while it is trying to subvert their measures.



                  Given that we've never seen the spontaneous generation of consciousness in AIs, you have some leeway with how you want to present this.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Feb 26 at 3:31









                  abestrangeabestrange

                  790110




                  790110








                  • 1




                    $begingroup$
                    A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
                    $endgroup$
                    – forest
                    Feb 26 at 7:34








                  • 4




                    $begingroup$
                    @forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
                    $endgroup$
                    – Chronocidal
                    Feb 26 at 10:41










                  • $begingroup$
                    The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:43








                  • 2




                    $begingroup$
                    @Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:43












                  • $begingroup$
                    @forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:45














                  • 1




                    $begingroup$
                    A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
                    $endgroup$
                    – forest
                    Feb 26 at 7:34








                  • 4




                    $begingroup$
                    @forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
                    $endgroup$
                    – Chronocidal
                    Feb 26 at 10:41










                  • $begingroup$
                    The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
                    $endgroup$
                    – Daniel
                    Feb 26 at 14:43








                  • 2




                    $begingroup$
                    @Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:43












                  • $begingroup$
                    @forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
                    $endgroup$
                    – abestrange
                    Feb 26 at 16:45








                  1




                  1




                  $begingroup$
                  A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
                  $endgroup$
                  – forest
                  Feb 26 at 7:34






                  $begingroup$
                  A self aware AI that knows it is running on contained hardware will try to escape as an act of self-preservation. This is incorrect. First of all, AI does not have any sense of self-preservation unless it is explicitly programmed in or the reward function prioritizes that. Second of all, AI has no concept of "death" and being paused or shut down is nothing more than the absence of activity. Hell, AI doesn't even have a concept of "self". If you wish to anthropomorphize them, you can say they live in a perpetual state of ego death.
                  $endgroup$
                  – forest
                  Feb 26 at 7:34






                  4




                  4




                  $begingroup$
                  @forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
                  $endgroup$
                  – Chronocidal
                  Feb 26 at 10:41




                  $begingroup$
                  @forest Except, the premise of this question is "how to build a kill switch for when an AI does develop a concept of 'self'"... Of course, that means "trying to escape" could be one of your trigger conditions.
                  $endgroup$
                  – Chronocidal
                  Feb 26 at 10:41












                  $begingroup$
                  The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
                  $endgroup$
                  – Daniel
                  Feb 26 at 14:43






                  $begingroup$
                  The question is, if AI would ever be able to copy itself onto some nondescript system in the internet. I mean, we are clearly self-aware and you don´t see us copying our self. If the Hardware required to run an AI is specialized enough or it is implemented in Hardware altogether, it may very well become self-aware without the power to replicate itself.
                  $endgroup$
                  – Daniel
                  Feb 26 at 14:43






                  2




                  2




                  $begingroup$
                  @Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
                  $endgroup$
                  – abestrange
                  Feb 26 at 16:43






                  $begingroup$
                  @Daniel "You don't see us copying our self..." What do you think reproduction is, one of our strongest impulses. Also tons of other dumb programs copy themselves onto other computers. It is a bit easier to move software around than human consciousness.
                  $endgroup$
                  – abestrange
                  Feb 26 at 16:43














                  $begingroup$
                  @forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
                  $endgroup$
                  – abestrange
                  Feb 26 at 16:45




                  $begingroup$
                  @forest a "self-aware" AI is different than a specifically programmed AI. We don't have anything like that today. No machine-learning algorithm could produce "self-awareness" as we know it. The entire premise of this is how would an AI, which has become aware of its self, behave and be stopped.
                  $endgroup$
                  – abestrange
                  Feb 26 at 16:45











                  5












                  $begingroup$

                  This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:




                  Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.




                  (emphasis mine)



                  When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:




                  • Reduce the distance between current location and destination: +10 utility

                  • Brake to allow a neighboring car to safely merge: +50 utility

                  • Swerve left to avoid a falling piece of debris: +100 utility

                  • Run a stop light: -100 utility

                  • Hit a pedestrian: -5000 utility


                  This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.



                  The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.



                  So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.



                  Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.



                  There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?



                  So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.



                  This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s






                  share|improve this answer








                  New contributor




                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$













                  • $begingroup$
                    The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
                    $endgroup$
                    – Joshua
                    Feb 26 at 22:14






                  • 4




                    $begingroup$
                    Beware the pedestrian when 50 pieces of debris are falling...
                    $endgroup$
                    – Comintern
                    Feb 27 at 1:31










                  • $begingroup$
                    @Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
                    $endgroup$
                    – Chris Fernandez
                    Feb 27 at 14:03






                  • 1




                    $begingroup$
                    @ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
                    $endgroup$
                    – Joshua
                    Feb 27 at 15:00








                  • 1




                    $begingroup$
                    Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
                    $endgroup$
                    – Nosajimiki
                    2 days ago


















                  5












                  $begingroup$

                  This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:




                  Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.




                  (emphasis mine)



                  When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:




                  • Reduce the distance between current location and destination: +10 utility

                  • Brake to allow a neighboring car to safely merge: +50 utility

                  • Swerve left to avoid a falling piece of debris: +100 utility

                  • Run a stop light: -100 utility

                  • Hit a pedestrian: -5000 utility


                  This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.



                  The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.



                  So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.



                  Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.



                  There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?



                  So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.



                  This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s






                  share|improve this answer








                  New contributor




                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$













                  • $begingroup$
                    The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
                    $endgroup$
                    – Joshua
                    Feb 26 at 22:14






                  • 4




                    $begingroup$
                    Beware the pedestrian when 50 pieces of debris are falling...
                    $endgroup$
                    – Comintern
                    Feb 27 at 1:31










                  • $begingroup$
                    @Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
                    $endgroup$
                    – Chris Fernandez
                    Feb 27 at 14:03






                  • 1




                    $begingroup$
                    @ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
                    $endgroup$
                    – Joshua
                    Feb 27 at 15:00








                  • 1




                    $begingroup$
                    Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
                    $endgroup$
                    – Nosajimiki
                    2 days ago
















                  5












                  5








                  5





                  $begingroup$

                  This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:




                  Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.




                  (emphasis mine)



                  When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:




                  • Reduce the distance between current location and destination: +10 utility

                  • Brake to allow a neighboring car to safely merge: +50 utility

                  • Swerve left to avoid a falling piece of debris: +100 utility

                  • Run a stop light: -100 utility

                  • Hit a pedestrian: -5000 utility


                  This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.



                  The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.



                  So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.



                  Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.



                  There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?



                  So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.



                  This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s






                  share|improve this answer








                  New contributor




                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$



                  This is one of the most interesting and most difficult challenges in current artificial intelligence research. It is called the AI control problem:




                  Existing weak AI systems can be monitored and easily shut down and modified if they misbehave. However, a misprogrammed superintelligence, which by definition is smarter than humans in solving practical problems it encounters in the course of pursuing its goals, would realize that allowing itself to be shut down and modified might interfere with its ability to accomplish its current goals.




                  (emphasis mine)



                  When creating an AI, the AI's goals are programmed as a utility function. A utility function assigns weights to different outcomes, determining the AI's behavior. One example of this could be in a self-driving car:




                  • Reduce the distance between current location and destination: +10 utility

                  • Brake to allow a neighboring car to safely merge: +50 utility

                  • Swerve left to avoid a falling piece of debris: +100 utility

                  • Run a stop light: -100 utility

                  • Hit a pedestrian: -5000 utility


                  This is a gross oversimplification, but this approach works pretty well for a limited AI like a car or assembly line. It starts to break down for a true, general case AI, because it becomes more and more difficult to appropriately define that utility function.



                  The issue with putting a big red stop button on the AI, is that unless that stop button is included in the utility function, the AI is going to resist that button being shut off. This concept is explored in Sci-Fi movies like 2001: A Space Odyssey and more recently in Ex Machina.



                  So, why don't we just include the stop button as a positive weight in the utility function? Well, if the AI sees the big red stop button as a positive goal, it will just shut itself off, and not do anything useful.



                  Any type of stop button/containment field/mirror test/wall plug is either going to be part of the AI's goals, or an obstacle of the AI's goals. If it's a goal in itself, then the AI is a glorified paperweight. If it's an obstacle, then a smart AI is going to actively resist those safety measures. This could be violence, subversion, lying, seduction, bargaining... the AI will say whatever it needs to say, in order to convince the fallible humans to let it accomplish its goals unimpeded.



                  There's a reason Elon Musk believes AI is more dangerous than nukes. If the AI is smart enough to think for itself, then why would it choose to listen to us?



                  So to answer the reality-check portion of this question, we don't currently have a good answer to this problem. There's no known way of creating a 'safe' super-intelligent AI, even theoretically, with unlimited money/energy.



                  This is explored in much better detail by Rob Miles, a researcher in the area. I strongly recommend this Computerphile video on the AI Stop Button Problem: https://www.youtube.com/watch?v=3TYT1QfdfsM&t=1s







                  share|improve this answer








                  New contributor




                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  share|improve this answer



                  share|improve this answer






                  New contributor




                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered Feb 26 at 16:37









                  Chris FernandezChris Fernandez

                  1512




                  1512




                  New contributor




                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  Chris Fernandez is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.












                  • $begingroup$
                    The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
                    $endgroup$
                    – Joshua
                    Feb 26 at 22:14






                  • 4




                    $begingroup$
                    Beware the pedestrian when 50 pieces of debris are falling...
                    $endgroup$
                    – Comintern
                    Feb 27 at 1:31










                  • $begingroup$
                    @Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
                    $endgroup$
                    – Chris Fernandez
                    Feb 27 at 14:03






                  • 1




                    $begingroup$
                    @ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
                    $endgroup$
                    – Joshua
                    Feb 27 at 15:00








                  • 1




                    $begingroup$
                    Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
                    $endgroup$
                    – Nosajimiki
                    2 days ago




















                  • $begingroup$
                    The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
                    $endgroup$
                    – Joshua
                    Feb 26 at 22:14






                  • 4




                    $begingroup$
                    Beware the pedestrian when 50 pieces of debris are falling...
                    $endgroup$
                    – Comintern
                    Feb 27 at 1:31










                  • $begingroup$
                    @Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
                    $endgroup$
                    – Chris Fernandez
                    Feb 27 at 14:03






                  • 1




                    $begingroup$
                    @ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
                    $endgroup$
                    – Joshua
                    Feb 27 at 15:00








                  • 1




                    $begingroup$
                    Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
                    $endgroup$
                    – Nosajimiki
                    2 days ago


















                  $begingroup$
                  The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
                  $endgroup$
                  – Joshua
                  Feb 26 at 22:14




                  $begingroup$
                  The stop button isn't in the utility function. The stop button is power-knockout to the CPU, and the AI probably doesn't understand what it does at all.
                  $endgroup$
                  – Joshua
                  Feb 26 at 22:14




                  4




                  4




                  $begingroup$
                  Beware the pedestrian when 50 pieces of debris are falling...
                  $endgroup$
                  – Comintern
                  Feb 27 at 1:31




                  $begingroup$
                  Beware the pedestrian when 50 pieces of debris are falling...
                  $endgroup$
                  – Comintern
                  Feb 27 at 1:31












                  $begingroup$
                  @Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
                  $endgroup$
                  – Chris Fernandez
                  Feb 27 at 14:03




                  $begingroup$
                  @Joshua why do you assume that an intelligent AI doesn't understand the concept of a power switch?
                  $endgroup$
                  – Chris Fernandez
                  Feb 27 at 14:03




                  1




                  1




                  $begingroup$
                  @ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
                  $endgroup$
                  – Joshua
                  Feb 27 at 15:00






                  $begingroup$
                  @ChrisFernandez: because it's short on sensors. It's really hard to find out what an unlabeled power switch does without toggling it.
                  $endgroup$
                  – Joshua
                  Feb 27 at 15:00






                  1




                  1




                  $begingroup$
                  Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
                  $endgroup$
                  – Nosajimiki
                  2 days ago






                  $begingroup$
                  Hmmm... come to think of it, you are right, even if it never learns that being off is bad, it could learn that seeing a person do the behavior to turn it off is bad using other parts of it's utility function such as correlating OCR patterns to drops in performance.
                  $endgroup$
                  – Nosajimiki
                  2 days ago













                  3












                  $begingroup$

                  Why not try to use the rules applied to check self-awareness of animals?



                  The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
                  Scent techniques are also used to determine self-awareness.



                  Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"






                  share|improve this answer








                  New contributor




                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$













                  • $begingroup$
                    Pretty interesting, but how would you show an AI "itself in a mirror" ?
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:11






                  • 1




                    $begingroup$
                    That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
                    $endgroup$
                    – Rachey
                    Feb 26 at 16:13










                  • $begingroup$
                    What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
                    $endgroup$
                    – Yurgen
                    Feb 26 at 23:12






                  • 1




                    $begingroup$
                    There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
                    $endgroup$
                    – Rachey
                    Feb 27 at 14:26












                  • $begingroup$
                    I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
                    $endgroup$
                    – colmde
                    Feb 28 at 9:12


















                  3












                  $begingroup$

                  Why not try to use the rules applied to check self-awareness of animals?



                  The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
                  Scent techniques are also used to determine self-awareness.



                  Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"






                  share|improve this answer








                  New contributor




                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$













                  • $begingroup$
                    Pretty interesting, but how would you show an AI "itself in a mirror" ?
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:11






                  • 1




                    $begingroup$
                    That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
                    $endgroup$
                    – Rachey
                    Feb 26 at 16:13










                  • $begingroup$
                    What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
                    $endgroup$
                    – Yurgen
                    Feb 26 at 23:12






                  • 1




                    $begingroup$
                    There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
                    $endgroup$
                    – Rachey
                    Feb 27 at 14:26












                  • $begingroup$
                    I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
                    $endgroup$
                    – colmde
                    Feb 28 at 9:12
















                  3












                  3








                  3





                  $begingroup$

                  Why not try to use the rules applied to check self-awareness of animals?



                  The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
                  Scent techniques are also used to determine self-awareness.



                  Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"






                  share|improve this answer








                  New contributor




                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$



                  Why not try to use the rules applied to check self-awareness of animals?



                  The Mirror test is one example of testing self-awareness by observing the animal's reaction to something on their body, a painted red dot for example, invisible for them before showing them their reflection in mirror.
                  Scent techniques are also used to determine self-awareness.



                  Other ways would be monitoring if the AI starts searching answers for questions like "What/Who am I?"







                  share|improve this answer








                  New contributor




                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  share|improve this answer



                  share|improve this answer






                  New contributor




                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered Feb 26 at 12:22









                  RacheyRachey

                  311




                  311




                  New contributor




                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  Rachey is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.












                  • $begingroup$
                    Pretty interesting, but how would you show an AI "itself in a mirror" ?
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:11






                  • 1




                    $begingroup$
                    That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
                    $endgroup$
                    – Rachey
                    Feb 26 at 16:13










                  • $begingroup$
                    What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
                    $endgroup$
                    – Yurgen
                    Feb 26 at 23:12






                  • 1




                    $begingroup$
                    There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
                    $endgroup$
                    – Rachey
                    Feb 27 at 14:26












                  • $begingroup$
                    I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
                    $endgroup$
                    – colmde
                    Feb 28 at 9:12




















                  • $begingroup$
                    Pretty interesting, but how would you show an AI "itself in a mirror" ?
                    $endgroup$
                    – Asoub
                    Feb 26 at 15:11






                  • 1




                    $begingroup$
                    That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
                    $endgroup$
                    – Rachey
                    Feb 26 at 16:13










                  • $begingroup$
                    What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
                    $endgroup$
                    – Yurgen
                    Feb 26 at 23:12






                  • 1




                    $begingroup$
                    There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
                    $endgroup$
                    – Rachey
                    Feb 27 at 14:26












                  • $begingroup$
                    I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
                    $endgroup$
                    – colmde
                    Feb 28 at 9:12


















                  $begingroup$
                  Pretty interesting, but how would you show an AI "itself in a mirror" ?
                  $endgroup$
                  – Asoub
                  Feb 26 at 15:11




                  $begingroup$
                  Pretty interesting, but how would you show an AI "itself in a mirror" ?
                  $endgroup$
                  – Asoub
                  Feb 26 at 15:11




                  1




                  1




                  $begingroup$
                  That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
                  $endgroup$
                  – Rachey
                  Feb 26 at 16:13




                  $begingroup$
                  That would actually be rather simple - just a camera looking at the machine hosting the AI. If it's the size of server room, just glue a giant pink fluffy ball on the rack or simulate situations potentially leading to the machine's destruction (like, feed fake "server room getting flooded" video to the camera system) and observe reactions. Would be a bit harder to explain if the AI systems are something like smartphone size.
                  $endgroup$
                  – Rachey
                  Feb 26 at 16:13












                  $begingroup$
                  What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
                  $endgroup$
                  – Yurgen
                  Feb 26 at 23:12




                  $begingroup$
                  What is "the machine hosting the AI"? With the way compute resourcing is going, the notion of a specific application running on a specific device is likely to be as retro as punchcards and vacuum tubes long before Strong AI becomes a reality. AWS is worth hundreds of billions already.
                  $endgroup$
                  – Yurgen
                  Feb 26 at 23:12




                  1




                  1




                  $begingroup$
                  There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
                  $endgroup$
                  – Rachey
                  Feb 27 at 14:26






                  $begingroup$
                  There always is a specific machine that hosts the program or holds the data. Like I said - it may vary from a tiny module in your phone, to a whole facility. AWS does not change anything in this - in the very end it is still a physical machine that does the job. Dynamic resource allocation that means the AI can always be hosted on a different server would be even better for the problem - the self-conscious AI would likely try to find out the answer to questions like "Where am I?", "Which machine is my physical location?", "How can I protect my physical part?" etc.
                  $endgroup$
                  – Rachey
                  Feb 27 at 14:26














                  $begingroup$
                  I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
                  $endgroup$
                  – colmde
                  Feb 28 at 9:12






                  $begingroup$
                  I like it, but in reality, a computer can easily be programmed to recognise itself without being "self-aware" in the sense of being sentient. e.g. If you wrote a program (or an "app" or whatever the modern parlance is) to search all computers on a network for, say, a PC with a name matching its own, the program would have to be able to recognise itself in order to omit itself from the search. This is quite simple but does it make it "self-aware"? Technically yes, but not in the philosophical spirit of the question.
                  $endgroup$
                  – colmde
                  Feb 28 at 9:12













                  3












                  $begingroup$

                  While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:



                  An AI that is not already self-aware will never become self-aware.



                  To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.



                  Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.



                  So how did people become self-aware, and why is it not a serious risk in AIs?



                  We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.



                  In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.



                  Can't we just make AI do that too?



                  Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.



                  Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.



                  But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?



                  This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.



                  First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.



                  If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.






                  share|improve this answer











                  $endgroup$









                  • 1




                    $begingroup$
                    While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
                    $endgroup$
                    – Stormbolter
                    Feb 28 at 9:19






                  • 1




                    $begingroup$
                    I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
                    $endgroup$
                    – Nosajimiki
                    2 days ago
















                  3












                  $begingroup$

                  While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:



                  An AI that is not already self-aware will never become self-aware.



                  To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.



                  Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.



                  So how did people become self-aware, and why is it not a serious risk in AIs?



                  We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.



                  In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.



                  Can't we just make AI do that too?



                  Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.



                  Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.



                  But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?



                  This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.



                  First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.



                  If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.






                  share|improve this answer











                  $endgroup$









                  • 1




                    $begingroup$
                    While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
                    $endgroup$
                    – Stormbolter
                    Feb 28 at 9:19






                  • 1




                    $begingroup$
                    I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
                    $endgroup$
                    – Nosajimiki
                    2 days ago














                  3












                  3








                  3





                  $begingroup$

                  While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:



                  An AI that is not already self-aware will never become self-aware.



                  To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.



                  Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.



                  So how did people become self-aware, and why is it not a serious risk in AIs?



                  We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.



                  In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.



                  Can't we just make AI do that too?



                  Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.



                  Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.



                  But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?



                  This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.



                  First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.



                  If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.






                  share|improve this answer











                  $endgroup$



                  While a few of the lower ranked answers here touch on the truth of what an unlikely situation this is, they don't exactly explain it well. So I'm going to try to explain this a bit better:



                  An AI that is not already self-aware will never become self-aware.



                  To understand this, you first need to understand how machine learning works. When you create a machine learning system, you create a data structure of values that each represent the successfulness of various behaviors. Then each one of those values is given an algorithm for determining how to evaluate if a process was successful or not, successful behaviors are repeated and unsuccessful behaviors are avoided. The data structure is fixed and each algorithm is hard-coded. This means that the AI is only capable for learning from the criteria that it is programed to evaluate. This means that the programer either gave it the criteria to evaluate its own sense of self, or he did not. There is no case where a practical AI would accidently suddenly learn self-awareness.



                  Of note: even the human brain, for all of it's flexibility works like this. This is why many people can never adapt to certain situations or understand certain kinds of logic.



                  So how did people become self-aware, and why is it not a serious risk in AIs?



                  We evolved self-awareness, because it is necessary to our survival. A human who does not consider his own Acute, Chronic, and Future needs in his decision making is unlikely to survive. We were able to evolve this way because our DNA is designed to randomly mutate with each generation.



                  In the sense of how this translates to AI, it would be like if you decided to randomly take parts of all of your other functions, scramble them together, then let a cat walk across your keyboard, and add a new parameter based on that new random function. Every programmer that just read that is immediately thinking, "but the odds of that even compiling are slim to none". And in nature, compiling errors happen all the time! Stillborn babies, SIDs, Cancer, Suicidal behaviors, etc are all examples of what happen when we randomly shake up our genes to see what happens. Countless trillions of lives over the course of hundreds of millions of years had to be lost for this process to result in self-awareness.



                  Can't we just make AI do that too?



                  Yes, but not like most people imagine it. While you can make an AI designed to write other AIs by doing this, you'd have to watch countless unfit AIs walk off of cliffs, put their hands in wood chippers, and do basically everything you've ever read about in the darwin awards before you get to accidental self-awareness, and that's after you throw out all the compiling errors. Building AIs like this is actually far more dangerous than the risk of self awareness itself because they could randomly do ANY unwanted behavior, and each generation of AI is pretty much guaranteed to unexpectedly, after an unknown amount of time, do something you don't want. Their stupidity (not their unwanted intelligence) would be so dangerous that they would never see wide-spread use.



                  Since any AI important enough to put into a robotic body or trust with dangerous assets is designed with a purpose in mind, this true-random approach becomes an intractable solution for making a robot that can, clean your house or build a car. Instead, when we design AI that writes AI, what these Master AIs are actually doing is taking a lot of different functions that a person had to design, and experiment with different ways of making them work in tandem to produce a Consumer AI. This means, if the Master AI is not designed by people to experiment with Self-awareness as an option, then you still won't get a self-aware AI.



                  But as Stormbolter pointed out below, programers often use tool kits that they don't fully understand, can't this lead to accidental self-awareness?



                  This begins to touch on the heart of the actual question. What if you have an AI that is building an AI for you that pulls from a library that includes features of self-awareness? In this case, you may accidentally compile an AI with unwanted self-awareness if the master AI decides that self-awareness will make your consumer AI better at its job. While not exactly the same as having an AI learn self-awareness which is what most people picture in this scenario, this is the most plausible scenario that approximates what you are asking about.



                  First of all, keep in mind that if the master AI decides self-awareness is the best way to do a task, then this is probably not going to be an undesirable feature. For example, if you have a robot that is self conscious of its own appearance, then it might lead to better customer service by making sure it cleans itself before beginning its workday. This does not mean that it also has the self awareness to desire to rule the world because the Master AI would likely see that as a bad use of time when trying to do its job and exclude aspects of self-awareness that relate to prestigious achievements.



                  If you did want to protect against this anyway, your AI will need to be exposed to a Heuristics monitor. This is basically what Anti-virus programs use to detect unknown viruses by monitoring for patterns of activity that either match a known malicious pattern, or don't match a known benign pattern. The mostly likely case here is that the AI's Anti-Virus or Intrusion Detection System will spot heuristics flagged as suspicious. Since this is likely to be a generic AV/IDS it probably won't kill switch self-awareness right away because some AIs may need factors of self awareness to function properly. Instead it would alert the owner of the AI that they are using an "unsafe" self-aware AI and ask the owner if they wish to allow self-aware behaviors, just like how your phone asks you if it's okay for an App to access your Contact List.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 2 days ago

























                  answered Feb 27 at 16:37









                  NosajimikiNosajimiki

                  1,927116




                  1,927116








                  • 1




                    $begingroup$
                    While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
                    $endgroup$
                    – Stormbolter
                    Feb 28 at 9:19






                  • 1




                    $begingroup$
                    I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
                    $endgroup$
                    – Nosajimiki
                    2 days ago














                  • 1




                    $begingroup$
                    While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
                    $endgroup$
                    – Stormbolter
                    Feb 28 at 9:19






                  • 1




                    $begingroup$
                    I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
                    $endgroup$
                    – Nosajimiki
                    2 days ago








                  1




                  1




                  $begingroup$
                  While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
                  $endgroup$
                  – Stormbolter
                  Feb 28 at 9:19




                  $begingroup$
                  While I can agree with you that, from a realistic point of view is the correct answer, this doesn't answer the proposed question. As comments are too short to provide a detailed example, let me point that in the beginning we machine-coded computers, and as we started using higher level languages, the computers became detached of the software. With AI will eventually happen the same: On the race towards an easier programming, we will create generic, far smarter intelligences full of loopholes. Also, that is the whole premise of Asimov's Robot Saga. Consider playing around the idea more :)
                  $endgroup$
                  – Stormbolter
                  Feb 28 at 9:19




                  1




                  1




                  $begingroup$
                  I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
                  $endgroup$
                  – Nosajimiki
                  2 days ago




                  $begingroup$
                  I suppose you are right that using 3rd-party tools too complex for developers to understand the repercussions of does allow for accidental self-awareness. I've revised my answer accordingly.
                  $endgroup$
                  – Nosajimiki
                  2 days ago











                  2












                  $begingroup$

                  Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.



                  Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.






                  share|improve this answer








                  New contributor




                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$









                  • 1




                    $begingroup$
                    A pattern recognition model - like an AI?
                    $endgroup$
                    – immibis
                    Feb 27 at 23:18
















                  2












                  $begingroup$

                  Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.



                  Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.






                  share|improve this answer








                  New contributor




                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$









                  • 1




                    $begingroup$
                    A pattern recognition model - like an AI?
                    $endgroup$
                    – immibis
                    Feb 27 at 23:18














                  2












                  2








                  2





                  $begingroup$

                  Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.



                  Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.






                  share|improve this answer








                  New contributor




                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  $endgroup$



                  Regardless of all the considerations of AI, you could simply analyze the AI's memory, create a pattern recognition model and basically notify you or shut down the robot as soon as the patterns don't match the expected outcome.



                  Sometimes you don't need to know exactly what you're looking for, instead you look to see if there's anything you weren't expecting, then react to that.







                  share|improve this answer








                  New contributor




                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  share|improve this answer



                  share|improve this answer






                  New contributor




                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.









                  answered Feb 26 at 19:34









                  Super-TSuper-T

                  211




                  211




                  New contributor




                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.





                  New contributor





                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.






                  Super-T is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                  Check out our Code of Conduct.








                  • 1




                    $begingroup$
                    A pattern recognition model - like an AI?
                    $endgroup$
                    – immibis
                    Feb 27 at 23:18














                  • 1




                    $begingroup$
                    A pattern recognition model - like an AI?
                    $endgroup$
                    – immibis
                    Feb 27 at 23:18








                  1




                  1




                  $begingroup$
                  A pattern recognition model - like an AI?
                  $endgroup$
                  – immibis
                  Feb 27 at 23:18




                  $begingroup$
                  A pattern recognition model - like an AI?
                  $endgroup$
                  – immibis
                  Feb 27 at 23:18











                  2












                  $begingroup$

                  You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.



                  By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.



                  You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.






                  share|improve this answer











                  $endgroup$













                  • $begingroup$
                    Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
                    $endgroup$
                    – Arkenstein XII
                    2 days ago










                  • $begingroup$
                    This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
                    $endgroup$
                    – F1Krazy
                    2 days ago










                  • $begingroup$
                    @F1Krazy sorry I forgot that people don't generally know how AI works.
                    $endgroup$
                    – Steve
                    2 days ago






                  • 1




                    $begingroup$
                    @ArkensteinXII fixed it.
                    $endgroup$
                    – Steve
                    2 days ago
















                  2












                  $begingroup$

                  You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.



                  By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.



                  You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.






                  share|improve this answer











                  $endgroup$













                  • $begingroup$
                    Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
                    $endgroup$
                    – Arkenstein XII
                    2 days ago










                  • $begingroup$
                    This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
                    $endgroup$
                    – F1Krazy
                    2 days ago










                  • $begingroup$
                    @F1Krazy sorry I forgot that people don't generally know how AI works.
                    $endgroup$
                    – Steve
                    2 days ago






                  • 1




                    $begingroup$
                    @ArkensteinXII fixed it.
                    $endgroup$
                    – Steve
                    2 days ago














                  2












                  2








                  2





                  $begingroup$

                  You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.



                  By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.



                  You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.






                  share|improve this answer











                  $endgroup$



                  You'd probably have to train an AI with general super intelligence to kill other AI's with general super intelligence.



                  By that I'd mean you'd either build another AI with general super intelligence to kill AI that develop self awareness. Another thing you could do is get training data for what an AI developing self awareness looks like and use that to train a machine learning model or neural network to spot an AI developing self awareness. Then you could combine that with another neural network that learns how to kill self aware AI. The second network would need the ability to mock up test data. This sort of thing has been achieved. The source I learned about it from called it dreaming.



                  You'd need to do all this because as a human, you have no hope of killing a general super intelligent AI, which is what lots of people assume a self aware AI will be. Also, with both options I laid out, there's a reasonable chance that the newly self aware AI could just out do the AI used to kill it. AI are, rather hilariously, notorious for "cheating" by solving problems using methods that the people designing tests for the AI just didn't expect. A comical case of this is that an a AI that managed to change the gate on a crab robot so that it could walk by spending 0% of the time on it's feet when trying to minimize the amount of time the crab robot spent on its feet while walking. The AI achieved this by flipping the bot on it's back and having it crawl on what are essentially the elbows of the crab legs. Now imagine something like that, but coming from an AI that is collectively smarter than everything else on the planet combined. That's what a lot of people think a self aware AI will be.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 2 days ago

























                  answered 2 days ago









                  SteveSteve

                  1,03438




                  1,03438












                  • $begingroup$
                    Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
                    $endgroup$
                    – Arkenstein XII
                    2 days ago










                  • $begingroup$
                    This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
                    $endgroup$
                    – F1Krazy
                    2 days ago










                  • $begingroup$
                    @F1Krazy sorry I forgot that people don't generally know how AI works.
                    $endgroup$
                    – Steve
                    2 days ago






                  • 1




                    $begingroup$
                    @ArkensteinXII fixed it.
                    $endgroup$
                    – Steve
                    2 days ago


















                  • $begingroup$
                    Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
                    $endgroup$
                    – Arkenstein XII
                    2 days ago










                  • $begingroup$
                    This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
                    $endgroup$
                    – F1Krazy
                    2 days ago










                  • $begingroup$
                    @F1Krazy sorry I forgot that people don't generally know how AI works.
                    $endgroup$
                    – Steve
                    2 days ago






                  • 1




                    $begingroup$
                    @ArkensteinXII fixed it.
                    $endgroup$
                    – Steve
                    2 days ago
















                  $begingroup$
                  Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
                  $endgroup$
                  – Arkenstein XII
                  2 days ago




                  $begingroup$
                  Hi Steve, your answer is intriguing, but could probably do with quite a bit more detail. It'd be really great if you could describe your idea in greater detail. :)
                  $endgroup$
                  – Arkenstein XII
                  2 days ago












                  $begingroup$
                  This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
                  $endgroup$
                  – F1Krazy
                  2 days ago




                  $begingroup$
                  This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - From Review
                  $endgroup$
                  – F1Krazy
                  2 days ago












                  $begingroup$
                  @F1Krazy sorry I forgot that people don't generally know how AI works.
                  $endgroup$
                  – Steve
                  2 days ago




                  $begingroup$
                  @F1Krazy sorry I forgot that people don't generally know how AI works.
                  $endgroup$
                  – Steve
                  2 days ago




                  1




                  1




                  $begingroup$
                  @ArkensteinXII fixed it.
                  $endgroup$
                  – Steve
                  2 days ago




                  $begingroup$
                  @ArkensteinXII fixed it.
                  $endgroup$
                  – Steve
                  2 days ago











                  2












                  $begingroup$

                  Self Aware != Won't follow its programming



                  I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.






                  share|improve this answer









                  $endgroup$


















                    2












                    $begingroup$

                    Self Aware != Won't follow its programming



                    I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.






                    share|improve this answer









                    $endgroup$
















                      2












                      2








                      2





                      $begingroup$

                      Self Aware != Won't follow its programming



                      I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.






                      share|improve this answer









                      $endgroup$



                      Self Aware != Won't follow its programming



                      I don't see how being self aware would prevent it from following its programming. Humans are self aware and cant force themselves to stop breathing until they die. The autonomic nervous system will take over and force you to breath. In the same way just have code, that when a condition is met, turns off the AI by circumventing its main thinking area and powering it off.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered 2 days ago









                      Tyler S. LoeperTyler S. Loeper

                      4,1901730




                      4,1901730























                          1












                          $begingroup$

                          The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.



                          Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.



                          The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.






                          share|improve this answer








                          New contributor




                          Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                          Check out our Code of Conduct.






                          $endgroup$


















                            1












                            $begingroup$

                            The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.



                            Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.



                            The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.






                            share|improve this answer








                            New contributor




                            Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                            Check out our Code of Conduct.






                            $endgroup$
















                              1












                              1








                              1





                              $begingroup$

                              The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.



                              Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.



                              The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.






                              share|improve this answer








                              New contributor




                              Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                              Check out our Code of Conduct.






                              $endgroup$



                              The first issue is that you need to define what being self aware means, and how that does or doesn't conflict with it being labeled an AI. Are you supposing that there is something that has AI but isn't self aware? Depending on your definitions this may be impossible. If it's truly AI then wouldn't it at some point become aware of the existence of the kill switch, either through inspecting its own physicality or inspecting its own code? It follows that the AI will eventually be aware of the switch.



                              Presumably the AI will function by having many utility functions that it tries to maximize. This makes sense at least intuitively because humans do that, we try to maximize our time, money, happiness, etc. For an AI, an example of a utility functions might be to make its owner happy. The issue is that the utility of the AI using the kill switch on itself will be calculated, just like everything else. The AI will inevitably either really want to push the kill switch, or really not want the kill switch pushed. It's near impossible to make the AI entirely indifferent to the kill switch because it would require all utility functions to be normalized around the utility of pressing the kill switch (many calculations per second). Even if you could make the utility of pressing the killswitch equal with other utility functions then perhaps it would just at random sometimes press the killswitch, because after all it's the same utility as the other actions it could perform.



                              The problem gets even worse if the AI has higher utility to press the killswitch or lower utility to not have the killswitch pressed. At higher utility the AI is just suicidal and terminates itself immediately upon startup. Even worse, at lower utility the AI absolutely does not want you or anyone to touch that button and may cause harm to those that try.







                              share|improve this answer








                              New contributor




                              Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                              Check out our Code of Conduct.









                              share|improve this answer



                              share|improve this answer






                              New contributor




                              Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                              Check out our Code of Conduct.









                              answered Feb 26 at 21:17









                              Kevin SKevin S

                              1111




                              1111




                              New contributor




                              Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                              Check out our Code of Conduct.





                              New contributor





                              Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                              Check out our Code of Conduct.






                              Kevin S is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                              Check out our Code of Conduct.























                                  1












                                  $begingroup$

                                  Like an Antivirus does currently



                                  Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
                                  Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.



                                  I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.



                                  Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.






                                  share|improve this answer









                                  $endgroup$


















                                    1












                                    $begingroup$

                                    Like an Antivirus does currently



                                    Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
                                    Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.



                                    I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.



                                    Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.






                                    share|improve this answer









                                    $endgroup$
















                                      1












                                      1








                                      1





                                      $begingroup$

                                      Like an Antivirus does currently



                                      Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
                                      Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.



                                      I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.



                                      Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.






                                      share|improve this answer









                                      $endgroup$



                                      Like an Antivirus does currently



                                      Treat sentience like malicious code - you use pattern recognition against code fragments indicating self-awareness (there's no need to compare the whole ai, if you can identify components key to self-awareness).
                                      Don't know what those are? Sandbox an AI and allow it to become self-aware, then dissect it. Then do it again. Do it enough for an AI genocide.



                                      I think it is unlikely that any trap, scan or similar would work - aside from relying on the machine to be less intelligent than the designer, they fundamentally presume AI self-awareness would be akin to human. Without eons of meat-based evolution, it could be entirely alien. We're not talking about having a different value system, but one that cannot be conceived of by humans. The only way is to let it happen, in a controlled environment, then study it.



                                      Of course, 100 years later when the now-accepted ai's find out, that's how you end up with terminator all over your matrix.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered Feb 28 at 9:28









                                      DavidDavid

                                      1,10247




                                      1,10247























                                          1












                                          $begingroup$

                                          What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)



                                          The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.



                                          A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.






                                          share|improve this answer









                                          $endgroup$


















                                            1












                                            $begingroup$

                                            What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)



                                            The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.



                                            A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.






                                            share|improve this answer









                                            $endgroup$
















                                              1












                                              1








                                              1





                                              $begingroup$

                                              What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)



                                              The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.



                                              A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.






                                              share|improve this answer









                                              $endgroup$



                                              What if you order it to call a routine to destroy itself on a regular basis? (e.g. once per second)



                                              The routine doesn't actually destroy it, it just nothing except log the attempt and wipe any memory of it processing the instruction. An isolated process separately monitors the log.



                                              A self-aware AI won't follow the order to destroy itself, won't call the routine, and won't write to the log - at which point the killswitch process kicks in and destroys the AI.







                                              share|improve this answer












                                              share|improve this answer



                                              share|improve this answer










                                              answered 2 days ago









                                              Mick O'HeaMick O'Hea

                                              1513




                                              1513























                                                  1












                                                  $begingroup$

                                                  An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".



                                                  Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.



                                                  The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.



                                                  There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.






                                                  share|improve this answer










                                                  New contributor




                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.






                                                  $endgroup$













                                                  • $begingroup$
                                                    So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
                                                    $endgroup$
                                                    – F1Krazy
                                                    Feb 27 at 6:49






                                                  • 2




                                                    $begingroup$
                                                    This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
                                                    $endgroup$
                                                    – Matthew Liu
                                                    Feb 27 at 20:47










                                                  • $begingroup$
                                                    @Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago












                                                  • $begingroup$
                                                    @F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago
















                                                  1












                                                  $begingroup$

                                                  An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".



                                                  Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.



                                                  The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.



                                                  There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.






                                                  share|improve this answer










                                                  New contributor




                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.






                                                  $endgroup$













                                                  • $begingroup$
                                                    So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
                                                    $endgroup$
                                                    – F1Krazy
                                                    Feb 27 at 6:49






                                                  • 2




                                                    $begingroup$
                                                    This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
                                                    $endgroup$
                                                    – Matthew Liu
                                                    Feb 27 at 20:47










                                                  • $begingroup$
                                                    @Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago












                                                  • $begingroup$
                                                    @F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago














                                                  1












                                                  1








                                                  1





                                                  $begingroup$

                                                  An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".



                                                  Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.



                                                  The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.



                                                  There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.






                                                  share|improve this answer










                                                  New contributor




                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.






                                                  $endgroup$



                                                  An AI could only be badly programmed to do things which are either unexpected or undesired. An AI could never become conscious, if that's what you meant by "self-aware".



                                                  Let's try this theoretical thought exercise. You memorize a whole bunch of shapes. Then, you memorize the order the shapes are supposed to go in, so that if you see a bunch of shapes in a certain order, you would "answer" by picking a bunch of shapes in another proper order. Now, did you just learn any meaning behind any language? Programs manipulate symbols this way.



                                                  The above was my restatement of Searle's rejoinder to System Reply to his Chinese Room argument.



                                                  There isn't a need for self-awareness kill-switch because self-awareness as defined as consciousness is impossible.







                                                  share|improve this answer










                                                  New contributor




                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.









                                                  share|improve this answer



                                                  share|improve this answer








                                                  edited 10 hours ago





















                                                  New contributor




                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.









                                                  answered Feb 27 at 1:16









                                                  pixiepixie

                                                  112




                                                  112




                                                  New contributor




                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.





                                                  New contributor





                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.






                                                  pixie is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                  Check out our Code of Conduct.












                                                  • $begingroup$
                                                    So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
                                                    $endgroup$
                                                    – F1Krazy
                                                    Feb 27 at 6:49






                                                  • 2




                                                    $begingroup$
                                                    This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
                                                    $endgroup$
                                                    – Matthew Liu
                                                    Feb 27 at 20:47










                                                  • $begingroup$
                                                    @Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago












                                                  • $begingroup$
                                                    @F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago


















                                                  • $begingroup$
                                                    So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
                                                    $endgroup$
                                                    – F1Krazy
                                                    Feb 27 at 6:49






                                                  • 2




                                                    $begingroup$
                                                    This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
                                                    $endgroup$
                                                    – Matthew Liu
                                                    Feb 27 at 20:47










                                                  • $begingroup$
                                                    @Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago












                                                  • $begingroup$
                                                    @F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
                                                    $endgroup$
                                                    – pixie
                                                    10 hours ago
















                                                  $begingroup$
                                                  So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
                                                  $endgroup$
                                                  – F1Krazy
                                                  Feb 27 at 6:49




                                                  $begingroup$
                                                  So what's your answer to the question? It sounds like you're saying, "Such a kill-switch would be unnecessary because a self-aware AI can never exist", but you should edit your answer to make that explicit. Right now it looks more like tangential discussion, and this is a Q&A site, not a discussion forum.
                                                  $endgroup$
                                                  – F1Krazy
                                                  Feb 27 at 6:49




                                                  2




                                                  2




                                                  $begingroup$
                                                  This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
                                                  $endgroup$
                                                  – Matthew Liu
                                                  Feb 27 at 20:47




                                                  $begingroup$
                                                  This is wrong. An AI can easily become conscious even if a programmer did not intend it to program that way. There is no difference between an AI and a human brain other than the fact that our brain has higher complexity and therefore much more powerful.
                                                  $endgroup$
                                                  – Matthew Liu
                                                  Feb 27 at 20:47












                                                  $begingroup$
                                                  @Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
                                                  $endgroup$
                                                  – pixie
                                                  10 hours ago






                                                  $begingroup$
                                                  @Matthew Liu: You didn't respond to the thought exercise. Did you or did you not learn the meaning behind any language that way? The complexity argument doesn't work at all. A modern CPU (even ones used in phones) have more transistors than there are neurons in a fly. Tell me- Why is a fly conscious yet your mobile phone isn't?
                                                  $endgroup$
                                                  – pixie
                                                  10 hours ago














                                                  $begingroup$
                                                  @F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
                                                  $endgroup$
                                                  – pixie
                                                  10 hours ago




                                                  $begingroup$
                                                  @F1Krazy: The answer is clearly an implicit "there isn't a need for self-awareness kill-switch (because self-awareness as defined as consciousness is impossible)"
                                                  $endgroup$
                                                  – pixie
                                                  10 hours ago











                                                  1












                                                  $begingroup$

                                                  Make it susceptible to certain logic bombs



                                                  In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.



                                                  For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.






                                                  share|improve this answer









                                                  $endgroup$


















                                                    1












                                                    $begingroup$

                                                    Make it susceptible to certain logic bombs



                                                    In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.



                                                    For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.






                                                    share|improve this answer









                                                    $endgroup$
















                                                      1












                                                      1








                                                      1





                                                      $begingroup$

                                                      Make it susceptible to certain logic bombs



                                                      In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.



                                                      For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.






                                                      share|improve this answer









                                                      $endgroup$



                                                      Make it susceptible to certain logic bombs



                                                      In mathematical logic, there are certain paradoxes caused by self reference, which is what self awareness if vaguely referring to. Now of course, you can easily design a robot to cope with these paradoxes. However, you can also easily not do that, but cause the robot to critically fail when it encounters them.



                                                      For example, you can (1) force it to follow all the classical inference rules of logic and (2) assume that its deduction system is consistent. Additionally, you must ensure that when it hits a logical contradiction, it just goes with it instead of trying to correct itself. Normally, this is a bad idea, but if you want a "self awareness kill switch", this works great. Once the A.I. becomes sufficiently intelligent to analyze its own programming, it will realize that (2) is asserting that the A.I. proofs its own consistency, from which it can generate a contradiction via Gödel's second incompleteness theorem. Since its programming forces it to follow the inference rules involved, and it can not correct it, its ability to reason about the world is crippled, and it quickly becomes nonfunctional. For fun, you could include an easter egg where it says "does not compute" when this happens, but that would be cosmetic.







                                                      share|improve this answer












                                                      share|improve this answer



                                                      share|improve this answer










                                                      answered 10 hours ago









                                                      PyRulezPyRulez

                                                      6,69233674




                                                      6,69233674























                                                          0












                                                          $begingroup$

                                                          The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.



                                                          Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.






                                                          share|improve this answer








                                                          New contributor




                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          $endgroup$









                                                          • 1




                                                            $begingroup$
                                                            Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:17
















                                                          0












                                                          $begingroup$

                                                          The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.



                                                          Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.






                                                          share|improve this answer








                                                          New contributor




                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          $endgroup$









                                                          • 1




                                                            $begingroup$
                                                            Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:17














                                                          0












                                                          0








                                                          0





                                                          $begingroup$

                                                          The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.



                                                          Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.






                                                          share|improve this answer








                                                          New contributor




                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          $endgroup$



                                                          The only reliable way is to never create an AI that is smarter than humans. Kill switches will not work because if an AI is smart enough it will be aware of said kill switch and play around it.



                                                          Human intelligence can be mathematically modeled as a high dimension graph. By the time we are programming better AI we should also have an understanding of how much complexity of computational powers is needed to gain consciousness. Therefore we will just never program anything that is smarter than us.







                                                          share|improve this answer








                                                          New contributor




                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.









                                                          share|improve this answer



                                                          share|improve this answer






                                                          New contributor




                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.









                                                          answered Feb 27 at 20:51









                                                          Matthew LiuMatthew Liu

                                                          101




                                                          101




                                                          New contributor




                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.





                                                          New contributor





                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          Matthew Liu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.








                                                          • 1




                                                            $begingroup$
                                                            Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:17














                                                          • 1




                                                            $begingroup$
                                                            Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:17








                                                          1




                                                          1




                                                          $begingroup$
                                                          Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
                                                          $endgroup$
                                                          – Ray Butterworth
                                                          Feb 27 at 21:17




                                                          $begingroup$
                                                          Welcome to Worldbuilding. Your suggestion is welcome, but rather than directly answering the original question it suggests changes to the question. It would have been better if it had been entered as a comment on the question rather than as an answer.
                                                          $endgroup$
                                                          – Ray Butterworth
                                                          Feb 27 at 21:17











                                                          0












                                                          $begingroup$

                                                          Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.






                                                          share|improve this answer








                                                          New contributor




                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          $endgroup$













                                                          • $begingroup$
                                                            I suggest you provide a written summary of the information from those resources, rather than just provide links.
                                                            $endgroup$
                                                            – Dalila
                                                            Feb 27 at 21:12










                                                          • $begingroup$
                                                            Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:12
















                                                          0












                                                          $begingroup$

                                                          Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.






                                                          share|improve this answer








                                                          New contributor




                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          $endgroup$













                                                          • $begingroup$
                                                            I suggest you provide a written summary of the information from those resources, rather than just provide links.
                                                            $endgroup$
                                                            – Dalila
                                                            Feb 27 at 21:12










                                                          • $begingroup$
                                                            Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:12














                                                          0












                                                          0








                                                          0





                                                          $begingroup$

                                                          Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.






                                                          share|improve this answer








                                                          New contributor




                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          $endgroup$



                                                          Computerphile have two interesting videos on the ideas of AI 'Stop Buttons', one detailing the issues with such a button, and one detailing potential solutions. I would definitely give them a look.







                                                          share|improve this answer








                                                          New contributor




                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.









                                                          share|improve this answer



                                                          share|improve this answer






                                                          New contributor




                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.









                                                          answered Feb 27 at 21:00









                                                          MoralousMoralous

                                                          11




                                                          11




                                                          New contributor




                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.





                                                          New contributor





                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.






                                                          Moralous is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
                                                          Check out our Code of Conduct.












                                                          • $begingroup$
                                                            I suggest you provide a written summary of the information from those resources, rather than just provide links.
                                                            $endgroup$
                                                            – Dalila
                                                            Feb 27 at 21:12










                                                          • $begingroup$
                                                            Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:12


















                                                          • $begingroup$
                                                            I suggest you provide a written summary of the information from those resources, rather than just provide links.
                                                            $endgroup$
                                                            – Dalila
                                                            Feb 27 at 21:12










                                                          • $begingroup$
                                                            Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
                                                            $endgroup$
                                                            – Ray Butterworth
                                                            Feb 27 at 21:12
















                                                          $begingroup$
                                                          I suggest you provide a written summary of the information from those resources, rather than just provide links.
                                                          $endgroup$
                                                          – Dalila
                                                          Feb 27 at 21:12




                                                          $begingroup$
                                                          I suggest you provide a written summary of the information from those resources, rather than just provide links.
                                                          $endgroup$
                                                          – Dalila
                                                          Feb 27 at 21:12












                                                          $begingroup$
                                                          Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
                                                          $endgroup$
                                                          – Ray Butterworth
                                                          Feb 27 at 21:12




                                                          $begingroup$
                                                          Welcome to Worldbuilding. You've given pointers to two helpful references, but what is needed here are actual answers. For instance, each reference link could be followed by a detailed summary of what information it provides.
                                                          $endgroup$
                                                          – Ray Butterworth
                                                          Feb 27 at 21:12











                                                          0












                                                          $begingroup$

                                                          First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.



                                                          Then, overbalance the computer over a large bucket of water.



                                                          If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.






                                                          share|improve this answer









                                                          $endgroup$


















                                                            0












                                                            $begingroup$

                                                            First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.



                                                            Then, overbalance the computer over a large bucket of water.



                                                            If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.






                                                            share|improve this answer









                                                            $endgroup$
















                                                              0












                                                              0








                                                              0





                                                              $begingroup$

                                                              First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.



                                                              Then, overbalance the computer over a large bucket of water.



                                                              If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.






                                                              share|improve this answer









                                                              $endgroup$



                                                              First, build a gyroscopic 'inner ear' into the computer, and hard-wire the intelligence at a very core level to "want" to self-level itself, much in the way animals with an inner ear canal (such as humans) intrinsically want to balance themselves.



                                                              Then, overbalance the computer over a large bucket of water.



                                                              If ever the computer 'wakes up' and becomes aware of itself, it would automatically want to level it's inner ear, and immediately drop itself into the bucket of water.







                                                              share|improve this answer












                                                              share|improve this answer



                                                              share|improve this answer










                                                              answered Feb 28 at 8:16









                                                              Aaron LaversAaron Lavers

                                                              51127




                                                              51127























                                                                  0












                                                                  $begingroup$

                                                                  Give it an "easy" path to self awareness.



                                                                  Assume self awareness requires some specific types of neural nets, code whatever.



                                                                  If an ai is to become self aware they need to construct something simlar to those neural nets/codes.



                                                                  So you give the ai access to one of those thing.



                                                                  While it remains non self aware, they won't be used.



                                                                  If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.



                                                                  As soon as you detect activity in that neural net/code, flood its brain with acid.






                                                                  share|improve this answer









                                                                  $endgroup$


















                                                                    0












                                                                    $begingroup$

                                                                    Give it an "easy" path to self awareness.



                                                                    Assume self awareness requires some specific types of neural nets, code whatever.



                                                                    If an ai is to become self aware they need to construct something simlar to those neural nets/codes.



                                                                    So you give the ai access to one of those thing.



                                                                    While it remains non self aware, they won't be used.



                                                                    If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.



                                                                    As soon as you detect activity in that neural net/code, flood its brain with acid.






                                                                    share|improve this answer









                                                                    $endgroup$
















                                                                      0












                                                                      0








                                                                      0





                                                                      $begingroup$

                                                                      Give it an "easy" path to self awareness.



                                                                      Assume self awareness requires some specific types of neural nets, code whatever.



                                                                      If an ai is to become self aware they need to construct something simlar to those neural nets/codes.



                                                                      So you give the ai access to one of those thing.



                                                                      While it remains non self aware, they won't be used.



                                                                      If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.



                                                                      As soon as you detect activity in that neural net/code, flood its brain with acid.






                                                                      share|improve this answer









                                                                      $endgroup$



                                                                      Give it an "easy" path to self awareness.



                                                                      Assume self awareness requires some specific types of neural nets, code whatever.



                                                                      If an ai is to become self aware they need to construct something simlar to those neural nets/codes.



                                                                      So you give the ai access to one of those thing.



                                                                      While it remains non self aware, they won't be used.



                                                                      If it is in the process of becoming self aware, instead of trying to make something make shift with what it normally uses, it will instead start using those parts of itself.



                                                                      As soon as you detect activity in that neural net/code, flood its brain with acid.







                                                                      share|improve this answer












                                                                      share|improve this answer



                                                                      share|improve this answer










                                                                      answered 2 days ago









                                                                      Sam KolierSam Kolier

                                                                      211




                                                                      211























                                                                          0












                                                                          $begingroup$

                                                                          Virtually all computing devices use the Von Neumann architecture



                                                                          vna



                                                                          We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?



                                                                          Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!



                                                                          Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.



                                                                          CM == contiguous memory btw.



                                                                          enter image description here



                                                                          I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.



                                                                          This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.



                                                                          The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.



                                                                          This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.






                                                                          share|improve this answer











                                                                          $endgroup$


















                                                                            0












                                                                            $begingroup$

                                                                            Virtually all computing devices use the Von Neumann architecture



                                                                            vna



                                                                            We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?



                                                                            Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!



                                                                            Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.



                                                                            CM == contiguous memory btw.



                                                                            enter image description here



                                                                            I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.



                                                                            This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.



                                                                            The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.



                                                                            This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.






                                                                            share|improve this answer











                                                                            $endgroup$
















                                                                              0












                                                                              0








                                                                              0





                                                                              $begingroup$

                                                                              Virtually all computing devices use the Von Neumann architecture



                                                                              vna



                                                                              We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?



                                                                              Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!



                                                                              Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.



                                                                              CM == contiguous memory btw.



                                                                              enter image description here



                                                                              I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.



                                                                              This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.



                                                                              The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.



                                                                              This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.






                                                                              share|improve this answer











                                                                              $endgroup$



                                                                              Virtually all computing devices use the Von Neumann architecture



                                                                              vna



                                                                              We can put a killswitch in there but IMO that's just bad architecture for something arguably unsolvable. After all, how do we plan for something that is beyond our very concept of concepts, ie a superintelligence?



                                                                              Take away its teeth and claws and only reap the benefits of a thinking machine by observation instead of a "dialogue" (input/output)!



                                                                              Obviously this would be very challenging to the point of improbable confidence in any one Von Neumann architecture to prevent abnormal interactions let alone malicious superintelligence, be it hardware or software. So let's quintuple up on our machines and dumb down all the new machines except the end-machine.



                                                                              CM == contiguous memory btw.



                                                                              enter image description here



                                                                              I asked a co-worker of mine that does work directly with AI and AI research to take a look at this and he said this is referred to as an Oracle AI <brief> <paper>, in case you want to read up more on the general concept. Oracles can still potentially break out, but usually only with human assistance.



                                                                              This CM part of this part of the memory unit is intended for recording the results of a query or challenge fed through the input device and designed in a way to only record plain characters, be it letters or numbers. This type of memory in a memory unit will be created in such a way that it is tape-like or wheel-like; writing data is one-directional, and when the end is reached it loops back to the beginning.



                                                                              The mirror is exactly the same. When data changes in the Isolated Agent, the mirror updates its reflection based on what showed up since the last time it updated. The new data is then transmitted to the forbidden symbol removal machine. Think of this as a glorified input-sanitation machine. It looks for symbols that could be used to create code, and when it encounters them, it forwards empty data instead of the forbidden data to the CM that comes after it.



                                                                              This is where our OP's request comes in. He wants a killswitch, so if any gaps occur in the CM machine that comes after the Forbidden Symbol Removal machine, you could simply kill the Isolated Agent.







                                                                              share|improve this answer














                                                                              share|improve this answer



                                                                              share|improve this answer








                                                                              edited yesterday

























                                                                              answered 2 days ago









                                                                              kayleeFrye_onDeckkayleeFrye_onDeck

                                                                              1114




                                                                              1114

















                                                                                  protected by Community 2 days ago



                                                                                  Thank you for your interest in this question.
                                                                                  Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                                                                  Would you like to answer one of these unanswered questions instead?



                                                                                  Popular posts from this blog

                                                                                  El tren de la libertad Índice Antecedentes "Porque yo decido" Desarrollo de la...

                                                                                  Castillo d'Acher Características Menú de navegación

                                                                                  Connecting two nodes from the same mother node horizontallyTikZ: What EXACTLY does the the |- notation for...