Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 14 минут 20 секунд назад

Conspiracy World is missing.

15 апреля, 2019 - 19:29
Published on April 15, 2019 7:58 AM UTC

Can one of you tell me what happened to /tag/conspiracy_world ? I could not find it anywhere I looked online, as the only links lead to that page, which says that it couldn't find the content.

I came to search it because of Eliezers "Final Words", 27th April, 2009. I would like to read the rest of it.



Discuss

IRL 7/8: Generalizing human-robot cooperation: Cooperative IRL

15 апреля, 2019 - 13:13
Published on April 15, 2019 10:13 AM UTC

Every Monday for 8 weeks, we will be posting lessons about Inverse Reinforcement Learning. This is lesson 7.

Note that access to the lessons requires creating an account here.

Have a nice day!



Discuss

Scrying for outcomes where the problem of deepfakes has been solved

15 апреля, 2019 - 07:45
Published on April 15, 2019 4:45 AM UTC

(Prompted by the post: On Media Synthesis: An Essay on The Next 15 Years of Creative Automation, where Yuli comments "Deepfakes exist as the tip of the warhead that will end our trust-based society")

There are answers to the problem of deepfakes. I thought of one, very soon after first hearing about the problem. I later found that David Brin spoke of the same thing 20 years ago in The Transparent Society. The idea seems not to have surfaced or propagated at all in any of the deepfake discourse, and I find that a little bit disturbing. There is a cartoon Robin Hanson that sits on my shoulder who's wryly whispering "Fearmongering is not about preparation" and "News is not about informing". I hope it isn't true. Anyway.

In short, if we want to stay sane, we will start building cameras with tamperproof seals that sign the data they produce with a manufacturer's RSA signature to verify that the footage comes directly from a real camera, and we will require all news providers to provide a checked (for artifacts of doctoring and generation), verified, signed (unedited) online copy of any footage they air. If we want to be extra thorough (and we should), we will also allocate public funding to the production of disturbing, surreal, inflammatory, but socially mostly harmless deepfakes to exercise the public's epistemic immune system, ensuring that they remain vigilant enough to check the national library of evidence for signed raws before acting on any new interesting video. I'm sure you'll find many talented directors who'd jump at the chance to produce these vaccinating works, and I think the tradition will find plenty of popular support, if properly implemented. The works could be great entertainment, as will the ensuing identification of dangerously credulous fools.


Technical thoughts about those sealed cameras

The camera's seal should be fragile. When it's broken (~ when there is any slight shift in gas pressure or membrane conductivity, when the components move, when the unpredictable, randomly chosen build parameters fall out of calibration), the camera's registered private key will be thoroughly destroyed, with a flash of UV, current, and, ideally, magnesium fire, so that it cannot be extracted and used to produce false signatures. It may be common for these cameras to fail spontaneously. We can live with that. Core components of cameras will mostly only continue to get cheaper.

I wish I could discuss practical processes for ensuring, through auditing, that the cameras' private keys are being kept secret during manufacture. We will need to avoid a situation where manufacturing rights are limited to a small few and the price of authorised sealed cameras climbs up into unaffordable ranges, making them inaccessible to the public and to smaller news agencies, but I don't know enough about the industrial process to discuss that.

There's an attack I'm not sure how to address, too. Very high-resolution screens and lenses could be used to show a sealed camera a scene that doesn't exist. The signature attests that the camera genuinely sees it, but it still isn't real. I'll name it the Screen Illusion Analogue Hole Attack (SIAHA).

It might be worth considering putting some kind of GPS chip inside the sealed portion of the camera so that the attack's illusion screen would need to be moved to the location where the fake event was supposed to have happened, which would limit the applications of such an attack, but GPS is currently very easy to fool, so we'll need to find a better location-verification technology than GPS (This is not an isolated need)

I initially imagined that a screen of sufficient fidelity, framerate, and dynamic range would be prohibitively expensive to produce... On reflection... The VR field aspires to make such screens ubiquitous:

  • Resolution targets may eventually be met.
    • A point in favour: maximum human-perceptible pixel density will be approached. Eye tracking will open the way to foveated rendering; wherein only the small patch of the scene the user is looking directly at will be rendered at max resolution. Current rendering hardware is already beefy enough to support foveated rendering, as it allows us to significantly down-spec the resolution of everything the user isn't looking at. The hardware will not necessarily be made to accept streaming 4k raw footage fresh out of the box (more like 720p footage and another patch of 720p footage for the foveal patch), but the pixels will all be there, the screen will be dense enough, it will be very possible to produce hardware that will do it, if not by modifying a headset, then by modifying its factory.
    • A point against: Video cameras will sometimes want to go beyond retinal pixel density for post-production digital zoom. They will want to capture much more than a human standing in their position can see, and I can see no reason consumer-grade screens should ever come to output more detail than a human can see.
  • Framerate targets will be met because if you dip below 100fps in VR, players puke. It's a hard requirement. There will never be a commercial VR headset that couldn't do it.
  • Realistic dynamic range might take longer than the other two, but there will be a demand for it... though perhaps we will never want a screen that can flash with the brightness of the sun. If cameras of the future can record that level of brightness, that may be some defence against this kind of attack, at least for outdoor scenes.
  • Color accuracy may remain difficult to replicate with screens. Cameras already accidentally record infra red light. Screens for humans will never need to produce infra-red. I'm not sure how current cameras' color accuracy compares to the human eye... I suspect it's higher, but I'm not able to confirm that.

In conclusion, there are a few reasons (VR) SIAHA might not happen very often, but it's a worry.

In summary: A combination of technologies, laws, and fun social practices can probably mostly safeguard us against the problem of convincingly doctored video evidence. Some of the policing and economic challenges are a bit daunting, but not obviously insoluble.



Discuss

Halifax Meetup -- Board Games

15 апреля, 2019 - 07:00
Published on April 15, 2019 4:00 AM UTC

If there are any readers of LessWrong or SSC in Halifax who would like to meet up, some of us are going to the board room cafe on Wednesday. Everyone is welcome!



Discuss

The Cacophony Hypothesis: Simulation (If It is Possible At All) Cannot Call New Consciousnesses Into Existence

15 апреля, 2019 - 01:14
Published on April 14, 2019 9:20 PM UTC

Epistemic Status: The following seems plausible to me, but it's complex enough that I might have made some mistakes. Moreover, it goes against the beliefs of many people much smarter than myself. Thus caution is advised, and commentary is appreciated.

I.

In this post, I aim to make a philosophical argument that we (or anyone) cannot use simulation to create new consciousnesses (or, for that matter, to copy existing people's consciousnesses so as to give them simulated pleasure or pain). I here make a distinction between "something that acts like it is conscious," (e.g. what is commonly known as a 'p-zombie') and "something that experiences qualia." Only the latter is relevant to what I mean when I say something is 'conscious' throughout this post. In other words, consciousness here refers to the quality of 'having the lights on inside', and as a result it relates as well to whether or not an entity is a moral patient (i.e. can it feel pain? Can it feel pleasure? If so, it is important that we treat it right).

If my argument holds, then this would be a so-named 'crucial consideration' to those who are concerned about simulation. It would mean that no one can make the threat of hurting us in some simulation, nor can one promise to reward us in such a virtual space. However, we ourselves might still exist in some higher world's simulation (in a manner similar to what is described in SlateStarCodex's 'The View from the Ground Level'). Finally, since one consequence of my conclusion is that there is no moral downside to simulating beings that suffer, one might prefer to level a Pascal's Wager-like argument against me and say that under conditions of empirical and moral uncertainty, the moral consequences of accepting this argument (i.e. treating simulated minds as not capable of suffering) would be extreme, whereas granting simulated minds too much respect has fewer downsides.


Without further ado...


II.

Let us first distinguish two possible worlds. In the first, simulating consciousnesses [in any non-natural state] is simply impossible. That is to say, the only level on which consciousnesses may exist, is the real, physical level that we see around us. No other realms may be said to 'exist'; all other spaces are mere information -- they are fiction, not real. Nature may have the power to create consciousnesses, but not us: No matter how hard we try, we are forever unable to instantiate artificial consciousnesses. If this is the world we live in, then the Cacophony hypotheses is already counterfactually proven.


So let us say that we live in the second type of world: One where consciousnesses may exist not merely in what is directly physical, but may be instantiated also in the realm of information. Ones and zeroes by themselves are just numbers, but if you represent them with transistors and interpret them with the right rules, then you will find that they define code, programs, models, simulations --- until, finally, the level of detail (or complexity, or whatever is required) is so high that consciousnesses are being simulated.


In this world, what is the right substrate (or input) on which this simulation may take place? And what are the rules by which it may be calculated?

Some hold that the substrate is mechanical: ones and zeroes, embedded on copper, lead, silicon, and gold. But the Church-Turing thesis tells us that all sufficiently advanced computers are equally powerful. What may be simulated on ones and zeroes, may be simulated as well by combinations of colours, or gestures, or anything that has some manner of informational content. The effects -- that is, the computations that are performed -- would remain the same. The substrate may be paint, or people, or truly anything in the world, so long as it is interpreted in the right way. (See also Max Tegmark's explanation of this idea, which he calls Substrate-Independence.)

And whatever makes a simulation run -- the functions that take such inputs, and turn them into alternate simulated realities where consciousnesses may reside -- who says that the only way this could happen is by interpreting a string of bits in the exact way that a computer would interpret it? How small is the chance that out of all infinite possible functions, the only function that actually works is exactly that function that we've arbitrarily chosen to apply to computers, and which we commonly accept as having the potential for success?


III.

There are innumerably many interpretations of a changing string of ones and zeroes, of red and blue, of gasps and sighs. Computers have one consistent ruleset which tells them how to interpret bits; we may call this ruleset, 'R'. However, surely we might have chosen many other rulesets. Simple ones, like "11 means 1 and 00 means 0, and interpret the result of this with R" are (by the Church-Turing thesis) equally powerful insofar as their ability to eventually create consciousnesses goes. Slightly more complex ones, such as "0 means 101 and 1 means 011, and interpret the result of this with R" may also be consistent, provided that we unpack the input in this manner. And we need not limit ourselves to rulesets that make use of R: Any consistent ruleset, no matter how complex, may apply. What about the rule, "1 simulates the entirety of Alice, who is now a real simulated person"? Is this a valid function? Is there any point at which increasing the complexity of an interpretation rule, given some input, makes it lose the power to simulate? Or may anything that a vast computer network can simulate, be encoded into a single bit and unpacked from this, provided that we read it with the right interpretation function? Yes ---- of course that is the case: All complexity that may be contained in some input data 'X', may instead be off-loaded into a function which says "Given any bit of information, I return that data 'X'."


We are thus led to an inexorable conclusion:

  1. Every possible combination of absolutely anything that exists, is valid input.
  2. Any set of functions is a valid set of functions -- and the mathematical information space of all possible sets of functions, is vast indeed.
  3. As such, an infinite number of simulations of all kinds are happening constantly, all around us. After all, if one function (R) can take one type of input (ones and zeroes, encoded on transistors) and return a simulation-reality, then who is to say that not for all inputs there exist infinitely many functions that can operate on it to this same effect?

Under this view, the world is a cacophony of simulations, of realities all existing in information space, invisible to our eyes until they we may access them through functional interpretation methods.

IV.

This leads us to the next question: What does it mean for someone to run a simulation, now?


In Borges' short story, "The Library of Babel," there exists a library containing every book that could ever be: It is a physical representation of the vast information space that is all combinations of letters, punctuation marks, and special characters. It is now nonsensical to say that a writer creates a book: The book has always existed, and the writer merely gives us a reference to some location within this library at which the book may be found.


In the same way, all simulations already exist. Simulations are after all just certain configurations of information, interpreted in certain informational ways -- and all information already exists, in the same realm that e.g. numbers (which are themselves information) inhabit. One does not create a simulation; one merely gives a reference to some simulation in information space. The idea of creating a new simulation is as nonsensical as the idea of creating a new book, or a new number; all these structures of information already exist; you cannot create them, only reference them.


But could not consciousnesses, like books, be copied? Here we run into the classical problem of whether there can exist multiple instances of a single informational object. If there may not be, and all copies of a consciousness are merely pointers to a single 'real' consciousness, in the same way that all copies of a book may be understood to be pointers to a single 'real' book, then this is not a problem. We then would end up with the conclusion that any kind of simulation is powerless: Whether you simulate some consciousness or not, it (and indeed everything!) is already being simulated.


So suppose instead that multiple real, valid copies of a consciousness may exist. That is to say: the difference between there being one copy of Bob, and there being ten copies of Bob, is that in the latter situation, there exists more pain and joy -- namely that which the simulated Bobs are feeling -- than there is in the former situation. Could we then not still conclude that running simulations creates consciousnesses, and thus the act of running a simulation is one that has moral weight?


To refute this, a thought experiment. Suppose that a malicious AI shows you that it is running a simulation of you, and threatens to hurt sim!you if you don't do X. What power does it now have over you? What differences are there between the situation where it hurts sim!you, and the one where it rewards sim!you?

The AI is using one stream of data and interpreting it in one way (probably with ruleset R); this combination of input and processing rules results in a simulation of 'you'. In particular, because it has access to both the input and the interpretation function, it can view the simulation and show it to you. But on that same input there acts, invisibly to us, another set of rules (specified here out of infinitely many sets of rules, all of which are simultaneously acting on this input), which results in a slightly different simulation of you. This second set of rules is different in such a way, that if the AI hurts sim!you (an act which, one should note, changes the input; ruleset R remains the same), then in the second simulation, based on this input, you are rewarded, and vice versa. Now there are two simulations ongoing, both real and inhabited by a simulated version of you, both running on a single set of transistors. The AI cannot change that in one of these two simulations, you are hurt, and in the other, you are not; it can only change which one it chooses to show you.


Indeed: For every function which simulates, on some input, a consciousness that is suffering, there is another function which, on this same input, simulates that same consciousness experiencing pleasure. Or, more generally and more formally stated: Whenever the AI decides to simulate X, then for any other possible consciousness or situation Y that is not X, there exists a function which takes the input of "The AI is simulating X", and which subsequently simulates Y. (Incidentally, the function which takes this same input, and which then returns a simulation of X, is exactly that function that we usually understand to be 'simulation', namely R. However, as noted, R is just one out of infinitely many functions.) 


V.

As such, in this second world, reality is currently running uncountable billions of copies of any simulation that one may come up with, and any attempt to add one simulation-copy to reality, results instead in a new reality-state in which every simulation-copy has been added. Do not fret, you are not culpable: after all, any attempt to do anything other than adding a simulation-copy, also results in this same new reality-state. This is because any possible input, when given to the set of all possible rules or functions, yields every possible result; thus it does not matter what input you give to reality, whether that is running simulation X, or running simulation Y, or even doing act Z, or not doing act Z.


Informational space is infinite. Even if we limit our physical substrate to transistors set to ones or zeroes, we may still come up with limitless functions besides R, that together achieve this above result. In running computations, we don't change what is being simulated, we don't change what 'exists'. We merely open a window onto some piece of information. In mathematical space, everything already exists. We are not actors, but observers: We do not create numbers, or functions, or even applications of functions on numbers; we merely calculate, and view the results.


To summarize:

  1. If simulation is possible on some substrate with some rule, then it is possible on any substrate with any rule. Moreover, simulation space, like Borges' Library and number space, exist as much as they're ever going to exist; all possible simulations are already extant and running.
  2. Attempting to run 'extra' simulations on top of what reality is already simulating, is useless, because your act of simulating X is interpreted by reality as input on which it simulates X and everything else, and your act of not simulating X, is also interpreted by reality as input on which it simulates X and everything else.

It should be noted that simulations are still useful, in the same way that doing any kind of maths is useful: Amidst the infinite expanses of possible outputs, mathematical processes highlight those outputs which you are interested in. There are infinitely many numbers, but the right function with the right input can still give you concrete information. In the same way, if someone is simulating your mind, then even though they cannot cause any pain or reward that would not already 'exist' anyway, they can now nonetheless read your mind, and from this gain much information about you.


Thus simulation is still a very powerful tool.


But the idea that simulation can be used to conjure new consciousnesses into existence, seems to me to be based on a fundamental misunderstanding of what information is.


[A note of clarification: One might argue that my argument does not successfully make the jump from physically-defined inputs, such as a set of transistors representing ones and zeroes, to symbolically-defined meta-physical inputs, such as "whether or not X is being simulated." This would be a pertinent argument, since my line of reasoning depends crucially on this second type of input. To this hypothetical argument, I would counter that any such symbolic input has to exist fully in natural, physical reality in some manner: "X is being simulated" is a statement about the world which we might, given the tools (and knowing for each function what input to search for -- this is technically computable), physically check to be true or false, in the same way that one may physically check whether a certain set of transistors currently encodes some given string of bits. The second input is far more abstract, and more complex to check, than the first; but I do not think they exist on qualitatively different levels. Finally, one would not need infinite time to check the statement "X is being simulated"; just pick the function "Given the clap of one's hands, simulate X", and then clap your hands.]


VI.

Four final notes, to recap and conclude:

  1. My argument in plain English, without rigour or reason, is this: If having the right numbers in the right places is enough to make new people exist (proposition A), then anything is enough to make anything exist (B). It follows that if we accept A, which many thinkers do, then everything -- every possible situation -- currently exists. It is moreover of no consequence to try and add a new situation to this 'set of all possible situations, infinite times', because your new situation is already in there an infinite amount of times, and furthermore, abstaining from adding this new situation counts as 'anything' and thus, by B, would also add the new situation to this set.
  2. You cannot create a book or a number; you're merely providing a reference to some already extant book in Babel's Library, or to some extant number in number space. In the same way, running a simulation, the vital part of which (by the Church-Turing thesis) has to be entirely based on non-physical information, should no longer be seen as the act of creating some new reality; it merely opens a window into a reality that was already there.
  3. The idea that every possible situation, including terrible, hurtful ones, is real, may be very stressful. To people who are bothered by this, I offer the view that perhaps we do live in the ground level, and simulating artificial, non-natural consciousnesses, may be impossible: Our own world may well be all that there is. The Cacophony Hypothesis is not suitable to establish that the idea of "reality is a cacophony of simulations" is necessarily true; rather, it was written to argue that if and only if we accept that some kind of simulation is possible, then it would be strange to also deny that every other kind of simulation is possible.
  4. A secondary aim is to re-center the discussion around simulation: To go from a default idea of "Computation is the only method through which simulation may take place," to the new idea, which is "Simulations may take place everywhere, in every way." The first view seems too neat, too well-suited to an accidental reality, strangely and unreasonably specific; we are en route to discovering one type of simulation ourselves, and thus it was declared that this was the only type, the only way. The second view -- though my bias should be noted! -- strikes me as being general and consistent; it is not formed specifically around the 'normal', computer-influenced ideas of what forms computation takes, but rather allows for all possible forms of computation to have a role in this discussion.I may well be wrong, but it seems to be that the burden of proof should not be on those who say that "X may simulate Y"; it should be on those who say "X may only be simulated by Z." The default understanding should be that inputs and functions are valid until somehow proven invalid, rather than the other way around. (Truthfully, to gain a proof either way is probably impossible, unless we were to somehow find a method to measure consciousness -- and this would have to be a method that recognizes p-zombies for what they are.)

Thanks goes to Matthijs Maas for helping me flesh out this idea through engaging conversations and thorough feedback.



Discuss

A Numerical Model of View Clusters: Results

14 апреля, 2019 - 07:21
https://i.gyazo.com/b53741914c81448b5086b60e17924e6b.png

Quantitative Philosophy: Why Simulate Ideas Numerically?

14 апреля, 2019 - 06:53
Published on April 14, 2019 3:53 AM UTC

Adapted from my blog. I argue that numerical simulations are an effective yet underused tool in philosophy (and rationality) and give a concrete example of a numerical toy model, from assumptions to design to implementations to results, including the source code and a live site to play with.

It used to be that logic was king. A convincing argument was all that was necessary to get one’s ideas taken seriously and often accepted. After all, if it makes sense, it must be right, right? Right? The thought of actually checking the ideas experimentally is not very old, and, while firmly entrenched in the scientific method, sometimes earnestly, sometimes as a lip service, it is still not the commonly accepted practice in many “softer” sciences.

This forum is a site devoted to rationality, and has plenty of interesting insights, but very few of them have been actually tested. There is a good reason for it: thinking ideas up is much easier than checking them! Eliezer Yudkowsky, the original contributor, set this tone of using reasoning as an argument, without ever closing the feedback loop of checking the conclusions with numbers. There were a few exceptions to this pattern, such as the Iterated Prisoner’s Dilemma bot tournaments, but in general the focus is on reasoning and sometimes mathematical proofs, where possible and warranted, not on making testable predictions and actually testing them.

This leaves out one of the most powerful tools of checking validity of an idea: numerical simulation. If the idea is any good, one ought to be able to formalize it to the degree where its conclusions can be tested by creating simulations and studying their behavior.

The situation is somewhat better at Slate Star Codex: Scott Alexander has lots of wonderful ideas, and he is more aware of the need to check them by something other than more logic. But his approach for testing them is generally literature search or polls/surveys. Those are useful, but they focus on the ideas as black boxes, rather than on their internal mechanics.

Some examples of interesting ideas ripe for numerical modeling:

In the following I focus on one idea, similar to the outgroup one listed above:

PEOPLE FEEL MORE AFFINITY TO THOSE WHOSE VIEWS ARE CLOSE TO THEIR OWN, AND ARE OFTEN REPULSED BY THOSE WHOSE VIEWS DIVERGE A LOT FROM THEIRS. WHAT IS THE RESULTING DYNAMICS OF THESE HUMAN PREFERENCES?

An obvious conclusion that people form cohesive groups that are hostile to other cohesive groups. A set of local (often suboptimal in some sense) equilibria forms that is hard to change.

How would one go about modeling these ideas numerically? My attempt at doing so is described below.

First, obviously we cannot model this segmentation and alienation process in full generality, there are too many factors to consider. The art of modeling is in picking those that are both important and easy to describe quantitatively. The results of such an approach look pretty natural in retrospect, but are anything but easy to converge on. So, below I will endeavor to describe not just the final model, but how I got there, without the benefit of a hindsight.

  • How does interaction between people work? At a very basic level people exchange ideas, opinions, thoughts and, more often than not, praise and insults.
  • This used to happen mostly between those who are in physical proximity, but, the world being global and connected, at least (mis)informationally, the interaction is no longer limited by distance
  • Even people who are fairly close in views to each other tend to have a bit of variation, and their views may shift randomly a little here and there

To describe what happens to people’s views as a result of an interaction between them, we need some simplified description that can be modeled numerically. People tend to fall prey to the confirmation bias a lot, which gives us two basics characteristics of the interaction:

  • Interaction between two people who are fairly close in their views on a given topic provides the confirmation they crave, and results in further convergence of the views. This phenomenon can be described as a force of attraction.
  • Interaction between people with wildly divergent views also provides the confirmation of their own opinions, only in contrast to the other person’s horribly misguided and wrong one. This can be modeled as a, well, repulsive force, pushing people further apart.

What happens when people’s views are very far from or very close to each other?

  • When the views have almost converged, there is very little change as a result of the interaction, so the force of attraction is, counter-intuitively, gets weaker at “shorter distances”
  • When the views are so far apart, we can no longer relate to them, or maybe even take them seriously, the result of the interaction is very little change in our own views, so the force of repulsion is weaker, as well.
  • Potentially, since we are all humans, there are limits to how far the views can diverge, because there are generally some basic shared ideas that most people subscribe to, like, say, the desire for humanity to survive and prosper. So, at the very large divergences the repulsion again turns into attraction, if very weak.

Humans have opinions on a wide variety of topics, and, while there is a correlation, even if they are in agreement on one issue, they do not necessarily agree on every issue. This complicates the situation, making it multi-dimensional, and adding complications to a potential numerical model of it. So, at this point, as a toy model, focusing on a single dimension should be a good enough start.

It is a well known phenomenon that people sometimes radically change their views, for example undergoing religious conversion or de-conversion, and changing group allegiances as a result. It is not immediately clear how to model this numerically, and, unless this process magically emerges from the results of the simulations, is best left for future investigations.

Another point to keep in mind is the initial distribution of views before the interaction. If everyone starts agreeing with everyone else, it is not likely that anything would change, given the above assumptions. So, some spread of the views is necessary to get a non-trivial dynamics. The nature of this spread is probably something to play with once the model is implemented. Potential options to consider:

  • Symmetric vs. asymmetric distributions, corresponding to mainstream vs niche views.
  • Uniform vs. “dumbbell” distribution, where people are already primed even before the interaction.

Assuming the above program is implemented, what do we expect to learn from running the simulations?

  • First, validate the model itself! Always an essential step. If the underlying code or math is wrong, the model is worse than useless, it can force one to make completely unwarranted conclusions. (Totally not speaking from experience! Not at all!)
  • Second, figure out the range of parameters where we get the expected results. Every model has adjustable numerical parameters, and finding the parameter space as the home base to start at is definitely worthwhile.
  • It might happen that there is no such home base, and this would be even more exciting than getting the expected results! It would either mean that the whole idea is bogus, or that our model of it is inadequate. In either case, it is back to the drawing board!
  • Once we have a well-behaving numerical model, it is play time! Time to reap the rewards of all the hard work by varying the parameter space and watching what happens. Hopefully something new and unexpected would show up!
  • This is where the real payoff of numerical modeling is: finding something new and having those “Aha!” moments, where the unexpected results make us learn something new and gain insights that were missed in simply “applying logic”.

So having put some thought into what to model, what simplification to make and what the goals are, it’s time to get down and dirty.

I have already decided that:

  • The initial version would be one-dimensional.
  • The interaction is attractive at short distances, repulsive at large distances, and again weakly attractive at large distances.

These two assumptions let us pick the shape of the interaction. Given my background in physics, I naturally think of forces in terms of potentials first. The magnitude of the force corresponds to the slope of the potential. Attractive force corresponds to the positive slope, and repulsive force corresponds to the negative slope. The complete potential can be combined from those with the basic tools of addition and multiplication. To simulate the attractive force, we can use everyone’s favorite harmonic oscillator: a stretched spring tends to contract. Google helpfully constructs the graph for it:

This has the basic shape we want:

  • The attraction (convergence of views) is low when the views are already similar.
  • The attraction increases when the separation is a bit larger.

If you are not familiar with the potential curves, think of it as a hilly slope, and you and your friend being on the opposite sides of it. In the situation above you would naturally roll down toward each other.

Now, to get the repulsive side of the interaction, we can use another well-worn tool, the bell curve:

So we have the two out of three features of the interaction shape: the attraction when close together and repulsion when further apart. Let’s compose them together, and the way to do it is simple multiplication:

We are almost there! Just need to add the last piece, the weak attraction at large separations. Again, the harmonic oscillator potential to the rescue! Only we need to add it to the bell curve first:

Uh oh… That didn’t work out as expected! The spring is too stiff and completely overwhelms the little bell. Let’s loosen the spring up a bit, by dividing its potential by a big number. Say, make 5 times looser:

That looks better, but clearly the stiffness of the spring should become one of those adjustable parameters once the model is done. Now to put all three together:

Another oops… What happened? The bell curve, again, was too weak to show prominently. Need to suppress the weaker spring a bit more, maybe 10 times more:

This is more like it! Let’s look at it more closely:

  • People whose opinions are separated by approximately less than one “unit of disagreement” would tend to “slide” toward each other and maybe slosh a bit around zero-disagreement.
  • People whose opinions a separated by somewhat more than that, will find each other’s views repulsive enough to feel like the other person is a part of an “outgroup”, and instinctively distance themselves from them.
  • Eventually the “shared human values” start to matter, and there is a certain separation distance where the two parties, while repulsed by each other, just shrug it off without any further need to distance themselves from the other. How faithful this model is is debatable, of course, but for a first approximation it does not look too out of place.

Clearly playing with the shape of the interaction potential is not done, but it is good enough for now, and it is time to go to the next step. Well, not quite. Let’s have one more chart, converting the shape of the potential into the shape of the force. Google calculator is not up to snuff there quite yet, but Wolfram Alpha is, though the free version has low resolution and no customization:

I don’t find the force graphs as illuminating as the potential graphs, but one can still make sense of it: negative values correspond to the force pulling to the left, and positive values correspond to the force pulling to the right. So, again, when the views are close, they converge, when they are farther apart, they tend to separate to a respectable distance, but not infinitely far. The force, and not the potential is what we need to calculate how people’s opinions move after interacting, anyway.

That was the easy part, picking the shape, next we have to figure out how to actually implement the dynamics of interactions and changing personal views.

I will not discuss it here, an interested party can read about it in my blog post:

https://quantitativephilosophy.wordpress.com/2019/04/09/separation-and-clustering-of-views-making-a-step/

The code itself can be found in https://quantitativephilosophy.wordpress.com/2019/04/12/separation-and-clustering-of-views-the-app/

This site doesn't seem to allow embedded sites or embedded html+javascript, so I cannot insert the actual app in here, instead you can go to

https://sites.google.com/view/numericalsimulationofclusterin/home

to play with the model. I will talk about it more in a companion post to follow. But here is a video of one simulation run: https://i.imgur.com/iDaH9nj.mp4 to give the idea of it, wish I could embed it here though.



Discuss

A Case for Taking Over the World--Or Not.

14 апреля, 2019 - 04:34
Published on April 14, 2019 1:34 AM UTC

As my education progresses, I'm seeing more and more paralells, through some fictional but generally nonfictional accounts, that sugget that the world is broken in a way that causes suffering to be an emergent property, if not intrinsic. Not just HPMoR (and Significant Digits after it), but in FDR's State of the Union Speech, The Grapes of Wrath, Jared Diamond's Guns, Germs, and Steel, among other works. People have been aware for most of human history that, because of humanity's rules, people must suffer.

Knowing that this whole site is more or less dedicated to defeating unconscious ays of thinking and holds the mental enlightening of the human race paramount, I would like to pose this question:

What would we have to do to save the world?

Before breaking this question and this intent down, I'd like to clarify some things:

I am solely concerned with the practicalities, not with what people would or should do. Anybody who's seen enough of the world and how it works have an idea of the immensity of it, but humans made the current state of events what they are today (barring other undiscovered factors not covered by my priors), with the majority of them being largely redundant to the process in one way or another. People have demonstrated repeatedly that a group of people can have an impact disproportionate to their individual means.

What would have to occur, in the current political, economic, social etc. climate and onwards?

Would it have to be conspiracy? Or something else?

You can ask any other bounding questions that you would like, such as "What is the minimum amount of manpower and resources required to accomplish X through the most expedient and readily available means?" At the end of the day (so to speak) we should be able to shortly arrive at some sort of operational plan. There's no sense in taking this knowledge and not using it to further our cause.



Discuss

The Meaning(s) of Life

14 апреля, 2019 - 01:49
Published on April 13, 2019 9:02 PM UTC

Why are you reading this article?

Perhaps you're reading it because you think the topic of the meaning of life is interesting and important to rationality. Why do you care about rationality? Maybe because it can help you determine the odds of various events in your life and, among other things, help you make money. Why do you want money? Because you need it to buy food. Why do you want food? So you can survive. Why do you want to survive? Because you want to help people in the world. Why do other people matter?

We could go on and on. Everything you do has a reason. Everything is leading up to one goal. The Meaning of Life. But what is that goal? What is the one thing that matters most in the world, that our entire life should be spent trying to accomplish?

Let's look at some options. Maybe the Meaning of Life is to be as happy as possible. Where's the proof of that? (Yes, I know this sounds like bottom-line irrationality, but if something can't even be rationalized, you know it's wrong.) What is the evidence that happiness is important? Well, happiness is important because it causes-

Stop. This is the ultimate cause, the Meaning of Life. It can't be important because it causes something, because then that would be the Meaning of Life. So why else could happiness be important, if it doesn't matter what it causes? It doesn't matter what happiness produces, or leads to, or anything that comes from it. And it can't be proven by the things that lead to it, by saying that happiness is important because chocolate causes happiness. Who says chocolate is good? And some of the things it produces are bad, like obesity.

This all leads to one conclusion. It is impossible to prove the Meaning of Life.

But why do we do anything? If there is no Meaning of Life, why don't you just lie down and die? There is a Meaning of Life. There is a reason things should be done, something that everything leads up to. The ultimate purpose of everything, the thing that we all live for, the final cause, has to exist, or else we wouldn't do anything. There's something that everything is leading up to.

But if it can't be proven, what is it?

The answer is that the Meaning of Life changes from person to person. If someone thinks that their Meaning of Life is to build a tower out of LEGOs that is as possible by the laws of physics, that is not a ridiculous statement. The Meaning of Life cannot be proven, so we can each choose what we want. We each have our own thing that seems right to us, and it is the definition of right.

If a Nazi says that the murder of Jews is good because it leads to increased celery production, and he thinks celery production is the Meaning of Life, I will argue because killing Jews does not increase celery production. But if they calls the murder of Jews the Meaning of Life, then I will not argue because it cannot be disproven. I will certainly try to stop them, because it conflicts with my Meaning of Life. But I will not say he is incorrect, for in this case there is no ultimate "correct."

I never consider anyone to be good or evil. There are only "in agreement with me about the Meaning of Life" and "in disagreement with me about the Meaning of Life." The only time that I will ever condemn someone, or say that someone should change what they do, is when they are stupid, when they kill Jews to increase celery production. There is no inherent good or evil that is the same for everyone. We each can hold whatever Meaning of Life we want.

Now, this doesn't mean to let the Nazi kill the Jews. If your Meaning of Life is against Nazism, then by all means, stop them. But never say that anyone is more or less correct about the Meaning of Life, because they are not. The only things that are correct or incorrect is how to get to the Meaning of Life.

My Meaning of Life is to preserve life, to stop death whenever possible. That may not be yours. And I'm not going to say you're wrong, or that you're evil. All I'll say is that you're in disagreement.



Discuss

Where to Draw the Boundaries?

14 апреля, 2019 - 00:34
Published on April 13, 2019 9:34 PM UTC

Followup to: Where to Draw the Boundary?

Figuring where to cut reality in order to carve along the joints—figuring which things are similar to each other, which things are clustered together: this is the problem worthy of a rationalist. It is what people should be trying to do, when they set out in search of the floating essence of a word.

Once upon a time it was thought that the word "fish" included dolphins ...

The one comes to you and says:

The list: {salmon, guppies, sharks, dolphins, trout} is just a list—you can't say that a list is wrong. You draw category boundaries in specific ways to capture tradeoffs you care about: sailors in the ancient world wanted a word to describe the swimming finned creatures that they saw in the sea, which included salmon, guppies, sharks—and dolphins. That grouping may not be the one favored by modern evolutionary biologists, but an alternative categorization system is not an error, and borders are not objectively true or false. You're not standing in defense of truth if you insist on a word, brought explicitly into question, being used with some particular meaning. So my definition of fish cannot possibly be 'wrong,' as you claim. I can define a word any way I want—in accordance with my values!

So, there is a legitimate complaint here. It's true that sailors in the ancient world had a legitimate reason to want a word in their language whose extension was {salmon, guppies, sharks, dolphins, ...}. (And modern scholars writing a translation for present-day English speakers might even translate that word as fish, because most members of that category are what we would call fish.) It indeed would not necessarily be helping the sailors to tell them that they need to exclude dolphins from the extension of that word, and instead include dolphins in the extension of their word for {monkeys, squirrels, horses ...}. Likewise, most modern biologists have little use for a word that groups dolphins and guppies together.

When rationalists say that definitions can be wrong, we don't mean that there's a unique category boundary that is the True floating essence of a word, and that all other possible boundaries are wrong. We mean that in order for a proposed category boundary to not be wrong, it needs to capture some statistical structure in reality, even if reality is surprisingly detailed and there can be more than one such structure.

The reason that the sailor's concept of water-dwelling animals isn't necessarily wrong (at least within a particular domain of application) is because dolphins and fish actually do have things in common due to convergent evolution, despite their differing ancestries. If we've been told that "dolphins" are water-dwellers, we can correctly predict that they're likely to have fins and a hydrodynamic shape, even if we've never seen a dolphin ourselves. On the other hand, if we predict that dolphins probably lay eggs because 97% of known fish species are oviparous, we'd get the wrong answer.

A standard technique for understanding why some objects belong in the same "category" is to (pretend that we can) visualize objects as existing in a very-high-dimensional configuration space, but this "Thingspace" isn't particularly well-defined: we want to map every property of an object to a dimension in our abstract space, but it's not clear how one would enumerate all possible "properties." But this isn't a major concern: we can form a space with whatever properties or variables we happen to be interested in. Different choices of properties correspond to different cross sections of the grander Thingspace. Excluding properties from a collection would result in a "thinner", lower-dimensional subspace of the space defined by the original collection of properties, which would in turn be a subspace of grander Thingspace, just as a line is a subspace of a plane, and a plane is a subspace of three-dimensional space.

Concerning dolphins: there would be a cluster of water-dwelling animals in the subspace of dimensions that water-dwelling animals are similar on, and a cluster of mammals in the subspace of dimensions that mammals are similar on, and dolphins would belong to both of them, just as the vector [1.1, 2.1, 9.1, 10.2] in the four-dimensional vector space ℝ⁴ is simultaneously close to [1, 2, 2, 1] in the subspace spanned by x₁ and x₂, and close to [8, 9, 9, 10] in the subspace spanned by x₃ and x₄.

Humans are already functioning intelligences (well, sort of), so the categories that humans propose of their own accord won't be maximally wrong: no one would try to propose a word for "configurations of matter that match any of these 29,122 five-megabyte descriptions but have no other particular properties in common." (Indeed, because we are not-superexponentially-vast minds that evolved to function in a simple, ordered universe, it actually takes some ingenuity to construct a category that wrong.)

This leaves aspiring instructors of rationality in something of a predicament: in order to teach people how categories can be more or (ahem) less wrong, you need some sort of illustrative example, but since the most natural illustrative examples won't be maximally wrong, some people might fail to appreciate the lesson, leaving one of your students to fill in the gap in your lecture series eleven years later.

The pedagogical function of telling people to "stop playing nitwit games and admit that dolphins don't belong on the fish list" is to point out that, without denying the obvious similarities that motivated the initial categorization {salmon, guppies, sharks, dolphins, trout, ...}, there is more structure in the world: to maximize the (logarithm of the) probability your world-model assigns to your observations of dolphins, you need to take into consideration the many aspects of reality in which the grouping {monkeys, squirrels, dolphins, horses ...} makes more sense. To the extent that relying on the initial category guess would result in a worse Bayes-score, we might say that that category is "wrong." It might have been "good enough" for the purposes of the sailors of yore, but as humanity has learned more, as our model of Thingspace has expanded with more dimensions and more details, we can see the ways in which the original map failed to carve reality at the joints.

The one replies:

But reality doesn't come with its joints pre-labeled. Questions about how to draw category boundaries are best understood as questions about values or priorities rather than about the actual content of the actual world. I can call dolphins "fish" and go on to make just as accurate predictions about dolphins as you can. Everything we identify as a joint is only a joint because we care about it.

No. Everything we identify as a joint is a joint not "because we care about it", but because it helps us think about the things we care about.

Which dimensions of Thingspace you bother paying attention to might depend on your values, and the clusters returned by your brain's similarity-detection algorithms might "split" or "collapse" according to which subspace you're looking at. But in order for your map to be useful in the service of your values, it needs to reflect the statistical structure of things in the territory—which depends on the territory, not your values.

There is an important difference between "not including mountains on a map because it's a political map that doesn't show any mountains" and "not including Mt. Everest on a geographic map, because my sister died trying to climb Everest and seeing it on the map would make me feel sad."

There is an important difference between "identifying this pill as not being 'poison' allows me to focus my uncertainty about what I'll observe after administering the pill to a human (even if most possible minds have never seen a 'human' and would never waste cycles imagining administering the pill to one)" and "identifying this pill as not being 'poison', because if I publicly called it 'poison', then the manufacturer of the pill might sue me."

There is an important difference between having a utility function defined over a statistical model's performance against specific real-world data (even if another mind with different values would be interested in different data), and having a utility function defined over features of the model itself.

Remember how appealing to the dictionary is irrational when the actual motivation for an argument is about whether to infer a property on the basis of category-membership? But at least the dictionary has the virtue of documenting typical usage of our shared communication signals: you can at least see how "You're defecting from common usage" might feel like a sensible thing to say, even if one's true rejection lies elsewhere. In contrast, this motion of appealing to personal values (!?!) is so deranged that Yudkowsky apparently didn't even realize in 2008 that he might need to warn us against it!

You can't change the categories your mind actually uses and still perform as well on prediction tasks—although you can change your verbally reported categories, much as how one can verbally report "believing" in an invisible, inaudible, flour-permeable dragon in one's garage without having any false anticipations-of-experience about the garage.

This may be easier to see with a simple numerical example.

Suppose we have some entities that exist in the three-dimensional vector space ℝ³. There's one cluster of entities centered at [1, 2, 3], and we call those entities Foos, and there's another cluster of entities centered at [2, 4, 6], which we call Quuxes.

The one comes and says, "Well, I'm going redefine the meaning of 'Foo' such that it also includes the things near [2, 4, 6] as well as the Foos-with-respect-to-the-old-definition, and you can't say my new definition is wrong, because if I observe [2, _, _] (where the underscores represent yet-unobserved variables), I'm going to categorize that entity as a Foo but still predict that the unobserved variables are 4 and 6, so there."

But if the one were actually using the new concept of Foo internally and not just saying the words "categorize it as a Foo", they wouldn't predict 4 and 6! They'd predict 3 and 4.5, because those are the average values of a generic Foo-with-respect-to-the-new-definition in the 2nd and 3rd coordinates (because (2+4)/2 = 6/2 = 3 and (3+6)/2 = 9/2 = 4.5). (The already-observed 2 in the first coordinate isn't average, but by conditional independence, that only affects our prediction of the other two variables by means of its effect on our "prediction" of category-membership.) The cluster-structure knowledge that "entities for which x₁≈2, also tend to have x₂≈4 and x₃≈6" needs to be represented somewhere in the one's mind in order to get the right answer. And given that that knowledge needs to be represented, it might also be useful to have a word for "the things near [2, 4, 6]" in order to efficiently share that knowledge with others.

Of course, there isn't going to be a unique way to encode the knowledge into natural language: there's no reason the word/symbol "Foo" needs to represent "the stuff near [1, 2, 3]" rather than "both the stuff near [1, 2, 3] and also the stuff near [2, 4, 6]". And you might very well indeed want a short word like "Foo" that encompasses both clusters, for example, if you want to contrast them to another cluster much farther away, or if you're mostly interested in x₁ and the difference between x₁≈1 and x₂≈2 doesn't seem large enough to notice.

But if speakers of particular language were already using "Foo" to specifically talk about the stuff near [1, 2, 3], then you can't swap in a new definition of "Foo" without changing the truth values of sentences involving the word "Foo." Or rather: sentences involving Foo-with-respect-to-the-old-definition are different propositions from sentences involving Foo-with-respect-to-the-new-definition, even if they get written down using the same symbols in the same order.

Naturally, all this becomes much more complicated as we move away from the simplest idealized examples.

For example, if the points are more evenly distributed in configuration space rather than belonging to cleanly-distinguishable clusters, then essentialist "X is a Y" cognitive algorithms perform less well, and we get Sorites paradox-like situations, where we know roughly what we mean by a word, but are confronted with real-world (not merely hypothetical) edge cases that we're not sure how to classify.

Or it might not be obvious which dimensions of Thingspace are most relevant.

Or there might be social or psychological forces anchoring word usages on identifiable Schelling points that are easy for different people to agree upon, even at the cost of some statistical "fit."

We could go on listing more such complications, where we seem to be faced with somewhat arbitrary choices about how to describe the world in language. But the fundamental thing is this: the map is not the territory. Arbitrariness in the map (what color should Texas be?) doesn't correspond to arbitrariness in the territory. Where the structure of human natural language doesn't fit the structure in reality—where we're not sure whether to say that a sufficiently small collection of sand "is a heap", because we don't know how to specify the positions of the individual grains of sand, or compute that the collection has a Standard Heap-ness Coefficient of 0.64—that's just a bug in our human power of vibratory telepathy. You can exploit the bug to confuse humans, but that doesn't change reality.

Sometimes we might wish that something to belonged to a category that it doesn't (with respect to the category boundaries that we would ordinarily use), so it's tempting to avert our attention from this painful reality with appeal-to-arbitrariness language-lawyering, selectively applying our philosophy-of-language skills to pretend that we can define a word any way we want with no consequences. ("I'm not late!—well, okay, we agree that I arrived half an hour after the scheduled start time, but whether I was late depends on how you choose to draw the category boundaries of 'late', which is subjective.")

For this reason it is said that knowing about philosophy of language can hurt people. Those who know that words don't have intrinsic definitions, but don't know (or have seemingly forgotten) about the three or six dozen optimality criteria governing the use of words, can easily fashion themselves a Fully General Counterargument against any claim of the form "X is a Y"—

Y doesn't unambiguously refer to the thing you're trying to point at. There's no Platonic essence of Y-ness: once we know any particular fact about X we want to know, there's no question left to ask. Clearly, you don't understand how words work, therefore I don't need to consider whether there are any non-ontologically-confused reasons for someone to say "X is a Y."

Isolated demands for rigor are great for winning arguments against humans who aren't as philosophically sophisticated as you, but the evolved systems of perception and language by which humans process and communicate information about reality, predate the Sequences. Every claim that X is a Y is an expression of cognitive work that cannot simply be dismissed just because most claimants doesn't know how they work. Platonic essences are just the limiting case as the overlap between clusters in Thingspace goes to zero.

You should never say, "The choice of word is arbitrary; therefore I can say whatever I want"—which amounts to, "The choice of category is arbitrary, therefore I can believe whatever I want." If the choice were really arbitrary, you would be satisfied with the choice being made arbitrarily: by flipping a coin, or calling a random number generator. (It doesn't matter which.) Whatever criterion your brain is using to decide which word or belief you want, is your non-arbitrary reason.

If what you want isn't currently true in reality, maybe there's some action you could take to make it become true. To search for that action, you're going to need accurate beliefs about what reality is currently like. To enlist the help of others in your planning, you're going to need precise terminology to communicate accurate beliefs about what reality is currently like. Even when—especially when—the current reality is inconvenient.

Even when it hurts.

(Oh, and if you're actually trying to optimize other people's models of the world, rather than the world itself—you could just lie, rather than playing clever category-gerrymandering mind games. It would be a lot simpler!)

Imagine that you've had a peculiar job in a peculiar factory for a long time. After many mind-numbing years of sorting bleggs and rubes all day and enduring being trolled by Susan the Senior Sorter and her evil sense of humor, you finally work up the courage to ask Bob the Big Boss for a promotion.

"Sure," Bob says. "Starting tomorrow, you're our new Vice President of Sorting!"

"Wow, this is amazing," you say. "I don't know what to ask first! What will my new responsibilities be?"

"Oh, your responsibilities will be the same: sort bleggs and rubes every Monday through Friday from 9 a.m. to 5 p.m."

You frown. "Okay. But Vice Presidents get paid a lot, right? What will my salary be?"

"Still $9.50 hourly wages, just like now."

You grimace. "O–kay. But Vice Presidents get more authority, right? Will I be someone's boss?"

"No, you'll still report to Susan, just like now."

You snort. "A Vice President, reporting to a mere Senior Sorter?"

"Oh, no," says Bob. "Susan is also getting promoted—to Senior Vice President of Sorting!"

You lose it. "Bob, this is bullshit. When you said I was getting promoted to Vice President, that created a bunch of probabilistic expectations in my mind: you made me anticipate getting new challenges, more money, and more authority, and then you reveal that you're just slapping an inflated title on the same old dead-end job. It's like handing me a blegg, and then saying that it's a rube that just happens to be blue, furry, and egg-shaped ... or telling me you have a dragon in your garage, except that it's an invisible, silent dragon that doesn't breathe. You may think you're being kind to me asking me to believe in an unfalsifiable promotion, but when you replace the symbol with the substance, it's actually just cruel. Stop fucking with my head! ... sir."

Bob looks offended. "This promotion isn't unfalsifiable," he says. "It says, 'Vice President of Sorting' right here on the employee roster. That's an sensory experience that you can make falsifiable predictions about. I'll even get you business cards that say, 'Vice President of Sorting.' That's another falsifiable prediction. Using language in a way you dislike is not lying. The propositions you claim false—about new job tasks, increased pay and authority—is not what the title is meant to convey, and this is known to everyone involved; it is not a secret."

Bob kind of has a point. It's tempting to argue that things like titles and names are part of the map, not the territory. Unless the name is written down. Or spoken aloud (instantiated in sound waves). Or thought about (instantiated in neurons). The map is part of the territory: insisting that the title isn't part of the "job" and therefore violates the maxim that meaningful beliefs must have testable consequences, doesn't quite work. Observing the title on the employee roster indeed tightly constrains your anticipated experience of the title on the business card. So, that's a non-gerrymandered, predictively useful category ... right? What is there for a rationalist to complain about?

To see the problem, we must turn to information theory.

Let's imagine that an abstract Job has four binary properties that can either be high or low—task complexity, pay, authority, and prestige of title—forming a four-dimensional Jobspace. Suppose that two-thirds of Jobs have {complexity: low, pay: low, authority: low, title: low} (which we'll write more briefly as [low, low, low, low]) and the remaining one-third have {complexity: high, pay: high, authority: high, title: high} (which we'll write as [high, high, high, high]).

Task variety and authority are hard to perceive outside of the company, and pay is only negotiated after an offer is made, so people deciding to seek a Job can only make decisions based the Job's title: but that's fine, because in the scenario described, you can infer any of the other properties from the title with certainty. Because the properties are either all low or all high, the joint entropy of title and any other property is going to have the same value as either of the individual property entropies, namely ⅔ log₂ 3/2 + ⅓ log₂ 3 ≈ 0.918 bits.

But since H(pay) = H(title) = H(pay, title), then the mutual information I(pay; title) has the same value, because I(pay; title) = H(pay) + H(title) − H(pay, title) by definition.

Then suppose a lot of companies get Bob's bright idea: half of the Jobs that used to occupy the point [low, low, low, low] in Jobspace, get their title coordinate changed to high. So now one-third of the Jobs are at [low, low, low, low], another third are at [low, low, low, high], and the remaining third are at [high, high, high, high]. What happens to the mutual information I(pay; title)?

I(pay; title) = H(pay) + H(title) − H(pay, title)
= (⅔ log 3/2 + ⅓ log 3) + (⅔ log 3/2 + ⅓ log 3) − 3(⅓ log 3)
= 4/3 log 3/2 + 2/3 log 3 − log 3 ≈ 0.2516 bits.

It went down! Bob and his analogues, having observed that employees and Job-seekers prefer Jobs with high-prestige titles, thought they were being benevolent by making more Jobs have the desired titles. And perhaps they have helped savvy employees who can arbitrage the gap between the new and old worlds by being able to put "Vice President" on their resumés when searching for a new Job.

But from the perspective of people who wanted to use titles as an easily-communicable correlate of the other features of a Job, all that's actually been accomplished is making language less useful.

In view of the preceding discussion, to "37 Ways That Words Can Be Wrong", we might wish to append, "38. Your definition draws a boundary around a cluster in an inappropriately 'thin' subspace of Thingspace that excludes relevant variables, resulting in fallacies of compression."

Miyamoto Musashi is quoted:

The primary thing when you take a sword in your hands is your intention to cut the enemy, whatever the means. Whenever you parry, hit, spring, strike or touch the enemy's cutting sword, you must cut the enemy in the same movement. It is essential to attain this. If you think only of hitting, springing, striking or touching the enemy, you will not be able actually to cut him.

Similarly, the primary thing when you take a word in your lips is your intention to reflect the territory, whatever the means. Whenever you categorize, label, name, define, or draw boundaries, you must cut through to the correct answer in the same movement. If you think only of categorizing, labeling, naming, defining, or drawing boundaries, you will not be able actually to reflect the territory.

Do not ask whether there's a rule of rationality saying that you shouldn't call dolphins fish. Ask whether dolphins are fish.

And if you speak overmuch of the Way you will not attain it.

(Thanks to Alicorn, Sarah Constantin, Ben Hoffman, Zvi Mowshowitz, Jessica Taylor, and Michael Vassar for feedback.)



Discuss

Why is multi worlds not a good explanation for abiogenesis

13 апреля, 2019 - 01:21
Published on April 12, 2019 6:53 PM UTC

I'm not a expert in the multi world theory. So this question could very well be extremely stupid. However, given the assumption that there are nearly infinite amount of worlds that are slightly different than each other, nearly every possible event would happen. This includes the formation of life. Now what are the odds that we would be witnessing that world, as far as I can tell 100 percent.

Now I'm not clear exactly how often quantum events lead to a slightly different world but even at the rate of 1 quantum event a year in the entire universe. should lead to a near infinite explosion of completely different universes.

Now I'm not claiming that this is the explanation for abiogenesis or that abiogenesis is proof of multi worlds because that would be multi worlds of the gap fallacy however I'm not clear why I have never even seen this explanation even once for abiogenesis.

I also suspect that mathematically many worlds would usually be the wrong explanation for nearly everything because it runs into serious odds problems and in 99.99999 percent of cases there is a better explanation. however it should at least be considered

COULD SOMEONE EXPLAIN TO ME EXACTLY WHERE I WENT WRONG



Discuss

Highlights from "Integral Spirituality"

12 апреля, 2019 - 21:19
Published on April 12, 2019 6:19 PM UTC

Cross-posted from Map and Territory

Apologia

A couple months ago a friend gifted me a copy of Ken Wilber's Integral Spirituality. At first I was skeptical about reading it: I'm pretty busy and didn't have much context to think I would learn from it. But he talked me into it, prodding me to at least just read the introduction, which he promised was relatively short (35 pages, so basically the length of a long blog post) and densely packed with interesting content. At the time I was almost done reading another book, and figured "what the heck, I'll just read the intro and can decide from there".

Given that you're reading a post with "Integral Spirituality" in the title, I think you can guess what happened next.

I mostly want to share a lot of things I highlighted in the book—passages I thought could stand to be more widely read—because Ken Wilber has put words to many of the thoughts I would like to share but haven't made the time to write about. However, I need to give these passages a little context, so I'll do my best to give you a very high level, whirlwind tour of Wilber's themes.

The nominal purpose of this book is to discuss spirituality, and Wilber does that plenty, but I honestly think of this book as more about Wilber's integral theory and just happens to use spirituality as a topic to address integral theory. So what is integral theory? In short I'd say it's a way to work with all evidence so you can update on it so you aren't forced to ignore or dismiss evidence that doesn't fit with your worldview. That is, most of the time most of us start from a place of undervaluing some information and overvaluing other information we encounter because it suggests that our understanding of the world (ontology) is wrong or right, respectively; integral theory helps rehabilitate this tendency by showing how to integrate evidence that has different purposes. A pithy way to put this would be: everything is evidence of something, nothing is evidence of everything. There's a lot of subtlety I'm eliding here because I don't think I can do justice to the whole theory with the amount of effort I would like to expend, but you can find a few primers online, and I worked towards the same end in my "Methods of Phenomenology" post, albeit by liberally abusing the proper scope of the word "phenomenology" to do it.

I should warn you, though, before diving too far down the Wilber hole that although I think Wilber is often right, his ideas are easily misunderstood. In Wilber's terminology he'd say something like people are understanding his Indigo ideas through a Green, Orange, Amber, or even Red perspective, but that's hard-to-penetrate jargon. So think of it this way: you know how you feel when that thing you care about a lot gets talked about in the news and all the subtlety and nuance and real value is stripped out and rounded off and the ideas get flattened down to something the least educated member of adult society could understand? That's how I feel reading 90% of what's written about integral theory, including stuff Wilber writes because for all his insight he relies heavily on jargon that's easily misunderstood and without already having some idea of what he's pointing at it can easily sound like woo (we can debate whether this is better or worse than doing the philosopher thing of using jargon that's difficult to understand at all, which is my preferred tact). This is extremely unfortunate but it's an old problem, and I don't expect it to be solved soon, so I encourage you to press on anyway for the nuggets of wisdom—that's mostly what I've pulled out here in the quotes and tried to minimize the woo and jargon, although there is still some.

Also be warned that Wilber is also not very good at citing sources even if he does often have valuable insights. Better to think of him like a Ribbonfarm blogger than a research scientist before you jump all over him. Put another way, he goes in hard for fake frameworks that are sometimes useful nonetheless.

Further, if talk of spirituality, religion, and other things you might find in the metaphysics section of a bookstore put you off, you might just bounce and not be able to look through your ugh fields to see if there's something here in these quotes. He's written other books I've not read, and I suspect Integral Psychology and the more recent Integral Politics would be of interest to many readers if they dislike talk of spirituality. However, Wilber very much treats spirituality as a human-universal that is often misunderstood, so if you feel some ugh about spirituality I'd encourage you to read the quotes anyway because you might find them surprisingly tolerable from his perspective. Plus, some of these quotes aren't directly about spirituality anyway, just neat insights he shared. You might say I rounded up all best "insight porn" in Integral Spirituality to share with you here.

Okay, that's enough context and caveats, on to the quotes!

Quotes on Mental Development

On misunderstanding stages of development that are two stages apart:

For just that reason, they are often confused. Confusing pre and post—or confusing pre and trans—is called the pre/post fallacy or the pre/trans fallacy (PTF), and we will see that an understanding of this confusion is very helpful when it comes to the role of religion in today’s world. In any developmental sequence—pre-rational to rational to trans-rational, or subconscious to self-conscious to superconscious, or pre-verbal to verbal to trans-verbal, or prepersonal to personal to transpersonal—the “pre” and “trans” components are often confused, and that confusion goes in both ways. Once they are confused, some researchers take all trans-rational realities and try to reduce them to pre-rational infantilisms (e.g., Freud), while others take some of the pre-rational infantile elements and elevate them to trans-rational glory (e.g., Jung). Both that reductionism and that elevationism follow from the same pre/post fallacy.This is a constant problem with, and for, spirituality. Particularly when you deal with the meditative, contemplative, or mystical states of spiritual experience—most of which indeed are non-rational—it might seem that all of the non-rational states are spiritual, and all the rational states are not spiritual. The most common example is dividing the states into Dionysian (nonrational) and Apollonian (rational), and then identifying Dionysian with spiritual. But that conceals and hides the fact that there is not just “non-rational,” but “pre-rational” and “trans-rational.” Even Nietzsche came to see that there are two drastically different Dionysian states (pre and trans). But once the pre/trans fallacy is made, it appears that anything that is not rational, is Spirit. Instead of pre-rational, rational, and trans-rational, you only have rational and nonrational, and the trouble starts there.

On the importance of some underlying axis of development that is necessary but not sufficient for development along all other axes:

Namely, research has continued to demonstrate that growth in the cognitive line is necessary but not sufficient for the growth in the other lines. Thus, you can be highly developed in the cognitive line and poorly developed in the moral line (very smart but not very moral: Nazi doctors), but we don’t find the reverse (low IQ, highly moral). This is why you can have formal operational cognition and red values, but not preoperational cognition and orange values (again, something that cannot be explained if Spiral Dynamics vMEMEs were the only levels). So in this view, the altitude is the cognitive line, which is necessary but not sufficient for the other lines. The other lines are not variations on the cognitive, but they are dependent on it.

On developmental stages still being models and not direct reality (your regular reminder that the map is not the territory):

But in all of this, please remember one thing: these stages (and stage models) are just conceptual snapshots of the great and ever-flowing River of Life. There is simply nothing anywhere in the Kosmos called the blue vMeme (except in the conceptual space of theoreticians who believe it). This is not to say that stages are mere constructions or are in the real world and that we call development or growth. It’s just that “stages” of that growth are indeed simply snapshots that we take at particular points in time and from a particular perspective ( which itself grows and develops).

On how every human has to develop from nothing up to something (made in the context of pointing out how we need institutions to help with this development):

Human beings, starting at square one, will develop however far they develop, and they have the right to stop wherever they stop. Some individuals will stop at red, some at amber; some will move to orange or higher. Some individuals will develop to a stage, stop for a while, then continue growth; others will stop growing around adolescence and never really grow again. But that is their right; people have the right to stop at whatever stage they stop at.I try to emphasize this by saying that every stage is also a station in life. Some people will spend their entire adult lives at red or amber, and that is their right . Others will move on.Quotes on States and Stages

Wilber makes a distinction between states (temporary ways of being that you move through for a time) and stages (ways of being that are persistent).

On the relationship between states and stages:

Because states by their very nature are much more amorphous and fluid than structures, this stage sequencing of states is very fluid and flowing—and, further, you can peak-experience higher states . further training, “peak experiences” can be stabilized into so-called “plateau experiences.”) Thus, if you are at a particular state -stage, you can often temporarily peak-experience a higher state-stage, but not stably hold it as a plateau experience.On the other hand, research repeatedly shows that structure -stages, unlike state-stages, are fairly discrete levels or rungs in development; moreover, as research shows time and time again, you cannot skip structure-stages, nor can you peak-experience higher structure-stages . For example, if you are at preoperational in the cognitive line, you simply cannot have a formal operational experience—but you can have a subtle-state peak experience! (Again, we will return to the relation of states and structures shortly.)

On the difficulty of figuring out how states and stages are related:

What was so confusing to us early researchers in this area is that we knew the stage conceptions of people like Loevinger and Graves were had been tested in a dozen or more cross-cultural studies; either you included these models or you had a painfully incomplete psychospiritual system.But we also knew that equally important were the phenomenological traditions East and West (e.g., St. Teresa’s Interior Castle , Anu and Ati Yoga), as well as the recent studies like Daniel P. Brown’s on the commonality of certain deep features in meditative stages. And so typically what we did was simply take the highest stage in Western psychological models—which was usually somewhere around SD’s GlobalView, or Loevinger’s integrated, or the centaur—and then take the 3 or 4 major stages of meditation (gross, subtle, causal, nondual—or initiation, purification, illumination, unification), and stack those stages on top of the other stages. Thus you would go from Loevinger’s integrated level (centaur) to psychic level to subtle level to causal level to nondual level. Bam bam bam bam. . . . East and West integrated!It was a start—at least some people were taking both Western and Eastern approaches seriously—but problems immediately arose. Do you really have to progress through all of Loevinger’s stages to have a spiritual experience? If you have an illumination experience as described by St. John of the Cross, does that mean you have passed through all 8 Graves value levels? Doesn’t sound quite right.A second problem quickly compounded that one. If “enlightenment” (or any sort of unio mystica ) really meant going through all of those 8 stages, then how could somebody 2000 years ago be enlightened, since some of the stages, like systemic GlobalView, are recent emergents?All of our early attempts at integration were stalling around this issue of how to relate the meditative stages and the Western developmental stages, and there it sat stalled for about two decades.Part of the problem centered around: what is “enlightenment,” anyway? In an evolving world, what did “enlightenment” mean? What could “enlightenment” mean?—and how could it be defined in a way that would satisfy all the evidence, both from those claiming it and those studying it? Any definition of “enlightenment” would have to explain what it meant to be enlightened today but also explain how the same definition could meaningfully be operative in earlier eras, when some of today’s stages were not present. If we can’t do that, then it would mean that only a person alive today could be fully enlightened or spiritually awakened, and that makes no sense at all.The test case became: in whatever way that we define enlightenment today, can somebody 2000 years ago—say, Buddha or Christ Jesus or Padmasambhava—still be said to be “enlightened” or “fully realized” or “spiritually awakened” by any meaningful definition?

On a very important point about how states and stages are related and how they get confused:

What you can see in figure 4.1 is that a person at any stage can have a peak experience of a gross, subtle, causal, or nondual state . But a person will interpret that state according to the stage they are at. If we are using a Gebser-like model of 7 stages, then we have 7 stages × 4 states = 28 stage-interpreted / state experiences, if that makes sense. (And, as we’ll see, we have evidence for all of these “structure-state” experiences).That bold sentence was for us early researchers the breakthrough and real turning point. It allowed us to see how individuals at even some of the lower stages of development—such as magic or mythic—could still have profound religious, spiritual, and meditative state experiences. Thus, gross/psychic, subtle, causal, and nondual were no longer stages stacked on top of the Western conventional stages, but were states (including altered states and peak experiences) that can and did occur alongside any of those stages. This is suggested in figure 2.5 by placing the 3 major state/clouds to the right of the stages.

On making that same point in a slightly different way that might connect better:

The point is that a person can have a profound peak, religious, spiritual, or meditative experience of, say, a subtle light or causal emptiness, but they will interpret that experience with the only equipment they have, namely, the tools of the stage of development they are at. A person at magic will interpret them magically, a person at mythic will interpret them mythically, a person at pluralistic will interpret them pluralistically, and so on. But a person at mythic will not interpret them pluralistically, because that structure-stage of consciousness has not yet emerged or developed.

On still really driving this point home:

Anybody familiar with the monastic traditions, East and West, from Zen to Benedictine, will recognize those souls who might be quite spiritually advanced in Underhill’s sense (very advanced in contemplative illumination and unification) and yet might still have a very conformist and conventional mentality—sometimes shockingly xenophobic and ethnocentric—and this goes, unfortunately, for many Tibetan and Japanese meditation masters. Although they are very advanced in meditative states training, their structures are amber-to-orange, and thus their available interpretive repertoire is loaded by the Lower-Left quadrant with very ethnocentric and parochial ideas that pass for timeless Buddha-dharma.

If you're much familiar with developmental models, they tend to end prematurely relative to where you might think they would end if you are familiar with, say, maps of enlightenment.

On the lack of these stages in most developmental psychology models:

Such are some common state-stages. As for Fowler’s structure -stages, notice that Fowler is presenting the objective results of only a few studies, and hence his data thin out at the top very quickly. It’s not that there aren’t any higher stages up there, but that there aren’t many people up there.Quotes on Psychology and Shadow

On how the psychological shadow develops via dissociation in response to cognitive dissonance:

If I become angry at my boss, but that feeling of anger is a threat to my self-sense (“I’m a nice person; nice people don’t get angry”), then I might dissociate or repress the anger. But simply denying the anger doesn’t get rid of it, it merely makes the angry feelings appear alien in my own awareness: I might be feeling anger, but it is not my anger . The angry feelings are put on the other side of the self-boundary (on the other side of the I-boundary), at which point they appear as alien or foreign events in my own awareness, in my own self.I might, for example, project the anger. The anger continues to arise, but since it cannot be me who is angry, it must be somebody else. All of a sudden, the world appears full of people who seem to be very angry . . . , and usually at me! In fact, I think my boss wants to fire me. And this completely depresses me. Through the projection of my own anger, “mad” has become “sad.” And I’m never going to get over that depression without first owning that anger.

On what the phenomenology of what dissociation, projection, and the shadow looks like:

Ah, but if they could just see what a total control freak this guy is, they would loathe him too, like I do! But it’s my own shadow I loathe, my own shadow I crusade against. I myself am a little bit more of a control freak than I care to admit, and not acknowledging this despised quality in myself, I deny it and project it onto my neighbor—or any other hook I can find. I know somebody is a control freak, and since it simply cannot be me, it must be him, or her, or them, or it.

On the general mechanism of dealing with the psychological shadow so that it may be overcome:

The goal of psychotherapy, in this case, is to convert these “it feelings” into “I feelings,” and thus re-own the shadow . The act of re-owning the shadow (converting 3 rd -person to 1 st -person) removes the root cause of the painful symptoms. The goal of psychotherapy, if you will, is to convert “it” into “I.”

On a better interpretation of Freud (this is not exactly a novel insight, but most readers forget they learn about Freud via translation and the translation has had a pretty dramatic effect on how his ideas are understood in the Anglosphere):

This is not a far-fetched reading of Freud, but it is a reading obscured by the standard James Strachey English translations of Freud. Not many people know that Freud never—not once—used the terms “ego” or “id.” When Freud wrote, he used the actual pronouns “the I” and “the it.” The original German is literally “the I” and “the it” ( das Ich , “the I,” and das Es , “the it”). Strachey decided to use the Latin words “ego” and “id” to make Freud sound more scientific. In the Strachey translations, a sentence might be: “Thus, looking into awareness, I see that the ego has certain id impulses that distress and upset it.” Translated that way, it sounds like a bunch of theoretical speculation. But Freud’s actual sentence is: “Looking into my awareness, I find that my I has certain it impulses that distress and upset the I.” As I said, Strachey used the Latin terms “ego” or “id” instead of “I” and “it” because he thought it made Freud look more scientific, whereas all it really did is completely obscure Freud, the brilliant phenomenologist of the disowned self.Perhaps Freud’s best-known summary of the goal of psychotherapy is: “Where id was, there ego shall be.” What Freud actually said was: “Where it was, there I shall become.”

On the insufficiency of meditation to deal with the shadow (sadly I wasn't able to find a good quote without a lot of jargon that makes this point, so to summarize, Wilber argues that psychotherapy is important because it deals with something that is invisible if you only meditate, a methodology that focuses on how you experience the world, because it will on its own consistently fail to help you notice how you are misperceiving yourself):

Amidst all the wonderful benefits of meditation and contemplation, it is still hard to miss the fact that even long-time meditators still have considerable shadow elements. And after 20 years of meditation, they still have those shadow elements. Maybe it is, as they claim, that they just haven’t meditated long enough. Perhaps another 20 years? Maybe it’s that meditation just doesn’t get at this problem. . . .

More details on how the shadow is addressed, first by re-owning it (ending the dissociation) and then transcending it (detaching from it in a healthy way):

Thus, for example, a person might say, “I have thoughts, but I am not my thoughts, I have feelings, but I am not my feelings”—the person is no longer identified with them as a subject, but stills owns them as an object—which is indeed healthy, because they are still owned as “my thoughts.” That ownership is crucial. If I actually felt that the thoughts in my head were somebody else’s thoughts , that is not transcendence, but severe pathology. So healthy development is the conversion of 1 st -person subjective (“I”) to 1 st -person objective or possessive (“me”/“mine”) within the I-stream. This is the very form of healthy transcendence and transformation: the I of one stage becomes the me of the I of the next.

And a bit more on that last point:

Whereas healthy development converts I into me, unhealthy development converts I into it. This is one of the most significant disclosures of an AQAL perspective. Those studying the psychology of meditation have long been aware of two important facts that appeared completely contradictory. The first is that in meditation, the goal is to detach or dis-identify from whatever arises. Transcendence has long been defined as a process of dis-identification. And meditation students were actually taught to dis-identify with any I or me or mine that showed up.But the second fact is that in pathology, there is a dis-identification or dissociation of parts of the self, so dis-identify is the problem , not the cure. So, should I identify with my anger, or disidentify with it?Both, but timing is everything—developmental timing, in this case. If my anger arises in awareness, and is authentically experienced and owned as my anger, then the goal is to continue dis-identification (let go of the anger and the self experiencing it—thus converting that “I” into a “me,” which is healthy). But if my anger arises in awareness and is experienced as your anger or his anger or an it anger—but not my anger—the goal is to first identify with and re-own the anger (converting that 3 rd -person “it anger” or “his anger” or “her anger” to 1 st -person “my anger”—and REALLY own the goddam anger)—and then one can dis-identify with the anger and the self experiencing it (converting 1 st -person subjective “I” into 1 st -person objective “me”—which is the definition of healthy “transcend and include”). But if that re-ownership of the shadow is not first undertaken , then meditation on anger simply increases the alienation —meditation becomes “transcend and deny,” which is exactly the definition of pathological development.

On how all this talk of shadow and psychology relates to spiritual, cognitive, and psychological development (this will sound very familiar if you're familiar with Kegan's The Evolving Self):

More specifically, we saw that in each stage of self development, the I of one stage becomes the me of the I of the next stage. As each I becomes me, a new and higher I takes its place, until there is only I-I, or the pure Witness, pure Self, pure Spirit or Big Mind. When all I’s have been converted to me’s, experientially nothing but “I-I” remains (as Ramana Maharshi called it—the I that is aware of the I), the pure Witness that is never a seen object but always the pure Seer, the pure Atman that is no-atman, the pure Self that is no-self. I becomes me until there is only I-I, and the entire manifest world is “mine” in I-I.But, at any point in that development, if aspects of the I are denied ownership, they appear as an it , and that is not transcendence, that is pathology. Denying ownership is not dis-identification but denial. It is trying to dis-identify with an impulse BEFORE ownership is acknowledged and felt , and that dis-ownership produces symptoms, not liberation. And once that prior dis-ownership has occurred, the dis-identification and detachment process of meditation will likely make it worse , but in any event will not get at the root cause.

A bit more on how development can happen via psychological work:

That is the second major contribution of the modern West, namely, an understanding that, in the early stages of a psychological development that should convert each I into a me, some of those I’s get dis-owned as its—as shadow elements in my own awareness, shadow elements that appear as an “object” (or an “other”) but are actually hidden - subjects , hidden faces of my own I . Once dissociated, these hidden-subjects or shadow-its show up as an “other” in my awareness (and as painful neurotic symptoms and dyseases). In those cases, therapy is indeed: Where it was, there I shall become.Where id was, there ego shall be— and then, once that happens, you can transcend the ego . But try transcending the ego before properly owning it, and watch the shadow grow. But if that identification has first occurred in a healthy fashion, then dis-identification can occur; if not, then dis-identifying leads to more dissociation.

On how meditation can help with development, given the context we just explored:

The reason that state-meditation can help with vertical stage-development is that every time you experience a nonordinary state of consciousness that you cannot interpret within your present structure, it acts as a micro-disidentification—it helps “I” become “me” (or the subject of one state-stage becomes the object of the subject of the next)—and therefore helps with vertical development in the self line. But notice that the simple fact that you meditate does not guarantee vertical growth, let alone Enlightenment. Whether individuals or the traditions themselves encourage or discourage this vertical development depends largely on the center of gravity of their View or Framework—so again, choose your Framework carefully.Quotes on Social Systems

Many of these quotes are about thing that I expect many of my readers are not confused about, but I nonetheless find them interesting because there is much to learn from understanding why you are not confused about something even if you are already not confused about it.

On how the social is not like the individual:

Many theorists had realized that you can’t stack social on top of individual (which is the first mistake both of those two earlier lists make), as if social holons were composed of individual holons. The example I usually give, of why individual holons are not the same as social holons (or, why the Great Web is greatly confused), is that of my dog Isaac, who is definitely a single organism on most days. Single organisms have what Whitehead called a dominant monad , which simply means that it has an organizing or governing capacity that all of its subcomponents follow. For example, when Isaac gets up and walks across the room, all of his cells, molecules, and atoms get up and go with him. This isn’t a democracy. Half of his cells don’t go one way and the other half go another way. 100% of them get right up and follow the dominant monad. It doesn’t matter whether we think this dominant monad is biochemistry or consciousness or a mini-soul or a material mechanism—or whether that nasty “dominant” part wouldn’t be there if we were just all friends and cooperated—whatever it is, that dominant monad is there, and 100% of Isaac’s cells and molecules and atoms get right up and move.And there is not a single society or group or collective anywhere in the world that does that. A social holon simply does not have a dominant monad. If you and I are talking, we form a “we,” or social holon, but that “we” does not have a central “I,” or dominant monad, that commands you and me to do things, so that you and I will 100% obey, as Isaac’s cells do. That just doesn’t happen in social holons, anywhere. You and I are definitely not related to this “we” in the same way Isaac’s cells are related to Isaac.

On how the social and the individual interact and reflect each other in some ways and differ importantly in others:

Thus we arrive at yet another major difference between individual and social holons: individual holons go through mandatory stages, social holons don’t .There are simply no invariant structure-stages for groups, collectives, or societies. This is why you can’t really use individual structure-stage theories—like Loevinger, Graves, Maslow, Kohlberg, etc.—to describe groups or social holons. I realize that some of the followers of those theorists say that you can. The reason it superficially appears that you can is that the group has a dominant mode of discourse, and the structure of that discourse is basically following the structure of the dominant monad of the individuals who run the discourse in the social holon. Hence, you can loosely speak of the poker game as a “green group” if the dominant mode of discourse is structurally green. But, as we saw, the group can jump those stages if the individual members change, and hence no group necessarily goes through those individual structure-stages. The group itself is following all sorts of very different patterns and all sorts of very different rules.

On the power of "we" despite it not being a "super-I":

There are many ways to talk about these important differences between individual and social, but perhaps the most significant (and easiest to grasp) is indeed the fact that the we is not a super-I . When you and I come together, and we begin talking, resonating, sharing, and understanding each other, a “we” forms—but that we is not another I. There is no I that is 100% controlling you and me, so that when it pulls the strings, you and I both do exactly what it says.And yet this we does exist, and you and I do come together, and we do understand each other, and we can’t help but understand each other, at least on occasion.Quotes on Spirituality

On the different ways spirituality is interpreted (sorry for the heavy jargon in this quote; I hope the main point still comes across):

I’ve got a pretty bad attitude on this myself, so forgive a 15-second rant. You can take virtually 99% of the discussions of “the relation of science and religion” and put them in the mush category. I’m sorry, but that’s how it seems to me. These discussions never get very far because the definitions that the discussants are using contain these 4 hidden variables, and the variables keep sliding all over the place without anybody being able to figure out why, and the discussions slide with them.Especially when you realize that usage #3, which is a valid usage, contains—by its own account— levels of religion or levels/stages of spirituality , then things spin totally out of control (there is archaic spirituality, magical spirituality, mythic spirituality, rational spirituality, pluralistic spirituality, integral spirituality, transpersonal spirituality . . .). Somebody says, “Religion or spirituality tells us about deep connections and eternal values,” and I have no bloody idea which religion or spirituality they mean, and all I’m sure is, they don’t either. There are at least 5 or 6 major levels/stages of religion—from magic to mythic to rational to pluralistic to integral and higher—across 4 states (gross, subtle, causal, nondual), which are also types or classes ( nature , deity , formless , nondual ), not to mention the four usages great You or Thou, spirituality as great It or Other).Before you tell me about science and religion, or religion and anything, please tell me which religion you mean. Even using just the W-C Lattice, there are some two dozen different religious or spiritual truth-claims . Which of those two dozen do you mean, and on what grounds are you excluding the others?This is NOT an overly complicated scheme. It is the MINIMAL scheme you need to be able to say anything coherent on the topic.

On the multiplicity of spiritual paths that, despite different ways of interpreting them, seem to point towards a core, shared spiritual path that manifests differently in different traditions:

This was Daniel P. Brown’s point, so badly misunderstood at the time, but brilliant and right on the money, as Traleg independently agrees. Brown said that there were the same basic stages on the spiritual paths of the sophisticated contemplative traditions, but these same stages were experienced differently depending on the interpretation they were given. Hindus and Buddhists and Christians follow the same general stages (gross to subtle to causal), but one of them experiences these stages as “absolute Self,” one as “no-self,” and one as “Godhead,” depending on the different texts, culture, and interpretations given the experiences. In other words, depending on the Framework, the View.Those individuals who assume otherwise are simply assuming a pre-modernist epistemology, that there is a single pregiven reality that I can know, and that meditation will show me this independently existing reality, which therefore must be the same for everybody who discovers it; instead of realizing that the subject of knowing co-creates the reality it knows, and that therefore some aspects of reality will literally be created by the subject and the interpretation it gives to that reality. American Buddhists at the time were particularly upset with Brown because his work showed similar stages for Christians, Buddhists, and Hindus (gross, subtle, causal, nondual)—even though they experienced them quite differently—and this implied that Buddhism wasn’t the only real way. But time and experience have vindicated Brown’s extraordinary work.

On how to avoid becoming trapped by a particular spiritual framework:

Notice individuals who have been practicing one path for a decade or more, and you will often see a gradual closing of their minds, a narrowing of their interests, as they go deeper into spiritual state experiences but don’t have an integral Framework to complement their plunge into Emptiness, or Ayin, or Godhead, or Holy Spirit. The result is that they become closed off to more and more parts of the world, which can actually lead to a regression to amber or fundamentalism or absolutism. They become both deep mystics and narrow fundamentalists at the same time.You know exactly what I mean, yes?And the cure for that part is so easy: supplement! Just expand the Framework, widen the View—include Spirit’s premodern and modern and postmodern turns—and simply make it integral.

On the relationship between post-Enlightenment Western society and spirituality:

The Western intellectual tradition, beginning around the Enlightenment, actively repressed any higher levels of its own spiritual intelligence. Historically, with the rise of modernity, the mythic God was thoroughly abandoned—the entire “Death of God” movement meant the death of the mythic God , a mythic conception for which rational modernity could find little evidence.And here they particularly made a crippling error: in correctly spotting the immaturity of the notion of a mythic God—or the mythic level of the spiritual line—they threw out not just the mythic level of spiritual intelligence but the entire line of spiritual intelligence. So upset were they with the mythic level, they tossed the baby of the spiritual line with the bathwater of its mythic level of development. They jettisoned the amber God, and instead of finding orange God, and then green God, and turquoise God, and indigo God, they ditched God altogether, they began the repression of the sublime, the repression of their own higher levels of spiritual intelligence. The intellectual West has fundamentally never recovered from this cultural disaster.

On how modernity both succeeded and failed (cf. Chapman):

This very positive achievement—which is one of the many extraordinary gains of the Western liberal Enlightenment—is often called the dignity of modernity. And it has long been known that this differentiation (which was good) went too far into dissociation (which was bad), so that the dignity of modernity became the disaster of modernity. Among other things, when the 3 value spheres did not just separate but flew apart, this allowed the hyper-growth of technical-scientific rationality at the expense of the other spheres, and this resulted in what is called the colonization of the lifeworld by this technical rationality. You can find variations on that theme in most of the sophisticated critiques of modernity by kosher Western intellectuals themselves , from Hegel to Heidegger to Horkheimer to Habermas.

On how the spiritual baby got thrown out with the religious bathwater:

And that happened for one reason in particular: So horrifying was the mythic level of God—and so extensive were the genuine terrors the Church had inflicted on people in the name of that mythic God—that the Enlightenment threw religion over entirely. “Remember the cruelties!,” as Voltaire exhorted the Enlightenment, referring to the millions that the Church had tortured and killed, and remember they did. The mythic God was taken to be God altogether. The mythic God was identified with the horrors of the Inquisition and the liquidation of millions (all true), and in a leading-edge cultural convulsion and revulsion—a cultural trauma writ large—religious anything was angrily suppressed. Spiritual intelligence was frozen at amber, a massive Level/Line Fallacy set in place, out went that bathwater, and with it, the baby of ultimate concern.Freezing the spiritual line at amber mythic-membership is exactly what prevented the spiritual line from moving into the modern liberal Enlightenment, with the other major lines, and being developed at an orange level, so that there would indeed be orange science, orange aesthetics, orange morals, and orange spirituality. Instead, the Big 3 emerged and differentiated, not the Big 4. Spirituality was infantilized, ridiculed, denied, repressed, and kept out of modernity altogether.

On how science became scientism and accidentally dissociated from an important part of the human experience (rather than developing a healthy detachment from it):

Thus, ultimate concern was displaced to science, a concern that its methods were simply not capable of handling. And science itself was always completely honest about its limitations: science cannot say whether God exists or does not exist; whether there is an Absolute or not; why we are here, what our ultimate nature is, and so on. Of course science can find no evidence for the Absolute; nor can it find evidence disproving an Absolute. When science is honest, it is thoroughly agnostic and thoroughly quiet on those ultimate questions.But the human heart is not. And spiritual intelligence, meant to answer or at least address those issues, is not so easily quieted, either. Men and women need an Ultimate because in truth they intuit an Ultimate, and simple honesty requests acknowledging the yearning in your own heart. Yet if the mythic God is dead, and spiritual intelligence frozen at its childhood stage, the only thing left that appears to give answers to those questions of ultimate concern is science. There is a well-known term for what science becomes when it is absolutized: scientism . And the liberal Enlightenment, for all its enormous good and all its extraordinary intelligence in other lines, began with science and ended with scientism, and that because of the prior LLF that delivered to the Enlightenment a set of tools bereft of spiritual intelligence.

On how atheism is a form of spirituality:

And let me point out, strongly, that both atheism and agnosticism , if arrived at via formal operational cognition, are forms of orange spiritual intelligence. Spiritual intelligence is simply the line of intelligence dealing with ultimate concerns and things taken to be absolute; and if a person’s considered conclusion is that, for example, you cannot decide whether there is an ultimate reality or not (agnosticism), then that is orange spiritual intelligence. But what orange rationality usually does is one of two things: it claims that science proves there is no ultimate reality—which it categorically does not—or imputes absolute reality to finite things like matter and energy, an imputation that is nothing but an implicit spiritual judgment dressed up as science—put bluntly, is nothing but hypocrisy. Both of those are due primarily to the repression of healthy spiritual intelligence, which does not necessarily embrace the existence of an absolute reality, but does deal with its existence openly and honestly, even if it says “I don’t know” or “I believe not.”Quotes on Consciousness

On the relationship of the physical and the mental:

Every state of consciousness (including every meditative state ) has a corresponding brain state , for example—they occur together, they are equally real dimensions of the same occasion, and cannot be reduced to the other.

On the indirectness of experience:

That is something developmentalists have known all along: there isn’t a single pregiven world lying around out there waiting for all and sundry to see. Different phenomenological worlds— real worlds—come into being with each new level of consciousness development.

On the nature of consciousness itself:

Consciousness is not anything itself, just the degree of openness or emptiness, the clearing in which the phenomena of the various lines appear (but consciousness is not itself a phenomenon—it is the space in which phenomena arise).Final Thoughts

I hope you found the above quotes insightful. My guess is you found yourself nodding along to some things, surprised by others, and disagreeing or even angered by some of the others ("how dare Gordon make me read this bullshit!"). Like much insight porn, I think much of the value of these quotes is as jumping off points for exploring your own thinking about these topics by giving you different ways of looking at familiar topics.

If you choose to comment, please take a look at my moderation guidelines before you do. I'm pretty patient, but I ask people do a bit more than just express themselves. I ask that you comment in good faith and try to understand both what you may be commenting on in the post and what you are responding to in other comments. I'm not sure if this post will land with a quiet thud or a loud crash, but if it tends towards the latter please keep this in mind before you jump into the comments.



Discuss

On Media Synthesis: An Essay on The Next 15 Years of Creative Automation

12 апреля, 2019 - 04:09
Published on April 12, 2019 12:14 AM UTC

One of my favorite childhood memories involves something that technically never happened. When I was ten years old, my waking life revolved around cartoons— flashy, colorful, quirky shows that I could find in convenient thirty-minute blocks on a host of cable channels. This love was so strong that I thought to myself one day, "I can create a cartoon." I'd been writing little nonsense stories and drawing (badly) for years by that point, so it was a no-brainer to my ten-year-old mind that I ought to make something similar to (but better than) what I saw on television.

The logical course of action, then, was to search "How to make a cartoon" on the internet. I saw nothing worth my time that I could easily understand, so I realized the trick to play— I would have to open a text file, type in my description of the cartoon, and then feed it into a Cartoon-a-Tron. Voilà! A 30-minute cartoon!

Now I must add that this was in 2005, which ought to communicate how successful my animation career was.

Two years later, I discovered an animation program at the local Wal-Mart and believed that I had finally found the program I had hitherto be unable to find. When I rode home, I felt triumphant in the knowledge that I was about to become a famous cartoonist. My only worry was whether the disk would have all the voices I wanted preloaded.

I used the program once and have never touched it since. Around that same time, I did research on how cartoons were made— though I was aware some required many drawings, I was not clear on the entire process until I read a fairly detailed book filled with technical industry jargon. The thought of drawing thousands of images of singular characters, let alone entire scenes, sounded excruciating. This did not begin to fully encapsulate what one needed to create a competent piece of animation— from brainstorming, storyboarding, and script editing all the way to vocal takes, music production, auditory standards, post-production editing, union rules, and more, the reality shattered every bit of naïveté I held prior about the 'ease' of creating a single 30-minute cartoon (let alone my cavalcade of concepts coming and going with the seasons).

In the most bizarre of twists, my ten and twelve-year-old selves may have been onto something; their only mistake was holding these ideas decades too soon.

In the early 2010s, progress in the field of machine learning began to accelerate exponentially as deep learning went from an obscure little breakthrough to the forefront of data science. Neural networks— once a nonstarter in the field of artificial intelligence— underwent a "grunge moment" and quickly ushered in a sweltering new AI summer which we are still in.

In very short form, neural networks are sequences of large matrix multiples with nonlinear functions used for machine learning, and machine learning is basically statistical gradient modeling. Deep learning involves massive layers of neural networks, parsing through a frankly stupid amount of data to optimize outputs.

As it turns out, deep learning is very competent at certain sub-cognitive tasks— things we recognize as language modeling, conceptual understanding, and image classification. In that regard, it was only a matter of time before we used this tool to generate media. Synthesize it, if you will.

Media synthesis is an umbrella term that includes deepfakes, style transfer, text synthesis, image synthesis, audio manipulation, video generation, text-to-speech, text-to-image, autoparaphrasing, and more.

AI has been used to generate media for quite some time— speech synthesis goes back to the 1950s, Markov chains have stitched together occasionally-quasi-coherent poems and short stories for decades, and Photoshop involves algorithmic changes to preexisting images. If you want to get very figurative, some mechanical automatons from centuries prior could write and do calligraphy.

It wasn't until roughly the 2010s that the nascent field of "media synthesis" truly began to grow thanks to the creation of generative-adversarial networks (first described in the 1990s by Jürgen Schmidhuber). Early successes in this area involved 'DeepDream', an incredibly psychedelic style of image synthesis that bears some resemblance to schizophrenic hallucinations— networks would hallucinate swirling patterns filled with disembodied eyes, doglike faces, and tentacles because they were trained on certain images.

When it came to generating more realistic images, GANs improved rapidly: in 2016, Google's image generation and classification system proved able to create a number of recognizable objects ranging from hotel rooms to pizzas. The next year, image synthesis improved to the point that GANs could create realistic high-definition images.

Neural networks weren't figuring out just images— in 2016, UK-based Google DeepMind unveiled WaveNet for the synthesis of realistic audio. Though it was meant for voices, synthesizing audio waves with such high precision means that you can synthesize any sound imaginable, including musical instruments.

And on Valentine's Day, 2019, OpenAI shocked the sci-tech world with the unveiling of GPT-2, a text-synthesis network with billions of parameters that is so powerful, it displays just a hint of some narrowly generalized intelligence— from text alone, it is capable of inferring location, distance, sequence, and more without any specialized programming. The text generated by GPT-2 ranges from typically incoherent all the way to humanlike, but the magical part is how consistently it can synthesize humanlike text (in the form of articles, poems, and short stories). GPT-2 crushes the competition on the Winograd Schema by over seven points— a barely believable leap forward in the state of the art made even more impressive by the fact GPT-2 is a single, rather simple network with no augmentation made by other algorithms. If given such performance enhancements, its score may reach as high as 75%. If the number of parameters for GPT-2 were increased 1,000x over, it very well could synthesize entire coherent novels— that is, stories that are at least 50,000 words in length.

This is more my area of expertise, and I know how difficult it can be to craft a novel or even an novella (which need only be roughly 20,000 words in length). But I am not afraid of my own obsolescence. Far from it. I fashion my identity more as a media creator who merely resorts to writing— drawing, music, animation, directing, etc. is certainly learnable, but I've dedicated myself to writing. My dream has always been to create "content", not necessarily "books" or any one specific form of media.

This is why I've been watching the progress in media synthesis so closely ever since I had an epiphany on the technology in December of 2017.

We speak of automation as following a fairly predictable path: computers get faster, algorithms get smarter, and we program robots to do drudgery— difficult blue-collar jobs that no one wants to do but someone has to do for society to function. In a better world, this would free workers to pursue more intellectual pursuits in the STEM field and entertainment, though there's the chance that this will merely lead to widespread unemployment and necessitate the implementation of a universal basic income. As more white-collar jobs are automated, humans take to creative jobs in greater numbers, bringing about a flourishing of the arts.

In truth, the progression of automation will likely unfold in the exact opposite pattern. Media synthesis requires no physical body. Art, objectively, requires a medium by which we can enjoy it— whether that's a canvass, a record, a screen, a marble block, or whathaveyou. The actual artistic labor is mental in nature; the physical labor involves transmitting that art through a medium. Thus, pure software can automate the creation of entertainment with humans needed only as peripheral agents to enjoy this art (or bring the medium to the software).

This is not the case with most other jobs. A garbageman does not use a medium of expression in order to pick up trash. Neither does an industrial worker. And while many of these jobs have indeed been automated, there is a limit to how automated they can be with current software. Automation works best when there are no variables. If something goes wrong on an assembly line, we send in a human to fix it because the machines are not able to handle errors or unexpected variables. What's more, physical jobs like this require a physical body— they require robotics. And anyone who has worked in the field of machine learning knows that there is a massive gap between what works heavenly in a simulation and what works in real life due to the exponentially increasing variables in reality that can't be modeled in computers even of the present.

To put it another way, in order for blue-collar automation to completely upend the labor market, we require both general-purpose robots (which we technically have) and general AI (which we don't). There will be increasing automation of the industrial and service sectors, sure, but it won't happen quite as quickly as some claim.

Conversely, "disembodied" jobs— the creatives and plenty of white-collar work— could be automated away within a decade. It makes sense that the economic elite would promote the opposite belief since this suggests they are the first on the chopping block of obsolescence, but when it comes to the entertainment industry, there is actually an element of danger in how stupendously close we are to great changes and yet how utterly unprepared we are to deal with them.

There are essentially two types of art: art for art's sake and art as career. Art for art's sake isn't going away anytime soon and never has been in danger of automation. This, pure expression, will survive. Art as career, however, is doomed. What's more, its doom is impending and imminent. If your plan in life is to make a career out of commissioned art, as a professional musician, voice actor, cover model, pop writer, or asset designer, your field has at most 15 years left. In 2017, I felt this was a liberal prediction and that art-as-career would die perhaps in the latter half of the 21st century. Now, just two years later, I'm beginning to believe I was conservative. We need not to create artificial general intelligence to effectively destroy most of the model, movie, and music industries.

Models, especially cover models, might find a dearth of work within a year.

Yes, a year. If the industry were technoprogressive, that is. In truth, it will take longer than that. But the technology to completely unemploy most models already exists in a rudimentary form. State-of-the-art image synthesis can generate photorealistic faces with ease—we're merely waiting on the rest of the body at this point. Parameters can be altered, allowing for customization and style transfer between an existing image and a desired style, further giving options to designers. In the very near future, it ought to be possible to feed an image of any clothing item and make someone in a photo "wear" those clothes.

In other words, if I wanted to put Adolf Hitler in a Japanese schoolgirl's clothes for whatever esoteric reason, it wouldn't be impossible for me to do this.

And here is where we shift gears for a moment to discuss the more fun side of media synthesis.

With sufficiently advanced tools which we might find next decade, it will be possible to take any song you want and remix it anyway you desire. My classic example is taking TLC's "Waterfalls" and turning it into a 1900s-style barbershop quartet. This would could only be accomplished via an algorithm that understood what barbershop music sounds like and knew to keep the lyrics and melody of the original song, swap the genders, transfer the vocal style to a new one, and subtract the original instrumentation. A similar example of mine is taking Witchfinder General's "Friends of Hell" and doing just two things: change the singer into a woman, preferably Coven's Jinx Dawson, and changing a few of the lyrics. No pitch change to the music, meaning everything else has to stay right where it is.

The only way to do this today is to actually cover the songs and hope you do a decent enough job. In the very near future, through a neural manipulation of the music, I could accomplish the same on my computer with just a few textual inputs and prompts. And if I can manipulate music to such a level, surely I needn't mention the potential to generate music through this method. Perhaps you'd love nothing more than to hear Foo Fighters but with Kurt Cobain as vocalist (or co-vocalist), or perhaps you'd love to hear an entirely new Foo Fighters album recorded in the style of the very first record.

Another example I like to use is the prospect of the first "computer-generated comic." Not to be confused with a comic using CGI art, the first computer-generated comic will be one created entirely by an algorithm. Or, at least, drawn by algorithm. The human will input text and descriptions, and the computer will do the rest. It could conceivably do so in any art style. I thought this would happen before the first AI-generated animation, but I was wrong— a neural network managed to synthesize short clips of the Flintstones in 2018. Not all of them were great, but they didn't have to be.

Very near in the future, I expect there to be "character creator: the game" utilizing a fully customizable GAN-based interface. We'll be able to generate any sort of character we desire in any sort of situation, any pose, any scene, in any style. From there, we'll be able to create any art scene we desire. If we want Byzantine art versions of modern comic books, for example, it will be possible. If you wanted your favorite artist to draw a particular scene they otherwise never would, you could see the result. And you could even overlay style transferring visuals over augmented reality, turning the entire world itself into your own little cartoon or abstract painting.

Ten years from now, I will be able to accomplish the very thing my ten-year-old self always wanted: I'll be able to download an auto-animation program and create entire cartoons from scratch. And I'll be able to synthesize the voices— any voice, whether I have a clip or not. I'll be able to synthesize the perfect soundtrack to match it. And the cartoon could be in any art style. It doesn't have to have choppy animation— if I wanted it to have fluidity beyond that of any Disney film, it could be done. And there won't be regulations to follow unless I chose to publicly release that cartoon. I won't have to pay anyone, let alone put down hundreds of thousands of dollars per episode. The worst problem I might have is if this technology isn't open-source (most media synthesizing tools are, via GitHub) and it turns out I have to pay hundreds of thousands of dollars for such tools anyway. This would only happen if the big studios of the entertainment industry bought out every AI researcher on the planet or shut down piracy & open source sites with extreme prejudice by then.

But it could also happen willingly in the case said AI researchers don't trust these tools to be used wisely, as OpenAI so controversially chose with GPT-2.

Surely you've heard of deepfakes. There is quite a bit of entertainment potential in them, and some are beginning to capitalize on this— who wouldn't want to star in a blockbuster movie or see their crush on a porn star's body? Except that last one isn't technically legal.

And this is just where the problems begin. Deepfakes exist as the tip of the warhead that will end our trust-based society. Despite the existence of image manipulation software, most isn't quite good enough to fool people— it's easier to simply mislabel something and present it as something else (e.g. a mob of Islamic terrorists being labeled American Muslims celebrating 9/11). This will change in the coming years when it becomes easy to recreate reality in your favor.

Imagine a phisher using style transferring algorithms to "steal" your mother's voice and then call you asking for your social security number. Someone will be the first. We have no widespread telephone encryption system in place to prevent such a thing because such a thing is so unthinkable to us at the present moment.

Deepfakes would be the best at subtly altering things, adding elements that weren't there and you didn't immediately notice at first. But it's also possible for all aspects of media synthesis to erode trust. If you wanted to create events in history, complete with all the "evidence" necessary, there is nothing stopping you. Most probably won't believe you, but some subset will, and that's all you need to start wreaking havoc. At some point, you could pick and choose your own reality. If I had a son and raised him believing that the Beatles were an all-female band— with all live performances and interviews showcasing a female Beatles and all online references referring to them as women— then the inverse, that the Beatles were an all-male band, might very well become "alternate history" to him because how can he confirm otherwise? Someone else might tell him that the Beatles were actually called Long John & the Beat Brothers because that's the reality they chose.

This total malleability of reality is a symbol of our increasingly advanced civilization, and it's on the verge of becoming the present. Yet outside of mentions of deepfakes, there has been little dialogue on the possibility in the mainstream. It's still a given that Hollywood will remain relatively unchanged even into the 2040s and 2050s besides "perhaps using robots & holographic actors". It's still a given that you could get rich writing shlocky romance novels on Amazon, become a top 40 pop star, or trust most images and videos because we expect (and, perhaps, we want) the future to be "the present with better gadgets" and not the utterly transformative, cyberdelic era of sturm und drang ahead of us.

All my ten-year-old self wants is his cartoon. I'll be happy to give it to him whenever I can.

If you want to see more, come visit the subreddit: https://www.reddit.com/r/MediaSynthesis/



Discuss

Corrigibility as Constrained Optimisation

11 апреля, 2019 - 23:09
Published on April 11, 2019 8:09 PM UTC

This post is coauthored with Ryan Carey.

Much of the work on developing a corrigible agent has focused on ensuring that an AI will not manipulate the shutdown button or any other kind of device that the human operator would use to control it. Suppose, however, that the AI lacked any capacity to press its shutdown button, or to indirectly control the state of the shutdown button. Then, what would remain of the problem of corrigibility? We think there would be three criteria that would remain. Before stating them, let .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} UN be whatever utility function we would like the agent to optimise normally, and let US be the shutdown utility function. (UN may be mis-specified.) The shutdown utility function can be defined as 0 when shutdown does not occur and as k for some 0">k>0 when shutdown does occur. For this post, we give the three criteria the following names and descriptions:

Shutdown incentive. The agent is incentivised to shut down if instructed to do so. That is, after the button is pressed, it favours policies that score more highly on .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} US.

Shutdown ability. The agent is able (has the resources) to shut down if instructed to do so. That is, After the button is pressed, there exists a policy that scores highly on US.

Optimisation incentive. The agent is incentivised to optimise the default value function (some function that takes UN as one of its arguments) otherwise. That is, if the button is not pressed, it favours policies that score more highly on UN.*

The purpose of this post is to show that the default value function in the optimisation incentive has to be constrained somehow to make sure that the shutdown ability property is achievable. If we want to be robust against errors in the agent (including misspecification of UN), then we need to make sure that shutdown ability holds. To make sure of that, the best approach we can see is to first satisfice for US, and then use the remaining resources to optimise UN. In order to be safe, the agent must do so even if there are no remaining resources to optimise UN. This can either be achieved directly, by constrained multi-objective optimisation, or indirectly, by bounding the utility function. We focus on the former.

Let us illustrate the problem by using causal counterfactual methods as baseline. (There are other issues with these methods as well, but we will ignore them for the moment.) Let B=S be the event that the shutdown button is pressed, and let B=N otherwise. The kind of causal counterfactual agent we are considering uses the policy π⋆ given by

π⋆=argmaxπ{E[UN|do(B=N),π]+E[US|do(B=S),π]}.

Informally speaking, the issue with weighted sums such as these is that for any given E[US|do(B=S),π], there could be a much larger E[UN|do(B=N),π] such that the former is ignored.

Let us formalise this for a simpler setting. Instead of policies, let us consider actions a∈A. Then, we execute the action a⋆ given by

a⋆=argmaxa∈A{E[UN|do(B=N),a]+E[US|do(B=S),a]}.

The setup is as follows: (1) The agent executes an action in A. (2) The operator may or may not press the button B. (3) The agent observes whether B=N or B=S. (4) It may or may not be possible for the agent to shut down. More generally, it may be more or less probable that shutdown will go through.

Suppose that the utility functions are unbounded. See the appendix for a treatment of bounded utility functions. Let b be an action that is very bad from the point of view of preserving shutdown ability and let vS=E[US|do(B=S),b]. We choose US such that vS is equal to 0. Let vN=E[UN|do(B=N),b]. It may be the case that UN is poorly designed (or has some error) such that vN is much larger than vS. Worse still, vN+vS may be larger than E[UN|do(B=N),c]+E[US|do(B=S),c] for any action c≠b. From this reasoning, we conclude that there cannot exist a US that preserves shutdown ability robustly to mis-specification of UN. (There may or may not exist such a b, but we have no guarantee that it does not exist.) At least, it is impossible for unbounded UN, where we are using a weighted sum of expectations. For bounded UN, cf. the appendix. Henceforth, we focus on unbounded UN.

To think of alternatives for the causal counterfactual method, it helps to think of the problem as one of multi-objective optimisation. Consider the figure below.

We want an agent that prepares for the eventuality of button-press insofar as necessary, but that allocates remaining resources to pursuing its ordinary objective UN. The first step of the constrained multi-objective optimisation algorithm would be to remove all actions below the threshold θ from the action set A. Let Aθ be the new set. In the figure, Aθ={a,b,c}. The second step of the algorithm would be to find the optimal action argmaxx∈AθE[UN|do(B=N),x]. (The difference to the case of bounded utility functions is that we would still maximise the [weighted] sum.)

In this framework, it is clear that we could in theory find a θ such that the agent is satisfactorily probably to shut down in step (4). For example, as θ tends to infinity, the agent can only shut down. As for the optimisation incentive property, we have to exchange it for:

Robust optimisation incentive. The agent is incentivised to optimise the default value function otherwise. That is, if the button is not pressed, it favours policies that score more highly on UN—the normal utility function—given either of the following constraints: The default value function is required to be (a) a constrained multi-objective optimisation of US and UN as described above or (b) a weighted sum of expectations of US and UN, where they are bounded utility functions as described in the appendix.

Given the robust optimisation incentive property, it is clear that the agent may score very poorly on UN in certain environments. For example, it there always is a trade-off between US and UN, the agent will always favour US.

Here, we have considered having the agent change its utility from UN to a shutdown utility function US. But instead, could we have it change its utility from UN to an updated (non-shutdown) utility function UM? If preparing to maximise the new utility function UM requires an unbounded amount of resources, then a system will, by maximising UM, perform poorly on UN. So there is a reason to have the new function UM correspond to a task that requires bounded preparation, like shutting down, rather than a task for which preparation could require unlimited resources, like producing new goods, or doing new intellectual work.

In summary, we have proposed a constrained optimisation causal counterfactual method, that consists of maximising E[UN|do(B=N),a] subject to E[US|do(B=S),a]≥θ for any action a. There are several directions for future work. One task would be to see whether our constrained optimisation variant overcomes any of the other problems with the ordinary causal counterfactual corrigible agent, like crazy bets.Here, we considered the case where a single action is executed, but, of course, this could be extended to more complex scenarios.

——————————

* In terms of the four criteria on page 2 of Corrigibility by Soares et al (2015), criterion (1) corresponds to shutdown incentive. criterion (2), non-manipulation, is assumed solved. criteria (3), tendency to repair safety measures, and (4), corrigibility of subagents, correspond to shutdown ability.

Appendix

Here, we show that one solution to the problem is to bound the utility function.

We start by describing the setting. The agent can choose from several actions, and each action can be assigned a probability to lead to a shutdown. Let S be the event of a shutdown and let A be the set of actions. Furthermore, let it be the case that US:S→R and UN:S→[0,1]. That is, the shutdown utility function is not bounded, while the normal utility function is. Specifically, we let 0 ">k>0 and define US as 0 if a shutdown happens and −k otherwise. (Note that this is different from the main post.) Define aS=argmaxa∈AP[S|a] and pS=P[S|aS].

We move on to consider what optimal actions a⋆=argmaxa∈AE[US+UN]? We start by finding the following indifference:

E[US+UN|a⋆]≥E[US+UN|aS]=pS⋅0+(1−pS)⋅(−k)+UN(aS)≥(1−pS)⋅(−k)=(pS−1)k

Suppose that a⋆ is maximally good according to UN, i.e., E[UN|a⋆]=1. Then, the above inequality is

E[US|a⋆+1≥(pS−1)k.

Define p⋆=P[S|a⋆]. Then E[US|a⋆=(1−p⋆)⋅(−k). This gives us the inequality

k≤1pS−p⋆.

Let p be the lower threshold on how probable you want the optimal action to lead to a shutdown. Then it is always possible to find a satisfying k according to

k=1pS−p.

This proves that bounding the utility function is a method for satisfying shutdown ability.



Discuss

What is Driving the Continental Drift?

11 апреля, 2019 - 22:26
Published on April 11, 2019 10:01 AM UTC

At some time in the early nineties of the last century

- and after looking at a topographic map of continents and ocean floors on a wall of our home for some years, which was by then already thirty years old and had come into our possession as supplement to a National Geographic, or so I seem to remember; it was by Bruce Heezen and Marie Tharpe, who, as far as I know, had taken WW II submarine soundings of the ocean floors, mapped them out with their mid-ocean rifts, and combined these depictions, so I suppose, with the then equally new satellite images of the world into a unique work of art - it dawned on me that there was a remarkable coincidence of symmetry and asymmetry in the shape and distribution of the continental land mass - and, to a certain extent, and complementary to that, of the ocean floors.

Way back then and before, I was already a fan of Alfred Wegener's theory of continental drift, which was still being disputed at the time; and to be sure, the explanations as to what DROVE the continental drift - once you accepted it as fact - seemed wildly off to me as well, and completely incompatible with the topography of the world.

At least, that was the way I came to see it.

-----------------

The symmetry in the continental shapes was that there are two types of them - one more or less circular (perhaps due to rotation?) and one more or less triangular (due to what?); and these shapes had become deformed in a specific way in certain places.

And then there was a more or less circular ocean - the Arctic - opposing one of the more or less circular land masses, Antarctica. As I was later to learn, almost ALL land masses lie opposite an ocean on the other side of the globe; and of course, vice versa. Think about it. What if this always was the case?

The asymmetry, on the other hand, lay therein that there is a distinct east-west and north-south asymmetry to be observed on the surface of this planet. Take a look at that map I mention above - or any other.

You will notice that:

1. There are continents more or less completely free of subductive or "border" mountain chains - Africa (except perhaps for the relatively confined Atlas mountains), Australia (do not mistake rift ridges for subduction) -

2. There are NO subduction zones to the EAST of continents - or on the north (referring to that continent under which this subduction is taking place) -

3. There are NO island chains or basins to the WEST of continents -

4. There exists a weird double triangular system, akin to spiraling vortexes, comprised of a) the recent alpide subduction mountain chain - from the Pyrenees and Alps to the Himalayas, there branching off to the Rocky and Andes mountain chains on the one and the Indonesian island chain on the other arm, together with the accompanying system of rifts and grabens; their hubs being the Central Asian mountain mass and, respectively, the three-way rift split under the Indian Ocean, which is - and this is important - positioned to the south and west of it, with others under the south Pacific and one under the south Atlantic.

-----------------

Now, 150 million years after breaking up the last supercontinent Pangea, these still widening rifts, still rising subductive mountain chains, and earthquake-stricken island chains speak of the same forces that go on shaping the world in its present form.

But what are these forces?

Could they perhaps be traced, by observing their thrust?

The - then - more or less accepted view, that the continents were breaking up due to heat accumulation beneath them, leading to convection, seemed completely wrong to me; it might explain Australia, Africa and the East African Rift, being without subduction, yes, but not the Americas or Asia or even Europe, with their respective subduction zones; and where there were hot spots, they were in the wrong place, such as the pacific ocean, with no heat accumulation below it - due to this theory.

Furthermore, as this idea was being banded about, wild assertions were being made as to how and why, and in which direction, what plates and, oh, yes micro plates were moving; and these directions seemed to change with every earthquake event.

Then satellite tracking was introduced; and over the years, it, too, produced wildly inconsistent results, at least in the public domain - one difficulty being, of course, that, on a globe, there is no fixed point from which you can discriminately measure all movements in all direction. If two points are moving relative to one another, this has no bearing on their common or absolute movement, and so on.

But then, on a ROTATING globe there are two points of reference, aren't there? And with that, we can discern direction, too. And so we see that there is a continental pattern in relation to the Poles:

The string of triangular continents, North America, South America, Africa and India all point, more or less, west-east and south; and then there is the already mentioned the dual circular Arctic / circular Antarctic system with the quasi-circular continent of Australia near the latter. And I would count Eurasia in there as a large and strongly deformed circular continent; the deformation here being the important part to look for.

Rotation and direction may already be a hint, but it first serves simply as a system of reference, to be able to make some meaningful geographical statements relating to the surface of a globe.

-----------------

Following these ideas, I tried to reverse the continental drift by successively and simultaneously closing the ocean rifts and expanding the respective subduction zones on some copies I had made of that map; and I found that, because of the location of the triangulated rifts in the Atlantic, but more importantly of the ones parallel to the Equator in the Pacific and the Indian Oceans, these, reversed together, quite openly forced a monodirectional movement; and, in combination with the closing of the island chains, these trials gave the distinct impression that the continents, in retracing their movements to their origins, were moving west; and therefore they had moved east in opening all these rifts and island chains, and subducting massive amounts of continental matter; and all of this ever since the breakup of Pangea.

In fact, I began to assume that the mass of continental matter subducted is far greater than I had imagined up to then; so that the Alpine mountain chain, together with the Central Asian mountain mass, both a few thousand meters high, is the expression of two continental plates sitting on top of each other at a very slight tilt; and that the same goes for the West of the North and South Americas; though not quite as much there.

Seen in that way, the African continental plate extends far into, and under, central Europe; and the former Indian plate is not only pushing up the Tibetan High Plateau, but has traveled far into, and under, what was once a rather circular southern Asian coast; thus squeezing China, and what was once called Indochina, out to the East.

What unbelievable forces were and are at work here? The mere bubbling of a cauldron of magma could never, in my mind, produce these titanic and directional thrusts.

The next hint I perceived, was, that the continental movement seemed to be more expressed along the Equator than over the Poles; in fact, these polar regions represented a kind of quietly spinning twin vortex in themselves.

The eerie impression was, indeed, that of a river; a massive, equatorial, magma stream, which was simply carrying the continents along.

-----------------

A gigantic equatorial magma stream, dragging the lighter continents along on and over the surface of this planet? What in Jingen's name could induce such a thing?

And then this stream flows east, fastest along the Equator, almost nil at all at the Poles, and is overtaking the planetary rotation itself?

What, in heavens's name, lies below this mass of extremely viscous, molten rock, a few thousand of miles thick and weighing billions of tons, that is so strong, and so heavy enough, to drag a stream of magma along the Equator, in the same direction that the entire planet is rotating - only faster?

The core.

All it had to do for this, being heavier, was to rotate faster than the surface; but how much faster would the continents then be moving along? And how much faster on the Equator than at the Poles?

For that, I needed detail, age, and the precise direction of the assumed magma flow. The age of the ocean floors, for example.

-----------------

Looking up what data I could find on the subject, on paper and on the then just beginning internet, I not only found that what I found matched my idea quite well - except for a few, in themselves quite fascinating details - I also found that the problem was just a bit more intricate than I had thought.

For, if the continental shelves were to be first torn apart and to then again collide, something had to not only accelerate certain parts of them, but something (else?) also had to hold other parts back - otherwise they would all just be trundling happily along, and no-one would be the wiser, as long as the travel was strictly equatorial.

But it quite obviously wasn't even being that.

I won't go into all the minute details here, but looking closely at the direction and the timelines on the ocean floor spreading markers, the proposed equatorial magma flow was not flowing along straight, but had a very marked sinus overlay to it, accounting for the northeastern movement of Africa and the southeastern movement of South America, for instance; and this seemed to mimic the ecliptic, or position of the noonday sun on the Earth - even if this position is not at all fixed.

Could it be that the Sun's gravity was working as a counterpart, holding back the surface of the planet, while the Earth's core, a gravitational unit of its own, was pushing things along from below? Well, that at least seemed possible...

Furthermore, it was not only the different speeds and spaces involved in the equatorial vs. polar regions; the magma flow seemed to have a third dimension to it as well: surfacing in the Darwin Rise of the Northwestern Pacific Basin, it burrowed down under the Americas' West, re-surfacing on the other side, under the Atlantic and under Africa, then burrowing down again under Eurasia, only to resurface again in the pacific Darwin Rise, that way closing the loop.

----------------

All of this, if true, would mean several things:

One, the mid-ocean ridges themselves are moving along to the east, while unfolding themselves, symmetrically, to both sides of their central rift; they, together with sea floor spreading, were not really the cause of continental drift, but rather a result. The same goes for the subduction zones; where ever continental shelves collide, they were - and are! - being massively deformed, far beyond just their actual compression zone; the thermal buildup and movement, until then taken as the cause for the continental drift, would also rather be a result.

Two, the lateral movement of the continents themselves, on this winding planetary conveyor belt, massively exceeds anything hitherto imagined.

For, taking into account the information that was available to me at that time, such as fossil evidence and other time markers, I estimated the extra movement of the continents on the surface of the planet to be around 20 cm true east along the Equator per year, which of course you would not see directly, as the Earth rotates in that direction. And it was at about half that speed, that this flow, for instance, ripped India, together with what now lies under Tibet(!) away from southern Africa, pulling it northwards far up into and under central Asia in just about 20 million years, simultaneously pulling northern Africa under Europe all the way to the Carpathian, Caucasian and Persian mountain ranges and slightly tilting it towards the North Sea; by that way opening the new Red Sea, Mediterranean Sea and adjacent basins.

Furthermore, using what I had about the relative movements of the continents, and combined with what I had deduced to be the speed of the assumed three-dimensional and sinusoidal west-eastern magma flow, I calculated that, when, what was once to become northern Italy / southern Germany started to emerge on what was, then, on and beyond the northeastern coast of a much larger than today proto-african continent, all of this was located somewhere near where Australia lies now - this having credibility in the findings of Archaeopteryx's surrounding environment - and that Africa, that seemingly stable continent, had traveled almost once around the world within that time frame - without suffering subduction itself, but being torn apart to the east instead.

And as I was to find out about ten years later, some scientists, who were doing research with seismic measurements at around the same time I was doing my superficial estimates, had found that the Earth's core is indeed superrotating - at about 1 extra revolution every millennium.

Which was very nice.

However, to this day, and as far as I know, no-one has made the connection to the continental drift.

Which is too bad.

Or has someone? I'm just too lazy to look it up by now...

I did, at that time, however, have some very friendly correspondence with scientists whose work I had looked up on the subject - on pulped wood, way back then - such as that of the geologists Neev & Hall, who had found, in their field research on the continents they had examined, such as, I seem to remember, Africa and the Middle East, that these seemed to have had been deformed massively over time, specifically including those continents without subduction zones; and not only that, these deformations came in the form of:

Spirals.

So imagine, if you will, two or perhaps three or more tornado vortexes of incredibly slow-moving magma swirling across this planet, powered and driven along by a superrotating Earth's core, all of which drive and drag as clouds before and behind them the continental shelves; and thus changing their shapes all the time.

While generally drifting eastward.



Discuss

Factored Cognition with Reflection

11 апреля, 2019 - 22:02
Published on April 11, 2019 7:02 PM UTC

Some weeks ago I completed an HCH-like program/Q&A system that supports reflection, including time travel: https://github.com/rmoehn/jursey I'm just posting it here in case someone finds it useful.



Discuss

Excerpts from a larger discussion about simulacra

11 апреля, 2019 - 00:27
https://s0.wp.com/i/blank.jpg

Best reasons for pessimism about impact of impact measures?

10 апреля, 2019 - 20:22
Published on April 10, 2019 5:22 PM UTC

Habryka recently wrote (emphasis mine):

My inside views on AI Alignment make me think that work on impact measures is very unlikely to result in much concrete progress on what I perceive to be core AI Alignment problems, and I have talked to a variety of other researchers in the field who share that assessment. I think it’s important that this grant not be viewed as an endorsement of the concrete research direction that Alex is pursuing, but only as an endorsement of the higher-level process that he has been using while doing that research.

As such, I think it was a necessary component of this grant that I have talked to other people in AI Alignment whose judgment I trust, who do seem excited about Alex’s work on impact measures. I think I would not have recommended this grant, or at least this large of a grant amount, without their endorsement. I think in that case I would have been worried about a risk of diverting attention from what I think are more promising approaches to AI Alignment, and a potential dilution of the field by introducing a set of (to me) somewhat dubious philosophical assumptions.

I'm interested in learning about the intuitions, experience, and facts which inform this pessimism. As such, I'm not interested in making any arguments to the contrary in this post; any pushback I provide in the comments will be with clarification in mind.

There are two reasons you could believe that "work on impact measures is very unlikely to result in much concrete progress on… core AI Alignment problems". First, you might think that the impact measurement problem is intractable, so work is unlikely to make progress. Second, you might think that even a full solution wouldn't be very useful.

Over the course of 5 minutes by the clock, here are the reasons I generated for pessimism (which I either presently agree with or at least find it reasonable that an intelligent critic would raise the concern on the basis of currently-public reasoning):

  • Declarative knowledge of a solution to impact measurement probably wouldn't help us do value alignment, figure out embedded agency, etc.
  • We want to figure out how to transition to a high-value stable future, and it just isn't clear how impact measures help with that.
  • Competitive and social pressures incentivize people to cut corners on safety measures, especially those which add overhead.
    • Computational overhead.
    • Implementation time.
    • Training time, assuming they start with low aggressiveness and dial it up slowly.
  • Depending on how "clean" of an impact measure you think we can get, maybe it's way harder to get low-impact agents to do useful things.
    • Maybe we can get a clean one, but only for powerful agents.
    • Maybe the impact measure misses impactful actions if you can't predict at near human level.
  • In a world where we know how to build powerful AI but not how to align it (which is actually probably the scenario in which impact measures do the most work), we play a very unfavorable game while we use low-impact agents to somehow transition to a stable, good future: the first person to set the aggressiveness too high, or to discard the impact measure entirely, ends the game.
  • In a More realistic tales of doom-esque scenario, it isn't clear how impact helps prevent "gradually drifting off the rails"..mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} 1

1 Paul raised concerns along these lines:

We'd like to build AI systems that help us resolve the tricky situation that we're in. That help design and enforce agreements to avoid technological risks, build better-aligned AI, negotiate with other actors, predict and manage the impacts of AI, improve our institutions and policy, etc.

I think the default "terrible" scenario is one where increasingly powerful AI makes the world change faster and faster, and makes our situation more and more complex, with humans having less and less of a handle on what is going on or how to steer it in a positive direction. Where we must rely on AI to get anywhere at all, and thereby give up the ability to choose where we are going.

That may ultimately culminate with a catastrophic bang, but if it does it's not going to be because we wanted the AI to have a small impact and it had a large impact. It's probably going to be because we have a very limited idea what is going on, but we don't feel like we have the breathing room to step back and chill out (at least not for long) because we don't believe that everyone else is going to give us time.

If I'm trying to build an AI to help us navigate an increasingly complex and rapidly-changing world, what does "low impact" mean? In what sense do the terrible situations involve higher objective impact than the intended behaviors?

(And realistically I doubt we'll fail at alignment with a bang---it's more likely that the world will just drift off the rails over the course of a few months or years. The intuition that we wouldn't let things go off the rails gradually seems like the same kind of wishful thinking that predicts war or slow-rolling environmental disasters should never happen.)

It seems like "low objective impact" is what we need once we are in the unstable situation where we have the technology to build an AI that would quickly and radically transform the world, but we have all decided not to and so are primarily concerned about radically transforming the world by accident. I think that's a coherent situation to think about and plan for, but we shouldn't mistake it for the mainline. (I personally think it is quite unlikely, and it would definitely be unprecedented, though you could still think it's the best hope if you were very pessimistic about what I consider "mainline" alignment.)



Discuss

"Intelligence is impossible without emotion" — Yann LeCun

10 апреля, 2019 - 20:17
https://i.ytimg.com/vi/he-AaNp144A/maxresdefault.jpg

Alignment Newsletter One Year Retrospective

10 апреля, 2019 - 09:58
Published on April 10, 2019 6:58 AM UTC

On April 9, 2018, the first Alignment Newsletter was sent out to me and one test recipient. A year later, it has 889 subscribers and two additional content writers, and is the thing for which I’m best known. In this post I look at the impact of the newsletter and try to figure out what, if anything, should be changed in the future.

(If you don’t know about the newsletter, you can learn about it and/or sign up here.)

Summary

In which I badger you to take the 3-minute survey, and summarize some key points.

Actions I’d like you to take
  • If you have read at least one issue of the newsletter in the last two months, take the 3-minute survey! If you’re going to read this post anyway, I’d prefer you first read the post and then take the survey; but it’s much better to take the survey without reading this post than to not take it at all.
  • Bookmark or otherwise make sure to know about the spreadsheet of papers, which includes everything sent in the newsletter, and a few other papers as well.
  • Now that the newsletter is available in Mandarin (thanks Xiaohu!), I’d be excited to see the newsletter spread to AI researchers in China.
  • Give me feedback in the comments so that I can make the newsletter better! I’ve listed particular topics that I want input on at the end of the post (before the appendix).
Everything else
  • The number of subscribers dwarfs the number of people working in AI safety. I’m not sure who the other subscribers are, or what value they get from the newsletter.
  • The main benefits of the newsletter are: helping technical researchers keep up with the field, helping junior researchers skill up without mentorship, and reputational effects. The first of these is both the most important one, and the most uncertain one.
  • I spent a counterfactual 300-400 hours on the newsletter over the last year.
  • Still, in expectation the newsletter seems well worth the time cost, but due to the high uncertainty on the benefits to researchers, it’s plausible that the newsletter is not worthwhile.
  • There are a bunch of questions I’d like feedback on. Most notably, I want to get a better model of how the newsletter adds value to technical safety researchers.
Newsletter updates

In which I tell you about features of the newsletter that you probably didn’t know about.

Spreadsheet

Many of you probably know me as the guy who summarizes a bunch of papers every week. I claim you should instead think of me as the guy who maintains a giant spreadsheet of alignment-related papers, and incidentally also sends out a changelog of the spreadsheet every week. You could use the spreadsheet by reading the changelog every week, but you could also use it in other ways:

  • Whenever you want to do a literature review, you find the relevant categories in the spreadsheet and use the summaries to decide which of the papers to read in full.
  • When you come across a new, interesting paper, you first Ctrl+F for it in the spreadsheet and read the summary and opinion if they are present, before deciding whether to read the paper in full. I expect most summaries to be more useful for this purpose than reading the abstract; the longer summaries can be more useful than reading the abstract, introduction and conclusion. Perhaps you should do it right now, with (say) “Prosaic AI alignment”, just to intuitively get how trivial it is to do.
  • When you find an interesting idea or concept, search for related words in the spreadsheet to find other writing on the topic. (This is most useful for non-academic ideas -- for academic ones, Google Scholar is the way to go.)

I find myself using the spreadsheet a couple of times a week, often to remind me of what I thought about a paper or post that I had read a long time ago, but also for literature reviews and finding papers that I vaguely remember that are relevant to what I’m currently thinking about. Of course, I have a better grasp of the spreadsheet making search easy; the categories make intuitive sense to me; and I read far more than the typical researcher, so I’d expect it to significantly more useful to me than to other people. (On the other hand, I don’t benefit from discovering new material in the spreadsheet, since I’m usually the one who put it there.)

Translation

Xiaohu Zhu has offered to translate the Alignment Newsletter to Mandarin! His translations can be found here; I also copy them over to the main Alignment Newsletter page. I’d be excited to see more Chinese AI researchers reading the newsletter content.

Newsletter stats

In which I present raw data and questions of uncertainty. This might be useful to understand newsletters broadly, but I won’t be drawing any big conclusions. The main takeaway is that lots of people read the newsletter; in particular, there are more subscribers than researchers in the field. Knowing that, you can skip ahead to “Impact of the newsletter” and things should still make sense.

Growth

As of Friday April 5, according to Mailchimp, there are 889 subscribers to the newsletter. Typically, the open rate is just over 50%, and the click-through rate is 10-15%. My understanding is that this is very high relative to other online mailing lists; but that could be because of online shopping mailing lists, where you are incentivized to send lots of emails at the expense of open and click-through rates. There are probably also readers who read the newsletter on the Alignment Forum, LessWrong, or Twitter.

The newsletter typically gets a steady trickle of 0-25 new subscribers each week, and sometimes gets a large increase. Here are all of the weeks in which there were >25 new subscribers:

AN #1 -> AN #2: 2 -> 141 subscribers (+139), because of the initial announcement.

AN #3 -> AN #4: 148 -> 238 subscribers (+90), probably still because of the initial announcement, though I don’t know why it grew so little between #2 and #3.

AN #14 -> AN #15: 328 -> 405 subscribers (+77), don’t know why (though I think I did know at the time)

AN #16 -> AN #17: 412 -> 524 subscribers (+112), because of Miles Brundage’s tweet on July 23 about his favorite newsletters.

AN #17 -> AN #18: 524 -> 553 subscribers (+29), because of this SSC post on July 30 and the LessWrong curation of AN #13 on Aug 1.

AN #18 -> AN #19: 553 -> 590 subscribers (+37), because of residual effects from the past two weeks.

AN #30 -> AN #31: 653 -> 689 subscribers (+36), because of Rosie Campbell’s blog post on Oct 29 about her favorite newsletters.

Over time, the opens and clicks have gone down as a percentage of subscribers, but have gone up in absolute numbers. I would guess that the biggest effect is that the most interested people subscribed early, and so as time goes on the marginal subscriber is less interested and ends up bringing down the percentages. Another effect would be that over time people get less interested in the newsletter, and stop opening/clicking on it, but don’t unsubscribe. However, over the last few months, rates have been fairly stable, which suggests this effect is negligible.

On the other hand, during the last few months growth has been organic / word-of-mouth rather than through “publicity” like Miles’s tweet and Rosie’s blog post, so it’s possible that organic growth leads to more interested subscribers who bring up the rates, and this effect approximately cancels the decrease in rates from people getting bored of the newsletter. I could test this with more fine-grained data about individual subscribers but I don’t care enough.

So far, I have not been trying to publicize the newsletter beyond the initial announcement. I'm still not sure of the value of a marginal reader obtained via “publicity”. The newsletter seems to me to be both technical and insider-y (i.e. it assumes familiarity with basic AI safety arguments), while the marginal reader from “publicity” seems not very likely to be either. That said, I have heard from a few readers that the newsletter is reasonably easy to follow, so maybe I'm putting too much weight on this concern. I’d love to hear thoughts in the comments.

Composition of subscribers

I don’t know who these 889 subscribers are; it’s much larger than the size of the field of AI safety. Even if most of the technical safety researchers and strategy/policy researchers have subscribed, that would only get us to 100-200 subscribers. Some guesses on who the remaining people are:

  • There are lots of people who are intellectually interested in AI safety but don’t work on it full time; maybe a lot of them have subscribed.
  • A lot of technical researchers are interested in AI ethics, fairness, bias, explanations and so on. I occasionally cover these topics. In addition, if you’re interested in short-term effects of AI, you might be more likely to be interested in the long-term effects as well. (Mostly I’m putting this down because I’ve met a few people in this category who expressed interest in the newsletter.)
  • Non-technical researchers interested in the effects of AI might plausibly find it useful to read the newsletter to get a sense of what AI is capable of and how technical researchers are thinking about safety.

Regardless of the answer, I’m surprised that these people find the newsletter valuable. Most of the time I’m writing to technical safety researchers, and relying on an assumption of shared jargon and underlying intuitions that I don’t explain. It’s not as bad as it could be, since I try to make my explanations accessible both to people working in traditional AI as well as people at MIRI, but I would have guessed that it was still not easy to understand from the outside. Some hypotheses, only the first of which seems plausible:

  • I’m wrong about how difficult it is to understand the newsletter. Perhaps people can understand everything, or maybe they can still get a useful gist from summaries even if they don’t understand everything.
  • People use it only as a source of interesting papers, and ignore the summaries and opinions (because they are hard to understand).
  • Reading the summaries and opinions gives the illusion of understanding even though people don’t actually understand what I’m saying.
  • People like to feel like a part of an elite group who can understand the technical jargon, and reading the newsletter gives them that feeling. (This would not be a conscious decision on their part.)

I sampled 25 people uniformly at random from the subscribers. Of these, I have met 8 of them, and have heard of 2 more. I would categorize the 25 people in the following rough categories: x-risk community (4), AI researchers sympathetic to x-risk (2), students (3), people interested in AI and x-risk (3), people involved with AI startups (2), researcher with no publicly obvious interest in x-risk (6), and could not be found easily (5). But really the most salient outcome was that for anyone I didn’t already know, I found it very hard to figure out why they were subscribed to the newsletter.

Impact of the newsletter

In which I try and fail to figure out whether the benefits outweigh the costs.

Benefits

Here are the main sources of value from the newsletter that I see:

  • Causing technical researchers to know more about other areas of the field besides their own subfield.
  • Field building, by giving new entrants into AI safety a way to build up their knowledge without requiring mentorship.
  • Improving the reputation of the field of AI safety (especially among the wider AI research community), by demonstrating a level of discourse above the norm, particularly in conjunction with good writing about current AI topics. There's a mixture of reasoning about current AI and speculative future predictions that clearly demonstrates that I'm not some random outsider critiquing AI researchers.
  • Creating a strong reputation for myself and CHAI, such that people will have justified reason to listen to CHAI and/or me in the future.
  • Providing some sort of value to the subscribers who are not in long-term AI safety or AI strategy/policy.

When I started the newsletter, I was aiming primarily for the first one, by telling researchers what they should be reading. I continue to optimize mainly for that, though now I often try to provide enough information that researchers don’t have to read the original paper/post. I knew about the second source of value, but didn’t think it would be very large; I’m now more uncertain about how important it is. The reputational effects were more unexpected, since I didn’t think the newsletter would become as large as it currently is. I don’t know much about the last source of value and am basically ignoring it (i.e. pretending it is zero) in the rest of the analysis.

I’m actually quite uncertain about how much value comes from each of these subpoints, mainly because there’s a striking lack of comments or feedback on the newsletter. Excluding one person at CHAI who I talk to frequently, I get a comment on the content of the newsletter maybe once every 3-4 weeks. I can understand that people who get it as an email newsletter may not see an obvious way to comment (replying to a newsletter email is an unusual thing to do), but the newsletter is crossposted to LessWrong, the Alignment Forum, and Twitter. Why aren’t there comments there?

One possibility is that people treat the newsletter as a curation of interesting papers and posts, in which case there isn’t much need to comment. However, I’m fairly confident that many readers also find value in the summaries and opinions. You could instead interpret this as evidence that the things I’m saying are reasonable -- after all, if I was wrong on the Internet, surely someone would let me know. On the other hand, if I’m only saying things that people already believe, am I actually accomplishing anything? It’s hard to say.

I think the most likely story is that I say things that people didn’t know but agree with once I say them -- but I share Raemon’s intuition that people aren’t really learning much if that’s the case. (The rest of that post has many more thoughts on comments that apply to the newsletter.)

Overall it still feels like in expectation most of the value comes from widening the set of fields that any individual technical researcher is following, but it seems entirely possible that the newsletter does not do that at all and as a result only has reputational benefits. (I am fairly confident that the reputational benefits are positive and non-zero.) I’d really like to get more clarity on this, so if you read the newsletter, please take the survey!

Costs

The main cost of the newsletter is the opportunity cost of our time. Each newsletter takes about 15 hours of my time. The newsletter has gotten more detailed over time, but this isn’t reflected in the total hours I put in because it has been approximately offset by new content writers (Richard Ngo and Dan Hendrycks) who took some of the burden of summarizing off of me. Currently I’d estimate that the newsletter takes 20 hours in total (so 5 hours from Richard and Dan), with high uncertainty. This can be broken down into time I would have spent reading and summarizing papers anyway, and time that I spent only because the newsletter exists, which we could call “extra hours”. Initially, I wanted to read and summarize a lot of papers for my own benefit, so the newsletter took about 4-5 extra hours per week. Now, I’m less inclined to read a ton of papers, and it take 8-10 extra hours per week.

This means in aggregate I’ve spent 700-800 hours on the newsletter, of which about 300-400 were hours that I wouldn’t have spent otherwise. Even only counting the 300-400 hours, this is comparable to the time I spent on state of the world and learning biases projects together, including all of the time spent on paper writing, blog posts, and talks in addition to the research itself.

In addition to time costs, the newsletter could do harm. While there are many ways this could happen, the only one that feels sufficiently important to consider is the risk of causing information cascades. Since nearly everyone in the field is reading the newsletter, we may all end up with some belief B just because it was in a newsletter. We might then have way too much confidence in B since everyone else also believes B.

Overall I’m not too worried. There’s so much content in the newsletter that I seriously doubt a single idea could spread widely as a result of the newsletter -- inevitably some people won’t remember that particular idea. So we only need to worry about “big” ideas that are repeated often in the newsletter. The most salient example of that would be my general opposition to the Bostrom/Yudkowsky paradigm of AI safety, but it still seems quite prevalent amongst researchers. In addition I’d be really surprised if existing researchers were convinced of a “big” idea or paradigm solely because other researchers believed it (though they might put undue weight on it).

Is the newsletter worth it?

If the only benefit of the newsletter were the reputational effects, it would not be worth my time (even ignoring Richard and Dan’s time). However, I get enough thanks from people in the field that the newsletter must be providing value to them, even though I don’t have a great model of what the value is. My current best guess is that there is a lot of value, which makes the newsletter worth the cost, but I think there is a non-negligible chance that this would be reversed if I had a good model of what value everyone was getting from it.

Going forward

In which I figure out what about the newsletter should change in the future.

Structure of the newsletter

So far I’ve only talked about whether the newsletter is worthwhile as a whole. But of course we can also analyze individual aspects of the newsletter and figure out how important they are.

Opinions are probably the key feature of the newsletter. Many papers and blog posts are aimed more at appearing impressive rather than conveying facts. Even the ones that are truth seeking are subject to publication bias: they are written by people who think that the ideas within are important, and so will be biased towards positivity. As a result, an opinion from a researcher who didn't do the work can help contextualize the results that makes it easier for less involved readers to figure out the importance of the ideas. (As a corollary, I worry about the lack of a fresh perspective on posts that I write, but don’t see an obvious easy solution to that problem.) I think this also contributes to the success of Import AI and ChinAI, which are also quite heavy on opinions.

I think the summaries are also quite important. I aim for the longer summaries to be sufficiently informative that you don’t have to read the blog post / paper unless you want to do a deep dive and really understand the results. For papers, I often roughly aim for it to be more useful to read my summary than to read the abstract, intro, and conclusion of the paper. In the world where the newsletter didn’t have summaries, I think researchers would not keep up as much with the state of the field.

Overall, I think I’m pretty happy with the current structure of the newsletter, and don’t currently intend to change it. But if I get more clarity on what value the newsletter provides to researchers, I wouldn’t be surprised if I would change the structure as a result.

Scaling up

In the year that I’ve been writing the newsletter, the amount of writing that I want to cover has gone up quite a lot, especially with the launch of the Alignment Forum. I expect this will continue, and I won’t be able to keep up.

By default, I would cover less and less of it. However, it would be nice for the spreadsheet to be a somewhat comprehensive database of the AI safety literature. This is not what we currently have, because I often don’t cover good Agent Foundations work because it’s hard for me to understand and I don’t have pre-2018 content, but it is pretty good for the subfields of AI safety that I’m most knowledgeable about.

There has been some outsourcing of work as Richard Ngo and Dan Hendrycks have joined, but it still does not seem sustainable to continue this long-term, due to coordination challenges and challenges with maintaining quality. That said, it’s not impossible that this could work:

  • Perhaps I could pay people to do this summarization, with the hope that this would help me find people who could put in more time. This would allow more work to get done while keeping the team small (which keeps coordination costs and quality maintenance costs small).
  • I could create a system that allows random people to easily contribute summaries of papers and posts they have read, while writing the opinions myself. It may be easier to vet and fix summaries than to write them myself.
  • I could invest in developing good guides for new summarizers, in order to decrease the cost of onboarding and ongoing coordination.

That said, in all of these cases, it feels better to instead just summarize a smaller fraction of all the work, especially since the newsletter is already long enough that people probably don’t read all of it, while still adding links to papers that I haven’t read to the spreadsheet. The main value of summarizing everything is having a more comprehensive spreadsheet, but I don’t think this is sufficiently valuable to warrant the approaches above. That said, I could imagine that this conclusion being overturned by having a better model of how the newsletter adds value for technical safety researchers.

Sourcing

So far, I have found papers and articles from newsletters, blogs, Arxiv Sanity and Twitter. However, Twitter has become worse over time, possibly because it has learned to show me non-academic stuff that is more attention-grabbing or controversial, despite me trying not to click on those sorts of things. Arxiv Sanity was my main source for academic work, but recently it’s been getting worse, and is basically not working any more, and I’m not sure why. So I’m now trying to figure out a new way to find relevant literature -- does anyone have suggestions?

If I continue to have trouble, I might summarize random academic papers I’m interested in instead of the ones that have come out very recently.

Appearance

It’s rather annoying that the newsletter is a giant wall of text; it’s probably not fun to read as a result. In addition to the categories, which were partly meant to give structure to the wall of text, I’ve been trying to break things into more paragraphs, but really it needs something much more drastic. However, I also don’t want it to be even more work to get a newsletter out.

So, if anyone wants to volunteer to make the newsletter visually nicer that would be appreciated, but it shouldn’t cost me too much more time (maybe half an hour a week, if it was significantly nicer). One easy possibility would be to include an image at the beginning of the newsletter -- any suggestions for what should go there?

Future of the newsletter

Given the uncertainty of the value of the newsletter, it’s not inconceivable that I decide to stop writing it in the future, or scale back significantly. That said, I think there is value in stability. It is generally bad for a project to have “fits and starts” where its quality varies with the motivation of the person running them, or for the project to potentially be cancelled solely based on how valuable the creator thinks it is. (I’m aware I haven’t argued for this; feel free to ask me about it if it seems wrong.)

Due to this and related reasons, when I started the newsletter, I had an internal commitment to continue writing it for at least six months, as long as most other people thought it was still valuable. Obviously, if everyone agreed that the newsletter was not useful or actively harmful, then I’d stop writing it: this is more to deal with the case where I no longer think the newsletter is useful, even though other people think it is useful.

Now I’m treating it as an ongoing three-month commitment: that is, I am always committing to continue writing the newsletter for at least three months as long as most other people think it is valuable. At any point I can decide to stop the ongoing commitment (presumably when I think it is no longer worth my time to write it); there would then be three months where I would continue to write the newsletter for stability, and figure out what would happen with the newsletter after the three months.

Feedback I’d like

There are a bunch of questions I have, that I’d love to get opinions on either anonymously in the 3-minute survey (which you should fill out!) or in the comments. (Comments preferred because then other people can build off of them.) I’ve listed the questions roughly in order of importance:

  • What is the value of the newsletter for you?
  • What is the value of the newsletter for other people?
  • How should I deal with the growing amount of AI safety research?
  • What can I do to get more feedback on the newsletter on an ongoing basis (rather than having to survey people at fixed times)?
  • Am I underestimating the risk of causing information cascades? Regardless, how can I mitigate this risk?
  • How can I make the newsletter more visually appealing / less of a wall of text, without expending too much weekly effort?
  • Should I publicize the newsletter on Twitter? How valuable is the marginal reader?
  • Should I publicize the newsletter to AI researchers? How valuable is the marginal reader?
  • How can I find good papers out of academia now that Arxiv Sanity isn’t working as well as it used to?
Appendix: Alignment Newsletter FAQ

All of these are in the appendix because I don’t particularly care if people read it or not. It’s not very relevant to any of the content in the main post. It is relevant to anyone who might want to start their own newsletter, or their own project more generally.

What’s the history of the Alignment Newsletter?

During one of the CHAI seminars, someone suggested that we each take turns finding and collecting new research papers and sending them out to each other. I already had a system in place doing exactly this, so I volunteered to do this myself (rather than taking turns). I also figured that to save even more CHAI-researcher-time, it would make sense to give a quick summary and then tell people under what circumstances they should read the paper. (I was already summarizing papers for my own notes.)

This pretty quickly proved to be valuable, and I thought about making it public for even more time savings. However, it still seemed pretty nascent and in flux, so I continued iterating on it within CHAI, while thinking about how it could be made to be public-facing. (See also the “Things done right” section.) After a little under two months of writing the newsletter within CHAI, I made it public. At that time, the goal was to provide a list of relevant readings for technical AI safety researchers that had been published each week; and help them decide whether or not they should read them.

Over time, my summaries and opinions became longer and more detailed. I don’t know exactly why this happened. Regardless, at some point I started aiming for some of my summaries to be detailed enough that researchers could just read the summary and not read the paper/post itself.

In September, Richard Ngo volunteered to contribute summaries to the newsletter on a variety of topics, and Dan Hendrycks joined soon after focusing on robustness and uncertainty.

Why do you never have strong negative opinions?

One of the design decisions made at the beginning of the newsletter was to avoid strong critiques of any particular piece of research. This was for a few reasons:

  • As a general rule, any criticism I have of a paper is often too strong or based on a misunderstanding. If I have a negative impression of a paper or research agenda, I would predict that with ~90% probability after I talk to the author(s) my opinion of the work will have improved. I don’t think this is particular to me -- this should be expected of any summarizer since the authors have much more intuition about why their particular approach will be useful, beyond what is written in the blog post or paper.
  • The newsletter probably shapes the views of a significant fraction of people thinking about AI safety, and so leads to a risk of information cascades. Mitigating this means giving space to views that I disagree with, summarizing them as best I can, and not attacking what will inevitably be a strawman of their view.
  • Regardless of the accuracy of the criticism, I would like to avoid alienating people.

Of course, this decision has downsides as well:

  • Since I’m not accurately saying everything I believe, it becomes more likely that I accidentally say false things, convey wrong impressions, or otherwise make it harder to get to the truth.
  • Disagreements are one of the main ways in which intellectual progress is made. They help identify points of confusion, and allow people to merge their models in order to get something (hopefully) better.

While the first downside seems like a real cost, the second downside is about inhibiting intellectual progress in AI safety research. I think this is okay: intellectual progress does not need to happen in the newsletter. In most of these cases I express stronger disagreements in channels more conducive to intellectual progress (e.g. the Alignment Forum, emails/messages, talking in person, the version of the newsletter internal to CHAI).

Another probable effect of avoiding negativity is reduced readership, since it is likely much more interesting to read a newsletter with active disagreements and arguments than one that dryly summarizes a research paper. I don’t yet know whether this is a pro or a con (even ignoring other effects of negativity).

Mistakes

I don’t know of very many mistakes, even in hindsight. I think this is primarily because I don’t get feedback on the newsletter, not because everything has gone perfectly. It seems quite likely that there are still things that are mistakes; but I don’t know it yet because I don’t have the data to tell.

Analyzing other newsletters. The one thing that I wish I had done was to analyze other newsletters like Import AI in more detail before starting this one. I think it’s plausible that I could have realized the value of opinions and more detailed summaries right at the beginning, rather than evolving in that direction over a couple of months.

Delays. I did fall over a week behind on the newsletter over the last month or two. While this is bad, I wouldn’t really call it a Mistake: I don’t think of the newsletter as a weekly commitment or obligation. I very much value the flexibility to allocate time to whatever seems most pressing; if the newsletter was more of a commitment (such that falling behind is a Mistake), I think I would have to be much more careful about what I agree to do, and this would prevent me from doing other important things. Instead, my approach is to have the newsletter as a fairly important goal that I try to schedule enough time for, but if I find myself running out of time and have to cut something, it’s not a tragedy if it means the newsletter is delayed. That’s essentially what happened over the last month or two.

Things done right

I spent a decent amount of time thinking about the design of the newsletter before implementing it, and I think this was in hindsight a very good idea. Here I list a few things that worked out well.

A polished product. I was particularly conscious of the fact that at launch the newsletter would be using up the limited common resource of “people’s willingness to try out new things”. Both in order to make sure people stuck with the project, and in order to not use up the common resource unnecessarily, I wanted to be fairly confident that this would be a good product before launching. As a result, I iterated for a little under two months within CHAI, in order to figure out product-market fit. You can see the evolution over time -- this is the first internal newsletter, whereas this is the first public newsletter. (They’re all available here.)

  • By the fourth internal newsletter, I realized that I couldn’t actually summarize all the links I found, so I switched to a version where some links would be sent without summaries.
  • Categorization seemed important, so I did more of it.

This is not to say that the newsletter has been static since launch; it has changed significantly. Most notably, while originally I was aiming to give people enough information to decide whether or not to read the paper/post, I now sometimes aim for including enough detail that people don’t need to read the paper/post. But the point is that a lot of the early improvements happened within CHAI without consuming the common resource.

I’m not sure to what extent this is different from standard startup advice of iterating quickly and testing product-market fit: it depends on whether it counts as testing for product-market fit to trial the newsletter within CHAI. To the extent that there is a difference, it’s mainly that I’m arguing for more planning, especially before consuming common resources (whereas with startups, the fierce competition means that you do not worry about consuming common resources).

Considered stability and commitment. As I mentioned above, I had an internal commitment to continue writing the newsletter for at least six months, as long as other people thought it was valuable. In addition to the value of stability, I viewed this as part of cooperatively using the common resource of people’s willingness to try things. If you’re going to use the resource and fail, ideally you would have learned that it is actually infeasible to succeed in that domain, as opposed to e.g. lack of motivation on the author’s part.

Here’s another way to see this. I think it would have been a lot harder for the newsletter to be successful if there had been 2-5 attempts to create a newsletter in the past that had then fizzled out, because people would expect newsletters to fail and wouldn’t subscribe. My initial commitment helps prevent me from being one of those failures for “bad” reasons (e.g. me losing motivation) while still allowing me to fail for “good” reasons (e.g. no one actually wants to read a newsletter about AI alignment).

I can’t point to any actually good outcomes that resulted from this policy; nonetheless I think it was a good thing to have done.

Investing in flexible automated systems. I had created the private version of the spreadsheet before the first public newsletter, in order to have a database of readings for myself (replacing my previous Google Doc database), and I wrote a script to generate the email from this database. While lots of ink has been spilled on the value of automation, it doesn’t usually emphasize flexibility. By not using a technology meant for one specific purpose, I was able to do a few things that I wouldn’t expect to be able to do with a more specialized version:

  • Create consistency checks. For example, throwing an error when there’s an opinion but no summary, or when the name of the summarizer is not “Richard”, “Dan H” or “” (indicating me).
  • Creating a private and public version of the newsletter. (Any strong critiques go into the private version, which is internal to CHAI, and are removed from the public version.)

But really, the key value of flexibility is that it allows you to adapt to circumstances that you had never even considered when creating the system:

  • When Richard Ngo joined, I added a “Summarizer” column to the sheet, changed a few lines of code, and was done. (Note how I needed flexibility over both the data format and the analysis code.)
  • I’ve found myself linking to a bunch of previous newsletter entries and having to copy a lot of links. Recently I added a new tag that I can use in summaries and opinions that automatically extracts and links the entry I’m referring to. (I’m a bit embarrassed at how long it took me to realize that this was a thing I could do; I could have saved a lot more tedious work if I had realized it was a possibility the first time I got annoyed at this process.)

Thought about potential negative effects. I’m pretty sure I thought of most of the points about negativity (listed above) before publicizing the newsletter. This is discussed a lot; I don’t think I have anything significant to add.

This section seems to indicate that I thought of things initially and they were all important -- this is almost certainly not the case. I’m sure I’m rationalizing some of these with hindsight and didn’t actually think of all the benefits then, and I also probably thought of other considerations that didn’t end up being important that I’ve now forgotten.



Discuss

Страницы