Вы здесь

Сборщик RSS-лент

Old-world Politics Fallacy

Новости LessWrong.com - 23 июня, 2020 - 15:32
Published on June 23, 2020 12:32 PM GMT

When a nasty political problem (like the current SSC situation) hits my consciousness, I'm habituated to Do Something About It. I feel an urge to investigate the political climate, find allies, and fight back against the threat. In the current Internet age, and with my nonexistent political clout and social influence (especially considering I live in Iran and people here generally can't be counted on to know or care about global politics), the substitution bias kicks in; I substitute doing The Real Thing (to which my contribution would be very meager if any) with reading online forums (Reddit, Twitter, Lesswrong, SSC, Hackernews, ...)(substituted for "investigate the political climate"), voting the Correct posts in said forums and possibly writing some low-effort answers to particularly egregious pieces ("find allies, and fight back"), and generally feeling bad that I have failed the Mission (and that the Society is broken).

I speculate that this fallacy probably has some evolutionary roots. In a hunter-gatherer tribe, a person such as myself (I estimate myself to be upper-mid status.) would have had a fair chance of affecting political change against causes that mostly hurt everyone but a minority of politicians who don't produce much value anyways. Especially since I (and my family) have been quite morally upstanding and honest, in a small community, we would have had a reputation to draw on. Even if not, engaging with the community would have been the essential first step in being part of a coalition, necessary for survival.

Obviously, in the 21st century, all this is moot for political stuff that actually matters. Most people are quite powerless in affecting those matters, one reason being that the important issues now affect orders of magnitude more people. I don't know what the optimal strategy currently is. My gut feeling is that a lot of the good people are un-politicizing themselves and simply giving up. (Hasn't Scott's default defensive strategy been more (self)censorship?) Individual contributory power being what it is, this might actually be the best heuristic. If political "activism" consists mostly of low-skill low-reputation noise-making, a game of quantity over quality, then "good" people will immediately lose the comparative advantage. In fact, in Iran, the situation seems to be that the only marginally effective activism is violence. Which, in an authoritarian regime, obviously leaves the hungry and the criminal to fend off the evil. The second derivative of their numbers being positive because of abject government failure does not exactly lend me hope.

To summarize; The old-world politics fallacy is the mistaken alief that you have meaningful political power in the 21st century. It often manifests via the substitution bias as fervent but ultimately zero-impact digital activity. Obviously, the fallacy leaves a bitter taste when you do notice that your efforts bore no fruits.

A related phenomenon might be our largely unwarranted interest/motivation (the two are subtly different) for participating in social media that engage mostly with strangers. We are wired to find coalitions of like-minded people and join them, because like-minded people's values align better with our own. The modern era, by allowing our freedom and individuality to flourish, has allowed us to be ever more different than one another. Social media empower us to find and communicate with people much more precisely than is possible in the physical world. These two trends cause us to seek coalition-building in the cyberspace, when the cyberspace usually does not facilitate that function (yet). This will, necessarily, cause us to overvalue our social media interactions. (Because we have the alief that we are actually accumulating social capital when no such thing is happening.)

I don't know what to call this cognitive error. "Pseudo-socializing?"


Personal experience of coffee as nootropic

Новости LessWrong.com - 23 июня, 2020 - 15:32
Published on June 23, 2020 12:00 PM GMT

My life took a turn in 2015. That was when I first started drinking coffee on a regular basis. Not Americanos or Lattes, but 'Kopi' brewed with higher-caffeine Robusta beans using a 'sock' immersed in water close to boiling temperature, extracting massive amounts of caffeine along with strong bitter notes that are neutralized by sweet, sugary condensed milk.

My rate of learning in sports shot up seemingly overnight. My field of vision and attention increased tremendously. I gained more stamina and muscle control. It wasn't just a subjective feeling. The results showed in the speed and accuracy of the pitches I threw. The first training session I had with coffee was so productive, I almost swore that I would never train without it again.

Seeing its benefits, I began to use it for my studies as well. The hit of dopamine I got from starting problem sets with a cup of kopi was nothing like I'd experienced before. While in the past I had to start slow and enter 'flow' gradually, this was a shortcut to the flow state. Sure, learning didn't happen instantly, and I still had to struggle, but I enjoyed the process. I felt unnaturally productive; my absorption of information shot up, and I could go at it for progressively longer hours. This was my NZT pill.

Gradually, the more I was able to accomplish, the greater the aims I set for myself. I couldn't believe it. As a result of drinking coffee, my 'ambition' increased because my self-judgement on learning ability and productivity increased. I had other strong reasons for working hard, but the biochemical changes that occurred in my brain and body after ingesting coffee brought an emotional multiplier to those reasons.

However, this is where complications came in. I slowly began to build up a heavy tolerance to it. When before it raised my energy and loosened my social inhibitions, I started "zoning out" with friends, craving for that focused mode where I could work on my studies. Initially, I attributed the gradual shelling up to the long hours of study and mental stress caused by the major exam at the end of the year. Only after abstaining from coffee for a period of time, did I think of linking that never-before-experienced social discomfort to coffee.

On hindsight, I realised that I might have been experiencing a mild but non-negligible form of anxiety that came from work-related stress, compounded by the spikes in cortisol and adrenaline my body was producing in reaction to coffee. The initial spike of motivation in getting things done was offset by the dreaded caffeine crash. The sharp contrast of focus and pleasure in pre-coffee and post-coffee states was what got me into daily consumption, but in the long run, consuming coffee at that rate was a net detriment.

This leads to a question I have been trying to solve, and would like to pose to the community: if a common beverage like coffee can cause such a marked increase in life satisfaction and work/study output, what other foods or practices can lead to the same changes in subjective state, and is sustainable on a time scale of decades?


New York Times, Please Do Not Threaten The Safety of Scott Alexander By Revealing His True Name

Новости LessWrong.com - 23 июня, 2020 - 15:20
Published on June 23, 2020 12:20 PM GMT

In reaction to (Now the entirety of SlateStarCodex): NYT Is Threatening My Safety By Revealing My Real Name, So I Am Deleting The Blog

I have sent the following to New York Times technology editor Pui-Wing Tam, whose email is pui-wing.tam@nytimes.com:

My name is Zvi Mowshowitz. I am a friend of Scott Alexander. I grew up with The New York Times as my central source of news and greatly value that tradition.

Your paper has declared that you intend to publish, in The New York Times, the true name of Scott Alexander. Please reconsider this deeply harmful and unnecessary action. If Scott’s name were well-known, it would likely make it more difficult or even impossible for him to make a living as a psychiatrist, which he has devoted many years of his life to being able to do. He has received death threats, and would likely not feel safe enough to continue living with other people he cares about. This may well ruin his life.

At a minimum, and most importantly for the world, it has already taken down his blog. In addition to this massive direct loss, those who know what happened will know that this happened as a direct result of the irresponsible actions of The New York Times. The bulk of the best bloggers and content creators on the internet read Scott’s blog, and this will create large-scale permanent hostility to reporters in general and the Times in particular across the board.

I do not understand what purpose this revelation is intended to serve. What benefit does the public get from this information?

This is not news that is fit to print.

If, as your reporter who has this intention claims, you believe that Scott provides a valuable resource that enhances the quality of our discourse, scientific understanding and lives, please reverse this decision before it is too late.

If you don’t believe this, I still urge you to reconsider your decision in light of its other likely consequences.

We should hope it is not too late to fix this.

I will be publishing this email as an open letter.

Zvi Mowshowitz

PS for internet: If you wish to help, here is Scott’s word on how to help:

There is no comments section for this post. The appropriate comments section is the feedback page of the New York Times. You may also want to email the New York Times technology editor Pui-Wing Tam at pui-wing.tam@nytimes.com, contact her on Twitter at @puiwingtam, or phone the New York Times at 844-NYTNEWS.

(please be polite – I don’t know if Ms. Tam was personally involved in this decision, and whoever is stuck answering feedback forms definitely wasn’t. Remember that you are representing me and the SSC community, and I will be very sad if you are a jerk to anybody. Please just explain the situation and ask them to stop doxxing random bloggers for clicks. If you are some sort of important tech person who the New York Times technology section might want to maintain good relations with, mention that.)

If you are a journalist who is willing to respect my desire for pseudonymity, I’m interested in talking to you about this situation (though I prefer communicating through text, not phone). My email is scott@slatestarcodex.com.


SlateStarCodex deleted because NYT wants to dox Scott

Новости LessWrong.com - 23 июня, 2020 - 10:51
Published on June 23, 2020 7:51 AM GMT

NYT Is Threatening My Safety By Revealing My Real Name, So I Am Deleting The Blog

PS: One suggestion I have is to allow anonymous posts on Lesswrong that show the author’s anonymized karma. This is far from a good or complete solution, but I imagine it would at least come in handy in situations like this.


Prediction = Compression [Transcript]

Новости LessWrong.com - 23 июня, 2020 - 02:54
Published on June 22, 2020 11:54 PM GMT

(Talk given on Sunday 21st June, over a zoom call with 40 attendees. Alkjash is responsible for the talk, Ben Pace is responsible for the transcription.)

Ben Pace: Our next speaker is someone you'll all know as Alkjash on LessWrong, who has written an awesome number of posts. Babble and Prune, Hammertime Final Exam – which is one of my favorite names of a curated post on LessWrong. Alkjash, go for it.

Prediction = Compression Talk

Alkjash: I will be talking about a bit of mathematics today. It's funny that this audience is bigger than any I've gotten in an actual maths talk. It's a bit depressing. Kind of makes me question my life choices...

Alkjash: Hopefully this mathematics is new to some of you. I'm sure that the machine learning people know this already. The principle is that prediction is the same thing as compression. And what that means is that whenever you have a prediction algorithm, you can also get a correspondingly good compression algorithm for data you already have, and vice versa.

Alkjash: Let me dive straight into some examples. Take this sentence and try to fill in the vowels.

Ppl bcm wh thy r mnt t b, Mss Grngr, by dng wht s rght.


People become who they are meant to be, Miss Granger, by doing what is right.

Alkjash: You'll notice that it's actually quite easy, hopefully, to read the sentence, even though I've taken out all the vowels. The human brain is very good at reconstructing certain kinds of missing data, and in some sense, that's a prediction algorithm, but immediately it also gives us a compression algorithm. If I take this sentence and I remove all the vowels, it's transmitting the same amount of data to a human brain because they can predict what's missing. 

Alkjash: This is a very simple example where you can predict with certainty what's missing. In general, we don't always have prediction algorithms that can predict the future with certainty. But anytime we can do better than chance, we can also get some compression improvements.

Alkjash: Here's a slightly more sophisticated example. Suppose I have a four character alphabet and I encode some language with a four character alphabet in this obvious default way. Each character is encoded in two bits, then the following string will be encoded in this fifth string. 

Alkjash: I've highlighted every other code word for you to just see the emphasis. And you can see that this string has encoded as 32 bits in memory.

Alkjash: However, I might have a little more information or better priors about what this language looks like. I might know for example that As appear more frequently and Cs and Ds appear less frequently. Then I can choose a more suitable encoding for the task. I can choose a different encoding where A is given a shorter bit string because it appears more frequently, and C and D are given longer ones to balance that out. And, using this encoding instead, we see that the same string is encoded only 28 bits.

Alkjash: This time I didn't have any certainty about my predictions, but because I was able to do better than chance about predicting what the next character would be, I achieved a better compression ratio. I can store the same amount of data in memory with less bits.

Alkjash: The third example is very similar, but it's a different sort of prediction. Previously we just used letter frequencies, but now suppose I have a three character alphabet and I encode A as 0, B as 10, and C as 11, then by default I get 25 bits in this encoding. 

Alkjash: But now suppose I notice a different sort of pattern about this language, which here, the pattern is that no letter appears consecutively twice in a row. So, that's the different sort of predictive information from letter frequencies that we've had before.

Alkjash: Perhaps you could think for a moment about how you could use that information to store the same memory in fewer bits. No two As appear twice in a row; no two Bs appear twice in a row; no two Cs appear twice in a row. How would you use that information to compress the data?

[Pause for thinking]

Alkjash: Here's one way you might do it.

Alkjash: I'm going to encode the first letter as whatever it would have been before, and then I'll draw this little triangle and each next letter, because it has to be different, will either be clockwise or counter-clockwise from the previous letter in this triangle that I've drawn. 

Alkjash: So if the next letter is clockwise from my previous letter, I put down a zero. If the next letter is counter-clockwise from the previous letter, I put down a 1. And this is a very specific example of some very general algorithm in entropy encoding in the information theory.

Alkjash: So whenever you have any sort of way of better predicting the future than chance, it corresponds to a better data compression algorithm for that type of data, where you store the same amount of information in your bits.

Alkjash: That's really all I have to say. So, I guess the question I want to open up for discussion about what this principle has applications in rationality. For example, we care a lot about super-forecasters predicting the future. Is it true that people who are good at predicting the future are good at storing the same amount of knowledge in their minds very efficiently? Are these very closely related skills? I’d be curious if that's true. Obviously these skills should be correlated with IQ or what not, but perhaps there's an additional layer to that.

Alkjash: And second, I think we spend a lot of time emphasizing prediction and calibration games as tools to improve your rationality, but perhaps we should also spend just as much time thinking about how to efficiently compress the body of knowledge that you already know. And what sorts of things could be designed as, for example, rationality exercises for that, I'd be interested to hear ideas about that.


Ben Pace: Thank you very much, Alkjash. I really like the idea of super-forecasters and super-compressors. I want to start a similar tournament for super-compressors.

Alkjash: If any of you are interested, I have been personally interested in questions of which languages have the highest information rate. Perhaps with the newer neural networks, we might actually be close to answering this sort of question.

Ben Pace: By languages, you don't mean computer science languages?

Alkjash: No, I mean human languages. What language seems to use more letters per idea.

Ben Pace: Interesting. Do you have a basic sense of which languages do compress things more?

Alkjash: Very naively, I expect things, for example languages with genders and other unnecessary dongles to be slightly worse, but I really don't know.

Ben Pace: All right, interesting. I think Jacob has a question.

Jacob Lagerros: So, when we spoke about this yesterday, Alkjash, you gave some interesting examples of the third bullet point. I was curious if you wanted to share some of those now as well?

Alkjash: Sure, here's one example I'm thinking about. So I have a general model about learning certain sorts of things. The model is that when you start out, there's so few data points that you just stick them into an array but as time goes on and you learn more stuff, your brain builds a more sophisticated data structure to store the same amount of data more efficiently.

Alkjash: For example, my wife and I occasionally have competed in geography sporcles, and one of the things we noticed is that I remember all the states as a visual map (I'm a more spacial person) and where each state is. I tried to list out the states by going through vertically scanning my internal map of the United States; on the other hand, she remembers it by this alphabet song that she learned in elementary school.

Alkjash: And so these are equally useful for the application of ‘remembering the states’. But when I turn to the addition of state capitals, my model seemed to be more useful for that because it's harder to fit in the state capitals into the alphabet song than into the geographical map, especially when I roughly already know where 10 or 20 of those cities would be geographically.

Alkjash: Thinking about how these data representations… it depends on the application, what compression actually means, I think.

Ben Pace: Nice. Dennis, do you want to ask your question?

Dennis: Looks like compression is related a little to the Zettlekasten system. Have you tried it, and what do you think about it?

Alkjash: I don't know anything about that. Could you explain what that is?

Dennis: This is the system of compact little cards which is used to store information, not in categories but with links relating one to another, which was used by sociologists.

Alkjash: I see. That's very interesting. Yeah, that seems like a very natural thing, closely related. It seems like, just naively, when I learn stuff, I start out just learning a list of things and then slowly, over time, what happens is the connection build up and making this more explicit. It seems like a great thing, to skip the whole store everything as an array and stuff, altogether.

Ben Pace: Cool. Thanks, Dennis.

Ben Pace: Gosh, I'm excited by these questions. Abram if you want to go for it, you can go for it. Just for Alkjash's reference, Abram wrote that he had "not a question, but a dumb argumentive comment relating this to my talk."

Abram Demski: Yeah, so my dumb argumentive comment is, prediction does not equal compression. Sequential prediction equals compression. But non-sequential prediction is also important and does not equal compression. And so, compression doesn't capture everything about what it means to be a rational agent in the world, predicting things.

Alkjash: Interesting.

Abram Demski: And by non-sequential prediction, I mean you have a sequence of bits in the information theory model, but if instead you have this broad array of things that you could think about, and you're not sure when any one of them will become observed, [then] you want all of your beliefs to be good, but you don't have any next bit... You can't express the goodness just in terms of this sequential goodness.

Alkjash: I feel like there is some corresponding compression rate I could write down based on whatever you're being able to predict in this general non-sequential picture, but I haven't thought about it.

Abram Demski: Yeah. I think that you could get it into compression, but my claim is that compression won't capture all the notion of goodness.

Alkjash: I see. Yeah...

Ben Pace: Abram will go into more depth I expect in his talk.

Ben Pace: I should also mention that if you wanted to look at more of that Zettlekasten thing Abram has written a solid introduction to the idea on LessWrong.

Abram Demski: I am also a big fan of Zettlekasten, and recommend people check it out.

Ben Pace: All right, let me see. There was another question from Vaniver. You want to go for it?

Vaniver: Yeah, so it seems to me like the idea of trying to compress your concepts, or like your understanding of the world that you already have, is a thing that we already do a lot of. 

Alkjash: Absolutely.

Vaniver: It feels to me that abstraction and categorization, this is the big thing. I guess the interesting comment to add onto that is something like, there's lumping and splitting as distinct things that are worth thinking about, where lumping is sort of saying, okay I've got a bunch of objects, how do I figure out the right categories that divide this in a small amount? And splitting is like, okay, I've got ‘all dogs’ or something and can I further subgroup within ‘dogs’ and keep going down until I have a thing that is a concrete object?

Vaniver: I think this is interesting to do because with real categories, it's much harder to find the right character encoding, or something. If we're looking at a bit-string like, we’re like oh yeah, letters are letters. And sometimes you're like, oh actually maybe we want words instead of letters.

Alkjash: That's right, yeah.

Vaniver: Your conceptual understanding of the world, it feels like there's much more variation in what you care about for different purposes.

Alkjash: Yeah, so I definitely agree that categorization is an important part of this compression. But I think, actually, probably the more important thing for compression is, as orthonormal says, having a model.

Alkjash: Here's a good example that I've thought about for a while. When I was playing Go, I spent about my first semester in college just studying Go 24/7 instead of going to class. At the end of those six months, I noticed that after playing a game, I could reproduce the entire game from memory. I don't know how much you guys know about Go, but it's a 19x19 board, and a game is placing down these black and white stones; there's probably 200 moves in the average game.

Alkjash: And this skill that I have is not at all exceptional, I think almost anyone who becomes a strong amateur will be able to reproduce their games from memory after playing for about a year. The reason they achieve such a good memory, I claim, is because they're getting good compression and it's not because they're good at categorizing. It's because for the vast majority of moves, they can predict exactly what the next move is because they have a model of how players at their level play. Even though there's a 19x19 board, 90 percent of the time the next move is obvious.

Alkjash: So, that's why there's so little data in that actual game.

Ben Pace: Cool, thanks. I wanted to just add one thought of mine which is: I think the super-forecasters and super-compressors idea is very fun, and I think if anyone wanted an idea for a LessWrong post to write and wanted to have a format in which to Babble a bit, something that’s got different constraints, you could write one taking something like Tetlock’s 10 Heuristics that Superforecasters Use and trying to turn them into 10 Heuristics that Supercompressors Use, and seeing if you come up with anything interesting there, I would be excited to read that.

Ben Pace: All right, thanks a lot, Alkjash. I think we'll move next onto Abram.


Locality of goals

Новости LessWrong.com - 23 июня, 2020 - 00:56
Published on June 22, 2020 9:56 PM GMT


Studying goal-directedness produces two kinds of questions: questions about goals, and questions about being directed towards a goal. Most of my previous posts focused on the second kind; this one shifts to the first kind.

Assume some goal-directed system with a known goal. The nature of this goal will influence which issues of safety the system might have. If the goal focuses on the input, the system might wirehead itself and/or game its specification. On the other hand, if the goal lies firmly in the environment, the system might have convergent instrumental subgoals and/or destroy any unspecified value.

Locality aims at capturing this distinction.

Intuitively, the locality of the system's goal captures how far away from the system one must look to check the accomplishment of the goal.

Let's give some examples:

  • The goal of "My sensor reaches the number 23" is very local, probably maximally local.
  • The goal of "Maintain the temperature of the room at 23 °C" is less local, but still focused on a close neighborhood of the system.
  • The goal of "No death from cancer in the whole world" is even less local.

Locality isn't about how the system extract a model of the world from its input, but about whether and how much it cares about the world beyond it.

Starting points

This intuition about locality came from the collision of two different classification of goals: the first from from Daniel Dennett and the second from Evan Hubinger.

Thermostats and Goals

In "The Intentional Stance", Dennett explains, extends and defends... the intentional stance. One point he discusses is his liberalism: he is completely comfortable with admitting ridiculously simple systems like thermostats in the club of intentional systems -- to give them meaningful mental states about beliefs, desires and goals.

Lest we readers feel insulted at the comparison, Dennett nonetheless admits that the goals of a thermostat differ from ours.

Going along with the gag, we might agree to grant [the thermostat] the capacity for about half a dozen different beliefs and fewer desires—it can believe the room is too cold or too hot, that the boiler is on or off, and that if it wants the room warmer it should turn on the boiler, and so forth. But surely this is imputing too much to the thermostat; it has no concept of heat or of a boiler, for instance. So suppose we de-interpret its beliefs and desires: it can believe the A is too F or G, and if it wants the A to be more F it should do K, and so forth. After all, by attaching the thermostatic control mechanism to different input and output devices, it could be made to regulate the amount of water in a tank, or the speed of a train, for instance.

The goals and beliefs of a thermostat are thus not about heat and the room it is in, as our anthropomorphic bias might suggest, but about the binary state of its sensor.

Now, if the thermostat had more information about the world -- a camera, GPS position, general reasoning ability to infer information about the actual temperature from all its inputs --, then Dennett argues its beliefs and goals would be much more related to heat in the room.

The more of this we add, the less amenable our device becomes to serving as the control structure of anything other than a room-temperature maintenance system. A more formal way of saying this is that the class of indistinguishably satisfactory models of the formal system embodied in its internal states gets smaller and smaller as we add such complexities; the more we add, the richer or more demanding or specific the semantics of the system, until eventually we reach systems for which a unique semantic interpretation is practically (but never in principle) dictated (cf. Hayes 1979). At that point we say this device (or animal or person) has beliefs about heat and about this very room, and so forth, not only because of the system's actual location in, and operations on, the world, but because we cannot imagine an-other niche in which it could be placed where it would work.

Humans, Dennett argues, are more like this enhanced thermostat, in that our beliefs and goals intertwine with the state of the world. Or put differently, when the world around us changes, it will influence almost always influence our mental states; whereas a basic thermostat might react the exact same way in vastly different environments.

But as systems become perceptually richer and behaviorally more versatile, it becomes harder and harder to make substitutions in the actual links of the system to the world without changing the organization of the system itself. If you change its environment, it will notice, in effect, and make a change in its internal state in response. There comes to be a two-way constraint of growing specificity between the device and the environment. Fix the device in any one state and it demands a very specific environment in which to operate properly (you can no longer switch it easily from regulating temperature to regulating speed or anything else); but at the same time, if you do not fix the state it is in, but just plonk it down in a changed environment, its sensory attachments will be sensitive and discriminative enough to respond appropriately to the change, driving the system into a new state, in which it will operate effectively in the new environment.

Part of this distinction between goals comes from generalization, a property considered necessary for goal-directedness since Rohin's initial post on the subject. But the two goals also differs in their "groundedness": the thermostat's goal lies completely in its sensors' inputs, whereas the goals of humans depend on things farther away, on the environment itself.

That is, these two goals have different locality.

Goals Across Cartesian Boundaries

The other classification of goals comes from Evan Hubinger, in a personal discussion. Assuming a Cartesian Boundary outlining the system and its inputs and outputs, goals can be functions of:

  • The environment. This includes most human goals, since we tend to refuse wireheading. Hence the goal depends on something else than our brain state.
  • The input. A typical goal as a function of the input is the one ascribed to the simple thermostat: maintaining the number given by its sensor above some threshold. If we look at the thermostat without assuming that its goal is a proxy for something else, then this system would happily wirehead itself, as the goal IS the input.
  • The output. This one is a bit weirder, but captures goals about actions: for example, the goal of twitching. If there is a robot that only twitches, not even trying to keep twitching, just twitching, its goal seems about its output only.
  • The internals. Lastly, goals can depend on what happens inside the system. For example, a very depressed person might have the goal of "Feeling good". If that is the only thing that matters, then it is a goal about their internal state, and nothing else.

Of course, many goals are functions of multiple parts of this quatuor. Yet separating them allows a characterization of a given goal through their proportions.

Going back to Dennett's example, the basic thermostat's goal is a function of its input, while human goals tend to be functions of the environment. And once again, an important aspect of the difference appears to lie in how far from the system is there information relevant to the goal -- locality.

What Is Locality Anyway?

Assuming some model of the world (possibly a causal DAG) containing the system, the locality of the goal is inversely proportional to the minimum radius of a ball, centered at the system, which suffice to evaluate the goal. Basically, one needs to look a certain distance away to check whether one’s goal is accomplished; locality is a measure of this distance. The more local a goal, the less grounded in the environment, and the most it is susceptible to wireheading or change of environment without change of internal state.

Running with this attempt at formalization, a couple of interesting point follow:

  • If the model of the world includes time, then locality also captures how far in the future and in the past one must go to evaluate the goal. This is basically the short-sightedness of a goal, as exemplified by variants of twitching robots: the robot that simply twitches; the one that want to maximize its twitch in the next second; the one that want to maximize its twitching in the next 2 seconds,... up to the robot that want to maximize the time it twitches in the future.
  • Despite the previous point, locality differs from the short term/long term split. An example of a short-term goal (or one-shot goal) is wanting an ice cream: after its accomplishment, the goal simply dissolves. Whereas an example of a long-term goal (or continuous goal) is to bring about and maintaing world peace -- something that is never over, but instead constrains the shape of the whole future. Short-sightedness differs from short-term, as a short-sighted goal can be long-term: "for all times t (in hours to simplify), I need to eat an ice cream in the interval [t-4,t+4]".
  • Where we put the center of the ball inside the system is probably irrelevant, as the classes of locality should matter more than the exact distance.
  • An alternative definition would be to allow the center of the ball to be anywhere in the world, and make locality inversely proportional to the sum of the distance of the center to the system plus the radius. This captures goals that do not depend on the state of the system, but would give similar numbers than the initial definition.

In summary, locality is a measure of the distance at which information about the world matters for a system's goal. It appears in various guises in different classification of goals, and underlies multiple safety issues. What I give is far from a formalization; it is instead a first exploration of the concept, with open directions to boot. Yet I believe that the concept can be put into more formal terms, and that such a measure of locality captures a fundamental aspect of goal-directedness.

Thanks to Victoria Krakovna, Evan Hubinger and Michele Campolo for discussions on this idea.


AI Benefits Post 1: Introducing “AI Benefits”

Новости LessWrong.com - 23 июня, 2020 - 00:27
Published on June 22, 2020 4:59 PM GMT

This is a post in a series on "AI Benefits." It is cross-posted from my personal blog. For other entries in this series, navigate to the AI Benefits Blog Series Index page.

This post is also discussed on the Effective Altruism Forum. Links to those cross-posts are available on the Index page.

For comments on this series, I am thankful to Katya Klinova, Max Ghenis, Avital Balwit, Joel Becker, Anton Korinek, and others. Errors are my own.

If you are an expert in a relevant area and would like to help me further explore this topic, please contact me.

Introducing “AI Benefits”

Since I started working on the Windfall Clause report in summer 2018, I have become increasingly focused on how advanced AI systems might benefit humanity. My present role at OpenAI has increased my attention to this question, given OpenAI’s mission of “ensur[ing] artificial general intelligence (AGI) . . . benefits all of humanity.”

This post is the first in a series on “AI Benefits.” The series will be divided into two parts. In the first part, I will explain the current state of my thinking on AI Benefits. In the second part, I will highlight some of the questions on which I have substantial uncertainty.

My hope is that this series will generate useful discussion—and particularly criticism—that can improve my thinking. I am particularly interested in the assistance of subject-matter experts in the fields listed at the end of who can offer feedback on how to approach these questions. Although I do raise some questions explicitly, I also have uncertainty surrounding many of the framing assumptions made in this series. Thus, commentary on any part of this series is welcome and encouraged.

In the first few posts, starting with this one, I want to briefly explain what I mean by “AI Benefits” for the purposes of this series. The essential idea of AI Benefits is simple: AI Benefits means AI applications that are good for humanity. However, the concept has some wrinkles that need further explaining.

AI Benefits is Distinct from Default Market Outcomes

One important clarification is the distinction between AI Benefits as I will be using the term and other benefits from AI that arise through market mechanisms.

I expect AI to create and distribute many benefits through market systems, and in no way wish to reject markets as important mechanisms for generating and distributing social benefits. However, I also expect profit-seeking businesses to vigorously search for and develop profitable forms of benefits, and thus likely to be generated absent further intervention.

Since I care about being beneficial relative to a counterfactual scenario in which I do nothing, I am more interested in discussing benefits that businesses are unlikely to generate from profit-maximizing activities alone. Thus, for the rest of this series I will be focusing on the subset of AI Benefits that individuals could receive other than what markets would likely provide by default by actors not motivated by social benefit.

There are several reasons why an AI Benefit might not be well provided-for by profit-maximizing businesses.[1] The first is that some positive externalities from a Benefit might not be easily capturable by market actors.[2] A classic example of this might be using AI to combat climate change—a global public good. It is common for innovations to have such positive externalities. The Internet is a great example of this; its “creators”—whoever they are—likely captured very little of the benefits that flow from it.

Profit-maximizers focus on consumers’ willingness to pay (“WTP”). However, a product for which rich consumers have high WTP can yield far lower improvements to human welfare and happiness than a product aimed at poor consumers, giving the latter’s low WTP. Accordingly, investments in products benefitting poor consumers might be underproduced by profit-seekers relative to the social value—happiness and flourishing—it could create.[3] Actors less concerned with profits should fill this gap. Consumers’ bounded rationality could also lead them to undervalue certain products.

This line of work also focuses on what benefactors can do unilaterally. Therefore, I largely take market incentives as stable, even though the maximally beneficial thing to do might be to improve market incentives. (As an example, carbon pricing could fix the negative externalities associated with carbon emissions.) Such market-shaping work is very important to good outcomes, but is less tractable for individual Benefactors other than entire governments. However, a complete portfolio of Benefits might include funding advocacy for better policies.

A point of clarification: although by definition markets will generally not provide the subset of AI Benefits on which I am focusing, market actors can play a key role in providing AI Benefits. Some profit-driven firms already engage in philanthropy and other corporate social responsibility (“CSR”) efforts. Such behaviors may or may not be ultimately aimed at profit. Regardless, the resources available to large for-profit AI developers may make them adept at generating AI Benefits. The motivation, not the legal status of the actor, determines whether a beneficial activity counts as the type of AI Benefits I care about.

Of course, nonprofits, educational institutions, governments, and individuals also provide such nonmarket benefits. The intended audience for this work is anyone aiming to provide nonmarket AI Benefits, including for-profit firms acting in a philanthropic or beneficent capacity.

  1. Cf. Hauke Hillebrandt & John Halstead, Impact Investing Report 26–27 (2018), https://founderspledge.com/research/fp-impact-investing [https://perma.cc/7S5Y-7S6F]; Holden Karnofsky, Broad Market Efficiency, The GiveWell Blog (July 25, 2016), https://blog.givewell.org/2013/05/02/broad-market-efficiency/ [https://perma.cc/TA8D-YT69]. ↩︎

  2. See Hillebrandt & Halstead, supra, at 14–15. ↩︎

  3. See id. at 12–14. ↩︎


News ⊂ Advertising

Новости LessWrong.com - 22 июня, 2020 - 22:19
Published on June 22, 2020 7:19 PM GMT

For-profit news outlets are financial incentivized to write about things that are easy to write about. The easiest articles to write are the subsidized ones. Public relations firms subsidize news by writing press releases. Then news outlets republish the press releases as news. That's why so much news is corporate and political advertising.

Here are the top stories on Ars Technica at the time of writing[1].

  1. "NordVPN users’ passwords exposed in mass credential-stuffing attacks"
  2. "AT&T’s priciest “unlimited” plan now allows 100GB+ of un-throttled data"
  3. "Researchers unearth malware that siphoned SMS texts out of telco’s network"
  4. "The count of managed service providers getting hit with ransomware mounts"
  5. "Facebook deletes the accounts of NSO Group workers"

Having only skimmed the articles, I suspect they were put there by the following companies.

  1. Have I Been Pwned (breach notification service)
  2. AT&T
  3. FireEye (security firm)
  4. Armor (global cloud security provider)
  5. Facebook

The first article lets slip who wrote it in the following line.

Readers who are NordVPN users should visit Have I Been Pwned[2] and check to see if their email address is contained in any of the lists.

Can you spot how this sentence attempts to influence reader behavior?

Different organizations write articles for different news outlets. Ars Technica is unusual in its disproportionate publishing of articles written by cybersecurity firms and its relatively low density of political propaganda compared to more traditional news outlets like The Economist. The Art of Manliness Podcast interviewees usually discuss the books they're selling.[3]

News is advertising. Ad-supported news is ad-supported advertising. Subscription-supported news is subscription-supported advertising. Advertising can't directly control what you believe. Advertisers can control what you think about. The more advertising I expose myself to, the more I think about the things advertisers want me to.

Here is what advertisers want me to think about.

  • Products I haven't bought
  • National politics[4]
  • More ad-supported media, such as celebrities and free-to-play videogames

Here is what I want to think about.

  • Things I can make and do myself
  • My local community
  • My friends, my family and me

My personal happiness is inversely related to how much news I expose myself to. It's not just a subjective feeling. I behave more healthily. I'm even more interesting to talk to.

Amateur blogs make me think about what the author thinks is important. That's a step in the right direction because amateur bloggers' interests align better with mine than do the corporate and political machines behind news outlet press releases. But they're still not me. And some of them are motivated by vanity.

I solve all of these issues by writing a blog myself. That way the author's interests align perfectly with my own.

If you liked this post, click here.

Edit: jballoch points out that Have I Been Pwned is a noncommercial donation-supported service.

  1. November 3, 2019 at 1:43 am ↩︎

  2. The hyperlink is in the original article. It's the article's second link to Have I been Pwned. ↩︎

  3. I pick these specific news outlets because I visit them the most. Aggregators like Facebook and Reddit are different beasts deserving of a separate post. ↩︎

  4. I don't deny that national politics is important. I mean that the proportion of attention it gets on the news is greater than the proportion of my attention I wish to passively devote to it. ↩︎


The Indexing Problem

Новости LessWrong.com - 22 июня, 2020 - 22:11
Published on June 22, 2020 7:11 PM GMT

Meta: this project is wrapping up for now. This is the first of probably several posts dumping my thought-state as of this week.

Suppose we have a simple causal model of a system:

We hire someone to study the system, and eventually they come back with some data: “X=2”, they tell us. Apparently they are using a different naming convention than we are; what is this variable X? Does it correspond to a? To b? To c? To something else entirely?

I’ll call this the indexing problem: “index into” a model to specify where some data “came from”. (If someone knows of a better existing name, let me know.)

In practice, we usually “solve” the indexing problem by coordinating on names of things. I use descriptive variable names like “counterweight mass” or “customer age” in my model, and then when someone gives me data, it comes with object names like “counterweight” or “customer” and field names like “mass” or “age”. Of course, this has a major failure mode: names can be deceiving. A poorly documented API may have a field called “date”, and I might mistake it for the date on which some document was signed when really it was the date on which the document was delivered. Or a social scientist may have data on “empathy” or “status” constructed from a weighted sum of questionnaire responses, which may or may not correspond to anybody else’ notion/measurement of “empathy” or “status”.

The expensive way around this is to explicitly model the data-generation process. If I look at the actual procedure followed by the social scientist, or the actual code behind the poorly-documented API, then I can ignore the names and figure out for myself what the variables tell me about the world.

Usually this is where we’d start talking about social norms or something like that, but the indexing problem still needs to be solved even in the absence of any other agents. Even if I were the last person on a postapocalyptic Earth, I might find a photograph and think “where was this photograph taken?” - that would be an indexing problem. I might get lost and end up in front of a distinctive building, and wonder “where is this? Where am I?” - that’s an indexing problem. A premodern natural philosopher might stare through a telescope and wonder where that image is really coming from, and if it can really be trusted - that’s an indexing problem. I might smell oranges but not see any fruit, or I might hear water but see only desert, and wonder where the anomalous sensory inputs are coming from - that’s an indexing problem. Any time I’m wondering “where something came from” - i.e. which components of my world-model I should attach the data to - that’s an indexing problem.

When thinking about embedded agents, even just representing indices is nontrivial.

Representing Indices

If our world-model were just a plain graphical causal model in the usual format, and our data were just variables in that model, then representation would be simple: we just have a hash table mapping each variable to data on that variable (or null, for variables without any known data). 

For agents modelling environments which contain themselves, this doesn’t work: the environment is necessarily larger than the model. If it’s a graphical causal model, we need to compress it somehow; we can’t even explicitly list all the variables. The main way I imagine doing this is to write causal models like we write progams: use recursive “calls” to submodels to build small representations of our world-model, the same way source code represents large computations.

… which means that, when data comes in, we need to index into a data structure representing the trace of a program’s execution graph, without actually calculating/storing that trace.

Here’s an example. We have a causal model representing this python program:

def factorial(n):    if n == 0:        return 1    return n * factorial(n-1)

Expanded out, the causal model for this program executing on n=3 looks like this:

… where there are implicit arrows going downwards within each box (i.e. the variables in each box depend on the variables above them within the box). Notice that the causal model consists of repeated blocks; we can compress them with a representation like this:

That’s the compressed world-model.

Now, forget about n=3. I just have the compressed world-model for the general case. I’m playing around with a debugger, setting breakpoints and stepping through the code. Some data comes in; it tells me that the value of “factorial(n-1)” in the “second block from the top” in the expanded model is 7.

Now, there’s a whole hairy problem here about how to update our world-model to reflect this new data, but we’re not even going to worry about that. First we need to ask: how do we even represent this? What data structure do we use to point to the “second block from the top” in an expansion of a computation - a computation which we don’t want to fully expand out because it may be too large? Obviously we could hack something together, but my inner software engineer says that will get very ugly very quickly. We want a clean way to index into the model in order to attach data.

Oh, and just to make it a bit more interesting, remember that sometimes we might not even know which block we’re looking for. Maybe we’ve been stepping through with the debugger for a while, and we’re looking at a recursive call somewhere between ten and twenty deep but we’re not sure exactly which. We want to be able to represent that too. So there can be “indexical uncertainty” - i.e. we’re unsure of the index itself.

I don’t have a complete answer to this yet, but I expect it will look something like algebraic equations on data structures (algex). For instance, we might have a plain causal model for name-generation like 

{    ethnicity = random_ethnicity()    first_name = random_first(ethnicity)    last_name = random_last(ethnicity) }

… and then we could represent some data via an equation like

The left-hand-side would be the same data structure we use to represent world models, and the right-hand-side would be a “similarly-shaped” structure containing our data. Like algex, the “equation” would be interpreted asymmetrically - missing data on the right means that we just don’t know that value yet, whereas the left has the whole world model, and in general we “solve” for things on the left in terms of things on the right.

The algex library already does this sort of thing with deeply-nested data structures, and extending it to the sort of lazy recursively-nested data structures relevant to world models should be straightforward. The main task is coming up with a clean set of Transformations which allow walking around the graph, e.g. something which says “follow this recursive call two times”.


Клуб чтения цепочек

События в Кочерге - 22 июня, 2020 - 19:30
В понедельник, 22 июня, в 19:30 по Москве заканчиваем читать цепочку «Загадочные ответы» из книги Элиезера Юдковского «Рациональность от ИИ до зомби».

Deriving General Relativity

Новости LessWrong.com - 22 июня, 2020 - 12:52
Published on June 22, 2020 9:52 AM GMT

.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')}

Challenge: Derive general relativity from special relativity.

Consider a spaceship with its windows blacked out. Accelerating at 1 g feels the same as sitting on the launchpad. Relativistically, an object resting on the surface of the Earth is accelerating upward at 1 g and an object in freefall is at rest along its geodesic. Therefore the natural (inertial) path of an object follows its geodesic along the spacetime manifold. Lagrangianly, an object's trajectory locally maximizes proper time.

If objects in freefall follow their geodesics then gravity must modify the geodesic itself i.e. gravity warps spacetime.

For gravity to be attractive instead of repulsive, time must be slowed in a gravity well. Under special relativity, slowing time and contracting space are the same thing. Therefore gravity slows time and contracts space. Gravity is produced by mass, so matter emits a field that slows time and contracts space.

Dyson Sphere over Tatooine

Now consider the gravity potential of a binary star system observed from a distance far away compared to the orbital radius between the stars.

If it is 1915 and you have not invented general relativity yet then you might believe M represents total enclosed rest mass. And it would…if we lived in a Newtonian universe. But Gauss' law applies to any closed system.


If you built a Dyson sphere around the binary star then the external gravity of your closed system should never change[1] even as the stars spiral into each other. But as the stars spiral into each other they increase their spatial speed v, thereby decreasing their temporal speed α=√1−v2c2. The slower temporal speed (increased Lorentz factor) decreases the stars' gravitational output per unit mass via time dilation. But the total gravity escaping from the closed system (Dyson sphere) must stay constant. The mass of the stars increases to counteract time dilation. Therefore the relativistic mass of a particle equals the rest mass m0 divided by the temporal speed α.


The M term in Gauss' law is the total relativistic mass plus the energy of the field.

Mass-Energy Equivalence

You can show the energy density of a field is proportional to the square of the field strength with pure mathematics. No science necessary. By plugging this math into the binary star system throught experiment we find that the energy from the gravitational field gets turned into mass. Thus the total mass-energy is conserved.

What if we did this in reverse? What happens when a Uranium-235 nucleus absorbs a neutron?


Just as an observer from outside the Dyson sphere cannot distinguish between the rest mass of the stars from the energy within the enclosed gravitational field, an observer outside a Uranium-235 nucleus cannot distinguish between the mass from the energy enclosed within it. An external observer observes the nucleus's internal energy as mass. So if the rest mass of the products is less than the rest mass of the reactants then the difference is released as energy.


  1. Please humor me by ignoring the energy lost to thermal radiation and gravitational waves. ↩︎


The Queen of the Damned

Новости LessWrong.com - 22 июня, 2020 - 07:50
Published on June 22, 2020 4:50 AM GMT

A short story set in the EVE universe. Read other short stories by me and by other members of the Alexylva Paradox at Alexylvaparadox.wordpress.com/chronicles

“They played you like a fiddle Metz.” The woman was everything Metz wasn’t. He was tall and lanky, rough, ragged on the edges, with a normally relaxed and carefree demeanor, dust worn and weather beaten. He was scruffy, with eyes that made him look older than he actually was, those dark eyes had seen a thousand little tragedies. Murders, execution, torture, and worse.

In contrast, Endorsei Edlrif looked like a paragon of the world of professional business. She was polished and sharpened, from her perfectly pressed suit to her immaculately brushed hair, the tasteful but understated jewelry she wore and the smooth and confident manner she carried herself with. She radiated an almost innocent poise, and with a face had hid her years, she would have fit in anywhere from a boardroom to a college campus.

She absolutely terrified Metz. The crimes, the inhumanities, the deaths, they weighed on Metz like a heavy stone around his neck, he felt the weight of his deeds with every step he took. He saw the faces of the dead when he closed his eyes. He wasn’t a good person, but he felt the cost of his sins, the barbs embedded in his soul. In contrast, Endorsei had an almost insane lightness to her. She was happy, cheerful, even chipper. He’d seen her squeal with glee and clap like a little girl when a group of traitors was strung up in front of her, none of the horror seemed to phase her in the slightest, on the contrary, she seemed to revel in it.

Metz Jerindold was a fixer, he made problems go away, but he was also a leader, he took responsibility for his actions, he might be a murderous bastard, but Skarkon was his system and he was still very protective of it. He was an Angel first, but he was a Skarkon native second. He was a matari, even if many in the republic would be loath to admit he was one of them.

Endorsei Edlrif on the other hand, was a monster. Metz had trouble believing she was actually human at times, much less Matari. The slaver’s fangs voluval appearing beneath her lips marked her as an outcast even among outcasts. She was the Cartel’s razor, gleaming in the darkness, the Queen of the Damned.

“I had to do something,” he said, nervously taking a drag of his cigarette, “ we couldn’t just let the Krullefor continue muscling into our turf.”

Endorsei looked out the panoramic windows of their meeting room aboard her custom Machariel. The deserts of Skarkon were like a painted canvas far below them. She watched a dust storm moving across the world with only the faintest hint of disgust.

“And so you set off a nuke in the middle of the city they were basing out of, killing a bunch of random people and giving the RSS an excuse to escalate the conflict further.” She shook her head, gesturing with the sucker she’d been slurping on,  “They played you Metz, they wanted you to react so they could say they were bringing justice and restoring order, and you reacted exactly the way they wanted. Now they get to bring the hammer down and play the heroes.”

“They won’t be seen as liberators,” Metz told her, following her eyes out the windows towards the planet, “Not in Skarkon. You don’t wipe away decades of bad blood and abandonment with a few soldiers and rations. They’d have to kill half the people down there to even begin to contest our grip.”

“You think the republic would give a fedo’s arse if half the people on that dustball died tomorrow?” She said, raising an eyebrow.

“They couldn’t…” he said, his voice trailing off, “That’s a bad look, even for them. I doubt they would want that much negative PR.”

“They’re going to put thousands of troops on the ground down there and turn that planet over for months,” Endorsei told him. “Long term occupations of hostile worlds are never pretty, just look at Caldari Prime.”

“What makes you so sure they’re going to stick around?” Metz asked her.

“Because they’re trying to find something,” she said, grinning darkly, “and they won’t leave a stone unturned if they think what they want is under it.”

“I know they’ve got people looking for Archeotech,” Metz told her, “What are they after?”

She giggled, “Sorry love, but that’s above your paygrade. All you need to know is that they won’t find it. We found it and took everything out over a decade ago.”

“So why are you here then Endorsei?” Metz said apprehensively, wondering if he was about to eat a bullet.

“Oh, I’m just tying up loose ends,” she said, skipping up to the window and peering out it with a wide eyed childlike fascination, “Ooh, look at that!” She jumped up and down, pointing at something out in the desert.

On the planet below, as if on queue, a trio of atomic explosions twinkled silently on the Ngelgneig, followed a few seconds later by five more. As far as Metz knew, there was nothing out there in the desert worth nuking. Just some mobile bases operated by various cartel backed corporate outfits.

“There!” She said happily, “No more loose ends, no one left alive who could say anything and no easily accessed evidence. It will take them months to find out that what they’re looking for isn’t on Skarkon II anymore, and you Metz, I have big plans for you!”

She put an arm over Metz’s shoulder, and the big man tried not to wince. She ran her fingers across his back and he had the sickening feeling of being sized up as a meal by some sort of giant predatory insect.

“You’re going to make sure the RSS’s stay on Skarkon II is an extra special one. I want you to pull out all the stops. Hold protests, throw rocks, arrange strikes, send gift baskets with grenades in the bottom, plant roadside bombs, hit squad leaders with snipers, everything you can do to turn that planet into as much of a slaver trap as possible, I want you to do it. Feel free to tap into the local discretionary fund. Fight smart, make the republic afraid of absolutely everyone on that damned planet. I want you to make it abundantly clear to them that Skarkon is not and will never be their planet, and that the people of Skarkon will never pledge loyalty to them. If they want Skarkon II, they’ll have to plant their flag in a pile of children’s corpses. Do I make myself clear?”

“Abundantly,” Metz said, carefully removing Endorsei’s hand from his back like one might remove a venomous snake.

“Just one last thing,” he said, “That necklace you’re wearing,” he pointed to the faintly luminescent sky blue pearl hanging by a simple silver chain from her neck, “There was a girl, a capsuleer, from one of the groups operating warclones on the planet. She had a jewel like that, said it was spiritually significant to her people and wanted to know what the Angels knew about them.”

Endorsei frowned faintly, looking at Metz and then down at her necklace. She shook her head, “That’s also above your paygrade Metz. But since I’m feeling…generous, here, give her my card.”

She held out a small piece of paper containing Endorsei’s neocom address. He knew the card also contained some nasty malware and a tiny sliver of antimatter which could be remotely detonated, he took the card carefully, handling it like the bomb it was.

Metz looked like he wanted to say something else, and then thought better of it, he wanted to be out of the room and away from Endorsei Edlrif as fast as possible.

“Now take care love,” she said, giggling and shooing him out of the meeting room, already bored of the sebiestor, “Make sure to give the RSS our best welcome, and give your girlfriend that card, we’ll see about getting her a nice trinket.”

Metz let himself be pushed out the door and practically ran back to his shuttle.


Neural Basis for Global Workspace Theory

Новости LessWrong.com - 22 июня, 2020 - 07:19
Published on June 22, 2020 4:19 AM GMT

Epistemic Status: Intense amateur neuroscience here. Hoping to leverage Cunningham's Law to reach enlightenment.

Kaj Sotala has a great sequence on Multiagent Models of the Mind, a sequence that's lead to a lot of fun developments in how I think about minds in general. It also introduced me to Global Workspace Theory, one of the current mainstream theories of consciousness. 

When studying the mind, you can attack from different levels of abstraction. From the lowest level of studying the anatomy of neurons, all the way up to postulating abstract cognitive algorithms that people might use in their thought. Kaj's sequence lives mostly on a mid-tier level; "imagine a system that works like this", and only dips into the neuroscience enough to give you a sense that someone has in fact looked into the neural plausibility of the idea. I think this was a good decision, as most of the interesting ideas come at the mid-tier functional level ("If you think of your mind as composed of subagents communicating through a global workspace, you'd expect ABC behavior").

This post details some of the lower level brain anatomy I've been investigating in an attempt to clear up some confusions I've had from thinking a lot of about Global Workspace Theory and how it relates to consciousness and Predictive Processing. Specifically, it looks at the neural basis for attention mechanisms, and the neural basis for the Global Workspace. Most of the details come from this paper, and this paper.

The Tangential Intracortical Network (TIN) (i.e The Global Neuronal Workplace?)

People like Jeff Hawkins assert that every part of the neocortex is running the same algorithm. Even people that don't go as far as him note that there's a lot of uniformity in the neocortex (it looks like it's mostly composed of tons of cortical columns). 

Most of the connections in the cortex go from one part of a column to another part of the same column. There's also plenty of cortex to [non cortex part of the brain] connections. Most of the intracortical connections are part of a big web of mid-range fibers that span the whole neocortex, called the Tangential Intracortical Network (TIN) by Baars and Newman. They point to it as a plausible physical basis for the GNW. We'll run with that.

Functionally, the global workspace is an area that disparate parts of the cortex can all compete to put a value on. This competition is winner-takes-all, and only one value can be on the network at a time. Once a value is on the network, the rest of the cortex is able to read the value, thus serving as a temporary "global state", hence the name.

I'm not exactly sure how the TIN implements this functionality, but I'm imagining it as resulting from fairly mechanistic network dynamics. Something like:

  • If multiple areas of cortex are active, they'll all automatically be sending signals on the TIN.
  • All the signals will briefly propagate, but because of [network dynamics magic] even slight differences in signal strength or random chance will lead to one overwhelming the others and taking over the entire network. 

The important aspect of this to me is the mechanistic nature of the competition. A winner is not "chosen", one signal simply beats out the others. Though to be fair, when we get to the thalamus, we'll find that that thalamus has lot of connections to the cortex that seem capable of signal boosting a chunk of the cortex, allowing it to dominate and become the contents of the GNW. This creates an indirect route through which the contents of the GNW can be manipulated.

Basal Ganglia (BG) and "Action" Selection

The BG is pretty solidly understood to be central to "action" selection. The air quotes are because the BG also seems to do a selection operation on inputs from places that aren't motor regions (like the prefrontal cortex). So it's capable of doing a winner-takes-all selection of various abstract and concrete "cognitive actions".

There are a few things that makes competition on the BG different from competition on the GNW. First, the BG has several different selection channels that can act in parallel. There's at least five different loops that all follow the pattern of: cortex projecting onto BG which uses the thalamus to give a go-ahead to the cortex.


Second difference is that action selection in the BG makes use of previous learned rewards. It basically seems to be doing the evidence accumulation that Kaj outlines in Subagents, neural Turing Machines, thought selection, and blindspots. Multiple subsystem (chunks of cortex) put their plans on the BG. The first option whose accumulated expected reward exceeds some threshold is chosen. Compare this to the mechanistic network dynamical magic of the GNW.

Third difference is what happens after the "selection" occurs. In the GNW, being "selected" just means you are the signal currently dominating the GNW. This results in the rest of the cortex being able to use your value as an input. With the BG, there seem to be two possible results; either the thalamus is used to boost you into taking over the GNW (like what happens when a production rule fires in the neural Turing machine model), or the BG can use the thalamus to route your plan to another part of the cortex. This seems to be what happens with motor actions. High-level action plans made in the frontal cortex are approved by the BG and routed to the motor cortex which creates the implementation details with the help of the cerebellum.

The difference between competition on the BG and competition on the GNW also accounts for one discrepancy Kaj mentions:

There seems to me to be a conceptual difference between the kinds of actions that change the contents of consciousness, and the kinds of actions which accumulate evidence over many items in consciousness (such as iterative memories of snacks). Zylberberg et al. talk about a “winner-take-all race” to trigger a production rule, which to me implies that the evidence accumulated in favor of each production rule is cleared each time that the contents of consciousness is changed. This is seemingly incompatible with accumulating evidence over many consciousness-moments, so postulating a two-level distinction between accumulators seems like a straightforward way of resolving the issue.

Something being on the GNW can boost evidence accumulation at the basal ganglia, which is maintained across changes in the contents of GNW. 

Understanding the role of the BG was big for me, because it helped make a lot more sense of where you do and don't expect to find bottlenecks. The BG can be choosing and routing actions that are being proposed by the cortex without having to wait to use the GNW. If there's a clear and obvious winner, the BG just chooses the right action and sends it along. It's only going to be novel situations when one action doesn't have a clear expected reward edge, and that's exactly when you'd expect someone's conscious attention to be on high, searching for any sliver of information that could push you towards a decisive action!

Also, don't forget that the GNW can only do one thing at a time, whereas the BG has multiple selection channels. 

The Thalamus: router/central hub

Though the TIN is supposed to be the network that is the GNW, the thalamus is what allows for the guiding and management of attention. There are three attention-esque things that the thalamus seems to do:

  1. Allow for sensory gating, controlling what sense data even makes it to the cortex for higher level processing.
  2. Allow for affecting the contents of the global neuronal workspace.
  3. Allow for routing information between disparate chunks of cortex.

The following is a brief overview of the thalamus that will help us build up to understanding its role in these three functions.

The thalamus is a central hub that all cortex-bound sense data has to pass through, making it a prime suspect for some sort of attentional control system. An important note, though the cortex is where most of the "intelligent" processing of data happens, it's not the only place that processes sense data. All sensory channels also connect directly to the brain stem. For vision, the optic nerve is connected to the thalamus (specifically the lateral geniculate nucleus) and also connected to the midbrain (specifically the superior colliculus), where as with audition all data goes hindbrain -> midbrain -> thalamus. 

This is important for thinking about voluntary and involuntary control of attention and brain processing; we're going to be talking about attention mechanisms that act on the thalamus, which means these mechanisms hold no sway over what data the brainstem receives. One prediction I'd make from this: we see things where inattention can lead to ignoring huge changes in your environment. But the things that change in those experiments are visual features that are processed in the cortex. If you had a small black dot scuttling across the screen (what your brainstem uses to trigger the "AAAH SPIDERS" reflex), I'd bet people would still have a startle reflex even if they weren't paying attention.

Here's a picture of the thalamus, split into its various parts (the names aren't super important for the post, I just like visuals).

It's common to split the thalamus into "first-order" and "high-order" sections. The first-order parts (also called relay sections) don't interconnect that much, and mostly just shuttle their sensory data off to the cortex. The lateral geniculate nucleus routes most of the visual data, the medial geniculate nucleus routes most of the auditory data, and other parts do other things. I think of them as mostly inert pipes that just transmit whatever is coming from their fixed input location. The way that these relays connect to the cortex is highly organized; neurons that are close together in lateral geniculate nucleus space are close together in retina space and get mapped close together in visual cortex space.

In contrast, the higher-order nuclei are much more interconnected, projecting out of the thalamus in more diffuse patterns, and receiving their inputs from various chunks of cortex as opposed to from sensory channels. Some of these are called association nuclei and seem to allow for routing info from one chunk of cortex to another, and others are called nonspecific nuclei and project out to the cortex in a very diffuse manner. The former seem to be important for cortex to cortex routing, and the latter seems to be important for influencing the global neuronal workspace.

Cortex to cortex routing is used in the execution of motor plans that we talked about with the BG. The diffuse connections to the cortex are used to signal boost things onto the GNW.

For every connection taking data from the thalamus to the cortex, there's several reverse connections connecting the cortex to the thalamus. This sort of re-entrant feedback is what makes advanced control loops possible.

The Thalamic Reticular Nucleus (TRN): the gates of the thalamus

The thalamus has a "shell" around it called the thalamic reticular nucleus. It's composed of a web of inhibitory neurons, and all of the thalamus's outgoing axons pass through it. If a chunk of the TRN gets activated, it will block outgoing signals. You can think of the TRN as being composed of a bunch of little gates that can all be triggered to block outgoing data. 

Since neurons spike over time (as opposed to logic gates which maintain ON or OFF), these gates are controlled by maintaining standing waves across the TRN. Baars and Newman describe it as being capable of acting as a fourier filter, selectively blocking outgoing signals that don't match the frequency of the filtering standing wave. This is really cool because it allows a lot more fine grained control than "don't let visual data in". The parts of the brain that control the TRN can learn what sorts of oscillations block what sorts of sense data. You won't get filtering at the highest conceptual levels ("block incoming visuals of bats"), but you can get more sophisticated than simple topographic blocking like "ignore everything in my peripherals".

There are at least three areas of the brain that plug into the TRN that control how it filters thalamic output. 

  • Prefrontal Cortex (PFC): responsible for the "executive" attention; attending to something because you want to, noticing stuff that's relevant to your goals, etc.
  • Midbrain reticular formation (MRF): responsible for attending to novel and dangerous stuff.
  • Posterior Cortex (PC): responsible for attending to stimuli similar to what you were just attending to, creating a sort of recency bias. 

All of these are capable of exerting control over the TRN and the thalamus. Sensory gating is what happens when the TRN is filtering out certain sense data, preventing it from ever reaching the cortex. This is very different from GNW attention, where various data is all being processed in the cortex, but only one value is being held in attention at any given moment. 

Once again, just like we split apart "selection" into the BG and the GNW, we can split the phenomena of "attending to something" into having it on the GNW, and sensory gating.

I'd expect GNW attention to allow for things like the cocktail party effect, and being completely zoned out of driving, yet upon being prompted being able to recall the last few seconds of detail. Basically any situation where you weren't paying attention to something, but when you do you find details that imply you must have already been processing the situation.

Sensory gating seems to correspond to having your attention broadened or narrowed. Being intensely focused a math problem and then jumping out of your skin when your roommate taps you on your shoulder; putting down your phone and feeling like you've suddenly been flooded with all this space. A wilderness first aid instructor I know that spent a lot of time as a kid coon hunting told me a story about a similar phenomena in dogs. When they get really into a chase, their senses "shut off" in order of their least to most important as they get more and more aroused; first the hearing goes, then vision, leaving them navigating only by smell.


I'm confident in the broad strokes of the neuroscience here, but I am certainly wrong about a large number of the details. When writing this I was really faced with the magnitude of everything I just don't understand about the brain. I often questioned why I was investigating the neuroscience if I was only going to do an amateur job, and when I was really interested in the more abstract implications. 

Despite all that, this has been really helpful for getting a sense of better questions to ask. 

Understanding the difference between competition on the GNW and competition on the BG was really useful. Understanding the difference between attention via sensory gating and attention via the GNW was also really useful. If you've been following along with Kaj's sequence, I think these are the two main takeaways.


Newman, James, Bernard J. Baars, and Sung-Bae Cho. "A neural global workspace model for conscious attention." Neural Networks 10.7 (1997): 1195-1206.

Newman, James, and Bernard J. Baars. "A neural attentional model for access to consciousness: A global workspace perspective." Concepts in Neuroscience 4.2 (1993): 255-290.

Models of Thalamocortical System: scholarpedia


Fight the Power

Новости LessWrong.com - 22 июня, 2020 - 05:19
Published on June 22, 2020 2:19 AM GMT

On power grabs, slogans, and how the collapse of authority leads to anxiety which leads to anger which leads to submission which leads to whatever is happening right now.

Cross-posted from Putanumonit.

Gurri’s World

…the elites that ran our institutions had the authority to provide information, frame it and explain the world. That’s completely gone, and with it there’s been a bleeding away of expert authority, and a public has been created that’s essentially very angry…

Martin Gurri, author of “The Revolt of the Public and the Crisis of Authority”

Martin Gurri wrote his seminal book in 2014, to explain the public revolts against authority in the early part of the 21st century, such as the Arab Spring. But as the century wears on, the book becomes even more timely than when it was written, the events proceeding exactly as he described them. Trump, Brexit, the fall of old media, culture war in academia — all are revolts by a newly informed public against established authorities and gatekeepers of information.

In Martin Gurri’s world, it is no surprise that we are seeing protests and revolts on a global scale right on the heels of the first wave of a global pandemic. COVID exposed what few authorities made claims to credibility beforehand. The government, the media, acronym bureaucracies like the CDC and FDA — they all abdicated their responsibilities, lied at every turn, and failed to do much of anything about a disease that has been killing 1,000 Americans a day for months.

The collapse of authority, even for those who yet cling to one of those I mentioned in the previous paragraph, provokes anxiety and anger. They intuitively seek epistemic authority, to tell them what is true and false. They seek moral authority, to tell them what is good and evil. And they want someone to blame, someone to take their anxiety and anger out on. It can’t be the virus. It has to be a person, and they have to be within reach of a fist.

The Grab

Humans did not evolve to tell very well what is true or false, or to think clearly about good and evil. But we evolved to be good at tribalism, to divide into “us” and “them”, to be on the winning side when “us” gain an edge in power and jump on “them” to take their stuff, their status, or their lives. We do this subconsciously, even as our brains confirm our moral righteousness to ourselves.

Coordinated grabs for status, wealth, and power need a coordination mechanism. This mechanism consists of a shared language and ideology, anything from common jokes to a holy text. Then an opportune moment arrives, often when the public is galvanized by a mass movement. The grabbers strike under the cover of the movement’s worthy cause.

A clear demonstration of this happened during the Arab Spring itself in Egypt from 2011-2014. The revolution started with a protest organized mostly by student activists and liberal youth movements. They protested curtailed freedoms, unchecked presidential powers, and police brutality. Several months and dead protesters later, the president of Egypt was deposed only to be replaced by the head of the Muslim Brotherhood, Morsi. Morsi immediately proceeded to curtail freedoms further, remove checks on his power, and sent the police to brutalize protesters.

While the students faced the police in Tahrir Square, the Brotherhood coordinated a power grab. Now Egypt has a different dictator still, and the young activists are wondering what happened to their cause that seemed to enjoy such popular support.

A lot of the Egyptians who brought Morsi and then Al-Sisi to power did not do so out of a desire to see their country sink further into despotism. They just got swept up in the angry revolt, and then intuitively supported the side that seemed poised to gain power. Anger leads to chaos, chaos leads to fear, fear leads to submission to whomever is strongest, whoever grabs for power most forcefully.

Protests in Tahrir Square in Cairo

Everything so far was just setting the stage, outlining a general model of how revolts erupt in anger and how this anger is repurposed for grabbing power and settling scores with the outgroup. This model informs my argument below, an argument about free thought and expression in an adversarial environment. This is not an argument about protests, black lives, or police reform. If you’re looking for hot takes on these topics, I don’t have any.

The Fear

George Floyd was killed, protests erupted, “Black lives matter!” rings from Minneapolis to New Zealand. You want to express something too, write something, say something. What should you say?

First, ask yourself: What can’t I say, even if were true? What am I afraid to say, and in front of who am I afraid to say it? What do I feel I have to say? What could be said yesterday that is scary to say today?

This fear is not a sign of truth. You cannot reason back to reality from human emotions. Galileo was right because he carefully watched the stars, not because he said things that made the church angry.

This fear is a sign that someone is gaining power over you, aiming a weapon your way. The things you can’t or must say are the battle lines being drawn. Those you’re afraid to speak in front of are those preparing to grab and strike. The accelerating speed of change in what is forbidden or mandatory is a sign that war is looming.

Consider not saying things that will fuel the war and give power to those you’re afraid of.

One way to give them power is to make them your enemy directly, to focus your energy on fighting them on their own terms. Direct enemies are very useful to those people, and you serve them by becoming one.

Some people I know have been possessed by a strong urge recently to quote black crime statistics. There are two groups of people who talk about black crime statistics: criminal justice activists, and racists. If you only start talking about black crime statistics when a white cop kills a black civilian you are probably not a criminal justice activist. Consider not repeating the talking points of racists if you’re worried about being called a racist.

The other way to give power to those grabbing for it is to mindlessly repeat their slogans. Slogans are rarely innocuous. They are often start reasonable and then escalate to absurdity. Once someone gets you to chant absurdities you start believing them, and once you believe in absurdities you’re ready to commit atrocities.

You probably remember this strange thing that happened at Trump rallies throughout 2016. In the early days, Trumpers were chanting “Make America great again!” Even if you quibble with the “again” part, this isn’t terribly objectionable. But in short order they started chanting “Lock her up!” instead, a direct threat to Trump’s opponent and to democratic norms. A person who chanted “Lock her up!” in public, even if they didn’t literally mean it, cannot go back and vote for Hillary no matter what information comes out about the candidates. They have enlisted in an army, and declared Democrats to be their enemy.

This happens a lot to movement slogans — they start of positive and uniting, then suddenly find an enemy to turn on. Solidarity turns into hate. “Black lives matter” turns into “All cops are bastards”. “We stand against hate” turns into “Abolish the police”.

Now, perhaps it is not absurd to abolish the police in some way. I’m not expressing an opinion on the matter here. But a lot of people are repeating that slogan not because they thought independently about a world without armed law enforcement but because they started chanting with a crowd and that’s what the crowd ended up chanting. This is made clear by the fact that many of those saying “abolish the police” are hastening to clarify that they do not literally mean that the police should be abolished. And yet they’re chanting it.

Again, if you have formed an independent opinion that the police should be abolished then you’re not feeding those that are building the war machine. You can discuss your plans for police dissolution, and realize those plans in a democratic way if you convince enough people through argument. Hopefully, it means that if the slogan switches tomorrow to “kill all cops” you would not switch with them, and will retain your independence. But those who repeat a slogan they do not really endorse and cannot defend with argument have given up the power of their independent though to those who write the slogans.

“It’s just a slogan” is the same as “I’m just following orders”. It makes you complicit. It will not protect you.

The War

When the enemy is marked and the soldiers are enlisted under banners of absurd slogans, the war will come. Those who are in the war to grab what they can for themselves will grab a lot. Those who are in it for a higher cause will find that a lot is lost and little is gained.

Primarily this will happen because the ostensible enemy is far and hard to strike at. Those who grab for power and status will grab it in their institutions, their workplaces, their social networks. But in our world of bubbles people don’t often share those with their ideological enemies. Instead they will strike at those near them, at whomever fails to learn the latest slogan and repeat it quickly and loudly. Or at those who fail at nimbly shifting the blame to their former friend or colleague, to let her take the hit in their stead.

This is already happening. People are getting mobbed, fired, and cancelled, often by their ideological allies. Relationships and friendships built over years are destroyed in seconds. It will be terrible and painful for all involved, even the people inflicting most of the harm. As I mentioned, a lot of this violence especially in the social sphere is not motivated by conscious malevolence. People will experience anger, anxiety, fear, and then submit to those with power unless they protect their independence.

The Cause

While this war rages in workplaces and on Facebook, the poor remain poor, the oppressed remain oppressed, and the corporate PR departments are going brrr to feed the flames and their own profits. This is not an unfortunate collateral of fighting for the cause. It directly undermines the cause.

What’s wrong with people being afraid to say something racist, you may ask? When people are threatened, whether they’re racist or not, their energy becomes entirely dedicated to self preservation, not to fighting for racial equity or police reform. If people call you racist, that accusation will not disappear the moment minorities achieve equal treatment by the law and law enforcement. Those who feel threatened will shout all the right slogans publicly but spend all their time privately sabotaging others around them. Cancel first, lest ye be cancelled.

People who were scared to speak voted for Trump, and will do so again.

If you believe in a cause, your main tools must be truth and reason. What do you know and how do you know it? How can you help and how will you check that it’s helping? Did the people giving confident orders ask themselves those questions or are they merely asserting dominance over you?

Those who try to come up with their own plans to reform the police are interested in feedback and discussion. They want dialogue and inclusion. Those who are using anger at the police to grab power are interested in obedience and submission. They want outrage and exclusion.

So what should you say? Whatever you actually believe in and thought through.

If you care about racial equity and police reform and say what you think will help, you are making the conversation smarter and more effective. If you say what you think will cover your ass, the conversation is making you stupid and vulnerable. If you protest because you believe in the protest, you are making the protest principled. If you protest so as not to be seen not protesting, the protests are making you corrupt. It takes some courage to think for yourself, but what do you think happens when everyone betrays that courage?

And if you’re not sure what you believe, it’s OK not to say anything. Even the police allow you to remain silent under arrest, informing you that whatever you say can be used against you. Think for yourself, or shut up.


Do Women Like Assholes?

Новости LessWrong.com - 22 июня, 2020 - 05:14
Published on June 22, 2020 2:14 AM GMT

Cross-posted from Putanumonit where a lively discussion is already going.

Thank you to Ben Pace for helping migrate this post to LessWrong.

Insofar as Putanumonit promotes a normative stance, it boils down to the following:

  1. Be savvy about statistics and research
  2. Be nice and cooperative
  3. Saying “sex is cool, but have you considered…” is cool, but have you considered having sex?

These are not unrelated. Being savvy with math can help your romantic life. Being nice and cooperative can really help your romantic life. At least, that’s what I believe based on my experience.

Some readers enjoy my posts and profitably employ my philosophy and then write me lovely messages about it. They are the reason why I get up in the morning and put so much time and effort into this blog.

But some readers don’t enjoy my posts. They tell me that I’m a fool or a liar, that women date jerks and disdain nice guys, that the gender wars are real and must be fought ruthlessly, that all this talk about win-win romance and compatible goals is a blue pill conspiracy to oppress men.

I’ve been mostly ignoring and mocking these latter readers. But recently, they started posting links to research papers supposedly proving their point. And so, in the name of stance #1, I got up in the morning and put way too much time and effort into my own research project to investigate: do assholes really do better romantically, or is there hope for men and women to get along after all?

Literature Review

The literature sucks. That’s it, that’s the review.

Granted, this question is hard to measure empirically. It’s hard to define who is an asshole, let alone to identify them, let alone to measure how well they do with women in the long run — I struggled with those issues in my own research. But that’s far from the only problem.

The study I was sent most often is The Dark Triad personality: Attractiveness to women by Carter et al (2013). The dark triad is the combination of narcissism (entitlement, grandiose self-image), psychopathy (callousness, lack of empathy), and Machiavellianism (insincere manipulation). The attractiveness was measured by asking women in an online questionnaire to read descriptions of men and say how attractive they find them. The women are 128 college undergrads in psychology.

The result of the study was a positive but statistically insignificant boost to the attractiveness of the dark triad descriptions. The DT guys were rated significantly higher on extraversion (which is attractive) and significantly lower on neuroticism (which isn’t). This would seem to imply that the dark triad isn’t attractive in itself, but only in what it signals about extraversion and neuroticism. Yet, somehow, the authors threw those three into a structural equation model (while conveniently ignoring other confounders like agreeableness) and squeezed out the requisite p-value to get published:

SEMs are a legitimate tool of social science research, but they’re impossible to replicate without access to the data and are rife with opportunities for multiplicity and p-hacking. I don’t know if this study shows anything at all about dark triad and attractiveness. Even if it does, I’m not sure for whom it shows this effect:

Here are a couple of other studies I looked at:

Basically, all the studies in this field use 19-year-old girls on college campuses. Not only are they WEIRD, which is a problem for a lot of psychology research, but 19-year-old women in college are in an extreme and unusual mating situation.

With this, I stopped searching for more papers based on hungover college students with one exception I’ll get to later. I instead listened to six hours of dating podcasts with Geoffrey Miller, David Buss, and Tucker Max. Miller wrote books about how sexual selection shaped our evolution. Buss wrote books about how evolution shapes our sexual selections. Max wrote books about being an asshole and getting laid. If anyone would know whether women prefer jerks and why, it’s those guys.

Everyone who believes that women like jerks is convinced that they know why, but of course they each have their own story. I’ve compiled a laundry list of hypotheses on the subject, based on the literature, the experts, and people I know.

Why may women prefer assholes?

Hypothesis 1 — Signal of extraversion and assertiveness

Women strongly prefer men who are extraverted and assertive to those who are socially passive. It could be that social dominance leads to social and professional success for the man making them a desirable partner, or that outgoing and decisive men simply make better lovers.

1a Being an asshole is, in fact, positively correlated with assertiveness and extraversion and is thus a signal of those traits.

1b Being an asshole isn’t correlated but is mistaken for assertiveness and extraversion. For example, someone may strive to be the center of attention because they are socially skilled and popular or because they are narcissistic, and it’s hard to immediately tell which is which.

1c Being an asshole is a signal of high status or skills, because a loser could not get away with being a jerk. A weak and unpopular man would get laughed at for narcissistic delusions or beaten up for acting like a psychopath. Thus, exhibiting dark triad traits is a signal that one is not a loser.

1d A corollary to the “asshole as signal” theories is that women will fall for assholes less as they grow in experience and wisdom. This is the main reason why studying this on 19-year-olds may be useless: women at 19 don’t have the experience to read men’s status and personality well. Moreover, men sharing a college campus at 19 are very undifferentiated, unlike later in life when women can look at stronger signals like career success.

When I was 19 I was nice and considerate and didn’t get laid a lot with 19-year-olds. Now that I’m 33 I’m trying to be nice and considerate and I’m happily married and having threesomes with smart and lovely women my age. A few of the women I asked admitted to falling for jerks who mistreated them while in college, and how they learned from that experience to recognize assholes and avoid them.

Hypothesis 2 — Short-term mating strategy

Assholes aggressively seek out short-term mating: more casual sex, less lasting relationships. They are more successful at it mostly because of their single-minded pursuit of it. The downsides of dating assholes only emerge in the long term, when the Machiavellian’s lies can’t be sustained or the narcissist’s volatile self-esteem swings from peak to nadir.

2a Women don’t like assholes but sleep with them because of selection effects — only psychopaths approach women aggressively at bars and clubs and it works because of the law of large numbers. This would be kinda funny if true because the sort of guy who posts links to research papers on blogs is almost certainly the sort of guy who will never do well in a bar or nightclub no matter what personality he adopts.

I’m the sort of guy who writes research posts, and none of the women I ever dated or slept with were met in a bar or nightclub. I mostly get dates through friends or my online presence, two areas that I built up through years of long-term-oriented effort. In the club nothing matters beyond the next 30 seconds.

2b Some women just want a guy for short-term mating and are choosing the assholes consciously because they know these men will not want to hang around. What kind of women?

One trope that comes up often is women who have bad relationships with their fathers date jerks. My evo psych take on it is that in the absence of a role model for good fatherhood, women take the good genes in the good genes-good father tradeoff. Tucker Max’s take is that “some girls need to work through the trauma of their daddy issues on some asshole’s dick, and there’s nothing wrong with that”. Either way, I regret not asking about it on the survey.

Hypothesis 3 — Being an asshole is just better

The final option is that being an asshole is not a signal or a correlate of anything, it just works better for romantic and sexual success.

3a Assholes successfully manipulate women into sleeping with them and staying with them with their dark skills.

3b No manipulation needed — women just consciously prefer to date jerks and be mistreated.

Hypothesis 4 — Women actually don’t prefer assholes

But some people think that they do because:

4a — They’re misogynists and want an excuse to be mean to women.

4b — Instead of simply being nice they’re being Nice Guys (TM) who objectify women and treat relationships as transactional.

4c — They confuse being high status among men (which is obviously attractive) to being high status relative to your partner. The latter would imply that belittling (negging) and undermining your partner to lower their status would be a successful strategy.

4d — They’re neophyte PUAs who measure success by getting numbers at bars, and scared women readily give a fake number to pushy psychopaths.

4e — They assume that guys with different norms around flirting (e.g., working-class people, or the French)  are assholes, when in fact they’re just more direct (which women like).

4f — They derive the causation backwards, judging men who talk about their own romantic success to be assholes because they talk about it (to less successful men).

The hypotheses in this group are outside the scope of this research, but they’re worth mentioning. Even if women don’t prefer assholes at all, there are many reasons why this trope could flourish.

Study Setup


My survey on personality and relationships received 1,220 responses. Thank you to everyone who filled it out, and huge thanks to everyone who shared, retweeted, reddited, and told their mom. Thanks for nothing to the 8 people whose responses I threw out for being nonsensical and fucking up the attention check questions. This is a huge sample, larger than in any academic paper I looked at, and quite varied. I’m really grateful.

The median age is 29, with 90% of respondents between the ages of 21-45. We’re talking about adults who are looking to date, not college freshmen looking for course credit.

801 of the respondents are male and straight. 256 are female and either straight or bi, i.e. the mating target for straight men. Given the core question, the bulk of my study focuses on these two groups and I will mostly use men and women as shorthand to refer to them. I’ll discuss some findings that relate to everyone else separately.

Personality Variables

The survey estimated 6 personality traits using 4 questions each (you can review all the questions on the survey itself). Narcissism, psychopathy, and Machiavellianism with questions taken from this paper, agreeableness and extraversion with questions from here, assertiveness from here.

The first three are collectively referred to as the Dark Triad. By subtracting Dark Triad from agreeableness I get a measure of niceness. Henceforth, nice guys are those high in agreeableness and low on the DT traits, while assholes are the opposite.

Assertiveness is often considered a sub-trait of extraversion, and the two showed up very similarly on the survey. They correlate highly with each other and have the same correlations with other traits. Given this, I sometimes combined both into a single measure I called social dominance for lack of a better term. Dominant people are decisive, talkative, like attention. Passive people are the opposite.

I also asked people to rate themselves on physical attractiveness (hotness), attractive talents, and popularity. My responders are slightly hotter than average according to themselves, and hotter the more cisgendered they are without any difference between men and women.

Relationships Variables

I asked people for their lifetime number of sexual partners, current relationship status, and percent of their adult life that they’ve been in a relationship. I also asked what they’re looking for, which I operationalized as a numeric scale for short-term orientation: 3 for those looking for sex, 2 for casual dating, 1 for serious relationships, and 0 for those not looking for any more partners (14% of this latter group are single).

Here is the correlation matrix for all the raw variables measured, it does not look particularly different when broken down by gender.

We see some interesting things right away. Narcissism is correlated with attractive traits, but so is agreeableness. As people get older they become less narcissistic and more assertive. Extraversion is great for both friends and lovers. Of course, many of these traits confound each other so we’ll use regressions and controls to tease out the effects.

The direct measure of short-term mating success is the number of lifetime partners, but we’d expect that to correlate with age in a non-linear fashion. To control for age I pulled data from the giant National Survey of Family Growth to derive the average number of partners for each age bracket. This is shown in the black line in the chart below (with the dots being my actual respondents), going up from 1 to 9 partners over people’s teens and twenties and topping out at 12 partners. Note the log scale of the Y-axis, modified to include those reporting 0 partners.

On the NSFG men report a lot more partners than women (15 vs. 8 by age 40), as common wisdom would suggest. In my survey women actually reported more partners (12 vs. 10), especially bi women. Gay men reported slightly fewer partners (but they are 6 years younger on average than straight men in my sample), lesbians the least, queers the most (despite lower self-rated hotness). By and large, my data seems at least as trustworthy as the NSFG.

My ultimate metric for short-term mating success is log(N partners + 1 / expected N partners for age). The log scale makes intuitive sense since finding your first partner is about as hard as the next two, or going from N to 2N. Using a log scale prevents it from being overly skewed by outliers reporting hundreds of partners. So a virgin at 32 (expected N is 10) scores -3.3, while someone with 99 partners at that age scores +3.3 on short-term mating.

For long-term success, I wanted to combine the questions on current relationship status and overall percent of time in relationships. Looking at a bunch of data like this, it seems that married people should expect 20 more years of marriage and single people should expect to stay unmarried for another 12. I decided to err on the conservative side and just add 15 years of “being in a serious relationship” to those currently in one for the purpose of calculating % of time romantically engaged. So a 33-year who spent half the time since age 18 in a relationship (7.5 years) but is now in a serious one will have that metric upgraded to 75%, since I assume their next 15 will be in a relationship as well.

Regressions and reporting

We’re 2,400 words in and I haven’t told you what the mainline finding is or mentioned p-values. That’s because p-values are a perversion of science, and reporting headline results out of context is a perversion of science reporting.

Instead, I’ll post a lot of regression tables (which you can derive the p-value from if you’re kinky), a lot of colorful charts (all clickable for a larger version), and precise results like the 20% nicest men are slightly likelier to be virgins (13.5% of them) than the 20% least nice ones (11%). My goal is to showcase the data first, not to argue a particular narrative.

Results for Straight Men

Here’s the regression of short-term mating success on all the personality and attractiveness variables. All the variables except for age have been normalized to have the same sample standard deviation so that their coefficients can be compared directly.

Age-Adjusted Number of Sexual Partners (on a log scale)       R2 = 0.24VariableCoefficient (SD)Narcissism-.09 (.05)Psychopathy.11 (.06)Machiavellianism.21 (.06)Agreeableness-.03 (.07)Assertiveness.15 (.06)Extraversion.23 (.07)Physical attractiveness (self-rated).23 (.05)Attractive talents.14 (.06)Popularity.27 (.06)

No big surprises here: men who are popular, good looking, and extraverted have more sexual partners. On the nice-asshole axis, assholes do have more partners mostly due to Machiavellianism. Let’s dig into this.

Hypothesis 1 – Asshole as signal

Narcissism and agreeableness are the strongest predictors of social dominance (sum of assertiveness and extraversion), accounting for 25% of the variance in it. You can see on the chart that the bright red dots (for socially dominant people) are concentrated towards the top right corner of those high in both agreeableness and narcissism.

These two traits are also correlated with popularity, but once we control for social dominance the effect of narcissism is cut in half while the effect of agreeableness remains.

Popularity       R2 = 0.30VariableCoefficient (SD)Narcissism .11 (.04)Psychopathy-.08 (.04)Machiavellianism  0Agreeableness .22 (.04)Social Dominance .36 (.04)

So narcissism is close to assertiveness and extraversion and is some signal of popularity (i.e. social status). Narcissism is also the only personality trait that positively predicts short-term orientation, i.e. reporting that you’re looking for sex or something casual and not a serious relationship (.13 coefficient with .04 SD). And yet, narcissists are not getting laid.

This matches the story I told in Go Fuck SomeoneNarcissists want to be fuckable more than want to fuck. They put all their effort into preserving their image and status, while getting intimate involves vulnerability and making room for the other person’s story. Narcissism is also the only personality trait that strongly predicts caring about one’s partner being good looking — a trait that’s more important for making impression on observers than for building relationships.

Agreeableness (measured as empathy, willingness to help, putting others at ease) is an even stronger predictor of social dominance and popularity, while Machiavellianism has no correlation with them and psychopaths are unpopular introverts.  The latter are the two asshole traits that actually contribute to getting laid. So insofar as being an asshole helps, it is not through signaling status or extraversion.

Hypothesis 2 – Assholes (and some women) just want to bang and ghost

As mentioned, narcissism is the only trait that predicts short-term orientation for single men. For women, short-term orientation is basically predicted only by age — older women want more serious relationships.

However, women are less short-term oriented in general. Despite being slightly younger in my sample, 65% of single women report looking for a serious relationship (55% of men) and only 7% are looking for mere sex (12% of men). As I mentioned when discussing gender ratios, this is not a huge difference but it’s important on the margin. For every two men looking for a one night stand (and those are likely the guys driving the number-of-partners metric), there is just one woman seeking the same.

62% of women who look for serious relationships answered that it’s very (5/5) important that their partners share their relationship goals. 45% of men don’t actually share those relationship goals, but would still like to bang those women.

Here are the four questions I used to assess Machiavellianism, which is a predictor of short-term mating success:

I have used deceit or lied to get my way

I tend to exploit others towards my own end

I have used flattery to get my way

I tend to manipulate others to get my way

Women don’t report seeking out assholes in any way — “nice and considerate” was a close second to “shares my goals” among the traits that are important to women in a partner, ahead of “happy and confident”, “physically attractive”, and “assertive and dominant”. This rules out hypothesis 3b (if women liked jerks, why would they lie about it?) and leads us to:

Hypothesis 3a — Fuckbois

My data, as well as the entirety of this horrible subreddit, seem to point to some number of Machiavellian dudes successfully manipulating women to get laid. For example by lying about their relationship intentions. Machiavellianism (along with psychopathy) is in fact negatively correlated with caring about your partner sharing your relationship goals — they only care about getting what they want themselves.

However, successful manipulation is not the only possibility. Machiavellianism and sexuality: on the moderating role of biological sex by McHoskey (2001) looks at the relationship between, well, Machiavellianism and sexuality. Machiavellianism correlates with psychopathy and extraversion (replicated in my data), sexual success, and also with promiscuity, curiosity, and excitement about sex. Machiavellians are also more likely to feign love, get someone drunk, and coerce someone into sex.

So there are three reasons why Machiavellians could be having more sexual partners:

  1. Coercion and manipulation.
  2. Correlation with extraversion, which gets you laid.
  3. Promiscuity and seeking out sex — if you seek you shall find.

If the first reason was the main one, it’s likely that Machiavellianism would correlate in particular with the number of partners but not the longevity of relationships. Once the lies come to light the Machiavellian fuckboi would have to move on to their next victim. We should see this in a negative relationship with long-term relationship success.

Serious Relationship Success  R2 = 0.31VariableCoefficient (SD)Narcissism0Psychopathy.03 (.06)Machiavellianism.10 (.06)Agreeableness.06 (.06)Assertiveness.11 (.06)Extraversion.02 (.07)Physical attractiveness (self-rated).18 (.05)Attractive talents.14 (.06)Popularity.09 (.06)Age.05 (.005)

In fact, Machiavellianism has a weak but positive impact on serious relationship success. This still holds if we look at both components of long-term success separately, the percent of adult life spent in serious relationships and being in one right now. This could be an artifact of noise, but it’s likely that there’s at least some weak effect there which provides some evidence against the idea that the success of Machiavellians is entirely due to nefarious tactics.

The data also goes against the “signaling extraversion” hypothesis, since neither including nor removing extraversion from the regression has any effect on the coefficient of Machiavellianism. We are left with the story that Machiavellians are simply more promiscuous.

Machiavellians in my sample don’t show any unusual preferences for casual sex over serious relationships, although that’s not quite the same as promiscuity and excitement. They could just be more relationship-seeking overall, or they get a woman drunk for a one-night stand and then catch feelings by accident and end up a decade later married with three kids and a golden retriever while also cheating on the side. Many such cases, as they say.

Other than that, what’s the secret to finding a serious relationship? Be hot, be funny, be assertive, be patient.

30% of men below age 30 report never having been in a serious relationship, but only 2 out of the 128 men over the age of 40 report the same. A lot of my readers are right at the cusp of that age transition — I hope you don’t stop reading Putanumonit once you find girlfriends and wives!

Summary of Results for Straight Men

  1. Looks, popularity, and social dominance (assertiveness + extraversion) get you laid, with neither factor dominating the others.
  2. Machiavellianism predicts sexual and romantic success. It’s unclear if this is due to successful manipulation or simply seeking out sex and romance more.
  3. Narcissists want casual sex with hot partners and predictably fail to obtain it.
  4. Agreeableness beats psychopathy for both friends and romantic relationships.
  5. Women don’t seem to consciously seek out assholes.
  6. Insofar as assholes are successful, it has little to do with status and their success doesn’t diminish with age.
  7. There’s a huge variance in the number of sexual partners for men of all ages, but almost all men end up in romantic relationships in their thirties.
Other Results

Below is a grab bag of other results that showed up in the data. Some of them fit what I would have predicted and some were surprising, but take them all with a pinch of salt since they were not the original object of the study.

We seek partners like us

Attractive people care more about their partner’s attractiveness, nice people care about their partner being nice, assertive people care more about their partner being happy and confident (although they don’t care about their partner being assertive). All of those relationships are significant and hold for both men and women. This should serve as a word of caution for those looking to be assholes as a romantic strategy — you may end up dating assholes yourself.

You can imagine virtuous and vicious cycles as a result of this. I was always nice and considerate, and it didn’t work until I figured out how to filter for women who are themselves lovely and kind. Now my partners and I can all be nice to each other and enjoy life. If you start off being a jerk you attract jerks, and this further justifies being mean and perpetuating the cycle.

Attractiveness matters for women only in the short term

A woman’s self-rated attractiveness predicts her number of sexual partners, but not her success at being in serious romantic relationships. The latter is correlated with assertiveness and agreeableness, and of course with age. This matches the preferences reported by men: guys who look for casual sex care more about a partner’s looks than those who look for serious relationships.

Narcissism also correlates with women’s short-term mating success but not serious relationships. I talked about it when discussing women’s mating markets. Hot young women have their choice of short-term partners, and they don’t pay much of a penalty for narcissism or disagreeable political stances like #KillAllMen. But they can remain in the mindset that a relationship is something they deserve for who they are instead of something they have to build and compromise for. If that’s you:

Perhaps you are going on dates with lovely people but the dates aren’t going exactly according to the script you envisioned. Or the people who flirt and match with you are not quite what someone with your degrees and BMI and yoga skill deserves. In this case you should go back to self-development: fix your narcissism and figure out what value you actually provide to a romantic partner besides imagining that you raise their status through mere association.

How to tell if you’re in the latter category? If you get a lot of “I can’t believe a great guy/gal like you can’t find a girlfriend/boyfriend” from your friends, that’s a sign. Your friends saying that is not a compliment, it’s a mockery of your misguided self-focus.

The opposite is true for gay men

The only trait that contributes to short-term success for the 122 gay and bisexual men in my sample is agreeableness (.59 coefficient with .19 SD, p=.003 without correcting for multiplicity). The only trait that correlates with long-term success aside from age is hotness (.09 coefficient with .03 SD, p=.006). I have no theories about this result or much confidence in it despite the statistical significance.

Personality and gender

Cis men are more psychopathic, disagreeable, and assertive. Women (queer and cis) are more narcissistic. Queer men (N=16) are meek sweethearts. This seems mostly in line with prevailing stereotypes.

True self-confidence comes with age

Personality predictors of age  R2 = 0.05VariableCoefficient (SD)Narcissism-1.96 (.35)Psychopathy-.38 (.37)Machiavellianism .13 (.37)Agreeableness-.35 (.36)Assertiveness1.44 (.35)Extraversion .46 (.40)

People become less narcissistic and more assertive with age. This result is statistically significant although the effect is quite weak — people who are 1 SD more assertive are only 1.44 years older on average. Older people and people who date younger partners are also significantly less likely to report wanting a partner who is dominant and assertive, with no other major changes in partner preference.

Good looks don’t affect popularity

Popularity with friends is driven by the same traits for both men and women: extraversion, agreeableness, and attractive talents. Quite remarkably, physical attractiveness has almost no impact for either gender, and neither does the dark triad.

I find the lack of relationship between looks and popularity surprising. Looks are strongly correlated with attractive talents (humor, art, athletics) and if we don’t control for those talents then the relationship between looks and popularity shows up, although still much weaker than either extraversion or agreeableness. Perhaps people prefer to hang out with friends of similar physical attractiveness, rather than those who overshadow them in the beauty department.

Sex is Other People

As I said before, my goal with this post was to showcase a lot of information and let the readers draw their own conclusions. Before you do, remember that statistical significance doesn’t imply a huge effect size, that my measures are messy and limited, and that some of the positive results are likely artifacts of noise. My sample also has various selection biases, although if you’re the sort of person who reads Putanumonit you’re probably dating the sort of people who fill out Putanumonit surveys and so these results are very relevant to your own life.

But with all those caveats, I think there’s a major theme that emerges: mating success is about focusing on other people, not yourself. Assertiveness, extraversion, humor — engaging with others leads to romantic success for both genders. Caring about others also helps men make friends and helps women find partners. Of the dark triad traits, the one that is focused on engaging with others even if in a nefarious way (Machiavellianism) is helpful, while caring about oneself instead of others (narcissism, psychopathy) are neutral or negative. Physical attractiveness is important, but it’s far from being an overwhelming factor.

This is good news. Assertiveness and extraversion don’t show up on your forehead, they are demonstrated in your behavior which you have control over. It’s hard to change fixed characteristics about yourself such as beauty or status. It’s easier to practice engaging with people.

People who have met me since I came to New York in my mid twenties find it hard to believe that for long stretches of my youth I didn’t have social confidence or many friends, but it’s true. I had to change social scenes several times and learn to thread the line between assertiveness and disagreeableness. I became less self-absorbed, more curious about others. This all massively helped my dating life. I also got older, of course, which helps.

Mating success isn’t guaranteed, and some people have a much stronger starting point than others. But it always starts with going out and talking to people.


I’m not sure if I’m going to publish the raw data, but I can prepare a sanitized version to share upon request if you write to let me know what you want it for. If you’re a researcher and think that this data or analysis could be used for a published paper I may be interested in collaborating.


Are Humans Fundamentally Good?

Новости LessWrong.com - 22 июня, 2020 - 04:54
Published on June 21, 2020 4:29 PM GMT

I've been doing some reading on political philosophy recently, with a focus on the works that influenced the founding fathers of the United States, and one of the ideas that comes up a lot is the idea of the social contract. One of the assumptions that drives this theory is that humans are not fundamentally good.

Are humans not fundamentally good?

In real live examples of anarchy, does society devolve because humans are not fundamentally good or because of some other reason?

I am an absolute newcomer to political philosophy so forgive me if I am misrepresenting social contract theory.


Plausible cases for HRAD work, and locating the crux in the "realism about rationality" debate

Новости LessWrong.com - 22 июня, 2020 - 04:10
Published on June 22, 2020 1:10 AM GMT

This post is my attempt to summarize and distill the major public debates about MIRI's highly reliable agent designs (HRAD) work (which includes work on decision theory), including the discussions in Realism about rationality and Daniel Dewey's My current thoughts on MIRI's "highly reliable agent design" work. Part of the difficulty with discussing the value of HRAD work is that it's not even clear what the disagreement is about, so my summary takes the form of multiple possible "worlds" we might be in; each world consists of a positive case for doing HRAD work, along with the potential objections to that case, which results in one or more cruxes.

I will talk about "being in a world" throughout this post. What I mean by this is the following: If we are "in world X", that means that the case for HRAD work outlined in world X is the one that most resonates with MIRI people as their motivation for doing HRAD work; and that when people disagree about the value of HRAD work, this is what the disagreement is about. When I say that "I think we are in this world", I don't mean that I agree with this case for HRAD work; it just means that this is what I think MIRI people think.

In this post, the pro-HRAD stance is something like "HRAD work is the most important kind of technical research in AI alignment; it is the overwhelming priority and we're pretty much screwed if we under-invest in this kind of research" and the anti-HRAD stance is something like "HRAD work seems significantly less promising than other technical AI alignment agendas, such as the approaches to directly align machine learning systems (e.g. iterated amplification)". There is a much weaker pro-HRAD stance, which is something like "HRAD work is interesting and doing more of it adds value, but it's not necessarily the most important kind of technical AI alignment research to be working on"; this post is not about this weaker stance.

Clarifying some terms

Before describing the various worlds, I want to present some distinctions that have come up in discussions about HRAD, which will be relevant when distinguishing between the worlds.

Levels of abstraction vs levels of indirection

The idea of levels of abstraction was introduced in the context of debate about HRAD work by Rohin Shah, and is described in this comment (start from "When groups of humans try to build complicated stuff"). For more background, see these articles on Wikipedia.

Later on, in this comment Rohin gave a somewhat different "levels" idea, which I've decided to call "levels of indirection". The idea is that there might not be a hierarchy of abstraction, but there's still multiple intermediate layers between the theory you have and the end-result you want. The relevant "levels of indirection" is the sequence HRAD → machine learning → AGI. Even though levels of indirection are different from levels of abstraction, the idea is that the same principle applies, where the more levels there are, the harder it becomes for a theory to apply to the final level.

Precise vs imprecise theory

A precise theory is one which can scale to 2+ levels of abstraction/indirection.

An imprecise theory is one which can scale to at most 1 level of abstraction/indirection.

More intuitively, a precise theory is more mathy, rigorous, and exact like pure math and physics, and an imprecise theory is less mathy, like economics and psychology.

Building agents from the ground up vs understanding the behavior of rational agents and predicting roughly what they will do

This distinction comes from Abram Demski's comment. However, I'm not confident I've understood this distinction in the way that Abram intended it, so what I describe below may be a slightly different distinction.

Building agents from the ground up means having a precise theory of rationality that allows us to build an AGI in a satisfying way, e.g. where someone with security mindset can be confident that it is aligned. Importantly, we allow the AGI to be built using whatever way is safest or most theoretically satisfying, rather than requiring that the AGI be built using whatever methods are mainstream (e.g. current machine learning methods).

Understanding the behavior of rational agents and predicting roughly what they will do means being handed an arbitrary agent implemented in some way (e.g. via blackbox ML) and then being able to predict roughly how it will act.

I think of the difference between these two as the difference between existential and universal quantification: "there exists x such that P(x)" and "for all x we have P(x)", where P(x) is something like "we can understand and predict how x will act in a satisfying way". The former only says that we can build some AGI using the precise theory that we understand well, whereas the latter says we have to deal with whatever kind of AGI that ends up being developed using methods we might not understand well.

World 1 Case for HRAD

The goal of HRAD research is to generally become less confused about things like counterfactual reasoning and logical uncertainty. Becoming less confused about these things will: help AGI builders avoid, detect, and fix safety issues; help AGI builders predict or explain safety issues; help to conceptually clarify the AI alignment problem; and help us be satisfied that the AGI is doing what we want. Moreover, unless we become less confused about these things, we are likely to screw up alignment because we won't deeply understand how our AI systems are reasoning. There are other ways to gain clarity on alignment, such as by working on iterated amplification, but these approaches don't decompose cognitive work enough.

For this case, it is not important for the final product of HRAD to be a precise theory. Even if the final theory of embedded agency is imprecise, or even if there is no "final say" on the topic, if we are merely much less confused than we are now, that is still good enough to help us ensure AI systems are aligned.

Why I think we might be in this world

The main reason I think we might be in this world (i.e. that the above case is the motivating reason for MIRI prioritizing HRAD work) is that people at MIRI frequently seem to be saying something like the case above. However, they also seem to be saying different things in other places, so I'm not confident this is actually their case. Here are some examples:

  • Eliezer Yudkowsky: "Techniques you can actually adapt in a safe AI, come the day, will probably have very simple cores — the sort of core concept that takes up three paragraphs, where any reviewer who didn’t spend five years struggling on the problem themselves will think, “Oh I could have thought of that.” Someday there may be a book full of clever and difficult things to say about the simple core — contrast the simplicity of the core concept of causal models, versus the complexity of proving all the clever things Judea Pearl had to say about causal models. But the planetary benefit is mainly from posing understandable problems crisply enough so that people can see they are open, and then from the simpler abstract properties of a found solution — complicated aspects will not carry over to real AIs later."
  • Rob Bensinger: "We’re working on decision theory because there’s a cluster of confusing issues here (e.g., counterfactuals, updatelessness, coordination) that represent a lot of holes or anomalies in our current best understanding of what high-quality reasoning is and how it works." and phrases like "developing an understanding of roughly what counterfactuals are and how they work" and "very roughly how/why it works" -- This post then doesn't really specify whether or not the final output is expected to be precise. (The analogy with probability theory and rockets gestures at precise theories, but the post doesn't come out and say it.)
  • Abram Demski: "I don't think there's a true rationality out there in the world, or a true decision theory out there in the world, or even a true notion of intelligence out there in the world. I work on agent foundations because there's still something I'm confused about even after that, and furthermore, AI safety work seems fairly hopeless while still so radically confused about the-phenomena-which-we-use-intelligence-and-rationality-and-agency-and-decision-theory-to-describe."
  • Nate Soares: "The main case for HRAD problems is that we expect them to help in a gestalt way with many different known failure modes (and, plausibly, unknown ones). E.g., 'developing a basic understanding of counterfactual reasoning improves our ability to understand the first AGI systems in a general way, and if we understand AGI better it's likelier we can build systems to address deception, edge instantiation, goal instability, and a number of other problems'."
  • In the deconfusion section of MIRI's 2018 update, some of the examples of deconfusion are not precise/mathematical in nature (e.g. see the paragraph starting with "In 1998, conversations about AI risk and technological singularity scenarios often went in circles in a funny sort of way" and the list after "Among the bits of conceptual progress that MIRI contributed to are"). There are more mathematical examples in the post, but the fact that there are also non-mathematical examples suggests that having a precise theory of rationality is not important to the case for HRAD work. There's also the quote "As AI researchers explore the space of optimizers, what will it take to ensure that the first highly capable optimizers that researchers find are optimizers they know how to aim at chosen tasks? I’m not sure, because I’m still in some sense confused about the question."
The crux

One way to reject this case for HRAD work is by saying that imprecise theories of rationality are insufficient for helping to align AI systems. This is what Rohin does in this comment where he says imprecise theories cannot build things "2+ levels above".

There is a separate potential rejection, which is to say that either HRAD work will never result in precise theories or that even a precise theory is insufficient for helping to align AI systems. However, these move the crux to a place where they apply to more restricted worlds where the goal of HRAD work is specifically to come up with a precise theory, so these will be covered in the other worlds below.

There is a third rejection, which is to argue that other approaches (such as iterated amplification) are more promising for gaining clarity on alignment. In this case, the main disagreement may instead be about other agendas rather than about HRAD.

World 2 Case for HRAD

The goal of HRAD research is to come up with a theory of rationality that is so precise that it allows one to build an agent from the ground up. Deconfusion is still important, as with world 1, but in this case we don't merely want any kind of deconfusion, but specifically deconfusion which is accompanied by a precise theory of rationality.

For this case, HRAD research isn't intended to produce a precise theory about how to predict ML systems, or to be able to make precise predictions about what ML systems will do. Instead, the idea is that the precise theory of rationality will help AGI builders avoid, detect, and fix safety issues; predict or explain safety issues; help to conceptually clarify the AI alignment problem; and help us be satisfied that the AGI is doing what we want. In other words, instead of directly using a precise theory about understanding/predicting rational agents in general, we use the precise theory about rationality to help us roughly predict what rational agents will do in general (including ML systems).

As with world 1, unless we become less confused, we are likely to screw up alignment because we won't deeply understand how our AI systems are reasoning. There are other ways to gain clarity on alignment, such as by working on iterated amplification, but these approaches don't decompose cognitive work enough.

Why I think we might be in this world

This seems to be what Abram is saying in this comment (see especially the part after "I guess there's a tricky interpretational issue here").

It also seems to match what Rohin is saying in these two comments.

The examples MIRI people sometimes give for precedents of HRAD-ish work, like the work done by Turing, Shannon, and Maxwell are precise mathematical theories.

The crux

There seem to be two possible rejections of this case:

  • We can reject the existence of the precise theory of rationality. This is what Rohin does in this comment and this comment where he says "MIRI's theories will always be the relatively-imprecise theories that can't scale to '2+ levels above'." Paul Christiano seems to also do this, as summarized by Jessica Taylor in this post: intuition 18 is "There are reasons to expect the details of reasoning well to be 'messy'."
  • We can argue that even a precise theory of rationality is insufficient for helping to align AI systems. This seems to be what Daniel Dewey is doing in this post when he says things like "AIXI and Solomonoff induction are particularly strong examples of work that is very close to HRAD, but don't seem to have been applicable to real AI systems" and "It seems plausible that the kinds of axiomatic descriptions that HRAD work could produce would be too taxing to be usefully applied to any practical AI system".
World 3 Case for HRAD

The goal of HRAD research is to directly come up with a precise theory for understanding the behavior of rational agents and predicting what they will do. Deconfusion is still important, as with worlds 1 and 2, but in this case we don't merely want any kind of deconfusion, but specifically deconfusion which is accompanied by a precise theory that allows us to predict agents' behavior in general. And a precise theory is important, but we don't merely want a precise theory that lets us build an agent; we want our theory to act like a box that takes in an arbitrary agent (such as one built using ML and other black boxes) and allows us to analyze its behavior.

This theory can then be used to help AGI builders avoid, detect, and fix safety issues; predict or explain safety issues; help to conceptually clarify the AI alignment problem; and help us be satisfied that the AGI is doing what we want.

As with world 1 and 2, unless we become less confused, we are likely to screw up alignment because we won't deeply understand how our AI systems are reasoning. There are other ways to gain clarity on alignment, such as by working on iterated amplification, but these approaches don't decompose cognitive work enough.

Why I think we might be in this world

I mostly don't think we're in this world, but some critics might think we are.

For example Abram says in this comment: "I can see how Ricraz would read statements of the first type [i.e. having precise understanding of rationality] as suggesting very strong claims of the second type [i.e. being able to understand the behavior of agents in general]."

Daniel Dewey might also expect to be in this world; it's hard for me to tell based on his post about HRAD.

The crux

The crux in this world is basically the same as the first rejection for world 2: we can reject the existence of a precise theory for understanding the behavior of arbitrary rational agents.

Conclusion, and moving forward

To summarize the above, combining all of possible worlds, the pro-HRAD stance becomes:

(ML safety agenda not promising) and ( (even an imprecise theory of rationality helps to align AGI) or ((a precise theory of rationality can be found) and (a precise theory of rationality can be used to help align AGI)) or (a precise theory to predict behavior of arbitrary agent can be found) )

and the anti-HRAD stance is the negation of the above:

(ML safety agenda promising) or ( (an imprecise theory of rationality cannot be used to help align AGI) and ((a precise theory of rationality cannot be found) or (even a precise theory of rationality cannot be used to help align AGI)) and (a precise theory to predict behavior of arbitrary agent cannot be found) )

How does this fit under the Double Crux framework? The current "overall crux" is a messy proposition consisting of multiple conjunctions and disjunctions, and fully resolving the disagreement can in the worst case require assigning truth values to all five parts: the statement "A and (B or (C and D) or E)", with disagreements resolved in the order A=True, B=False, C=True, D=False can still be true or false depending on the value of E. From an efficiency perspective, if some of the conjunctions/disjunctions don't matter, we want to get rid of them in order to simplify the structure of the overall crux (this corresponds to identifying which "world" we are in, using the terminology of this post), and we also might want to pick an ordering of which parts to resolve first (for example, with A=True and B=True, we already know the overall proposition is true).

So some steps for moving the discussion forward:

  • I think it would be great to get HRAD proponents/opponents to be like "we're definitely in world X, and not any of the other worlds" or even be like "actually, the case for HRAD really is disjunctive, so both of the cases in worlds X and Y apply".
  • If I missed any additional possible worlds, or if I described one of the worlds incorrectly, I am interested in hearing about it.
  • If it becomes clear which world we are in, then the next step is to drill down on the crux(es) in that world.

Thanks to Ben Cottier, Rohin Shah, and Joe Bernstein for feedback on this post.


Training our humans on the wrong dataset

Новости LessWrong.com - 21 июня, 2020 - 20:17
Published on June 21, 2020 5:17 PM GMT

I really don't want to say that I've figured out the majority of what's wrong with modern education and how to fix it, BUT

1. We train ML models on the tasks they are meant to solve

When we train (fit) any given ML model for a specific problem, on which we have a training dataset, there are several ways we go about it, but all of them involve using that dataset.

Say we’re training a model that takes a 2d image of some glassware and turn it into a 3d rendering. We have images of 2000 glasses from different angles and in different lighting conditions and an associated 3d model.

How do we go about training the model? Well, arguable, we could start small then feed the whole dataset, we could use different sizes for test/train/validation, we could use cv to determine the overall accuracy of our method or decide it would take to long... etc

But I'm fairly sure that nobody will ever say:

I know, let's take a dataset of 2d images of cars and their 3d rendering and train the model on that first.

If you already have a trained model that does some other 2d image processing or predicts 3d structure from 2d images, you might try doing some weight transfer or using part of the model as a backbone. But that's just because the hard part, the training, is already done.

To have a very charitable example, maybe our 3d rendering is not accurate enough and we've tried everything but getting more data is too expensive. At that point, we could decide to bring in other 2d to 3d datasets and also train the model on that and hope there's enough similarity between the two datasets that the model will get better at the glassware task.

One way or another, we'd try to use the most relevant dataset first.

2. We don't do this with humans

I'm certain some % of the people studying how to implement basic building blocks (e.g. allocators, decision trees, and vectors) in C during a 4-year CS degree end up becoming language designers or kernel developers and are glad they took the time to learn those things.

But the vast majority of CS students go on to become frontend developer or full-stack developers where all the "backend" knowledge they require is how to write SQL and how to read/write from file and high-level abstractions over TCP or UDP sockets.

At which point I ask something like:

Well, why not just teach those people how to make UIs and how to write a basic backend in flask?

And I get a mumbled answer about something-something having to learn the fundamentals. To which I reply:

I mean, I don't think you're getting it, the things you are teaching haven't been fundamental for 20+ years, they are hard as fuck. I know how to write a half-decent allocator or an almost std-level vector implementation and I know how to make a basic CRMS and I can't imagine what sort of insane mind would find the former two easier to learn. I also can't imagine the former two being useful in any way to building a CRMS, other than by virtue of the fact that learning them will transfer some of the skills needed to build a CRMS.

At which point I get into arguments about how education seems only marginally related to salary and job performance in most programming jobs. The whole thing boils down to half-arsed statistics because evaluating things like salary, job performance, and education levels is, who would have guessed, really hard.

At which point I get into arguments about how education seems only marginally related to salary and job performance in most programming jobs. The whole thing boils down to half-arsed statistics because evaluating things like salary, job performance, and education levels is, who would have guessed, really hard.

So for the moment, I will just assume that your run of the mill angular developer doesn't need a 4 year CS degree to do his job and a 6-month online program that teaches the direct skills required is sufficient.

3. Based on purely empirical evidence, I think we should

Going even further, let's get into topics like memory ordering. I'm fairly sure these would be considered fairly advanced subjects as far as programming is concerned, to know how to properly use memory ordering is to basically write ASM code.

I learned about the subject by just deciding one day that I will write a fixed size, lock-free, wait-free, thread-safe queue that allows more multi-reader, multi-writer, or both... then to make it a bit harder I went ahead and also developed a Rust version in parallel.

Note: I'm fairly proud of the above implementations since I was basically a kid 4 years ago when I wrote them. I don't think they are well tested enough to use in production and they likely have flaws that basic testing on an x86 and raspberry PI ARM processor didn't catch. Nor are they anywhere close to the most efficient implementations possible.

I'm certain that I don't have anywhere near a perfect grasp of memory ordering and "low level" parallelism in general. However, I do think the above learning experience was a good way to get an understanding equivalent to an advanced university course in ~2 weeks of work.

Now, this is an n=1 example, but I don't think that I'm alone in liking to learn this way.

The few people I know that can write an above half-arsed compiler didn't finish the Dragon book and then started making their first one, they did the two in parallel or even the other way around.

I know a lot of people who swear by Elm, Clojure, and Haskell, to my knowledge none of them bothered to learn Category theory in-depth or ever read Hilbert, Gödel or Russel. This didn't seem to stop them from learning Haskell or becoming good at it.

Most ML practitioners I know don't have a perfect grasp of linear algebra, they can compute the gradients for a simple neural network by hand or even speculate on the gradients resulting from using a specific loss function, but it's not a skill they are very confident in. On the other hand, most ML practitioners I know are fairly confident in writing their own loss functions when it suits the problem.

That's because most people learning ML don't start with a Ph.D. in applied mathematics, they start playing around with models and learn just enough LA to understand(?) what they are doing.

Conversely, most people that want to get a Ph.D. in applied mathematics working on automatic differentiation might not know how to use TensorFlow very well, but I don't think that's something which is at all harmful to their progress. Even though the final results will find practical applications in libraries over which Tensorflow might serve as a wrapper.

Indeed, in any area of computer-based research or engineering people seem comfortable tackling hard problems even if they don't have all the relevant context for understanding those problems, they have to be, one lifetime is enough to learn all the relevant context.

That's not to say you never have to learn anything other than the thing you are working on, I'd be the last person to make that claim. But usually, if you have enough context to understand a problem, the relevant context will inevitably come up as you are trying to solve it.

4. Why learning the contextual skills independently is useful

But say that you are a medieval knight and are trying to learn how to be the most efficient killing machine in a mounted charge with lances.

There are many ways to do it, but hopping on a horse, strapping you spurs and charging with a heavy lance attached your warhorse (the 1,000kg beast charging at 25km/h into a sea of sharp metal) is 100% not the way to do it.

You'd probably learn how to ride first, then learn how to ride fast, then learn how to ride in formation, then learn how to do it in heavy armor, then how to do it with a lance... etc

In parallel, you'd probably be practicing with a lance on foot, or by stabbing hey sacks with a long blunt stick while riding a poney.

You might also do quite a lot of adjacent physical training, learn to fight with and against a dozen or so weapons and types of shield and armor.

Then you'd spend years helping on the battlefield as a squire, a role that didn't involve being in the vanguard of a mounted charge.

Maybe you'd participate in a tourney or two, maybe you'd participate a dozen mock tourneys where the goal is slightly patting your opponent with a blunt stick.


Because the cost of using the "real training data" is high. If I would be teleported inside a knight's armor strapped to horse galloping towards an enemy formation with a lance in my hand I would almost certainly die.

If someone with only 20% of a knight's training did so, the chance of death or debilitating injury might go down to half.

But at the end of the day, even for the best knight out there, the cost of learning in battle is still a, say, 1/20th chance of death or debilitating injury.

Even partially realistic training examples like tourneys would still be very dangerous, with a small chance of death, a small but significant chance of a debilitating injury, and an almost certainty of suffering a minor trauma and damage to your expensive equipment.

I've never tried fighting with blunt weapons in a mock battle, but I have friends who did and they inform me you get injured all the time and it's tiresome and painful and not a thing one could do for a long time. I tend to believe them, even if I was wearing a steel helmet, the thought of being hit over the head full strength with a 1kg piece of steel is not a pleasant one.

On the other hand, practicing combat stances or riding around on a horse involve almost zero risk of injury, death, or damage to expensive equipment.

The knight example might be a bit extreme, but even for something as simple as baking bread, the cost of failure might be the difference between being able to feed yourself or starving/begging for food for the next few weeks.

The same idea applied and to some extent still applies to any trade. When the stakes involve the physical world and your own body "training on real data" is prohibitively expensive if you're not already 99% fit for the task.

5. Why the idea stuck

We are very irrational and for good reason, most of the time we try to be rational we fail.

If for 10,000 years of written history "learning" something was 1% doing and 99% learning contextual skills we might assume this is a good pattern and stick to it without questioning it much.

Maybe we slowly observe that in e.g. CS people that code more and learn theory less do better, so our CS courses go from 99% theory and 1% coding to 25/75. Maybe we observe that frontend devs that learn their craft in 10 weeks are just as good interns as people with a CS degree, so we start replacing CS degrees for frontend developers with 6-month "boot camps".

But what we should instead be thinking is whether or not we should throw away the explicit learning of contextual skills altogether in these kinds of fields.

I will, however, go even further and say that we sometimes learn contextual skills when we'd be better of using technology to play with fake but realistic "training data".

Most kids learn science as if it's a collection of theories and thus end up thinking of it as a sort of math-based religion, rather than as an incomplete and always shifting body of knowledge that's made easier to understand with layers of mathematical abstraction and gets constantly changed and refactored.

This is really bad, since many people end up rejection is the same way they'd reject a religion, refusing to bow down before the tenured priesthood. It's even worst because the people that "learn" science usually learn it as if it were a religion or some kind of logic-based exercise where all conclusions are absolute and all theories perfectly correct.

But why do we teach kids about theory instead of letting them experiment and analyze data, thus teach them the core idea of scientific understanding? Because experiments are expensive and potentially dangerous.

Sure, there's plenty of token chemistry, physics, and even biology experiments that you can do in a lab. However, telling an 8th-grade class:

Alright kids, today we are going to take nicotine and try to design an offshoot that it's particularly harmful to creatures with this particular quick in their dopaminergic pathways.

It sounds like a joke or something done by genius kids inside the walls of an ivory tower. Not something that can be done in a random school serving a pop-2,000 village in Bavaria.

But why?

After all, if you make enough tradeoffs, it doesn't seem that hard to make a simulation for this experiment. Not one that can be used to design industrial insecticides mind you. But you can take a 1,000 play-neurons model of an insect brain then create 500 different populations, some of which have a wired quickly that causes arrhythmia when certain dopamine pathways are overly stimulated.

You can equally well make a toy simulation for mimicking the dopaminergic effects of nicotine derivative based on 100 simple to model parameters (e.g. expression of certain enzymes that can destroy nicotine, affinity to certain sites, ease of passing the blood-brain barrier).

You're not going to come up with anything useful, but you might even stumble close to an already known design for a neonicotinoid insecticide.

The kids or teachers needn't understand the whole simulation, after all that would defeat the point. They need only be allowed to look at parts of it and use their existing chemistry and biology knowledge to speculate on what change might work.

Maybe that's too high of a bar for a kid? Fine, then they need only use their chemistry knowledge to figure out if a specific molecule could even exist in theory, then fire away as many molecules as possible and see how the simulation reacts. After all, that sounds like 99% of applied biochemistry in the past and potentially even now.

This analogy breaks down at some point, but what we have currently is so skewed towards the "training on real data is a bad approach" extreme that I think any shift towards the "training on real data is the correct approach" would be good.

The shift from "training of real data" being expensive and dangerous is a very recent one, to be charitable it might have happened in the last 30 years or so with the advent of reasonably priced computers.

Thus I think it's perfectly reasonable to assume most people haven't seen this potential paradigm shift. However, it seems like the kind of swim or sink distinction that will make or break many education systems in the 21st century. In many ways, I think this process has already started, but we are just not comprehending it fully as of yet.


The affect heuristic and studying autocracies

Новости LessWrong.com - 21 июня, 2020 - 20:05
Published on June 21, 2020 4:07 AM GMT

General Juan Velasco Alvarado was the military dictator of Peru from 1968 to 1975. In 1964-5 he put down revolutionary peasant guerilla movements, defending an unequal and brutally exploitative pattern of land ownership. Afterward he became frustrated with the bickering and gridlock of Peru’s parliament. With a small cadre of military coconspirators, he planned a coup d’état. Forestalling an uprising by pro-peasant parties, he sent tanks to kidnap the democratically elected president. The parliament was closed indefinitely. On the one year anniversary of his coup, Velasco stated “Some people expected very different things and were confident, as had been the custom, that we came to power for the sole purpose of calling elections and returning to them all their privileges. The people who thought that way were and are mistaken”.[1]

What would you expect Velasco’s policy toward land ownership and peasants to be? You would probably expect him to continue their exploitation by the oligarchy land-owning families. But you would be mistaken. Velasco and his successor redistributed 45% of all arable land in Peru to peasant lead communes, which were later broken up. Land redistribution is a rare spot of consensus in development economics, improving both the lives of the poor and increasing growth. [2]

I told you this story to highlight how your attitudes toward the actor affect your predictions. It is justifiable to dislike Velasco for his violence, for ending Peruvian democracy, for his state-controlled economy. But our brains predict off of those value judgements. The affect heuristic (aka the halo/horn effect) is when one positive/negative attribute of an actor causes people to assume positive/negative attributes in another area. The affect heuristic causes attractive candidates to be hired more often, or honest people to be rated as more intelligent. Subjects told about the benefit of nuclear power are likely to rate it as having fewer risks, et cetera. Our moral attitudes toward the coup are not evidence for Velasco’s policy preference, but our brains treat them as evidence.[3] That is bad if you think predicting what policies autocrats will adopt is important. Which I do.

The same problem applies to agents you like. One of my research projects involved interviewing Jordanian policymakers. I studied policies for rural-urban water reallocation, which I broadly endorse. Because I agreed with some interviewees about this issue, over-trusted them when selecting evidence. During fieldwork, I heard rumors about high radon activity in the water for the Disi-Amman conveyance. I even found a power-point made by MWI staff advocating that Jordan’s drinking water standards be revised to protect the project. But I never looked deeply into the issue. I could have, I have a physics degree, I have operated spectroscopes, I understand radiation physics and can parse radiology articles at a high level. But I never did, because I assumed that this evidence would support my positive attitude toward my interviewees. I never noticed the bias because evidence selection is mostly subconscious. There is no time for probability-theory calculations when you have an hour to speak with a bureaucrat.

Eventually, I looked into the issue at the request of peer reviewers (when I had to). Even then, I looked for evidence only until it supported my conclusions. Radon evaporates out of water, so maybe it evaporates in the mixing reservoirs. I checked the long-time-horizon equilibrium, then assumed the evaporation was fast enough. Once the analysis agreed with my assumptions I stopped looking for more information. I would have submitted this falsehood but by chance found a ministry paper revealing that transport reduces activity by just one eighth. At last, I had to reinvestigate and conclude that the radon activity is a public health hazard worth considering. The effect of low-doses of radiation are hotly debated, but the linear no threshold model predicts excess mortality risk from a lifetime of consumption greater than 10^-4. Because I had a positive attitude toward the policy-makers, I subconsciously avoided evidence that cast them in a negative light.

What to do about the affect heuristic in comparative politics research? At first, I wrote an elegant ending paragraph suggesting not caring or assigning moral value to people or institutions. But that answer relies on rejecting your own feelings, which is morally dubious and impossible anyway. If the goal of modern comparative politics to connect political institutions to outcomes, we still have normative judgments attached to those outcomes. Brian Min wants poor people to have access to electricity. Michael Albertus wants poor people to own the land they work. Removing those concerns would not improve their work. Assigning all agency to institutions would attach the halo and horns to the institutions. In any case waking up every day saying “I have no moral attachments” and talking like spock does not “cure” you of preference.

I am unsure what researchers should do. My first two guesses are

1. Using prediction exercise to calibrate yourself. I was overconfident in my assumptions, so prediction exercises should reduce my overconfidence.

2. Only write up the obvious conclusions from your case-studies. I had plenty of subtle theories to explain Jordan’s policies, but just a few which strong evidence. I asked myself “if a young man from Mafraq reproduced my methodology, would he arrive at the same conclusions”, and left out ideas which failed the test.


[1] Albertus, M. (2015). Autocracy and redistribution. Cambridge University Press. pp. 201

[2] https://www.worldpoliticsreview.com/articles/13688/cultivating-equality-land-reforms-potential-and-challenges

[3] For the curious, Velasco claimed that while putting down the revolts he saw the conditions peasants lived in and resolved to change them. Albertus’s case study suggests he did it to destroy the power base of his rivals (rural elites) and prevent a second peasant uprising, thus securing his rule (Albertus, 2015).

I want to improve my writing and reasoning so comments and critique welcome! Write to the reader!


My weekly review habit

Новости LessWrong.com - 21 июня, 2020 - 17:40
Published on June 21, 2020 2:40 PM GMT

Every Saturday morning, I take 3-4 hours to think about how my week went and how I’ll make the next one better.

The format has changed over time, but for example, here’s some of what I reflected on last week:

  • I noticed I’d fallen far short of my goal for written output. I decided to allocate more time to reading this week, hoping that it would generate more ideas. And I reorganized my morning routine to make it easier to start writing in the morning.

  • I looked at some stats from RescueTime and Complice about what I’d spent time on and accomplished. I noticed that my time spent on Slack was nearing dangerous levels, so I decided to make a couple experimental tweaks to get it down:

    • I tried out Contexts, a replacement for the macOS window switcher, which I configured to only show windows from my current workspace—hoping that this would prevent me from cmd+tabbing over to Slack and getting distracted.

    • I decided to run an experiment of not answering immediately when coworkers called me in the middle of a focused block of time, and keeping a paper “todo when done focusing” list to remind myself to call them back, check Slack, etc.

  • I noticed that it felt hard for me to get useful info from the time-tracking data in RescueTime and Complice, so I revisited what questions I actually wanted to answer and how I could make them easy to answer.

    • I realized that I should be using Google Calendar, not RescueTime or Complice, to track my time spent in meetings, so I added that to my time-tracking data sources.

    • I also made several tweaks to the way I used Complice to make it easier to see various stats I was interested in.

And so on. By the end of the review I had surfaced lots of other improvements for the coming week.

While each individual tweak is small, over the weeks and years they’ve compounded to make me a lot more effective. Because of that, this weekly review is the most useful habit (or habit-generating meta-habit) I’ve built. Here are some of the improvements I’ve made that have come out of weekly reviews:

  • I decided to experiment with time tracking, realized that it needed to be zero-effort to succeed, and identified RescueTime as the best option, giving me much better data on how I was actually spending my time.

  • Once I started using RescueTime, I eliminated a bunch of distractions that it flagged. As a result I improved my focused time by 50%.

  • Later on, reviewing RescueTime stats also helped me realized I was spending much more time distracted by the Internet than I’d realized. I tried various things to break my Internet news habit and eventually found Focus, a zero-effort website blocker which has probably saved me hundreds of hours.

  • I identified “feeling low-energy on winter evenings” as a blocker and tried several experiments to improve my evening energy levels. One of them was an ultra-bright lightbulb which worked amazingly well, giving me back about an hour a day.

  • By thinking about how to help my partner with her PhD, I came up with the idea of doing one-on-ones, which she thinks helped her finish her dissertation a year faster.

  • By thinking about ways to improve our relationship and points of friction we’ve had over the last week, we’ve both started lots of useful discussions in those one-on-ones that have helped us understand each other and communicate better.

  • I’ve made hundreds of tweaks to my daily routine and habits to make sure I reliably exercise, sleep enough, and maintain high energy levels.

Of course, you don’t need to have a weekly review habit to come up with this type of improvement. But by systematically thinking it through, you’ll generate more of them. And by doing it consistently, you’ll be able to build these small improvements on top of each other.

I’ve had to iterate a lot on the format and timing of the weekly review to get to one where I can consistently maintain the habit and output useful weekly reviews. The format I currently have is:

  • Review happens first thing every Saturday morning. This time is sacred and (largely) immovable. Morning is really important for having the right energy and mindset; weekend is important so I’m not distracted by work; consistency is important so that I don’t lose the habit.

  • I start the review by re-reading some parts of my favorite essays of life advice. (Different parts/essays every week. This also sometimes gets me to notice new parts of the essays that resonate or spark interesting thoughts.)

  • Next, I load the week back into working memory by reviewing what happened during the week.

  • Based on the above I’ll write down a list of topics to think about, taking written notes on each topic as I think about them.

  • I also have a set of recurring prompts that I think about every week. I tweak them over time as they get stale, but some examples would be:

    • Was I consistent at my core habits this week (exercise, morning routine, todo system, etc.)? How can I tweak them to be more consistent or more useful?

    • What did I do this week that was a mistake and how can I avoid repeating it?

    • How much of this week did I spend on stuff that was truly my comparative advantage? For everything else, how can I get out of the loop?

As an appendix, some random tactical tips for weekly reviews:

  • Changing my physical environment helps me context-switch into a less focused, more reflective mindset. Back when cafes were open I’d often go to the cafe near my house. At home, I’ll work from a different room, play different background music, etc.

  • I still find it’s easy to get distracted during weekly reviews, so I make sure to close everything else on my computer when I start.

  • When I have granular, objective data on “what happened this week” (e.g. RescueTime, calendar, todo lists) I’ve found it helpful to review that because it occasionally surprises me. (See the points about RescueTime above.)

  • I find that taking notes while I think about things is really important—otherwise I lose track of what I’m thinking about or get distracted.

  • For note-taking, I’d recommend using hierarchical bulleted lists, not free-written paragraphs. Lists are more efficient because you can write in incomplete sentences and leave out transitions (relying on the bullet hierarchy to make the structure clear).

    Bulleted lists are also easier to reorder (especially if you use an app that gives you keyboard shortcuts for it), so if you’re like me, they’ll let you more efficiently exercise your nervous tic of stack-ranking all lists.



Подписка на LessWrong на русском сбор новостей