Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 41 минута 48 секунд назад

DC Meetup

30 сентября, 2019 - 01:37
Published on September 29, 2019 10:37 PM UTC

Slate Star Codex meetup for Washington, DC.


The Thinking Ladder - Wait But Why

29 сентября, 2019 - 21:51
Published on September 29, 2019 6:51 PM UTC

The Thinking Ladder is the latest post in a huge ongoing Wait But Why series called The Story of Us.

Our beliefs make up our perception of reality, they drive our behavior, and they shape our life stories. History happened the way it did because of what people believed in the past, and what we believe today will write the story of our future.So it seems like an important question to ask: Why do we actually come to believe the things we end up believing?

It's a great introduction to many core rationality and LessWrong concepts which is fun and accessible to a general audience.

I recommend the whole series and really all of Wait But Why.


The first step of rationality

29 сентября, 2019 - 15:01
Published on September 29, 2019 12:01 PM UTC

(Crosspost from LW Netherlands FB group)

The first step of rationality is integrating the self. You're not actually one agent, in the sense of something that has one coherent goal and set of beliefs. You're an ensemble of agents that tend to disagree with each other. And if they do, lots of bad things happen. If it gets really bad, we put these things under the umbrella of mental illness. If it's just slightly bad, we call it things like indecision, brain fog, lack of motivation, confusion, weakness of will, akrasia, etc.

An integrated self is what Maslow pointed at when he listed self-actualized individuals. It's part of what Buddhists call enlightenment. It's the thing you edge towards if you meditate, or when you heal a trauma. It's what Jung gestures towards with his integrating the shadow. It's what you attempt to get, in a very explicit and clunky way, with an Internal Double Crux. Complete integration of self is the end-all be-all of spiritual practice.

Have you ever noticed that your "amount of awareness" goes up and down? That's yourself being more or less integrated over time. Have you ever noticed how, after a decent amount of meditation, your intuitions suddenly work for you instead of against you? That's integration. It's a sudden clarity and control. As if there's less beliefs and goals in your brain to compete with.

Meditation has been shown to increase the amount of white matter in the brain. White matter is the wiring between different areas. More wiring, I imagine, means more communication. More communication, I imagine, means more integration. What's more, the ACC, part of your prefrontal cortex, has a dual function of inhibition and awareness. These functions tend to correlate inversely. As if you have a lot more resources at your disposal if you're not spending them on pushing down parts of you that you disagree with. I speculate that this is the physical substrate of your shadow being inhibited. Integrating your shadow means opening up these gateways to your repressed subagents, making amends with them, going from inhibition to awareness.

So why not just stop our efforts and sign up to the local buddhist Sangha instead? Because in rationality there is a second step. Systematized winning.

While integrating the self, we build the focus and trust needed to have a say over our system 1. When practicing systematized winning, we use our power over our intuitions to program them according to our best understanding of decision- and probability theory, cognitive biases, consequentalist ethics, and anything else that cutting-edge analytic thought can give us.

While a Buddhist might eventually let go of the need of consistency to further liberate their mind, we hold on to it. Our paths diverge where the tails of happiness and productivity come apart.

But that's many years down the line. Until then, I suggest we embrace spirituality.

Some problems are new, but the problem of mental flourishing has been with us for hundreds of thousands of years. Many generations have dedicated their life's work to solving it. In these generations, some people that were many orders of magnitude smarter than you. 92% of the human race is in the past.

Imagine that some of them did figure it out. Imagine that they even managed to hand down the solution to their descendants. How might they have called it? I think they called it spirituality.


Candy for Nets

29 сентября, 2019 - 14:10
Published on September 29, 2019 11:10 AM UTC

Yesterday morning my five-year-old daughter was asking me about mosquitos, and we got on to talking about malaria, nets, and how Julia and I donate to the AMF to keep other kids from getting sick and potentially dying. Lily took it very seriously, and proposed that when I retire she take my programming job and donate in my place.

I told her that she didn't need to wait until after I retired to start helping, and she decided she wanted to sell candy on the bike path as a fundraiser. I told her we could do this after naps if the weather was still nice, and the first thing she said when I got her up from her nap was that she wanted to go make a sign.

She dictated to me, "Lily is selling candy to raise money for malaria nets, $1" and I wrote the letters. She colored them in:

(It looks like she's posing with the sign here, but this is just how she happened to position herself for coloring. She has short arms.)

Once Anna was up from her (longer) nap I got out the wagon and brought them over to the bike path. Lily did all the selling; I just hung out to the side, leaning against a tree.

She's always been good at talking to adults, and did a good job selling the candy. She would explain that the candy was $1/each, that the money was going to buy malaria nets, and that malaria was a very bad disease that you got from mosquitoes. People were generous, and several people gave without taking candy, or put in an extra dollar. One person didn't have cash but wanted to give enough that they went home and came back with a dollar. As someone who grew up in a part of town with very little foot traffic, the idea that you can just walk a short distance from your house to somewhere where several people will pass per minute continually amazes me.

After about twenty minutes all the candy was sold and Lily had collected $20.75. She played in the park for a while, and then when we came home she asked how we would use the money to buy nets. I showed her pictures of distributions on the AMF website but she wanted to see pictures of the nets in use so we spent a while on image search:

I explained that we weren't going to distribute the nets ourselves, but that we would provide the money so other people could.

Initially she didn't want to donate the whole amount, but wanted to set aside half to buy more candy so she could do this again. I told her that I would be happy to buy the candy. Possibly I should have let her manage this herself, but I was worried that the money wouldn't end up donated which wouldn't have been fair to the people who'd bought the candy, and explained this to her. She gave me the $20.75 and I used my credit card to pay for the nets. [1]

Here's the message she dictated for the donation:

I want people to be safe in the world from biting mosquitoes. I don't want them getting hurt, and especially I don't want the kids like me to die.

I don't know how her relationship with altruism will change as she gets older, and I do think there are ways it will be hard for her to have parents who have strong unusual views. As we go I'm going to continue to try very hard not to pressure or manipulate her, while still giving advice and helping her explore her motivations here. I am, however, very proud of her today.

[1] I haven't listed this on our donations page and it doesn't count it towards our 50% goal because the donation was Lily's and not ours.


Christiano decision theory excerpt

29 сентября, 2019 - 05:55
Published on September 29, 2019 2:55 AM UTC

A transcribed excerpt I found interesting from the decision theory outtake from 80,000 Hours' second interview of Paul Christiano (starting at 14:00):

Robert Wiblin: It seems like philosophers have not been terribly interested in these heterodox decision theories. They just seem to not have been persuaded, or not even to really have engaged with them a great deal. What do you think is going on there?

Paul Christiano: I think there's a few things going on. I don't think it's right to see a spectrum with CDT and then EDT and then UDT. I think it's more right to see a box, where there's the updatelessness axis and then there's the causal vs. evidential axis. And the causal vs. evidential thing I don't think is as much of a divide between geography, or between philosophers and rationalists.

CDT is the majority position in philosophy, but not an overwhelming majority; there are reasonable philosophers who defend EDT. EDT is a somewhat more common view, I think, in the world -- certainly in the prisoner's dilemma with a twin case, I think most people would choose "cooperate" if you just poll people on the street. Again, not by a large majority.

And amongst the rationalists, there's also not agreement on that. Where I would say Eliezer has a view which I would describe as the causal view, not the evidential view. But he accepts updatelessness.

So on the causal vs. evidential split, I think it's not so much geographical. It's more just like everyone is fairly uncertain. Although I think it is the case maybe people in the Bay are more on evidential decision theory and philosophers are more on causal decision theory.

Then on the updatelessness thing, I think a lot of that is this semantic disagreement / this understanding of "what is the project of decision theory?" Where if you're building AI systems, the point of decision theory is to understand how we make decisions such that we can encode it into a machine.

And if you're doing that, I think there's not actually a disagreement about that question. Like, once you're really specific about what is meant. No one thinks you should make -- well, I don't... Hopefully no one thinks you should make an AI that uses causal decision theory. Causal decision theorists would not recommend making such an AI, because it has bad consequences to make such an AI.

So I think once we're talking about "How do you make an AI?", if you're saying, "I want to understand which decision theory my AI ought to use -- like, how my AI ought to use the word 'right'" -- everyone kind of agrees about that, that you should be more in the updateless regime.

It's more like a difference in, "What are the questions that are interesting, and how should we use language?" Like, "What do concepts like 'right' mean?"

And I think there is a big geographical and community distinction there, but I think it's a little bit less alarming than it would be if it were like, "What are the facts of the case?" It's more like, "What are the questions that are interesting?" -- which, like, people are in fact in different situations. "How should we use language?" -- which is reasonable. The different communities, if they don't have to talk that much, it's okay, they can just evolve to different uses of the language.

Robert Wiblin: Interesting. So you think the issue is to a large degree semantic.

Paul Christiano: For updatelessness, yeah.

Robert Wiblin: So you think it's the case that when it comes to programming an AI, there's actually a lot of agreement on what kind of decision theory it should be using in practice. Or at least, people agree that it needn't be causal decision theory, even though philosophers think in some more fundamental sense causal decision theory is the right decision theory.

Paul Christiano: I think that's right. I don't know exactly what... I think philosophers don't think that much about that question. But I think it's not a tenable position, and if you really got into an argument with philosophers it wouldn't be a tenable position that you should program your AI to use causal decision theory.

Robert Wiblin: So is it the case that people started thinking about this in part because they were thinking, "Well, what decision theory should we put in an AI?" And then they were thinking, "Well, what decision theory should I commit to doing myself? Like, maybe this has implications in my own life." Is that right?

Paul Christiano: Yeah, I think that's how the rationalists got to thinking about this topic. They started thinking about AI, then maybe they started thinking about how humans should reason insofar as humans are, like, an interesting model. I'm not certain of that, but I think that's basically the history for most of them.

Robert Wiblin: And it seems like if you accept one of the updateless decision theories, then potentially it has pretty big implications for how you ought to live your life. Is that right?

Paul Christiano: Yeah, I think there are some implications. Again, they're mostly going to be of this form, "You find yourself in a situation. You can do a thing that is bad for yourself in that situation but good for yourself in other situations. Do you do the thing that's bad right now?" And you would have committed to do that. And maybe if you've been thinking a lot about these questions, then you've had more opportunity to reason yourself into the place where you would actually take that action which is locally bad for yourself.

And again, I think philosophers could agree. Most causal decision theorists would agree that if they had the power to stop doing the right thing, they should stop taking actions which are right. They should instead be the kind of person that you want to be.

And so there, again, I agree it has implications, but I don't think it's a question of disagreement about truth. It's more a question of, like: you're actually making some cognitive decisions. How do you reason? How do you conceptualize what you're doing?

There are maybe also some actual questions about truth, like: If I'm in the Parfit's hitchhiker case, and I'm imagining I'm at the ATM, I'm deciding whether to take the money out. I'm kind of more open to a perspective that's like, "Maybe I'm at the ATM, deciding whether to take the money out. But maybe I'm inside the head of someone reasoning about me."

Maybe I'm willing to ascribe more consciousness to things being imagined by some other person. And that's not necessarily a decision theory disagreement, but it's more of a disagreement-about-facts case. And it's a kind of perspective that makes it easier to take that kind of action.

Robert Wiblin: Yeah, so... Before we imagine that we're potentially in people's imaginations, it seems like in your writing, you seem very interested in integrity and honesty. And it seems in part like this is informed by your views on decision theory.

For example, it seems like you're the kind of person to be like, "No, absolutely, I would always pay the money at the ATM," in the hitchhiker case. Do you want to explain, how has decision theory affected your life? What concrete implications do you think it should potentially have for listeners?

Paul Christiano: Yeah, so I think there are a lot of complicated empirical questions that bear on almost every one of these practical decisions. And we could talk about what fraction of the variation is explained by considerations about decision theory.

And maybe I'm going to go for, like, a third of variation is explained by decision theory stuff. And I think there a lot of it is also not the core philosophical issues, but just chugging through a lot of complicated analysis. I think it's surprising that it gets as complicated as it does.

My basic take-away from that kind of reasoning is that a reasonable first-pass approximation, in many situations, is to behave as if, when making a decision, consider not only the direct consequences of that decision, but also the consequences of other people having believed that you would make that decision.

So, for example, if you're deciding whether to divulge something told to you in confidence, you're like, "Well, on the one hand, divulging this fact told to me in confidence is good. On the other hand, if I were known to have this policy, I wouldn't have learned such information in the first place." This gives you a framework in which to weigh up the costs and benefits there.

So that's my high-level take-away, first-order approximation. Which I think a priori looks kind of silly -- maybe there's a little bit to it, but it probably is not going to hold up. And then the more I've dug in, the more I feel like when all the complicated analysis shakes out, you end up much closer to that first-order theory than you would have guessed.

And I think that summarizes most of my takeaway from weird decision theory stuff.

Maybe the other question is: There's some "How nice to be to other value systems?" where I come down on a more nice perspective, which maybe we'll touch a little bit on later. This gets weirder.

But those are maybe the two things. How much should you sympathize with value systems that you don't think are intrinsically valuable, just based on some argument of the form, "Well, I could have been in there position and they could have been in mine"? And then: how much should you behave as if, by deciding, you were causing other people to know what you would have decided?


Kohli episode discussion in 80K's Christiano interview

29 сентября, 2019 - 04:40
Published on September 29, 2019 1:40 AM UTC

From the 80,000 Hours Podcast (1:10:38):

Robert Wiblin: A few weeks ago, we published our conversation with Pushmeet Kohli, who’s an AI robustness and reliability researcher at DeepMind over in London. To heavily summarize Pushmeet’s views, I think he might have made a couple of key claims.

One was that alignment and robustness issues, in his view, appear everywhere throughout the development of machine learning systems, so they require some degree of attention from everyone who’s working in the field. According to Pushmeet, this makes the distinction between safety research and non-safety research somewhat vague and blurry, and he thinks people who are working on capabilities are kind of also helping with safety, and improving reliability also improves capabilities, because you can then actually design algorithms that do what you want.

Secondly, I think he thought that an important part of reliability and robustness is going to be trying to faithfully communicate our desires to machine learning algorithms, and that this is kind of analogous to – although a harder instance of – the challenge of just communicating with other people, getting them to really understand what we mean. Although of course it’s easier to do that with other humans than with animals or machine learning algorithms.

A third point was, I guess, just a general sense of optimism: that DeepMind is working on this issue quite a lot and are keen to hire more people to work on these problems, and I guess a sense that probably we’re going to be able to gradually fix these problems with AI alignment as we go along and machine learning algorithms get more influential.

I know you haven’t had a chance to listen to the whole interview, but you skimmed over the transcript. Firstly, where do you think Pushmeet is getting things right? Where do you agree?

Paul Christiano: So I certainly agree that there’s this tight linkage between getting AI systems to do what you want and making them more capable. I agree with the basic optimism that people will need to address the “getting AI systems to do what we want” problem. I think it is more likely than not that people will have a good solution to that problem.

Maybe there’s this interesting intervention of, “Should longtermists be thinking about that problem in order to increase the probability?” I think even absent the actions of the longtermists, there’s a reasonably good chance that everything would just be totally fine. So in that sense, I’m on board with those claims, definitely.

I think that I would disagree a little bit in thinking that there is a meaningful distinction between activities whose main effect is to change the date by which various things become possible, and activities whose main effect is to change the trajectory of development.

I think that’s the main distinguishing feature of “working on alignment” per se. You care about this differential progress towards being able to build systems to do what we want. I think in that perspective, it is the case that the average contribution of AI work is almost by definition zero on that front, because if you just increased all the AI work by a unit, you’re just bringing everything forward by one unit.

And so I think that does mean there’s this well-defined thing which is, “Can we change the trajectory in a way?” And that’s an important problem to think about.

I think there’s also a really important distinction between the kind of failure which is most likely to disrupt the long-term trajectory of civilization, and the kind of failure which is most likely to be an immediate deal-breaker for systems actually being useful or producing money. And maybe one way to get at that distinction is related to the second point you mentioned.

Communicating your goals to an ML system is very similar to communicating with a human. I think there is a hard problem of communicating your goals to an ML system, which we could view as a capabilities problem. Are they able to understand things people say? Are they able to form the kind of internal model that would let them understand what I want? In some sense, it’s very similar to the problem of predicting what Paul would do, or it’s a little slice of that problem – predicting under what conditions Paul would be happy with what you’ve done.

That’s most of what we’re dealing with when we’re communicating with someone. If I’m talking with you, I would be completely happy if I just managed to give you a perfect model of me – then the problem is solved. I think that’s a really important AI difficulty for making AI systems actually useful.

I think that’s less the kind of thing that could end up pushing us in a bad long-run direction, mostly because we’re concerned about behavior as AI systems become very capable and have a very good understanding of the world around them, of the people they’re interacting with; and the really concerning cases are ones where AI systems actually understand quite well what people would do under various conditions, understand quite well what they want – what we think about as normal communication problems between people – that understand what Paul wants, but aren’t trying to help Paul get what he wants. And I think that a lot of the interesting difficulty, especially from a very long-run perspective, is really making sure that no gap opens up there.

Again, there's a gap between the problems that are most important on the very long-run perspective and the problems that people will most be confronting in order to make AI systems economically valuable.

I do think that there’s a lot of overlap – problems that people are working on, that make AI systems more valuable, are also helping very directly with the long-run outcome. But I think if you’re interested in differentially changing the trajectory or improving the probability that things go well over the long term, you’re more inclined to focus precisely on those problems which won’t be essential for making AI systems economically useful in the short term. And I think that’s really distinctive to what your motivation is or how you’re picking problems or prioritizing problems.

Robert Wiblin: One of the bottom lines for Pushmeet, I guess, was that people who want to make sure that AI goes well, they needn’t be especially fussy about whether they’re working on something that’s safety-specific or on something that’s just about building a new product that works well using machine learning.

Sounds like you’re a little bit more skeptical of that, or you think ideally people should in the medium term be aiming to work on things that seem like they disproportionately push on robustness and reliability?

Paul Christiano: Yeah, I think people who are mostly concerned about the long-term trajectory face this dilemma in every domain. If you live in the world where you think that almost all of the most serious challenges to humanity are caused by things humans are doing – by things not only that humans are doing, but by things humans are doing that we would often think of as part of productive progress, part of the goal – like we’re building new technologies, but those technologies are also the things that pose the main risks – then you have to be picky if you’re a person who wants to change the long-term trajectory.

You're just sort of like, "I probably am helping address those problems if I just go do a random thing – I work on a random project, make a random product better. I am helping address the kinds of problems we're concerned about. But I’m also at the same time contributing to bringing those problems closer to us in time."

And it’s sort of roughly a wash, if you’re on the average product, making the average product work. And there are subtle distinctions we could make, of like – I think if you are motivated to make products work well, if you’re like, “Not only do I want to do the thing that’s most economically valuable, I want to have more of an emphasis on making this product robust,” I think you’re just generally going to make a bunch of low-level decisions that will be helpful.

I definitely think you can have a pretty big impact by being fussy about which problems you work on.

Robert Wiblin: I guess there’s this open question of whether we should be happy if AI progress across the board just goes faster. What if, yeah, we can just speed up the whole thing by 20%? Both all of the safety and capabilities. As far as I understand, there’s kind of no consensus on this. People vary quite a bit on how pleased they’d be to see everything speed up in proportion.

Paul Christiano: Yeah, I think that’s right. I think my take, which is a reasonably common take, is it doesn’t matter that much from an alignment perspective. Mostly, it will just accelerate the time at which everything happens.

And there are some second-order terms that are really hard to reason about, like, “How good is it to have more or less computing hardware available?” Or ”How good is it for there to be more or less kinds of other political change happening in the world prior to the development of powerful AI systems?”

There’s these kind of higher-order questions where people are very uncertain of whether it’s good or bad. But I guess my take would be that the net effect there is kind of small, and the main thing is I think accelerating AI matters much more on the next-100-years perspective.

If you care about welfare of people and animals over the next 100 years, then acceleration of AI looks reasonably good. So I think the main upside of faster AI progress is that people are going to be happy over the short term. I think if you care about the long term, it is roughly a wash, and people could debate whether it’s slightly positive or slightly negative but mostly it’s just accelerating where we’re going.

Robert Wiblin: Yeah, this has been one of the trickier questions that we’ve tried to answer in terms of giving people concrete career advice.

It seems to me if you’re someone who has done a PhD in ML or is very good at ML, but you currently can’t get a position that seems especially safety-focused or that is going to disproportionately affect safety more than capabilities, it is probably still good to take a job that just advances AI in general, mostly because you’ll be reaching the cutting edge potentially of what’s going on and improving your career capital a lot and having relevant understanding.

And the work, I guess you kind of think, is kind of close to a wash. It speeds things up a little bit – like, everything goes in proportion. It’s not clear whether that’s good or bad. But then you can potentially later on go and work on something that’s more alignment-specific, and that is the dominant term in the equation. Does that seem reasonable?

Paul Christiano: Yeah, I think that seems basically right to me. I think there’s some intuitive hesitation with the family of advice that’s like, “You should do this thing, which we think is roughly a wash on your values now, but there will be some opportunity in the future where you can sort of make a call.” I think there’s some intuitive hesitation about that, but I think that is roughly right.

Imagine if you offered to Paul two possible worlds. In one, there’s twice as many people working on machine learning and AI, but half of them really care about the long term and ensuring that AI is developed in a way that’s good for humanity’s long term. I'm like, "That sounds like a good trade."

We maybe then have less opportunity to do work right now. I think that’s the main negative thing. There will be less time to think about the alignment problem per se. But on the other hand, it just seems really good if a large fraction of the field really cares about making things go well. I just expect a field that has that character to be much more likely to handle issues in a way that’s good for the long term.

And I think you can sort of scale that down. It’s easiest for me to imagine the case where a significant fraction of the field is like that, but I think that if anything, the marginal people at the beginning are having probably a better cost-benefit analysis for them.

Robert Wiblin: Yeah, I guess I was suggesting that this would be the thing to do if you couldn’t get a job that was alignment-specific already. Say that they want to join your team but they’re just not quite good enough yet, they need to learn more. Or potentially, there’s just only so fast that the team can grow, so even though they’re good, you just can’t hire as quickly as people are coming on board.

But I suppose you have to make sure that, yeah, if people are going into these roles that we currently think are kind of just neutral, but good for improving their skills, that they don’t forget about that. That the original plan was at some point to switch to something different.

I guess there is a bit of a trap. It seems like people just in general tend to get stuck in doing what they’re doing now and convince themselves that whatever they’re doing is actually really useful. So you might think, “Yeah, it would be good to go in and then switch out," but you might have some doubts about whether in fact you will follow through on that.

Paul Christiano: Yeah, I think that’s right. I would be even happier, certainly, in the world where you took those half of people who might have gone into ML, and you instead moved them all into really thinking deeply about the long term and how to make things go well. That sounds like an even better world still.

If someone really cared about the long term and were like, “What should I do,” it’s a reasonably good option to just be like, “Go do this thing which is good on the short term and adjacent to an area we think is going to be really important over the long term.”

Robert Wiblin: There’s been this argument over the years that it would just be good in some way that we can’t yet anticipate to have people at the cutting edge of machine learning research who are concerned about the long term and alert to safety issues and alert to alignment issues that could have effects on the very long term. And people have gone back and forth on how useful that actually would be, to just be in the room where decisions are getting made.

It's kind of occurred to me that it seems like the machine learning community is really moving in the direction of sharing the views that you and I hold. A lot of people are just becoming concerned about “Will AI be aligned in the long term?” And it might be that if you’re particularly concerned about that now, then maybe that makes you different from your peers right now, but in 10 years’ time or 20 years’ time everyone will have converged on a similar vision as we have a better idea of what machine learning actually looks like and what the risks are when it’s deployed.

Paul Christiano: Yeah, I think that’s an interesting question, or an interesting possible concern with that kind of approach. I guess my take would be that there are some differences – I don’t know if you’d call them values differences or deep empirical or worldview differences – that are relevant here. Where I think to the extent that we’re currently thinking about problems that are going to become real problems, it’s going to be much, much more obvious that there are real problems.

And I think that to the extent that some of the problems we think about over the very long term are already obviously problems, people in the ML community are very interested in problems that are obviously problems – or, problems that are affecting the behavior of systems today.

Like, again, if these problems are real, that’s going to become more and more the case over time, and so people will become more and more interested in those problems.

I still think there is this question of, "How much are you interested in making the long term go well, versus how much are you doing your job or pursuing something which has a positive impact over the short term, or that you’re passionate about or interested in this other non-long-term impact of?" I do think there are just continuously going to be some calls to be made or some different decisions.

The field embodies some set of values. I think that people’s empirical views are changing more than the set of implicit values that they have. I think if you just said everyone who really cares about the long term isn’t going into this area, then the overall orientation the field will persistently be different.

Robert Wiblin: Do you have any views on the particular technical approaches that Pushmeet mentioned in the episode or that the DeepMind folks have written up on their safety blog?

Paul Christiano: The stuff I’m most familiar with that Pushmeet’s group is working on is verification for robustness to perturbations, some working on verification more broadly, and some working on adversarial training and testing. Maybe those are the three things, I don’t know if there’s something else. I’m happy to go through those in order.

Robert Wiblin: Yeah, go through those.

Paul Christiano: So, I guess I’m generally pretty psyched about adversarial testing and training and verification. That is, I think there is this really important problem over both – this is one of those things at the intersection of "it matters over the short term, I think it matters (maybe even more) over the very long term" – of, you have some AI system, you want to delegate a bunch of work to maybe not just one but a whole bunch of AI systems. If they failed catastrophically, it would be really unrecoverably bad.

You can’t really rule out that case with traditional ML training, because you’re just going to try a thing on a bunch of cases that you’ve generated so far, experienced so far. So your training process isn’t at all constraining on this potential catastrophic failure in a new situation that comes up.

So we just want to have something, we want to change the ML training process to have some information about what would constitute a catastrophic failure and then not do that. I think that’s a problem that is in common between the short and long term. I think it matters a lot on the long term. It’s a little bit hard to say whether it’s more on the long term or short term, but I care about it a lot on the long term.

I think that the main approaches we have to that are – the three I really think about are adversarial training and testing, verification, and interpretability or transparency.

I just think people getting familiar with those techniques, becoming good at them, thinking about how you would apply them to richer kinds of specifications, how you grapple with the fundamental limitations in adversarial training where you have to rely on the adversary to think of a kind of case...

The way the technique works in general is that you're like, “I’m concerned about my system failing in the future. I’m going to have an adversary who’s going to generate some possible situations under which the system might fail. And then we’re going to run on those and see if it fails catastrophically.” You have this fundamental limitation where your adversary isn’t going to think of everything.

People are getting experience with, "How do we grapple with that limitation?" In some sense, verification is a response to that limitation. I think it’s productive to have people thinking about both verification and the limits of verification, and testing and the limits of testing. So overall I’m pretty excited about all of that.

Robert Wiblin: Do you share Pushmeet’s general optimism?

Paul Christiano: I don’t know quantitatively exactly how optimistic he is. My guess would be that I’m less optimistic, in the sense that I’m like, “Well, there’s like tens of percent chance that we'll mess this up and lose the majority of the value of the future.” Whereas listening to him, that's not the overall sense I get of where he’s at.

Robert Wiblin: It's not the vibe.

Paul Christiano: Yeah. But it’s a little bit hard to know how to translate between a vibe and an actual level of optimism.

Robert Wiblin: Yeah, it is interesting. Someone could think there's a 20% chance that we’ll totally destroy everything, but still just have kind of a cheerful disposition, so they come across as, "Well, y'know, things could go well as well!"

Among people working on existential risks and global catastrophic risks, and I guess AI in particular, there’s this trade-off between not wanting to do things that other people disagree with or are unenthusiastic about, and at the same time not wanting to have a field that’s so conservative that there are no experiments done unless there is a consensus behind them. Do you think people are too inclined to make unilateralist-curse type mistakes, or not trying things enough?

Paul Christiano: I think my answer to this probably varies depending on the area. For reference, I think the policy you want to be following is: update on the fact that no one else wants to do this thing and then take that really seriously, engage with it a lot before deciding whether you want to do it. And ideally that’s going to involve engaging with the people who have made that decision to understand where they’re coming from.

I think I don’t have a very strong general sense of whether we’re more likely to make one mistake or the other. I think I’d expect the world systematically to make too much of the "a thing can be done unilaterally, so it gets done" one. In the context of this field, I don’t know if there are as many – yeah, I guess I don’t feel super concerned about either failure mode. Maybe I don’t feel that bad about where people are at.

[Continued at 1:27:00]


Is Specificity a Mental Model?

29 сентября, 2019 - 01:53
Published on September 28, 2019 10:53 PM UTC

This is Part IX of the Specificity Sequence

I've recently noticed a lot of smart people publishing lists of mental models. They're apparently having a moment:

There are hundreds of useful mental models to learn, such as “leverage”, “social proof”, “seizing the middle”, and of course, “mental model”. I want to help you be the very best, searching far and wide, teaching your brain to understand the power that’s inside.

Gotta Catch ’Em All

We’ve been focusing nonstop on one (super)powerful mental model called the “ladder of abstraction”, and seen it prove useful in a surprising variety of unrelated domains. The best mental models are the ones that have the largest number of applicability domains while also being the simplest and most compact. 😎

But despite all its usefulness, the ladder of abstraction doesn’t appear in any list of mental models I’ve seen to date. The closest it’s gotten is probably this entry from Farnam Street’s list in the "Military & War" section:

Seeing the Front
One of the most valuable military tactics is the habit of “personally seeing the front” before making decisions — not always relying on advisors, maps, and reports, all of which can be either faulty or biased.

Yes, advisors and maps and reports that tell you the reality on the ground may be “faulty or biased”, but there’s an even more fundamental problem: Their whole job is to slide the ground truth up the ladder of abstraction.

A report tells you that your enemy’s troops on the battlefield outnumber yours 2-to-1. Sounds like you should retreat, right? Not so fast. Let’s slide that down the ladder of abstraction by filling in more detail.

Rain clouds are gathering in the sky? The report might have neglected to mention that detail. When you realize it’s going to rain, you might think of a clever strategy that uses the rain to your advantage.

“Seeing the front” is an instance of “sliding down the ladder of abstraction”: you replace an abstract summarization of observations of the front with a lower-level data dump about the front (which happens to come from your own senses).

“Seeing the front” is absolutely a useful mental model to know; so are others like “Value Prop Story” and "mind-anchor". But “ladder of abstraction” is even more useful because it lets you derive these and a bunch of other mental models for yourself.

Next post: The Power to Draw Better (coming next weekend)


Axis-49 to Jammer MIDI Mapper

28 сентября, 2019 - 14:20
Published on September 28, 2019 11:20 AM UTC

One of the risks with new instruments is that you might put a lot of work learning how to play something that then is discontinued. In the case of the jammer, things are worse because it was discontinued five years before I even got into the instrument. As I see used Axis-49s come up on resale sites I've been making lowball offers, as politely as I can, trying to collect a few spares, and at this point I have three and a half. [1] While my main goal with these is insurance against a future where mine breaks and I can't fix it, they don't need to stay on my shelf: I've been lending them out to people who want to play with the layout.

You can't use it without a MIDI mapper, though, because while I've rearranged the physical keys, that doesn't make it send different MIDI signals. I was trying to help someone get set up with one, and it turns out that the state of MIDI mappers for non-programmers isn't that great. Plus, with 98 keys, that's a lot of data entry. So I've made a stand-alone version for Mac: source code, executable program.

It's a quick cut-down version of the code behind my rhythm stage setup that looks for an Axis-49 and presents a virtual MIDI device (jammer) that produces the mapped notes. There are two binaries, one for holding the device with (non-functional) transpose keys up, the other with transpose keys down.

If you sometimes play with sharp-key instruments and other times play with flat key ones, you can turn the device over and use the other binary, with a transposition MIDI mapper. This lets you have a range from F to B (F, C, G, D, A, E, B) in one orientation, centered on the key of D, and Db to G (Db, Ab, Eb, Bb, F, C, G) in the other, centered on the key of Bb, in the other.

[1] Three good ones, and one that's too old to go into "selfless" mode and so is effectively only half a keyboard.


Follow-Up to Petrov Day, 2019

28 сентября, 2019 - 02:47
Published on September 27, 2019 11:47 PM UTC

Hurrah! Success! I didn't know what to expect, and am pleasantly surprised to find the Frontpage is still intact. My thanks to everyone who took part, to everyone who commented on yesterday's post, and to everyone who didn't unilaterally blow up the site.

Launch Attempts Results

I said I would share usernames and codes of all attempts to launch the codes. Others on the team told me this seemed like a bad idea in many ways, and on reflection I agree - I think many people were not aware they were signing up for being publicly named and shamed, and I think it's good that people aren't surprised by their actions becoming public. Though if someone had successfully nuked the site I would have named them.

Nonetheless, I’ll share a bunch of info. First of all, the button was in a pretty central place, and it turns out you can hit it accidentally. Ray built the button so that you could only hit it once - it was forever after pressed.

  • The number of logged-in users who pressed the button was 102.
    • (Ruby made a sheet of times when people pressed the button, redacting most of the info.)
  • I have no number for logged-out users, for them pressing it brought up a window asking them to log-in. (Er, I'm not certain that's the best selection process for new users).
  • The number of users who actually submitted launch codes is 18.
    • 11 of those accounts had zero karma, 7 accounts had positive karma. None of the users were people who had been given real codes.
  • Several users submitted launch codes before clicking through to find out what the button even did - I hope this initiative serves them well in life.
  • A few accounts were made on-the-day presumably for this purpose, I'm happy to name these. They include users like "bomb_presser", "The Last Harbinger", and "halosaga", whose codes were "00000000", "NL73njLH58et1Ec0" and "diediedie" respectively.

LW user ciphergoth (Paul Crowley) shared his launch codes on Facebook (indeed I had sent him real launch codes), and two users copied and entered them. However, he had actually shared fake codes. "The Last Harbinger" entered them.

A second user entered them, who had positive karma, and was not someone to whom I had sent real codes. However, they failed to properly copy it, missing the final character. To them, I can only say what I had prepared to say to anyone who mis-entered what they believed were correct launch codes. "First, you thought you were a failure to the community. But then, you learned, you were a failure to yourself."

Oli and Ray decided that anyone submitting launch codes deserved a janky user-experience. I hope all of the users enjoyed finding out that when you try to nuke the site, regardless of whether you enter correct or incorrect launch codes, the launch pad just disappears and nothing else happens.

Last night during my house's Petrov Day ceremony, which ran from about 8:10-9:10, I nervously glanced over at the LW frontpage on the open laptop as it refreshed every 60 seconds. Some small part of me was worried about Quirinus_Quirrell following through on his threat to nuke the site at 9pm. I honestly did not expect that someone could create a character hard enough that it would leap out of the book and hold us all hostage in a blackmail attempt. Damn you Eliezer Yudkowsky!

Looking Ahead

I thought the discussion was excellent. I mostly avoided participating to let others decide for themselves, but I might go back and add more comments now it's done. As Said Achmiz pointed out in the comments of the last post, it'll be better next year to have more time in advance for people to discuss the ethics of the situation and think, and that will be even more informative and valuable. Though I still learned a lot this year, and I think overall it turned out as well as I could've hoped.

I'll think more about how to do it next year. One thing I will say is that I'd ideally like to be able to reach an equilibrium where 100s of users every year don't fire the launch codes, to build up a real tradition of not taking unilateralist action - sitting around and not pressing buttons. Several users have suggested to me fun, gamified ways of changing the event (e.g. versions where users are encouraged to trick other users into thinking you can trust them but then nuke the site), but overall in ways that I think decreased the stakes and common knowledge effects, which is why I don't feel too excited about them.


Partial Agency

28 сентября, 2019 - 01:04
Published on September 27, 2019 10:04 PM UTC

Epistemic status: very rough intuitions here.

I think there's something interesting going on with Evan's notion of myopia.

Evan has been calling this thing "myopia". Scott has been calling it "stop-gradients". In my own mind, I've been calling the phenomenon "directionality". Each of these words gives a different set of intuitions about how the cluster could eventually be formalized.


Nash equilibria are, abstractly, modeling agents via an equation like .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} a∗=argmaxaf(a,a∗). In words: a∗ is the agent's mixed strategy. The payoff f(.,.) is a function of the mixed strategy in two ways: the first argument is the causal channel, where actions directly have effects; the second argument represents the "acausal" channel, IE, the fact that the other players know the agent's mixed strategy and this influences their actions. The agent is maximizing across the first channel, but "ignoring" the second channel; that is why we have to solve for a fixed point to find Nash equilibria. This motivates the notion of "stop gradient": if we think in terms of neural-network type learning, we're sending the gradient through the first argument but not the second. (It's a kind of mathematically weird thing to do!)


Thinking in terms of iterated games, we can also justify the label "myopia". Thinking in terms of "gradients" suggests that we're doing some kind of training involving repeatedly playing the game. But we're training an agent to play as if it's a single-shot game: the gradient is rewarding behavior which gets more reward within the single round even if it compromises long-run reward. This is a weird thing to do: why implement a training regime to produce strategies like that, if we believe the nash-equilibrium model, IE we think the other players will know our mixed strategy and react to it? We can, for example, win chicken by going straight more often than is myopically rational. Generally speaking, we expect to get better rewards in the rounds after training if we optimized for non-myopic strategies during training.


To justify my term "directionality" for these phenomena, we have to look at a different example: the idea that "when beliefs and reality don't match, we change our beliefs". IE: when optimizing for truth, we optimize "only in one direction". How is this possible? We can write down a loss function, such as Bayes' loss, to define accuracy of belief. But how can we optimize it only "in one direction"?

We can see that this is the same thing as myopia. When training predictors, we only consider the efficacy of hypotheses one instance at a time. Consider supervised learning: we have "questions" x1,x2,... etc and are trying to learn "answers" y1,y2,... etc. If a neural network were somehow able to mess with the training data, it would not have much pressure to do so. If it could give an answer on instance x1 which improved its ability to answer on x2 by manipulating y2, the gradient would not specially favor this. Suppose it is possible to take some small hit (in log-loss terms) on y1 for a large gain on y2. The large gain for x2 would not reinforce the specific neural patterns responsible for making y2 easy (only the patterns responsible for successfully taking advantage of the easiness). The small hit on x1 means there's an incentive not to manipulate y2.

It is possible that the neural network learns to manipulate the data, if by chance the neural patterns which shift x1 are the same as those which successfully exploit the manipulation at x2. However, this is a fragile situation: if there are other neural sub-patterns which are equally capable of giving the easy answer on x2, the reward gets spread around. (Think of these as parasites taking advantage of the manipulative strategy without doing the work necessary to sustain it.) Because of this, the manipulative sub-pattern may not "make rent": the amount of positive gradient it gets may not make up for the hit it takes on x1. And all the while, neural sub-patterns which do better on x1 (by refusing to take the hit) will be growing stronger. Eventually they can take over. This is exactly like myopia: strategies which do better in a specific case are favored for that case, despite global loss. The neural network fails to successfully coordinate with itself to globally minimize loss.

To see why this is also like stop-gradients, think about the loss function as l(w,w∗): the neural weights w determine loss through a "legitimate" channel (the prediction quality on a single instance), plus an "illegitimate" channel (the cross-instance influence which allows manipulation of y2 through the answer given for x1). We're optimizing through the first channel, but not the second.

The difference between supervised learning and reinforcement learning is just: reinforcement learning explicitly tracks helpfulness of strategies across time, rather than assuming a high score at x2 has to do with only behaviors at x2! As a result, RL can coordinate with itself across time, whereas supervised learning cannot.

Keep in mind that this is a good thing: the algorithm may be "leaving money on the table" in terms of prediction accuracy, but this is exactly what we want. We're trying to make the map match the territory, not the other way around.

Important side-note: this argument obviously has some relation to the question of how we should think about inner optimizers and how likely we should expect them to be. However, I think it is not a direct argument against inner optimizers. (1) The emergence of an inner optimizer is exactly the sort of situation where the gradients end up all feeding through one coherent structure. Other potential neural structures cannot compete with the sub-agent, because it has started to intelligently optimize; few interlopers can take advantage of the benefits of the inner optimizer's strategy, because they don't know enough to do so. So, all gradients point to continuing the improvement of the inner optimizer rather than alternate more-myopic strategies. (2) Being an inner optimizer is non synonymous with non-myopic behavior. An inner optimizer could give myopic responses on the training set while internally having less-myopic values. Or, an inner optimizer could have myopic but very divergent values. Importantly, an inner optimizer need not take advantage of any data-manipulation of the training set like that I've described; it need not even have access to any such opportunities.

The Partial Agency Paradox

I've given a couple of examples. I want to quickly give some more to flesh out the clusters as I see them:

  • As I said, myopia is "partial agency" whereas foresight is "full agency". Think of how an agent with high time-preference (ie steep temporal discounting) can be money-pumped by an agent with low time-preference. But the limit of no-temporal-discounting-at-all is not always well-defined.
  • An updatefull agent is "partial agency" whereas updatelessness is "full agency": the updateful agent is failing to use some channels of influence to get what it wants, because it already knows those things and can't imagine them going differently. Again, though, full agency seems to be an idealization we can't quite reach: we don't know how to think about updatelessness in the context of logical uncertainty, only more- or less- updatefull strategies.
  • I gave the beliefs←territory example. We can also think about the values→territory case: when the world differs from our preferences, we change the world, not our preferences. This has to do with avoiding wireheading.
  • Similarly, we can think of examples of corrigibility -- such as respecting an off button, or avoiding manipulating the humans -- as partial agency.
  • Causal decision theory is more "partial" and evidential decision theory is less so: EDT wants to recognize more things as legitimate channels of influence, while CDT claims they're not. Keep in mind that the math of causal intervention is closely related to the math which tells us about whether an agent wants to manipulate a certain variable -- so there's a close relationship between CDT-vs-EDT and wireheading/corrigibility.

I think people often take a pro- or anti- partial agency position: if you are trying to one-box in Newcomblike problems, trying to cooperate in prisoner's dilemma, trying to define logical updatelessness, trying for superrationality in arbitrary games, etc... you are generally trying to remove barriers to full agency. On the other hand, if you're trying to avert instrumental incentives, make sure an agent allows you to change its values, or doesn't prevent you from pressing an off button, or doesn't manipulate human values, etc... you're generally trying to add barriers to full agency.

I've historically been more interested in dropping barriers to full agency. I think this is partially because I tend to assume that full agency is what to expect in the long run, IE, "all agents want to be full agents" -- evolutionarily, philosophically, etc. Full agency should result from instrumental convergence. Attempts to engineer partial agency for specific purposes feel like fighting against this immense pressure toward full agency; I tend to assume they'll fail. As a result, I tend to think about AI alignment research as (1) needing to understand full agency much better, (2) needing to mainly think in terms of aligning full agency, rather than averting risks through partial agency.

However, in contrast to this historical view of mine, I want to make a few observations:

  • Partial agency sometimes seems like exactly what we want, as in the case of map←territory optimization, rather than a crude hack which artificially limits things.
  • Indeed, partial agency of this kind seems fundamental to full agency.
  • Partial agency seems ubiquitous in nature. Why should I treat full agency as the default?

So, let's set aside pro/con positions for a while. What I'm interested in at the moment is the descriptive study of partial agency as a phenomenon. I think this is an organizing phenomenon behind a lot of stuff I think about.

The partial agency paradox is: why do we see partial agency naturally arising in certain contexts? Why are agents (so often) myopic? Why have a notion of "truth" which is about map←territory fit but not the other way around? Partial agency is a weird thing. I understand what it means to optimize something. I understand how a selection process can arise in the world (evolution, markets, machine learning, etc), which drives things toward maximization of some function. Partial optimization is a comparatively weird thing. Even if we can set up a "partial selection process" which incentivises maximization through only some channels, wouldn't it be blind to the side-channels, and so unable to enforce partiality in the long-term? Can't someone always come along and do better via full agency, no matter how our incentives are set up?

Of course, I've already said enough to suggest a resolution to this puzzle.

My tentative resolution to the paradox is: you don't build "partial optimizers" by taking a full optimizer and trying to add carefully balanced incentives to create indifference about optimizing through a specific channel, or anything like that. (Indifference at the level of the selection process does not lead to indifference at the level of the agents evolved by that selection process.) Rather, partial agency is what selection processes incentivize by default. If there's a learning-theoretic setup which incentivizes the development of "full agency" (whatever that even means, really!) I don't know what it is yet.


Learning is basically episodic. In order to learn, you (sort of) need to do the same thing over and over, and get feedback. Reinforcement learning tends to assume ergodic environments so that, no matter how badly the agent messes up, it eventually re-enters the same state so it can try again -- this is a "soft" episode boundary. Similarly, RL tends to require temporal discounting -- this also creates a soft episode boundary, because things far enough in the future matter so little that they can be thought of as "a different episode".

So, like map←territory learning (that is, epistemic learning), we can kind of expect any type of learning to be myopic to some extent.

This fits the picture where full agency is an idealization which doesn't really make sense on close examination, and partial agency is the more real phenomenon. However, this is absolutely not a conjecture on my part that all learning algorithms produce partial agents of some kind rather than full agents. There may still be frameworks which allow us to approach full agency in the limit, such as taking the limit of diminishing discount factors, or considering asymptotic behavior of agents who are able to make precommitments. We may be able to achieve some aspects of full agency, such as superrationality in games, without others.

Again, though, my interest here is more to understand what's going on. The point is that it's actually really easy to set up incentives for partial agency, and not so easy to set up incentives for full agency. So it makes sense that the world is full of partial agency.

Some questions:

  • To what extent is it really true that settings such as supervised learning disincentivize strategic manipulation of the data? Can my argument be formalized?
  • If thinking about "optimizing a function" is too coarse-grained (a supervised learner doesn't exactly minimize prediction error, for example), what's the best way to revise our concepts so that partial agency becomes obvious rather than counterintuitive?
  • Are there better ways of characterizing the partiality of partial agents? Does myopia cover all cases (so that we can understand things in terms of time-preference), or do we need the more structured stop-gradient formulation in general? Or perhaps a more causal-diagram-ish notion, as my "directionality" intuition suggests? Do the different ways of viewing things have nice relationships to each other?
  • Should we view partial agents as multiagent systems? I've characterized it in terms of something resembling game-theoretic equilibrium. The 'partial' optimization of a function arises from the price of anarchy, or as it's known around lesswrong, Moloch. Are partial agents really bags of full agents keeping each other down? This seems a little true, to me, but also doesn't strike me as the most useful way of thinking about partial agents. For one thing, it takes full agents as a necessary concept to build up partial agents, which seems wrong to me.
  • What's the relationship between the selection process (learning process, market, ...) and the type of partial agents incentivised by it? If we think in terms of myopia: given a type of myopia, can we design a training procedure which tracks or doesn't track the relevant strategic influences? If we think in terms of stop-gradients: we can take "stop-gradient" literally and stop there, but I suspect there is more to be said about designing training procedures which disincentivize the strategic use of specified paths of influence. If we think in terms of directionality: how do we get from the abstract "change the map to match the territory" to the concrete details of supervised learning?
  • What does partial agency say about inner optimizers, if anything?
  • What does partial agency say about corrigibility? My hope is that there's a version of corrigibility which is a perfect fit in the same way that map←territory optimization seems like a perfect fit.

Ultimately, the concept of "partial agency" is probably confused. The partial/full clustering is very crude. For example, it doesn't make sense to think of a non-wireheading agent as "partial" because of its refusal to wirehead. And it might be odd to consider a myopic agent as "partial" -- it's just a time-preference, nothing special. However, I do think I'm pointing at a phenomenon here, which I'd like to understand better.


[Talk] Paul Christiano on his alignment taxonomy

27 сентября, 2019 - 21:37
Published on September 27, 2019 6:37 PM UTC

Paul Christiano makes an amazing tree of subproblems in alignment. I saw this talk live and enjoyed it.


Attainable Utility Theory: Why Things Matter

27 сентября, 2019 - 19:48
Published on September 27, 2019 4:48 PM UTC

If you haven't read the prior posts, please do so now. This sequence can be spoiled.



Long-term Donation Bunching?

27 сентября, 2019 - 15:40
Published on September 27, 2019 12:40 PM UTC

Let's say you're a relatively well-off American with an income of $100k and have taken the Giving What We Can pledge to donate 10% of your money to the most effective charities you can find. The standard deduction in the US is currently $12k, which means if you were to donate $10k/year it effectively wouldn't be tax deductible. [1]

The standard EA suggestion here is Donation Bunching: instead of donating $10k every year donate $20k half the years and $0 the other half. In the years where you donate you itemize your deductions, in the others you take the standard deduction. Considered over two years, $8k of the $20k (40%) is deductible.

But we could do better than that! You can deduct up to 60% of your income if you're donating cash, so you could have five years of donating $0 and one year of donating $60k. Considered over six years, $48k of the $60k (80%) is deductible. [2]

This sounds great, right? Money donated to effective charities does much more good than money collected by the government, and with a bit of planning you can effectively move 80% of your donations to being pre-tax. But I don't think it's a good idea!

Over time people change, and a very common way people change as they get older is to turn inward. You start out as a bright-eyed idealist, enthusiastic about making the world a better place and willing to make sacrifices for what you believe in. Then you burn out working too many hours doing something that doesn't feel as effective as you thought it would be, or you start feeling a pull to have kids and focus your efforts there, or you just stop feeling so motivated by altruism, or dozens of other things, and after a few years the idea of giving your money to people who need it more still sounds nice but isn't a priority anymore.

While we don't have good data on the rate at which this happens, in a small sample about half of people in the effective altruism movement in 2013 were no longer involved five years later. If you think your current self is correct to be altruistic and don't want to leave donations up to a likely less-generous future person then bunching donations over several years is harmful: the substantial possibility that you don't actually donate outweighs the tax savings. [3]

(Thanks to someone I talked to at the SSC Boston Meetup for asking me the question that got me thinking about this.)

[1] Specifically, unless you had other reasons to itemize your tax deductions, you would do better to take the $12k standard deduction.

[2] This ignores inflation and investment income, but they aren't large enough to change the picture much over a ~6 year window.

[3] Depending on your views on discount rates ("how much better is it to give this year than next?") it might also not be so good.

Comment via: facebook


Happy Petrov Day!

27 сентября, 2019 - 05:10
Published on September 27, 2019 2:10 AM UTC

I don't care much about feast days or memorial days. Assumption of Mary? Meh. Slovak Constitution Day? Shrug. Labour Day? Real socialism had sucked all the joy out of that one.

The only one that I do observe is Petrov Day. Not that I do anything special. I don't even take a day off from work. I just take few minutes to contemplate.

But now I am thinking that maybe, on this festive occasion, I should do a little bit more, maybe something that's public, something that has at least a slight feeling of a ceremony.

And given that while browsing the newspapers today I haven't found a single mention of the event, no op-eds, no commemorative articles, no historical analysis, I decided to write a witness report of my own, no matter how petty and unimportant it may be.

On September 26th, 1983 I was ten years old. I was attending 4th grade of the Czech school by Czechoslovak embassy in London. The authorities didn't want socialist kids to mingle with capitalist kids, so we weren't allowed to attend local schools. (There were sporadic, unsuccessfull, attempts to mix the kids from different socialist countries though.) The school resided in a family house in Hampstead and there were approximately forty kids attending. I don't remember that particular day, but given that it was in autumn and that it was in London, I assume it was raining. It was Monday. We used to live close to the school, several Czechoslovak families close to each other and so, same as every morning, one of the mothers walked us to the school. Kids that lived further away were driven in by a minibus. There were four grades in the school, two teachers and two classrooms. Kindergarten was in a separate room near the entrance. Me and my friend, we once did a puppet show for the kindergarteners. We had only one puppet (a chicken) and so I have no idea what the other one of us was doing. The memories have faded in the meantime. There was a narrow room with no windows in the basement that served as a library. My favourite book was about cars: A hand-illustrated compendium of models from the very beginning of automobilism up to the seventies. One of the cars was called Zaporozhets which was a bit puzzling and mysterious because it sounded like it should have horns. The room was also used for more nefarious purposes, when the boys were all too interested in what's under girls' skirts and the girls were not unwilling to give them a peek. In the afternoon, back at home, we used to play with Lego and my new chemistry set. Mostly mixing all chemicals together and obtaining a boring greyish liquid. We even had an experimental vegetable garden where we successfully grew carrots 2 cm in length. Me and my sister, we used to sleep in a bunk bed. When she got angry she used to hit me with a book called "Fish and Fishing". I was older and had longer hands though, so I was able to keep her at a safe distance.

And yes, we did our homework because we thought the next day was coming. We had no idea that we were almost burned alive that day.

September 26th, 2019

by martin_sustrik


Idols of the Mind Pt. 2 (Novum Organum Book 1: 53-68)

27 сентября, 2019 - 01:24
Published on September 26, 2019 10:24 PM UTC

This is the fifth post in the Novum Organum sequence. For context, see the sequence introduction.

We have used Francis Bacon's Novum Organum in the version presented at www.earlymoderntexts.com. Translated by and copyright to Jonathan Bennett. Prepared for LessWrong by Ruby.

Ruby's Reading Guide

Novum Organum is organized as two books each containing numbered "aphorisms." These vary in length from three lines to sixteen pages. Titles of posts in this sequence, e.g. Idols of the Mind Pt. 1, are my own and do not appear in the original.While the translator, Bennett, encloses his editorial remarks in a single pair of [brackets], I have enclosed mine in a [[double pair of brackets]].

Bennett's Reading Guide

[Brackets] enclose editorial explanations. Small ·dots· enclose material that has been added, but can be read as though it were part of the original text. Occasional •bullets, and also indenting of passages that are not quotations, are meant as aids to grasping the structure of a sentence or a thought. Every four-point ellipsis . . . . indicates the omission of a brief passage that seems to present more difficulty than it is worth. Longer omissions are reported between brackets in normal-sized type.Aphorism Concerning the Interpretation of Nature: Book 1: 53–68

by Francis Bacon

53. The idols of the cave—·my topic until the end of 58·— arise from the particular mental and physical make-up of the individual person, and also from upbringing, habits, and chance events. There are very many of these, of many different kinds; but I shall discuss only the ones we most need to be warned against—the ones that do most to disturb the clearness of the intellect.

54. A man will become attached to one particular science and field of investigation either because •he thinks he was its author and inventor or because •he has worked hard on it and become habituated to it. But when someone of this kind turns to general topics in philosophy ·and science· he wrecks them by bringing in distortions from his former fancies. This is especially visible in Aristotle, who made his natural science a mere bond-servant to his logic, rendering it contentious and nearly useless. The chemists have taken a few experiments with a furnace and made a fantastic science out of it, one that applies to hardly anything. . . .

[In this work ‘chemists’ are alchemists. Nothing that we would recognize as chemistry existed.]

[[We might see Bacon here as claiming that "seeing everything as a nail" can be very harmful.]]

55. When it comes to philosophy and the sciences, minds differ from one another in one principal and fairly radical way: some minds have more liking for and skill in •noting differences amongst things, others are adapted rather to •noting things’ resemblances. The •steady and acute mind can concentrate its thought, fixing on and sticking to the subtlest distinctions; the •lofty and discursive mind recognizes and puts together the thinnest and most general resemblances. But each kind easily goes too far: one by •grasping for ·unimportant· differences between things, the other by •snatching at shadows.

56. Some minds are given to an extreme admiration of antiquity, others to an extreme love and appetite for novelty. Not many have the temperament to steer a middle course, not pulling down sound work by the ancients and not despising good contributions by the moderns. The sciences and philosophy have suffered greatly from this, because these attitudes to antiquity and modernity are not judgments but mere enthusiasms. Truth is to be sought not in •what people like or enjoy in this or that age, but in •the light of nature and experience. The •former is variable, the •latter is eternal. So we should reject these enthusiasms, and take care that our intellect isn’t dragged into them.

57. When you think ·hard and long and uninterruptedly· about nature and about bodies in their simplicity—·i.e. think of topics like matter as such·—your intellect will be broken up and will fall to pieces. When on the other hand you think ·in the same way· about nature and bodies in all their complexity of structure, your intellect will be stunned and scattered. The difference between the two is best seen by comparing the school of Leucippus and Democritus with other philosophies. For the members of that school were so busy with the ·general theory of· particles that they hardly attended to the structure, while the others were so lost in admiration of the structure that they didn’t get through to the simplicity of nature. What we should do, therefore, is alternate between these two kinds of thinking, so that the intellect can become both penetrating and comprehensive, avoiding the disadvantages that I have mentioned, and the idols they lead to.

58. Let that kind of procedure be our prudent way of keeping off and dislodging the idols of the cave, which mostly come from

  • intellectual· favouritism (54),
  • an excessive tendency to compare or to distinguish (55),
  • partiality for particular historical periods (56), or
  • the largeness or smallness of the objects contemplated (57).

Let every student of nature take this as a general rule for helping him to keep his intellect balanced and clear: when your mind seizes on and lingers on something with special satisfaction, treat it with suspicion!

59. The idols of the market place are the most troublesome of all—idols that have crept into the intellect out of the contract concerning words and names [Latin verborum et nominum, which could mean ‘verbs and nouns’; on the contract, see 43]. Men think that their reason governs words; but it is also true that words have a power of their own that reacts back onto the intellect; and this has rendered philosophy and the sciences sophistical and idle. Because words are usually adapted to the abilities of the vulgar, they follow the lines of division that are most obvious to the vulgar intellect. When a language-drawn line is one that a sharper thinker or more careful observer would want to relocate so that it suited the true divisions of nature, words stand in the way of the change. That’s why it happens that when learned men engage in high and formal discussions they often end up arguing about words and names, using definitions to sort them out—thus •ending where, according to mathematical wisdom and mathematical practice, it would have been better to •start! But when it comes to dealing with natural and material things, definitions can’t cure this trouble, because the definitions themselves consist of words, and those words beget others. So one has to have recourse to individual instances. . . .

[[Bacon grokked that misuses of words were a great cause of confusion. He probably would have like the A Human's Guide to Words Sequence. See Where to Draw the Boundary? and 37 Ways That Words Can Be Wrong.]]

60. The idols that words impose on the intellect are of two kinds. (1) There are names of things that don’t exist. Just as there are things with no names (because they haven’t been observed), so also there are names with no things to which they refer—these being upshots of fantastic ·theoretical· suppositions. Examples of names that owe their origin to false and idle theories are ‘fortune’, ‘prime mover’, ‘planetary orbits’, and ‘element of fire’. This class of idols is fairly easily expelled, because you can wipe them out by steadily rejecting and dismissing as obsolete all the theories ·that beget them·.

[[See Empty Labels.]]

(2) Then there are names which, though they refer to things that do exist, are confused and ill-defined, having been rashly and incompetently derived from realities. Troubles of this kind, coming from defective and clumsy abstraction, are intricate and deeply rooted. Take the word ‘wet’, for example. If we look to see how far the various things that are called ‘wet’ resemble one other, we’ll find that ‘wet’ is nothing but than a mark loosely and confusedly used to label a variety of states of affairs that can’t be unified through any constant meaning. For something may be called ‘wet’ because it

  • easily spreads itself around any other body,
  • has no boundaries and can’t be made to stand still,
  • readily yields in every direction.
  • easily divides and scatters itself,
  • easily unites and collects itself,
  • readily flows and is put in motion,
  • readily clings to another body and soaks it,
  • is easily reduced to a liquid, or (if it is solid) easily melts.

Accordingly, when you come to apply the word, if you take it in one sense, flame is wet; if in another, air is not wet; if in another, fine dust is wet; if in another, glass is wet. So that it is easy to see that the notion has been taken by abstraction only from water and common and ordinary liquids, without proper precautions.

Words may differ in how distorted and wrong they are. One of the •least faulty kinds is that of names of substances, especially names that

  • are names of lowest species, ·i.e. species that don’t divide into sub-species·, and
  • have been well drawn ·from the substances that they are names of·.

·The drawing of substance-names and -notions from the substances themselves can be done well or badly. For example·, our notions of chalk and of mud are good, our notion of earth bad. •More faulty are names of events: ‘generate’, ‘corrupt’, ‘alter’. •The most faulty are names of qualities: ‘heavy’, ‘light’, ‘rare’, ‘dense’, and the like. (I exclude from this condemnation names of qualities that are immediate objects of the senses.) Yet in each of these categories, inevitably some notions are a little better than others because more examples of them come within range of the human senses.

61. The idols of the theatre ·which will be my topic until the end of 68· are not innate, and they don’t steal surreptitiously into the intellect. Coming from the fanciful stories told by philosophical theories and from upside-down perverted rules of demonstration, they are openly proclaimed and openly accepted. Things I have already said imply that there can be no question of refuting these idols: where there is no agreement on premises or on rules of demonstration, there is no place for argument.


This at least has the advantage that it leaves the honour of the ancients untouched ·because I shall not be arguing against them. I shall be opposing them, but· there will be no disparagement of them in this, because the question at issue between them and me concerns only the way. As the saying goes: a lame man on the right road outstrips the runner who takes a wrong one. Indeed, it is obvious that a man on the wrong road goes further astray the faster he runs. ·You might think that in claiming to be able to do better in the sciences than they did, I must in some way be setting myself up as brighter than they are; but it is not so·. The course I propose for discovery in the sciences leaves little to the acuteness and strength of intelligence, but puts all intelligences nearly on a level. My plan is exactly like the drawing of a straight line or a perfect circle: to do it free-hand you need a hand that is steady and practised, but if you use a ruler or a compass you will need little if anything else; and my method is just like that.


But though particular counter-arguments would be useless, I should say something about •the classification of the sects whose theories produce these idols, about •the external signs that there is something wrong with them, and lastly •about the causes of this unhappy situation, this lasting and general agreement in error. My hope is that this will make the truth more accessible, and make the human intellect more willing to be cleansed and to dismiss its idols.

62. There are many idols of the theatre, or idols of theories, and there can be and perhaps will be many more. For a long time now two factors have militated against the formation of new theories ·in philosophy and science·.

  • Men’s minds have been busied with religion and theology.
  • Civil governments, especially monarchies, have been hostile to anything new, even in theoretical matters; so that men have done that sort of work at their own peril and at great financial cost to themselves—not only unrewarded but exposed to contempt and envy.

If it weren’t for those two factors, there would no doubt have arisen many other philosophical sects like those that once flourished in such variety among the Greeks. Just as many hypotheses can be constructed regarding the phenomena of the heavens, so also—and even more!—a variety of dogmas about the phenomena of philosophy may be set up and dug in. And something we already know about plays that poets put on the stage is also true of stories presented on the philosophical stage—namely that fictions invented for the stage are more compact and elegant and generally liked than true stories out of history!

What has gone wrong in philosophy is that it has attended in great detail to a few things, or skimpily to a great many things; either way, it is based on too narrow a foundation of experiment and natural history, and decides on the authority of too few cases. (1)Philosophers of the reasoning school snatch up from experience a variety of common kinds of event, without making sure they are getting them right and without carefully examining and weighing them; and then they let meditation and brain-work do all the rest. (2) Another class of philosophers have carefully and accurately studied a few experiments, and have then boldly drawn whole philosophies from them, making all other facts fit in by wildly contorting them. (3) Yet a third class consists of those who are led by their faith and veneration to mix their philosophy with theology and stuff handed down across the centuries. Some of these have been so foolish and empty-headed as to have wandered off looking for knowledge among spirits and ghosts. So there are the triplets born of error and false philosophy: philosophies that are (1) sophistical, (2) empirical, and (3) superstitious.

[To explain Bacon’s second accusation against Aristotle in 63: A word ‘of the second intention’ is a word that applies to items of thought or of language (whereas things that are out there in the world independently of us are referred to by words ‘of the first intention’). Now Aristotle in his prime held that the soul is not a substance but rather a form: rather than being an independently existing thing that is somehow combined with the rest of what makes up the man, the soul is a set of facts about how the man acts, moves, responds, and so on. Bacon has little respect for the term ‘form’: in 15 he includes it among terms that are ‘fantastical and ill-defined’, and in 51 he says that ‘forms are fabrications of the human mind’. This disrespect seems to underlie the second accusation; the class of forms is not a class of independently existing things but rather a class of muddy and unfounded ways of thinking and talking, so that ‘form’ is a word of the second intention.]

63. The most conspicuous example of (1) the first class was Aristotle, whose argumentative methods spoiled natural philosophy. He

  • made the world out of categories;
  • put the human soul, the noblest of substances, into a class based on words of the second intention;
  • handled the issues about density and rarity (which have to do with how much space a body takes up) in terms of the feeble distinction between what does happen and what could happen;
  • said that each individual body has one proper motion, and that if it moves in any other way this must be the result of an external cause,

and imposed countless other arbitrary restrictions on the nature of things. He was always less concerned about the inner truth of things than he was about providing answers to questions—saying something definite. This shows up best when his philosophy is compared with other systems that were famous among the Greeks. For

  • the homogeneous substances of Anaxagoras,
  • the atoms of Leucippus and Democritus,
  • the heaven and earth of Parmenides,
  • the strife and friendship of Empedocles, and
  • Heraclitus’s doctrine of bodies’ being reduced to the perfectly homogeneous condition of fire and then remolded into solids,

all have a touch of natural philosophy about them—a tang of the nature of things and experience and bodies. Whereas in Aristotle’s physics you hear hardly anything but the sounds of logical argument—involving logical ideas that he reworked, in a realist rather than a nominalist manner, under the imposing name of ‘metaphysics’. Don’t be swayed by his frequent mentions of experiments in his On Animals, his Problems, and others of his treatises. For he didn’t consult experience, as he should have done, on the way to his decisions and first principles; rather, he first decided what his position would be, and thenbrought in experience, twisting it to fit his views and making it captive. So on this count Aristotle is even more to blame than his modern followers, the scholastics, who have abandoned experience altogether.

64. The (2) empirical school of philosophy gives birth to dogmas that are more deformed and monstrous than those of the sophistical or reasoning school. The latter has as its basis the •light of vulgar notions; it’s a faint and superficial light, but it is in a way •universal, and applies to many things. In contrast with that, the empirical school has its foundation in the •narrowness and •darkness of a few experiments. Those who busy themselves with these experiments, and have infected their imagination with them, find such a philosophy to be probable and all but certain; everyone else finds them flimsy and incredible. A notable example of this ·foolishness· is provided by the alchemists and their dogmas; these days there isn’t much of it anywhere else, except perhaps in the philosophy of Gilbert. Still, I should offer a warning relating to philosophies of this kind. If my advice ever rouses men to take experiments seriously and to bid farewell to sophistical doctrines, then I’m afraid that they may—I foresee that they will—be in too much of a hurry, will leap or fly ·from experiments straight· to generalizations and principles of things, risking falling into just the kind of philosophy I have been talking about. We ought to prepare ourselves against this evil now, ·well in advance·.

65. The corruption of philosophy by (3) superstition and input from theology is far more widespread, and does the greatest harm, whether to entire systems or to parts of them. ·Systems thus afflicted are just nonsense judged by ordinary vulgar standards, but that doesn’t protect men from accepting them, because· the human intellect is open to influence from the imagination as much as from vulgar notions, ·and in these philosophies it is the imagination that wields the power·. Whereas the contentious and sophistical kind of philosophy combatively traps the intellect, this ·superstitious· kind, being imaginative and high-flown and half-poetic, coaxes it along. For men—especially intelligent and high-minded ones—have intellectual ambitions as well as ambition of the will.

A striking example of this sort of thing among the Greeks is provided by Pythagoras, though ·his form of it wasn’t so dangerous, because· the superstition that he brought into it was coarser and more cumbrous ·than many·. Another example is provided by Plato and his school, whose superstition is subtler and more dangerous. Superstition turns up also in parts of other philosophies, when they

  • introduce abstract forms—·i.e. forms that aren’t the forms of anything·,

and when they do things like

  • speaking of ‘first causes’ and ‘final causes’ and usually omitting middle causes.

[Bacon’s point is: They discuss the first cause of the whole universe, and the end or purpose for which something happens (its ‘final cause’), but they mostly ignore ordinary causes such as spark’s causing a fire. Putting this in terms of first-middle-final seems to be a quiet joke].

We should be extremely cautious about this. There’s nothing worse than the deification of error, and it is a downright plague of the intellect when empty nonsense is treated with veneration. Yet some of the moderns have been so tolerant of this emptiness that they have—what a shallow performance!—tried to base a system of natural philosophy on the first chapter of Genesis, on the book of Job, and other parts of the sacred writings, ‘seeking the living among the dead’ [Luke 24:5]. This makes it more important than ever to keep down this ·kind of philosophy·, because this unhealthy mixture of human and divine gives rise not only to •fantastic philosophy but also to •heretical religion. It is very proper that we soberly give our faith only to things that are the faith.

66. So much for the mischievous authority of systems founded on •vulgar notions, on •a few experiments, or on •superstition. I should say something about bad choices of what to think about, especially in natural philosophy. In the mechanical arts the main way in which bodies are altered is by composition or separation; the human intellect sees this and is infected by it, thinking that something like it produces all alteration in the universe. This gave rise to •the fiction of elements and of their coming together to form natural bodies. Another example: When a man surveys nature working freely, he encounters different species of things—of animals, of plants, of minerals—and that leads him smoothly on to the opinion that nature contains certain primary forms which nature intends to work with, and that all other variety comes from •nature’s being blocked and side-tracked in her work, or from •conflicts between different species—conflicts in which one species turns into another. To the first of these theories we owe ·such intellectual rubbish as· first qualities of the elements; to the second we owe occult properties and specific virtues. Both of them are empty short-cuts, ways for the mind to come to rest and not be bothered with more solid pursuits. The medical researchers have achieved more through their work on the second qualities of matter, and the operations of attracting, repelling, thinning, thickening, expanding, contracting, scattering, ripening and the like; and they would have made much greater progress still if *it weren’t for a disaster that occurred. The two short-cuts that I have mentioned (elementary qualities and specific virtues) snared the medical researchers, and spoiled what they did with their correct observations in their own field.

[The passage flagged by asterisks expands what Bacon wrote, in ways that the small-dots system can’t easily indicate.]

It led them either •to treating second qualities as coming from highly complex and subtle mixture of first or elementary qualities, or •to breaking off their empirical work prematurely, not following up their observations of second qualities with greater and more diligent observations of third and fourth qualities.* ·This is a bigger disaster than you might think, because· something like—I don’t say exactly like—the powers involved in the self-healing of the human body should be looked for also in the changes of all other bodies.

But something much worse than that went wrong in their work: they focused on

  • the principles governing things at rest, not on •the principles of change; i.e. on
  • what things are produced from, not •how they are produced; i.e. on
  • topics that they could talk about, not •ones that would lead to results.

The vulgar classification of ·kinds of· motion that we find in the accepted system of natural philosophy is no good—I mean the classification into

  • generation,
  • corruption,
  • growth,
  • diminution,
  • alteration, and
  • motion.

Here is what they mean. If a body is moved from one place to another without changing in any other way, this is •motion; if a body changes qualitatively while continuing to belong to the same species and not changing its place, this is •alteration; if a change occurs through which the mass and quantity of the body don’t remain the same, this is •growth or •diminution; if a body is changed so much that it changes substantially and comes to belong to a different species, this is •generation or •corruption. But all this is merely layman’s stuff, which doesn’t go at all deeply into nature; for these are only measures of motion. . . .and not kinds of motion. They [= the notions involved in the classification into generation, corruption etc.] signify that the motion went this way or that, but not how it happened or what caused it. They tell us nothing about the appetites of bodies [= ‘what bodies are naturally disposed to do’] or about what their parts are up to. They come into play only when the motion in question makes the thing grossly and obviously different from how it was. Even when ·scientists who rely on the above classificatory system· do want to indicate something concerning the causes of motion, and to classify motions on that basis, they very lazily bring in the ·Aristotelian· distinction between ‘natural’ motion and ‘violent’ motion, a distinction that comes entirely from vulgar ways of thinking. In fact, ‘violent’ motion is natural motion that is called ‘violent’ because it involves an external cause working (naturally!) in a different way from how it was working previously.

[Bacon himself sometimes describes a movement as violens, but this is meant quite casually and not as a concept belonging to basic physics. These innocent occurrences of violens will be translated as ‘forceful’.]

Let us set all this aside, and consider such observations as that bodies have an appetite for

mutual contact, so that separations can’t occur that would break up the unity of nature and allow a vacuum to be made;

or for

resuming their natural dimensions. . . ., so that if they are compressed within or extended beyond those limits they immediately try to recover themselves and regain their previous size;

or for

gathering together with masses of their own kind—e.g. dense bodies ·moving· towards the earth, and light and rare bodies towards the dome of the sky.

These and their like are truly physical kinds of motion; and comparison of them with the others that I mentioned makes clear that the others are entirely logical and scholastic.

An equally bad feature of their philosophies and their ways of thinking is that all their work goes into investigating and theorizing about the

  • fundamental· principles of things. . . .—so they keep moving through higher and higher levels of abstraction until they come to formless potential matter—and
  • the ultimate parts of nature—so they keep cutting up nature more and more finely until they come to atoms, which are too small to contribute anything to human welfare—

whereas everything that is useful, everything that can be worked with, lies between ·those two extremes·.

67. The intellect should be warned against the intemperate way in which systems of philosophy deal with the giving or withholding of assent, because intemperance of this kind seems to establish idols and somehow prolong their life, leaving no way open to reach and dislodge them.

There are two kinds of excess: •the excess of those who are quick to come to conclusions, and make sciences dogmatic and lordly; and •the excess of those who deny that we can know anything, and so lead us into an endlessly wandering kind of research. The •former of these subdues the intellect, the •latter deprives it of energy. The philosophy of Aristotle ·is of the former kind·. Having destroyed all the other philosophies in argumentative battle. . . . Aristotle laid down the law about everything, and then proceeded to raise new questions of his own and to dispose of them likewise, so that everything would be certain and settled—a way of going about things that his followers still respect and practice.

The ·Old Academy·, the school of Plato, introduced acatalepsy—·the doctrine that nothing is capable of being understood·. At first it was meant as an ironical joke at the expense of the older sophists—Protagoras, Hippias, and the rest—whose greatest fear was to seem not to doubt something! But the New Academy made a dogma of acatalepsy, holding it as official doctrine. They did allow of some things to be followed as probable, though not to be accepted as true; and they said they didn’t ·mean to· destroy all investigation; so their attitude was better than. . . .that of Pyrrho and his sceptics. (It was also better than undue freedom in making pronouncements.) Still, once the human mind has despaired of finding truth, it becomes less interested in everything; with the result that men are side-tracked into pleasant disputations and discourses, into roaming, rather than severely sticking to a single course of inquiry. But, as I said at the start and continue to urge, the human senses and intellect, weak as they are, should not be •deprived of their authority but •given help.

68. So much for the separate classes of idols and their trappings. We should solemnly and firmly resolve to deny and reject them all, cleansing our intellect by freeing it from them. Entering the kingdom of man, which is based on the sciences, is like entering the kingdom of heaven, which one can enter only as a little child.

The next post in the sequence, Book 1: 69-92 (13 Causes of Bad Science), will be posted Saturday, September 28 at latest by 6:00pm PDT.


Free Money at PredictIt?

26 сентября, 2019 - 19:10
Published on September 26, 2019 4:10 PM UTC

Previously on Prediction Markets (among others): Prediction Markets: When Do They Work?Subsidizing Prediction Markets

Epistemic Status: No huge new insights, but a little fun, a little free money, also Happy Petrov Day?

Yesterday, with everything happening regarding impeachment, I decided to check PredictIt to find out how impactful things were. When I checked, I noticed some obvious inconsistencies. They’re slightly less bad today, but still not gone.

I figured it would be fun and potentially enlightening to break this down. Before I begin, I will state that unless I messed up this post expresses zero political opinions whatsoever on what election or other outcomes would be good or bad, and does its best to only make what I consider very safe observations on probabilities of events. All comments advocating political positions or candidates will be deleted in reign-of-terror style. No exceptions.

Market Analysis

Odds are represented as cost in cents for each contract that pays $1, so they double as probabilities out of 100.

Let us look at the democratic nomination odds, using last. All are 1 cent wide:

Elizabeth Warren 50

Joe Biden                21

Andrew Yang         10

Bernie Sanders       8

Pete Buttigieg          6

Hillary Clinton        5

Kamala Harris        4

Tulsi Gabbard         3

Amy Klobuchar       2

Can be sold for 1: Corey Booker, Tom Snyder, Beto ‘o Rourke

All other candidates can be bought for 1 and cannot be sold.

Adding that up we get 112. We could buy all the no sides for a total of 111 – you can get these prices on no except for Warren, where you’d sell at 49.

That’s certainly some free money. If you sell all of them, you don’t tie up any money, although you do have to deposit, so it’s a pretty great trade, albeit with an $850 limit.

I’ve already done many of the legs of that trade. Some are better than others. Hillary Clinton at 5 is complete insanity. Andrew Yang is trading at 10 because internet. That likely covers most of the reason you can sell the field for 111. Lower them to sane numbers (let’s be super generous and say Hillary Clinton 1, and say Andrew Yang 5) then the field would add to 102. Completing the trade is mostly about freeing up your capital. You also get some value for it being someone not on the above list, as the ‘brokered convention causes weirdness’ scenario is definitely not impossible. The weird thing is expecting that to somehow nominate Hillary Clinton.

The big not-automatically-insane opinion is making Warren 50% to win the nomination. That is rather bold at this stage of things, but we’re thinking about arbitrage and outright mistakes.

Let’s now look at the Presidential odds. For any Democrat, this is almost identical to a two-part bet, where that person wins the nomination and then wins the general election.

Donald Trump      41

Elizabeth Warren 35

Joe Biden                13

Andrew Yang          6

Bernie Sanders       6

Pete Buttigieg          3

Nikki Haley              2

Kamala Harris         2

Mike Pence               2

Tulsi Gabbard          2

Corey Booker           1

Amy Klobuchar      1

That adds up to 114. If you look at actually available prices, you could sell the field for 110. Again, pretty good idea. I’d get on that, and I mostly did.

One could also point out the implied general election win percentages of democrats where rounding isn’t a big deal.

Warren                  70%

Biden                      62%

Yang                        60%

Sanders                  75%

Sum of All Republican odds is 45% (Trump, Haley and Pence) out of 114%, for odds of 39.4%. Thus, Democratic victory should be about 60%. Warren is 50% to win the nomination, so that 70% number is really weird. This does not add up, and makes me reluctant to sell Warren at 50% odds in the primary.

In both these cases, the free money seems real enough. You get to use your capital in both markets if you sell the whole field and then have it free for a third market as well, and you can’t really lose. Doesn’t mean it’s worth the effort, but it’s a nice thing to notice.

Let’s look at Republican nomination odds:

Trump                  78

Haley                      7

Pence                      7

Kasich                     2

Romney                  2

Weld                        2

Sanford                   2

That only adds up to 98%, which makes sense, since if Trump is actually gone then anything could happen. This market seems sane on that level, perhaps even rich. What’s most interesting is that Trump is highly unlikely to not win the Republican nomination and win the presidency, so if he’s 41% to be reelected but 78% to be nominated, then Trump has a general election win rate of 52% (47% if we knock off 10% for the market being inflated by adding up to 114%). But perhaps this is reasonable? If Trump is gone it’s because something brought him down so it’s going to be super hard for anyone else to win? It’s not like much of that probability is that Trump’s health fails, given the time frame.

Also noteworthy is Trump is only 20% to win the popular vote, although the available volume here is very low. That implies a stunning 21% chance that Trump loses the popular vote but wins the election. Put another way, given Trump is reelected, he’s still an underdog to have won the popular vote. The electoral college seems to favor Trump, but that’s a huge probability to put in such a narrow space, even if you assume the states all look identical to 2016. I believe that pre-election, 538 had Trump at 10% to do this, with the polls only a few percent away from that result. How do you get to 20%?

You can sell “Hillary Clinton runs for president in 2020” at 12% odds. Is that a worthwhile return on capital? You could also sell Michele Obama at 8%, Cuomo at 6%, and Oprah or Mark Cuban at 5%.

They have Trump at 88% to be President at the end of 2019 and 73% to complete his first term. They think he is 41% to be impeached this year and 63% to be impeached at all. Congress is expected to work fast. Have they met congress?

There are a number of other similar good bets available. One gets the idea. The catch is that those all tie up capital. Also, if you take risk and win, you have to pay 10% of your net winnings and potentially taxes. Again, three cheers for arbitrage.

Looking at such systems of prices, and looking for opportunity, is often good training as not only a trader or gambler but also for calibration and probability estimation in general, which are excellent skills for anyone to develop.

What Does This Say About Prediction Markets in General?

Not much we didn’t already know. PredictIt has an $850 limit on any one market, for any one candidate or other potential outcome. This does not increase if you do arbitrage. This is why pure arbitrage that frees up capital can continue. I am literally at risk for $44 in the general election market, but that does not allow me to continue to trade.

Other markets in the past such as InTrade have not had this restriction. This results in less egregious versions of the same problems, as you can use bigger size to trade against the mistakes. However, there is no point in fully correcting a mistake, as doing so would offer minimal or no profits. If you have a market that is inefficient, and a chance to trade to make it more efficient, that’s a good trade, but at some point it isn’t worth the time and trouble and capital investment, so you stop. That point is necessarily before full efficiency, but in places like the stock market you can potentially get (in expectation) very close.

In prediction markets, cost of capital to do trades is a major distorting factor, as are fees and taxes and other physical costs, and participants are much less certain of correct prices and much more worried about impact and how many others are in the same trade. Most everyone who is looking to correct inefficiencies will only fade very large and very obvious inefficiencies, given all the costs.

Thus, we see the same inefficiencies pop up over and over again and not be corrected. The most well-known and universal one is that if the probability is under about 40%, the odds will likely be too high. The lower the odds below that, the more (as a percentage of the chance listed) the price will be too high. For low percentages, the people selling the contract are treating it as if it is a bond that pays interest over time, with a tiny default risk, rather than saying that the 7% chance is too high and should have been 5%. One also has to be wary of model error.

In politics, it is also inevitable that anything that sounds superficially good to people on the internet but is unlikely to actually happen is almost always going to trade rich.

If anything, it is remarkable how little difference it made to limit accounts to $850 in trading, beyond there being free cash lying around.

Anyway, thought that would be fun to write up formally given I had been tricked into actually trading the markets, and maybe some of you would get to do some good trades, so I figured why not. Have fun, everyone.



Running the Stack

26 сентября, 2019 - 19:03
Published on September 26, 2019 4:03 PM UTC

Look at the image here:


After looking at that image, you understand the concept well enough to use it as a mental model.

Hard-won lessons —

(1) I joke that "meditation is having exactly one thing on the stack." One thing at a time on the stack might seem oppressive, but it's actually joyful. I think you more-or-less can only do one thing at a time.

(2) But okay, the stack is more full. You just popped the top item off. Now what? IME, life goes better if you go down the stack unless new information compelling obsoletes it (unless you're just messing around, in which case "messing around" is on the stack and you're good). When an irrelevant tangent hits in a conversation, once it concludes, go back to where you were (if it was useful). When you realize you got distracted putting the groceries away, typically you want to finish putting them away.

(3) It's entirely true that oftentimes, going down the stack is short-term worse than whatever newly catches your attention. But it trains you to both recognize tangents and navigate conversations intelligently (again, in a non-pure-social-hangout conversation - like at work or when exploring an important topic).

(4) Even more true: often super sucks to go back down the stack on physical task stuff after you got distracted. But! I believe — I don't have any research, but my observation bears it out, it's a hypothesis — I believe that consistently running down the stack after you got distracted makes you less distractible going forwards, because there's less payoff to doing so.

(5) Some people can literally "run the stack" in their minds. Not a metaphor. Literally.

(6) I couldn't do this before. Now I can.

(7) What changed is that I used to be able to comfortably juggle 5-7 items at a time without running a stack, but I recently calculated out the work I'm committed to in the near future— like, "almost all of this work will get done" — and it's 300+ hours. Employees, administration, ops, software development, sales, finance. There's dozens of projects that stretch off into infinity going on. Suddenly, I was just running the stack all the time. I don't recommend it, but that's what happened to me.

(8) You can get better about refusing to add things to the stack.

(9) You can get better about "closing the thread" (popping things off the stack) before changing gears. "Yeah but wait, let's talk about that, but can we calendar that thing before we move on?" (can say it shorter, exaggerating for clarity)

(10) You don't need to do a task or complete a conversation to remove it from the stack. You can just delete it. But the act of explicitly doing so — and communicating it to anyone else relevant that needs to know — is what keeps your stack from overflowing.

And the most important lesson —

(11) When you have multiple items on the stack and "start feeling ambitious and motivated", COMPLETE THE ITEMS ON THE STACK RATHER THAN ADD NEW ITEMS TO THE STACK.

The all caps there isn't shouting at you — it's regret for lost years of my life. Alas. Bigger stack isn't better. Faster throughput is better. That's typically less stuff on the stack at any one time.

Anyway, the concept doesn't work for everyone, but a surprising number of people who are effective I know actually literally "run a stack" in their minds. It's... more common than I'd thought it. Probably the mix of being on software development and having an amount of work that'd be insanely overwhelming if I didn't take things one-thing-at-a-time is what generated it.


Resetting Somebody Will v2

26 сентября, 2019 - 19:00
Published on September 26, 2019 4:00 PM UTC

Yesterday I wrote about the song Somebody Will and why it's tricky in a group singing context, and shared a new melody I was playing with. After talking to some people about what they like about the original melody, and listening to some group singing recordings, here's another go.

This version keeps the melody for sections where it's reasonably predictable, simplifies the melody in a few places, and substitutes a new melody only where I think one is needed. Specifically:

  • The first melody ("Our new world ... will not be mine") is unchanged from the original.
  • The second melody ("A hundred years ... each new home") no longer moves the tune from D to Bbm but goes to the more natural Bm instead. The melody here is new but tries to have a similar feel as the original.
  • The third melody ("But I'll teach .. rockets belong") is unchanged from the original for the first part, but swaps a repeat of the second line's melody for the range-busting rise on "where our rockets belong".
  • The fourth melody ("It will never ... so far away") is new, and another attempt to make something fitting in Bm to replace something tricky in Bbm.
  • The fifth melody ("But I am .. somebody will someday") is unchanged from the original (I had thought it was too hard and recorded a draft version with it swapped to the version from yesterday, but really the only hard part is the first two notes (4, 3 over D) and I think we can deal with that)



(in D) A D Our new world is so close. A D A D Mars has treasures we're only just starting to find. A D A D Frozen mountains and crimson dust waiting for footprints Bm A That will not be mine. Bm A Bm A A hundred years to run the first tests G G A another to raise the first dome. Bm A Bm A The moon, then Mars, then Titan next, G G A A life time to touch each new home. A D And I want it so much. A D A D Close my eyes, I can taste the Mars dust in the air. A D A D In the darkness the space stations shimmer in orbits that Bm A I will not share. D D D D But I'll teach the student who'll manage the fact'ry D D G A That tempers the steel that makes colonies strong. D D D D And I'll write the program that runs the computer D D G A That charts out the stars where our rockets belong. Bm A G A It will never get easy to wake from my dream Bm A G A When the future I dream of is so far away. D D D D But I am willing to sacrifice D D D D Something I don't have for something I won't have Bm A D But somebody will someday. A D And it feels like a waste. A D A D All this working and waiting and battling time, A D A D And all for a kingdom that all of my efforts Bm A Will never make mine, Bm A Bm A But brick by brick the Pyramids rose, G G A With most hidden under the sand, Bm A Bm A So life by life the project grows G G A In ways I might not understand. A D I am voyaging too, A D A D We will need the foundation as much as the dome Bm A For those worlds to come true, D D D D And I'll clerk the office that handles the funding D D G A That raises the tower that watches the sky. D D D D And I'll staff the bookstore that carries the journal D D G A That sparks the idea that makes solar sails fly. Bm A G A It takes so many sailors to conquer an ocean Bm A G A And so many more when it's light-years away, D D D D But I am willing to sacrifice D D D D Something I don't have for something I won't have Bm A D But somebody will someday. A D It's so easy to run. A D A D Hide away in my books, games and fantasy plans, A D A D Let them call me a coward who can't face reality's Bm A Grownup demands, Bm A Bm A But if I love my fantasy worlds G G A It's not fantasy love that I feel. Bm A Bm A And so much more I feel for this G G The world that created them, G G World we create with them, G G A One chance to make them all real. A D And I know we won't stop. A D A D We've planned too many wonders for one little star. A D A D Though often the present may seem too complacent Bm A To take us that far. D D D D But I'll tell the story and I'll draw the picture D D G A And I'll sing the anthem that banishes doubt, D D D D And host the convention that summons the family D D G A That carries the fire that never burns out Bm A G A There are so many chances to give up the journey, Bm A G A Especially when it's so easy to stay, D D D D But I am willing to sacrifice D D D D Something I don't have for something I won't have Bm A And not only me, D D D D But we are willing to sacrifice D D D D Something we don't have for something we won't have Bm A D So somebody will, someday.


What are your recommendations on books to listen to when doing, e.g., chores?

26 сентября, 2019 - 17:50
Published on September 26, 2019 2:50 PM UTC

Some books lend themselves well to being listened to (e.g., Life 3.0), while some (GEB) don’t. What are your recommendations? PS: Note that TTS software can audioify any books.


What is operations?

26 сентября, 2019 - 17:16
Published on September 26, 2019 2:16 PM UTC

This the first in a sequence of posts about “operations”.

Acknowledgements to Malo Bourgon, Ray Arnold, Michelle Hutchinson, and Ruby for their feedback on this post.

My ops background

Several years ago, I decided to focus on operations work for my career. From 2017 to 2019 I was one of the operations staff at the Center for Effective Altruism, initially as the operations manager and later as the the Finance Lead. Prior to that, I was a volunteer logistics lead at approximately 10 CFAR workshops; I also ran ops for SPARC twice, and for a five day AI-safety retreat. I also attribute some of my ops skill to my previous work as an ICU nurse.

I have spent a lot of time thinking about hiring and training for operations roles. In the course of hiring I have had numerous conversations about what exactly “operations work” refers to, and found it surprisingly hard to explain. This post, and the rest of my operations sequence, will be an attempt to lay out my current thinking on what these roles are, what they have in common, and what skills they lean on most heavily.

Operations: not a single thing, still a useful shorthand

Operations work, or “ops”, is a term used by organizations like 80,000 Hours to refer to a particular category of roles within organizations.

I don’t think that “operations” as used in this sense is a single coherent thing; my sense is that 80,000 Hours is gesturing at a vague cluster that doesn’t completely carve reality at the joints. There isn’t a set of defining characteristics shared between all operations-type roles, and many of the attributes described are also found in other roles. However, I do think this is a useful shorthand that points both at a set of functions that need to be filled within organizations, and the skills that are necessary to carry out these duties.

It’s worth noting that this use of the word “operations” does not seem to be standard outside the EA community. In large companies, it can sometimes refer to e.g. the production side of manufacturing, or to supply chain logistics, whereas the internal admin roles are called by their individual job titles (Finance Manager, HR Manager, etc). I do think it makes more sense to carve out the “operations” cluster for small organizations, where the internal support work is more likely to be done by a single person or team rather than a multi-part bureaucracy. In smaller orgs, operations/admin staff often wear multiple hats since there often isn't a full-time work for a role in just finance/HR/etc.

Operations Roles: Support, Infrastructure, and Force Multiplication

80,000 Hours: “Operations staff enable everyone else in the organisation to focus on their core tasks and maximise their productivity.”

My sense is that roles in the ops cluster usually fill the following functions:

  • Maintaining the day-to-day infrastructure of an organization: there is a near-endless list of tasks and hoops to jump through to keep an org functioning, e.g., paying the bills, staying in compliance with local tax and HR laws, maintaining accurate bookkeeping, etc.
  • Supporting, implementing or executing externally-facing projects. This can involve setting up a spreadsheet or other workflow to track various steps and deadlines, researching the legal constraints on a project, communicating with external vendors, etc.
  • Acting as force multipliers for other staff: a good ops person will make it as easy as possible for the rest of an organization to interface with processes like payroll and expense reimbursement, and will set up internal systems with an eye to improving the productivity of staff. Office managers, for example, are responsible for maintaining the physical office space and helping staff with the setup they need for their work. Some roles in the ops cluster, such as personal or executive assistants, very directly involve supporting a specific person, taking on the attention cost of various logistical details (emails, scheduling, deadlines, booking travel, etc) and allowing them to spend more time in deep work.
  • Operations roles are usually on the generalist side; the tasks involved are extremely varied, requiring shallow knowledge across a huge range of domains, and therefore the ability to quickly pick up new knowledge and skills. They usually do not depend on a specific technical skillset or background (Though technical skills are often very helpful when trying to automate things).
The prototypical operations role in a small organization

Ops roles can vary widely on both seniority (responsibility and autonomy, skill and experience required), and specialization. In my thinking, the most central ops role is the “operations generalist” or “operations manager” at a small organization (10-20 people).

  • High autonomy: they are the main admin staff for the entire organization, and the buck stops with them. They are likely to know more than any other staff member about the various details of their role, and thus will need to mostly set their own deadlines and priorities, as well as plan ahead and anticipate problems.
  • They are involved with many, and sometimes all, of the following duties:
    • Finances and accounting
    • Payroll
    • Paying bills
    • Filing
    • Legal compliance and writing internal policies
    • HR and onboarding
    • Responding to questions and concerns from inside and outside the org about any of the above
    • Admin on software systems used internally (e.g. email provider, Slack, Asana)
    • External communications with e.g. donors or customers
    • Supporting specific projects as they set up systems
    • Coordinating projects and people
    • Fundraising
  • They are often working at an organization that is growing, and so need to set up new systems to meet changing needs. This is particularly true for someone hired as the first dedicated operations staff member at a very new organization e.g. an early startup, which may have few to no existing systems, with processes happening ad-hoc.
Not operations

My knowledge about this area is limited, but my sense from talking to others is that some software jobs (devops, sysadmin, internal tech support, etc) perform similar functions, in terms of helping to support and enable other staff and maximise their productivity. However, in this sequence, I’m not including these roles in the “ops” cluster I’m trying to point at, since they require specific technical skills.

Skills required: systematization, planning, prioritization, attention to detail, patience

80,000 Hours: “operations staff are especially good at optimising systems, anticipating problems, making plans, having attention to detail, staying calm in the face of urgent tasks, prioritising among a large number of tasks, and communicating with the rest of the team.”

Of course, the skills described here are useful in almost any job, not just operations roles, and can tend to sound like “just being generally competent.” I do think that jobs vary widely in terms of what skills are most load-bearing, though, and “ops” is a cluster that relies especially heavily on these skills as opposed to others such as technical ability, writing talent, aptitude for deep work, etc.

My sense is that many of the skills being gestured at when describing someone as “good at ops” fall out of a certain type of attention pattern, which I describe below. I am particularly trying to contrast this style of thinking with the “deep work” attention pattern that is most useful for, e.g., research.

  • Concrete and detail-oriented: ops tends to be messy, dealing with a lot of exceptions and one-off tasks, frequently interfacing with opaque outside systems that have precise and not-especially-elegant requirements.
    • Doing the work to a high level does require some level of zooming-out, to be able to look at a given task in the context of the “bigger picture” of the organization’s priorities, and to see where systemic improvements can be made, but for the overall breakdown of work hours spent, these roles involve more “in the weeds” work than big-picture work.
    • The way that various external systems, such as banks and the IRS, behave in practice is a lot more relevant than the way they would work in an ideal world.
    • Creating and optimizing systems and processes to make future work easier is important, but it needs to stay grounded in the details and what other staff will actually use.
    • Relevant concept: the virtue of narrowness (link?)
  • Broad and shallow focus: more often than not, ops work involves juggling a large number of small tasks, individually straightforward, rather than deep dives on complex projects. Operations staff need to be extremely organized and able to track all of these, and prioritize them against each other, without becoming tunnel-visioned on any one task.
  • Thinking in tradeoffs: in these types of roles, perfect is the enemy of the good.
    • It’s almost never possible to catch up with all the tasks or systemic improvements that would ideally be done, and ops staff need to ruthlessly prioritize and 80/20 tasks where possible.
    • In particular, there is often a tradeoff between the urgency and long-term importance of tasks. It can be very high-value to spend some time building long-term infrastructure in advance of when it is needed, or change over to a new system that will be better in the long run e.g. switching to a better accounting software, but usually not at the cost of missing short-term deadlines.
    • Many time-sensitive decisions will involve taking on some amount of potential risk, whether legal, financial, reputational, etc, and often the time and resources available to investigate all possible risks are limited, especially for small organizations. Ops staff need to be able to consider and compare different low-likelihood risks, prioritize the time spent digging into potential issues, and pick the best option available even if it’s not ideal.
    • Professional services such as lawyers, auditors, accountants, etc, often push for the most conservative, “perfect” version of a process without quantifying the risk of choosing the less-than-perfectly-safe option. Having an eye on the actual risk involved is key.
    • Overall, it’s important to focus on maximizing progress towards the goals, not checking off your to-do list.

There are some other skills that might or might not fall into the same cluster, but that I think are also particularly key.

  • Noticing confusion: since these roles involve frequent reprioritizing and troubleshooting, it’s very important that ops staff develop a sense of how things should look, allowing them to flag unexpected issues, e.g., human error in the accounting.
    • It is especially valuable to train intuitive, gut-level judgement on this, so that the noticing can happen even when distracted by hectic deadlines.
  • Comfort with the unknown and with making mistakes: ops involves a large number of weird one-off problems and tasks, and there isn’t a standardized degree or training program, so most of these roles will involve learning how to do particular tasks on-the-job. Added to the time pressure, this means that sometimes tasks will get dropped and the wrong judgement calls will get made. In addition, when under a heavy workload, ops staff may have to deal with criticism and complaints from staff and others outside the organization, even when they make what they think is the best tradeoff. Almost all mistakes are recoverable from for the organization, but even more than in most roles, operations staff need to be able to take this criticism from within and without the organization, and learn from their dropped balls without becoming demoralized. People don’t tend to notice ops work when things are going well, which can cause feedback to be disproportionately negative.
  • Multitasking and interruptions: this is more true of some roles, like event logistics, than others, but most ops roles require frequently switching tasks and shifting priorities in response to new information, as well as being interruptible for time-sensitive requests.
    • A subskill here is the ability to stay calm when confronted with emergencies or urgent decisions.
  • Murphyjitsu: excellent ops staff with automatically run mental models of tasks, situations, and new systems, try to anticipate what will go wrong, and troubleshoot or make contingency plans beforehand. This requires having a solid understanding of both the overall systems and priorities, the big picture, and also of the specific details in each case.
  • Communication skills and people-modeling: ops work involves a lot of coordinating with other people, internal and external to the organization, and predicting how people will interact with systems that are being built.
Outline of the operations sequence

The rest of the sequence will go into various aspects of this summary in more detail. Future posts will cover the following topics:

  • Describing the various operations roles and job titles, with my attempt to categorize and compare them.
  • Exploring the factors that can make someone a good fit for operations roles, in terms of skills, aptitude, and personality traits.
  • A more detailed breakdown of various skills that I think are especially relevant.
    • Developing judgement and “taste”
    • Dealing with interruptions and time management
    • Principles of building good organizational systems
    • Delegating tasks and accepting delegation