Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 19 минут 15 секунд назад

Noticing the Value of Noticing Confusion

21 октября, 2021 - 21:04
Published on October 21, 2021 6:04 PM GMT

Crossposted from spacelutt.com

Your strength as a rationalist is your ability to be more confused by fiction than by reality. Either your model is wrong or this story is false.

Your model of the world is how you understand the world to work. If I think ice is frozen water, and ice is frozen water, then my model of the world is right. If I’m six and I think Santa Clause is the one who brought me presents, when really it was my parents, then my model of the world is wrong, whether I know it or not (and the whole point is that you don’t, because if you knew your model was wrong it wouldn’t be you’re true model anymore.)

Confusion is not understanding. If you are never confused, it means you understand everything all the time. It means your model of the world is exactly perfectly right.


Looking at it this way, it’s extremely obvious why over the past year as I’ve been practicing the skill of noticing my confusion more there’s been more and more opportunities until it’s just a constant, never-ending flow. So many micromysteries.

And it’s not only lying that it prevents you against. I’ve caught out lots of people in semi plausible lies, with the thought pattern being “If x, then almost definitely y, and I see no y so more likely you’re lying” and then I’m right and I feel really good.

One of the key aspects is knowing that your model is supposed to forbid things. People lying to you is a very obvious time when the story you’re being fed is wrong. But also you telling yourself an inaccurate story is the same as a lie for these purposes, as it’s another opportunity to say “hey wait, this doesn’t quite fit” and then say “I’m wrong!”.

Like when you’re eating dinner with someone, and they go to get up and you don’t know why they left. 99% of the time people get up, it’s to go and get water. So you assume they went to get water, and think you have understood what is going on.

But their cup is still sitting on their tray.


And after you’re good at noticing when you don’t understand, you get to play one of the funnest games available to man: trying to figure out what the hell is actually going on. And this is the bit where your brain is forever warped, this is the bit for why Eliezer says this is your strength as a rationalist. Did you notice your confusion when Eliezer said this, just this measly tidbit brain pattern was your strength as a rationalist?

It’s because once you start noticing all the times you don’t understand something, you can start actually trying to understand. You can start throwing out hypothesis, and weighing them based on their probabilities!

That’s right!

Like Harry James Potter Evens Verres!!!

And you know once you’ve done it right because every puzzle piece clicks together. It all makes sense now.

Or, and this is almost equally fun, you find that you’re model actually, legitimately, has no explanation for wtf just occurred.

Why would someone stand up at dinner in the school cafeteria, leaving their tray, phone, and cup?

  • To fill more water
    • Unlikely, they would’ve taken their cup
  • To get more food
    • Unlikely, they would’nt have gone to get more if they hadn’t finished what was already on their plate
  • To talk to one of their friends
    • Unlikely, they don’t have that many friends
  • To get dessert
    • Unlikely, why wouldn’t they have taken their bowl

What the FUCK is going on?

It’s important not to settle for one hypothesis just because it’s the best you have even if it doesn’t make sense. Even though I had no better explanation, I insisted that the hypothesis that the person I was with went with – that they went to get more water – didn’t make sense because they would’ve taken their cup.

So I sat in excitement, waiting to find out what really happened. Waiting to actually learn something new about the world.

You shouldn’t be able to explain things that your model of the world doesn’t explain. If you don’t bring these moments of small confusion to the forefront of your brain, you can never truly learn because you can never really feel the gaping hole in your understanding!

Noticing confusion is really fun, and pretty important. Being able to throw out hypotheses and weigh them based on probability is really fun.

Can you guess why they actually got up from dinner?

It was to speak to their teacher.


What's Stopping You?

21 октября, 2021 - 19:20
Published on October 21, 2021 4:20 PM GMT


This post is about the concept of agency, which I define as ‘doing what is needed to achieve your goals’. As stated, this sounds pretty trivial - who wouldn’t do things to achieve their goals? But agency is surprisingly hard and rare. Our lives are full of constraints, and defaults that we blindly follow, going past this to find a better way of achieving our goals is hard.

And this is a massive tragedy, because I think that agency is incredibly important. The world is full of wasted motion. Most things in both our lives and the world are inefficient and sub-optimal, and it often takes agency to find better approaches. Just following defaults can massively hold you back from achieving what you could achieve with better strategies.

Yet agency is very rare! Thinking past defaults, and your conception of ‘normal’ is hard, and takes meaningful effort. But this leaves a lot of value on the table. Personally, I’ve found cultivating agency to be one of the most useful ways I’ve grown over the last few years. And, crucially, this was deliberate - agency is a hard but trainable skill.

Why Care?

I’ve been pretty abstract so far about the idea of agency. A reasonable reaction would be to be pretty skeptical that there’s any value here. That you already try to achieve your goals, that’s what it means to have goals! But I would argue that it’s easy to be missing out on opportunities and better strategies without realising it. An unofficial theme of this blog is that the world is full of low-hanging fruit, you just need to look for it. Further, many of the people I most admire, who’ve successfully changed the world for the better, had a lot of agency - changing the world is not the default path!

To make this more concrete, I want to focus on examples of how agency has been personally valuable to me. The times I managed to step off the default path, look past my constraints, and be more ambitious and creative about how I achieved my goals.

  • By far the most significant was realising that I could take agency with my career and life path. That I could step away from the default of continuing my undergrad to a fourth year and masters and then doing a maths PhD or working in finance. And instead, I’ve spent the past year doing 3 back-to-back AI Alignment research internships, trying to figure out if this might be a path for me to have a significant positive impact on the world.
    • This was an incredibly good decision that led to much more personal growth. I now feel much less risk-averse, am a better engineer and researcher, have a much clearer idea of what the AI space is like, and have a much more concrete view of why AI Alignment matters and what progress on it might look like.
    • Further, I now have a job I’m very excited about, and a much clearer picture of what I want to do over the next few years.
    • Agency is relative to your context, and your defaults. I expect some people would have found this decision easy, but I found this surprisingly hard. I’d intended to do a fourth year for ages, and most of my friends were doing it - this strongly felt like the default, and doing something else felt risky and scary.
  • Being ambitious about taking on personal projects, rather than the default of being risk-averse and shying away from fear of failure and putting in effort
  • Realising that I can improve my social life by taking initiative - breaking free of the default path of forming surface level friendships with the people I run into naturally, and never putting myself out there.
    • Intentionally making close friends - practicing and learning how to form emotional bonds with friends has made my life so much better, and hopefully helped improve their’s too!
    • Taking social initiative - By getting better at reaching out, I’ve gotten really valuable mentorship and advice, job opportunities, and friends I now cherish
    • Having awkward conversations about ways I thought I’d hurt someone or being hurt, apologising, and growing closer rather than letting things fester
  • Improving myself and my life. Breaking out of the default mode of helplessness and realising that problems are for fixing, that I have the capacity to make things better.
    • Underlying my feelings of guilt, obligation and not meeting my standards, and learning how to manage this better
    • Understanding my own motivation and learning how to become excited about what I’m doing, or how to find things I am excited about.
    • Smaller everyday things - noticing small things which annoy me and fixing them, or items that could improve my life and buying them.
    • More generally, the spirit of self-experimentation, seeking novelty and doing new things, and being willing to go outside my comfort zone to see what that’s like. This has led to things from trying out pole-dancing to going round a room and asking people for sincere criticism.

As those examples hopefully illustrate, agency has been extremely valuable for me, and my goals. But it is not my place to tell you whether agency is right for you. Agency can be hard, stressful and exhausting! Sometimes the defaults are good enough. Instead, my goal in this post is to present the mindset of agency, and make a case that it can be valuable to you, for achieving what you want.

Exercise: What do you want? What are your goals? What are your dreams, your ambitions? How do you want to change the world? What is missing in your life? Take a minute to reflect on your favourite prompt before moving on.

Exercise 2: What’s stopping you?

What is Agency

When stated as ‘doing what is needed to achieve your goals’, agency feels like a pretty simple concept. But implicitly, I’m gesturing at a messy bundle of different skills. In this section, I want to break agency down what agency is, and the most important mindset and sub-skills. These are neither comprehensive nor compulsory for achieving agency, but I’ve found all of them valuable to cultivate:

  • Noticing and avoiding defaults: A core lens I view the world through is that our lives are full of defaults. Expectations that society places upon us, social norms that we follow, our own conception of our role and expected duties and path through life, our own conception of what is ‘normal’ vs ‘weird’. A key part of agency is noticing when these constrain you, and being willing to break them.
    • Part of this is avoiding groupthink - being able to think non-default thoughts, think for yourself, think things through from first principles, and deeply caring about having true beliefs.
      • Eg the kind of person who was concerned about COVID at the start of February, or someone who grew up in a deeply religious household and decided to de-convert.
    • Note that agency is not about knee-jerk nonconformity. You need to be willing to not conform - that’s what it means to avoid a default. But non-conformity fails to be agency when it’s about not following defaults because they’re defaults, rather than to achieve your goals - that’s still letting defaults control you. Instead, strive to notice your defaults, and check whether you want to avoid it.
      • For example, trying hard in exams is often the conformist choice. But if I care about getting a great job and my grades matter for that, then choosing to try hard is the agentic choice, and slacking off is not.
  • Finding opportunities: Noticing when there are opportunities to achieve my goals, chances to take agency. And being good at making my own opportunities, and actively seeking them out, rather than waiting for them to fall into my lap
    • Part of this is creativity - being able to see what other people don’t see, and generate a lot of ideas
    • Part of this is not thinking in defaults - being open to weird and unusual ideas, and taking them seriously
    • To me, this feels similar to the mindset of learning the rules to a game, and reflexively looking for ways to exploit, munchkin and break them.
  • Intentionality: Understanding why you’re taking actions, and keeping your goals clearly in mind. Being mindful of wasted motion, and checking whether you’re actually achieving what you want. Being deliberate.
    • Note that this is separate from identifying your goals in the first place. Spending time thinking about and eliciting your deep goals and values, the skill of prioritisation, is incredibly important. But it different enough to be worth separating from agency, which is about building on those goals.
      • Though this is a fuzzy boundary - we often absorb default goals from our social context, eg making money or seeking status. And it takes agency to look past this and check for what you actually value.
  • Taking action: Ultimately agency is about doing things to achieve my goals. It is important to learn how to convert thoughts and vague intentions into actions and change.
    • But agency is not just about having willpower and putting in effort, being able to act rather than procrastinating or always following the easy path. Agency is also about being strategic and intentional. For example, a hard-working student who exerts a lot of effort to read and re-read their notes exhibits less agency than a student who learns it with half the effort by spaced repetition, or who realises they don’t care about this course and drop it.
  • Ambition: Thinking big, and not being constrained by small-mindedness. Often you can achieve things far more important than it feels at first. We are bad judges of what we are capable of, and it is a tragedy to let a lack of ambition limit what you can achieve.
    • This can look like many things - being ambitious in changing your life, believing that you can make progress on your biggest problems rather than being helpless; being ambitious as a researcher by identifying the most important problems in your field and working on them; being ambitious about having a major positive impact and making progress on the world’s biggest problems, eg believing you might be able to save hundreds of millions of lives.
      • An underlying insight here is the importance of fat tails and upside risk
    • Though note that agency is not just about ambition, it is also about being intentional. Agency looks like actually trying, not just doing actions that seem defensible under the pretext of some grand vision. Trying to understand the problem, finding the points of leverage, and forming a theory of change for how to achieve your ambition. Noticing when your strategies aren’t working, learning from this, and doing something differently.
  • Don’t be a bystander: A particularly poisonous mindset that can hold you back from agency is the bystander effect. Saying that something is “not my problem”, implicitly relying on someone else to fix it. Asking whether you have to do something about it, rather than whether you want to, or whether doing it could bring you closer to your goals or help others. Framing things in terms of blame and obligation, rather than asking if you could do something about it..
    • This can be local, personal problems or larger problems in the world. From realising you can take agency and fix the things in your life that make you unhappy, like poor sleep, to contributing to large global challenges like climate change or pandemic prevention
    • Part of this mindset is taking responsibility - realising that you can do things and influence the world, and that by taking it upon yourself to fix or improve something the world will be better than if you did nothing. That if you just rely on others, the world will be worse.
    • Part of this mindset is to pick your battles - taking responsibility can itself be poisonous if you make everything your problem, every single thing that is wrong in the world and your life, and feel guilty if you fail to solve them. Focus on the problems you most care about and have leverage on.
      • Further, sometimes you will never be able to “solve” a problem. Making progress on a big problem can still be really valuable. And flinching away from problems so large they feel unsolvable can also hold you back.

A final note after all this discussion of what agency is and isn’t. In practice, it is rarely sensible to ask “is this actually agenty enough?” and imagining being able to justify your agency - that pushes me towards doing things that are clearly and legibly weird and original. Instead, agency is relative to your goals, and your defaults. A socially anxious introvert who decides to throw a party demonstrates far more agency than a confident extrovert who does it every weekend. Agency is what you make of it. The important question is whether it helps you achieve your goals, not whether it looks appropriately brave or non-conformist.

The agency to improve the world

An application of agency that is particularly important to me is taking agency in improving the world, in finding the most effective ways to have a positive impact. Agency is important here because you can do far more good than you do by following default ways to do good - I see this as one of the key insights I’ve gotten from the Effective Altruism movement. You can achieve far more if you look for missed opportunities, way to leverage your limited resources, ways to be far more ambitious and aim for a chance of having a major impact.

The mindset of taking responsibility for contributing to the world’s problems treating it as ‘not my problem’ can be particularly important here, and represent the difference between doing something and nothing. Eg, realising that you can actually put meaningful effort into fighting climate change, rather than just recycling and being environmentally conscious. But I think this mindset can harm people, so I wanted to give my take on how I view this.

Often, this can be framed in terms of obligations. That the world is full of problems, it is my job to fix them, and I have to do this. I reject that mindset. You don’t have to do anything. If you don’t care about problems, that is your prerogative.

Instead, I think in terms of my values. Over the course of my life, I have the capacity to influence the world towards my values, and it is my responsibility to do something about this. But this is not some weighty obligation to resent and feel guilty towards - these are my values and I actually care about doing something about it. Personally, I care deeply about human flourishing. I have the capacity to influence the world to be a better and safer place, and it is important to me to do something about this, to take agency and be ambitious about it. I’m a big fan of Nate Soares’ thoughts here.

Cultivating AgencyWhat’s Holding You Back?

Agency can be pretty rare. And part of why it’s rare is that it’s hard! And in particular, lots of things make it harder to be an agent. And before diving how to develop agency, it’s worth examining what’s holding you back, and seeing which things you can relax. Even if you can’t solve the things holding you back, often just identifying them can help!

There are two important categories here, defaults and constraints.

  • Defaults
    • Default roles and expectations
      • Eg, that the role of a good student is to do well in assignment, so you pour all your effort into your degree, without checking how much you care
      • This is particularly bad with expectations around careers and life paths. This is one of the most important decisions you’ll ever make!
        • Eg, the idea that a maths student will go into software, finance or academia, and you just need to find the best option
        • Eg, having Asian parents who insist that you need to become a doctor or a lawyer
    • Social norms - a deep sense of what is “normal” vs “weird”
      • Eg, feeling unable to skip small talk and talk about something interesting, even if you think both of you would enjoy it
      • Of course, often following social norms is the right call! But it’s valuable to see it as a trade-off, with benefits and consequences, rather than an iron-clad rule that induces anxiety to violate
    • Cultural narratives and groupthink - the ideas you’re taught when you’re young, and the things everyone around you believes
      • Politics is particularly bad for this - it’s very hard to hold right wing ideas if all your friends are left, and vice versa
      • It’s worth noting this even when you think you agree with the idea! Eg, I’d find it pretty hard to conclude that COVID vaccines aren’t worth taking given my social circles. I also genuinely think the vaccines are great, but it’s much harder to be truth-tracking here!
    • Default strategies - Instrumentally achieving your goals the way all your friends do. This is much worse with peer pressure, but even just a sense of what is “normal” can hold you back
      • Eg, learning by going to lectures, making notes, and revising by re-reading and doing past papers
      • Eg, letting keeping in touch with friends happen naturally and spontaneously, rather than being deliberate and intentional about it
    • The illusion of doing nothing - The feeling that doing nothing is safe, that you can’t be blamed, and that taking action is scary. And thus procrastinating on actually doing anything.
      • This is much worse when I’m considering doing something with risk!
      • And when dealing with option paralysis
  • Constraints - When you don’t have enough of some important resource, and feel constrained. Fundamentally, lacking Slack in your life.
    • Money
    • Energy levels - it’s much harder to be agentic when you’re tired all the time!
      • Note, if you commonly feel fatigued, I highly recommend getting tested for the most common causes of fatigue! Some, like iron deficiency or hypothyroidism, are easy to treat and easy to test for but often go undiagnosed.
      • There can be more mundane causes - poor sleep, poor diet, lack of exercise, etc
    • Time
      • Note - it’s worth distinguishing between not having enough time, vs not being able to use time well. I find that when I’m stressed, I always feel like I don’t have enough time
    • Commitments - overcommitting your time, having a packed schedule and a long list of obligations
      • On a deeper level, sometimes the problem is feeling unable to quit things, and not knowing how to say no when someone asks you to take on a new commitment.
      • I’ve recently gotten value from making a form I need to fill out when taking on any new commitment, estimating how much time and effort it’ll take, and checking whether it’s actually worth it
    • Mental health
    • Attention/focus - are you constantly distracted? Do you ever have long blocks of at least 2 hours where you’re confident you won’t be interrupted, and can think and reflect on things?
    • Physical health
    • It’s worth checking how much of the resource you actually need, and trying to quantify things. Or whether you can systematise taking care of it. Often the problem is less the constraint itself, and more the mental space tracking it and stressing over it takes up. Eg, it’s much better to make and keep to a budget than to constantly stress over how much money you spend.

Of course, just noticing a default or constraint is much easier than solving it. So what can you do? This is hard to give general advice on, but often noticing is the first step to doing something about it. Some personal examples:

  1. I noticed the constraint of not having enough mental space to try new things or take agency, due to a lack of Slack and too many commitments. In the short term, I carved out my Sunday afternoons to relax and work towards non-urgent stuff, or whatever I was excited about, and carved out time for weekly reviews, to reflect on my longer-term goals and on what opportunities I was missing when too in the moment. In the longer-term, I’ve set myself a much higher bar for future commitments, quit the lowest priority ones, and am slowly reducing my load.
  2. I noticed that I cared a lot about what felt normal and safe, vs weird and unusual, and found it took a lot of willpower to deviate from this. In the short term, I found particularly useful kinds of weird things to do, and practiced doing them. In the longer-term, I’ve tried to surround myself with friends who are ambitious, altruistic and agentic, and I am sufficiently socially influenced that this has helped me get better at overcoming defaults in general.

Exercise: What’s stopping you? If you suddenly became significantly more ambitious, what would you want to do? And what’s holding you back from doing that now?

Feeling Agency

The main path to cultivating agency, as I see it, is to practice! To initially do agentic things occasionally, and with effort. To (hopefully) have them go well, and get positive reinforcement. And slowly practice and develop the mindset of agency, and have the mental moves feel smoother and more reflexive, until this is something you do naturally.

To make this more concrete, I find it helpful to reflect on what agency feels like, and how to make each part of the process smoother and more natural.

  1. First, I have the spark of an idea, something to do. Noticing some opportunity, inefficiency, or a desire to do something interesting.
  2. Noticing and nurturing that spark of an idea. Resisting the urge to instinctively flinch away from it, and actually thinking about it. Checking whether I actually want to do it, but actually checking, not just flinching away from something weird and risky. Exploring the idea, understanding it, and figuring out what I could actually do
  3. Taking action - finally, actually doing something about it! Being concrete, avoiding overthinking, procrastination and option paralysis, and actually doing something.

So, how to make each of these smoother? Some immediate thoughts:

  • Ideas
    • The main thing is to be creative, and to open myself up to new ideas
      • Making time to reflect, think and brainstorm. I really like 5 minute timers for this
      • Read a wide range of interesting things, and try to step outside your bubbles
      • Talking to other people
        • Note - it’s important to distinguish between “I am doing an idea because someone told me to”, without checking whether you want it, and “I genuinely want to do this idea someone else suggested”
    • Be open to weird ideas - notice the ideas you immediately flinch away from. Notice if there are things you see someone else do that you’d never have thought of. Notice your default patterns of thought, and what these close you off from
    • Ambition - Some of my best ideas come from being unafraid to think big.
      • Ask yourself, “What would I do if I was a way more ambitious person?“
      • If trying to solve a difficult and intractable problem, ask “If I managed to completely solve my problem, what happened?”
  • Noticing and nurturing
    • Filters: Notice the filters in your head that cause you to flinch away from an idea. And check whether you actually want to follow these constraints, and whether they are connected to your goals, or just reflexes stored in your head.
      • A big one is risk-aversion - I often flinch away from ideas because it could go wrong, and this feels scary
        • But, please do actually check for risk! Many people are not agenty enough, but some are too agenty, and if you have a lot of agency and don’t check for risk, you can really hurt yourself
      • Fear of judgement, and a desire for social conformity
      • A personal sense of identity - it’s hard to do something that doesn’t feel “me”. For example, on some level I identify as nerdy and averse to exercise, and this makes it way harder to entertain ideas involving exercise
      • Often this is particularly hard because the flinch happens on auto-pilot, but noticing it takes self-awareness. For this problem, I find the technique of noticing emotions particularly helpful
    • Redefine failure: Often I flinch away from an idea because I’m afraid of failure. In this case, I find it useful to redefine what I’m aiming for, what success/failure actually mean. If you can pull this off, and orient towards something meaningful, failure is literally no longer an option!
      • A big one for me is making taking agency my goal. Deciding that this is a skill I want, and that any time I take agency I’m making progress.
        • Building it into my identity, and trying to become a person who actually does things
        • In the long-term, you don’t just want agency without good judgement. But I find it is much easier to first become able to do things, and then filter for the things most worth doing, rather than trying to do both at once.
      • Seeking novelty. It’s easy systematically under-explore, and don’t try new things enough, because the costs of not doing the standard option are concrete and visceral, while the benefits of discovering a new and better option are abstract. To counteract this, I try to build novelty seeking into my identity
      • When I’ve redefined failure, I also get past the endless agonising over whether this is the “best” thing to be doing. I’m no longer analysing the object level action, I’m making progress towards the kind of person I most want to be.
    • Be concrete: It’s easy to flinch away from an unknown idea without properly exploring it. Assume you do explore the idea, and try to make concrete what you would actually do. Suppose it goes wrong, and flesh out exactly how bad this would be, and what you could do about it.
      • It’s much easier to entertain an idea than to actually take action, and this helps reduce the flinch
  • Taking action
    • Be spontaneous and fast!
      • Create tight loops between taking actions and getting results.
      • Put a big premium on doing something now rather than later. Don’t leave enough time for motivation to fade
      • Have easy ways to rapidly commit to something. Eg, messaging the intention to a friend, putting it in your to-do list, scheduling a time to do it properly in your calendar, etc. Put in effort beforehand, to enable yourself to be spontaneous in the moment
    • Avoid overthinking and overplanning. Identify a rough plan, and a concrete first step, and then take it - you don’t need a perfect plan to start doing something.
    • Set a 5 minute timer, and try to do as much as you can before the timer goes off - often this is enough to get enough activation energy
    • Try to have a clear goal/intention when acting, don’t just go through the motions
    • I collect some related strategies in Taking the First Step
  • Reward - Finally, you want to feel good about taking agency after the fact! The main value of practicing agency is getting better at the skill, and for that you need positive reinforcement.
    • Ideally this happens automatically, if you take agency towards good things!
      • And if you take agency towards actively bad things, it’s worth checking whether agency is a skill you actually want to cultivate
    • Redefining failure really helps here - if you can get excited about seeking novelty or taking agency, it’s much easier to get strong and rapid positive reinforcement
    • Seek tight feedback loops - practice agency on small things where you’ll quickly find out whether it was a good idea
Concrete Advice

The following is a grab bag of more concrete ways to cultivate agency and put this into practice. Some of these are contradictory, and aimed at different people - look for the ones that resonate, and might be of value to you!

  • As outlined above, practice! Look for opportunities for agency in everyday life, and take them for the sake of practicing the skill.
    • This can be incredibly minor things, eg being the one who gets up from the table to refill an empty water jug.
  • Make time to regularly reflect. I am a big advocate for weekly reviews
    • Prompts like “what would I do if I was being really ambitious?” or “what opportunities came up recently?”, “what am I missing?” or even “how could I be more agentic?” Can be really helpful
  • Try to take ideas seriously. Notice when you flinch away from something because it feels weird or effortful, and actually think through the pros and cons. Give yourself permission to be weird and ambitious and to actually try
  • Notice the defaults in your life, and make efforts to step past them. Try new things! Expose yourself to new ideas, and new ways of thinking. Make friends very different from yourself.
  • Take care of yourself! Notice if you’re spreading yourself too thin. Make sure you have energy, good health, and take care of anything causing stress or taking up mental space. Build good systems. If these things are going wrong, aggressively prioritise dealing with it.
  • Take an action orientation - try to default to saying why not, rather than no. Be willing to experiment and see what happens
  • Seek mentors and role models who have agency, and who you can learn it from
    • Note - there are different kinds of mentorship, and most will not give you this
    • Personally, I find that by far the best way to teach agency to other people is via 1-1 conversations. Understanding what the other person wants and their problems and constraints and defaults. And making suggestions for ways to do something differently and take agency. And helping them check whether this is actually what they want, and then making the intention concrete and putting it into practice

Exercise: Did any of these resonate? What is something concrete you could implement in your life? Set a 5 minute timer, and do something about it right now.


Most of this post has been cheerleading for agency, and treating it as a virtue. But it’s worth reflecting on the drawbacks, and ways too much agency can hurt you. Some particularly notable drawbacks:

  • The attractor of non-conformity. Feeling uncomfortable doing anything normal, and defaulting to ignoring standard wisdom unless it’s obviously true. Sometimes things are done for a reason!
  • There are social consequences to weirdness, especially in certain cultures and social circles. This makes me sad, and I strongly advocate looking for friends who will help nurture your agency, but it is a real cost and consequence
  • Agency adds a lot of variance to things. The default path is normally fairly safe, while agency opens up a lot of new avenues. The prototypical example is someone who decides to try a lot of drugs, doesn’t know how to do it safely, and ends up badly hurting themselves.
  • Agency is hard. Making your own path can be exhausting and stressful, while following the default path can be pleasant and fine. Optimising small things may not be worth the effort
    • Again, I recommend picking your battles! It’s easy to fall into the attractor of thinking you must be agentic in everything, and feeling guilty when you’re not.

Overall, agency is one of the most useful skills I’ve ever gained (and I still have a lot of room to grow!). And hopefully in this post I’ve helped to flesh out what, exactly, I mean by agency, reasons to value it, and concrete ways to cultivate this skill.

So, if this post resonated and you want to gain agency, my final challenge to you is this. What are you doing to do about it?

Thanks to Duncan Sabien and the LessWrong Feedback Service for valuable feedback!


Covid 10/21: Rogan vs. Gupta

21 октября, 2021 - 18:10
Published on October 21, 2021 3:10 PM GMT

I finally got my booster shot yesterday. I intended to get it three weeks ago, but there was so much going on continuously that I ended up waiting until I could afford to be knocked out for a day in case that happened, and because it’s always easy to give excuses for not interacting with such systems. When I finally decided to do it I got an appointment literally on my block an hour later for that and my flu shot, and I’d like to be able to report there were no ill effects beyond slightly sore arms, but I’m still kind of out of it, so I’ll be fine but if I made some mistakes this week that’s likely the reason. I also had to wait the fifteen minutes. I would have simply walked out the moment they weren’t looking, but they held my vaccine card hostage until time was up. 

We now have full approval for every iteration of booster shots, including mix and match, for those sufficiently vulnerable. If you’re insufficiently vulnerable but would still rather be less vulnerable, there’s a box you’ll need to check. 

I got a chance to listen to the Rogan podcast with Gupta, and have an extensive write-up on that. It was still a case of ‘I listen so you hopefully do not have to’ but it was overall a pleasant surprise, and better than most of what passes for discourse these days.

Executive Summary
  1. Conditions continue to improve.
  2. Booster sequences including mix-and-match have been approved.
  3. Rogan did a podcast and I listed to it so you don’t have to.

Let’s run the numbers.

The Numbers Predictions

Prediction from last week: 481k cases (-12%) and 9,835 deaths (-11%).

Results (from data source unadjusted): 472k cases (-15%) and 11,605 deaths (+1%).

Results (adjusted for Oklahoma which will be baseline for next week): 472k cases (-15%) and 10,705 deaths (-3%). 

Prediction for next week: 410k cases (-13%) and 9,600 deaths (-10%).

Wikipedia reported over 1,100 deaths in Oklahoma this week. That’s not plausible, so I presume it was a dump of older deaths or an error of some kind, and removed 900 of them from the total. 

There’s no hard and fast rule for when I look for such errors or how I do the fixes, so you can decide if what I’m doing is appropriate. Basically if an entire region gives a surprising answer I’ll look at the individual states for a large discrepancy, which is at least slightly unfair since sometimes it makes the number look ‘normal’ when it shouldn’t, but time is limited. 

This is still more deaths than I expected, but given cases continue to drop I expect deaths to keep dropping. It’s possible there was another past death backlog I didn’t spot because it wasn’t big enough to be obvious.


Chart and graph are adjusted (permanently) by -900 deaths this week in Oklahoma.

Death counts seemed higher than plausible in general even after the fix, but it’s a small mistake. Next week will tell us whether or not it is a blip.


The South’s situation continues to improve rapidly, and it now has fewer cases than multiple other regions, but we see improvement everywhere. Solid improvement in the more northern states is especially promising in terms of worries about a possible winter wave. Can’t rule it out, but it seems somewhat less likely. 

We are now down more than 50% in cases from the recent peak, and over the last five weeks, although regionally that is only true in the South. But we’ve clearly peaked everywhere. 


Nothing ever changes. Which at this point is good. Steady progress is more meaningful each week as more of the population is already vaccinated.

Vaccine Effectiveness and Approvals

In the words of Weekend Editor: Today the FDA formally authorized Moderna boosters, J&J boosters, and all the mix-and-match combination boosters. This is very aggressive for them! 

Indeed, and congratulations to the FDA for doing the right thing, at least on this particular question. When someone does the right thing is the time to thank them, no matter how long overdue it might be. 

As per procedure now the CDC gets to have all the same discussions, because if there’s one thing we need enough of it’s veto points. We’ll know the outcome on that next week.

Vaccine Mandates

Support for older vaccine mandates is declining. This could end up being quite bad.

There continue to be claims that there will be massive waves of people quitting over vaccine mandates, this time in New York City. No, we are not going to lose half our cops, no matter how excited that prospect makes some people these days. We’ll find out soon enough:

On the one hand $500 is in the ‘let’s actually get it done’ range and worth it if it works, and should smooth over any general grumbling, on the other hand it’s enough that I’m pissed off that they’re getting that many extra tax dollars for what they should have done anyway. 

Here’s a Zeynep thread on the psychology involved. Lot of good food for thought.

A vaccine mandate carries with it the requirement to verify that it is begin followed. This in turn means verifying people’s vaccine status and ID at various points. When does this end? Some people who based on their previous writings really should know better are seemingly fine with ‘never’ and I notice I am most definitely not okay with that for everyday activities. There will be a point these mandates \(\) everyday actions turn negative, and it’s not that far off, and then we’ll have to figure out how to unwind them. Would be increasingly happy to start now. 

This post looks at vaccine persuasion in Kentucky, notes that $25 Walmart gift cards were a big draw, doesn’t seem to offer much hope that persuasion via argument would work. But have we tried bigger gift cards?

California state works are somehow vaccinated at a rate much lower than the state average. Ignoring for the moment that they don’t seem to be doing much to fix this problem, one can draw various conclusions about how the state government operates and hires based on this. And one can wonder why, if the state is willing to impose so many other restrictions, they can’t or won’t take care of business in this way. However, the article also notes that this is comparing the number who provided proof of their vaccination status as employees, versus the number who actually got vaccinated as adults. It seems that some employees may have simply decided not to provide proof, either as a f*** you to the demand for proof of vaccination, or because it seems that if you don’t vaccinate they ‘make’ you get tested a lot but that’s free and some people like the idea of getting tested frequently, so whoops. 

Washington State’s football coach is out: 

So is an NHL player, and even if you oppose mandates I hope most of us think that faking a vaccine card is not a permissible response.

This is a remarkably high tolerance of importantly fraudulent behavior. Very much does not seem like a sufficient response.

Not endorsed, but noting the perspective that the unvaccinated are holding us hostage, because the threat of potentially running out of health care capacity is the reason we still take major preventative measures, and if everyone got vaccinated we would go back to normal. I find the hostage situation metaphor apt because hostage situations are mostly because we choose to care about them. Every so often, someone on a show will grab a hostage, and the response will quite correctly be ‘I’m not going to reward threats to destroy value by giving you what you want’ and I wrote a contest essay back in grade school arguing this should be standard procedure. Instead, we’re more like a DC hero who thinks that if you point a weapon at any random citizen they are forced to hand over the world-destroying superweapon codes. I will leave it to you to draw the appropriate metaphor to our current situation on other fronts.

NPIs Including Mask and Testing Mandates 

From New Zealand and Offsetting Behavior comes the story of Rako. Rako offered to scale up their Covid-19 tests, the government said they weren’t interested, then when it turned out the tests were good they reversed course and decided to take what tests and capacity that did exist without much paying for them, among other disasters going on there, and hope that somehow anyone will be interested in helping with such matters next time around. It doesn’t look good. Neither does the Australian decision not to securely keep the police away from the contract tracing records

How much should you update on a Covid-19 test? We’ve got a new concrete reasonable attempt answer that, although it still uses the PCR test results as their ‘gold standard’ and thus is underestimating the practical usefulness of other testing methods. 

The Bayes factors for positive tests are pretty high. The ones for negative results are less exciting, but if you’re focusing on infectiousness, I believe you end up doing a lot better than this. Those who do math on this stuff a lot are encouraged to look into the details.

It turns out Rapid Antigen Testing works rather well at telling who is infectious. Here’s a thread explaining why they’re much more accurate than people thought, which is that when the PCR tests came back with what were effectively (but arguably not technically) false positives, not matching those was a failure. The fallback general official response has for a long time been something like this.

Where ‘have Covid’ is defined as ‘have or recently have had any trace of Covid’ although that’s rarely the thing that has high value of information. It is very reasonable for someone to want to know if they have (or others have) Covid, and it is also very reasonable for someone to know if they are (or others are) infectious. Different purposes, different tools, and it turns out both tools are highly useful. The mistake we made for over a year was saying that because people might also want to know if they have Covid, the test that is very good at detecting infectiousness and less good at detecting Covid was illegal, so we should instead not test at all or use a test that was more expensive, slower and less useful in context. It is good that things seem to be coming around a bit.

Restaurant that isn’t as good as Shake Shack lets people in, is now told they are out. 

This was only a temporary shutdown of one store, since they only have one store in San Francisco. Depending on exactly where their stores are, this could be a very smart move, as it wins them massive points with the outgroup.  

Permanent Midnight

Some of us think the ultimate goal is to become complacent about Covid-19 once it’s no longer a major threat, and return to our lives. Our official authorities say, madness.

This is an explicit call for vaccinated children to be forced to mask permanently. This is utterly insane. If not them, then who? If not now, then when?

I sincerely hope the kids neither forgive nor forget that this happened to them.

This also brings up another of my humble proposals of ‘maybe we should teach children that skipping a meal every so often is fine, so they have that valuable skill in life that’s done me worlds of good’ but mostly it’s that they are literally forcing children to go outside in the rain to eat.


Here’s Sam Bankman-Fried going over why calls for large permanent interventions are nowhere near ever passing cost-benefit tests, and giving an attempt at a calculation and thus an opportunity to nitpick and refine.

Semi-constant mask wearing costs a lot more than 0.4% of GDP. I don’t know exactly what you’d have to pay people to get them to wear masks indefinitely (with no other benefits) but I’d be stunned if it’s under 1% of consumption. Even if we knew it would work on its own with no help I have no idea why you’d even consider this.

Staying home if you have a sick housemate, to me, seems mostly like a good idea even before Covid-19. You can call this a cost of one day per year, but you have to make a bunch of assumptions to get there. Days of work can’t be fungible, so taking a random day off means your productivity is lost and can’t be made up later, and there aren’t substantial benefits from taking that extra day off on the margin. But that’s kind of weird, since if there was such a big net loss from losing a random day of work it strongly implies you’re not working enough at the baseline. And it seems likely to me that you save the office collectively a full day of productive work (since being low-level sick makes work less effective on top of less fun) by avoiding additional infections. 

The exception here would be if work on that day can’t be done from home, and isn’t fungible with either other times or other people, so you lose something close to a full day’s productive value. I think that is rarely the case, and that Sam’s history of being stupidly productive at all times makes this a blind spot. For most people, my model says that either (A) you can mostly get others to cover for you without too much loss and (B) most of the work where this isn’t true can be done remotely for a day or two. 

Zoom meetings are a mixed bag, but this week I had my first work in-person meeting in over a year and it was incredibly more productive than a similar Zoom meeting would have been. There are big advantages the other way, so this won’t always be the case, but I strongly agree that giving up on seeing people in person is a productivity (and living life) nightmare that costs way more than we could plausibly give up. But on the margin more Zoom meetings than 2019 is good actually.

That leaves the vaccines, which he estimates at a day of cost, and I don’t understand this number at all. Sometimes the vaccine will knock one out for a day, but this does not need to be the case and I wrote most of this the day after getting my booster shot. Over time, we’ll figure out the right dosing and regimens and the side effect impact will decline, and you can plan ahead so you choose a day when it’s not that expensive to be somewhat out of it. 

If we end up passing a ‘everyone must miss a work day after the shot so everyone feels permission to get the shot’ law then it could end up costing a day, I guess, but also giving people some paid time off at the time of their choice that they plan for seems like it isn’t even obviously a bad idea?

Rogan Versus Gupta

I got the chance to listen to Joe Rogan’s podcast with Dr. Gupta. It’s a fascinating combination of things, some of which are great and some of which are frustrating and infuriating, from both of them.

The opening is a discussion of why the two of them were willing to sit down together. Gupta sat down with Rogan to try and understand Rogan’s thinking process and because Rogan can reach a huge audience that is otherwise exceedingly difficult to reach, and to convince Rogan on vaccines. Rogan sat down with Gupta because Gupta’s public changing of his mind on marijuana (which they talk then about a bit) revealed to Rogan that Gupta is willing to look at the data, change his mind and admit when he’s wrong. 

In this past, both of them acquitted themselves well. The central point here was well taken. On its surface it was about the potential of marijuana and why we should not only legalize but embrace it and research what it can do for us, and I’m while I don’t have any desire to use it myself I am totally here for that. 

The real point was that one needs to think for oneself, look the data with your own eyes and an open mind, be curious and come to conclusions based on where that takes you, and that doing this is how you earn many people’s respect. That Gupta was here with the ability to engage in (admittedly imperfect, but by today’s standards pretty darn good) discourse because he’d shown himself in the past to be an honest broker and truth seeker acting in good faith. 

They then started getting down to it and discussing the situation in earnest. Compared to my expectations, I was impressed. Joe Rogan came in curious and seeking truth. He had many good points, including some where he’s more right than and where he was wrong, he was at least wrong, making substantive claims for reasons and open to error correction and additional data and argument. He was continuously checking to see if Gupta’s story added up and whether it lined up with Rogan’s model of the world in general, but was quite open to learning more.

Like any discourse or debate, there were many ways all participants could have done better.

Several people have noted that Joe Rogan is drawing a distinction between vaccines, where the burden of proof of safety is being put on the vaccines, and on various other things like Ivermectin, where he largely puts the burden on others to show they are not safe, and holds them to a very different standard. In general, it seems like Rogan is hunting for an angle whereby the vaccines will look risky. Not full solider mindset, but definitely some of that going on. 

It’s worth noting that Rogan explicitly states in minute 59 that the risks from the vaccines are very, very small. This is despite Rogan listing off people he claims to know who had what he thinks are deadly serious adverse reactions, so it’s not clear to me that he in his position should even believe these risks are all that small. 

Rogan’s point that Gupta is at far greater risk as a vaccinated healthy older adult, than a child would be unvaccinated, is completely correct and a kill shot when not tackled head on. None of our actions around children and this pandemic make any sense because we refuse to reckon with this. Gupta has no answer. The response ‘I think you have to draw a distinction between those that have immunity and those that don’t’ is not a meaningful answer here – saying the word ‘immunity’ and treating that as overwriting age-based effects is Obvious Nonsense and Gupta is smart enough to know that. As are his attempts to move back and forth between risk to self and risk to others when dealing with kids. If he wants to make the case that vaccinating kids is mostly about protecting others, that’s a very reasonable case, but you then have to say that part out loud.

Which is why Rogan keeps coming back to this until Gupta admits it. Gupta was trying to have it both ways, saying he’s unconcerned with a breakthrough infection at 51 years old, and that young children need to be concerned about getting infected, and you really can’t have this one both ways. Eventually Gupta does bite the bullet that child vaccinations are about protecting others, not protecting the child (although he doesn’t then point out the absurdity of the precautions we force them to take), and frames the question in terms of the overall pandemic. 

The question of protecting others was a frustrating place, and the one where I’m most disappointed in Rogan. Rogan pointed out that vaccinated people could still spread Covid-19 (which they can) and then said he didn’t see the point of doing it to protect others, whereas he’s usually smarter than that. Gupta pointed out that the chances of that happening were far lower, although he could have made a stronger and better case.

Gupta was very strong in terms of acknowledging there was a lot we didn’t know, and that he had a lot of uncertainty, and that data was constantly coming in, and in engaging the data presented with curiosity and not flinching, if anything taking Rogan’s anecdata a little too seriously but in context that was likely wise. 

The key moment where Rogan turns into the Man of One Study seems to start in minute 62. In response to Gupta referring to the study, Rogan has it brought up. The study’s surface claim is that for some group of young men, the risk of the vaccine causing myocarditis is 4.5x the chance of being hospitalized for Covid-19. Gupta had previously pointed out that the risk of myocarditis from Covid-19 is higher than that risk from the vaccine, and tries to point out that the study here is not an apples-to-apples comparison, as it’s comparing hospitalization risk to diagnosis risk. Rogan grabs onto this and won’t let go. It takes a few minutes and Gupta stumbles in places, but around the end of minute 65 Gupta gets through to Rogan that he’s claiming myocarditis risk from the disease is higher than from the vaccine. Rogan responds that this is inconsistent with the data from the study, which seems right. Then Gupta gives the details of his finding, but his finding is based on all Covid-19 patients in general, which is consistent with this particular risk being higher for young boys from the vaccine than from Covid-19, and potentially with the results of the study. 

At another point, Gupta threw the Biden administration under the bus on the issue of boosters, blaming them for daring to attempt to have an opinion or make something happen without waiting for word to first come from the Proper Regulatory Authorities, and claiming this was terrible and caused two people to resign and treating their decision to resign as reasonable (Rogan was asking about the resignations repeatedly). He equated ‘data driven’ with following formal procedure and only accepting Officially Recognized Formats of Data. I wasn’t happy about this, but the alternative would be to start speaking truth about the FDA.

My model is that Rogan’s take on vaccines differing from the standard line comes mainly from Rogan placing an emphasis on overall health and the strength of a person’s immune system, and from taking these questions seriously and spotting others not taking the questions seriously. 

Rogan’s entire model of health and medicine, not only his model of Covid-19, consistently gives a central role to maintaining overall good health. People should devote a lot of time and effort to staying in good health. They should eat right, exercise and stay active, maintain a healthy weight, take various supplements and so on. This is especially important for Covid-19, whose severity seems highly responsive to how healthy someone is, with large risk factors for many comorbidities, although not as large as age. 

From Rogan’s perspective, one option against Covid-19 is vaccination, but another option is to get or stay healthy. As Gupta points out multiple times, this is a clear ‘why not both’ situation, except that there’s complete silence around helping people get healthy, even though it’s a free action. It’s worth getting and staying healthy anyway, why not use Covid-19 as an additional reason to get people started on good habits? And if you’re unwilling to help people get healthy, why should we listen to you about this vaccine? Which is a fair point, you mostly shouldn’t listen to these people in the sense that their claims are not in general especially strong evidence. It’s that in this case, it’s very clear for multiple distinct reasons that they are right.

Minute 88 is when they get into Ivermectin. Joe Rogan is not happy that he was described as taking ‘horse dewormer.’ As he points out, this is very much a human medicine, regardless of how some people are choosing to acquire it, and those people are not him: “Why would they lie and call it horse dewormer? I can afford people medicine, motherf***er, this is rediculous. It’s just a lie. Isn’t a lie like that dangerous? When you know that they know they’re lying?” 

So then he played the clip, and the CNN statement wasn’t lying, exactly. It was technically correct, which as we all know is the best kind of correct – it said that he said he had taken several drugs including Ivermectin. Then it said that it was used to treat livestock, and that the FDA had warned against using it to treat Covid. Now all of those statements are technically correct – the FDA definitely warned about it and doesn’t want you doing that, and among other things Ivermectin is used to treat livestock, although it is also often used for humans and Rogan had a doctor’s prescription. 

Now, in context, does that give a distinctly false impression to viewers? Yes. Are they doing that totally on purpose in order to cause that false impression? Absolutely. Is it lying? Well, it’s a corner case, and technically I guess I’m going with no? Gupta’s response is that they shouldn’t have done it, but he’s not willing to call it a ‘lie’ and is denying that there was glee involved. (Morgan Freeman narrator’s voice: Oh, there was glee involved.)

Rogan asks, if they’re lying about this, what do we think about what they’re saying about Russia, or any other news story? And my answer would be that this is the right question, and that it’s the same thing. They’re (at least mostly) going to strive to be technically correct or at least not technically wrong, and they’re going to frame a narrative based on what they want the viewer to think, and as a viewer you should know that and act accordingly.

Later on comes the part that should be getting more attention. In minute 125, Rogan explains that he almost got vaccinated, but didn’t, and what happened.

  1. The UFC got some doses and offered one to Rogan. He accepted.
  2. Logistical issues. Rogan had to go to a secondary location to get it, his schedule didn’t allow it, had to be somewhere else, planned to take care of it in two weeks.
  3. During the two week period, Johnson & Johnson got pulled.
  4. Also, his friend had a stroke and Rogan connected this to the vaccination, whether or not this actually happened.
  5. Rogan goes “holy ****” and gets concerned.
  6. Another of Rogan’s friends has what looks like a reaction to the vaccine, gets bedridden for 11 days. And another guy from ju-jitsu that he knows had what looked like another issue, having a heart attack and two strokes.
  7. A bunch of these reactions don’t get submitted to the official side effects register.
  8. Rogan concludes that side effects are likely to be underreported.
  9. Rogan goes down a rabbit hole of research, finds opinions on shape of Earth differ.
  10. Rogan doesn’t get vaccinated, thinking he’s healthy and he’ll be fine.
  11. Rogan gets Covid-19, his family presumably gets it from him (Minute 135), it isn’t fun, but he gets over it and he’s fine, and they get over it and they’re fine.
  12. Rogan tells these stories to millions of people, teaching the controversy, but still advocating vaccination for the vulnerable and for most adults, but is highly skeptical about vaccinating kids and thinks people should be free to choose.
  13. Gupta tries to get Rogan to get vaccinated despite having been infected, while admitting Rogan has strong immunity already, which goes nowhere.
  14. Rogan says repeatedly that he’s not a professional, that you shouldn’t take his advice, to listen to professionals, that he is just some guy with no filter. But this includes naming The Man We Don’t Talk About as an expert.
  15. But of course, he knows that saying ‘my advice is not to take my advice’ mostly never works.

The first thing he mentions in his story, the start of this reversal, is when they pulled Johnson & Johnson to ‘maintain credibility.’ This is a concrete example of the harm done by that action. It contributed directly to Rogan not being vaccinated. That speaks to how many other people had similar reactions, and also Rogan then shared his thoughts with millions of people, some of whom doubtless therefore did not get vaccinated. 

The bulk of his points were about side effects in particular people that Rogan knew. From his perspective, the side effects looked very much like they were being severely underreported, especially since these particular side effect cases weren’t reported. How could he not think this? From his epistemic position, he’d be crazy not to think this. He has quite a lot of friends and people who would count as part of the reference class that he’d observe here, and the timing of some of what looked like side effects could easily have been a coincidence rather than causal, but still, he saw what looked like three of these serious cases in rapid succession, in people who seemed otherwise healthy. Meanwhile, similar risks are being used as a reason to pull one of the vaccines. 

He responded to all this quite strong (from his position) Bayesian evidence, combined with his good health and his model that Covid-19 was unlikely to be that bad for him, did a bunch of research that under these circumstances put him in contact with a bunch of Covid-19 vaccine skeptics, and declined the vaccine. 

I strongly feel he made the wrong decision, took an unnecessary risk and would have been much better off getting vaccinated. But mostly the heuristics and logic used here seem better than blindly trusting a bunch of experts. Sometimes that gets you the wrong answer, but so does trusting the experts. 

Given he continues to mostly advocate for vaccination of adults, and seems to have come around to believing the generally accepted vaccine safety profile, that both speaks highly to his epistemic process used since he was exposed to a bunch of his good friends who were peddling other conclusions rather forcefully, and also makes me think I know here he did make his mistake.

My guess (and I could be wrong, he didn’t make this explicit) is that the decision ultimately came down in large part to blameworthiness in Rogan’s mind. In the frame most of us have, vaccines are safe and effective, so if you get Covid-19 without being vaccinated that’s on you, and if you have one of the exceedingly rare serious side effects (many or more likely most of which are a coincidence anyway) then that’s not on you. The incidents with his friends reversed this for him, combined with thinking that outcomes from Covid-19 are tied to health. In his mind, if Covid-19 got him, that was his fault for being unhealthy. If the vaccine got him, that would be on him for seeing these things happening to his friends, and taking it anyway. So he did what most people do most of the time, especially when he saw the decision as otherwise only a small mistake, and avoided what he thought of as blame, and did what he could feel good about doing. And of course, the decision was in many ways on brand. But the undercurrent I sense is that yeah, he knew it was objectively a mistake in pure health terms, but not a huge one, so he just did it anyway. 

One thing that reinforces this is that Rogan comes back repeatedly to individual examples of people, especially young healthy people, who had problems that happened after getting vaccinated, and says that it was overwhelmingly likely that that particular person would have been fine had they gotten Covid-19. Which is true, but it was also far more overwhelmingly likely that they would not have had the problem they had if they got vaccinated. If you trade one risk for another smaller risk, sometimes the smaller risk happens to you. That’s what a risk is. But if you instinctively use forms of Asymmetric Justice, what matters is that this particular person is now worse off, even if on net people who took such actions are better off, therefore blame.

That of course is an aspect of vaccines being held to a different burden of proof. In his mind and many others, they’re unsafe until proven safe, and that includes long term data, and the prior on ‘artificial thing we made to do this’ in some sense is stronger than any of our ‘this is how this mechanically works or when we’d see the effects show up’ style arguments could hope to be. Whereas he puts his assortment of other stuff into a different bucket, with a different burden and a radically different prior. Which isn’t a crazy thing to do, from his perspective, although I don’t see it as mapping to the territory.

They finish up with a discussion about the lab leak hypothesis, and they certainly don’t make me less suspicious about what happened on that front. 

That’s a giant amount written about a three hour podcast I listened to (mostly at 1.5x speed) so you didn’t have to. It was less infuriating than I expected, and contained better thinking, and is to be hailed for its overall good faith. We need to be in a place where such actions and exploration are a positive thing, even when they make mistakes and even when they end up causing people to be pushed towards worse decisions in many cases. 

In Other News

Bioethicists have profoundly inverted ethics.

No, seriously, imagine speaking this sentence out loud. Say, to whoever is listening, “We don’t ask people to sacrifice themselves for the good of society.” 

Then realize that bioethicists are far more insane than that, because what they’re actually saying is, “We don’t allow people to sacrifice of themselves, or take risks, for the good of society.” 

Over half of respondents to this survey report being lonely, with only a small effect from identifying as autistic. We had a crisis of loneliness before and Covid-19 had to have made it much worse, and at this point I worry about such effects far more than Covid-19.

Not Covid, but a good politician never wastes a crisis, so here’s a look into the child care portion of the Build Back Better bill. I solved for the equilibrium, and I doubt anyone’s going to like it.

As the weeks continue to blend into one another, it seems like it’s getting to be time to formally write up my lessons from the pandemic. I don’t know when I’ll have the bandwidth, but I’m authorizing people to periodically ask why I haven’t finished that yet.


Rationality Vienna goes hiking

21 октября, 2021 - 14:34
Published on October 21, 2021 11:34 AM GMT

We’ll meet at 2pm at the terminus of tram D (Nußdorf, Beethovengang) to hike along Stadtwanderweg 1.

I expect around a dozen participants show up. (90%: 6 .. 18)

If the weather forecast is not appealing, we might change to a café on short notice, which change would be posted here. Assume default plan if there’s no announcement.


NATO: Cognitive Warfare Project

21 октября, 2021 - 12:57
Published on October 21, 2021 9:57 AM GMT

NATO seems to have a project on cognitive warfare and a few public reports online:

Interim Report

Based on the Understanding Phase findings, NATO has identified the following priorities:
- Develop a Critical Thinking online course
- Develop improvements to the decision making processes

- Leverage technologies, including VR and AI to develop tools in support of better cognition and better decision making

1 Jun 21 Cognition Workshop Report

Cognition includes three interrelated aspects that are reflected in the structure of the workshop: information, decision-making and neuroscience.

Cognitive Warfare

As global conflicts take on increasingly asymmetric and "grey" forms, the ability to manipulate the human mind employing neurocognitive science techniques and tools is constantly and quickly increasing. This complements the more traditional techniques of manipulation through information technology and information warfare, making the human increasingly targeted in the cognitive warfare. 


Successful Mentoring on Parenting, Arranged Through LessWrong

21 октября, 2021 - 11:27
Published on October 21, 2021 8:27 AM GMT


In June 2021, Zvi posted The Apprentice Thread, soliciting people to offer, or request, mentoring or apprenticeship in virtually any area. Gunnar_Zarncke offered advice on parenting, as the parent of four boys (incidentally, true of my grandmother as well) between the ages of 9 and 17, with the usual suite of observational skills and knowledge that comes with being a regular on this site. I responded with interest as my first child is due in November.

Gunnar and I are sharing our experience as an example of what a successful mentoring process looks like, and because his key points on parenting may be interesting to current and future parents in this community. I had several breakthrough-feeling insights which helped me to connect my LessWrong/rationalist schema to my parenting schema.

Gunnar and I began by exchanging messages about the parameters of what we were getting into. I was interested in his insight based on these messages and other comments and posts he had made on this site about parenting. We arranged a Google Meet video call, which confirmed that our personalities and philosophies were compatible for what we were undertaking.

We did not have a structured reading list, although I investigated resources as Gunnar suggested.  As we went along, Gunnar translated into English samples of notes taken by his children’s mother throughout their childhood and shared them with me. She had also systematically described the daily and weekly tasks a parent could expect in various development phases of the child’s life. I was an only child and have not parented before, so I found this extremely educational.

We had several video calls over the next few months and discussed a wide range of parenting-related topics. Gunnar also suggested this post, to report on our experience. I drafted the post, and Gunnar provided comments, which I merged, and after he reviewed the final version, we published it as a joint post.

By call number two, I was realizing that parenting was never going to be the sort of thing where I could read the “correct” book for the upcoming developmental stage, buy the “correct” tools, and thereby maximize outcomes. Instead, it would be a constant process of modeling the child’s mind, providing new inputs, observing behaviors, updating the model as needed, researching helpful tools, and iterating more or less until the kid is in its 20s. At first, this was intimidating, but I’ve come around to understanding that this just is the parenting process. This synthesis eventually gave me additional motivation and optimism. 

These calls gave me great comfort against anxiety about parenting, confidence, and a sense of human connection, all beyond what I expected. 

First call

Our first call was within a week of Zvi’s post. We described our backgrounds as people who were parented. Gunnar came from a large family; I came from a small one. We discussed how our parents nurtured positive traits in us and also touched on what our parents did that didn’t work. 

For example, my parents would frequently observe when other people were acting in ways consistent with the values they were trying to teach me, in addition to praising or otherwise rewarding me for acting that way myself. 

Gunnar's mother was mostly trusting of her children and "went with the flow," following her intuitions. His father was very fostering and offered a lot of practical education. He consciously created a safe environment. He said he learned this approach from his parents, who came from different backgrounds. Gunnar's grandmother came from a liberal Scandinavian family, and his grandfather came from disciplined Prussian family. His grandfather embraced his grandmother's liberal norms, which seems to have created a reliable high-trust environment for his father--despite difficult times during and after World War II. 

Gunnar segued into discussing general strategies for supporting children’s development. Highlights:

  • “Salami tactics”: Allow them to learn new behaviors and situations incrementally rather than all at once.
  • Developmental diary: Once a week, or more often, write down notes on what happened with each child during that period, what was effective parentingwise, what wasn’t. This was something that Gunnar came back to consistently. However, he is not confident that it is right for everyone, just that it was for him. I plan to do this as well.
    • Consistent reflection
    • Lessons to carry from one child to the next
    • Incorporate photographs
  • The saying goes, "Small kids, small problems; big kids, big problems." But the pattern goes like this:
    • With small kids you have a lot of very small tasks and problems: How to diaper. Why is the baby crying right now? Let's try this 5-minute game. Let's go to this 30-minute baby swimming class. We have to rock the crying baby for an hour until it finally sleeps. Oh, the baby is interested in this thing--oh, it's already gone. Why does X no longer work? Oh, Y works now.
    • As they grow older this switches to: Will they find friends at the new school? Taking the kid to soccer games every weekend--and staying there for cheering, photos, and small talk. Practicing math for hours before the exam. Working again and again on some fight between siblings. Helping to renovate the room.  Talking for hours about some conflict or problem.
  • Be alert to opportunities for teaching based on the child’s interests.
    • Model the behavior of conceiving and running experiments
    • Organize activities around projects (for example, in the garden)

Gunnar recommended several texts during this call and in a follow-up email, including:

We covered many topics in later calls, organized below by subject rather than chronology.

Child Cognition in General

We discussed more cognitive elements of parenting--the extent to which developing brains “need” new inputs and partially “know” what inputs they need but, if overloaded, will retreat to the familiar, and especially to you, and then consolidate.  Gunnar mentioned the Big Five as a good shorthand for observing kids’ personalities.  He shared the first of the translated parenting documents I mentioned above.

This discussion reminded me of Clark, Surfing Uncertainty, which I cannot recommend strongly enough.  After reading that book, I understood intellectually that brains seemed to be prediction/testing machines that thrive on stimulation, but I didn’t see that model as a frame to place over my parenting thoughts until Gunnar spoke about similar concepts in his own perception of parenting. This was a eureka moment for me.

Your kids spend even more cognitive energy on you than you do on them, because their survival depends on it (see also here). 

  • They will notice if you are stressed or worried.
  • They understand words you’re using before they can use those words themselves.

We discussed various ways to teach children before they are in school, and to augment what they learn in school.

  • Use homeschooling materials to assist them with their homework. (In Gunnar’s country, homeschooling is very rare; in mine, the USA, it’s a constitutionally protected right, consistent with Gunnar’s claim that the best homeschooling materials are in English.) I might never have considered these otherwise, because homeschooling in the USA is correlated with weird beliefs, and I was subconsciously assuming that homeschooling materials generated by weird-belief-holders would be somehow infected by the weird beliefs.  (Gunnar adds: They likely are infected by weird beliefs, but you can just keep the good parts.)
  • Avoid rote memorization, except where necessary--multiplication tables, for example.
  • Parents’ and teachers’ incentives are often misaligned (ideal methods for an entire room versus ideal methods for your own child).
  • Encourage kids to make testable predictions and bets.
  • At all verbal ages, you can talk to them in a more complex way than they are able to communicate, yet they will still understand some parts of it and absorb context and parts of meaning..

Conditioning works, but only on things you are consistent about. Corollary: if you’re not willing to be consistent on something, leave it out. (My parents used this on me when I learned how to whine. They agreed not to acknowledge anything I said in a whiny tone, and told me this would be their policy. According to them, it worked quickly.)

When the desired behavior is rare on its own, you can “cheat” by simulating the behavior (for example, in pretend play).

Rather than “No,” use “Yes, but” “yes, and” “yes, as soon as”. These are opportunities to show the child that you are also a person with needs, and to emphasize mutual responsibility.

  • Trust your instincts, yourself, your spouse, and offer the child a lot of trust.
  • The parent should behave such that the child unconditionally trusts that its needs will be met reliably.
  • Don’t lie to your kids.
  • Challenge them, but not so far that they feel physically unsafe.

Parenting is intense and challenging, not least when you are sleep deprived because of a baby’s sleep schedule. Observations:

  • Cultivate a support network of friends and other parents.
    • There is probably no substitute for in-person connection and support.
    • Talking helps.
  • Have a safe place to temporarily retreat to.
  • Consistently (perhaps a certain interval each day) set aside time for unstructured entire-family time.
  • The change in the marriage relationship requires focus and time to navigate.
  • Communicate feelings and stress with your spouse and provide physical support as needed.
Avenue for Neuroscience Research

Gunnar has an interesting, and possibly testable, hypothesis: One effect of puberty is to partially reset the values a child assigns to normative judgments, but not to procedural knowledge about reality. (Corollary: Whatever values you’ve taught your child will be more likely to survive if you’ve given them the information necessary to conclude that the value is correct.)  The cascade of puberty hormones could conceivably affect the chemicals in the brain responsible for adjusting weights of priors.  I don’t have enough neuroscience to develop this any further, but it’s “common knowledge” that many teenagers think their parents are idiots.  A biological explanation would explain how widespread the behaviors leading to this folk belief are.


I am grateful to Gunnar for his time, attention, and “gameness.” I am glad that this entire process happened, starting with Zvi’s initial post and ending with this post. I feel far more prepared than I did at the beginning, and I doubt that a person outside this community would have been able to get me there. I plan to implement is weekly development diary as a way to track trends, organize my own thoughts about parenting, and force myself to really think about what's going on.  Maybe most importantly, I have a role model for thinking hard about what's going on even with very young children. My only model for that before was cognitive scientists and their informative but ultimately clinical experiments.

I’ll give Gunnar the last word:

I enjoyed the mentoring tremendously. It is very rare to find someone so interested in parenting and taking the preparation so seriously. I felt myself and my advice highly valued. A good feeling that I hope many mentors share. Talking about my parenting experiences and insights also sharpened them and gave me more clarity about some of my thoughts on parenting. I highly appreciate all the note-taking that was done by Supposedlyfun.

One thing that I realized is how crowded the parent education market is and how difficult it is to find unbiased evidence-based material. I have been thinking quite a lot about this and hope to post about it sometime.

We have paused the mentoring for the time being and I am looking forward to how the advice works out in practice. We agreed on a call sometime after the family has adjusted to the new human being.


Experimenting with Android Digital Wellbeing

21 октября, 2021 - 08:43
Published on October 21, 2021 5:43 AM GMT

inspired by this post

Introduction: Small Deaths

I'm a morning person.

I usually wake up at about 6 AM. I read on my phone in bed until the toddler wakes up at 6:30, at which point I look after him till I take him to daycare at about 7:15. I then have till 9 AM free, during which time I get a lot of stuff done - both chores and personal projects.

I finish work at 6 PM. Either we have dinner with the kid, or I feed him dinner, and then we have dinner once he's in bed at about 7:30.

So by 8:30 PM I've eaten dinner, jobs are all done, and kid's in bed. I might put on the dishwasher, but other than that my evening's free till I start getting ready for bed at about 10:30 PM.

So what do I do in those 2 hours?

Sometimes I read a book. Sometimes I go for a walk.

But most of the time I stare at my phone. I catch up on lesswrong, gitter, emails, twitter, discord, and then once I've done all that I go back again and refresh them all in case something new has turned up.

Once I'm convinced that nothing exciting is going to pop up on my regular haunts I start to think about all the sites that might have fresh content. Maybe Scott Aaronson has posted something on Shtetl Optimized? And didn't I once read a blog by X which I vaguely enjoyed? He's probably posted something in the last year...

Eventually I probably find a sequence or short story I haven't read yet, or some random Wikipedia article - I wonder where the third lowest lake on earth is? What about the highest roads and villages, or northernmost islands?

I couldn't say I enjoy this experience. I'm vaguely bored and unhappy throughout, and I'd probably be happier just going straight to bed. The moment I stop and do something I immediately feel better and invigorated. But stopping is just so difficult!

I view this time as a small death. Time I'm just trying to kill. It doesn't relax me. I don't enjoy it. It has no benefits - it might as well not exist. I might as well be dead for those 2 hours.

How do I break out of it?

Experiment: Digital Wellbeing

I don't want to not be able to use my phone at all after 7 - I'm not brave enough for that.

Ideally I would be able to just turn off chrome after 7, since that removes anything open ended - I can only check apps I have installed on my phone.

Google provides Digital WellBeing controls on modern android phones. Unfortunately it doesn't have the ability to turn off specific apps at specific times. Neither does Parental Controls.

It does allow you to set a limit on the total amount of time you can spend on an app each day. I decided that might be good enough, so for now I've set a 1 hour limit on Chrome every day.


I hope that this will make me more likely to do any of these things in the evening:

  1. Read
  2. Walk
  3. Work on my OSS projects
  4. Write LessWrong posts
  5. Other productive/social activities

This experiment can fail in a number of ways:

  1. I increase the time limit till it is ineffective.
  2. I find ways to work around the time limit. I've already had to set a half hour timer on the "Google" app (the one that allows you to search for things directly from your home screen) because you can access arbitrary websites from there.
  3. I end up wasting my time in different ways - e.g. watching netflix all night, going on different apps on my phone (youtube, gitter, etc.).
  4. I don't enjoy the alternative activities I do in the evening as much as I thought.

I would say at the moment I spend more than an hour and a half on my phone about 4 evenings a week. I estimate that till now I spend a total of about 24 hours a week on my phone. I hope to reduce both these measures significantly.

I'm going to assess this in 1 month, and report how things are going. My predictions are:

  1. This works as well as I hope (only spend more 1.5 hours on phone in an evening when organizing something, less than 12 hours of weekly screen  time): 20%
  2. This does something, but not as much as I would like (3 or less evenings a week spending more than an hour and a half on my phone. Less than 20 hours a week total screen time): 50%
  3. This has no significant effect after a month: 30%

I'm also going to report subjectively how I feel about this process.


Emergent modularity and safety

21 октября, 2021 - 04:54
Published on October 21, 2021 1:54 AM GMT

Our default expectation about large neural networks should be that we will understand them in roughly the same ways that we understand biological brains, except where we have specific reasons to think otherwise. How do we understand human brains? One crucial way we currently do so is via their modularity: the fact that different parts of the brain carry out different functions. Neuroscientists have mapped many different skills (such as language use, memory consolidation, and emotional responses) to specific brain regions. Note that this doesn’t always give us much direct insight into how the skills themselves work - but it does make follow-up research into those skills much easier. I’ll argue that, for the purposes of AGI safety, this type of understanding may also directly enable important safety techniques.

What might it look like to identify modules in a machine learning system? Some machine learning systems are composed of multiple networks trained on different objective functions - which I’ll call architectural modularity. But what I’m more interested in is emergent modularity, where a single network develops modularity after training. Emergent modularity requires that the weights of a network give rise to a modular structure, and that those modules correspond to particular functions. We can think about this both in terms of high-level structure (e.g. a large section of a neural network carrying out a broad role, analogous to the visual system in humans) or lower-level structure, involving a smaller module carrying out more specific functions. (Note that this is a weaker definition than the one defended in philosophy by Fodor and others - for instance, the sets of neurons don’t need to contain encapsulated information.)

In theory, the neurons which make up a module might be distributed in a complex way across the entire network with only tenuous links between them. But in practice, we should probably expect that if these modules exist, we will be able to identify them by looking at the structure of  connections between artificial neurons, similar to how it’s done for biological neurons. The first criterion is captured in a definition proposed by Filan et al. (2021).: a network is modular to the extent that it can be partitioned into sets of neurons where each set is strongly internally connected, but only weakly connected to other sets. They measure this by pruning the networks, then using graph-clustering algorithms, and provide empirical evidence that multi-layer perceptrons are surprisingly modular.

The next question is whether those modules correspond to internal functions. Although it’s an appealing and intuitive hypothesis, the evidence for this is currently mixed. On one hand, Olah et al.’s (2020) investigations find circuits which implement human-comprehensible functions. And insofar as we expect artificial neural networks to be similar to biological neural networks, the evidence from brain lesions in humans and other animals is compelling. On the other hand, they also find evidence for polysemanticity in artificial neural networks: some neurons fire for multiple reasons, rather than having a single well-defined role.

If it does turn out to be the case that structural modules implement functional modules, though, that has important implications for safety research: if we know what types of cognition we’d like our agents to avoid, then we might be able to identify and remove the regions responsible for them. In particular, we could try to find modules responsible for goal-directed agency, or perhaps even ones which are used for deception. This seems like a much more achievable goal for interpretability research than the goal of “reading off” specific thoughts that the network is having. Indeed, as in humans, very crude techniques for monitoring neural activations may be sufficient to identify many modules. But doing so may be just as useful for safety as precise interpretability, or more so, because it allows us to remove underlying traits that we’re worried about merely by setting the weights in the relevant modules to zero - a technique which I’ll call module pruning.

Of course, removing significant chunks of a neural network will affect its performance on the tasks we do want it to achieve. But it’s possible that retraining it from that point will allow it to regain the functionality we’re interested in without fully recreating the modules we’re worried about. This would be particularly valuable in cases where extensive pre-training is doing a lot of work in developing our agents’ capabilities, because that pre-training tends to be hard to control. For instance, it’s difficult to remove all offensive content from a large corpus of internet data, and so language models trained on such a corpus usually learn to reiterate that offensive content. Hypothetically, though, if we were able to observe small clusters of neurons which were most responsible for encoding this content, and zeroed out the corresponding parameters, then we could subsequently continue training on smaller corpora with more trustworthy content. While this particular example is quite speculative, I expect the general principle to be more straightforwardly applicable for agents that are pre-trained in multi-agent environments, in which they may acquire a range of dangerous traits like aggression or deception.

Module pruning also raises a counterintuitive possibility: that it may be beneficial to train agents to misbehave in limited ways, so that they develop specific modules responsible for those types of misbehaviour, which we can then remove. Of course, this suggestion is highly speculative. And, more generally, we should be very uncertain about whether advanced AIs will have modules that correspond directly to the types of skills we care about. But thinking about the safety of big neural networks in terms of emergent modules does seem like an interesting direction - both because the example of humans suggests that it’ll be useful, and also because it will push us towards lower-level and more precise descriptions of the types of cognition which our AIs carry out, and the types which we’d like to prevent.


Work on Robin Hanson compilation

21 октября, 2021 - 01:03
Published on October 20, 2021 10:03 PM GMT

I think Robin Hanson's ideas are not read nearly as widely as they should be, in part because it's difficult to navigate his many, many blog posts (I estimate he's written 2000 of them). So I'd like to pay someone to read through all his writings and compile the best ones into a more accessible format. The default option would be an ebook like Rationality: from AI to Zombies, containing several thematically-linked sequences of posts; possible extensions of this include adding summaries or publishing physical copies (although let me know if you have any other suggestions).

I expect this to take 1-2 months of work, and will pay 5-10k USD (depending on how extensive the project ends up being). The Lightcone team has kindly offered to help with the logistics of bookmaking. My gmail address is richardcngo; email me with the subject line "Hanson compilation", plus any relevant information about yourself, if you're interested in doing this.


AGI Safety Fundamentals curriculum and application

21 октября, 2021 - 00:44
Published on October 20, 2021 9:44 PM GMT

Over the last year EA Cambridge has been designing and running an online program aimed at effectively introducing the field of AGI safety; the most recent cohort included around 150 participants and 25 facilitators from around the world. Dewi Erwan runs the program; I designed the curriculum, the latest version of which appears in the linked document. We expect the program to be most useful to people with technical backgrounds (e.g. maths, CS, or ML), although the curriculum is intended to be accessible for those who aren't familiar with machine learning, and participants will be put in groups with others from similar backgrounds. If you're interested in joining the next version of the course (taking place January - March 2022) apply here to be a participant or here to be a facilitator. Applications are open to anyone and close 15 December. (We expect to be able to pay facilitators, but are still waiting to confirm the details.)

This post contains an overview of the course and an abbreviated version of the curriculum; the full version (which also contains optional readings, exercises, notes, discussion prompts, and project ideas) can be found here. Comments and feedback are very welcome, either on this post or in the full curriculum document; suggestions of new exercises, prompts or readings would be particularly helpful. I'll continue to make updates until shortly before the next cohort starts.

Course overview

The course consists of 8 weeks of readings, plus a final project. Participants are divided into groups of 4-6 people, matched based on their prior knowledge about ML and safety. Each week (apart from week 0) each group and their discussion facilitator will meet for 1.5 hours to discuss the readings and exercises. Broadly speaking, the first half of the course explores the motivations and arguments underpinning the field of AGI safety, while the second half focuses on proposals for technical solutions. After week 7, participants will have several weeks to work on projects of their choice, to present at the final session.

Each week's curriculum contains:

  • Key ideas for that week
  • Core readings
  • Optional readings
  • Two exercises (participants should pick one to do each week)
  • Further notes on the readings
  • Discussion prompts for the weekly session

Week 0 replaces the small group discussions with a lecture plus live group exercises, since it's aimed at getting people with little ML knowledge up to speed quickly.

The topics for each week are:

  • Week 0 (optional): introduction to machine learning
  • Week 1: Artificial general intelligence
  • Week 2: Goals and misalignment
  • Week 3: Threat models and types of solutions
  • Week 4: Learning from humans
  • Week 5: Decomposing tasks for outer alignment
  • Week 6: Other paradigms for safety work
  • Week 7: AI governance
  • Week 8 (several weeks later): Projects
Abbreviated curriculum (only key ideas and core readings)Week 0 (optional): introduction to machine learning

This week mainly involves learning about foundational concepts in machine learning, for those who are less familiar with them, or want to revise the basics. If you’re not already familiar with basic concepts in statistics (like regressions), it will take a bit longer than most weeks; and instead of the group discussions from most weeks, there will be a lecture and group exercises. If you’d like to learn ML in more detail, see the further resources section at the end of this curriculum.

Otherwise, start with Ngo (2021), which provides a framework for thinking about machine learning, and in particular the two key components of deep learning: neural networks and optimisation. For more details and intuitions about neural networks, watch 3Blue1Brown (2017a); for more details and intuitions about optimisation, watch 3Blue1Brown (2017b). Lastly, see Simonini (2020) for an introduction to how deep learning can be used to solve reinforcement learning tasks.

Core readings:

  1. If you’re not familiar with the basics of statistics, like linear regression and classification:
    1. Introduction: linear regression (10 mins)
    2. Ordinary least squares regression (10 mins)
  2. A short introduction to machine learning (Ngo, 2021) (20 mins)
  3. But what is a neural network? (3Blue1Brown, 2017a) (20 mins)
  4. Gradient descent, how neural networks learn (3Blue1Brown, 2017b) (20 mins)
  5. An introduction to deep reinforcement learning (Simonini, 2020) (30 mins)
Week 1: Artificial general intelligence

The first two readings this week offer several different perspectives on how we should think about artificial general intelligence. This is the key concept underpinning the course, so it’s important to deeply explore what we mean by it, and the limitations of our current understanding.

The third reading is about how we should expect advances in AI to occur. AI pioneer Rich Sutton explains the main lesson he draws from the history of the field: that “general methods that leverage computation are ultimately the most effective”. Compared with earlier approaches, these methods rely much less on human design, and therefore raise the possibility that we build AGIs whose cognition we know very little about.

Focusing on compute also provides a way to forecast when we should expect AGI to occur. The most comprehensive report on the topic (summarised by Karnofsky (2021)) estimates the amount of compute required to train neural networks as large as human brains to do highly impactful tasks, and concludes that this will probably be feasible within the next four decades - although the estimate is highly uncertain.

Core readings:

  1. Four background claims (Soares, 2015) (15 mins)
  2. AGI safety from first principles (Ngo, 2020) (only sections 1, 2 and 2.1) (20 mins)
  3. The Bitter Lesson (Sutton, 2019) (15 mins)
  4. Forecasting transformative AI: the “biological anchors” method in a nutshell (Karnofsky, 2021) (30 mins)
Week 2: Goals and misalignment

This week we’ll focus on how and why AGIs might develop goals that are misaligned with those of humans, in particular when they’ve been trained using machine learning. We cover three core ideas. Firstly, it’s difficult to create reward functions which specify the desired outcomes for complex tasks (known as the problem of outer alignment). Krakovna et al. (2020) helps build intuitions about the difficulty of outer alignment, by showcasing examples of misbehaviour on toy problems.

Secondly, however, it’s important to distinguish between the reward function which is used to train a reinforcement learning agent, versus the goals which that agent learns to pursue. Hubinger et al. (2019a) argue that even an agent trained on the “right” reward function might acquire undesirable goals - the problem of inner alignment.

Thirdly, Bostrom (2014) argues that almost all goals which an AGI might have would incentivise it to misbehave in highly undesirable ways (e.g. pursuing survival and resource acquisition), due to the phenomenon of instrumental convergence.

While we can describe fairly easily what badly misaligned AIs might look like, it’s a little more difficult to pin down what qualifies as an aligned AI. Christiano’s (2018) definition allows us to mostly gloss over the difficult ethical questions.

Core readings:

  1. Specification gaming: the flip side of AI ingenuity (Krakovna et al., 2020) (15 mins)
  2. Introduction to Risks from Learned Optimisation (Hubinger et al., 2019a) (30 mins)
  3. Superintelligence, Chapter 7: The superintelligent will (Bostrom, 2014) (45 mins)
  4. Clarifying “AI alignment” (Christiano, 2018) (10 mins)
Week 3: Threat models and types of solutions

How might misaligned AGIs cause catastrophes, and how might we stop them? Two threat models are outlined in Christiano (2019) - the first focusing on outer misalignment, the second on inner misalignment. Muehlhauser and Salamon (2012) outline a core intuition for why we might be unable to prevent these risks: that progress in AI will at some point speed up dramatically. A third key intuition - that misaligned agents will try to deceive humans - is explored by Hubinger et al. (2019).

How might we prevent these scenarios? Christiano (2020) gives a broad overview of the landscape of different contributions to making AIs aligned, with a particular focus on some of the techniques we’ll be covering in later weeks.

Core readings:

  1. What failure looks like (Christiano, 2019) (20 mins)
  2. Intelligence explosion: evidence and import (Muehlhauser and Salamon, 2012) (only pages 10-15) (15 mins)
  3. AI alignment landscape (Christiano, 2020) (30 mins)
  4. Risks from Learned Optimisation: Deceptive alignment (Hubinger et al., 2019) (45 mins)
Week 4: Learning from humans

This week, we look at four techniques for training AIs on human data (all falling under “learn from teacher” in Christiano’s AI alignment landscape from last week). From a safety perspective, each of them improves on standard reinforcement learning techniques in some ways, but also has weaknesses which prevent it from solving the whole alignment problem. Next week, we’ll look at some ways to make these techniques more powerful and scalable; this week focuses on understanding each of them.

The first technique, behavioural cloning, is essentially an extension of supervised learning to settings where an AI must take actions over time - as discussed by Levine (2021). The second, reward modelling, allows humans to give feedback on the behaviour of reinforcement learning agents, which is then used to determine the rewards they receive; this is used by Christiano et al. (2017) and Steinnon et al. (2020). The third, inverse reinforcement learning (IRL for short), attempts to identify what goals a human is pursuing based on their behaviour.

A notable variant of IRL is cooperative IRL (CIRL for short), introduced by Hadfield-Menell et al. (2016). CIRL focuses on cases where the human and AI interact in a shared environment, and therefore the best strategy for the human is often to help the AI learn what goal the human is pursuing.

Core readings:

  1. Imitation learning lecture: part 1 (Levine, 2021a) (20 mins)
  2. Deep RL from human preferences blog post (Christiano et al., 2017) (15 mins)
  3. Learning to summarise with human feedback blog post (Stiennon et al., 2020) (25 mins)
  4. Inverse reinforcement learning
    1. For those who don’t already understand IRL:
    2. For those who already understand IRL:
Week 5: Decomposing tasks for outer alignment

The most prominent research directions in technical AGI safety involve training AIs to do complex tasks by decomposing those tasks into simpler ones where humans can more easily evaluate AI behaviour. This week we’ll cover three closely-related algorithms (all falling under “build a better teacher” in Christiano’s AI alignment landscape).

Wu et al. (2021) applies reward modelling recursively in order to solve more difficult tasks. Recursive reward modelling can be considered one example of a more general type of technique called iterated amplification (also known as iterated distillation and amplification), which is described in Ought (2019). A more technical description of iterated amplification is given by Christiano et al. (2018), along with some small-scale experiments.

The third technique we’ll discuss this week is Debate, as proposed by Irving and Amodei (2018). Unlike the other two techniques, Debate focuses on evaluating claims made by language models, rather than supervising AI behaviour over time.

Core readings:

  1. Recursively summarising books with human feedback (Wu et al., 2021) (ending after section 4.1.2: Findings) (45 mins)
  2. Factored cognition (Ought, 2019) (introduction and scalability section) (20 mins)
  3. AI safety via debate blog post (Irving and Amodei, 2018) (15 mins)
  4. Supervising strong learners by amplifying weak experts (Christiano et al., 2018) (40 mins
Week 6: Other paradigms for safety work

A lot of safety work focuses on “shifting the paradigm” of AI research. This week we’ll cover two ways in which safety researchers have attempted to do so. The first is via research on interpretability, which attempts to understand in detail how neural networks work. Olah et al. (2020) showcases some prominent research in the area; and Chris Olah’s perspective is summarised by Hubinger et al. (2019).

The second is the research agenda of the Machine Intelligence Research Institute (MIRI) which aims to create rigorous mathematical frameworks to describe the relationships between AIs and their real-world environments. Soares (2015) gives a high-level explanation of their approach; while Demski and Garrabrant (2018) identify a range of open problems and links between them. 

Core readings:

  1. Zoom In: an introduction to circuits (Olah et al., 2020) (35 mins)
  2. Chris Olah’s views on AGI safety (Hubinger, 2019) (25 mins)
  3. MIRI’s approach (Soares, 2015) (30 mins)
  4. Embedded agents (Demski and Garrabrant, 2018) (25 mins)
Week 7: AI governance

In the last week of curriculum content, we’ll look at the field of AI governance. Start with Dafoe (2020), which gives a thorough overview of AI governance and ways in which it might be important, particularly focusing on the framing of AI governance as field-building. An alternative framing - of AI governance as an attempt to prevent cooperation failures - is explored by Clifton (2019). Although the field of AI governance is still young, Muehlhauser (2020) identifies some useful work so far. Finally, Bostrom (2019) provides a background framing for thinking about technological risks: the process of randomly sampling new technologies, some of which might prove catastrophic.

Core readings:

  1. AI Governance: Opportunity and Theory of Impact (Dafoe, 2020) (25 mins)
  2. Cooperation, conflict and transformative AI: sections 1 & 2 (Clifton, 2019) (25 mins)
  3. Our AI governance grantmaking so far (Muehlhauser, 2020) (15 mins)
  4. The vulnerable world hypothesis (Bostrom, 2019) (ending at the start of the section on ‘Preventive policing’) (60 mins)
Week 8 (four weeks later): Projects

The final part of the AGI safety fundamentals course will be projects where you get to dig into something related to the course. The project is a chance for you to explore your interests, so try to find something you’re excited about! The goal of this project is to help you practice taking an intellectually productive stance towards AGI safety - to go beyond just reading and discussing existing ideas, and take a tangible step towards contributing to the field yourself. This is particularly valuable because it’s such a new field, with lots of room to explore.

Click here for the full version of the curriculum, which contains additional readings, exercises, notes, discussion prompts, and project ideas.


[AN #167]: Concrete ML safety problems and their relevance to x-risk

20 октября, 2021 - 20:10
Published on October 20, 2021 5:10 PM GMT

[AN #167]: Concrete ML safety problems and their relevance to x-risk Alignment Newsletter is a weekly publication with recent content relevant to AI alignment around the world View this email in your browser Newsletter #167
Alignment Newsletter is a weekly publication with recent content relevant to AI alignment around the world. Find all Alignment Newsletter resources here. In particular, you can look through this spreadsheet of all summaries that have ever been in the newsletter.
Audio version here (may not be up yet).
Please note that, while I work at DeepMind, this newsletter represents my personal views and not those of my employer. SECTIONS HIGHLIGHTS

Unsolved Problems in ML Safety (Dan Hendrycks, Nicholas Carlini, John Schulman, and Jacob Steinhardt) (summarized by Dan Hendrycks): To make the case for safety to the broader machine learning research community, this paper provides a revised and expanded collection of concrete technical safety research problems, namely:

1. Robustness: Create models that are resilient to adversaries, unusual situations, and Black Swan events.

2. Monitoring: Detect malicious use, monitor predictions, and discover unexpected model functionality.

3. Alignment: Build models that represent and safely optimize hard-to-specify human values.

4. External Safety: Use ML to address risks to how ML systems are handled, including cyberwarfare and global turbulence.

Throughout, the paper attempts to clarify the problems’ motivation and provide concrete project ideas.

Dan Hendrycks' opinion: My coauthors and I wrote this paper with the ML research community as our target audience. Here are some thoughts on this topic:

1. The document includes numerous problems that, if left unsolved, would imply that ML systems are unsafe. We need the effort of thousands of researchers to address all of them. This means that the main safety discussions cannot stay within the confines of the relatively small EA community. I think we should aim to have over one third of the ML research community work on safety problems. We need the broader community to treat AI safety at least as seriously as safety for nuclear power plants.

2. To grow the ML safety research community, we need to suggest problems that can progressively build the community and organically grow support for elevating safety standards within the existing research ecosystem. Research agendas that pertain to AGI exclusively will not scale sufficiently, and such research will simply not get enough market share in time. If we do not get the machine learning community on board with proactively mitigating risks that already exist, we will have a harder time getting them to mitigate less familiar and unprecedented risks. Rather than try to win over the community with alignment philosophy arguments, I'll try winning them over with interesting problems and try to make work towards safer systems rewarded with prestige.

3. The benefits of a larger ML safety community are numerous. They can decrease the cost of safety methods and increase the propensity to adopt them. Moreover, to ensure that ML systems have desirable properties, it is necessary to rapidly accumulate incremental improvements, but this requires substantial growth since such gains cannot be produced by just a few card-carrying x-risk researchers with the purest intentions.

4. The community will fail to grow if we ignore near-term concerns or actively exclude or sneer at people who work on problems that are useful for both near- and long-term safety (such as adversaries). The alignment community will need to stop engaging in textbook territorialism and welcome serious hypercompetent researchers who do not post on internet forums or who happen not to subscribe to effective altruism. (We include a community strategy in the Appendix.)

5. We focus on reinforcement learning but also deep learning. Most of the machine learning research community studies deep learning (e.g., text processing, vision) and does not use, say, Bellman equations or PPO. While existentially catastrophic failures will likely require competent sequential decision-making agents, the relevant problems and solutions can often be better studied outside of gridworlds and MuJoCo. There is much useful safety research to be done that does not need to be cast as a reinforcement learning problem.

6. To prevent alienating readers, we did not use phrases such as "AGI." AGI-exclusive research will not scale; for most academics and many industry researchers, it's a nonstarter. Likewise, to prevent needless dismissiveness, we kept x-risks implicit, only hinted at them, or used the phrase "permanent catastrophe."

I would have personally enjoyed discussing at length how anomaly detection is an indispensable tool for reducing x-risks from Black Balls, engineered microorganisms, and deceptive ML systems.

Here are how the problems relate to x-risk:

Adversarial Robustness: This is needed for proxy gaming. ML systems encoding proxies must become more robust to optimizers, which is to say they must become more adversarially robust. We make this connection explicit at the bottom of page 9.

Black Swans and Tail Risks: It's hard to be safe without high reliability. It's not obvious we'll achieve high reliability even by the time we have systems that are superhuman in important respects. Even though MNIST is solved for typical inputs, we still do not even have an MNIST classifier for atypical inputs that is reliable! Moreover, if optimizing agents become unreliable in the face of novel or extreme events, they could start heavily optimizing the wrong thing. Models accidentally going off the rails poses an x-risk if they are sufficiently powerful (this is related to "competent errors" and "treacherous turns"). If this problem is not solved, optimizers can use these weaknesses; this is a simpler problem on the way to adversarial robustness.

Anomaly and Malicious Use Detection: This is an indispensable tool for detecting proxy gaming, Black Balls, engineered microorganisms that present bio x-risks, malicious users who may misalign a model, deceptive ML systems, and rogue ML systems.

Representative Outputs: Making models honest is a way to avoid many treacherous turns.

Hidden Model Functionality: This also helps avoid treacherous turns. Backdoors is a potentially useful related problem, as it is about detecting latent but potential sharp changes in behavior.

Value Learning: Understanding utilities is difficult even for humans. Powerful optimizers will need to achieve a certain, as-of-yet unclear level of superhuman performance at learning our values.

Translating Values to Action: Successfully prodding models to optimize our values is necessary for safe outcomes.

Proxy Gaming: Obvious.

Value Clarification: This is the philosophy bot section. We will need to decide what values to pursue. If we decide poorly, we may lock in or destroy what is of value. It is also possible that there is an ongoing moral catastrophe, which we would not want to replicate across the cosmos.

Unintended Consequences: This should help models not accidentally work against our values.

ML for Cybersecurity: If you believe that AI governance is valuable and that global turbulence risks can increase risks of terrible outcomes, this section is also relevant. Even if some of the components of ML systems are safe, they can become unsafe when traditional software vulnerabilities enable others to control their behavior. Moreover, traditional software vulnerabilities may lead to the proliferation of powerful advanced models, and this may be worse than proliferating nuclear weapons.

Informed Decision Making: We want to avoid decision making based on unreliable gut reactions during a time of crisis. This reduces risks of poor governance of advanced systems.

Here are some other notes:

1. We use systems theory to motivate inner optimization as we expect this motivation will be more convincing to others.

2. Rather than having a broad call for "interpretability," we focus on specific transparency-related problems that are more tractable and neglected. (See the Appendix for a table assessing importance, tractability, and neglectedness.) For example, we include sections on making models honest and detecting emergent functionality.

3. The "External Safety" section can also be thought of as technical research for reducing "Governance" risks. For readers mostly concerned about AI risks from global turbulence, there still is technical research that can be done.

Here are some observations while writing the document:

1. Some approaches that were previously very popular are currently neglected, such as inverse reinforcement learning. This may be due to currently low tractability.

2. Five years ago, I started explicitly brainstorming the content for this document. I think it took the whole time for this document to take shape. Moreover, if this were written last fall, the document would be far more confused, since it took around a year after GPT-3 to become reoriented; writing these types of documents shortly after a paradigm shift may be too hasty.

3. When collecting feedback, it was not uncommon for "in-the-know" researchers to make opposite suggestions. Some people thought some of the problems in the Alignment section were unimportant, while others thought they were the most critical. We attempted to include most research directions.

[MLSN #1]: ICLR Safety Paper Roundup (Dan Hendrycks) (summarized by Rohin): This is the first issue of the ML Safety Newsletter, which is "a monthly safety newsletter which is designed to cover empirical safety research and be palatable to the broader machine learning research community".

Rohin's opinion: I'm very excited to see this newsletter: this is a category of papers that I want to know about and that are relevant to safety, but I don't have the time to read all of these papers given all the other alignment work I read, especially since I don't personally work in these areas and so often find it hard to summarize them or place them in the appropriate context. Dan on the other hand has written many such papers himself and generally knows the area, and so will likely do a much better job than I would. I recommend you subscribe, especially since I'm not going to send a link to each MLSN in this newsletter.


Selection Theorems: A Program For Understanding Agents (John Wentworth) (summarized by Rohin): This post proposes a research area for understanding agents: selection theorems. A selection theorem is a theorem that tells us something about agents that will be selected for in a broad class of environments. Selection theorems are helpful because (1) they can provide additional assumptions that can help with learning human values, and (2) they can tell us likely properties of the agents we build by accident (think inner alignment concerns).

As an example, coherence arguments demonstrate that when an environment presents an agent with “bets” or “lotteries”, where the agent cares only about the outcomes of the bets, then any “good” agent can be represented as maximizing expected utility. (What does it mean to be “good”? This can vary, but one example would be that the agent is not subject to Dutch books, i.e. situations in which it is guaranteed to lose resources.) This can then be turned into a selection argument by combining it with something that selects for “good” agents. For example, evolution will select for agents that don’t lose resources for no gain, so humans are likely to be represented as maximizing expected utility. Unfortunately, many coherence arguments implicitly assume that the agent has no internal state, which is not true for humans, so this argument does not clearly work. As another example, our ML training procedures will likely also select for agents that don’t waste resources, which could allow us to conclude that the resulting agents can be represented as maximizing expected utility, if the agents don't have internal states.

Coherence arguments aren’t the only kind of selection theorem. The good(er) regulator theorem (AN #138) provides a set of scenarios under which agents learn an internal “world model”. The Kelly criterion tells us about scenarios in which the best (most selected) agents will make bets as though they are maximizing expected log money. These and other examples are described in this followup post.

The rest of this post elaborates on the various parts of a selection theorem and provides advice on how to make original research contributions in the area of selection theorems. Another followup post describes some useful properties for which the author expects there are useful selections theorems to prove.

Rohin's opinion: People sometimes expect me to be against this sort of work, because I wrote Coherence arguments do not imply goal-directed behavior (AN #35). This is not true. My point in that post is that coherence arguments alone are not enough, you need to combine them with some other assumption (for example, that there exists some “resource” over which the agent has no terminal preferences). I do think it is plausible that this research agenda gives us a better picture of agency that tells us something about how AI systems will behave, or something about how to better infer human values. While I am personally more excited about studying particular development paths to AGI rather than more abstract agent models, I do think this research would be more useful than other types of alignment research I have seen proposed.


State of AI Report 2021 (Nathan Benaich and Ian Hogarth) (summarized by Rohin): As with past (AN #15) reports (AN #120), I’m not going to summarize the entire thing; instead you get the high-level themes that the authors identified:

1. AI is stepping up in more concrete ways, including in mission critical infrastructure.

2. AI-first approaches have taken biology by storm (and we aren’t just talking about AlphaFold).

3. Transformers have emerged as a general purpose architecture for machine learning in many domains, not just NLP.

4. Investors have taken notice, with record funding this year into AI startups, and two first ever IPOs for AI-first drug discovery companies, as well as blockbuster IPOs for data infrastructure and cybersecurity companies that help enterprises retool for the AI-first era.

5. The under-resourced AI-alignment efforts from key organisations who are advancing the overall field of AI, as well as concerns about datasets used to train AI models and bias in model evaluation benchmarks, raise important questions about how best to chart the progress of AI systems with rapidly advancing capabilities.

6. AI is now an actual arms race rather than a figurative one, with reports of recent use of autonomous weapons by various militaries.

7. Within the US-China rivalry, China's ascension in research quality and talent training is notable, with Chinese institutions now beating the most prominent Western ones.

8. There is an emergence and nationalisation of large language models.

Rohin's opinion: In last year’s report (AN #120), I said that their 8 predictions seemed to be going out on a limb, and that even 67% accuracy woud be pretty impressive. This year, they scored their predictions as 5 “Yes”, 1 “Sort of”, and 2 “No”. That being said, they graded “The first 10 trillion parameter dense model” as “Yes”, I believe on the basis that Microsoft had run a couple of steps of training on a 32 trillion parameter dense model. I definitely interpreted the prediction as saying that a 10 trillion parameter model would be trained to completion, which I do not think happened publicly, so I’m inclined to give it a “No”. Still, this does seem like a decent track record for what seemed to me to be non-trivial predictions. This year's predictions seem similarly "out on a limb" as last year's.

This year’s report included one-slide summaries of many papers I’ve summarized before. I only found one major issue -- the slide on TruthfulQA (AN #165) implies that larger language models are less honest in general, rather than being more likely to imitate human falsehoods. This is actually a pretty good track record, given the number of things they summarized where I would have noticed if there were major issues.


CHAI Internships 2022 (summarized by Rohin): CHAI internships are open once again! Typically, an intern will execute on an AI safety research project proposed by their mentor, resulting in a first-author publication at a workshop. The early deadline is November 23rd and the regular deadline is December 13th.

FEEDBACK I'm always happy to hear feedback; you can send it to me, Rohin Shah, by replying to this email. PODCAST An audio podcast version of the Alignment Newsletter is available. This podcast is an audio version of the newsletter, recorded by Robert Miles.
Subscribe here:

Copyright © 2021 Alignment Newsletter, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.


Sentience, Sapience, Consciousness & Self-Awareness: Defining Complex Terms

20 октября, 2021 - 16:48
Published on October 20, 2021 1:48 PM GMT

The terms in the title are commonly used in crucial debates surrounding morality & AI. Yet, I feel like there is no clear consensus about the meaning of those terms. The words are often used interchangeably, causing people to think they are all the same or very closely related. I believe they're not. Clearly separating these terms makes it a lot easier to conceptualize a larger "spectrum of consciousness"

Disclaimer: I expect some people to be upset for 'taking' terms and changing their definition. Feel free to propose different terms for the concepts below!


"Consciousness" is often taken to mean "what we are". "Our" voice in our heads, the "soul". I propose a more limited definition. A conscious entity is a system with an "internal observer". At this very moment, these words are being read. Hello, 'observer'! You probably have eyes. Focus on something. There is an image in your mind. Take the very core of that: not the intellectual observations connected to it, not the feelings associated with it, just the fact that a mental image exists. I think that is the unique ability of a conscious individual or system. 


Wikipedia claims that consciousness is sentience. Wiktionary has a definition for sentient that includes human-like awareness and intelligence. Once again, I propose a more limited definition. A sentient entity is a system that can experience feelings, like pleasure and pain. Consciousness is a prerequisite: without internal observer, there is nothing to experience these feelings. 

I believe sentience is the bedrock of morality. Standing on a rock probably doesn't generate any observations - certainly not pleasant or unpleasant ones. Standing on a cat seems to produce deeply unpleasant feelings for the cat. Defining morality as a system that tries to generate long-term positive experiences and to reduce negative experiences seems to work pretty well. In that case, standing on cats is not recommended. 

In these terms, consciousness is not the threshold of morality. Perhaps we discover that rocks are conscious. When we stand on them, we slightly compress their structure and rocks somehow hold an internal observer that is aware of that. But it doesn't have any feelings associated with that. It doesn't experience pain or joy or fear or love. It literally doesn't care what you do to it. It would be strange, it would overhaul our understanding of consciousness, it would redefine pet rocks - but it doesn't make it immoral for us to stand on rocks. 


Self-awareness is often seen as a big and crucial thing. Google "When computers become", and "self aware" is the second suggestion. I believe self-awareness is vague and relatively unimportant. Does it mean "knowing you're an entity separate from the rest of the world"? I think self-driving cars can check that box. Do you check that box when you recognize yourself in the mirror? Or do you need deep existential though and thorough knowledge of your subconsciousness and your relationship to the world and its history? In that case, many humans would fail that test. 

I believe self-awareness is a spectrum with many, many degrees. It's significantly correlated with intelligence, but not strongly or necessarily. Squirrels perform highly impressive calculations to navigate their bodies through the air, but these don't seem to be "aware calculations", and squirrels don't seem exceptionally self-aware.


According to Wikipedia

Wisdom, sapience, or sagacity is the ability to contemplate and act using knowledge, experience, understanding, common sense and insight.

Sapience is closely related to the term "sophia" often defined as "transcendent wisdom", "ultimate reality", or the ultimate truth of things. Sapiential perspective of wisdom is said to lie in the heart of every religion, where it is often acquired through intuitive knowing. This type of wisdom is described as going beyond mere practical wisdom and includes self-knowledge, interconnectedness, conditioned origination of mind-states and other deeper understandings of subjective experience. This type of wisdom can also lead to the ability of an individual to act with appropriate judgement, a broad understanding of situations and greater appreciation/compassion towards other living beings.

I find sapience to be much more interesting than self-awareness. Wikipedia has rather high ambitions for the term, and once again I propose a more limited definition. Biologically, we are all classified as homo sapiens. So it makes sense to me that "sapience" is the ability to understand and act with roughly human-level intelligence. 

Here is a fascinating article about tricking GPT-3. It includes some very simple instructions like do not use a list format and use three periods rather than single periods after each sentence, which are completely ignored in GPT-3's answer. GPT-3 copies the format of the text and elaborates in a similar style - exactly the thing the instructions told it not to do. 

Human children can easily follow such instructions. AIs can't, nor can animals like cats and dogs. Some animals seem to be able to transmit some forms of information, but this seems to be quite limited to things like "food / danger over there". As far as I know, no animal builds up a complex, abstract model of the world and intimately shares that with others (exactly the thing we're doing right now). 

On the other hand: lots of animals do clearly build up complex models of the world in their minds. The ability to communicate them mainly seems to rely on language comprehension. Language is tremendously powerful, but should it be the threshold of 'sapience'? 

Language acquisition seems to be nearly binary. Children learn to speak rather quickly. Among healthy eight year old children, there is no separate category for children that speak "Part A" of English but not "Part B".

It seems like raw human brain power + basic language skills = easy access to near infinite complex abstract models. And indeed, many concepts seem to be easily taught to a speaking child. "Store shoes there", "don't touch fire", "wear a jacket when it's cold outside". 

Simultaneously, relatively simple things can quickly become too complex for regular humans to communicate and handle. Take COVID for example. The virus itself is relatively simple - exponentional growth is not a new concept. Yet, our societies had and have a hard time properly communicating about it. We started mostly with fear and panic, and then transitioned to politicizing things like facemasks and vaccines, turning things into a predictable Greens vs Blues mess

I can imagine beings with a communication method that comes as natural to them as talking about the weather comes to us, who can easily talk about subjects like those above, whose civilizations quickly and painlessly coordinate around COVID, and whose Twelfth Virtue is not "nameless" / "the void". To those beings, humans might barely seem sapient, like bees and ants barely seem sapient to us. 

So my definition of sapience would be something like the ability to function with broad, diverse and complex models of the world, and to appropriately share these models with others. It's technically a spectrum, but basic language ability seems to be a massive jump upwards here. 

The philosophical zombie is sapient, but not conscious. Self-awareness does seem to be fundamentally connected to sapience. Any functional model of the world requires the ability to "model yourself". Proper sapience also seems to require the ability to be "receptive" towards the concept of self-awareness.

I think this results in the following map of possibilities:

I believe these definitions make discussing related subjects a lot more fruitful. I'd love to hear your opinion about it!


Boring machine learning is where it's at

20 октября, 2021 - 14:23
Published on October 20, 2021 11:23 AM GMT

It surprises me that when people think of "software that brings about the singularity" they think of text models, or of RL agents. But they sneer at decision tree boosting and the like as boring algorithms for boring problems.

To me, this seems counter-intuitive, and the fact that most people researching ML are interested in subjects like vision and language is flabergasting. For one, because getting anywhere productive in these fields is really hard, for another, because their usefulness seems relatively minimal.

I've said it before and I'll say it again, human brains are very good at the stuff they've been doing for a long time. This ranges from things like controlling a human-like body to things like writing prose and poetry. Seneca was as good of a philosophy writer as any modern, Shakespear as good of a playwright as any contemporary. That is not to say that new works and diversity in literature isn't useful, both from the perspective of diversity and of updating to language and zeitgeist, but it's not game-changing.

Human brains are shit at certain tasks, things like finding the strongest correlation with some variables in an n-million times n-million valued matrix. Or heck, even finding the most productive categories to quantify a spreadsheet with a few dozen categorical columns and a few thousand rows. That's not to mention things like optimizing 3d structures under complex constraints or figuring out probabilistic periodicity in a multi-dimensional timeseries.

The later sort of problem is where machine learning has found the most amount of practical usage, problems that look "silly" to a researcher but implacable to a human mind. On the contrary, 10 years in, computer vision is still struggling to find any meaningfully large market fits outside of self-driving. There are a few interesting applications, but they have limited impact and a low market cap. The most interesting applications, related to bioimaging, happen to be things people are quite bad at; They are very divergent from the objective of creating human-like vision capabilities, since the results you want are anything but human-like.

Even worst, there's the problem that human-like "AI" will be redundant the moment it's implemented. Self-driving cars are a real challenge precisely until the point when they become viable enough that everybody uses them, afterwards, every car is running on software and we can replace all the fancy CV-based decision making with simple control structures that rely on very constrained and "sane" behaviour from all other cars. Google assistant being able to call a restaurant or hospital and make a booking for you, or act as the receptionist taking that call, is relevant right until everyone starts using it, afterwards everything will already be digitized and we can switch to better and much simpler booking APIs.

That's not to say all human-like "AI" will be made redundant, but we can say that its applications are mainly well-known and will diminish over time, giving way to simpler automation as people start being replaced with algorithms. I say its applications are "well known" because they boil down to "the stuff that humans can do right now which is boring or hard enough that we'd like to stop doing it". There's a huge market for this, but it's huge in the same way as the whale oil market was in the 18th century. It's a market without that much growth potential.

On the other hand, the applications of "inhuman" algorithms are boundless, or at least only bounded by imagination. I've argued before that science hasn't yet caught up to the last 40 years of machine learning. People prefer designing equations by hand and validating them with arcane (and easy to fake, misinterpret and misuse) statistics, rather than using algorithmically generate solutions and validating them with simple, rock-solid methods such as CV. People like Horvath are hailed as genius-level polymaths in molecular biology for calling 4 scikit-learn functions on a tiny dataset.

Note: Horvath's work is great and I in no way want to pick on him specifically, the world would be much worse without him, I hope epigenetic clocks predict he'll live and work well into old age. I don't think he personally ever claimed the ML side of his work is in any way special or impressive, this is just what I've heard other biologists say.

This is not to say that the scientific establishment is doomed or anything, it's just slow at using new technologies, especially those that shift the onus of what a researcher ought to be doing. The same goes for industry; A lot of high-paying, high-status positions involve doing work algorithms are better at, precisely because it's extremely difficult for people, and thus you need the smartest people for it.

However, market forces and common sense are at work, and there's a constant uptick in usage. While I don't believe this can bring about a singularity so to speak, it will accelerate research and will open up new paradigms (mainly around data gathering and storage) and new problems that will allow ML to take centre stage.

So in that sense, it seems obvious to postulate a limited and decreasing market for human-like intelligence and a boundless and increasing market for "inhuman" intelligence.

This is mainly why I like to focus my work on the latter, even if it's often less flashy and more boring. One entirely avoidable issue with this is that the bar of doing better than a person is low, and the state of benchmarking is so poor as to make head-to-head competition between techniques difficult. Though this in itself is the problem I'm aiming to help solve.

That's about it, so I say go grab a spreadsheet and figure out how to get the best result on a boring economics problem with a boring algorithm; Don't worry so much about making a painting or movie with GANs, we're already really good at doing that and enjoy doing it.


Moravec's Paradox Comes From The Availability Heuristic

20 октября, 2021 - 09:23
Published on October 20, 2021 6:23 AM GMT

Epistemic Status: very quick one-thought post, may very well be arguing against a position nobody actually holds, but I haven’t seen this said explicitly anywhere so I figured I would say it.

Setting Up The Paradox

According to Wikipedia:

Moravec’s paradox is the observation by artificial intelligence and robotics researchers that, contrary to traditional assumptions, reasoning requires very little computation, but sensorimotor and perception skills require enormous computational resources.


I think this is probably close to what to Hans Moravec originally meant to say in the 1980’s, but not very close to how the term is used today. Here is my best attempt to specify the statement I think people generally point at when they use the term nowadays:

Moravec’s paradox is the observation that in general, tasks that are hard for humans are easy for computers, and tasks that are easy for humans are hard for computers.


If you found yourself nodding along to that second one, that’s some evidence I’ve roughly captured the modern colloquial meaning. Even when it’s not attached to the name “Moravec’s Paradox”, I think this general sentiment is a very widespread meme nowadays. Some example uses of this version of the idea that led me to write up this post are here and here.

To be clear, from here on out I will be talking about the modern, popular-meme version of Moravec's Paradox.

And Dissolving It

I think Moravec’s Paradox is an illusion that comes from the availability heuristic, or something like it. The mechanism is very simple – it’s just not that memorable when we get results that match up with our expectations for what will be easy/hard.

If you try, you can pretty easily come up with exceptions to Moravec’s Paradox. Lots of them. Things like single digit arithmetic, repetitive labor, and drawing simple shapes are easy for both humans and computers. Things like protein folding, the traveling salesman problem, and geopolitical forecasting are difficult for both humans and computers. But these examples aren’t particularly interesting, because they feel obvious. We focus our attention on the non-obvious cases instead – the examples where human/computer strengths are opposite, counter to our expectations.

Now, this isn’t to say that the idea behind Moravec’s Paradox is wrong and bad and we should throw it out. It’s definitely a useful observation, I just think it needs some clearing up and re-phrasing. This is the key difference: it’s not that human and computer difficulty ratings for various tasks are opposites – they’re just not particularly aligned at all.

We expected “hard for humans” and “hard for computers” to be strongly correlated. In reality they aren’t. But when we focus just on the memorable cases, this makes it appear as if all tasks are either easy for humans and hard for computers, or vice versa.

The observations cited to support Moravec’s Paradox do still give us an update. They tell us that something easy for humans won’t necessarily be easy for computers. But we should remember that something easy for humans won’t necessarily be hard for computers either. The lesson we take from Moravec’s observations in the 1980’s is that it tells us very little about what to expect.

That’s definitely valuable to know. But the widespread meme that humans and computers have opposite strengths is mostly just misleading. It updates too hard, due to a focus on the memorable cases. This produces a conclusion that’s the opposite of the original (negative correlation vs positive correlation), but that’s equally incorrect (should be little to no correlation at all).


Book review: Lost Connections by Johann Hari

20 октября, 2021 - 05:40
Published on October 20, 2021 2:40 AM GMT

Why this book is interesting

Well, it's about depression, which is generally interesting to LW readers. For instance, 34% of SSC readers said they were diagnosed or thought they had it in 2020 (source).

This book asserts that most of us are thinking about depression in a fundamentally wrong way, which would be very important if true. It also presents some interesting possible solutions for solving depression in one’s own life and solving depression as a social, collective-action problem. 

It’s not really fully fleshed out and supported, nor is it a good self-help guide. The book provides a vague model pointing towards how we should think about depression differently, and even some specific causes to look at, but there’s clearly some big gaps in the supporting evidence, and there’s not much concrete advice that a depressed person won’t have considered before (“maybe you would be less depressed if you made friends, or had more money”). Many of the claims made in the book ,seem to rely on anecdote and individual studies. The author seems to be taking a journalistic approach -- find one or two experts and one or two random individuals whose stories support his narrative, describe a couple of studies -- rather than a more rigorous “but why should we actually believe this over alternative models?”

Another major issue is that it conflates “biopsychosocial model of depression” with a more generalized position of “antidepressants bad, all other treatments good,” which really doesn’t help the author’s case. More on this in “Critiques”.

Nonetheless, I found the book useful because it provides a more social and practical lens on the problem of depression, which we really don’t talk about much because it doesn’t fit into the standard “treatment of illness” narrative. And some of the early results in e.g. social prescribing sound very promising.

Summary of the book

High-level claims:

  • The biological model of depression is mostly wrong; instead, we should think of it as biopsychosocial, meaning that there are some biological and some social and some individual components. We should focus more on the psycho- and social- components, because right now we mostly treat it biologically.
  • Depression is a signal in your brain that tells you something is wrong, like pain or nausea; you should listen to it and try to fix the underlying causes, rather than trying to just treat the symptom. If you suppress the signal, you might be missing something important.
  • People are more depressed now than in the past because of society-wide trends towards people having more depressing lives. This requires societal solutions, not just individual solutions.

The author blames depression on “disconnection from” seven factors:

  1. Meaningful work: Many people today feel that their work is “bullshit” or not important.
  2. Social connection (loneliness): There are well-documented trends towards people having fewer friends, saying that they feel lonely more often in surveys, participating in fewer social events and clubs, etc., over time.
  3. Meaningful values: Consumerism makes people focus on things that aren’t really meaningful to them, which makes them less happy.
  4. Childhood trauma: People don’t talk about their trauma and it makes them sad. (I find this questionable as part of his overall argument, see “Critiques” for more detail.)
  5. Status and respect: People don’t have control or status in the workplace, which makes them unhappy.
  6. Natural world: People don’t see nature much and this makes them sad.
  7. Hopeful/secure future: People have precarious lives, and can’t trust in the future of their jobs and lives. (I’m skeptical of this claim without evidence about how optimism about the future has changed over time vs. depression. The main supporting evidence here is some stuff about First Nations people becoming more depressed and suicidal as they lost control of their lands... which could be explained in several different ways, not just a lack of a secure future... and a study of depressed teenagers who were less able to predict the future of a character in a book than their non-depressed counterparts.)

And the author also devotes a chapter to the contributions of biology to depression:

  • He thinks genes mainly affect things via genetic susceptibility to depression, not a genetic cause. In other words, if you are genetically susceptible but have a fun and non-depressing life, you won’t be depressed. If you aren’t susceptible, you won’t be depressed even if your life is depressing.
  • Brain changes: you can get stuck in a depression feedback loop that’s counterproductive. Even if your depression was caused by something “real” in your life, your brain can undergo a feedback loop that gets out of control and out of proportion to the original cause.

He describes seven "reconnections" that he thinks could help people find their way to better, less depressing lives:

  1. Other people
    1. The Amish are happy and they live around other people all the time
    2. Also if you’re depressed, try doing something nice for other people instead of yourself
  2. Social prescribing
    1. Make people do group work for therapy. For example, one therapist prescribed a group of depressed city dwellers to make a community garden together; it helped them a lot because of the friendships they made.
  3. Meaningful work
    1. Worker co-ops give you more control and meaning in your work.
  4. Meaningful values
    1. Ban advertising?
    2. Teach people to reconnect with their values using workshops where they talk about money, what they spend money on vs what they find important, etc.
  5. Sympathetic Joy, and Overcoming Addiction to the Self
    1. Loving-kindness meditation
    2. Maybe do CBT or something? But loving-kindness meditation is better because it connects us to others.
    3. Guided psychedelics therapy (psychedelics can cause brain changes like meditation)
      1. Caveat 1: Bad trips suck though.
      2. Caveat 2: Also for some people the effects don’t last long.
      3. (No mention of the possibility that the effects could be bad and long-lasting.)
  6. Acknowledging and Overcoming Childhood Trauma
    1. Maybe your doctor should ask you about your trauma?
    2. Talking about childhood trauma is good.
  7. Restoring the Future
    1. UBI is good, because it gives people a sense of stability which lets them do more long-term planning (going back to school, etc.).
CritiquesBroken signals

Hari says depression is a “signal” that we need to improve things in our life-- just like pain is a signal that something is wrong, or nausea, or whatever. I wonder how often he takes Tylenol for a headache.

These biological signals can go wrong. And it’s often necessary to give people medication to remove their pain or nausea, because A) everyone involved is already aware there is a problem and working on it, and B) pain and nausea make it harder to cope with your problems mentally. Depression has the same problem. A little can motivate you to fix what’s going wrong in your life, but too much can be disabling. That’s a great motivating case for antidepressants, even in cases where the “real problem” is something in your life-- giving you breathing room to actually solve those problems, instead of not being able to get out of bed every morning.

Hari gives some lip service to the idea of “feedback loops” in your brain that make things worse, but that doesn’t address the idea that the signal can simply be unhelpful even without that. As far as I know, no one talks about “pain feedback loops” for an ordinary headache, or “nausea feedback loops” in pregnancy; it’s just known that sometimes you have had enough pain or nausea, actually, and your body doesn’t always get that right.

Childhood trauma

Part of the whole thesis of this book is that people are more depressed now than in the past, and we need to look at why. But childhood trauma doesn’t fit this story at all.

In general, people had way more childhood trauma in the past. For example, attitudes on corporal punishment have only turned around in the past couple decades; people have only started talking about the later impacts of childhood sexual abuse in about the same interval; and if you go back further, violence was simply a more normal part of life for people, including children, in the past. And historically, people didn’t cope with childhood trauma openly the way we do now-- treating it as a disorder, bringing it up in therapy, or even having therapy at all. It was just a thing that happened to you. How can this explain an increase in depression, when people treat their children dramatically more nicely now than they used to and talk about their trauma more?

I’d agree that talking about childhood trauma in therapy can be helpful-- I’ve found it helpful myself!-- but I think you’d have to do a lot more work to prove that this is a societal problem, or that it’s worse now compared to before.

Down on pills for no good reason

I originally heard about this book on Econtalk (episode link). Hari comes across very reasonably on the episode. He says people tar him as wanting to “break into their house and take away their Prozac,” and claims that this is not at all his position; he says antidepressants can be useful for many people, it’s just that we should encourage people to investigate other solutions. And not just therapy, either-- structural and social solutions to people’s problems. I found this position intriguing and very reasonable; surely more possible solutions is better than fewer.

Unfortunately, the book is not nearly as reasonable as he sounded in the podcast; in the book, he spends a lot of time basically talking about how antidepressants don’t work in the vast majority of cases, and have terrible side effects, and never worked for him at all even when he thought they did and told everyone they did, and the cases where they’re useful are (his words) “very rare”; can’t imagine why that would make people think you want to take away their antidepressants!

And there’s something else that makes me skeptical about his position on medication. He spends a chapter or so talking about CBT, meditation, and psychedelics. But here’s the thing: If antidepressants don’t work because depression is an honest signal of things that are going wrong in people’s lives, then meditation, CBT, and psychedelics shouldn’t work either. And for that matter, psychedelics are biological, too.

Meditating doesn’t fix your problems. CBT doesn’t fix your problems. David Burns, author of one of the most famous DIY CBT books, Feeling Good, often says on his podcast that people who are depressed don’t need to change anything in their life to be happy; all they need to do is change their thoughts.[1] That’s the exact opposite of Hari’s primary thesis, which is that depression is a signal that you need to change something concrete in your life, and that it would be bad to just “turn it off” because you’d be missing signals about something important. And yet, he argues for meditation and CBT as solutions to depression! I think this betrays an unfounded bias against medication on his part. It basically comes off as a naturalistic fallacy or, in the case of psychedelics, a “hippie” bias.

A final note: He spends a while in the book going through his own experience with antidepressants, which was not good, and uses that as support for his claim that they mostly don’t work. But frankly, the treatment algorithm his doctors used was wildly bad. They kept giving him higher and higher doses of the same antidepressant, despite the fact that it was causing side effects that harmed him increasing amounts, without trying a single alternative. (Nothing at all like the treatment algorithm described here.) I’m not saying antidepressants would necessarily have worked for him if they did a better job, but it was pretty predictable that they weren’t going to work given what his doctors were actually doing.

What I found most valuableEven non-"situational" depression usually involves having a depressing life

There are some very interesting studies cited in the book showing that there wasn’t much difference between “situational” (aka “reactive” depression) and the other kind. They asked a bunch of people who were diagnosed with “situational” vs normal depression a lot of questions about things in their life -- their financial situation, romantic situation, other relationships and support system -- and found that even the people with “non-situational” depression had many of the same life problems. This is definitely suggestive. I don’t think it quite supports the author’s belief that antidepressants don’t help-- that’s a different proposition-- but it does strongly suggest that biology isn’t the sole underlying cause of depression.

Framing depression as a social problem

I appreciated the point that, if people’s lives are depressing, we should probably work on that problem, and not just making them less depressed via other methods like pills and therapy. And in particular, that this might require help, rather than being something someone can always do by themselves.

The book makes the very good point that the biological-only method of treating depression is often oriented towards “just get this person to survive in their current life without making changes,” which, in many cases, benefits other people more than the patient. They take drugs to help them get through their shitty job and pretend their shitty personal life isn’t so bad. It’s rather nightmarish. In theory, the best use of drugs is to get people enough breathing room that they can make their lives better, but in practice it doesn’t always work that way.

Another study he mentioned was one where they took people from the United States and from Russia, Japan, and Taiwan, and tracked A) whether they were trying to become happier deliberately, and B) whether they did actually become happier. Of those who actively tried to become happier, those from the US did not, but the others did become happier. The study authors attribute this to the fact that most of the Asian people chose to focus on improving their family or group, rather than focusing on their happiness in an individualistic way (e.g. by treating themselves). I found this interesting because I hadn’t considered that taking that kind of totally different frame on the problem could help. (Though, I believe it only weakly; it’s one study, and I’m not all that convinced they’ve ruled out alternative explanations like “people in different countries interpret the survey questions differently.”)

… a social problem that is getting worse

I think it’s also useful to look at societal trends in depression over time and try to track why they’re happening. It does seem true that depression has been increasing recently, and it’s also quite clearly true that people are becoming lonelier and less connected to other people in their lives. The most striking statistic in the book to me was the modal number of friends that Americans now report having: zero. And to be clear, this didn’t used to be the case.

I suspect it’s also true that more people now feel they are working bullshit jobs, and that this probably makes people more depressed (from my own experience, if nothing else).

Social prescribing

Social prescribing is particularly interesting now because, at least in theory, it’s something that one person or a small group of people can implement without the need for huge society-wide coordination (unlike, say, UBI), and the initial results are promising.

The basic idea is that a psychiatrist or mental health professional tells a group of patients to work together in some social way, which will hopefully result in social bonds forming that help them become happier and/or help them in other areas of their life. I almost wonder whether there’s some catch or problem that explains why I haven’t heard of this before, since it seems like such an obviously good idea. If it consistently works, why aren’t more people doing it?

Economic reforms

When it comes to economic reforms to help people be less depressed, I mostly agree with Hari’s proposals, but he doesn’t have that much evidence that they would actually work.

In the case of UBI, his argument is mostly “Look at all these studies showing that UBI improved people’s lives,” which is fair. I’m not entirely convinced of the impact on depression specifically (as opposed to generic life-improvement), or that this impact is mediated the way he says it should be (via having a sense of a more secure future). But the theory that “if people are less likely to become homeless or starve, they will be less depressed” seems basically sound.

Then there’s worker co-ops. He admits there is no research on whether coops improve the mental health of workers. I agree that such research would be very helpful! I don’t know how realistic it is to remold a significant fraction of our economy on a co-op model, but maybe if the evidence piled up that employee ownership was important to wellbeing, more people would start buying stock in the companies they work for (or more companies would start offering ownership shares to employees).

Overall takeaways

This book is about depression, which might make you think that the target audience is depressed people or perhaps psychiatrists. But it’s really neither. If depression is a social problem that needs to be solved by society as a whole, then you can’t just talk to those people; you need to talk to everyone. So it’s really aimed at more of a general audience, and many of the solutions proposed can’t be implemented by just one depressed person trying to have a better life. It makes sense in that context to think of it as more of a pop-science book with a political agenda.

I spent more of my time on this book hatereading than regular reading. I still think it was worth going through, for the valuable frame and points that it makes, but I wouldn’t necessarily recommend going through it in its entirety, unless you find random anecdotes about depressed people and whatnot interesting. There’s a lot of filler.

I didn’t have time to do epistemic spot checks of the claims made in the book; I’ve done my best here to represent my guess at how reliable they are. If anyone has stronger data on claims I’ve made here or that the book made, I’d be happy to hear about it.


[1] Personally, I think this is partly true; thoughts are the ultimate mediator of feelings, but in some circumstances it’s just not reasonable to have certain thoughts! Also, there’s “distorted” vs “nondistorted” sadness/pain in his ontology, which still leaves a lot of room for human suffering.


[Update] Without a phone for 10 days

20 октября, 2021 - 04:17
Published on October 20, 2021 1:17 AM GMT

I wrote a post about going without a phone for 10 days. Ten days have now passed, and I'm evaluating my options. This post is about my experience being phoneless and my thoughts about having a phone moving forward.

The last ten days have been extraordinarily peaceful! After a break-in phase of frequently checking my pant pocket for a phantom phone, I began to feel more at ease. After about three days, I felt a calmness that I hadn't enjoyed since middle school. After a week, I became more aware of the passage of time -- my days felt closer to a single drawn-out experience, as opposed to a cluttered collection of moments. During errands, I was forced to spend time waiting for as long as 30 minutes. Being without a phone, I spent these periods thinking to myself. There was immense value in maintaining my attention during these moments; I would compare them to a weak form of mindfulness meditation, something once part of my daily routine.

During these past ten days I've also seen greater productivity, which I attribute to an overall decrease in desire for stimulation. I finally got a simple academic personal website up and running. On the whole, I feel more capable of directing my attention.

Granted, there were some inconveniences to not having a phone. Most inconvenient was being unable to authenticate my university login. I still cannot authenticate, and in order to generate backup codes I need to get in touch with my school's IT department. I was also inaccessible to close friends whom I wanted to speak to, could not order food for delivery (though now I realize I can order it from my computer), and could not easily use car service. I was, however, able to chat with friends using iMessage on my laptop.

Having spent the past ten days without a phone -- and being disappointed by the recent launch of the Google Pixel 6 (a contributing factor to this phoneless experiment) -- I would like to continue life without a phone. I'm willing to give up Uber, Uber Eats, Google Maps, Google Pay, Slack, GMail, and Snapchat. Happy to, actually. However, I need to be able to make calls and texts. Also, having used iMessage for the first time, I was very impressed by how conversational it feels and how convenient it is to chat from my laptop. If I want to get texts on a feature phone, I will have to choose SMS over iMessage. I briefly considered getting an iPhone 12 mini for this reason (I also absolutely love the form factor and design), but I'm worried about getting robocalls. This was not as much a problem on my Pixel 4, which offered excellent spam-call blocking and on-hold waiting (this is perhaps the most practical feature of Google's Pixel Launcher; I almost never needed to listen to god-awful muzak while waiting for a representative).

I'm torn, guys. Should I get a feature phone? An iPhone 12 mini? Should I flash the Pixel Launcher onto a different Android phone (or embark on a painstaking search for the beautiful-but-discontinued Pixel 5) to spare myself from robocalls? What are your thoughts?


Book Review: 'Predicting the Next President: The Keys to the White House'

20 октября, 2021 - 03:42
Published on October 20, 2021 12:42 AM GMT

Part one: what is this book and should you read it?

Predicting the Next President: The Keys to the White House (henceforth rendered “The Keys”) is an ambitious book. Penned by historian Allan Lichtman, The Keys explains his system for forecasting the outcome of US presidential elections. This system is also called “The Keys”. So, to help readers figure out whether I’m talking about the book or the system, I’m only going to italicize The Keys when I’m explicitly referring to the book. The Keys is a 13-strong checklist of “true-or-false” statements that measure how well the incumbent party has performed during its term in office. If six or more statements are judged to be “false” then the election goes to the challenger. Five or fewer, and the incumbent takes it.

According to Lichtman, The Keys reflect a number of axiomatic truths about the democratic system in the US. And although he’s keen to caution against superposing these truths non-reflectively on democratic systems in other countries, he also believes The Keys speak to political processes in liberal democracies more generally. 

In his introductory chapter, Lichtman describes a “pragmatic” American electorate that chooses a president “according to the performance of the party holding the White House” and theorises: “If candidates and the media could come to understand that governing, not campaigning, counts in presidential elections, we could have a new kind of presidential politics.” (Lichtman, 2016, pp. 8). This new kind of politics would dump the attack ads and go deep on policy promises, outlining substantial agendas of what candidates actually planned to do with their desired four years in office.

I should be clear from the outset that I think Lichtman’s vision is an appealing one. I also think The Keys is well written, and The Keys is well conceived. Assuming you’re in the target audience, you will probably enjoy reading this book, and might come away from it with something useful. What is the target audience? Probably people who enjoy reading long copy about predictive methodology and political history (given where you’re reading this, I don’t think it’s a stretch to assume that includes you, dear reader). 

With all that out of the way, let me tell you what’s coming next. In part two, I’ll give a brief overview of what’s in this book and how I’m going to address it. In the third section, I’m going to break down each of the 13 keys and explain how you can use them to create your own electoral forecast. Then, in part four, I’ll look at some of the criticisms levelled against The Keys and give my own brief critique. Finally, in part five, I’ll offer my opinion on the both the book and the system. 

If you don’t want to wade through all of that, here’s the “TLDR” version: I like The Keys, but I think it’s ultimately a flawed predictive methodology that doesn’t live up to its own hype. And I like The Keys but I take issue with some of the claims Lichtman makes and (what I think is) a sometimes deceptively rhetorical style of writing. However, these issues don’t quite manage to make the book unenjoyable or the system it describes unworkable. So, if it sounds like your cup of tea, I suggest you take the plunge and start reading.

Part two: mapping the structure of this review to the structure of The Keys

One of the best things about The Keys is how straightforward it is: it’s just 13 true or false statements. Five or fewer false answers, and the incumbent keeps the White House, six or more and they lose it. The Keys is also straightforward, giving you a full rundown of all 13 keys in chapter two, which I’ll summarise in the next section. But before I get to that, let me briefly sketch the rest of the book’s structure out, and explain how much attention I give to each bit. 

Chapter one deals with most of the theory (we’ll discuss this theory in parts three, four, and five of this review). 

Chapters three to 11 are Professor Lichtman’s reading of American political history in relation to The Keys, which I’ll only mention in passing. I understand that these chapters are going to be a draw for some readers, but I found them a slog. To me – who came here for the theory and the method – these sections read as filler. Let me be clear, this is not fair to Professor Lichtman, who writes well and knows his subject, hence the “to me” qualifier in the previous sentence. But, from an operational perspective, there is nothing you need to read in them – all they offer is historic example after historic example of The Keys aligning with presidential election outcomes (and, of course, some insight into Lichtman’s own view of American political history). 

Chapter 12 is Professor Lichtman’s prediction for the 2016 US presidential election, which I’ll talk about in part four of this review (remember, this is the 2016 edition, so it was published before Mr. Trump took the White House). 

Finally, chapter 13 outlines what Professor Lichtman thinks the “lessons” of the 13 Keys are for political candidates. Despite its relative brevity, this is one of the most important chapters in the book, and it will feature pretty heavily in part five of this review.

Part three: the 13 Keys to the White House

Without further ado, here are the 13 Keys. This is a condensed and only-slightly paraphrased list (I will never in good conscience use the word “listicle”) taken from chapter two of The Keys. Each key is titled as-per Professor Lichtman’s original, and contains a statement which you have to decide is true or false. I’ve addressed scoring The Keys above, but let me reiterate it here: once you’ve assessed the veracity of each statement, all you need to do is count up how many “false” results you’ve got. Six or more, and the challenger wins the White House. Five or fewer, and the incumbent does. 

  1. The Incumbent-Party Mandate: The incumbent party holds more seats in the House of Representatives after the mid-term elections than it did after the previous mid-term elections.
  2. Nomination Contest: The sitting president doesn’t face any serious competition for his party’s nomination.
  3. Incumbency: The sitting president is the incumbent party’s candidate.
  4. Third Party: The race is between “the big two” with no serious third-party campaign.
  5. Short-term Economy: There’s no recession during the election campaign.
  6. Long-term Economy: Real annual per capita GDP growth during the current term equals or exceeds mean growth in the previous two terms (for the economically challenged, this basically means the US economy has been growing as fast, or faster, during the term in which the election takes place as the two terms that directly precede it).
  7. Policy Change: The incumbent makes major changes in national policy (Lichtman defines what a “major” change would look like: “redirecting the course of government or […] innovating new policies and programs that have broad effects on the nation’s commerce, welfare, or outlook.”) (Lichtman, 2016, pp. 53).
  8. Social Unrest: There is no “sustained social unrest” during the incumbent’s term in office (Lichtman, 2016, pp. 55).
  9. Scandal: The incumbent administration hasn’t had to contend with a major scandal (what counts as “major”? “The wrongdoing has to bring discredit upon the president himself, calling into question his personal integrity, or at least his faithfulness in upholding the law.”) (Lichtman, 2016, pp. 57).
  10. Foreign or Military Failures: The incumbent administration hasn’t suffered a major foreign policy or military failure (there’s that word “major” again. This time it means: “a major disaster that appears to undermine America’s national interests or seriously diminish its standing in the world.”) (Lichtman, 2016, pp. 60).
  11. Foreign or Military Successes: The incumbent administration has made a major foreign policy or military success (major: “perceived as dramatically improving the nation’s interests or prestige.”) (Lichtman, 2016, pp. 62 – 63).
  12. Incumbent Charisma: The incumbent-party candidate is either immensely charismatic or a national hero (“an extraordinarily persuasive or dynamic candidate, or one who has attained heroic status through achievements prior to his nomination as a presidential candidate.”) (Lichtman, 2016, pp. 65).
  13. Challenger Charisma: The challenging-party candidate is neither immensely charismatic nor a national hero (see Key 12’s description in parenthesis, but remember that Key 13 is claiming the challenger is not charismatic or a hero, whereas Key 12 is claiming the incumbent is one or both of these things).

It’s worth going back and re-reading these carefully, because some of the phrasing is a little awkward. This is an artifact of Lichtman’s decision to render The Keys as a series of true or false statements that are sometimes opposed to one another, and you can’t really write around it. 

In Chapter two of The Keys Lichtman gives us pages of explanation for each key. It’s worth reading, and it’s interesting reading… but it’s not essential reading. You can use his predictive methodology with nothing more than the list above, provided you are able to figure out whether each statement given is true or false. 

Part four: critiquing The Keys

On August 31, 2011, renowned political forecaster and number cruncher Nate Silver penned a post on FiveThirtyEight with the headline: Despite Keys, Obama Is No Lock. Although titularly concerned with then-incumbent president Barack Obama’s chances in the 2012 presidential election, the post is more noteworthy today for its blistering critique of The Keys: “Superficially quite impressive. But ‘superficial’ is the crucial term here.” (Silver, 2011, online). 

What is Silver talking about specifically? Professor Lichtman’s claim that The Keys “called the winner” of every election since 1860 (Silver again: “That’s 38 elections in a row!”). Nate runs through “several problems with this model” noting first “the nature of the keys […] several of which are quite subjective”. Silver admits, for example, that “candidate quality” is a thing, but notes that it can be a pretty amorphous one. Why, he asks, was Mr Obama scored as charismatic in 2008 but not in 2012? Shouldn’t John McCain have qualified as a war hero in 2008? Why doesn’t Afghanistan count as a foreign policy failure, given that it’s unpopular with Americans? 

Professor Lichtman responded to Silver a week later, claiming “his critique of the keys system cannot withstand scrutiny” (Lichtman, 2011, online) According to Lichtman, his “published definition” of the challenger charisma/hero key “rules out counting McCain as a national hero since he did not lead the nation through war, however admirable.” He continues to claim “in [the] first edition of the Keys to the White House published many years before the 2008 election, I wrote that candidates attain heroic stature only ‘through vital leadership in war like Ulysses Grant and Dwight Eisenhower.’” (Lichtman, 2011, online). 

This is probably true – I don’t have the first edition to hand. However, I do have the 2016 edition, and I’m going to quote the passages from both key 12 and key 13 that apply here. To be clear, these are the only passages in the description of these keys that deal with the subject (heroism) under discussion. I apologise for the length of these quotes – I’m including them because I think it’s really important to illustrate a wider criticism I want to make: 

“There have been only two clearly heroic nominees, both of whom attained this stature through vital leadership in war: Ulysses Grant and Dwight Eisenhower. Many other candidates, including William McKinley, George McGovern, and George H. W. Bush have had impressive military records but have fallen far short of the heroic status attained by Grant and Eisenhower. Of all presidential candidates since 1860, only Theodore Roosevelt, the leader of the “Rough Riders” during the Spanish-American War, combined personal charisma with near-heroic accomplishment.” (Lichtman, 2016, pp. 65).

“Heroic stature is easier to identify. To meet the threshold, a candidate’s achievement must be deemed critical to the nation’s success in an important endeavour, and probably should be of relatively recent vintage. New Jersey senator Bill Bradley’s career as a highly acclaimed professional basketball player, while appealing to a large segment of the population, does not make him a national hero. Ohio senator John Glenn’s achievement as the first American astronaut to orbit the earth might have conferred that status early on, but probably would no longer have secured the key had he been nominated in 1984.” (Lichtman, 2016, pp. 67).

Now, dear reader, I am going to make a prediction of my own that relies on nothing but my faith in your reading comprehension skills: you’re wondering “where, in the above quotes, does Professor Lichtman explicitly state that vital leadership in war is required to turn the ‘hero’ key?” Well, let me answer you: nowhere. He explicitly states it nowhere. 

Now, you might also be thinking it’s unfair of me to quote the 2016 version of The Keys, given that this criticism and response dates back to 2011. But that leads us on to the following question: why is Professor Lichtman quoting himself from 1996, when the most recent edition of his work (at the time) was the 2008 edition? Thanks to the magic of Google Books, I can take a pretty good guess: because the text in the 2008 version is exactly the same as the text in the 2016 version. 

In other words, Silver’s criticism is correct – it is not clear whether McCain (who won the Silver Star and Purple Heart, among other wartime baubles) should have turned the ‘hero’ key or not. And Lichtman’s response (in my opinion) is disingenuous. He quoted an out-of-date version of his own work to create the impression that turning key 12 was less subjective than it actually was at the time of writing (2011). Or, if the text was the same back in 1996, he is simply being obfuscatory and quoting himself out of context to make it appear as though leading the nation through war was always a clearly defined prerequisite for turning key 12, when in fact it was not.

It’s hard to be charitable here. If you re-read that second lengthy quotes from Lichtman, you’ll see that he is explicit about there being loads of ways to turn the hero key other than providing “vital leadership in war”. And let’s be clear, it’s already a stretch to equate the provision of “vital leadership in war” with “lead[ing] the nation through war”. The latter seems like a specific (and much more selective) example of the former. Unfortunately, as we’ll see in a moment, this kind of equivocation is not a one off. 

Turning back to Silver’s critique, there is some slightly misguided discussion about The Keys’ inability to determine the margin of victory in any given election (in his retort, Lichtman quite reasonably responds that “the keys were not designed to predict vote percentages. They were designed to forecast whether the incumbent or challenging party will prevail in the popular vote, regardless of the margin of victory”). Then Nate hits upon something important, which I will quote in full: 

“If there are, say, 25 keys that could defensibly be included in the model, and you can pick any set of 13 of them, that is a total of 5,200,300 possible combinations. It’s not hard to get a perfect score when you have that large a menu to pick from! […] It’s less that he has discovered the right set of keys than that he’s a locksmith and can keep minting new keys until he happens to open all 38 doors.” (Silver, 2011, online).

Why is this important? It is important because it describes the way in which Professor Lichtman mis-represents the input he used to create The Keys (results from historic US presidential elections) as the output of his system. Here’s an example: he writes “retrospectively, the keys account for the results of every presidential election from 1860 through 1980, much longer than any other prediction system” (Lichtman, 2016, pp. 9). This is not a lie, but it is deceptively worded. What does it even mean for a system to “retrospectively” account for the very data it was built upon? It would be a bloody poor system if it didn’t! To put it another way, we all know that The Keys is not going to call the 1932 election for Herbert Hoover, and we all intuitively know why: The Keys does not predict the result of this election (or even “account” for it) but is in fact predicated on it. As Silver notes above, there is nothing fantastically impressive in designing a model that can regurgitate the data it is modelled upon – hindsight, as they say, is 20-20. 

Let’s move on from hindsight to what we’re really interested in here: foresight. Since The Keys’ creation, Professor Lichtman has called all ten US presidential elections (the first being 1984). What’s his actual predictive track record? Frankly, it’s great. In all that time, he has only called the race incorrectly once – the 2000 election, when George Bush Jr. beat then-Vice President Al Gore in the Electoral College despite losing the popular vote. Ah, but hang on – The Keys is designed to predict popular vote victory, not Electoral College victory. So, over nearly 40 years, The Keys has managed an incredible success rate of 100%! Let’s turn now to the penultimate chapter of The Keys, where Professor Lichtman forecasts the result of the 2016 presidential election… his prediction? Uncertain. Luckily, Professor Lichtman didn’t leave it there.

In a September 23, 2016 interview with The Washington Post (sorry about the paywall), Professor Lichtman responds to the question of who, with seven weeks left before voting day, will be president by the end of the year: “Based on the 13 keys, it would predict a Donald Trump victory. Remember, six keys and you’re out, and right now the Democrats are out – for sure – five keys.” (Stevenson, 2016, online). 

After the election, when Professor Lichtman had been proved right, unlike the vast majority of pundits and pollsters, the media was rightly impressed (and here, and here, for example). Or was it right to be impressed? Just a paragraph ago, I wrote that The Keys predict the winner of the popular vote, not the electoral college. A few paragraphs before that, I quoted Professor Lichtman’s retort to Nate Silver from 2011 that makes this fact explicit. And don’t forget, in my 2016 edition of The Keys, it’s stated in the very first chapter (Lichtman, 2016, pp. 9). President Trump, of course, did not win the popular vote – Hillary Clinton won 65,853,514 votes to Trump’s 62,984,828. Now, if you’re like me, you like your academics honest, and probably hope to read that Professor Lichtman quickly corrected this error and owned up to his mistake. Sadly… well, you can probably guess how it really went.

When I was adding the finishing touches to this review (October 18, 2021), I googled “Lichtman calls 2016 for Trump” (without the quotation marks). One of the articles that came up was an American University (where Lichtman is Distinguished Professor of History) article titled Historian’s Prediction: Donald J. Trump to Win 2016 Election. This article was published on September 26, 2016. The very first paragraph on that page is not part of the article itself. It reads: “Editor’s Note: This story has been updated with a correction. It has been corrected to read that Prof. Lichtman’s 13 Keys system predicts the winner of the presidential race, not the outcome of the popular vote.” Confusingly, it later claims that The Keys also predicted “Al Gore’s popular vote victory in 2000.” (Basu, 2016, online). 

The pertinent question here is which election Lichtman admits to having called wrong: 2000, when he got the Electoral College vote wrong, or 2016 when he got the popular vote wrong? The answer is: neither. 

My copy of The Keys was published in 2016, and it claims The Keys predict the outcome of the popular vote. But by the end of the same year, Lichtman had decided it predicts the winner of the presidential race instead. When did he change his mind? Not being Professor X, I can’t tell you for certain. However, what I lack in telepathic powers, I make up for with access to the Wayback Machine. So I can tell you without a shadow of a doubt that on November 12, 2016, just four days after the relevant election results came in, that editor’s note quoted in the previous paragraph didn’t exist

Again, it’s hard to be charitable here… I’m not saying that Lichtman waited until the results came in, saw he was wrong, and decided to retroactively change what he claimed to be predicting rather than admit to his mistake. I’m just saying it looks like he did.

Part five: a final judgement 

I think I’ve made my problems with The Keys pretty clear: it’s an engaging, interesting, and informative read, but a lot of it feels dishonest to me. Maybe that’s too harsh – Professor Lichtman might just be overly enthusiastic. I would probably be less sceptical if I hadn’t had to pay for the book, and if Professor Lichtman didn’t publish new editions so regularly. 

Criticisms aside, though, there are plenty of reasons I talked the book up in part one of this review that go beyond “it’s an engaging read”. For one thing, Lichtman does have a genuinely positive political vision, and as someone with an interest in political theory (my master’s degree was in philosophy), it’s hard not to be attracted to it. Here are the “lessons” Lichtman thinks politicians should take away from The Keys, as presented in its final chapter: 

  1. Fire the hucksters (media consultants, pollsters, advertising and media strategists, etc.).
  2. Concentrate on substance (focus on governing rather than rhetoric).
  3. Don’t play it safe (propose bold and innovative policy initiatives).
  4. Don’t hide from ideology (which can be the driving force behind an administration).
  5. Take the high road (don’t repudiate your opponent unnecessarily – build yourself up rather than tear them down).
  6. Pick the best candidate for the number-two slot (forget courting fringe votes, pick the best leader in the party to be your running mate).
  7. Get off the merry-go-round (tone down all that campaigning, slash the media appearances, and ditch the photo opportunities).

To me, this sounds like a much better type of politics. 

Particularly, I would like to say a few words about jettisoning the performative aspects of politics and focussing on the substantive, which is a major theme in Lichtman’s list of takeaways. There is a not-insignificant body of research that suggests advertising is far less effective than the people who create it would have you believe. I have worked in the advertising industry for over a decade, and this rings true to me. So, I think there’s something to Lichtman’s assertion that ads, rallies, and related tools are less impactful than presidential candidates hope. 

(It’s important to remember, though, that there’s a big difference between the suggestion that ads are less persuasive than we’re told they are, and the suggestion that people don’t frequently operate under the sway of persuasion. The former is pretty well documented, and vitally important to remember in a world where the media stirs up fears about online advertising re-routing democracy. The latter is, frankly, ludicrous, as study after study after study demonstrates).

Another thing that really stands out to me as a positive about The Keys is that it doesn’t pretend to be more scientific than it is. Lichtman is clear that The Keys include some subjective judgements. Ironically, I think this actually makes it better able to engage with objective reality. People are confusing, messy, subjective things, and when you put millions of them together in a complex system defined by inscrutable non-linear dynamics, predictions become decidedly fuzzy. This fuzziness is built into The Keys not because it’s complex, but because it doesn’t try to hide people’s inherent complexity from themselves.

I suspect most of us think that it matters whether or not a presidential candidate is extremely charismatic, but how do we quantify something that’s fundamentally qualitative? Step one has to be factoring it into our methodology and taking it seriously. As anyone familiar with Bayes’ Theorem already knows, evaluating our priors and adjusting expectations to incorporate new information is a fundamental aspect of making accurate predictions. In The Keys, we’re given a framework within which to think about the impact of charisma on an electorate, and the quantification is built into the system automatically: it’s one of thirteen equally weighted factors. 

My final assessment of The Keys is that it’s a fun and interesting read – although you should expect to be bored by the middle chapters, unless you’re a political history enthusiast. And my assessment of The Keys is that it’s a valuable addition to a political forecaster’s arsenal. If you’re already adept at forecasting, I wouldn’t suggest swapping Bayes for Lichtman – but The Keys could help you think about political issues in a wider, more useful, way (I would probably say it had this effect on me, despite already having a solid grounding in political theory).

Neither The Keys nor The Keys are, however, perfect. 

The Keys suffers from a lack of objectivity when assessing The Keys’ ability to produce accurate forecasts, and in places this bleeds into (what I have to conclude is) outright deception. And The Keys, despite its incredible track record and intuitive nature, has nothing like the success rate Lichtman claims. 

Lichtman has made ten predictions with this system and gotten nine of them right. The subjective nature of The Keys itself makes this success rate hard to evaluate… is Lichtman particularly good at using The Keys because he’s a quantitative historian who specialises in American political history? Would he make good predictions no matter what framework he used? A fairer way to assess The Keys’ performance would be for Lichtman to teach a team of undergrads his system and track their results over time. Given Professor Lichtman’s apparent dedication to preserving the illusion of omniscience, I doubt we’ll see this kind of rigour employed any time soon.  

All in all, The Keys is deserving of a space on the bookshelf if you’re into political forecasting. But, until Professor Lichtman addresses some of the issues raised in this review, I don’t think it’s deserving of a great deal more than that.


Some of the best rationality essays

20 октября, 2021 - 01:57
Published on October 19, 2021 10:57 PM GMT

Meta: Send this to anyone who is interested in learning more about "rationality"

A refresher: what is “rationality?” [1]

Rationality is the art of thinking in ways that result in accurate beliefs and good decisions [as understood by the LessWrong community; this understanding of rationality differs from others, some of which are more common in colloquial usage than “LessWrong rationality”]. It is the primary topic of LessWrong.

Rationality is not only about avoiding the vices of self-deception and obfuscation (the failure to communicate clearly), but also about the virtue of curiosity, seeing the world more clearly than before, and achieving things previously unreachable to you. The study of rationality on LessWrong includes a theoretical understanding of ideal cognitive algorithms, as well as building a practice that uses these idealized algorithms to inform heuristics, habits, and techniques, to successfully reason and make decisions in the real world.

Resources and materials

To learn more about rationality, I recommend (in order of usefulness per unit of effort): 

To learn more about systematically doing good (i.e. effective altruism) I recommend:

Some of the best rationality essays [2]

Below you can find some of the best writings on rationality, in no particular order:

  1. Affective Death Spirals
  2. Uncritical Supercriticality
  3. Cached Thoughts
  4. We Change Our Minds Less Often Than We Think
  5. Hold Off On Proposing Solutions
  6. Knowing About Biases Can Hurt People
  7. Update Yourself Incrementally
  8. The Bottom Line
  9. Avoiding Your Belief's Real Weak Points
  10. Motivated Stopping and Motivated Continuation
  11. Anti-Epistemology
  12. The Proper Use of Humility
  13. You Can Face Reality
  14. The Meditation on Curiosity
  15. Something to Protect
  16. Crisis of Faith
  17. Mind Projection Fallacy
  18. Reductionism
  19. Explaining vs. Explaining Away
  20. Angry Atoms
  21. Making Beliefs Pay Rent (in Anticipated Experiences)
  22. Humans are not automatically strategic
  23. Local Validity as a Key to Sanity and Civilization
  24. Twelve Virtues of Rationality
  25. The noncentral fallacy - the worst argument in the world?
  26. Policy Debates Should Not Appear One-Sided
  27. "Other people are wrong" vs "I am right"
  28. On Caring
  29. Outline of Galef's "Scout Mindset"
  30. Generalizing From One Example
  31. Diseased thinking: dissolving questions about disease
  32. How An Algorithm Feels From Inside
  33. Scope Insensitivity
  34. Conservation of Expected Evidence
  35. Your Strength as a Rationalist
  36. Your intuitions are not magic
  37. How To Convince Me that 2 + 2 = 3
  38. The simple truth
  39. Circular altruism


[1] Taken from the Rationality Tag, though the bracketed words are mine.

[2] Half of these are the bolded essays from here, the other half were recommended by a well-read LessWronger. By “some of the best” I really mean “some of the most useful to read, probably.” This is not authoritative at all; please comment if you think we missed any important essays.


Ideal Format/Frequency of Information Consumption

19 октября, 2021 - 18:39
Published on October 19, 2021 3:39 PM GMT

It seems that for every type of information input, there is an ideal:

1. Source/Format

2. Frequency

3. Time of Day

4. Energy State

1. Source/Format

Examples: Twitter, blog, books, podcasts etc.

Don’t waste your time following your favorite author’s Twitter account if 99% of their value is derived from their books.

2. Frequency

The most effective frequency to consume a particular information source might be:

- Live/on-demand/push notifications

- Daily

- Weekly

- Monthly

- Annually

- On a Random Schedule

Match the frequency to the information source, and aim to maximize value gained for time input. Apply the 80/20 rule, then apply it again.

In general, if you feel like a source of information input is taking up too much of your time relative to the value gained, try reducing the frequency.

3. Time of Day

Some information is best consumed in the morning, some during your lunch break, some while exercising, some late at night etc.

4. Energy State

Example: Deep work is only possible when you’re in a state where your mind is able to focus and concentrate, not when you’re exhausted.

Example: If you have a favorite TV show, you can probably watch it when you’re exhausted in the evening and still enjoy it, whereas you can’t complete serious ‘productive’ work in that energy state.

Note: All of these factors are unique to each person and change at different stages of life.

Note: Many information sources don’t pass any filters and their ideal source/frequency etc. is zero, never.


Listen to top LessWrong posts with The Nonlinear Library

19 октября, 2021 - 15:29
Published on October 19, 2021 12:24 PM GMT

Crossposted from the EA Forum.

We are excited to announce the launch of The Nonlinear Library, which allows you to easily listen to top EA and rationalist content on your podcast player. We use text-to-speech software to create an automatically updating repository of audio content from the EA Forum, Alignment Forum, LessWrong, and other EA blogs.

In the rest of this post, we’ll explain our reasoning for the audio library, why it’s useful, why it’s potentially high impact, its limitations, and our plans. You can read it here or listen to the post in podcast form here.

Goal: increase the number of people who read EA research

A koan: if your research is high quality, but nobody reads it, does it have an impact?

Generally speaking, the theory of change of research is that you investigate an area, come to better conclusions, people read those conclusions, they make better decisions, all ultimately leading to a better world. So the answer is no. Barring some edge cases (1), if nobody reads your research, you usually won’t have any impact.

Research → Better conclusion → People learn about conclusion → People make better decisions → The world is better

Nonlinear is working on the third step of this pipeline: increasing the number of people engaging with the research. By increasing the total number of EA and rationalist articles read, we’re increasing the impact of all of that content.

This is often relatively neglected because researchers typically prefer doing more research instead of promoting their existing output. Some EAs seem to think that if their article was promoted one time, in one location, such as the EA Forum, then surely most of the community saw it and read it. In reality, it is rare that more than a small percentage of the community will read even the top posts. This is an expected-value tragedy, when a researcher puts hundreds of hours of work into an important report which only a handful of people read, dramatically reducing its potential impact.

Here are some purely hypothetical numbers just to illustrate this way of thinking:

Imagine that you, a researcher, have spent 100 hours producing outstanding research that is relevant to 1,000 out of a total of 10,000 EAs.

Each relevant EA who reads your research will generate $1,000 of positive impact. So, if all 1,000 relevant EAs read your research, you will generate $1 million of impact.

You post it to the EA Forum, where posts receive 500 views on average. Let’s say, because your report is long, only 20% read the whole thing - that’s 100 readers. So you’ve created 100*1,000 = $100,000 of impact. Since you spent 100 hours and created $100,000 of impact, that’s $1,000 per hour - pretty good!

But if you were to spend, say 1 hour, promoting your report -  for example, by posting links on EA-related Facebook groups - to generate another 100 readers, that would produce another $100,000 of impact. That’s $100,000 per marginal hour or ~$2,000 per hour taking into account the fixed cost of doing the original research.

Likewise, if another 100 EAs were to listen to your report while commuting, that would generate an incremental $100,000 of impact - at virtually no cost, since it’s fully automated.

In this illustrative example, you’ve nearly tripled your cost-effectiveness and impact with one extra hour spent sharing your findings and having a public system that turns it into audio for you.  

Another way the audio library is high expected value is that instead of acting as a multiplier on just one researcher or one organization, it acts as a multiplier on nearly the entire output of the EA research community. This allows for two benefits: long-tail capture and the power of large numbers and multipliers.

Long-tail capture. The value of research is extremely long tailed, with a small fraction of the research having far more impact than others. Unfortunately, it’s not easy to do highly impactful research or predict in advance which topics will lead to the most traction. If you as a researcher want to do research that dramatically changes the landscape, your odds are low. However, if you increase the impact of most of the EA community’s research output, you also “capture” the impact of the long tails when they occur. Your probability of applying a multiplier to very impactful research is actually quite high.

Power of large numbers and multipliers. If you apply a multiplier to a bigger number, you have a proportionately larger impact. This means that even a small increase in the multiplier leads to outsized improvements in output. For example, if a single researcher toiled away to increase their readership by 50%, that would likely have a smaller impact than the Nonlinear Library increasing the readership of the EA Forum by even 1%. This is because 50% times a small number is still very small, whereas 1% times a large number is actually quite large. And there’s reason to believe that the library could have much larger effects on readership, which brings us to our next section. 

Why it’s useful
EA needs more audio content

EA has a vibrant online community, and there is an amazing amount of well researched, insightful, and high impact content. Unfortunately, it’s almost entirely in writing and very little is in audio format.

There are a handful of great podcasts, such as the 80,000 Hours and FLI podcasts, and some books are available on Audible. However, these episodes come out relatively infrequently and the books even less so. There’s a few other EA-related podcasts, including one for the EA Forum, but a substantial percentage have become dormant, as is far too common for channels because of the considerable amount of effort required to put out episodes.

There are a lot of listeners

The limited availability of audio is a shame because many people love to listen to content. For example, ever since the 80,000 Hours podcast came out, a common way for people to become more fully engaged in EA is to mainline all of their episodes. Many others got involved through binging the HPMOR audiobook, as Nick Lowry puts it in this meme. We are definitely a community of podcast listeners.

Why audio? Often, you can’t read with your eyes but you can with your ears. For example, when you’re working out, commuting, or doing chores. Sometimes it’s just for a change of pace. In addition, some people find listening to be easier than reading. Because it feels easier, they choose to spend time learning that might otherwise be spent on lower value things.

Regardless, if you like to listen to EA content, you’ll quickly run out of relevant podcasts - especially if you’re listening at 2-3x speed - and have to either use your own text-to-speech software or listen to topics that are less relevant to your interests.

Existing text-to-speech solutions are sub-optimal

We’ve experimented extensively with text-to-speech software over the years, and all of the dozens of programs we’ve tried have fairly substantial flaws. In fact, a huge inspiration for this project was our frustration with the existing solutions and thinking that there must be a better way. Here are some of the problems that often occur with these apps:

  • They are glitchy, frequently crashing, losing your spot, failing at handling formatting edge cases, etc.
  • Their playlists don’t work or exist, so you’ll pause every 2-7 minutes to pick a new article to read, making it awkward to use during commutes, workouts, or chores. Or maybe you can’t change the order, like with Pocket, which makes it unusable for many.
  • They’re platform specific, forcing you to download yet another app, instead of, say, the podcast app you already use.
  • Pause buttons on headphones don’t work, making it exasperating to use when you’re being interrupted frequently.
  • Their UI is bad, requiring you to constantly fiddle around with the settings.
  • They don’t automatically add new posts. You have to do it manually, thus often missing important updates.
  • They use old, low-quality voices, instead of the newer, way better ones. Voices have improved a lot in the last year.
  • They cost money, creating yet another barrier to the content.
  • They limit you to 2x speed (at most), and their original voices are slower than most human speech, so it’s more like 1.75x. This is irritating if you’re used to faster speeds.

In the end, this leads to only the most motivated people using the services, leaving out a huge percentage of the potential audience. (2)

How The Nonlinear Library fixes these problems

To make it as seamless as possible for EAs to use, we decided to release it as a podcast so you can use the podcast app you’re already familiar with. Additionally, podcast players tend to be reasonably well designed and offer great customizability of playlists and speeds.

We’re paying for some of the best AI voices because old voices suck. And we spent a bunch of time fixing weird formatting errors and mispronunciations and have a system to fix other recurring ones. If you spot any frequent mispronunciations or bugs, please report them in this form so we can continue improving the service.

Initially, as an MVP, we’re just posting each day’s top upvoted articles from the EA Forum, Alignment Forum, and LessWrong. (3) We are planning on increasing the size and quality of the library over time to make it a more thorough and helpful resource.

Why not have a human read the content?

The Astral Codex Ten podcast and other rationalist podcasts do this. We seriously considered this, but it’s just too time consuming, and there is a lot of written content. Given the value of EA time, both financially and counterfactually, this wasn’t a very appealing solution. We looked into hiring remote workers but that would still have ended up costing at least $30 an episode. This compared to approximately $1 an episode via text-to-speech software.

On top of the time costs leading to higher monetary costs, it also makes us able to make a far more complete library. If we did this with humans and we invested a ton of time and management, we might be able to convert seven articles a week. At that rate, we’d never be able to keep up with new posts, let alone include the historical posts that are so valuable. With text-to-speech software, we could have the possibility of keeping up with all new posts and converting the old ones, creating a much more complete repository of EA content. Just imagine being able to listen to over 80% of EA writing you’re interested in compared to less than 1%.

Additionally, the automaticity of text-to-speech fits with Nonlinear’s general strategy of looking for interventions that have “passive impact”. Passive impact is the altruistic equivalent of passive income, where you make an upfront investment and then generate income with little to no ongoing maintenance costs. If we used human readers, we’d have a constant ongoing cost of managing them and hiring replacements. With TTS, after setting it up, we can mostly let it run on its own, freeing up our time to do other high impact activities.

Finally, and least importantly, there is something delightfully ironic about having an AI talk to you about how to align future AI.

If somebody doesn’t want their content turned into audio

If for whatever reason you would not like your content in The Nonlinear Library, just fill out this form. We can remove that particular article or add you to a list to never add your content to the library, whichever you prefer. 

Future Playlists (“Bookshelves”)

There are a lot of sub-projects that we are considering doing or are currently working on. Here are some examples:

  • Top of all time playlists: a playlist of the top 300 upvoted posts of all time on the EA Forum, one for LessWrong, etc. This allows people to binge all of the best content EA has put out over the years. Depending on their popularity, we will also consider setting up top playlists by year or by topic. As the library grows we’ll have the potential to have even larger lists as well.
  • Playlists by topic (or tag): a playlist for biosecurity, one for animal welfare, one for community building, etc.
  • Playlists by forum: one for the EA Forum, one for LessWrong, etc.
  • Archives. Our current model focuses on turning new content into audio. However, there is a substantial backlog of posts that would be great to convert.
  • Org specific podcasts. We'd be happy to help EA organizations set up their own podcast version of their content. Just reach out to us.
  • Other? Let us know in the comments if there are other sources or topics you’d like covered.

(1)  Sometimes the researcher is the same person as the person who puts the results into action, such as Charity Entrepreneurship’s model. Sometimes it’s a longer causal chain, where the research improves the conclusions of another researcher, which improves the conclusions of another researcher, and so forth, but eventually it ends in real world actions. Finally, there is often the intrinsic happiness of doing good research felt by the researcher themselves.

(2)  For those of you who want to use TTS for a wider variety of articles than what the Nonlinear Library will cover, the ones I use are listed below. Do bear in mind they each have at least one of the cons listed above. There are probably also better ones out there as the landscape is constantly changing. 

(3) The current upvote thresholds for which articles are converted are:
25 for the EA forum
30 for LessWrong
No threshold for the Alignment Forum due to low volume

This is based on the frequency of posts, relevance to EA, and quality at certain upvote levels.