Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 1 час 6 минут назад

test post

27 августа, 2021 - 18:30
Published on August 27, 2021 3:30 PM GMT

why don't you leave a test comment?


Research productivity tip: "Solve The Whole Problem Day"

27 августа, 2021 - 16:05
Published on August 27, 2021 1:05 PM GMT

(This is about a research productivity strategy that’s been working very well for me personally. But YMMV, consider reversing any advice, etc. etc.)

As a researcher, there’s kinda a stack of "what I'm trying to do", from the biggest picture to the most microscopic task. Like here's a typical "stack trace" of what I might be doing on a random morning:

So as researchers, we face a practical question: How do we allocate our time between the different levels of the stack? If we’re 100% at the bottom level, we run a distinct risk of "losing the plot", and working on things that won't actually help advance the higher levels. If we’re 100% at the top level, with our head way up in the clouds, never drilling down into details, then we’re probably not learning anything or making any progress.

Obviously, you want a balance.

And I've found that striking that balance properly isn't something that takes care of itself by default. Instead, my default is to spend too much time at the bottom of the stack and not enough time higher up.

So to counteract that tendency, I have for many months now had a practice of "Solve The Whole Problem Day". That's one day a week (typically Friday) where I force myself to take a break from whatever detailed thing I would otherwise be working on, and instead I fly up towards the top of the stack, and try to see what I'm missing, question my assumptions, etc.

In my case, "The Whole Problem" = "The Whole Safe & Beneficial AGI Problem". For you, it might be The Whole Climate Change Problem, or The Whole Animal Suffering Problem, or The Whole Becoming A Billionaire Problem, or whatever. (If it's not obvious how to fill in the blank, well then you especially need a Solve The Whole Problem Day! And maybe start here & here & here.)

Implementation details
  • The most concrete and obvious way that my Solve The Whole Problem Days are different from my other workdays is that I have a rule that I impose on myself: No neuroscience. ("Awww c'mon, not even a little? Pretty please?" "No!!!!!"). So that automatically forces me up to like Levels 3 & 4 on the bulleted list above, instead of my usual perch at Levels 1 & 2. Of course, there's more to it than that one rule—the point is Solving The Whole Problem, not following self-imposed rules. But still, that rule is especially helpful.
    • For example, when I'm answering emails and commenting on blog posts, that's often not about neuroscience, nor about Solving The Whole Problem. So I wouldn't count those towards fulfilling the spirit of Solve The Whole Problem Day.
  • The point is not to stay at a high level on the stack. The point is to visit a high level on the stack, and then drill down to lower levels. That's fine … as long as I'm drilling down into lower-level details along a new and different branch of the tree.
  • I also have a weekly cleanup and reorganization of my to-do list, but I think of it as a totally different thing from Solve The Whole Problem Day, and indeed I do it on a different day. In fact, a separate sub-list on my Trello board to-do list is a list of tasks that I want to try tackling on an upcoming Solve The Whole Problem Day.
  • I have no qualms about Solving The Whole Problem on other days of the week too—I'm trying to correct a particular bias in my own workflow, and am not at risk of overcorrecting.
Why do I need to force myself to do this, psychologically?

It's crazy: practically every Solve The Whole Problem Day, I start the morning with a feeling of dread and annoyance and strong temptation to skip it this week. And I end the day feeling really delighted about all the great things I got done. Why the annoyance and dread? Introspectively, I think there are a few things going on in my mind:

  • First, I’m very often immersed in some interesting problem, and reluctant to pause. “Aww,” I say to myself, “I really wanted to know what the nucleus incertus does! What on earth could it be? And now I have to wait all the way until Monday to figure it out? C'mon!!” Not just that, but all my normal heuristics for to-list prioritization would say that I should figure out the nucleus incertus right now: I need to do it eventually one way or the other, and I'm motivated to do it now, and I'm in an especially good position to do it right now (given that all the relevant context is fresh in my mind), and finally, the "Solve The Whole Problem" activities are not time-sensitive.
  • Second, I prefer working on problems that definitely have solutions, even if nobody knows them. The nucleus incertus does something. Its secrets are just waiting to be revealed, if only we know where to look! Other low-level tasks are of the form "Try doing X with method Y", which might or might not succeed, but at least I can figure out whether it succeeds or fails, cross it off my to-do list, and move on. By contrast, higher-level things are sometimes in that awful place where there’s neither a solution, nor a proof that no solution exists. (Think of things like "solve the whole AGI control problem", or "find an interpretability technique that scales to AGI".) If I'm stumped, well maybe it's not just me, maybe there's just no progress to be made. I find that somewhat demotivating and aversive. Not terribly so, but just enough to push me away, if I’m not being self-aware about it.
  • Third, I have certain ways of thinking about the bigger-picture context of what I'm working on, and I'm used to thinking that way, and it's comfortable and I like it. But a frequent task of Solve The Whole Problem Day is to read someone coming with a very different perspective, sharing none of my assumptions or proximate goals or terminologies, and try to understand that perspective and get something out of it. Sometimes this is fun and awesome, but also sometimes it's just a really long hard dense slog with no reward at the end. So it feels aversive, and comparatively unproductive.

But again that's just me. YMMV.


Altruism Under Extreme Uncertainty

27 августа, 2021 - 09:58
Published on August 27, 2021 6:58 AM GMT

I attended an Effective Altruism club today where someone had this to say about longtermism.

I have an intuitive feeling that ethical arguments about small probabilities of helping out extremely large numbers (like 1058.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-surd + .mjx-box {display: inline-flex} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor; overflow: visible} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} ) of people are flawed, but I can't construct a good argument for why this is.

The flaw is uncertainty.

In the early 20th century, many intellectuals were worried about population control. The math was simple. People reproduce at an exponential rate. The amount of food we can create is finite. Population growth will eventually outstrip production. Humanity will starve unless population control is implemented by governments.

What actually happened was as surprising as it was counterintuitive. People in rich, industrial countries with access to birth control voluntarily restricted the number of kids we have. Birthrates fell below replacement-level fertility. This process is called the demographic transition.

We now know that if you want to reduce population growth, the best way to do so is to make everyone rich and then provide free birth control. The side effects of this are mostly beneficial too.

China didn't know about the demographic transition when they implemented the one-child policy (一孩政策). The one-child policy wasn't just a human rights disaster involving tens of thousand of forced abortions for the greater good. It was totally unnecessary. The one-child policy was implemented at a time when China was rapidly industrializing. The Chinese birthrate would have naturally dropped below replacement level without government intervention. Chinese birthrates are still below replacement-level fertility even now that the one-child policy has been lifted. China didn't just pay a huge cost to get zero benefit. They paid a huge cost to gain negative benefit. Their age pyramid and sex ratios are extra messed up now. This is the opposite of what effective population control should have accomplished.

China utterly failed to predict its own demographic transition even though demographic changes on time horizons of a few decades are an unusually easy trend to predict. The UN makes extremely precise predictions on population growth. Most trends are much harder to predict than population growth. If you're making ethical decisions involving the distant future then you need to make predictions about the distant future. Predictions about the distant future necessarily involve high uncertainty.

In theory, a 10% chance of helping 10 people equals a 0.001% chance of helping out 100,000 people. In practice, they are very different because of uncertainty. In the 10% situation, a 0.1% uncertainty is ignorably small. In the 0.001% situation, a 0.1% uncertainty dominates the equation. You have a 0.051% chance of doing good and a 0.049% chance of doing harm once uncertainty is factored in. It's statistical malpractice to even write the probabilities as "0.051%" and "0.049%". They both round to 0.05%.

Is it worth acting when you're comparing a 0.051% chance of doing good to a 0.049% chance of doing harm? Maybe, but it's far from a clean argument. Primum non nocere (first, do no harm) matters too. When the success probability of an altruistic action is lower than my baseline uncertainty about reality itself, I let epistemic humility take over by prioritizing more proximate objectives.


Could you have stopped Chernobyl?

27 августа, 2021 - 07:05
Published on August 27, 2021 1:48 AM GMT

...or would you have needed a PhD for that?

It would appear the inaugural post caused some (off-LW) consternation! It would, after all, be a tragedy if the guard in our Chernobyl thought experiment overreacted and just unloaded his Kalashnikov on everyone in the room and the control panels as well.

And yet, we must contend with the issue that if the guard had simply deposed the leading expert in the room, perhaps the Chernobyl disaster would have been averted.

So the question must be asked: can laymen do anything about expert failures? We shall look at some man-made disasters, starting of course, with Chernobyl itself.

ChernobylOne way for problems to surface

To restate the thought experiment: the night of the Chernobyl disaster, you are a guard standing outside the control room. You hear increasingly heated bickering and decide to enter and see what's going on, perhaps right as Dyatlov proclaims there is no rule. You, as the guard, would immediately be placed in the position of having to choose to either listen to the technicians, at least the ones who speak up and tell you something is wrong with the reactor and the test must be stopped, or Dyatlov, who tells you nothing is wrong and the test must continue, and to toss the recalcitrant technicians into the infirmary.

If you listen to Dyatlov, the Chernobyl disaster unfolds just the same as it did in history.

If you listen to the technicians and wind up tossing Dyatlov in the infirmary, what happens? Well, perhaps the technicians manage to fix the reactor. Perhaps they don't. But if they do, they won't get a medal. Powerful interests were invested in that test being completed on that night, and some unintelligible techno-gibberish from the technicians will not necessarily convince them that a disaster was narrowly averted. Heads will roll, and not the guilty ones.

This has broader implications that will be addressed later on, but while tossing Dyatlov in the infirmary would not have been enough to really prevent disaster, it seems like it would have worked on that night. To argue that the solution is not actually as simple as evicting Dyatlov is not the same as saying that Dyatlov should not have been evicted: to think something is seriously wrong and yet obey is hopelessly akratic.

But for now we move to a scenario more salvageable by individuals.

The ChallengerRoger Boisjoly, Challenger warner

The Challenger disaster, like Chernobyl, was not unforeseen. Morton-Thiokol engineer Roger Boisjoly, had raised red flags with the faulty O-rings that led to the loss of the shuttle and the deaths of seven people as early as six months before the disaster. For most of those six months, that warning, as well as those of other engineers went unheeded. Eventually, a task force was convened to find a solution, but it quickly became apparent the task force was a toothless, do-nothing committee.

The situation was such that Eliezer Yudkowsky, leading figure in AI safety, held up the Challenger as a failure that showcases hindsight bias, the mistaken belief that a past event was more predictable than it actually was:

Viewing history through the lens of hindsight, we vastly underestimate the cost of preventing catastrophe. In 1986, the space shuttle Challenger exploded for reasons eventually traced to an O-ring losing flexibility at low temperature (Rogers et al. 1986). There were warning signs of a problem with the O-rings. But preventing the Challenger disaster would have required, not attending to the problem with the O-rings, but attending to every warning sign which seemed as severe as the O-ring problem, without benefit of hindsight.

This is wrong. There were no other warning signs as severe as the O-rings. Nothing else resulted in an engineer growing this heated the day before launch (from the obituary already linked above):

But it was one night and one moment that stood out. On the night of Jan. 27, 1986, Mr. Boisjoly and four other Thiokol engineers used a teleconference with NASA to press the case for delaying the next day’s launching because of the cold. At one point, Mr. Boisjoly said, he slapped down photos showing the damage cold temperatures had caused to an earlier shuttle. It had lifted off on a cold day, but not this cold.

“How the hell can you ignore this?” he demanded.

How the hell indeed. In an unprecedented turn, in that meeting NASA management was blithe enough to reject an explicit no-go recommendation from Morton-Thiokol management:

During the go/no-go telephone conference with NASA management the night before the launch, Morton Thiokol notified NASA of their recommendation to postpone. NASA officials strongly questioned the recommendations, and asked (some say pressured) Morton Thiokol to reverse their decision.

The Morton Thiokol managers asked for a few minutes off the phone to discuss their final position again. The management team held a meeting from which the engineering team, including Boisjoly and others, were deliberately excluded. The Morton Thiokol managers advised NASA that their data was inconclusive. NASA asked if there were objections. Hearing none, NASA decided to launch the STS-51-L Challenger mission.

Historians have noted that this was the first time NASA had ever launched a mission after having received an explicit no-go recommendation from a major contractor, and that questioning the recommendation and asking for a reconsideration was highly unusual. Many have also noted that the sharp questioning of the no-go recommendation stands out in contrast to the immediate and unquestioning acceptance when the recommendation was changed to a go.

Contra Yudkowsky, it is clear that the Challenger disaster is not a good example of how expensive it can be to prevent catastrophe, since all prevention would have taken was NASA management doing their jobs. Though it is important to note that Yudkowky's overarching point in that paper, that we have all sorts of cognitive biases clouding our thinking on existential risks, still stands.

But returning to Boisjoly. In his obituary, he was remembered as "Warned of Shuttle Danger". A fairly terrible epitaph. He and the engineers who had reported the O-ring problem had to bear the guilt of failing to stop the launch. At least one of them carried that weight for 30 years. It seems like they could have done more. They could have refused to be shut out of the final meeting where Morton-Thiokol management bent the knee to NASA, even if that took bloodied manager noses. And if that failed, why, they were engineers. They knew the actual physical process necessary for a launch to occur. They could also have talked to the astronauts. Bottom line, with some ingenuity, they could have disrupted it.

As with Chernobyl, yet again we come to the problem that even while eyebrow raising (at the time) actions could have prevented the disaster, they could not have fixed the disaster generating system in place at NASA. And like in Chernobyl: even so, they should have tried.

We now move on to a disaster where there wasn't a clear, but out-of-the-ordinary solution.

BeirutYet another way for problems to surface

It has been a year since the 2020 Beirut explosion, and still there isn't a clear answer on why the explosion happened. We have the mechanical explanation, but why were there thousands of tons of Nitropril (ammonium nitrate) in some rundown warehouse in a port to begin with?

In a story straight out of The Outlaw Sea, the MV Rhosus, a vessel with a convoluted 27 year history, was chartered to carry the ammonium nitrate from Batumi, Georgia to Beira, Mozambique, by the Fábrica de Explosivos Moçambique. Due to either mechanical issues or a failure to pay tolls for the Suez Canal, the Rhosus was forced to dock in Beirut, where the port authorities declared it unseaworthy and forbid it to leave. The mysterious owner of the ship, Igor Grechushkin, declared himself bankrupt and left the crew and the ship to their fate. The Mozambican charterers gave up on the cargo, and the Beirut port authorities seized the ship some months later. When the crew finally managed to be freed from the ship about a year after detainment (yes, crews of ships abandoned by their owners must remain in the vessel), the explosives were brought into Hangar 12 at the port, where they would remain until the blast six years later. The Rhosus itself remained derelict in the port of Beirut until it sank due to a hole in the hull.

During those years it appears that practically all the authorities in Lebanon played hot potato with the nitrate. Lots of correspondence occurred. The harbor master to the director of Land and Maritime Transport. The Case Authority to the Ministry of Public Works and Transport. State Security to the president and prime minister. Whenever the matter was not ignored, it ended with someone deciding it was not their problem or that they did not have the authority to act on it. Quite a lot of the people aware actually did have the authority to act unilaterally on the matter, but the logic of the immoral maze (seriously, read that) precludes such acts.

There is no point in this very slow explosion in which disaster could have been avoided by manhandling some negligent or reckless authority (erm, pretend that said "avoided via some lateral thinking"). Much like with Chernobyl, the entire government was guilty here.

What does this have to do with AI?

The overall project of AI research exhibits many of the signs of the discussed disasters. We're not currently in the night of Chernobyl: we're instead designing the RBMK reactor. Even at that early stage, there were Dyatlovs: they were the ones who, deciding that their careers and keeping their bosses pleased was most important, implemented, and signed off, on the design flaws of the RBMK. And of course there were, because in the mire of dysfunction that was the Soviet Union, Dyatlovism was a highly effective strategy. Like in the Soviet Union, plenty of people, even prominent people, in AI, are ultimately more concerned with their careers than with any longterm disasters their work, and in particular, their attitude, may lead to. The attitude is especially relevant here: while there may not be a clear path from their work to disaster (is that so?) the attitude that the work of AI is, like nearly all the rest of computer science, not life-critical, makes it much harder to implement regulations on precisely how AI research is to be conducted, whether external or internal.

While better breeds of scientist, such as biologists, have had the "What the fuck am I summoning?" moment and collectively decided how to proceed safely, a similar attempt in AI seems to have accomplished nothing.

Like with Roger Boisjoly and the Challenger, some of the experts involved are aware of the danger. Just like with Boisjoly and his fellow engineers, it seems like they are not ready to do whatever it takes to prevent catastrophe.

Instead, as in Beirut, memos and letters are sent. Will they result in effective action? Who knows?

Perhaps the most illuminating thought experiment for AI safety advocates/researchers, and indeed, us laymen, is not that of roleplaying as a guard outside the control room at Chernobyl, but rather: you are in Beirut in 2019.

How do you prevent the explosion?

Precisely when should one punch the expert?

The title of this section was the original title of the piece, but though it was decided to dial it back a little, it remains as the title of this section, if only to serve as a reminder the dial does go to 11. Fortunately there is a precise answer to that question: when the expert's leadership or counsel poses an imminent threat. There are such moments in some disasters, but not all, Beirut being a clear example of a failure where there was no such critical moment. Should AI fail catastrophically, it will likely be the same as Beirut: lots of talk occurred in the lead up, but some sort of action was what was actually needed. So why not do away entirely with such an inflammatory framing of the situation?

Why, because us laymen need to develop the morale and the spine to actually make things happen. We need to learn from the Hutu:

Can you? Can I?

The pull of akrasia is very strong. Even I have a part of me saying "relax, it will all work itself out". That is akrasia, as there is no compelling reason to expect that to be the case here.

But what after we "hack through the opposition" as Peter Capaldi's The Thick of It character, Malcolm Tucker, put it? What does "hack through the opposition" mean in this context? At this early stage I can think of a few answers:

  1. There is such a thing as safety science, and leading experts in it. They should be made aware of the risk of AI, and scientific existential risks in general, as it seems they could figure some things out. In particular, how to make certain research communities engage with the safety-critical nature of their work.
This sort of gibberish could be useful. From Engineering a Safer World.
  1. A second Asilomar conference on AI needs to be convened. One with teeth this time, involving many more AI researchers, and the public.
  2. Make it clear to those who deny or are on the fence about AI risk that the (not-so-great) debate is over, and it's time to get real about this.
  3. Develop and proliferate the antidyatlovist worldview to actually enforce the new line.

Points 3 and 4 can only sound excessive to those who are in denial about AI risk, or those to whom AI risk constitutes a mere intellectual pastime.

Though these are only sketches. We are indeed trying to prevent the Beirut explosion, and just like in that scenario, there is no clear formula or plan to follow.

This Guide is highly speculative. You could say we fly by the seat of our pants. But we will continue, we will roll with the punches, and we will win.

After all, we have to.

You can also subscribe on substack.


Dialogue on anti-induction.

27 августа, 2021 - 02:21
Published on August 26, 2021 11:21 PM GMT

My uncle with whom I shared thoughts on anti-induction remarked that humans are systematically anti-inductive in some situations : he gave the example of gambling, where people can think that losing a lot in a row means that they are poised to win soon.
But this is not a fair example in my opinion, because gamblers are not consciously anti-inductive : when their behavior is exposed as such, they do not defend their decision.
Among my relatives, the gamblers are notoriously irrational. A bayesian might say that a long streak of wins is very weak evidence that they will keep on winning, because they have a strong prior confidence in mathematical analysis of the game, but that hardly tells us anything about how anti-induction arose in the first place.

The following dialogue is intended to showcase a (moderately) intelligent anti-inductor in action, to try to understand the anti-inductors by putting myself in their shoes.


Alice is an anti-inductor. She intuitively believes that things that have happened often typically don't happen again.
Aware of that fact, and of the existence of inductors, she has tried to look into anti-induction to know what it really means to her, and if it is an intuition she should abandon.

She has a friend, Iris, who is an inductor. They are both tentatively rational, as intelligent as each other (and coincidentally about as dumb as me).

Iris : See, when using induction, I have often been right, and you have often been wrong using anti-induction.
Alice : Then, your induction probably tells you that it means induction is right and anti-induction is wrong... How interesting.
I : I am aware of that, and I can accurately modelize your own view, including your modelization of me, and it's reciprocal, and I know you know I know... Let's accept that and move on.
A : We are very similar, in that our goals and cognitive patterns are the same, except when it comes to induction. We also have shared knowledge. I accept on these grounds that I might be wrong about anti-induction, with a strong negative emotional bias of course.
I : So do I. I care about you and want to make you recognize the good of induction for your own well-being. By symmetry, we will search for a non-(anti-)inductive argument to ground (anti-)induction, and we will try our best not to implicitly found our arguments on (anti-)induction.
A : How about... We accept there are inductors and anti-inductors. Let's approximate being right with fitness. This approximation is reasonable because anti-induction is used for very basic things, such as the sun not shining tomorrow. I expect inductors to die from starvation when they falsely believe that eating will feed them after they have done it a thousand times. Likewise, you expect anti-inductors to die from starvation from refusing to eat when they are hungry.
I : Exactly so ! Now, look : we have diverging conclusions about the state of the world. Let us observe it and crown one of the two competing theories !

Were I a non-inductive spirit, I would have perhaps no reason to fill my fictional world with inductors rather than anti-inductors (thereby supporting one side a priori), but I am not.
This does mean however that the following reasoning is only valuable insofar as it modifies my behavior relative to the real world and not a mere possible one (in some possible worlds, I would not update on counterfactually fictional evidence).

A : We have observed without bias in samples or experiment, and it has been shown everyone on fictional-Earth except for me is an inductor. Wow, I did NOT expect that !
I, triumphantly : You know the bias of being surprised at your own failures. Now, be fair and accept to change your mind and join the inductive side of the force !
A, thoughtfully : You could have worded it differently... Is there a meaningful difference between what you said and "Now, because of the laws of rationality, you must change your mind etc." ? In that case, I will argue that the laws of rationality have had this implication everytime a similar reasoning followed a similar experiment indeed... Which seems to prove that we cannot think in the same way this time !
I : There are no "laws of rationality", there are only the actual laws of rationality ! How could correct methods of thinking change from one experiment to the other ?
A : Well, it has definitely never happened before.
I : The fact that anti-induction support a contradictory proposition does not mean that the original one relies on induction, in general and in this particular case.
A : Although I don't see why what you just said would be true it sounds very reasonable, so let's accept it and continue our research.
I : What is there to continue thinking about ? We have shown that induction improves fitness, and we have previously agreed that the fittest heuristic would be the truest.
A : Yes, but that is only a true fact about the past. Induction has managed to prevail until now (rather, until the moment we observed the people of the world), but how do we know it will remain the best heuristic tomorrow ? Or in five minutes ? Heuristics like this one are as fundamental as modus ponens, of the kind that does not change on small scales of time, and tend to be invariant general fitness (dis)advantages.
A : (As a side note, we both feel like it's more than a mere behavioral heuristic, but rather a logical truth, but we cannot find any supportive reasoning.)
I : So what ?
A : So I expect this particular one to be unlike all the other ones. This is anti-induction alright, but is it false ?
I, after a moment of reflection : You used very vague wording, and perhaps you think there's no flaw because you are confused. Can you detail your proposition ? Perhaps tabooing words like "other" and "particular".
A : So, I expect anti-induction to be true tomorrow, since induction was true until five minutes ago, because (anti-)induction belongs to the reference class of fundamental behavioral heuristics, like will to survive or will to reproduce. The kind of things that we expect every single living being to have because anything else is simply improbable to happen to live.
I : Geez... I'm still not sure it's not just confusion, but I see why I can't spot a mistake : (anti-)induction is messy, does not yield logical certainty, and uses ill-defined reference classes. Using arbitrary classes can make a committed (anti-)inductor believe anything, there are just too many potential reference classes with interesting properties out there.
A : And yet our reference classes are not arbitrary. We would reject a reference class used for induction comprising gems blue before today and green after. My point stands very strongly if "fundamental behavioral heuristics" is a very legitimate class.

None of my tentatives to formalize induction in formal logic led to interesting results. Please tell me if you know a paper on the subject !
So far, my best definition is that there is an interest function of propositions (personal ?) and interesting reference classes, defined as {x : P(x)} for an interesting property P. Induction states that if A is an interesting reference class, 

Vx€A, (O(x)=>Q(x)) -> Vx€A, Q(x) 

where Q is interesting and O(x) means "x is observed", assuming a meaningful "observed" property.

I : This is starting to become too subjective, so I'll try a new approach. You said earlier that inductors should die of starvation. Please tell me, Alice, how is it that you managed to survive until now ?
A : I had not thought about the absurdity of eating. From now on, I will not eat anymore when I am hungry.
I : Funny how the thought did not occur to you before... Perhaps your system 2 is anti-inductor, and your system 1 is inductor ?
A : Perhaps, but we have little evidence in favor of that, it's just a wild conjecture, perhaps motivated by your desire to prove anti-induction wrong. I am fictional anyway, so it's not evidence that there is no rational anti-inductor.
I : Tell me, Alice, how is it you believe the word induction is used in the same meaning it had since the beginning of our conversation ?
A : By the double-barrelled jumping jiminetty and flying spaghetti monster !

Alice refused to partake further in the debate, troubled. Thank the lord she did not think about the fact that she had used anti-induction until now ! She would then have legitimately expected to stop believing her truth any moment, without any explanation as to why ! Fated to be wrong like someone who receives overwhelming evidence they are a Botzmann brain is fated to stop thinking ! In that sorry case, she would weep for her soon to be lost love for truth, and digging further, her soon to be lost consistent utility function (for she followed one until now).

I : Now, that is a devastating argument from my viewpoint : that anti-induction leads to systematic uncertainty. Unfortunately, that is conditional on the fact that obervedly-unchanging truth is unchanging, which I suspect is an inductive system 1 reasoning.
I : I predict that anti-inductors will fail to think, and anti-inductors incapable of self-reflection would take me up on that bet while anti-inductors capable of self-reflection will fail to think, like Alice. Now, I just need to observe the world and see whether people actually think consistently, right ? Or maybe bet a lot of fitness on that ?
I : Another potential argument is the fact that Alice cannot explain how she managed to avoid starvation until now. Perhaps induction allows me to explain more facts of the world than anti-induction ? The big issue there is a posteriori explanation, since it may not be enough to deduce knowledge of the future without using (anti-)induction.

I : I sure hope my reasoning was not too motivated by the fact that I actually believe induction to be superior to anti-induction... As long as Alice was there, at least she could compensate by being equally motivated to defend anti-induction, but what would she answer to the above arguments ?


Amyloid Plaques: Chemical Streetlight, Medical Goodhart

27 августа, 2021 - 00:25
Published on August 26, 2021 9:25 PM GMT

Alzheimer's Disease (AD) is truly, unduly cruel, and truly, unduly common. A huge amount of effort goes into curing it, which I think is a true credit to our civilization. This is in the form of both money, and the efforts of many of the brightest researchers.

But it hasn't worked.

Since AD is characterised by amyloid plaques, the "amyloid hypothesis" that these were the causative agent has been popular for a while. Mutations to genes which encode the amyloid beta protein can cause AD. Putting lots of amyloid into the brain causes brain damage in mice. So for many years, drugs were screened by testing them in mutant mice which were predisposed to AD. If the plaques disappeared, they were considered good candidates.

So why didn't it work?

Lots of things can affect amyloid plaques as it turns out, right up to the latest FDA approved drug, which is just antibodies which target amyloid protein. While this does reduce amyloid, it has no effect on cognitive decline.

Goodhart's law has reared its head: amyloid plaque buildup is a metric for AD progression, but selecting for drugs which reduce it causes the relationship between AD and plaques to fall apart.

Equally, amyloid plaques are very easy to measure in mouse (and human) brains. It can be done by MRI scan, or by dissection. Memory loss and mood changes are harder to measure, and even harder in mice. The methods for measuring amyloid plaques also feel better in many ways. There's less variation in potential methods, they can be compared across species, they're qualitative, and they're also more in line with what the average biologist/chemist will be used to.

Understanding these, we can see how looking for drugs which decrease amyloid plaques in mice just really feels like productive research. We can also understand, now, why it wasn't.

Avoiding Wasted Effort

Pointing out biases is fairly useless. Pointing out specific examples is better. But the best way to help others is to point out how it feels from the inside to be making these mistakes.

So what does it feel like to be on the inside of these biases? Unfortunately as someone who has not been intimately involved in AD research I can't say exactly. But as someone involved with research in general I can make a guess:

  • Research will feel mostly productive. It may feel like you are becoming how you imagine a researcher to be. Papers will be published. This is because you're in the streetlight.
  • What you won't feel is a sense of building understanding. Learning to notice a lack of understanding is one of the most important skills, and it is sadly not an easy thing to explain.
  • Think about the possible results of your experiments. Do you expect something you've not seen before? Or do you expect a result with a clear path to success? Creative work usually passes the first. Well-established and effective protocols pass the second. Mouse AD models do not pass either (anymore).
  • A positive experimental result will be much easier than a "true" success. This has the benefit (for researchers) of allowing you to seem successful without actually doing good. The ratio of AD papers to AD cures is 1:0 ("Alzheimer's Disease Treatment" returns 714,000 results in Google Scholar)

Beyond this I do not know. Perhaps it is a nameless virtue. But it might be useful to try to identify more cases. I hereby precommit to posting a follow-up with at least five examples of this within the next seven days.


A brief review of The Scout Mindset

26 августа, 2021 - 23:47
Published on August 26, 2021 8:47 PM GMT

I've been reading blogs like Less Wrong for almost a decade. So then, a lot of what was said in this book wasn't new to me. However, I still really liked it.

I'm of the opinion that in order to deeply understand a topic, it's not just enough to understand it conceptually, you have to see lots and lots of examples of it, from many different angles. I felt like this book helped me with that. Despite the fact that I spend so much time reading rationality related blogs, The Scout Mindset still felt like it non-trivially deepened my understanding of the subject matter.

The stories that were told in this book were really good. They were a nice blend of appropriate, instructive, and engaging. I'm always impressed when books like these manage to do this. I spend too much of my time reading blogs and not enough of my time reading books. I often find myself similarly impressed by the quality of stories chosen in other books as well.

My main critique of The Scout Mindset is that, well, let me start with this. Early in the book, Julia pointed out that the difficulty isn't in knowing that you should do stuff like account for cognitive biases. It's in actually bringing yourself to do it!

My path to this book began in 2009, after I quit graduate school and threw myself into a passion project that became a new career: helping people reason out tough questions in their personal and professional lives. At first I imagined that this would involve teaching people about things like probability, logic, and cognitive biases, and showing them how those subjects applied to everyday life. But after several years of running workshops, reading studies, doing consulting, and interviewing people, I finally came to accept that knowing how to reason wasn't the cure-all I thought it was.

Knowing that you should test your assumptions doesn't automatically improve your judgement, any more than knowing you should exercise automatically improves your health. Being able to rattle off a list of biases and fallacies doesn't help you unless you're willing to acknowledge those biases and fallacies in your own thinking. The biggest lesson I learned is something that's since been corroborated by researchers, as we'll see in this book: our judgment isn't limited by knowledge nearly as much as it's limited by attitude.

So then, the question becomes, how do you actually get people to have a scout mindset? Julia's "approach has three prongs: 1) Realize that truth isn't in conflict with your other goals, 2) Learn tools that make it easier to see clearly, 3) Appreciate the emotional rewards of Scout Mindset". This is where my criticism comes in. I didn't really find that this approach was that effective. It didn't "tug on my heartstrings" enough.

To me, when I think about books that have moved me, they have told longer, deeper, more emotionally compelling stories. This often means that the book is fiction, but not always. It could be non-fiction. As an example of non-fiction, biographies come to mind. So does some long-form journalistic reporting. I think doing something like that in The Scout Mindset would have been more effective. Instead, Julia chose to use lots of smaller stories. By doing so, I think the book falls too close to "educational" on the spectrum from educational-to-inspirational.

That said, I do think that there is a place for a book that lives at that point on the spectrum. Not all books should be, say, at +8 points towards inspirational.

A related maybe-not-even-a-critique is that a book doesn't feel like the right medium for the goal of "inspire people to adopt a Scout Mindset". Something that is more social, interactive, and community-based seems like a better tool for that job. To her credit, Julia has in fact spent time tackling the problem from that angle in founding the Center for Applied Rationality, amongst other things. She also mentions various communities you can join online in the book such as the FeMRADebates or ChangeMyView subreddits, or the Effective Altruist community. And I'll mention Less Wrong again as another example.

Overall, I felt that The Scout Mindset was a pleasant read, and a book that did help me make progress as a scout. I don't feel like it was very much progress though, and I have thoughts about what could have been done instead to promote more progress. However, this is a super important topic, so any progress is valuable, and I'll happily take what I can get.


Signaling Virtuous Victimhood as Indicators of Dark Triad Personalities

26 августа, 2021 - 22:18
Published on August 26, 2021 7:18 PM GMT

A recent paper at the University of British Columbia describes five studies which, taken together, provide evidence that tendency to engage in victimhood signalling is correlated across individuals with Dark Triad personality traits (Machiavellianism, narcissism, and psychopathy).

Results were robust across the general US and Canada population gathered on mTurk as well as within a sample of Canadian undergraduates.

A link to the full study is provided in this post.


Introduction to Reducing Goodhart

26 августа, 2021 - 21:38
Published on August 26, 2021 6:38 PM GMT

I - Prologue

Two months ago, I wanted to write about AI designs that evade Goodhart's law. But as I wrote that post, I became progressively more convinced that framing things that way was leading me to talk complete nonsense. I want to explore why that is and try to find a different (though not entirely original, see Rohin et al., Stuart 1, 2, 3) framing of core issues, which avoids assuming that we can model humans as idealized agents.

This post is the first of a sequence of five posts - in this introduction I'll be making the case that we expect problems to arise in the straightforward application of Goodhart's law to value learning. I'm interested in hearing from you if you remain unconvinced or think of things I missed.

II - Introduction

Goodhart's law tells us that even when there are normally only small divergences between what we optimize for and our real standards, the outcome can be quite bad by our real standards. To use Scott Garrabrant's terminology from Goodhart Taxonomy, suppose that we have some true preference function V (for "True Values") over worlds, and U is some proxy that has been correlated with V in the past. Then there are several reasons given in Scott's post why maximizing U may score poorly according to V.

But here's the problem: humans have no such V (see also Scott A., Stuart 1, 2). Inferring human preferences depends on:

  • what state the environment is in.
  • what physical system to infer the preferences of.
  • how to make inferences from that physical system.
  • how to resolve inconsistencies and conflicting dynamics.
  • how to extrapolate the inferred preferences into new and different contexts.

There is no single privileged way to do all these things, and different choices can give very different results. And yet the framing of Goodhart's law, as well as much of our intuitive thinking about value learning, rests on the assumption that the True Values are out there.


Goodhart's law is important - we use it all over the place (e.g. 1, 2, 3). In AI alignment we want to use Goodhart's law to crystallize a pattern of bad behavior in AI systems (e.g. 1, 2, 3, 4), and to design powerful AIs that don't have this bad behavior (e.g. 1, 2, 3, 4, 5, 6).  But if you try to use Goodhart's law to design solutions to these problems it has a single prescription for us: find V (or at least bound your error relative to it). Since there is no such V, not only is that advice useless, it actually denies the possibility of success.

The goal, then, is deconfusion. We still want to talk about the same stuff, the same patterns, but we want a framing of what-we-now-call-Goodhart's-law that helps us think about what successful AI could look like in the real world.

III - Preview of the sequence

We'll start the next post with the classic question: "Why do I think I know what I do about Goodhart's law?"

The obvious answers to this question involve talking about how humans model each other. But this raises yet more questions, like "why can't the AI just model humans that way?" This requires two responses: first, breaking down what we mean when we casually say that humans "model" things, and second, talking about the limitations of such models compared to the utility-maximization picture. The good news is that we can rescue some version of common sense, the bad news is that this doesn't solve our problems.

Next we'll take a deeper look at some typical places to use Goodhart's law that are related to value learning:

  • Curve fitting and overfitting.
  • Hard-coded utility functions.
  • Adversarial examples.
  • Hard-coded human models.

Goodhart's law reasoning is used both in the definition of these problems, and also in talking about proposed solutions such as quantilization. I plan to talk at excessive length about all of these details, with the object of building up pictures of our reasoning in these cases that never needs to mention the word "Goodhart" because it's at a finer level of magnification.

However, these pictures aren't all going to be consistent, because what humans think of as success or failure can depend on the context, and extrapolating beyond that context will bring our intuitions into conflict. Thus we'll have to revisit the abstract notion of human preferences and really hash out what happens (or what we think happens) at the boundaries and interchanges between human models of the world.

Finally, the hope is to conclude with some sage advice. Not a solution, because I haven't got one. But maybe some really obvious-seeming sage advice can tie together the concepts introduced in the sequence into something that feels like progress.

We'll see.


Narrative truth

26 августа, 2021 - 20:49
Published on August 26, 2021 5:49 PM GMT

One idea I encountered when investigating Jordan Peterson was the idea of narrative truth.

This is the kind of concept that most people nod along too, but which is almost always left implicit, so I thought it'd be worthwhile doing so here.

Let's quote what some other people have written on this subject:

Literal and Narrative Truth - Dave King

His story raises an interesting question, though – what does it mean for a memoir to be accurate? One of the largest issues we dealt with was the matter of dialogue.  He wanted to be absolutely scrupulous, telling stories precisely as they happened.  But in his original draft, his characters only spoke when he could remember what they said word-for-word.  Since the manuscript was written years after the fact, this meant he used very little dialogue – mostly bursts of highly memorable lines like, “I must live, I must tell!”  Nearly all the rest of his conversations were narrative summary, and many of his scenes felt flat and distant as a result.  He was telling the story to readers rather than letting them experience it.

I agreed with his absolute scruple about accuracy, but argued that he needed to focus on a different kind of accuracy – narrative accuracy rather than literal accuracy.  He needed to create dialogue that would make his readers feel the way he felt at the time.  This meant literally putting words in his characters’ mouths, even if those words conveyed the gist of a dialogue that actually happened.  But since the point of his narrative was to allow his readers to experience what he had experienced, the scenes with recreated dialogue were more accurate than the flat, emotionless scenes.

Many memoirists have taken this technique a step further and created composite characters.  For instance, in Dreams of my Father, President Obama’s “New York girlfriend” was actually an amalgam of several girlfriends he’d had in New York and Chicago. I’d argue that combining several minor characters into a single character who represents the type is another form of narrative accuracy.  If you had, for instance, several high school teachers who inspired you in similar ways, you could take the time to create each of them as a minor character.  But all these excess characters would do more than simply slow your narrative down.  By spending time on each teacher, you would give your readers the impression that your high school experiences meant more to you than they actually did.  The writing is strictly accurate, but the story as a whole is thrown off.

Storytelling and Narrative Truth

Society would remember the Holocaust differently if there were no survivors to tell the story, but only data, records and photographs. The stories of victims and survivors weave together the numbers to create a truth that is tangible to the human experience… The combination of the personal and narrative truth gives human context to the grainy black and white photos. As a result, the narrative truths combine with factual truth create a holistic picture of the Warsaw Ghetto and the Holocaust. This need for the narration of human experience seems innate...

Truth in Storytelling

It’s easier to understand important points when there’s a structure to follow. And it’s easier for us to remember—particularly if it is a lively and engaging piece.

If we over-simplify a story to fit it into a narrative arc, are we being truthful?

This gets us into an area where people start to see different shades of gray.

I think it helps to ask a few questions:

  • Am I leaving out key details because including them messes with the narrative flow?
  • Do I skip context because it makes the piece less compelling?
  • Am I framing anything in a way that makes the story look black and white when the reality is far more complex and nuanced?
  • Do I exclude facts and circumstances because they clutter the piece and may bore readers?

Why do we want to know the truth? Sometimes it's out of curiosity, sometimes for it's own sake, but arguably the strongest reason is because it allows us to act effectively in the world.

However, acting effectively in the world isn't just about knowing true facts about it. The human brain is fundamentally a meaning-making machine. When we are exposed to new facts, these update our current narratives and frames and it is usually via this indirect route that facts change our we live in the world.

Narratives build upon untrue facts can lead us down the wrong path, but the responsible use of artistic license often allows us model the world accurately more than the plain, unvarnished truth. Too much unnecessary detail confuses people, wears out their patience or interferes with the emotional impact. Simply throwing facts at someone is unlikely to be effective. Instead, you are much more likely to influence people if your communication style is comprehensible and engaging. And sometimes that requires some minor sacrifices in terms of literal accuracy.


For me the Christianity deal-breaker was meekness

26 августа, 2021 - 17:56
Published on August 26, 2021 2:25 PM GMT

I was raised Catholic, became agnostic around 13, stopped thinking an afterlife made sense at 15, and noticed I was no longer religious at 16.

But I still spent the next ~12 years in pretty close proximity to Christianity. I did religious studies at school, I studied philosophy and theology at university, and most importantly I sung in church choirs, which meant I was regularly attending services.

I also wasn't crazy about the label 'atheist', as I didn't think my beliefs had much in common with famous or 'new' atheists (Richard Dawkins, Christopher Hitchens, AC Grayling). I often found their objections uninteresting, as they dealt with claims concerning things like 'metaphysics' and 'theodicy' that most Christians I knew didn't really care about. For those Christians, religion was more of a spiritual commitment or a decision to live a certain way, and, for example, their conservative approach to romance seemed like it could be valid even if the Church couldn't explain why God lets evil exist.

But if what you're doing is following a set of practices that are at most loosely inferred from a set of community texts and a history of community tradition, why take the extra step of identifying as Christian? Why not just say 'hey there's some good stuff in here, and I'll join in with the good and leave the bad'? What makes you want to take that extra step?

During the last 12 years I've wondered if I wanted to take that extra step again, until this year, when I realized any desire I had to do so was based in meekness.

Meek: Quiet, gentle, and easily imposed on; submissive.
- https://www.lexico.com/definition/meek

Meekness is praised in the Gospel:

Blessed are the meek,
  for they will inherit the earth.
- Matthew 5:5

Not to mention that in the most important story in the whole Bible, the Son of God gives himself up *for death* without putting up a fight. He knows he will be betrayed but he waits for his captors to come and arrest him.

Meekness is a big deal for Christianity. And it shouldn't be. Meekness is not a virtue.

In the dictionary definition of meekness above, 'quiet, gentle' sounds fine; 'easily imposed on' and 'submissive', not so much. Meekness is when you don't stand up and ask for what you need, or it is when you allow what you need to be taken from you. This is a disaster for your personal wellbeing, and perhaps even worse when it comes to looking out for your community.

One of my clearest memories of Catholic school is sitting at a breakfast table watching three kids bully a fourth. The fourth kid had recently returned from a suspension for writing something rude in a guest book; it was pretty clear he had been pressured if not coerced. As if the events happening in front of him weren't enough, the supervising adult at the table would have known all this. But he was very deliberately staring straight ahead and into the distance, focussing his energy on ignoring what was happening.

And if you think meekness is a virtue, why wouldn't you do what that adult did? If you actually value not standing up for yourself, what hope is there for the people around you who might be counting on you to stand up for them?

Unfortunately, Christian communities that break this pattern are in my experience the exception not the rule. We had Marthin Luther King Jr., but he also wrote the following in his Letter from Birmingham Jail:

When I was suddenly catapulted into the leadership of the bus protest in Montgomery, Alabama, a few years ago, I felt we would be supported by the white church. I felt that the white ministers, priests and rabbis of the South would be among our strongest allies. Instead, some have been outright opponents, refusing to understand the freedom movement and misrepresenting its leaders; all too many others have been more cautious than courageous and have remained silent behind the anesthetizing security of stained glass windows. 
- https://letterfromjail.com/


Why did I wonder if I wanted to join the Church again? I realized I had been missing the sensation from childhood of being 'easily imposed upon, submissive', and letting a greater authority determine a larger part of what I believed.

The world is scary, and as a kid there's a lot of comfort in that kind of a relationship. I can't be the only one who was taken in by it.

But I'm not a kid any more, and as soon as I recognized the roots of what I was feeling, I also recognized it was neither healthy nor acceptable to me. Thanks, but no thanks, I'll be an impartial observer of religion from now on.

If you're reading this and you are a Christian, here are my challenges to you:

  • Don't let meekness be a recruiting mechanism to your Church. Deferring responsibility in the face of difficulty is sometimes necessary, but doing it as part of a fundamental, lifelong commitment is no way to live.
  • Respect yourself. Can you treat yourself with kindness and value your own wishes and desires? Or will you find some excuse for letting them go unfulfilled?
  • Look out for others. Prove to the world that you are not meek. You're committed to loving your neighbour, so whose basic human dignity are you going to stand up and defend, even in the face of opposition?
  • Decide what you believe on your own terms, not on your priest's. (If the conclusion that you reach is that you agree with him, then good for you.)

n.b. I've encountered the term 'protest atheism' which does a decent job of capturing where I am now. A new atheist like Dawkins would say 'God doesn't exist', whereas a protest atheist might say 'this isn't how we should live'.


Covid 8/26: Full Vaccine Approval

26 августа, 2021 - 16:00
Published on August 26, 2021 1:00 PM GMT

Great news, everyone. The Pfizer vaccine has been approved. Woo-hoo!

It will be marketed under the name Comirnaty. Doh! 

(Do we all come together to form one big cominraty? Or should you be worried about the cominraties of getting vaccinated, although you should really be orders of magnitude more worried about the cominraties of not getting vaccinated? Did things cominraty or was there a problem? Nobody knows. Particle man.

My understanding is that if a doctor were to prescribe the vaccine ‘off label,’ say to give to an 11 year old or to get someone an early booster shot, then they could potentially be sued for anything that went wrong, so in practice your doctor isn’t going to do this. 

A reasonable request was made that my posts contain Executive Summaries given their length. Let’s do it!

Executive Summary of Top News You Can Use
  1. Pfizer vaccine approved under the name Cominraty. 
  2. Vaccines still work. If you have a choice, Moderna > Pfizer but both are fine. 
  3. Boosters are still a good idea if you want even better protection. 
  4. Cases approaching peak.

Also, assuming you’re vaccinated, Krispy Kreme is offering two free donuts per day from August 30 until September 5. 

Now that that’s out of the way, let’s run the numbers.

The Numbers Predictions

Prediction from last week: 1,000,000 cases (+14%) and 8,040 deaths (+45%).

Results: 935k cases (+7%) and 7,526 deaths (+35%).

Prediction for next week: 950k cases (+2%) and 9,400 deaths (+25%).

I was confused how there could be such sharp peaks in other countries. It looks like we won’t get one of those. The trend lines seem clear, and it looks like we are approaching the peak. It would be surprising if we were still seeing increases week over week by mid-September, with the obvious danger that things could pick up again once winter hits.

Deaths DateWESTMIDWESTSOUTHNORTHEASTTOTALJul 1-Jul 74593296121281528Jul 8-Jul 145323986891451764Jul 15-Jul 214343417321701677Jul 22-Jul 2849138510091572042Jul 29-Aug 469347714153042889Aug 5-Aug 1170562921812343749Aug 12-Aug 1891285133943885545Aug 19-Aug 251281104546925087526

Deaths continue to lag cases. News was slightly good, so adjusting expectations slightly in response. Peak should still be a month out or so.

Cases DateWESTMIDWESTSOUTHNORTHEASTTOTALJul 1-Jul 727,41317,46040,0317,06591,969Jul 8-Jul 1445,33827,54468,12911,368152,379Jul 15-Jul 2165,91339,634116,93319,076241,556Jul 22-Jul 2894,42960,502205,99231,073391,996Jul 29-Aug 4131,19786,394323,06348,773589,427Aug 5-Aug 11157,553110,978409,18466,686744,401Aug 12-Aug 18183,667130,394479,21478,907872,182Aug 19-Aug 25188,855152,801502,83291,438935,926 Vaccination Statistics

How much will full FDA approval matter? Survey says not much.

I am more hopeful than this, and expect more than a 10% increase. Some of this will be people for whom this really was the true rejection. Other parts of it will be as mandates are handed down and people anticipate further mandates. 

Vaccine Effectiveness

I continue to find this very telling in terms of vaccine effectiveness versus Delta:

The argument is simple. The Delta vaccines are designed and would be easy to get approved, yet there has been no move to manufacture them quickly. The only reasonable explanation for this is that there isn’t actually much if any difference with the old vaccine. Or at least, that’s what the pharma companies that have every financial incentive acting against this are revealing they believe.

A new paper on vaccine effectiveness concurs (preprint).

I will for now accept the principle that a single dose provides substantially less protection against Delta than Alpha, but this is another data point that Delta isn’t different from Alpha once you get your second shot. I always find maddening the ‘confidence intervals overlapped, so nothing here’ reaction to differences like 67% vs. 79% – yes, you can’t be confident in that, but that’s mostly saying your study was underpowered, since that’s the kind of difference one would expect if there was a difference, and again, the word evidence does not mean what they think it means. 

The paper’s findings then get worse, if you believe them, claiming rapid reduction in effectiveness over time.

They then go on to say this, which given how vaccinations were timed seems likely to be confounding indeed:

There’s no one good money quote on it, but the findings robustly say that vaccinated people’s cases tend to be lower viral load, less dangerous and less severe.

Looking at their section on statistical analysis, they’re doing some of the necessary and reasonable things but I can’t tell if it’s enough of them. Such studies are better than nothing if treated with caution, and this seems like a relatively well-done one, but I’m still more focused on the population numbers and what makes the models work out. 

When I see things like this:

My core reaction is, the very idea of a 22% decline in vaccine effectiveness per month doesn’t make any mathematical sense, until I figured out it meant a 22% increase in vaccine ineffectiveness. As in, if you are 99% effective in month one, and then you have a 22% ‘decline in effectiveness’ you would be… 98.8% effective. Or if you were 95% before, you’re 94% now. Which doesn’t sound to me like a 22% decline in effectiveness, even if true. 

Israeli data continues to suggest extreme fading of vaccine effectiveness if you look at it naively, along with yet another reason to, as post puts it, proceed with caution. 

New data from Denmark:

One presumes that the improvement against hospitalization in Pfizer is a data artifact or failure to control for something or some such, which shows how easy it is to get misleading results, especially since infection went the other way. And this big Pfizer versus Moderna difference against Alpha isn’t found elsewhere, which makes me think that once again there’s confounding going on all over the place.

Here’s a thread analyzing some of the results, and takes the declining protections and other study data fully seriously, putting the burden of proof on finding something specific that is wrong with the studies, and otherwise taking their results and details seriously and forming the model around that. As usual, the broader context of what such results would mean for all the other data we see isn’t incorporated – but again, I don’t see anyone doing that.

Here’s another good long thread explaining what vaccine effectiveness means then listing lots of different findings and real world results. Putting them all together like that makes it striking how much the different numbers don’t agree if you take them all at face value. 

I continue to think that the decline in vaccine effectiveness over time is in large part a mirage, and for practical purposes the decline is relevant but small. 

This week’s representations of how those vaccines are doing, after having vaccinated about 70% of adults and most of the elderly.

Virginia offers a dashboard:

Doesn’t look like vaccines are losing effectiven

Houston, via PoliMath:

And another:

And another:

That’s disappointing at face value since it’s only a 90% reduction in deaths but after correcting for age it would look a lot better. Weird that so much of the vaccine advantage here seems to be coming after hospitalization. 

A worry is that the studies are selecting for ways to show vaccinated people are at risk, and another worry is that the real world statistics being reported are selecting for showing that the vaccines are super effective, because they are the same information but the Official Story is on two contradictory propaganda tracks and is pretending not to notice that this is a physical world question with a correct answer (whether or not we are confident we know what it is). 

Anecdotal in Tampa, Florida


Here in New York:

Meanwhile also this:

Note that yes, we are excluding the first wave infections here as per her follow-up note, but note the graph and adjust accordingly, and I think the point stands.

That does bring up that UK cases are clearly rising again, so we can no longer use that as an important signpost that things will turn around rapidly and that will be that. If anything, it’s now making the case that such a turnaround is unlikely. I don’t know of anyone who has offered an explanation other than a shrug for the decline followed by a reversal here.

As for the reinfections versus vaccine effectiveness, my hypothesis is that this is not a case of ‘immunity from infection holds up but vaccine immunity is losing ground.’ Remember when we were worried that natural immunity faded with time but vaccines solved that problem? The actual difference is in the methods of observation. When similar observational methods are used, we seem to get similar results. 

How infectious are breakthrough cases? We now have two studies for that. They found that vaccinated people who get infected are still infectious, but their viral loads are substantially lower, so this was what we previously expected. And also they clear the virus faster, which was also expected. 

Weirdly, they’re two different studies that find the two different results, although depending on how you measure, fading quickly implies lower average viral loads, so the results are compatible with the graph and it’s possible what we’re seeing is a shorter period of infectiousness rather than less at the peak. That seems unlikely to be the whole effect to me, but could easily be the majority of the benefit. 

How much comfort that brings depends on the situation and on what you previously believed. If you’re as bad at this as the CDC and were saying the vaccines ‘prevent transmission’ full stop and now that they ‘don’t prevent transmission full stop’ it gets confusing. 

Vaccine Hesitancy and Mandates

Formal approval is in, so here… we… go.

I saw this about one minute after I saw the FDA had approved the Covid vaccine, perhaps someone planned something in advance for once:

On her first day on the job now that The Worst Is Over, our new governor lays down the law:

She also raised New York’s total death count by 12k, which once again highlights that maybe Cuomo went down in a similar way to Al Capone (who was indeed guilty of tax evasion).

Although she’s also mandating ‘ethics seminars’ so you win some and you lose some.

LSU is going to mandate vaccination or a negative test for all fans at Tiger Stadium. 

Whereas the University of Georgia is going the other way.

Here’s the owner of the Dallas Cowboys:

Who else we got (WaPo)? They found CVS Health, Deloitte and Disney, but so far, not an impressive set of additional mandates. It seems not many were standing by ready to go. 

Delta Airlines is charging unvaccinated employees $200 a month extra for health insurance, on the very reasonable premise that every hospital stay for Covid costs them an average of $50,000 and they end up in the hospital for Covid more often. Insurance companies can’t do this, but it seems corporations employing you can do it. 

NYPD has threatened to sue if the city attempts to implement a mandate.

Texas Governor once again mandates against vaccine mandates, this time ensuring it applies despite FDA approval

When you’re fully anti-vax, you’re anti-vax, and it’ll be hard to tell you different, as Donald Trump learned:

Others are less fully anti-vax, but still unvaccinated, thanks to various ways we botched things.

As Ranu notes there are two distinct things here. First, we botched the logistics, and could have done much better if we’d made sure to beware trivial inconveniences that aren’t always trivial. Second, our authorities are untrustworthy so people don’t trust them. This is framed here in the standard blue-tribe way as ‘the system fails such people and they remember the legacy of all that’ with it being ‘hard to make up’ during a pandemic, rather than the simple ‘these people lied about the pandemic over and over again’ model. Both are presumably relevant, but my guess is that handling the pandemic in a trustworthy fashion would have largely solved both problems. Yes, such people will absolutely ask why you weren’t helping them before, but that’s different from turning your help down if you’re here now.

One aspect of vaccination decisions is that patients in America do not pay for their health care. Almost everyone who can get it has health insurance because if you don’t the medical system bills you personally and attached some number of extra zeros to the bill because they can, so you can’t opt out. For a while, they even waved ‘cost sharing’ on Covid, so you didn’t even pay the fraction you normally pay, but that’s increasingly no longer true. Would be good if more people knew. Incentives matter, but only if people are aware of it. One could note that this policy could be taken farther, if the government permitted this, so we’re doing mandatory mandates with one hand and mandatory massive subsidies to those who don’t follow those mandates with the other. 

State employees, you will get vaccinated as many times as is legal, or else.

This is an explicit ‘everything that is not forbidden is mandatory, and everything that is not mandatory is forbidden’ rule. You can get exactly this many shots at exactly these times, and you either get them or you’re fired. There’s no concept of a booster that is optional, based on someone’s situation, and the full mandate applies to teleworkers.

This is where things are going to be tricky. Requiring ‘full vaccination’ so far has been simple. You get two shots and that’s it. Now there are signs that this in many places is going to morph into getting periodic boosters with different places (at a minimum, nations, Austria and Croatia are already setting expiration dates) having different requirements, and those boosters will have a much less slam-dunk risk-benefit profile. 

I will happily take the third shot without any need for outside incentives, but it is a very reasonable position to not want the third one, and it seems likely that requiring boosters will have far less robust support than requiring two shots. 

A cheap shot, but I think a necessary one so putting it here anyway, without any need for further comment.

One can definitely say shots fired:

Masking, Testing and NPIs

This is nuts, actively counterproductive on every level, and what must be fought against:

To be fair, it is only required ‘when social distancing is not possible,’ most of the time this will definitely apply, and I assure you that it’s always possible. 

It’s always adorable when people think the constitution is a meaningful limiting factor, and all recursive mandate sentences are fun.

In practice, this is technically true, but there is a known way around it known as withholding federal funding. And another easier way around it, called ignoring the constitution, since presidents mostly do what they want without any actual legal authority under the constitution and mostly no one calls them out on it. Eviction moratorium, anyone? 

If you’re in NYC and either old or immunocompromised, make sure you know about this:

You can also buy one at the pharmacy, although not like in Europe where the tests are super cheap and abundant. FDA Delenda Est.

Also, a periodic reminder that the reason younger children can’t get vaccinated, which in practice is causing super massive freakouts although there’s almost zero risk there, is that the FDA moved the goalposts to require additional data. Thus we almost certainly won’t get this before the end of 2021, and I’d double check but this market sure looks a lot like free money.  

Here’s a graph of how afraid people have been over time:

The lack of an increase in fear over the winter surge is the most surprising thing here. Otherwise it all makes sense, with fear going down when things were improving, then fear starting to go back up as cases rise. Fear isn’t a perfect proxy for the private control system, but changes in fear likely predict marginal changes in private actions and we’re back at levels similar to April.

Here’s a survey on activity:

As one would expect by now, vaccinated people are taking more precautions than unvaccinated people. Almost half of vaccinated people are ‘avoiding people as much as possible’ and they’re claiming it’s because of the pandemic. However I share Nate’s skepticism here minus the word ‘little’ because math:

Perhaps ‘as much as possible’ means until one is hungry, or has somewhere to go. It’s on the margin. 

Study does some modeling and finds that according to its model masks work, ventilation works even better.

From the study:

Filters win out here over windows, if one has to choose, and of course if possible you’d do both. Also you can’t cheat on the windows, you gotta actually leave them open. When we’re considering actions like mask mandates or shutting down living life entirely I find it odd that people worry about energy costs this much, but there you go. Also fresh air remains a Nice Thing. As always, one must be highly skeptical when translating such results into predictions for actually preventing cases. 

A potential issue with price controls:

An argument against weaning masks on the margin, and a good question about presenting that argument.

I found the tweet more compelling than the full post. Getting into the details mostly highlighted places I disagreed with Bryan.

Booster Shots

Governor of Texas gets a third shot as a booster. I have no issue with people in high positions getting superior medical treatment when there’s a supply or resource shortage, but meanwhile we have vaccines expiring in some places. That’s from Scott’s post with further comments on the topic of FDA Delenda Est, which is interesting but inessential. 

The new argument against booster shots is that they… might cause us to produce too many antibodies against Covid, and then maybe Covid mutates and the antibodies become dangerous or unhelpful because they’re overtrained? When it’s not Officially Sanctioned even antibodies are labeled bad, it seems. Meanwhile this is doubtless supposed to make people worried about Delta, but this worry definitely does not apply to Delta, and an additional customized booster would be necessary in the cases being described either way. Don’t worry, such arguments will go away once the Official Sanction comes down, which is coming soon.

Meanwhile, an argument for booster shots is that the first two doses were so close together that they count as a primary immunization, claiming it looks like this:

Which is so insane it doesn’t even bother putting any impact from the second shot into the chart at all, and puts the peak of the ‘primary’ response more than halfway down the graph when it’s almost fully effective. There’s obvious nonsense available on all sides.

Think of the Children

We really do have a large class of Very Serious People, with a lot of influence on policy and narrative, who think that living life is not important, that the things you care about in life are not important, and that our future is not important, because saying the word ‘safety’ or ‘pandemic’ should justify anything. 

This week’s case in point, and like my source MR I want to emphasize that this is not about the particular person in question here.

If anything, I’d like to thank Dr. Murray for being so clear and explicit. If you think that safety trumps the need for love, for friends and for living a complete life in general, then it’s virtuous to say that outright, so no one is confused. 

In case you think she doesn’t mean that (or that others don’t mean that), no really, she does:

Ellie Murray does not believe that school is terrible, so she is simply saying that the claimed benefits of school are not important relative to the marginal impact of schools on Covid-19. 

That reply was one voice in a chorus, as the replies are what you’d expect and rather fun to read through. Nate Silver sums this up well:

There was also a side debate over whether school is the future of our children and our children are our future, or the alternate hypothesis that children are also people and school is a prison and dystopian nightmare. The thing to remember is that this view is not driving most of the anti-school rhetoric. Such folks mostly think school is vital to children, but don’t care.

Yes, I was aware, and I’d rank my concerns regarding school in this order:

  1. Kids going to school. School is a prison and a dystopian nightmare.
  2. Kids not going to school. Remote school as implemented is somehow so much worse.
  3. Getting Covid. I’d rather not get Covid.

But yeah, we can beat that take this week, because the The Times Is On It:

Technically I’m sure it is true that masks represent an ‘educational opportunity’ in the sense that whenever anything happens you can use it as an opportunity to learn. The main such opportunity is to learn about those making the decisions.

In Other News

Have you tried using a market clearing price? No? Well, then.

I strongly agree that lawn care is a terrible use of water when there’s a limited supply, but the way we figure such things out in a sane world is we charge more money for water and if desired or needed give people a credit to avoid distributional concerns. Yes, I know, don’t make you tap the sign, go write on the blackboard, etc etc. 

Biden still hasn’t appointed anyone to head the FDA, but at least he floated a name. The name is someone who said that living past 75 is a waste, but hey, pobody’s nerfect, right?

Obama literally hired a doctor to ensure everyone was vaccinated and safe and his party was still a huge issue, so now everyone in Washington is afraid to throw parties. Also for other reasons, I’d imagine, but those are beyond scope.

A calculation of whether the benefits of exercise in a gym exceed the risk of Covid finds that it very much does in her case. Often the choice really is between going to the gym or not exercising. Her calculation did depend on the lack of other people in the gym, however, so if the gym had sufficiently more people in a tight space the calculation could have gone the other way. She has a spreadsheet you can play around with if you’d like to explore this more.

Denmark gives up on the mystical ‘herd immunity.’ Usual misunderstandings here but I suppose this is better than the practical alternative of not giving up.

Thread reminding us that the control system has many facets, and they work together at least additively and often multiplicatively. You don’t need any one factor to control the virus or get you mystical ‘herd immunity’ on its own, you care about the combined effects. 

Zeynep reminds us that plastic barriers are likely to be net harmful because they interfere with airflow. I got this one wrong early on, same as everyone else. The key is to update.

Monoclonal antibodies are free and effective against Covid, but few people are getting them (WaPo).

From MR: You can get flown home if you get Covid while abroad, but you’ll need a special service

Germany moves to using hospitalizations as the primary measure of whether Covid is under control. This makes sense for policy, since what matters is whether the hospitals are overwhelmed and whether people are sick and dying. 

Australian stockpile of AZ continues to grow, over 6 million doses (via MR).

Australians who are vaccinated overseas can register that vaccination, but only if the vaccine was approved in Australia at the time of vaccination. Which was not a rapid process.

I like how transparently the ‘at the time’ restriction is purely harmful. No fig leaf.  

Also via MR, due to continued Covid restrictions down under, they shot dogs due to be rescued by a shelter to prevent shelter workers from travelling to pick them up. Meanwhile, they’ve uncovered people getting fresh air. It’s becoming an epidemic of fresh air getting after 200 days in lockdown. 

But good news, if you’re fully vaccinated, you’re about to get new freedoms!

So, how do you think Australia did, all things considered?

Poison control is lonely work. Not many people call, and when they do, it’s usually something like ‘I took prophylactic Ivermectin that was intended for animals, thinking that was a good idea.’ We have some news.

General warning for anyone who needs it: Animal formulations of a given medicine are often different from the human version, and could be highly dangerous to humans. Do not perform this regulatory arbitrage assuming that the two things are the same. 

Also, this:

They didn’t know the two things were different, and it’s a perfectly reasonable hypothesis that a thing could be vastly cheaper and easier to get if you can do an end run around the FDA, or around pharmacists earning praise for refusing to fill prescriptions for Ivermectin. This simply was not one of those times.

Also note the numbers. One individual was told to ‘seek further evaluation,’ and 85% of the cases were mild. The definition of ‘mild’ can be whatever people want it to be, but if it’s ‘no need to seek further evaluation’ it seems like there were six poison control calls out of eight total calls? I’m guessing it’s higher than that, and please if you decide to take Ivermectin make sure you’re sourcing and dosing it safely and properly, but this isn’t an epidemic of cases, and this was going around enough it felt important to point that out, even if I’m highly skeptical that Ivermectin does anything useful. 

Rob Bensinger offers his advice on what to personally do about Covid. Not endorsed.

Inessential but fun case of an elected official saying very much the wrong thing.

Not Covid

Reminder, purely about actual cars:

Remember that if you own an Oculus, and your Facebook account gets suspended because of reasons (such as saying facts that contradict local health authorities) you will lose all your games and save data permanently, no refunds, no fixes. Might want to consider a secondary Facebook account for this purpose, unless you’re using your Oculus to recover your Facebook account, which is also a thing.

In Scott’s recent post, he reckons with his struggles to not make mistakes despite the need to quickly produce a lot of content. I have this problem as well, and last week failed to check something I should have checked. My solution so far has essentially been to state my epistemic confidence in my statements, and to carefully put conditionals on statements that I haven’t verified. So last week I wrote “I am not aware of any X” and it turns out there are a bunch of common Xs and I really should have known that already and also should have checked even though I didn’t know, but I did know I hadn’t checked so I wrote I wasn’t aware. I ended up editing the paragraph (on pregnancy) a few times. There wasn’t anything false when I wrote it but once it was pointed out it obviously needed to be fixed quickly. This occasionally happens, also there are occasional typos, broken links and other stupid mistakes, and occasionally one of the sources turns out to be fake, as was the case with a British account a while back. 


Pair Debugging and applied rationality

26 августа, 2021 - 14:37
Published on August 26, 2021 11:37 AM GMT

Pair debugging is a staple of applied rationality. Humans aren't really all that smart on their own. Our intelligence is distributed. You don't get very far by thinking by yourself.

Before we start, I'll teach a rationality technique. These techniques can be used explicitly as a structured conversation, or they can be applied freestyle whenever.

Minimum prep: bring a bugs list! This is a list of n things that you would like to improve about your life. Here's some examples:

- "I often don't fall asleep quickly enough on evenings before workdays"
- "I want to work for an early phase startup for my next job, but I don't know how to get in contact with startups in that phase"
- "Despite being poly I seem to have a mental block around flirting even when there is clear interest from the other side"
- etc

Optional prep: read some of the CFAR handbook at https://www.rationality.org/resources/handbook

Rough schedule:

14:00 - welcome
± 15:00 - meditation session
± 15:15 - lecture & practice
± 16:30 - done
± 19:00 - grab some dinner together
± 22:00 - closing

Come whenever, but try not to come during the meditation.


Bangalore, India – ACX Meetups Everywhere 2021

26 августа, 2021 - 13:20
Published on August 26, 2021 12:56 AM GMT

This year's ACX Meetup everywhere in Bangalore, India.

Location: Cubbon Park band stand – ///firework.corkscrew.shelter

Contact: w0074@outlook.com


Learning can be deciding

25 августа, 2021 - 19:22
Published on August 25, 2021 4:22 PM GMT

Consider the following questions:

A: "Will it rain in Paris at midnight tonight?"

B: "Will I blink in the next 5 seconds?"

If I wish to find the answer to question A, I might need to look at the conditions of the atmosphere over Paris and then use knowledge of how climate evolves. I might need readings of temperature, wind speed, humidity, and pressure, and I might also need complex mathematical models of the weather.

If I wish to find the right answer to question B, I can just decide to blink now.

There is an activity that I would roughly describe as "figuring out how the world is like." This activity is to be understood as a collective endeavour, not a personal one. Simple acts of information gathering, like opening a box to see what is inside, are examples of this activity. More impressive examples are Science and History, systematic disciplines that evolve across generations and produce large bodies of knowledge. The end goal of this activity might be a theory of everything.

I often hear people (particularly academics) speak as if figuring out how the world is like must always look like the process described above for answering Question A: science-y, focused on gathering data, distilling models or regularities, and applying them to particular cases.

However, Question B illustrates that in some cases, figuring how the world is like simply consists in deciding, or at least it includes a decision. This occurs when we pose questions about ourselves or about changes in the environment that occur as a consequence of our actions. For example: what will the temperature of this room be in the next 10 minutes? That depends on whether I will open the windows, turn on the AC, or set the furniture on fire.


(apologies for Alignment Forum server outage last night)

25 августа, 2021 - 17:45
Published on August 25, 2021 2:45 PM GMT

The Alignment Forum was down between approx 1:30AM and 7:00AM PDT last night due to what seems to be a memory leak issue (postmortem ongoing). We're setting up some additional monitoring to ensure this doesn't happen again.

Apologies for any inconvenience experienced!


How to turn money into AI safety?

25 августа, 2021 - 13:49
Published on August 25, 2021 10:49 AM GMT

Related: Suppose $1 billion is given to AI Safety. How should it be spent? , EA is vetting-constrained, What to do with people?


I have heard through the grapevine that we seem to be constrained - there's money that donors and organizations might be happy to spend on AI safety work, but aren't because of certain bottlenecks - perhaps talent, training, vetting, research programs, or research groups are in short supply. What would the world look like if we'd widened some of those bottlenecks, and what are local actions that people can do to move in that direction? I'm not an expert either from the funding or organizational side, but hopefully I can leverage Cunningham's law and get some people more in the know to reply in the comments.

Of the bottlenecks I listed above, I am going to mostly ignore talent. IMO, talented people aren't the bottleneck right now, and the other problems we have are more interesting. We need to be able to train people in the details of an area of cutting-edge research. We need a larger number of research groups that can employ those people to work on specific agendas. And perhaps trickiest, we need to do this within a network of reputation and vetting that makes it possible to selectively spend money on good research without warping or stifling the very research it's trying to select for.

In short, if we want to spend money, we can't just hope that highly-credentialed, high-status researchers with obviously-fundable research will arise by spontaneous generation. We need to scale up the infrastructure. I'll start by taking the perspective of individuals trying to work on AI safety - how can we make it easier for them to do good work and get paid?

There are a series of bottlenecks in the pipeline from interested amateur to salaried professional. From the the individual entrant's perspective, they have to start with learning and credentialing. The "obvious path" of training to do AI safety research looks like getting a bachelor's or PhD in public policy, philosophy, computer science, or math, (for which there are now fellowships, which is great) trying to focus your work towards AI safety, and doing a lot of self-study on the side. These programs are often an imprecise fit for the training we want - we'd like there to be graduate-level classes that students can take that cover important material in AI policy, technical alignment research, the philosophy of value learning, etc.

Opportunity 1: Develop course materials and possibly textbooks for teaching courses related to AI safety. This is already happening somewhat. Encourage other departments and professors to offer courses covering these topics.

Even if we influence some parts of academia, we may still have a bottleneck where there aren't enough departments and professors who can guide and support students focusing on AI safety topics. This is especially relevant if we want to start training people fast, as in six months from now. To bridge this gap this it would be nice to have training programs, admitting people with bachelor's- or master's-level skills, at organizations doing active AI safety research. Like a three-way cross between internship, grad school, and AI Safety Camp. The intent is not just to have people learn and do work, but also to help them produce credible signals of their knowledge and skills, over a timespan of 2-5 years. Not just being author number 9 out of 18, but having output that they are primarily responsible for. The necessity of producing credible signals of skill makes a lot of sense when we look at the problem from the funders' perspective later.

Opportunity 2: Expand programs located at existing research organizations that fulfill training and signalling roles. This would require staff for admissions, support, and administration.

This would also provide an opportunity for people who haven't taken the "obvious path" through academia, of which there are many in the AI safety community, who otherwise would have to create their own signalling mechanisms. Thus it would be a bad outcome if all these internships got filled up with people with ordinary academic credentials and no "weirdness points," as admissions incentives might push towards. Strong admissions risk-aversion may also indicate that we have lots of talent, and not enough spots (more dakka required).

Such internships would take nontrivial effort and administrative resources - they're a negative for the research output of the individuals who run them. To align the incentives to make them happen, we'd want top-down funding intended for this activity. This may be complicated by the fact that a lot of research happens within corporations, e.g. at DeepMind. But if people actually try, I suspect there's some way to use money to expand training+signalling internships at corporate centers of AI safety research.

Suppose that we blow open that bottleneck, and we have a bunch of people with some knowledge of cutting-edge research, and credible signals that they can do AI safety work. Where do they go?

Right now there are only a small number of organizations devoted to AI safety research, all with their own idiosyncrasies, and all accepting only a small number of new people. And yet we want most research to happen in organizations rather than alone: Communicating with peers is a good source of ideas. Many projects require the efforts or skillsets of multiple people working together. Organizations can supply hardware, administrative support, or other expertise to allow research to go smoother.

Opportunity 3: Expand the size and scope of existing organizations, perhaps in a hierarchical structure. Can't be done indefinitely (will come back to this), but I don't think we're near the limits.

In addition to increasing the size of existing organizations, we could also found new groups altogether. I won't write that one down yet, because it has some additional complications. Complications that are best explored from a different perspective.


If you're a grant-making organization, selectivity is everything. Even if you want to spend more money, if you offer money for AI safety research but have no selection process, a whole bushel of people are going to show up asking for completely pointless grants, and your money will be wasted. But it's hard to filter for people and groups who are going to do useful AI safety research.

So you develop a process. You look at the grantee's credentials and awards. You read their previous work and try to see if it's any good. You ask outside experts for a second opinion, both on the work and on the grantee themselves. Et cetera. This is all a totally normal response to the need to spend limited resources in an uncertain world. But it is a lot of work, and can often end up incentivizing picking "safe bets."

Now let's come back the unanswered problem of increasing the number of research organizations. In this environment, how does that happen? The fledgling organization would need credentials, previous work, and reputation with high-status experts before ever receiving a grant. The solution is obvious: just have a central group of founders with credentials, work, and reputation ("cred" for short) already attached to them.

Opportunity 4: Entice people who have cred to found new organizations that can get grants and thus increase the amount of money being spent doing work.

This suggests that the number of organizations can only grow exponentially, through a life cycle where researchers join a growing organization, do work, gain cred, and then bud off to form a new group. Is that really necessary, though? What if a certain niche just obviously needs to be filled - can you (assuming you're Joe Schmo with no cred) found an organization to fill it? No, you probably cannot. You at least need some cred - though we can think about pushing the limits later. Grant-making organizations get a bunch of bad requests all the time, and they shouldn't just fund all of them that promise to fill some niche. There are certainly ways to signal that you will do a good job spending grant money even if you utterly lack cred, but those signals might take a lot of effort for grant-making organizations to interpret and compare to other grant opportunities, which brings us to the "vetting" bottleneck mentioned at the start of the post. Being vetting-constrained means that grant-making organizations don't have the institutional capability to comb through all the signals you might be trying to send, nor can they do detailed follow-up on each funded project sufficient to keep the principal-agent problem in check. So they don't fund Joe Schmo.

But if grant-making orgs are vetting-constrained, why can't they just grow? Or if they want to give more money and the number of research organizations with cred is limited, why can't those grantees just grow arbitrarily?

Both of these problems are actually pretty similar to the problem of growing the number of organizations. When you hire a new person, they need supervision and mentoring from a person with trust and know-how within your organization or else they're probably going to mess up, unless they already have cred. This limits how quickly organizations can scale. Thus we can't just wait until research organizations are most needed to grow them - if we want more growth in the future we need growth now.

Opportunity 5: Write a blog post urging established organizations to actually try to grow (in a reasonable manner), because their intrinsic growth rate is an important limiting factor in turning money into AI safety.

All of the above has been in the regime of weak vetting. What would change if we made grant-makers' vetting capabilities very strong? My mental image of strong vetting is grant-makers being able to have a long conversation with an applicant, every day for a week, rather than a 1-hour interview. Or being able to spend four days of work evaluating the feasibility of a project proposal, and coming back to the proposer with a list of suggestions to talk over. Or having the resources to follow up on how your money is being spent on a weekly basis, with a trusted person available to help the grantee or step in if things aren't going to plan. If this kind of power was used for good, it would open up the ability to fund good projects that previously would have been lost in the noise (though if used for ill it could be used to gatekeep for existing interests). This would decrease the reliance on cred and other signals, and increase the possible growth rate, closer to the limits from "talent" growth.

An organization capable of doing this level of vetting blurs the line between a grant-making organization and a centralized research hub. In fact, this fits into a picture where research organizations have stronger vetting capabilities for individuals than grant-making organizations do for research organizations. In a growing field, we might expect to see a lot of intriguing but hard-to-evaluate research take place as part of organizations but not get independently funded.

Strong vetting would be impressive, but it might not be as cost-effective as just lowering standards, particularly for smaller grants. It's like a stock portfolio - it's fine to invest in lots of things that individually have high variance so long as they're uncorrelated. But a major factor in how low your standards can be is how well weak vetting works at separating genuine applicants from frauds. I don't know much about this, so I'll leave this topic to others.

The arbitrary growth of research organizations also raises some questions about research agendas (in the sense of a single, cohesive vision). A common pattern of thought is that if we have more organizations, and established organisms have different teams of people working under their umbrellas, then all these groups of people need different things to do, and that might be a bottleneck. That what's best is when groups are working towards a single vision, articulated by the leader, and if we don't have enough visions we shouldn't found more organizations.

I think this picture makes a lot of sense for engineering problems, but not a lot of sense for blue-sky research. Look at the established research organizations - FHI, MIRI, etc. - they have a lot of people working on a lot of different things. What's important for a research group is trust and synergy; the "top-down vision" model is just a special case of synergy that arises when the problem is easily broken into hierarchical parts and we need high levels of interoperability, like an engineering problem. We're not at that stage yet with AI safety or even many of its subproblems, so we shouldn't limit ourselves to organizations with single cohesive visions.


Let's flip the script one last time - if you don't have enough cred to do whatever you want, but you think we need more organizations doing AI safety work, is there some special type you can found? I think the answer is yes.

The basic ingredient is something that's both easy to understand and easy to verify. I'm staying at the EA Hotel right now, so it's the example that comes to mind. The concept can be explained in about 10 seconds (it's a hotel that hosts people working on EA causes), and if you want me to send you some pictures I can just as quickly verify that (wonder of wonders) there is a hotel full of EAs here. But the day-to-day work of administrating the hotel is still nontrivial, and requires a small team funded by grant money.

This is the sort of organization that is potentially foundable even without much cred - you promise something very straightforward, and then you deliver that thing quickly, and the value comes from its maintenance or continuation. When I put it that way, now maybe it sounds more like Our World In Data's covid stats. Or like 80kh's advising services. Or like organizations promising various meta-level analyses, intended for easy consmption and evaluation by the grant-makers themselves.

Opportunity 6: If lacking cred, found new organizations with really, extremely legible objectives.

The organization-level corollary of this is that organizations can spend money faster if they spend it on extremely legible stuff (goods and services) rather than new hires. But as they say, sometimes things that are expensive are worse. Overall this post has been very crassly focusing on what can get funded, not what should get funded, but I can be pretty confident that researchers give a lot more bang per buck than a bigger facilities budget. Though perhaps this won't always be true; maybe in the future important problems will get solved, reducing researcher importance, while demand for compute balloons, increasing costs.

I think I can afford to be this crass because I trust the readers of this post to try to do good things. The current distribution of AI safety research is pretty satisfactory to me given what I perceive to be the constraints, we just need more. It turned out that when I wrote this post about the dynamics of more, I didn't need to say much about the content of the research. This isn't to say I don't have hot takes, but my takes will have to stay hot for another day.

Thanks for reading.

Thanks to Jason Green-Lowe, Guillaume Corlouer, and Heye Groß for feedback and discussion at CEEALAR.


Sam Altman at the AstralCodexTen Online Meetup

25 августа, 2021 - 11:07
Published on August 25, 2021 8:07 AM GMT

Sam Altman, CEO of OpenAI, joining us again. Almost a year ago we had a great meeting with Sam— with record attendance!

Our  meetup on Sunday, September 5, 2021, will start off with a Q&A. You can ask questions on AI, AGI, OpenAI, or any other topics.

After that, we will  socialize on video-chat rooms.

The meetup starts 10:30 AM Pacific Daylight Time, 17:30 UTC, 20:30 Israel Daylight Time.

Please register here , and we'll send you an invitation closer to the time.