Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 36 минут 48 секунд назад

Splitting Debate up into Two Subsystems

3 июля, 2020 - 23:11
Published on July 3, 2020 8:11 PM GMT

In this post I will first recap how debate can help with value learning and that a standard debater optimizes for convincingness. Then I will illustrate how two subsystems could help with value learning in a similar way, without optimizing for convincingness. (Of course this new system could have its own issues, which I don't analyse in depth.)


Debate serves to get a training signal about human values

Debate (for the purpose of AI safety) can be interpreted as a tool to collect training signals about human values. Debate is especially useful when we don’t know our values or their full implications and we can’t just verbalize or demonstrate what we want.

Standard debate set-up

Two debaters are given a problem (related to human values) and each proposes a solution. The debaters then defend their solution (and attack the other’s) via a debate. After having been exposed to a lot of arguments during the debate, a human decides which solution seems better. This judgement serves as the ground truth for the human’s preference (after being informed by the debaters) and can be used as a training signal about what they really want. Through debate we get question-answer pairs which can be used to train a preference predictor.

A debater optimizes for convincingness

An agent will optimize for the goal it is given. In the case of a debater, the agent is optimizing to be judged positively by a human. This means that the main incentive of a debater is to maximise convincingness (potentially by using misleading arguments).

Encouraging the debater to be truthful

The current debate proposal aims to shift this incentive towards truthfulness by empowering both debaters to expose deception in the other debater’s arguments (through cross-examination). (Other methods of disincentivizing deception exist or may exist in the future as well.) I think this disincentivizing may be like a bandaid, when what we want is to prevent the wound in the first place.

A simplistic model of what a debater optimizes for

A debater first comes up with a potential solution. After it has committed to a solution, it tries to convince the judge (through debate) that this is the solution that the judge values most highly. Initially, the AI may reason about what the human’s preference is, but then it tries to persuade the human that the AI’s guess is in fact the human’s preference. In reality, these two steps are actually intertwined: A debater tries to find a solution that will be easiest to convince the human of.

What we want from a debater

What we want the system to do is: “Allow us to state our preferences in domains where we previously couldn’t”. We won’t know how to find a desirable solution to some problems, and sometimes we won’t even know how to choose between two given solutions. In these cases, we would like an AI to help us find desirable solutions and help us discover our preferences.

An alternative system that does not aim to convince

A debater helps us discover our preferences by proposing a solution and then helping us reason about the solution (via a debate). However, these functions need not be performed by one and the same agent. Instead, we could train a different agent for each of these components:

  1. an option generator, which gives us proposals for potential solutions (in domains in which we can assess them); and
  2. an epistemic helper, which helps us to deliberate about options (expands the domains in which we can assess options).

We may not always need both subsystems

In some cases, an epistemic helper could be so effective that it empowers us to generate good solutions ourselves. In such cases, we don’t need the option generator. In other cases, we may be capable of assessing different options right away (even though we can’t generate them ourselves). In these cases, we don’t need the epistemic helper.

Option generator

An option generator tries to come up with solutions that we would pick. To some extent, this is a preference predictor. The agent receives a problem and predicts what solution the human would propose or pick. This agent is trained by receiving many questions (about values) answered by a human. This agent should not interact with the human whose values it’s predicting and ideally has no means of influencing their values.

Epistemic helper

The second agent would help us reason about our values. We need this agent to empower us to answer questions that we by ourselves don’t know the answers to. It should be rewarded for how much it helped us reason.

[I don’t have a good proposal for how to build an epistemic helper, but will explain what I mean by giving some examples.]

An epistemic helper similar to a debater

Consider a debate in which each debater has to defend a potential solution proposed by the value predictor, but instead of being judged on how convincing their solution was, they are being judged on how insightful (illuminating and helpful) their utterances seemed. This would lead to different behaviour than optimizing for convincingness. For example, consider a debate about going on a holiday to Bali or Alaska. Suppose the human does not have a passport to go to Bali, but could get an expedient passport. In the standard debate set-up, the advocate for Bali will not bring up the lack of passport because the debater knows there is a counter-argument. However, it could be useful for the human to know that it should get an expedient passport if it wants to go to Bali. Unfortunately, by default, an agent that is rewarded for insightfulness would optimize for eliciting the emotion of feeling helped in the judge, rather than for helping the judge.

Epistemic helpers dissimilar to debaters

Epistemic helpers could, of course, take many other forms as well. From the top of my head, an epistemic helper could look like an agent that: behaves like a therapist and mostly guides the human to introspect better; acts like a teacher; or produces visualizations of information such as Bayesian networks or potential trajectories; etc.

Rewarding epistemic helpers

Examples of how we could reward epistemic helpers:

  • Judged based on the human experience, i.e. the judge feels like they understand the world better. (This is similar to how debate is currently judged.)
  • Based on if the human became more capable. We could:
    • Test the human on how well they predict events (in the real world or in simulations)
    • Test how well people predict personal evaluations of happiness
      • We could test how well people can predict their own evaluations of happiness. However, this could lead to self-fulfilling prophecies.
      • Predict other people’s evaluation.
    • Test people on how good they are at finding solutions to problems. (Often, even if a solution is difficult to find, given a potential solution it is easy to test whether it is correct or not.)
    • Also see “Understanding comes from passing exams

A potential downside of using an epistemic helper is that we could get paralyzed by considerations, when we actually want a “one-handed economist” .


[I was inspired to write about splitting up debate into subsystems by a discussion between Joe Collman and Abram Demski during Web-TAISU. After writing I noticed a more refined idea is explained here.]

Thanks to Vojta and Misha for helpful comments and interesting discussions.



Discuss

Idea: Imitation/Value Learning AIXI

3 июля, 2020 - 20:10
Published on July 3, 2020 5:10 PM GMT

.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')}

Note: I'm not writing with rigor

Michele Campolo proposes that goal directed behavior is compressible which gives a condition (not sufficient) for goal directed behavior. Say ρ takes in data D from the environment and then converts that to a reward function. One standard way of specifying ρ is to take D as a set of expert demonstrations and then to perform IRL on D. After you do this we have a reward function. Ultimately we have,

GAIL=RL∘IRL

This follows from proposition 3.2 of the original GAIL paper. Thus, the compression bound Campolo presents becomes,

K(GAIL|D)≤K(RL|D)+K(IRL|D)

However, at this point I noticed that we could instead try to find the simplest π given the data. This would be a policy that best reproduces the structure of D in deployment. I'd presume this would be a imitation-learning equivalent of AIXI.

AIXI works by using Solomonoff Induction to apply Occam's Razor to environment observations. This is an a priori model of the environment that is then updated as AIXI interacts with the environment. It seems that you could define a imitation-learning version of AIXI (AIXIL) by weighting all possible policies with the Solomonoff version of Occam's Razor.

Using Campolo's bound, the result would perform at, or above, the level of using a value learning approach (even a AIXVL approach). This seems to imply value learning isn't really fundamental to creating an AI that models our behavior. I'd assume someone's worked this out further than I did here...comments/references?



Discuss

Research ideas to study humans with AI Safety in mind

3 июля, 2020 - 19:01
Published on July 3, 2020 4:01 PM GMT

Premise

Recently I spent some time thinking about ways in which studying the human side of human-machine systems would be beneficial to build aligned AIs. I discussed these ideas informally and people seemed interested and wanted to know more. Thus, I decided to write a list of research directions for studying humans that could help solve the alignment problem.

The list is non-exhaustive. Also, the intention behind it is not to argue that these research directions are more important than any other but rather to suggest directions to someone with a related background or personal fit in studying humans. There is also a lot of valuable work in AI Strategy that involves studying humans, which I am not familiar with. I wrote this list mostly with Technical AI Safety in mind.

Human-AI Research Fields

Before diving into my suggestions for studying humans with AI Safety in mind, I want to mention some less well-known research fields that study the interactions between human and AI systems in different ways, since I reference some of these below. Leaving aside the usual suspects of psychology, cognitive science and neuroscience, other interesting research areas I came across are

Cybernetics

A “transdisciplinary” approach defined by Norbert Wiener in 1948 as "the scientific study of control and communication in the animal and the machine". It is currently mostly used as a historical reference and a foundational reading. However, there is growing work in integrating cybernetics concepts in current research.

Human-AI Interaction

Human-Computer Interaction (HCI) is an established field dating back to the 70s. It “studies the design and use of computer technology, focused on the interfaces between people and computers”. Human-AI Interaction is a recently established sub-field of HCI concerned with studying specifically the interactions between humans and “AI-infused systems”.

Computational Social Science

“Using computers to model, simulate, and analyze social phenomena. It focuses on investigating social and behavioural relationships and interactions through social simulation, modelling, network analysis, and media analysis”

Collective Intelligence

Defined as “the enhanced capacity that is created when people work together, often with the help of technology, to mobilise a wider range of information, ideas, and insights”

Artificial Social Intelligence

Which some define as “the domain aimed at endowing artificial agents with social intelligence, the ability to deal appropriately with users’ attitudes, intentions, feelings, personality and expectations”

Research ideas to study humans with AI Safety in mind1 - Understand how specific alignment techniques interact with actual humans

Many concrete proposals of AI Alignment solutions, such as AI Safety via Debate, Recursive Reward Modelling or Iterated Distillation and Amplification involve human supervision. However, as Geoffrey Irving and Amanda Askell argued we do not know what problems may emerge when these systems interact with real people in realistic situations. Irving and Askell suggested a specific list of questions to work on: the list is primarily aimed at the Debate technique but knowledge gained about how humans perform with one approach is likely to partially generalize to other approaches (I also recommend reading the LessWrong comments to their paper).

Potentially useful fields: Cognitive science, Human-AI Interaction.

2 - Demonstrate where factored cognition and evaluation work well

Factored cognition and evaluation refer to mechanisms to address open-ended cognitive tasks by breaking them down (or factoring) into many small and mostly independent tasks. Note that the possibly recursive nature of this definition makes it hard to reason about the behaviour of these mechanisms in the limit. Paul Christiano already made the case for better understanding factored cognition end evaluation when describing what Ought is doing and why it matters. Factored cognition and evaluation are major components of numerous concrete proposals to solve outer alignment, including Paul’s ones. It, therefore, seems important to understand the extent to which factored cognition and evaluation work well for solving meaningful problems. Rohin Shah and Buck Shlegeris mentioned that they would love to see more research in this direction for similar reasons and also because it seems plausible to Buck that “this is the kind of thing where a bunch of enthusiastic people could make progress on their own”. 

Potentially useful fields: Cognitive science, Collective Intelligence

3 - Unlocking richer feedback signals

Jan Leike et al. asked whether feedback-based models (such as Recursive Reward Modelling or Iterated Distillation and Amplification) can attain sufficient accuracy with an amount of data that we can produce or label within a realistic budget. Explicitly expressing approval for a given set of agent behaviours is time-consuming and often an experimental bottleneck. Among themselves, humans tend to use more sample efficient feedback methods, such as non-verbal communication. The most immediate way of addressing this question is to work on understanding preferences and values from natural language, which is being tackled but still unsolved. Going further, can we train agents from head nods and other micro-expressions of approval? There are already existing examples of such work coming out of Social Signal Processing. We can extend this idea as far as training agents using brain-waves, which would take us to Brain-Computer Interfaces, although this direction seems relatively further away in time. Additionally, it makes sense to study this because systems could develop it on their own and we would want to have a familiarity with it if they do.

Potentially useful fields: Artificial Social Intelligence, Neuroscience

4 - Unpacking interpretability

Interpretability seems to be a key component of numerous concrete solutions to inner alignment problems. However, it also seems that improving our understanding of transparency and interpretability is an open problem. This probably requires both formal contributions around defining robust definitions of interpretability as well as the human cognitive processes involved in understanding, explaining and interpreting things. I would not be happy if we ended up with some interpretability tools that we trust for some socially idiosyncratic reasons but are not de-facto safe. I would be curious to see some work that tries to decouple these ideas and help us get out of the trap of interpretability as an ill-defined concept.

Potentially useful fields: Human-AI Interaction, Computational Social Science.

5 - Understanding better what “learning from preferences” mean

When talking about value alignment, I heard a few times an argument that goes like this: “while I can see that the algorithm is learning from my preferences, how can I know that it has learnt my preferences”? This is a hard problem since latent preferences seem to be somewhat unknowable in full. While we certainly need some work on ensuring generalisation across distributions and avoiding unacceptable outcomes, it would also be useful to better understand what would make people think that their preferences have been learnt. This could also help with concerns like gaming preferences or deceitfully soliciting approval. 

Potentially useful fields: Psychology, Cognitive Science, Human-AI Interaction

6 - Understanding value formation in human brains

This is something that I am less familiar about, but let me put it out there for debate anyway. Since we want to build systems that are aligned and compatible with human values, would it not be helpful to better understand how humans form values in their brains? I do not think that we should copy how humans form values, as there could be better ways to do it, but knowing how we do it could be helpful, to say the least. There is ongoing work in neuroscience to answer such questions.

Potentially useful fields: Neuroscience

7 - Understanding the risks and benefits of “better understanding humans”

Some think that if powerful AI systems could understand us better, such as by doing more advanced sentiment recognition, there would be a significant risk that they may deceive and manipulate us better. On the contrary, others argue that if powerful AI systems cannot understand certain human concepts well, such as emotions, it may be easier for misaligned behaviour to emerge. While an AI having deceiving intentions would be problematic for many reasons other than its ability to understand us, it seems interesting to better understand the risks, benefits, and the trade-offs of enabling AI systems to understand us better. It might be that these are no different than any other capability, or it might be that there are some interesting specificities. Some also argued that access to human modelling could be more likely to produce mesa-optimizers, learnt algorithms that have their own objectives. This argument hinges on the idea that since humans often act as optimizers, reasoning about humans would lead these algorithms to learn about optimization. A more in-depth evaluation of what reasoning about humans would involve could likely provide more evidence about the weight of this argument.

Potentially useful fields: Cognitive Science, AI Safety Strategy.

8 - Work on aligning recommender systems

Ivan Vendrov and Jeremy Nixon made a compelling case on why working on aligning existing recommended systems can lead to significant social benefits but also have positive flow-through effects on the broader problem of AGI alignment. Recommender systems are likely the largest datasets of real-word human decisions currently existing. Therefore, working on aligning them will require significantly more advanced models of human preferences values, such as metrics of extrapolated volition. It could also provide a large-scale real-world ground to test techniques of human-machine communication as interpretability and corrigibility.

Potentially useful fields: Human-AI Interaction, Product Design

Conclusion

The list is non-exhaustive and I am very curious to hear additional ideas and suggestions. Additionally, I am excited about any criticism or comments on the proposed ideas.

Finally, if you are interested in this topic, there are a couple of interesting further readings that overlap with what I am writing here, specifically:

Thanks to Stephen Casper, Max Chiswick, Mark Xu, Jiajia Hu, Joe Collman, Linda Linsefors, Alexander Fries, Andries Rosseau and Amanda Ngo which shared or discussed with me some of the ideas.



Discuss

Poly Domestic Partnerships

3 июля, 2020 - 17:10
Published on July 3, 2020 2:10 PM GMT

This is a joint post with David Chudzicki. Neither of us are lawyers.

Somerville just introduced domestic partnerships:

Domestic partnership means the entity formed by people who meet the following criteria ...

Cambridge has a similar ordinance:

"Domestic partnership" means the entity formed by two persons who meet the following criteria ...

Like most, Cambridge's ordinance is limited to two people, but Somerville is novel in being open to more.

Both, however, recognize domestic partnerships from other jurisdictions. Here's Cambridge's:

"Domestic partner" means a person who meets the criteria set out in subsection D of this section or who is registered as such in another jurisdiction.

Since Somerville does not restrict domestic partnerships to residents, it looks to us like a group of Cambridge residents could register a Somerville domestic partnership, which Cambridge should then recognize.

We looked a bit for some other jurisdictions that might recognize Somerville group domestic partnerships, but only found municipalities. Looking at states, California explicitly limits to two, and Oregon and Maine don't recognize outside domestic partnerships. Some municipalties, like Boston and West Hollywood don't seem to recognize domestic partnerships from other jurisdictions, but others, like Provincetown, Nantucket, and Berkeley do appear to.


How does this interact with existing marriage law? Suppose Pat and Sam are married. Can they form a Somerville domestic partnership with Alex? The ordinance is unclear: one of the conditions is that "They are not married". Does that mean the people entering a partnership can't be married to each other? Or to anyone? If the latter, then they can't have a domestic partnership that includes Alex without getting divorced first. Maybe Pat and Sam could then remarry? If they can't remarry, then the situation is pretty rough from a tax perspective: they have to choose between two of them getting to file taxes jointly, or three of them getting to have a Somerville domestic partnership.

We also wondered how the law would handle multiple different relationships. Suppose Pat and Sam have a relationship, and Sam and Alex do as well. All three live together as a household, but Pat and Alex aren't partners. The ordinance lets them all have one domestic partnership together. Do they also have the option of registering two separate partnerships? It looks to us like they can: (2.502.c.3) might mean you can't be married to someone else already, but it doesn't say you can't have an existing domestic partnership. Later on (2.505.a) says "When the term 'spouse' or 'marriage' is used in other city ordinances, it shall be interpreted to include a domestic partner or partnership." Without the word "other", we think this would mean that one partnership counts as a marriage for (2.502.c.3). As it's written, we don't see anything precluding multiple simultaneous partnerships.


Municipal-level domestic partnership probably doesn't go as far as state or federal, but it does have important effects:

  • Health insurance: City employees' partners are eligible for health insurance since "The City of Somerville shall afford persons in domestic partnerships all the same rights and privileges afforded to those who are married" (2-505.b).

  • Co-living: Somerville prohibits more than four unrelated people from sharing a unit (9.11.a). Since "When the term 'spouse' or 'marriage' is used in other city ordinances, it shall be interpreted to include a domestic partner or partnership" (2-505.a), it looks to us like large households that consider themselves family can register as a domestic partnership and share a unit.

  • Discrimination: "No person in our city shall be unlawfully discriminated against in matters of housing, employment, education, contracts, purchasing or public accommodations, on the basis of: age, ..., family/marital status, ..." (2.V.6.2.237). Before this, maybe a Somerville teacher could have been fired for coming out as living in a polyamorous family.

  • Unlicensed peddling: "Before selling any meats, butter, cheese, fish, fresh fruit or vegetables, any hawker or peddler must either be duly licensed by the director of standards of the commonwealth or by the city council; however, this section shall not apply to any person who peddles only fish obtained by his or her own labor or his or her family's or to any person who peddles only fruits, vegetables or other farm products raised by himself or herself or his or her family" (8.IV.8.80).

That's just Somerville. Any other jurisdiction that recognizes Somerville domestic partnerships could also be affected!

Comment via: facebook



Discuss

The Book of HPMOR Fanfics

3 июля, 2020 - 16:32
Published on July 3, 2020 1:32 PM GMT

One of my friend made a compilation of fanfics of Harry Potter and the Methods of Rationality. Ze gave me the okay to share zir work anonymously.

It has 91 stories (96 in the table of content, but 5 are missing), 12,244 pages, and 3'384'120 words "written by many exceptional muggles".

There's a .PDF, .mobi, and .epub version of the compilation. I'm hosting all of those files, including a .jpg and .svg of the image below, in this Google Drive folder.

I invite y'all to share your reviews and favorite ones below.

From the Preface:

This book is a compilation of all the Harry Potter and the Methods of Rationality (HPMOR) fanfanfictions, metafanfictions, second-level fanfictions, recursive fanfictions, crossover-fanfictions, forked-fanfictions, fanfanfanfictions, or alternate endings I could find.

Let's be clear at once that I do not own Harry Potter, HPMOR, its arborescence of derivatives fictions, scientific breakthroughs, or esoteric whispers, duh. All credit for the stories assembled into this book goes to their respective authors. Blessed be their horcruxes.

The stories are ordered somewhat chronologically according to the HPMOR universe, such that reading through them in their suggested order should ensure at least some amount of cross-fic coherence.

For a better reading experience, refer to Zedzed9's HPMoR Fic Tree, where stories set in universes descended from Lesswrong's Harry Potter and the Methods of Rationality are arranged as a cladogram.

Happy Reading!



Discuss

Open & Welcome Thread - July 2020

3 июля, 2020 - 01:41
Published on July 2, 2020 10:41 PM GMT

If it’s worth saying, but not worth its own post, here's a place to put it. (You can also make a shortform post)

And, if you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are welcome.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ.

The Open Thread sequence is here.



Discuss

The allegory of the hospital

3 июля, 2020 - 00:46
Published on July 2, 2020 9:46 PM GMT

Sylvanus, a professional programmer, has gotten into a car crash, with disastrous results: he’s paralyzed from the wrists down, and won’t be able to use a keyboard or mouse for six weeks while his hands heal. He has to take time off from his job, and worse, he has to take a break from his obsessive hobby of reading dozens of online news articles a day. He loves to stay up-to-date about current events, but no keyboard and no mouse means no surfing the internet, and the staff at the hospital are to busy to keep him in the loop.

Luckily, Sylvanus has a brilliant imagination, and since he’s on a break from life he has plenty of time to think. He lies in the hospital bed all day, relaxing his mind and sending probing thoughts out into the world, like a golfer putting a finger up in the air in order to check the direction of the wind. It isn’t long before he starts getting inklings about the outside world. They’re weak and vague at first, but they get stronger and clearer with each passing day of practice. He asks a nurse for a tape recorder, which he operates by pressing the buttons with his limp knuckles and which he uses to make spoken notes about the messages he receives.

After a few weeks of thinking and recording, he decides it’s time to review his notes. The process is challenging, since the messages are often incoherent or contradictory, but he feels up to the task, and before long he’s pretty sure he knows what he’s missed in world news since the accident.

The next day he gets a visit from his co-worker Daniella. She knows he’s a news junkie, and she offers to fill him in, but he decides to try to impress her. He explains his listening process, tells her what tweaks it needed to work more reliably, lets her listen to some of his notes, and then says:

“So, I think I already know what the news is. I’ll bet you five bucks I’ve got it right.”

“Huh,” she says. “Okay, deal. Start talking.”

He tells her what he came up with. Their country’s top official died of the flu three days after the accident, and was replaced by their second-in-command, as per the country’s founding document. But the second-in-command was incompetent and hugely unpopular, and was assassinated six days later. Everyone expected another high-ranking official to take their place, but a loophole was discovered stipulating that, in light of some specific details of the assassination, a general election should be held instead. All the major political parties raced to find suitable candidates, and…

“Okay, stop,” says Daniella. “Not sure where this is coming from, but it’s way off the mark. Nothing like that happened. You owe me five bucks.”

Sylvanus slumps in his bed, obviously disappointed.

“Why did you even offer that bet?” she asks. “Surely you know telepathy doesn’t work?”

But Sylvanus glares at her, and snaps “Well, I had nothing else to go on!”



Discuss

Open & Welcome Thread - July 2020

2 июля, 2020 - 23:59
Published on July 2, 2020 8:59 PM GMT

(The usual boilerplate:)

  • If it’s worth saying, but not worth its own post, here's a place to put it.
  • And, if you are new to LessWrong, here's the place to introduce yourself.
    • Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are welcome.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ.

The Open Thread sequence is here.



Discuss

Covid 7/2: It Could Be Worse

2 июля, 2020 - 23:20
Published on July 2, 2020 8:20 PM GMT

Previous Update: Covid 6/25: The Dam Breaks

General Viewpoint Reading: On R0Covid-19: My Current Model

This past week was not great on the Covid front, but it could have been a lot worse. I don’t mean that statement in a ‘worse is better because variolation‘ way. I do think this week was slightly better than expectations, and was looking decidedly better than that until Wednesday – earlier in the week I noted on Twitter that things looked remarkably good the last few days. If I had to guess, I’d still put this week at about par.

It seems odd to say this. We set multiple new records for infections. Positive test percentage set a new record yesterday. ICUs are full and running out of supplies in Houston and other places in the South. Measures taken to contain the situation were mostly pathetically weak.

When we run the numbers, they’re going to look rather grim.

The thing is, all of that was mostly baked into my model of what was likely to happen. We saw less rapid increase in cases than I expected. We saw even a little action to reverse opening measures before death rates skyrocketed, including in the Northeast, which was a pleasant surprise. We saw some additional support for and push for mask wearing from most people across the political spectrum, other than Donald Trump.

On the non-Covid front, things are not great. SlateStarCodex is still down. I’ve put my Aside on Other News at the end of the post.

Until then, let’s run the numbers. The charts now recombine New York into the Northeast and start on May 7. The graphs are unchanged, since it’s part of history.

Positive Tests by Region Date WEST MIDWEST SOUTH NORTHEAST May 7-May 13 22419 43256 37591 56892 May 14-May 20 22725 42762 40343 52982 May 21-May 27 23979 39418 42977 37029 May 28-June 3 32200 31504 50039 33370 June 4-June 10 35487 24674 55731 22693 June 11-June 17 41976 22510 75787 17891 June 18-June 24 66292 26792 107221 15446 June 25-July 1 85761 34974 163472 16303 Deaths by Region Date WEST MIDWEST SOUTH NORTHEAST May 7-May 13 1082 2288 1597 5327 Apr 23-29 1090 2060 1442 4541 Apr 30-May 6 775 1723 1290 3008 May 28-June 3 875 1666 1387 2557 June 4-June 10 743 1297 1230 1936 June 11-June 17 778 1040 1207 1495 June 18-June 24 831 859 1204 1061 June 25-July 1 858 658 1285 818 Positive Test Percentages Date USA tests Positive % NY tests Positive % May 7-May 13 2,153,748 7.5% 202,980 8.2% May 14-May 20 2,643,333 6.0% 246,929 5.6% May 21-May 27 2,584,265 5.7% 305,708 3.5% May 28-June 3 3,022,469 5.1% 417,929 2.2% June 4-June 10 3,252,870 4.6% 438,695 1.4% June 11-June 17 3,470,057 4.6% 442,951 1.1% June 18-June 24 3,629,478 6.0% 440,833 1.0% June 25-July 1 4,260,004 7.2% 419,696 1.2%

The local situation in New York continues to be difficult to read. Two days ago we hit a new low with 0.84% infected for the state. Now we’re back up above 1%, and the weekly rate rose on net. That’s scary. A slow climb is still a climb until reversed. We do know it’s not a disaster, and the Northeast in general is stable for the time being.

Everyone else, I have some bad news.

The overall positive rate jumped again to 7.2% and once again hit a record on a Wednesday, hitting 8.5%. Clearly some states are reporting positive and negative results on different days, and it’s messing with the numbers, but the steady climb is clear.

The Midwest saw another, larger increase in positive tests. Given the inability to reverse conditions on the ground in a substantial way, looks like they’re on pace to get overwhelmed the same way the South and West are, only to do so slower.

Most of the South, and parts of the West, are rapidly becoming disasters. Texas, Florida, Arizona and California have the headlines. Florida almost doubled to just short of 50,000 cases. But the big four are not alone.

In the rest of the West, huge percentage jumps can be seen in Alaska, Idaho, Montana and Nevada, while Oregon, Washington, Wyoming and Utah continue to steadily get worse as well.

Most of the South is in deep trouble. This was a huge almost 50% jump in positive tests for the entire region. Positive test percentages are at crisis levels, if not as bad as the 50% we’re seeing in Mexico. I’ve been keeping my analysis mostly to the United States, but it’s clear that Mexico has it even worse than the worst states.

CDC Renews Application For Delenda Est Club

We already had the W.H.O. and the F.D.A., alongside old classics like Facebook, and new candidates like The New York Times. Now the Centers for Disease Control is feeling left out, and itching to ride the momentum of its botched Covid testing kits to get in on the action with this advice to colleges this fall:

To which I respond:

 

I have so, so had it with all instances of ‘it is unknown’ and ‘there is insufficient evidence’ as arguments for not doing obviously probably useful and definitely not harmful thing.

This is utterly insane. You do not get to not model the physical world and wait for peer review. You are literally the Centers for Disease Control. You are saying that when there is an infectious disease that can have asymptomatic carriers who infect people, you’re not sure that testing more people won’t reduce risk of further infections.

And since you’re not sure that this is helpful, you do not recommend doing it, which is effectively recommending not doing it.

What is going on? Are they worried that colleges will overload the testing system?

Then say that this would overload the testing system. 

It’s masks all over again. We believed (falsely) that we didn’t have enough masks and they’d be hard to get, so the W.H.O. and most authorities got together to lie to everyone and say masks didn’t work. And now, when we have masks for days, half the country remembers and won’t wear one, and that’s the only reason we’re still in this mess.

The C.D.C. presumably thinks we don’t have enough tests, so it’s saying tests don’t work, so colleges won’t use up the precious tests on nineteen year olds.

If that is what they mean, then they should say that. And I’m fine with it. I don’t approve of Alabama students throwing Covid parties to see who gets infected, unless they’re doing safe and proper variolation, regardless of how important they think it is to later safely attend Crimson Tide games. But I do think it’s reasonable to say that young people are low risk and broad-based testing of them can’t be a high priority. I also think it would be reasonable to tell everyone to come two weeks early and keep everyone in their dorm rooms during that time while delivering them pizzas. Maybe they’d actually do the reading.

Keep Talking To Your Area Man About Lagged Exponential Growth

Exponential growth keeps going until something stops it. Yesterday we had a record high of cases, and the highest positive test percentage we’ve had since May 9. Numbers are jumpy, so it’s hard to tell how rapidly things are still getting worse, but for now it is clear they are still getting worse in terms of infections. Hospitalizations are also rising. Deaths will follow.

At what point does the death rate staying down start to get seriously weird?

June 18 starts the surge in positive tests that represents the full second wave. June 23 represents when it accelerates. My default assumption has been one week to test positive, and about two weeks after that to see the average death.

That would mark the surge in deaths to start around July 2. In other words, today, with things picking up speed on July 7.

So no, this isn’t weird. Not yet. But if there is no spike in the next seven days, then that’s pretty weird. If that actually happened, I’d look more carefully at hospitalization data, which I usually disregard as not worth the trouble. But mostly I’d be terribly confused. The infection fatality rate seems to clearly have fallen, but why would it have fallen so much so quickly now that a surge in infections doesn’t kill more people? Quite the tall order.

Still Alive

It’s clear that death rates have fallen.

The question is why. Different explanations have very different implications for what happens next and what you should do personally.

Explanation 1: Better Medical Care

This is the best possible news in any non-overwhelmed area. If it’s better medical care, we can expect that to continue, so the death rate drop gets sustained and applies to your risk. It also means we’re learning to be better, so we can expect to keep learning, and deaths likely fall further and faster in such worlds.

Explanation 2: Nursing Homes are Protected

This is a very good thing to do. Nursing homes by all accounts were a huge part of why death rates in the first wave were so high. But if this is the explanation, then the drop in death rates is only good news insofar as less people are dying. It’s not good news in the sense that your behavior can adjust. Before, the death rate was high in general but probably lower for you. Now, the death rate is lower in general, but your rate hasn’t changed. And thanks to lower death rates, you’re now going to be more likely to act like a Covidiot.

Explanation 3: Young Getting Infected

Mirror image of protecting nursing homes. It doesn’t change your risk if you get infected. All it means is that less people die for a given level of infection. Which is good, but you shouldn’t change your behavior other than being more wary of younger people, especially in their 20s.

Explanation 4: Fraud

This is the worst explanation. We noted before that there’s been a rise in ‘mysterious deaths’ and many states seemed to be trying to cook their books to justify re-openings. They were facing popular pressure, and also fiscal pressure to get tax revenue up and unemployment costs down. What if the death rate drop is in large part due to fraud? I’d like to see more careful looks into this. Obviously this is very bad news all around. It means our society is breaking down more than we expected, and also we can’t trust the numbers, and also people are actually still dying.

Explanation 5: Better Testing

It would be easy to say that no, this isn’t it. Positive test percentages didn’t drop that much, and now are rising. Certainly any results from the second wave will face effectively worse testing, as capacity is being overwhelmed faster than it can be added. But there’s a version of this that can sort of work. Perhaps as we increase capacity, we still find mostly the same positive test percentage but get to test people who are less sick and/or less vulnerable, and they therefore get detected but don’t die. I don’t think so, because this would show up in changes in other statistics, and again this won’t apply going forward. But I can’t quite entirely rule out some effect here.

Explanation 6: Selection for Vulnerability

Perhaps how likely one is to be infected given exposure correlates highly to one’s risk of death, so death rates should drop over time as the most vulnerable are already infected? But given the low infection rate in the areas with new cases, doesn’t seem like effect can be big enough here. This would be unclear news, as it might change your perceptions of risk (if I’d be at risk I’d already have it!) or not (I’ve been avoiding infection so I’d be safe so far regardless). On society-wide level it would be good news.

What else is there to consider?

Bar the Doors

Indoor restaurants and bars are, I believe, the biggest super spreaders that we have any control over, now that we’ve lost control over religious services. Going to large indoor religious services, especially with lots of singing, seems like an actual worst case scenario. Just yesterday we learned there was a choir at a mega-church that had infections, then went ahead and performed in a mega-church, and did so for the Vice-President. So we’re pretty insane on that front.

But here’s some interesting restaurant data. Restaurant dining activity by state:

This does not differentiate indoors versus outdoors. New Jersey for example has a big spike in activity but indefinitely postponed indoor dining. New York City also isn’t dining indoors any time soon. That’s a big relief to me, as growth rates locally are on the brink and I expected indoor dining to take us over the edge. We also saw a few closures in southern California, and a few rollbacks in places in the South.

It wasn’t anything like what is needed. But it was something! And it happened without waiting for the bodies to start piling up.

I don’t think that it’s possible to open indoor dining, even at reduced capacity, without substantial herd immunity or a wave of infections. It’s possible it can be done in a very careful, limited way, but you definitely can’t do bars, at all. At a bare minimum, this is going to eat up far more of the flex space than it is worth, space that could be used for schools and offices and socializing outdoors and so on.

If you go to an indoor restaurant in the United States without a positive antibody test or two weeks isolation time in both directions, let alone go to a bar, you’re being grossly irresponsible. Period. Don’t do it.

Great Minds Think Alike

Jenny Mason is the sister of Jon Finkel, history’s greatest Magic player. Her analysis of Covid-19 asks many of the right questions that are rarely asked, and mostly comes up with similar answers to my own. I recommend reading it.

Her central points are as follows.

  1. There has not been a substantial second wave anywhere there was a strong first wave. This implies that herd immunity, as I’ve noted here, is likely playing a big role.

2. Sweden didn’t come out of this as the hero, but things were nowhere near as bad as the critics predicted for it, and cases managed to peak and then steadily decline. I don’t put as much stock in this as she seems to. I view locking down by law as not that relevant to how locked down, mask wearing and otherwise distant and responsible an area really is. But it’s certainly interesting that things were only bad and not nightmarish.

3. Protesters don’t seem to be getting infected much in areas with previous waves, like Seattle and New York City, despite all the obvious close contact. I’ve noted this too. My main conclusion was that being outdoors was a bigger factor than we thought. Since then, we’ve seen some statistical changes, like the share of infections coming from young people in Minnesota, that has made it seem like the protests were spreading things directly just fine, thanks, in places without a previous wave. But as usual there’s lots of other explanations and the protests weren’t that large.

4. Household infection rate is shockingly low. She notes: There have been quite a few studies estimating the household transmission rate for COVID, and they are consistently below the 38% rate for the flu: 11%17%17%30%. For something this infectious, that’s weird. Super spreaders are part of the answer, but would have to be very extreme to be enough.

5. Kids are hard to infect, and find it hard to infect others, and almost never die from it. Not clear to me that she ties this into the rest of her thesis, but this is still true and it’s still weird. If anything I continue to update towards kids not being carriers worth worrying about.

6. Herd immunity thresholds from disease-acquired infection are lower than from vaccination. I wrote On R0 a while ago to point this out and keep pointing it out but it feels like shouting into the wind. If anything I think her discounts here are way too small, and herd immunity threshold is much lower than people think.

7. This week’s actual physical science news about Covid-19, which shows that many people who test negative for antibodies are still effectively immune. Huge if true! Still has to be peer reviewed and all that, so we haven’t seen the paper, but this is essentially the best possible news. It potentially means a lot more infections than we thought and a less deadly virus than we thought, and puts us much closer to immunity. Again, we haven’t verified yet, but huge if true.

8. Points to analysis that suggests many more cases than we expected, even after the adjustments I’ve been making, and that serosurveillance will give underestimates of infection rates. Consider this part two of the seventh point. To extent that these are big effects, they’re big news.

Jenny then concludes that areas that have already suffered a wave likely are already at herd immunity, including New York City, Seattle and Sweden. Her theory on kids is that this comes from cross-immunity that comes from other diseases.

Again, I think her article is worth a read, and have a similar picture to hers. She goes farther with it than I do. I don’t think anyone has hit true ‘herd immunity’ in the sense that New York City could act like there was no virus and it wouldn’t have a problem. Nor do I think that such a level is obviously sufficient to make a protest not spread the virus, assuming there are people who come already infected – R0 is not a constant number at any given time. I do however agree that (partial) herd immunity is doing a ton of work.

Overall, all of this is potentially very good news, as it means we are likely to reach herd immunity faster, and with fewer infections and even fewer deaths, than would otherwise have been necessary.

What the Hell is Up With the Stock Market?

It’s bizarre. The economy is in shambles. Infections are exploding across the country, with all but a handful of areas looking primed for trouble. There has been civil unrest that has increased greatly the focus on Marxist causes and calls for redistribution. Crime is up. The Democratic presidential candidate, who promises to raise taxes and otherwise be big-business unfriendly relative to his opponent, has a double digit lead. States, basically all of them, are on the verge of budgetary collapse with devastating consequences and no sign of a fix.

Yet the stock market continues to go up. What the hell is going on?

Will Eden asked this on Twitter a few days ago. Eliezer Yudkowsky replied that the only thing that makes sense and doesn’t violate the EMH is a drop in real interest rates, because TIPS spreads don’t indicate recovery.

I don’t think that’s right. I do think real interest rates dropping is part of the story. It’s especially helpful in the sense that if a company won’t make money for a while, low interest rates allow the net present value of its profit flows not to care all that much.

There are also other parts.

The biggest part is that the economic effects of Covid-19 shifted and will continue to shift business and profits away from small business towards large business. Stocks represent Big Business. Even small stocks, even if they’re not Big Business, are still big business.

How much of that effect was the direct effect of the need to distance and shut down benefiting business with cash reserves and scale, and hurting those who benefit from locality and loyalty and routine? How much of it is that in a time of crisis, reliability of major corporations is valued more highly, and they are better able to transition to doing more things online? Versus how much of it is that the government’s actions outright subsidized Big Business in contrast to small business?

I think this story is mostly about the direct effects of the crisis, and the government’s interventions were not central.

One could argue that big business would have been in deep trouble if not for the intervention of the Fed to stabilize certain markets in March. Without that, cascading failures would likely have occurred, and preventing that limited many big business failures, while many small businesses still failed. That’s true, although I view such interventions as obviously necessary and good and to the benefit of all. It’s preventing a systemic failure. I have friends who view this as more hostile slash unfair. Either way, that’s over now, and all it did was prevent one type of bad thing from happening.

One could argue that big business got $500 billion, scaled up, in secret. Arnold Kling, who often has insightful things to say, argued that this was de facto the end of American capitalism, as the government was effectively in charge now. Again, some of my friends were likewise treating this as a really huge deal. I don’t see it that way. It’s not that much money in the grand scheme, and it’s loans that have to be paid back. If companies could otherwise raise money, and most could, all this does is save them the change in cost of capital. While in some ways it seems like a big number, in my mind this whole thing is not all that big an effective number.

Mostly this was saving insolvent slash illiquid companies in certain areas like air travel, combined with an opportunity for massive corruption. It’s absurd that we agreed to supervision of how the money got distributed, then this money was distributed without any transparency whatsoever, and everyone mostly shrugged at that. It would be surprising if the bandits involved did not make out like who they are. But in terms of adding this kind of value to the market, it doesn’t make sense. For most companies, prices clearly don’t reflect ‘saved from bankruptcy by a government loan and now has possible value if things recover.’ By contrast, small business got massive grants that took the form of forgivable loans. It was botched in many ways, as you would expect, but a gift is much, much better than a loan.

Boeing’s experience was instructive. Boeing, which seems to have lost its ability to make airplanes that can properly fly and become a moral maze, and which has no customers who want a new airplane for a while even if it can make them, and was already bleeding money, still managed to get private financing in order to avoid any potential strings that might be imposed on it by a Biden administration. The stock went from ‘seems like it’s only worth anything because bailout’ to then turning that bailout down.

Much bigger than all of this direct subsidization was the simple, straightforward Federal Reserve intervention to do reasonable things and do standard monetary policy in response to a real shock. Which is their job. If anything they should have done more, and/or adopted level targeting.

I think that, in turn, is dwarfed by the way the landscape happens to work out in this situation, to the benefit of big business. If one believes that This Too Shall Pass, and we’ll definitely be recovered a year or two from now, and we’re past the scary ‘things might fall apart before that’ stage, then this all can make sense – they’ll come back stronger than ever, with more market power and more sales, and make that much more money.

I also strongly believe that the EMH, the Efficient Market Hypothesis, is false. I’ve believed that for a while. I also believe this crisis has strongly shown it to be false. As I’ve noted, markets seem to be reacting to news far slower than those paying attention can figure it out. Markets predictably, on a given day, for a while, reacted in the wrong direction to news. News that looked bad to the market was good news, because it meant a better reaction that might help, whereas those paying attention already knew how bad things actually were from data they’d had for a week or two, and were busy processing the real news. There was also a reversion effect for a long time, that probably still holds somewhat, that each day’s move partly reverses the next day, and for at least a while a momentum effect, where what happened early in the day was reinforced later in the day. For March and April it was clear things were highly predictable slash wrong on such scales.

Based on my trading experience, I am highly confident that such inefficiencies were real, and that my old firm made a lot of money exploiting them.

One can also look at Robinhood and other potential new sources of capital. What Matt Levine calls the Bored Markets Hypothesis seems strong, that those who have few other ways to gamble and interact and get excitement are looking to the market, and mostly they end up long and pushing stocks up. I haven’t looked at the magnitude here, so I’m doubtful this is a big impact outside of a few weird stocks (e.g. Hertz) but I definitely believe that a bunch of inflows (people buying stocks for the hell of it) can drive prices up.

This and everything else here is not investment advice. That said, if I had spare cash on the sidelines, I certainly wouldn’t be in a hurry to buy into what looks like a historically overpriced market with an overhang of multiple huge crises still to be handled. But I also wouldn’t have done that a month ago or two months ago.

Markets can stay insane for a long time and all that, and very long term either things are going to go very, very badly in ways that make stocks the least of my worries, or you’ll want to have been long the whole time rather than staying out entirely.

As for ‘why aren’t you rich?’ and ‘how much money did you make off this?’ the answer is that I’m rich enough that I have made a conscious decision to do other things with my life and to avoid trading. The things I know how to predict here are short term, on the order of hours or days, rather than months or years. To start down that path would prevent me from focusing on anything else, and doing the trades involved in the face of the fees and taxes and interest rates I’d have to pay as an individual with little setup would be tricky. Whereas in the long term, I am confident the market isn’t thinking right, but that doesn’t mean I can beat it by enough that I should be constantly trading and both letting that eat my attention and also paying massive capital gains taxes to do so.

Aside on Other News

Other news seems rather terrible.

The central topic of online conversation, other than Covid-19 itself and the continuing adventures of the Orange Man, now seems to be cancel culture’s latest targets, the dangers of cancel culture, and the cultural forces that are trying to use this moment to paint America as History’s Greatest Villain, to tear our country apart and set us at each others’ throats. Almost all the talk of potentially constructive object-level action, to address issues like stopping state-sanctioned murder, has been effectively sidelined by calls for the only action our society still seems capable of, which is symbolic action against scapegoats.

Everyone is rushing to take actions they think will make them less likely to be the next scapegoat, despite being, in the eyes of those doing the scapegoating, eternal scapegoats by nature of who they were when they were born. The more intensely one tries this, the more one’s attempts merely grant the attackers power and highlight one as a juicy target.

Meanwhile, these symbolic actions have, as I’ve noted before, crippled any legs that a response to the Covid-19 crisis had legally, rhetorically or practically. You can try to claim that symbolic risky action against the outgroup good but actual living life risky action bad, but, well, good luck with that. There’s no trust left in ‘experts’ or institutions, and there’s no will. We’re out of state capacity. The thing will mostly burn until it burns out.

Hopefully this does not all turn out the way many survivors of countries destroyed by communist takeovers are warning us about. It’s up to us to ensure that does not happen.

The New York Times is playing a central role. Its attack on Scott Alexander, which Twitter reports is going to be a true hit piece attempting to not only doxx Scott but also completely falsely label Scott a ‘white supremacist’ is not an outlier. It is their new mode of doing business. I have joined the movement to #GhostNYT. No clicks, links or business of any kind. If their reporters come knocking, they get nothing. If and when the situation is resolved to Scott’s satisfaction, I’ll consider softening that position.

Please join me and #GhostNYT. Also please sign the petition against doxxing him.

Meanwhile, China has taken advantage of its opportunity and forced on Hong Kong a beyond over the top National Security Law. This law claims jurisdiction over all people of the world, everywhere. They have taken the position that they can and have made speech, anywhere, by anyone, illegal if it is adverse to China’s control of Hong Kong. By writing the previous sentences, I have in theory made myself a criminal in China, subject to arrest were I ever to visit there. I am happy to see Britain and hopefully Australia are preparing to offer a place for citizens of Hong Kong to go. We should do the same. Of course, the United States has done nothing of substance to respond to this on any level.

We all feel the chilling effect of all this on both speech and action. This section was rewritten multiple times to attempt to properly balance my seething rage at what is going on with the need to be prudent.

In March, April and May, I felt unsafe and stressed because of Covid-19. Now I feel unsafe mainly because of other things.

Despite slash because of such worries, I didn’t feel I could write this post without this aside. So I wrote it.

But let’s not discuss it here further.

As always, please keep the comments to Covid-19 related discussion that is relevant to forward-looking analysis. Who is to blame and who should be scapegoated need not interest us here, except insofar as it is important to who to believe and trust, and what actions or events come next.

 

 

 



Discuss

How to decide to get a nosejob or not?

2 июля, 2020 - 21:54
Published on July 2, 2020 5:54 PM GMT

The affect heuristic suggests that being attractive has many subtle advantages, both in social and professional life. These advantages might be too subtle to notice at any one interaction, but they are important in aggregate. If getting rhinoplasty made me 5 percentile more attractive (say from 35th percentile to 40th, or 65th to 70th), over my entire life it might be worthwhile. Rhinoplasty in a low-healthcare-costs country might run just 3,000 USD, and to my knowledge is a one time expense which does not require repairs or updates.

How can I assess the value for money/time? For one thing, asking people if you have a big nose is awkward and they would mostly lie. The doctors just want a sale, so they have the opposite incentives to lie. In any case, is there evidence in general that rhinoplasty actually makes people more attractive.

I have found one resources for this - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2691557/

(I suspect other ways of improving presentation, like buying neutral colored clothing and improving social skills are more cost effective, but am uncertain)



Discuss

Goals and short descriptions

2 июля, 2020 - 20:41
Published on July 2, 2020 5:41 PM GMT

Outline

I develop some contents—previously introduced in the Value Learning sequence by Rohin Shah—more formally, to clarify the distinction between agents with and without a goal. Then I present related work and make some considerations on the relation between safety and goal-directedness. The appendix contains some details on the used formalism and can be skipped without losing much information.

A brief preliminary

In the first post of the Value Learning sequence, Shah compares two agents that exhibit the same behaviour (a winning strategy) when playing Tic-Tac-Toe, but are different in their design: one applies the minimax algorithm to the setting and rules of the game, while the other one follows a lookup table—you can think of its code as a long sequence of if-else statements.

Shah highlights the difference in terms of generalisation: the first one would still win if the winning conditions were changed, while the lookup table would not. Generalisation is one of the components of goal-directedness, and lookup tables are among the least goal-directed agent designs.

Here I want to point at another difference that exists between agents with and without a goal, based on the concept of algorithmic complexity.

Setup

Most problems in AI consist in finding a function .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} π∈AO, called policy in some contexts, where A={a1,…,am} and O={o1,…,on} indicate the sets of possible actions and observations. A deterministic policy can be written as a string π=ai1ai2…ain with aik indicating the action taken when ok is observed.

Here I consider a problem setting as a triplet (A,O,D) where D stands for some kind of environmental data—could be about, for example, the transition function in a MDP, or the structure of the elements in the search space O. Since I want to analyse behaviour across different environments, instead of considering one single policy I’ll sometimes refer to a more general function g (probably closer to the concept of “agent design”, rather than just “agent”) mapping environments to policies on them: (A,O,D)g↦π(A,O,D).

Lookup table vs simple-reward RL

As written before, a lookup table is a policy that is described case-by-case, by explicitly giving the list of all observation-action pairs. Thus, a generic lookup table for a setting (A,O,D) is expected to have Kolmogorov complexity K(π|D)≈|O|: most lookup tables are incompressible.

Now let’s consider a policy generated via Reinforcement Learning. Such a policy can be written as π=RL(p(D)), where RL is the training algorithm, and p indicates an algorithmic procedure that gets the environmental data D as input and returns a reward function r:O→[−1;1] to be used by the RL algorithm. Often, p corresponds to the “work” done by the human designers who assign appropriate rewards to states, according to what they want the agent to achieve in the environment.

However, any policy could actually be expressed in the form π=RL(r) with an appropriately constructed reward function r, because of an argument analogous to the one that Shah showed in this other post. The relevant element for goal-directedness is the algorithmic complexity, compared to the environment size, of the reward function.

If the algorithmic procedure p has low complexity, then I expect that

K(g(A,O,D)|D)=K(π|D)=K(RL|D)+K(p|D)≪|O|

especially in large environments. As an example for this case, consider the policy that is the result of RL training on a maze with the same small negative reward assigned to every state except for the exit, which has reward 1 instead. For this reward function, K(p|D) is low, since the procedure p is only about recognising the exit square of the maze given as input D. Moreover, the same exact procedure can be used to find an exiting policy in larger mazes.

The fact that low algorithmic complexity is a sign of goal-directed behaviour has an analogy in natural language. When we say the goal of an agent is to exit mazes, we are giving a short description of its policies across multiple environments, no matter how big they are. In other words, we are expressing the policies in a compressed form by using the simple reward function, which coincides with the goal of the agent.

On the other hand, consider a reward function that assigns lots of different values to all states in the maze, with no recognisable pattern. In this case, the algorithmic complexity of the reward function is approximately equal to the size of the observation set: the situation is the same as with incompressible lookup tables.

In the analogy with natural language, this corresponds to the fact that the only way of describing such policies would be to state what the action (or the reward) is for each observation. There would be no goal which we can use to give a short description of the agent behaviour.

The lookup table vs simple-reward RL contrast appears under different forms in other contexts as well. Consider search processes (in this case, π∈{0,1}O) that select one or more elements in a search space O and take as input some data D about the elements. Manually specified filters, that for each search space list which elements to take and which to discard, are generally as complex as the size of the search space, thus essentially identical to lookup tables and without a recognisable goal.

On the other hand, a search algorithm that uses the data D to generate an ordering of the elements according to a simple evaluating function, and takes the best element, is succinctly described by the evaluating function itself. Moreover, the latter search process naturally extends to larger environments, while the filter needs one more specification for each element that is added to the search space.

Related work

The previous example with search processes smoothly leads to the consideration of an alternative formalism to standard optimisation: quantilizers. Briefly, instead of taking the best element in the ordering generated by the evaluation function, an element is chosen from the top q portion of the same ordering, according to a probability distribution.

If a standard optimisation process, like the previous example, is described as π=best(eval(D)), then a quantilizer can be expressed similarly as π∈topq(eval(D)), with a negligible change in complexity.

In terms of goal-directedness, this corresponds to the fact that the two agents can be said to have more or less the same goal, since they are interested in the same property captured by the evaluating function. The difference lies in the degree of “directedness”: the first agent applies straightforward hard optimisation, while the quantilizer is, in a sense, more relaxed and possesses some safety properties (e.g. Lemma 1 in the original paper).

A similar comparison in the context of learning can be done between the optimal policy maximising a simple reward function, and a policy that does the same 1−q of the time but takes a default action q of the time—something like waiting for new human instructions. These two agents have the same goal, but they pursue it differently, since the second one leaves more room for corrections.

Overall, when comparing agents with the same goal, it seems that less-directed agents trade a certain amount of performance for a gain in terms of safety.

Compression of complex policies is also listed as a favourable condition for mesa-optimisation (see section 2 in the paper). When searching for algorithms that can solve a certain class of problems, a bias in favour of short policies increases the likelihood that an algorithm which is itself an optimiser is chosen; ideally, if the bias for simplicity is strong enough, it may be possible to find an algorithm that generalises to problems outside of the original class. Unsurprisingly, such an algorithm is also more likely to be goal-directed.

Implications for safety

When the full policy of the agent can be described as the result of a relatively short algorithm applied to some short data representing the goal, as in the case of π=RL(p(D)), we can interpret p as a compressor that selects the goal-related data from all the environmental data D, and RL as a decompressor that uses the goal-related data to generate the full policy. Thus, because of error propagation, we should expect arbitrarily large errors in the full policy if there is any kind of error in the specification of the goal-related data.

This slightly formal reasoning reflects the less formal argument that, if we design a powerful goal-directed AI supposed to act in large and varied environments, and we make a mistake in the specification of the goal, the resulting policy could be arbitrarily far from what we originally expected or wanted.

A fundamental safety question is whether it’s possible to design safe agents that have goals and act in the real world. We’ve seen before that, among agents with the same goal, some are safer than others, but it seems hard to tell if this is enough to avoid all possible bad outcomes.

Note, however, that even if “standard” goal-directed agents were unsafe, an alternative solution could be to design agents that still use some compression, but also have a certain amount of explicitly-specified ad-hoc behaviour for unique scenarios that wouldn’t be handled correctly otherwise. Such an agent would be an intermediate design between the two extremes shown before, i.e. lookup tables and agents with a clear goal.

Thanks to Adam Shimi, Joe Collman and Sabrina Tang for feedback.

This work was supported by CEEALAR (EA hotel).

Appendix
  • The environmental data D is underspecified and not completely formal, but the main ideas in the post should be clear enough anyway, so that it’s easy to criticise them or suggest new research directions.
  • Even though I showed the simpler case with deterministic policies over states, the reasoning is the same with stochastic policies over histories.
  • The use of Kolmogorov complexity actually requires fixing a Universal Turing Machine: the above analysis doesn’t change at its core. You can think of all mentioned algorithms as if they were written in the same programming language.
  • Kosoy has proposed a definition of goal-directed intelligence that also uses algorithmic complexity, but in a different way.


Discuss

Does the Berkeley Existential Risk Initiative (self-)identify as an EA-aligned organization?

2 июля, 2020 - 20:38
Published on July 2, 2020 5:38 PM GMT

Effective altruism exists at the intersection of other social and intellectual movements, and communities. Some but not all of the organizations part of these communities focused on various focus areas of EA, such as existential risk reduction, identify as part of effective altruism as a movement. Such organizations are typically labeled as "EA-aligned organizations."



Discuss

June 2020 gwern.net newsletter

2 июля, 2020 - 17:19
Published on July 2, 2020 2:19 PM GMT



Discuss

The "AI Debate" Debate

2 июля, 2020 - 13:16
Published on July 2, 2020 10:16 AM GMT

As far as I can tell, I have a disjoint set of concerns to many of the concerns I've heard expressed in conversations about AI Safety via Debate.

My main concern with AI Debate is this: each debater has an incentive to trick the operator into running code that takes over the world, replaces the operator, and settles the debate in favor of that debater. To get traction on how big a concern this is, let's start with the following question:

How farsighted does a superintelligent chatbot have to be before it becomes dangerous? Let's forget the two agent setup for a moment, and just consider a single agent. For concreteness, let its action space be the words in the dictionary, and I guess 0-9 too. These get printed to a screen for an operator to see. It's observation space is the set of finite strings of text, which the operator enters.

If it acts to maximize some function of the very next observation it gets, I'm pretty sure it never constructs an existentially dangerous argument. Call this a horizon-1 agent. If it acts to maximize some function of the next .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} N observations it gets, call this a horizon-N agent. I won't rehash the AI Box debates, but my intuition is that it is very likely that a horizon-106 chatbot agent would take over the world to intervene in the provision of observations, and not out-of-the-question that a horizon-103 agent would as well. (This isn't a great anchor, but my mind went to the length of rebuttals in peer review).

Let's get back to the AI Debate setup, instead of the single agent setup. The existence of an adversary may make it harder for a debater to trick the operator, but if they're both trying to push the operator in dangerous directions, I'm not very comforted by this effect. The probability that the operator ends up trusting one of them doesn't seem (to me) so much lower than the probability the operator ends up trusting the single agent in the single-agent setup.

So that leaves us with the problem of picking N, the episode length. This is a little dicey, of course, but we could imagine starting small, and slowly increasing it until we just begin to get useful work out of the system. Call this the tiptoe approach. It leaves something to be desired, but I think there's a decent chance that come AGI, all the best safety proposals will have an element of the tiptoe approach, so I don't advocate dismissing the tiptoe approach out of hand. An important danger with the tiptoe approach here is that different topics of conversation may have wildly different Ndangerous and Nuseful thresholds. A debate about how to stop bad actors from deploying dangerous AGI may be a particularly risky conversation topic. I'd be curious to hear people's estimates of the risk vs. usefulness of various horizons in the comment section.

So what if AI Debate survives this concern? That is, suppose we can reliably find a horizon-length for which running AI Debate is not existentially dangerous. One worry I've heard raised is that human judges will be unable to effectively judge arguments way above their level. My reaction is to this is that I don't know, but it's not an existential failure mode, so we could try it out and tinker with evaluation protocols until it works, or until we give up. If we can run AI Debate without incurring an existential risk, I don't see why it's important to resolve questions like this in advance.

So that's why I say I seem to have a disjoint set of concerns (disjoint to those I hear voiced, anyway). The concerns I've heard discussed don't concern me much, because they don't seem existential. But I do have a separate concern that doesn't have much to do with interesting machinery of AI Debate, and more to do with classic AI Box concerns.

And now I can't resist plugging my own work: let's just put a box around the human moderator.[Comment thread] See Appendix C for a construction. Have the debate end when the moderator leaves the box. No horizon-tiptoeing required. No incentive for the debaters to trick the moderator into leaving the room to run code to do X, because the debate will have already been settled before the code is run. The classic AI Box is a box with a giant hole in it: a ready-made information channel to the outside world. A box around the moderator is another thing entirely. With a good box, we can deal with finding workable debate-moderation protocols at runtime.



Discuss

C̶a̶m̶b̶r̶i̶d̶g̶e̶ Virtual LW/SSC Meetup

2 июля, 2020 - 06:45
Published on July 2, 2020 3:45 AM GMT

The July Cambridge Less Wrong / Slate Star Codex (RIP) meetup will be held online due to the plague.

Hangouts link



Discuss

Noise on the Channel

2 июля, 2020 - 04:58
Published on July 2, 2020 1:58 AM GMT

Articulation of these ideas in their present form owes a debt to interactions with Tsvi Benson-Tilsen, and Erin Tatum.

Almost everyone will be familiar with the concept of signal vs noise. Literally, it's a signal processing concept which differentiates useful information which we're trying to communicate from useless distractor information which can corrupt our signal. Those who grew up in a pre-digital age will be familiar with "static" on the telephone line or TV. Digital information can be transmitted almost error-free through a noisy channel via redundant encodings which allow error-correction, at a rate determined by Shannon's noisy-channel coding theorem. This is a likely reason for the level of redundancy in natural language, as well: it aids communication in a (literally) noisy environment.

Metaphorically, we use the concept of signal vs noise to talk about everything from inboxes and newsfeeds to writing styles. To this end, people talk about the signal-to-noise-ratio: the proportion of useful/desirable information to total information in a given information source. This is useful in part because it helps manage attention: the total amount of useful information on (say) Twitter might be very large, but because of a very low signal-to-noise ratio, it may not be an efficient way to get information. In contrast to the technical signal-processing model, where the sender and receiver share a concept of which information is useful, this metaphorical generalization admits that the sender's "signal" might be the receiver's "noise".

I'm here to talk about a further metaphorical extension of the signal/noise concept. I don't know whether this concept is especially useful, but it's very strongly a part of my personal experience -- this is one of the most salient aspects of a conversation for me, and one of the biggest factors in determining how enjoyable or productive a conversation is. I call it "fuzz" or "static" or "noise on the channel".

How much static is in this conversation?

I'm pointing to a set of conditions which all have a similar way of making conversations more difficult and less fruitful.

Here are some examples of what I'm talking about.

  1. Literally, a noisy room. A bar on a busy night; everyone is shouting in an effort to be heard over the loud music and the other people shouting. (Literal unironic object-level question: why do so many people think this is a good social setting? Maybe the noise serves an important social function I'm not seeing?) 2 One or both people are hard of hearing. This is practically the same as a noisy room.
  2. One or both of the participants are repeatedly distracted. Threads of inquiry keep getting interrupted, and sometimes forgotten.
  3. You are talking to someone who has to leave in a minute. You both know you don't have time to get into any complicated topics.
  4. One or both participants lack fluency in their common language. Otherwise simple things may take minutes to get across, much like a game of charades or person-do-thing. Complex subjects cannot be discussed, unless the conversation is very low-noise in other relevant aspects (IE, the participants are committed and have a lot of time).
  5. One or both people lack interest in the discussion. Like the example where someone needs to leave soon, it's likely that you don't have a lot of time, because a disinterested person may break off the conversation early. Like the example where there are constant distractions, it's likely that you don't have full attention, and points may get cut off or dropped.
  6. There is a high inferential distance. The conversation participants have very different ways of thinking about the subject at hand, which have been developed over long time periods and have a lot of details. Even when the language appears to be shared, there may be hidden differences which are actually critical (see the double illusion of transparency). Like the case of lacking fluency, this means both speakers need to spend a lot of time carefully conveying concepts and checking whether they're understood.
  7. There are a lot of conversational land-mines. Secrets which need to be kept, or touchy subjects which can't be brought up. You need to tread very carefully to avoid blowing up.

In all of these situations, I experience a very similar stressful feeling. I'm trying to squeeze my ideas through a tiny straw. Often the ideas stay bottled up, because it's impossible to communicate complex thoughts. One of the main things I want to get across in this post is my model of why communication is so terrible in these situations.

Why Noise Sucks So Much

All of the object-level difficulties I listed in the previous conversation are different. However, I think the main source of difficulty in such conversations is often the Nth-order effects the "noise" has on the conversation, which are very similar. Many different obstacles to good conversation cause each other and compound on each other to make for a sucky conversation.

In a noisy room,

  • I have to shout to be heard.
  • Shouting takes effort, which makes me a little more reluctant to speak.
  • I'm not sure if I will be heard, which makes the expected value of speaking lower.
  • I'm not sure whether I was heard, which means I'm not sure I can build on my previous statements.
  • It's difficult to hear the other person, which means I have to fill in the gaps, making assumptions about what they probably said.
  • The same is true for them, meaning I have to worry about whether I was really understood.
  • The need to make additional statements to check whether I've understood what they said multiplies with the extra effort of shouting.
  • Even if we largely are being understood, the constant worry that we aren't still makes it more difficult to build on previous points in the discussion.
  • All of the above combines to lower the expected value of the conversation.
  • Because both of us know these things lower the expected value of conversation, we both have less faith in each other's commitment to the conversation.
  • Even if we are both fairly committed to the conversation, our lack of faith in the other person's commitment means we have to treat them like a possibly distracted/disinterested person. This lowers the expectations for the conversation even further, recursively compounding the effect.
  • This worry that the other person isn't going to be very committed to a good conversation means we can't even expect lengthy error-checking procedures to enable us to get complex points across, because we don't know whether the other person will be motivated enough to participate in correcting errors or verifying that points were understood.
  • All of the above means that we are restricted to things which (1) can be communicated fairly quickly, and (2) are commonplace enough that the other party is likely to guess our meaning correctly despite all the communication difficulties. Basically, small talk. This restriction in feasible subject matter further drops the expected value of the conversation, further compounding other effects.
  • Since both people probably realize that the feasible subject matter of conversation is restricted, this knowledge plays into the guesswork we do when trying to figure out what the other person meant / check whether we heard them correctly. This fact itself further reinforces the restriction of subject matter, since it means we'll be even more likely to be misunderstood if we say something complicated.

I could go on. The point is that the bad effects compound each other. A noisy conversation involves a heavy game-theoretic component. Each participant's expectations of the value of the conversation is heavily dependent on (their estimate of) each other's expectations. There's a stag hunt for a good conversation, but the cost of hunting stag is being driven up, without driving up the reward. This means people are even more likely to hunt rabbit than usual, even if hunting stag would still be the overall better option. (And the perception that people are more likely to hunt rabbit makes it even more likely, which feeds back in... well you get the idea.)

You might think you're not doing all the metacognition which I describe above; or, that "normal people" don't do that much metacognition. And maybe not. But I don't think you actually have to do the metacognition in order to feel the consequences. A simpler reinforcement-learning like algorithm will still teach you, via conditioning, that you can't expect deep conversations in certain contexts. As people learn that, they'll try less, and teach each other even more that it's not going to work. So without even thinking about all the recursive implications of the noisy environment, you might have a general sense of doom about difficult conversations in noisy environments. If you're like me, that sense of doom will also pervade a wide variety of similar situations which aren't literally noisy, but share critical features in common with noise.

The Wonderful Magic of Noise-Free Conversations

I still expect some readers to not really know what I'm talking about. Those readers may not even know that they don't know what I'm talking about. Noise is pervasive. A truly low-noise conversation is a rare and precious thing. It's like falling in love. It's like an old friend who understands you. It's Deep Work. It's the joy of being seen and being understood. You don't know what you're missing until you've experienced it.

Of course, this is all a matter of degree. There's the simple everyday variation in "noise" which comes from distracted vs undistracted time, close friends vs acquaintances, et cetera. Then there's the rare, really deep conversations which happen when two people are really very interested in understanding each other, repeatedly make time for each other, and work together to eliminate distractions and other barriers. And then there are the as-yet-undreamt-of heights of noise-free conversations which can only be attained by black-belt rationalists who have first internalized and then later transcended all kinds of cognitive skills related to good conversation, after ingesting all the right nootropics and heading to an extended wilderness retreat.

Let's reverse some of the previous points I made, to clarify what a really low-noise conversation looks like:

  • Low literal noise. Everyone's literal words are understood easily. Everyone knows this without hesitation, so it fades into the background and doesn't take any attention.
  • No distractions. Everyone has a clear mind to focus entirely on the discussion. Again, everyone knows this and doesn't have to think about it.
  • High level of interest. It's common knowledge that everyone in the conversation wants to continue engaging in the conversation, and is interested in understanding what others have to say. There is a high expectation of follow-through on lines of thinking, even if those lines of thinking are very tricky and subtle and will take a lot of time to follow through.
  • Relatedly, large time commitment. The conversation has all the time it needs. If the conversation eventually has to end on this particular day, there is a high degree of trust that you'll get together again soon to continue it, and do so repeatedly for as long as the subject requires it. There is no end in sight.
  • Points are never dropped unless everyone thinks they're finished. In the ideal, there is perfect memory of the conversation, everyone readily knows what the open points are, and those points get returned to in an expedient manner. (Of course in reality, different points have to compete for time.) Conclusions of the conversation are fully internalized by all participants, and applied in any relevant contexts which come up later (in this conversation or beyond).
  • There is a large shared context of understanding. Complicated concepts, feelings, and intuitions which would normally be obscure are easily conveyed and understood, due to special shared language which the participants have developed for their needs in this conversation.
  • You can say anything that's on your mind. There are no conversational landmines, no secrets, no taboos. Nor is anything considered off-topic; since there is a strong shared interest in the subject matter and a high degree of trust in that mutual interest, there is no need to police the conversation to avoid distractions. Nor would there be any need even if not for that, due to the large amount of time available, and the infallible memory everyone has for the active points of discussion. All of this means that when you start on a seemingly irrelevant branch of discussion, no one tries to reel you in; nor will they blame you if it ultimately turns out to be irrelevant. Nonetheless, everyone does largely stay on-topic.

Despite my praise for low-noise conversations, it bears mentioning that this isn't the optimal kind of conversation to have for all purposes. Relaxed, distracted conversations can be great for getting to know someone -- e.g., a highly distracted conversation over a board game. Some subjects demand fast, time-limited conversations. Not all subjects of conversation merit a high level of interest; boredom is sometimes the correct response. And so on.

It's also sometimes possible to get really good conversations by dramatically lowering some kinds of "noise" despite other types being very high. For example, a conversation with high inferential distance is likely to have a lot of really valuable information, if you can give it the time and attention to bridge the gap. Another example: email conversations are likely to be slower and lower-commitment, but this can be compensated for by the fact that all points are remembered (everything is in a text record) and participants can take a lot of time to compose their thoughts. (Keep in mind that the probability you'll write a thoughtful reply influences the amount of effort the other party will put into their email.)

Dealing with Noise

Sometimes you just have to make due with a noisy conversation. In that case, it pays to have some coping strategies.

Lower your epistemic standards. Sad to say, you may be faced with the choice between communicating something poorly and not communicating it at all. In some cases, communicating it poorly will be preferable. I wouldn't recommend practicing this as a skill so much as trying to notice that you already do it -- better, at least, to explicitly flag for yourself that you're less than totally accurate. Some examples:

  • Guess at what the other person means, rather than seeking clarification. You don't have time/energy/etc to get clarification. Fly by the seat of your pants in this conversation. Just make a guess and go with it.
  • Settle for communicating something in the right cluster. Maybe there isn't bandwidth in the conversation to tell them what you were really up to yesterday, even though they asked. Maybe "working" is a lie for subtle reasons. You weren't really working. But it gives them approximately the right idea.

Pick the most important point, and drop the rest. The conversation doesn't have the attention for everything right now; you just have to make a choice.

Accept being unheard or misunderstood. Maybe you were feeling kind of off about something that happened yesterday and you wanted a sympathetic ear to talk it out with. Oh well. This conversation isn't the one where that's going to happen. Let's talk about the weather or something instead.

Am I the Noisy One?

On the other hand, you could be doing any of the above things unnecessarily, creating a "noisy" conversation despite the lack of a noisy environment. Like I said, a good conversation is a stag hunt. Are you hunting rabbit unnecessarily? Are you ignoring your conversation partner's attempts to hunt stag? Are you not giving them the opportunity to try?

I suspect this can be easy to miss if you don't have a lot of experience with the deeper sort of conversation which (unknown to you) your conversation partner is trying to have. Imagine an angsty teenager who assumes any genuine conversation about feelings is a setup for making fun of them. Or imagine someone just starting as a graduate student, who doesn't have any experience with pre-rigorous research concepts turning into rigorous concepts later, so blocks themself off from engaging with ideas that don't sound rigorous (because they're trying to be a serious researcher).

If you notice yourself engaging in some of the "dealing with noise" strategies from the previous section: are you hunting rabbit when others were trying to hunt stag?

Credibly Committing to Continuing Conversation

If approaching this as a problem to be solved, rather than just a phenomenon to be aware of, one approach is to visibly set time aside, set aside distractions, and give a conversation your full attention. Remove distractions: set aside phone, laptop, etc. Find a private room or a semi-isolated outdoor location. Perhaps take the conversation on a long walk without a cell phone, which provides a visible commitment to keep talking for some amount of time. If you want to make sure there are follow-up conversations, maybe mention that early on, to establish common knowledge that this is only the first part of a continuing conversation.

Again, this isn't a guide to how every conversation should ideally go. Not every conversation deserves your maximal attention. And the Schelling choice is rabbit, not stag.

Maybe it's possible to 80/20 this. Perhaps it's possible to be someone who has deep conversations even if they're brief and have no certainty of being continued later. Maybe you can get a lot of the benefit by merely giving off the feeling that you might, if only you had more time, listen and participate deeply in the conversation. Maybe you can find a way to get away with reversing some or all of the advice I gave in "Dealing with Noise" -- raise your epistemic expectations, remember all the points, don't accept being unheard or misunderstood. Just give off an aura of reasonableness except instead of making people avoid dramatic expressions of emotion, it makes them feel that you're willing to hunt stag in the conversation.

If so, let me know what the trick is.



Discuss

Harry Potter and methods of rationality ending.

2 июля, 2020 - 04:24
Published on July 1, 2020 6:51 PM GMT

First of all, I want to thank the community and its creator and express deep respect and gratitude for the work done. I got incomparable pleasure from studying unexpected plot intricacies, consistent and well-structured thoughts of the protagonist, as well as references to outstanding works, many of which I, to my shame, did not know.

However, the finale of the book left more questions than answers.

What will Potter's friends do when they realize that he made them orphans and did not even give the opportunity to say goodbye to parents? Why didn't he even try to persuade his fellows to confront their parents and do their best to force them to change their minds and traditions?

What will Hermione do when she will realize that Harry could stop her, yet instead left her defenceless and then turned her into golem and personal bodyguard with help of their enemy?

Will she being literally walking dead be able to give birth? Will she give birth to some kind of unicorn trolls or nearly invincible magicians on steroids? Will her descendants even try to build diplomatic relations with humans or will turn into immortal fascist?

What will other mages do, when they get to know that Harry was so afraid to loose his inner dark counselor which is part of Voldie’s soul that he spared You-Know-Who and hid him under his own and Hermione’s protection? Somehow reminds me story of Isildur from LOTR.

Maybe i'm too optimistic, but i just hope that current ending is just possible future that Hermione had seen when she had lost her consciousness during fight with slytherins. Maybe we could do better. I dunno, maybe she could ask Hufflepuff (which are depicted as local counter-intelligence service) to help her. There’s is always a better solution.

Best regards, Klen.



Discuss

Second Wave Covid Deaths?

1 июля, 2020 - 23:40
Published on July 1, 2020 8:40 PM GMT

Two weeks ago I looked at covid cases by state, and divided states into three groups:

  • First wave: states that had peaked and were trending down.
  • Second wave: states that were rising, on a second wave with a slower build and later peak.
  • Unclear: states with few cases or complex trajectories.
With eighteen days of additional data (6/13 through 6/30) we can see that the second wave has continued to build:


source: JHU CSSE

Virginia and New Mexico have controlled it some, but the other second wave states are still seeing lots of growth. Some of the first wave states are growing again (Louisiana, Illinois, Pennsylvania) and some of what I called the "unclear" states have turned out to be in the second wave as well (primarily Florida).

At the time I wrote:

Speculating now, it looks to me like there's a pattern where people take precautions more seriously once people they know start dying. I don't think the second-wave states have hit that level yet, but with the rise in confirmed cases I think we're going to be seeing those deaths in about a week, sadly. This was based on looking at the lag time from the first wave states confirmed cases to deaths (note the different left and right axes):

But with the latest numbers, here's what I see for second wave states (same left and right axes as the previous chart):

Starting in late May, confirmed cases start rising dramatically but deaths haven't moved along with them. What's going on? Possible explanations:

  • Delayed initial testing: When things were first taking off in first wave states, our testing capacity was way behind where it needed to be. Perhaps this heavily suppressed the initial "confirmed" numbers for the first wave, and so we should expect to see second wave deaths rise in the next few weeks?

  • Increasing test capacity: I've seen some people suggest that the second wave is just an artifact of increased testing in these states. If that were the case, then there would be no rise in covid cases to be explained. But then I would expect the fraction of tests that returned positive to be decreasing, and we aren't seeing that. This one seems like wishful thinking to me.

  • Undercounting: Perhaps we are seeing a large increase in covid deaths in the second wave states, but they're not being counted? If we were following the first wave trajectory, however, this would mean 1000+ mystery deaths per day, and that is quite a lot to go missing! The CDC collects "excess deaths" numbers, and while the most recent numbers they give are for 6/13 they're not showing many.

  • Different populations: early in the pandemic people didn't know to be being careful, and a lot of elderly or otherwise vulnerable people got it. The people getting sick now do skew younger, and it's possible we're awkwardly implementing the cocooning strategy the UK initially considered? If this is happening, whether it's a good approach depends quite a lot on whether we can keep hospitals from being overwhelmed (seems likely at this point) and how long covid immunity turns out to be (possibly as short as a year, though reinfections could maybe be cleared more easily?) This is my current best guess.

Other ideas? What seems most likely?

Comment via: facebook



Discuss

Second-Order Existential Risk

1 июля, 2020 - 21:46
Published on July 1, 2020 6:46 PM GMT

Cross-posted.

[Epistemic status: Low confidence]

[I haven’t seen this discussed elsewhere, though there might be overlap with Bostrom’s “crunches” and “shrieks”]

How important is creating the conditions to fix existential risks versus actually fixing existential risks? 

We can somewhat disentangle these. Let’s say there are two levels to “solving existential risk.” The first level includes the elements deliberately ‘aimed’ at solving existential risk. This includes researchers, their assistants, their funding. On the second level are the social factors that come together to produce humans and institutions with the knowledge and skills to even be able to contribute to existential risk. This second level includes things like “a society that encourages curiosity” or “continuity of knowledge” or “a shared philosophy that lends itself to thinking in terms of things like existential risk (humanism?).” All of these have numerous other benefits to society, and they could maybe be summarized as “create enough surplus to enable long-term thinking.” 

Another attribute of this second level is that these are all conditions that allow us to tackle existential risk. Here are a few more of these conditions:

  • Humans continue to reproduce. 
  • Humans tend to see a stable career as their preferred life-path. 
  • Research institutions exist.
  • Status is allocated to researchers and their institutions. 

If any of these were reversed, it seems conceivable that our capacity to deal with existential risk would be heavily impacted. Is there a non-negligible risk of these conditions reversing? If so, then perhaps research should be put into dealing with this “second-order” existential risk (population collapse, civilization collapse) the same way it’s put into dealing with “first-order” existential risk (nuclear war, AI alignment).

Reasons why second-order x-risk might be a real concern:

  • The above conditions are not universal and thus can’t be taken for granted.
  • Some of these conditions are historical innovations and thus can’t be taken for granted.
  • The continued survival of our institutions could be based more on inertia than any real strength.  
  • Civilizations do collapse. Due to our increased interconnectivity a global collapse seems possible. 
  • A shift away from ‘American values’ over the coming decades could lead to greater conformity and less innovation. 
  • Technology could advance faster than humanity’s ability to adapt, significantly impacting our ability to reproduce ourselves. 

Reasons why second-order x-risk might not be a real concern:

  • Civilization keeps on trucking through all the disruptions of modernity. Whether or not the kids are alright, they grow up to complain about the next kids. 
  • Whatever poor adaptations people have to new technology, they’ll be selected against. Future humans might develop good attention habits and self-control. 
  • The bottleneck could really only be in funding. You don’t need that many talented people to pluck all the significant x-risk fruit. They’re out there and they’ll be out there for years to come, they just need funding once found. 

When considering whether or not second-order x-risk is worth researching, it’s also worth looking at where second-order existential risk falls in terms of effective altruist criteria: 

  • Scale: Impaired ability to deal with existential risk would, by definition, affect everybody. 
  • Neglect: Many people are already working on their version of preserving civilization. 
  • Tractability: It is unclear what the impact of additional resources would be. 

My suspicion is that second-order x-risk is not as important as ex-risk. It might not even be a thing! However, I think the tractability is still worth exploring. Perhaps there are cheap, high-impact measures that maximize our future ability to deal with existential risk. It’s possible that these measures could also align with other EA values. Even decreasing disease burden in developing countries slightly increases the chances of a future innovator not dying of starvation. 

I am also personally interested in the exploration of second-order x-risk because there is a lot of overlap with conservative concerns about social and moral collapse. I think those fears are overblown but they are shared by a huge chunk of the population (and are probably the norm outside of WEIRD countries). I’m curious to see robust analyses of how much we realistically should worry about institutional decay, population collapse, and technological upheaval. It’s a ‘big question’ the same way religion is: if its claims are true, it would be a big deal, and enough people consider it a big deal that it’s worth checking. However, if it is rational to not worry about such things, then we could convince at least a few people with those concerns to worry about our long-term prospects instead.



Discuss

How to Find Sources in an Unreliable World

1 июля, 2020 - 21:30
Published on July 1, 2020 6:30 PM GMT

I spent a long time stalling on this post because I was framing the problem as “how to choose a book (or paper. Whatever)?”. The point of my project is to be able to get to correct models even from bad starting places, and part of the reason for that goal is that assessing a work often requires the same skills/knowledge you were hoping to get from said work. You can’t identify a good book in a field until you’ve read several. But improving your starting place does save time, so I should talk about how to choose a starting place.

One difficulty is that this process is heavily adversarial. A lot of people want you to believe a particular thing, and a larger set don’t care what you believe as long as you find your truth via their amazon affiliate link (full disclosure: I use amazon affiliate links on this blog). The latter group fills me with anger and sadness; at least the people trying to convert you believe in something (maybe even the thing they’re trying to convince you of). The link farmers are just polluting the commons.

With those difficulties in mind, here are some heuristics for finding good starting places.

  • Search “best book TOPIC” on google
    • Most of what you find will be useless listicles. If you want to save time, ignore everything on a dedicated recommendation site that isn’t five books.
    • If you want to evaluate a list, look for a list author with deep models on both the problem they are trying to address, and why each book in particular helps educate on that problem.  Examples:
    • A bad list will typically have a topic rather than a question they are trying to answer, and will talk about why books they recommend are generically good, rather than how they address a particular issue. Quoting consumer reviews is an extremely bad sign and I’ve never seen it done without being content farming.
  • Search for your topic on Google Scholar
    • Look at highly cited papers. Even if they’re wrong, they’re probably important for understanding what else you read.
    • Look at what they cite or are cited by
    • Especially keep an eye out for review articles
  • Search for web forums on your topic (easy mode: just check reddit). Sometimes these will have intro guides with recommendations, sometimes they will have where-to-start posts, and sometimes you can ask them directly for recommendations. Examples:
  • Search Amazon for books on your topic. Check related books as well.
  • Ask your followers on social media. Better, announce what you are going to read and wait for people to tell you why you are wrong (appreciate it, Ian). Admittedly there’s a lot of prep work that goes into having friends/a following that makes this work, but it has a lot of other benefits so if it sounds fun to you I do recommend it. Example:
  • Ask an expert. If you already know an expert, great. If you don’t, this won’t necessarily save you any time, because you have to search for and assess the quality of the expert.
  • Follow interesting people on social media and squirrel away their recommendations as they make them, whether they’re relevant to your current projects or not.


Discuss

Страницы