Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 32 минуты 16 секунд назад

Urgent & important: How (not) to do your to-do list

1 февраля, 2019 - 20:44
Published on February 1, 2019 5:44 PM UTC

The Eisenhower Box is a well-known, simple decision matrix for dealing with tasks such as a to-do list, based on whether they’re urgent or important.

I reckon it has multiple flaws. But by fixing each flaw in turn, we end up with a better, rather different decision matrix, which can also be simplified further.

What to do?

The great problem of life is what to do. Your life consists of millions of decisions large and small, from making coffee to running for President. Which should you do, and when, and how?

There’s all the things to be done at work and home, constant demands and distractions, unfulfilled ambitions at the back of your mind – and barely time to think, let alone get through all this stuff.

Happily, a box has been invented to help you out. A bit like an Amazon Echo – but made only of paper & ink – it not only tells you how to deal with everything on your plate, but magically makes some of it disappear.

Or so it is claimed.

The box

The Eisenhower Box (or Matrix) was invented by Stephen Covey in his bestseller The 7 Habits of Highly Effective People. It was later named after US President Dwight Eisenhower, who once said:

“I have two kinds of problems, the urgent and the important. The urgent are not important, and the important are never urgent.”

The point being, people spend too much time on urgent-seeming but unimportant distractions, instead of on important, non-urgent matters – such as planning, people, and future opportunities. Short-term trivia divert you from what really counts.

To solve this, the Eisenhower Box tells you what to do with each task that happens along, based on whether it’s important or urgent:[1]

The kind of tasks that end up in each cell, starting top-left, are:

  • Important & Urgent (green): things that need action ASAP, such as important meetings/calls/emails, tight deadlines, and crises. They’ve got to be done, so – like the box says – you’d better Do them.
  • Important & Non-Urgent (blue): big-picture, longer-term, proactive things you don’t do enough of – planning, identifying opportunities, talking to people, recruiting, training, self-improvement, etc. Such things are rarely pressing, but can prevent the scrambles and crises of the stressful green cell. So Schedule these tasks in your calendar, to avoid procrastinating them for ever.
  • Unimportant & Urgent (orange): e.g. unnecessary meetings, phone calls, interruptions, and similar needless demands from others. These grab your attention, so are easily mistaken as important – but you have better things to do, so Delegate these tasks to others.
  • Unimportant & Non-Urgent (red): total time-wasters, such as pointless emails & business trips, aimless web browsing & social media, and other displacement activities that avoid real work. You know you shouldn’t be doing these, but go ahead anyway – because they’re easy or enjoyable, and feel a bit like work. Wise up and Delete these things entirely.

So, for anything on your to-do list – whether work or personal – just toss it into the box, and out falls how best to deal with it. You’ll end up spending quality time on what matters, while delegating and deleting less important stuff. And everything will go swimmingly.

So far, so plausible – yes?

Well, close examination shows that this is half-right, but also half-wrong. And being half-wrong, it only half-works. But I won’t rip up the Eisenhower Box into little shreds, as we can patch it up and relabel it to make a better one.

Let’s think outside the Box by considering each of its labels in turn. (And if you’re not interested in the reasoning, skip to Hopscotch near the end for the final result.)

Urgent / Non-urgent

There’s certainly something special about urgent tasks. The dictionary defines ‘urgent’ as ‘requiring immediate action’. How could this not be crucial for deciding what to do?

Alas, Covey (the box’s inventor) uses the word both for things which really are urgent – like an angry phone-call from a major customer, or a crisis meeting – and those that merely seem urgent, like a phone-call from an insurance salesman, or a toddler shrieking for ice cream.

The former do require immediate action, but the latter merely grab your attention; they may involve urging, but they are not urgent. Calling both ‘urgent’ blurs the crucial distinction; it fails to tell us which is which, and how to treat them. In fact, the real difference is that the former things are important, and the latter unimportant.[2] But the box already makes this distinction via its rows; classifying all these tasks as Urgent doesn’t help one jot.

However, there is another aspect of urgency, and the clue is in the dictionary definition: the word ‘immediate’. It’s useful to know whether something needs doing ASAP, or can be dealt with another time.

More specifically, if you make a to-do list each day, it’s useful to distinguish what needs doing today from what can be put off until later, e.g. tomorrow. For example, collecting the kids from school must be done today – it can’t be postponed; whereas collecting a jacket from the dry cleaners can be done tomorrow or next week, if you don’t need it yet.

So, we can improve the Eisenhower Box by replacing Urgent/Non-Urgent with this better distinction – Today versus Later:

Important / Unimportant

The box is absolutely right to identify importance as important. But we can do better.

Suppose you have a crucial project to finish today, and also a meeting you ought to attend but which isn’t essential. You’d better skip the meeting and get on with the project. Both are important, but the project is a ‘must’, and takes precedence.

Some things are more important than others; importance comes in shades of grey. But the Eisenhower Box only distinguishes black and white: Important and Unimportant. As far as it’s concerned, the project and the meeting are equally important; it can’t tell you which to do.

Far more useful is to distinguish three degrees of importance: things you must do, come what may (like the project); those you should do, which are preferable but inessential (like the meeting); and those you could do, but don’t matter much (like a lot of email). The must/should/could distinction is also easy to understand – we use these words all the time.[3]

We can adapt the box to use them by splitting Important into two rows – Must and Should – and renaming Unimportant as Could:

So now the top left half-cell is for things you Must do Today – like finish the project, or collect the kids from school – and top right, things you Must do Later, such as your tax return (not due for three months). Must means that if a task isn’t done, you risk disaster; and Must Today means you risk disaster if the task isn’t done today.[4] Hence, you will definitely do 100% of Must tasks if at all possible.

In the new row below the Musts are things you Should do Today – like attend the useful but inessential meeting, or eat lunch (though you might skip that too if necessary) – and things you Should do Later, e.g. buy a faster laptop, even though your current one is OK for now.

Finally, the Could row contains everything else on your to-do list.

Do / Delegate

The Eisenhower Box is quite right to recommend delegating. It’s inefficient, indeed impossible, to do everything yourself. So it tells you to do the Important Urgent tasks, and delegate Unimportant Urgent things to others. As the saying goes, “if you want a job done properly, do it yourself”.

But should you?

For many important tasks, you shouldn’t do them yourself – you should delegate them. If you’re on trial for bank robbery, don’t try to conduct your own defence – delegate that to a lawyer. Nor should you do your business’s accounts yourself; mistakes could lead to fines or a tax investigation. Delegate it to an accountant instead.

Conversely, many unimportant things aren’t worth delegating. To hang a picture on the wall, you could get a handyman in, but it’s probably not worth the hassle and cost; if you have a hammer and nail, just do it yourself. Prince Charles has a valet who squeezes toothpaste onto the royal toothbrush for him; but it’s not worth hiring your own valet for this – squeeze your own toothpaste.

And for many tasks, whether you delegate or not depends on how busy you are. If you’re a CEO, you usually meet major customers yourself; but if some major crisis hits, you send your sales manager in your place. When you have friends round, you usually cook your best pizza recipe; but if you have to leave work late, you order a pizza delivery instead.

So the reality is, you should Do some important tasks, and Delegate others; Do some unimportant tasks and Delegate others. The box can’t tell us which. Hence we should replace these words with a single phrase that covers both – ‘deal with’:

Dealing with a task means getting it done – either yourself, or by someone else. Which could be anyone who’s suitable and available: a junior, your boss, a contractor, partner, friend, etc.

The decision whether to delegate actually depends not on importance, but on what is best use of your time. For an important task for which you’re best suited – e.g. a meeting with a big customer – your time is best spent doing it yourself. Unless, that is, there’s an even better use of your time, such as handling a major crisis.

Similarly, you like making pizza, but if working late is a better use of your time, do that instead, and delegate the cooking to your local pizzeria.

You may handle the meeting better than anyone else, or make better pizza than the pizzeria, but that doesn’t necessarily mean you should do it – if you have better things to do.

Schedule

(Sometimes labelled ‘Do later’, ‘Decide’, or ‘Plan’.)

What about tasks in the blue cell, that we Must or Should do Later? The Eisenhower Box suggests we schedule them for some future date, which is certainly better than ignoring them. But there are other ways to handle these tasks – like, actually do them today, or at least make a start.

For if the task is a long one, i.e. a project, it’s best to start on it sooner than you might optimistically schedule. And the longer and more important the project, and the firmer the deadline, the earlier you should start. Unknown unknowns and emergencies often cause delays – and even without them, you can’t be sure you’ll finish on time. I have even heard of occasions on which things were left to the very last minute, but then turned out to take longer than expected.

Making even a small start today, such as a few minutes thinking, researching or planning, may reveal important information – e.g. that the project is much harder than you thought, or needs someone or something you hadn’t anticipated.

Alternatively, it may make sense to delegate the task, or some part of it. In which case, the sooner you do so the better, so whoever you’re delegating to can fit it into their schedule. They may be busy too.

So for blue-cell tasks, as an alternative to scheduling them, also consider doing, delegating, or at least starting on them today. That is, ‘deal with’ them:

Delete

(Sometimes called ‘Eliminate’ or ‘Drop’.)

Some things really are a waste of time, and the Eisenhower Box is right to try to stop us doing them. But does it identify them correctly?

Just as the Important row combines two degrees of importance – essential (Must) and merely preferable (Should) – the Unimportant row hides two degrees of unimportance: things worth getting done if possible, and things never worth doing. Like aimless web surfing, social media (in work time), pointless emails, and other wastes of time.

However, the latter aren’t so much tasks you Could do, as things you shouldn’t do; they really belong on a separate ‘Shouldn’t’ row below, for stuff you actually should Delete:

That said, in practice we don’t need this extra row, because you already know you shouldn’t do these things. It’s not as if ‘watch cat videos’ is on your to-do list, and you need the box to decide between that or some crucial project; you already know the answer. So we can forget about total time-wasters – just don’t do them.

Apart from those, everything else you Could do might be worth doing someday – so there is no point deleting them.

For example, suppose your bathroom is looking a bit jaded, and you’re wondering whether to repaint it. This is something you Could do Later, e.g. next weekend; but on the other hand, it hardly matters. Faced with this dilemma, you may be tempted to be decisive: either actually do it next weekend, or forget the whole idea – Delete it.

But if you don’t repaint the bathroom next weekend, nothing is gained by deciding never to do it. The task can just loiter around the bottom of your list until you have nothing better to do, if ever. For the opportunity may arise, e.g. if you get painters in for some other job; or the situation may change, if in a year or two the bathroom looks much worse, or you want to sell your house, so decide to paint it after all. Why Delete tasks if you might end up doing them anyway?

This does mean your complete to-do list will include lots of minor things that may never get done.[5] Accept this. You won’t ever finish your list – and that’s OK. For if there were nothing left to do, what would there be to live for?

So, we shouldn’t have a Shouldn’t row; let’s delete Delete; and instead make anything you could do later just another task to Deal With, if you ever get around to it:

Note the phrase ‘get around to it’. Unimportant tasks loiter at the bottom of your to-do list, seldom getting done, because you almost always have other things to do first.

So what should you do first? And what then, and after that?

Prioritizing

The Eisenhower Box kind of assumes you’ll have enough time to do what it tells you. But what if you don’t? Scheduling should be quick enough, but there may not be enough hours in the day to Do all the green tasks, let alone Delegate the orange ones (as delegating can take time). Something must give; but what?

If we can work out an order of priority for tasks, then if you don’t complete them today, at least you’ll have done what matters most. We can figure out the order like this (bear with me):

Must Todays must by definition be dealt with today, come what may – so do them first of all. Doing anything else increases the risk of not completing them.

Should Todays are more important than Coulds, so should be done before them. Doing Coulds instead of what you Should be doing is a symptom of ‘busyness’ – merrily filling your day with minor tasks that aren’t a waste of time, but aren’t much use either.

A harder conundrum is: which is higher priority, Should Today or Must Later? The pressing meeting, or the crucial long-term strategy that can wait? Well, if you do Should Todays first, you may not get through the Must Laters on your list. And if this happens day after day, some Must Laters will be procrastinated indefinitely, and you’ll miss crucial deadlines and opportunities. Whereas Should Todays are merely desirable. Hence it’s best to deal with Must Laters before Should Todays; and if you’re too busy to work on a Must Later, at least you can schedule it in seconds.

For example, suppose your business relies on selling just one product. At some point you must come up with a new one – a task you Must do Later. But you’re so focussed on your current product that your to-do list is full of Should Todays – reply to enquiries, order stock, issue invoices, etc. If you keep dealing with these every day rather than even thinking about a new product, a sudden competitor to your current product could make you go bust before you can say ‘mañana’. But by prioritizing Must Laters before Should Todays, you’ll either start planning your new product today, or at least put a date in the diary to do so.

And by the same reasoning, Should Laters are higher priority than Could Todays.

Putting this all together, the best order for tasks turns out as:

1. Must Today

2. Must Later

3. Should Today

4. Should Later

5. Could Today/Later

(though there are exceptions, discussed below).

With Coulds, it’s hardly worth distinguishing Could Todays from Could Laters. The timing of Coulds usually doesn’t matter much, because the task itself doesn’t matter much, and you probably won’t get round to it anyway. Hence we can combine them into a single Could category.

Now let’s redraw our reworked box more neatly. Every cell is now just something to ‘deal with’ (or ‘schedule’) – so we can lose the colours, and replace ‘deal with’ with the above numbers for the order to do those tasks in:

Hopscotch

For whimsical reasons I’ll call this new box and method Hopscotch. Use it like this:

Compile a daily to-do list

At the start of each day, go through your full to-do list and appointments, and ask yourself which things you must deal with today, must deal with later, should deal with today, etc. Perhaps label them MT, ML, ST etc. as you go. If you have lots to do, selecting just these top three categories should suffice; but if you include Coulds, pick just the most important ones, as chances are you won’t get round to them today anyway.

Copy these tasks to create a separate list for the day. Then reorder them as specified by the Hopscotch cell numbers: Must Todays first, then Must Laters, then Should Todays, etc.

Within each of these categories, put tasks into roughly descending order of importance; if unsure, also consider which need doing sooner. It’s often unclear whether something counts as a Must or a Should, or a Should versus Could, but that hardly matters if tasks are ordered within each category too.

You can then move some tasks around for the following reasons:

  • If a task is at a fixed time (e.g. a scheduled meeting/call), or must wait for something/someone else, move it to where in the list you expect to do it.
  • Put related tasks together to increase efficiency, e.g. ones from the same project or in the same location. If you Must go to the dentist, you may as well buy some milk nearby to save a separate trip, even if that’s just a Should. (But think twice about moving Coulds up the list.)
  • Put a quick Should Today task before/between Must Todays if it ought to be done early. For example, if you can delegate it just by forwarding an email, do so, rather than leaving it till late in the day or tomorrow. Or if you have a fairly important 9am call scheduled, which you’re sure won’t stop you finishing the Must Todays, go ahead – but duck out if it overruns.

It’s easiest to do all this if you have your full to-do list in software – even good ol’ Microsoft Word will do – and keep it in rough order of importance.

Using the to-do list

Now, down to work. Deal with tasks in the listed order; don’t cherry-pick ones you feel like doing. Remember that ‘dealing with’ a task can involve getting someone else to (delegating), depending on what’s the best use of your time, and who else could do it.

Whenever a new task arrives during the day, again ask yourself whether you really must deal with it today. If so, add it at/near the top of today’s list; if not, you can probably just leave it for consideration tomorrow.

You may well not get through everything by the end of the day. That’s fine – leftover tasks can go on tomorrow’s list. But when you compile it, reconsider whether leftovers are still Must/Should/Could or Today/Later, as their priorities may have changed; they may not even be worth including.

Simpler version

Hopscotch is pretty simple, with just one more cell than the Eisenhower Box. Try it for a few days, and see how it works for you.

But for something even simpler, we could combine cells 2 & 3, and 4 & 5, to produce a minimal system that doesn’t even need a box:

  • Start with tasks you must deal with today (unless they have to be done later in the day)
  • Then, tasks you should deal with today – the most important first, including working on or scheduling things you must deal with sometime
  • Finally, deal with anything else, in order of importance.
Postscript

Was I too hard on the poor old Eisenhower Box? It was only ever a rough-and-ready solution. But millions know about it, so if it can be improved on, it should be.

Hopscotch may be better, but is still quite approximate. For example, there are big issues with deciding how important things are, tasks without clear deadlines, different time horizons, and considerations beyond importance and timing. But I hope to, and perhaps Should, write about these Later.

[1] Stephen Covey did not propose the one-word actions for each cell, which seem to have been added by later writers, and are an oversimplification of what he says; for instance, he did not claim Unimportant Urgent tasks are the only ones you should Delegate.

[2] Hence there are no Unimportant Urgent tasks, and nothing belongs in the orange cell. Anything truly urgent – that requires action – must be important, and so goes in the green cell.

[3] Some people label three degrees of importance A, B, C, or 1, 2, 3 instead; but these are arbitrary, meaningless symbols, hence liable to be used inconsistently.

[4] This doesn’t mean this is a Must task which you have merely chosen to do Today; it’s an objective deadline, e.g. for completing the project, or collecting the kids, after which things will go pear-shaped.

[5] Could Laters are like ‘Someday/Maybe’ tasks in David Allen’s Getting Things Done system; Someday = (much) Later, Maybe = Could. He proposes keeping them in a separate list, though in reality they’re just the bottom of one big to-do list.



Discuss

Who wants to be a Millionaire?

1 февраля, 2019 - 17:02
Published on February 1, 2019 2:02 PM UTC

tldr; Using Toy models, the Kelly criterion, prospect theory, Bayes and amateur psychology in an unnecessarily detailed analysis of Who wants to be a Millionaire. I strongly suspect there will be errors but hopefully not too many.

Motivating example

I was watching Who wants to be a Millionaire? a couple of nights ago for the first time in about 20 years after its recent recommissioning in the UK.

One contestant’s general knowledge got her to £250,000, 2 correct answer away from £1,000,000. She had £125,000 guaranteed and still had 2 lifelines (ask the audience and phone a friend).

Question: The Norwegian explorer Roald Amundsen reached the South Pole on 14th December of which year?

A: 1891

B: 1901

C: 1911

D: 1921

She thought about/discussed this at length (there was no time limit and the total time spent on the question was about 20 minutes!). She knew that Amundsen beat Scott to the south pole and was confident that Scott was Victorian, which ruled out C & D. She pointed out that if it was 1911 then the 100 year anniversary would have been 2011 and she felt she would have remembered there being something about it in the news.

This didn’t help her choose between A & B so she asked the audience (she was confident none of her friends would know). The results were (from memory, maybe slightly off):

A: 28%

B: 48%

C: 24%

D: 0%

What would you do?

The result that stood out to me was the 24% who said C. After everything that she said about how confident she was that it isn’t 1911, who are these people voting C? It turns out they’re the people who knew the right answer.

Unfortunately, the contestant went with the majority, said B, and left with only £125k. Not too bad really, even if the name Roald Amundsen is haunting her and popping up everywhere she goes.

I’ll admit that even though I suspected that the answer was C based purely on the ask the audience result, I don’t think I would have been confident enough to go for it based only on that.

If I knew that b was the right answer, would I have been surprised that it got 48% of the vote? No, that would make sense.

If I knew that b was the wrong answer, would I have been surprised that it got 48% of the vote? Maybe a little, it is quite high, but not out of the question, especially this late in the game.

If I knew that c was the right answer, would I have been surprised that it got 24% of the vote? No, that would make sense; it’s a tough question so if most of the people who pressed c actually knew the answer then 24% sounds about right.

If I knew that c was the wrong answer, would I have been surprised that it got 24% of the vote? Yes, very, you have to be really committed to an answer to stick with it after the someone who is clearly good at general knowledge has ruled it out and given a couple of reasons why she thinks it’s wrong. Anything more than 10% would be a surprise, 24% would be really weird. This is especially true as D got 0% of the vote.

Let’s dig into some more detail.

Modelling ask the audienceSimple model

A simple model for Ask the Audience would be to expect that those who know the right answer would press the correct button and those who didn’t would guess equally spread between the 4 answers.

If we estimate that 20% of the audience actually know the answer, this gives 20:20:20:20 from the guessers, with an additional 20 for the correct answer. We get 40:20:20:20 and the correct answer is obvious. Even with a bit of random noise in the results the correct answer should be clear, provided enough people actually know the answer.

On late game questions, one would expect that fewer people will know the answer as few people win the jackpot. However, the game requires consecutive answering of questions to win the jackpot so each individual question doesn’t need to be too hard to prevent people from winning the jackpot (provided there are no indiscreet coughs).

Consider only people who get to the last 5 questions. Even if they, on average, know the answer to (or correctly guess) 50% of the time, only 1 in 32 will actually get to the jackpot (provided the questions are on sufficiently different subjects). 33% knowledge gives 1 in 243 wins. In the original UK series there were 5 winners in 1200 contestants (1 in 240) but as some contestants weren't good enough to reach the final 5 questions, 33% is a lower bound.

The average audience member probably isn’t as good as the best contestants but they have applied to be an audience member of WWTBAM so probably have a good interest in general knowledge (or are there with someone who does!). I think that 15-20% of the audience on average knowing the correct answer is probably not a bad starting point.

Salience

Imagine you’re an audience member faced with a question you don’t know the answer to. Possibly one of them stands out to you for some reason or other. Maybe it’s a name you recognise or a date which seems like a good enough guess.

Unfortunately, the girl next to you is thinking the same answer for a similar reason, as is the guy a couple of seat further down. 20% of the audience who guess the same thing as you for roughly the same reason. Instead of 40:20:20:20, the results are now 35:35:15:15.

This is bad news for the contestant as two answers are indistinguishable as the same number of people came up with a particular wrong answer due to a consistent reason as actually knew the answer. In reality this is rarely the case, with one effect being larger.

In this circumstance, most people will either take the money and run or go with the highest scoring answer. 20 years ago when I used to watch the program I had the heuristic that in the final 5 questions, it’s a better option to take the second highest scoring answer from ask the audience.

A better solution than to go consistently with the highest or second highest answer would be to consider which effect sizes you would expect – people actually knowing the answer and salience. Then, when you see the results, consider how surprising they would be based on the hypothesis that A, B, C or D is the correct answer.

Estimating salience is tricky but as an upper bound a brief look on the millionaire wiki gives an example of the audience voting 81:19 in favour of the wrong answer! This page gives even more extreme examples.

Using 50:50 and ask the audience for the same question

On that first page, a number of other examples are given of people using 50:50 and ask the audience lifelines on the same question. It is notable that all 4 used their 50:50 lifeline first, followed by ask the audience.

Superficially this makes sense – you want to give the audience as much information as possible to help them make the best choice.

However in reality there is probably a fairly binary split in the audience members – those who know and those who are guessing. You don’t care what the guessers think as its very hard for them to provide you with enough evidence to justify answering the question.

The only thing you actually care about is identifying the people who actually do know. If you refrain from using your 50:50 until after using ask the audience, the 50:50 serves the additional purpose of removing from the statistics a section of the audience who don’t know., increasing your signal to noise ratio.

In addition, if there is a highly salient wrong answer then you have a 2/3 chance of removing it. Say we have 35:35:15:15 ratio due to knowledge and salience effects as described above. Using the 50:50 has a 2/3 chance of leaving 35:15 and 1/3 chance of leaving 35:35.

35:15 gives an obvious best response under the model. Using the same model and removing the same questions but using the 50:50 lifeline first you would have 60:40 in favour of the correct answer. This is much less strong evidence and you would only need a relatively small bit of salience in favour of the wrong answer to tip the scales in the wrong direction.

If 35:35 is left you’re still stuck but you wouldn’t be in any better situation if you’d selected 50:50 first.

Salience influence by the contestant

One of the classic things to shout at the T.V. during Millionaire is “Don’t discuss the answers before using your Ask the Audience lifeline!”

It is often said that if you talk about which answers you think are most likely then you will influence the audience towards those answers. This means that you don’t get the audience’s true opinion on the subject. This seems to have happened in the Amundsen question.

The obvious thought comes to mind that one might use that influence deliberately but I’ll come back to that later.

(I’m working on the assumption that the rules prevent you from telling the audience what to do if they’re not sure!)

How confident do I need to be?3 alternative models

Millionaire includes 2 safety nets, such that once a contestant reaches these nets they are guaranteed to win at least that amount. Once a safety net has been passed the contestants are no longer having to bet their entire bankroll on each answer.

I’m going to invoke the Kelly criterion here even though I know that the assumptions of the derivation are not met. Adjust up or down according to taste.

(Interestingly one of the Kelly assumptions is that you will get unlimited future opportunities to wager with an edge. In Millionaire you get another opportunity to wager iff you take the wager offered and win. Additionally, Kelly relates to the optimal amount to bet, rather than whether to accept a set sized wager.)

If we rearrange Kelly and apply it to Millionaire (where the prize doubles for each successful question answered (or very close)) then we arrive at a formula for how confident one should be in order to guess.

1 – \frac{1}{2^{q+1}-1}">pk>1–12q+1−1

where q is the number of questions since your last safety net.

This implies that for each question past the last safety net you need about an extra bit of evidence before you are justified in answering the question.

Kelly might not be the best option for choosing a required probability. Let's say I value money logarithmically and calculate expected utility. In order to justify answering I require:

\frac{q}{q+1}">peu>qq+1

This is considerably less stringent, particularly as q increases.

We can compare this to prospect theory. I'll consider a loss to be twice as painful as a similar gain is pleasureable and anchor on how much money I'll get if I don't answer. I will let the decision weights equal the probabilities as they are not extreme. I then require:

\frac{2^q-1}{1.5 \times 2^q-1}">pp>2q−11.5×2q−1

This is even less stringent for high q, tending towards .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} 2/3 for high q.

As prospect theory is descriptive rather than prescriptive this isn't necessarily how you should behave, but may represent how people do behave.

In the original UK version of WWTBAM, the host used to present the contestant with a cheque for the amount that they had just reached (for question ~11 onwards), before taking it away and saying "but we don't want to give you that" and proceeding to the next question. I used to assume that it was just showmanship, now I wonder whether it was a cunning plan to encourage anchoring to the current amount and making the contestant more likely to guess.

Working through the odds

In the example given, the contestant was 1 question past a lifeline so should require .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} \hspace{1mm}^ 2/_3">p>2/3 minimum in order to answer (I'll stick with Kelly for the moment).

Let’s say she had assigned probabilities of roughly 50:50:0:0 before she asked the audience. So in order to answer she only needed a single bit of evidence for one answer over the other.

p(E|a)p(E|b)=p(a|E)p(b|E)p(b)p(a)=2/31/30.50.5=21

The audience voting 28:48 in favour of option b maybe did represent a bit of evidence for b over a if you don't expect a high salience effect (she hadn’t distinguished between the 2 in her deliberations).

Coming at the same question and evidence, I started with 25:25:25:25 odds on which answer was correct because I had no real clue. In order for one of the probabilities to rise to \hspace{1mm}^2/_3">p>2/3, I would need 6:1 evidence in favour of that answer over all of the other answers combined.

I didn’t really assign any particular salience to any answer before the contestant started talking – none of the answers particularly drew me in and I couldn’t think why they would draw other people in. There might be a small effect with people being more likely to guess the middle 2 answers but not enough for me to adjust confidently.

However, after she ruled out C & D fairly confidently, I was looking at those straight away when the results came in, expecting that if one was more than 10% points higher than the other then that would be fairly good evidence in favour of that answer.

The actual result was about as clear as you could get: C=24% & D=0%. Assuming it is correct and useful, I think it comfortably assigns better than 6:1 odds in favour of C over the other answers.

How confident am I in my model?

However, I didn’t have any specific evidence that the model was correct, just my amateur psychology at work. If I let m be the model being true (and me having interpreted the results correctly), say that my model assigns 90% of its probability mass to option c and that if my model is wrong then I still have 25:25:25:25 odds, I can calculate how confident I need to be in my model to achieve 2/3.">p(c)>2/3.

\frac{^2/_3 - p(c|¬m)}{ p(c|m) - p(c|¬m)} \approx 0.64">2/3<p(c)=p(m)p(c|m)+p(¬m)p(c|¬m)=p(m)(p(c|m)−p(c|¬m))+p(c|¬m)p(m)>2/3−p(c|¬m)p(c|m)−p(c|¬m)≈0.64

So should I have been willing to take the gamble? That depends on whether I thought that the model was more than 64% likely to be accurate.

I would have assigned prior probability of p(m)≈0.4 based purely on it seeming sensible vs my lack of expert knowledge and the fact that any effect might be smaller than I anticipated. The fact that answer d received 0% of the vote increased my confidence in the model and effect size but only up to maybe p(m)=0.5 at best. I think that this counts as my gut instinct roughly matching with what my maths has come up with – I shouldn’t take the bet but it’s a close thing.

Having seen that the model worked on this occasion, I should update:

p(m|c)p(¬m|c)=p(c|m)p(c|¬m)×p(m)p(¬m)=0.90.25×0.40.6=2.4p(m|c)=0.71

So now that I’ve seen the model work on this occasion I should be willing to bet based on a similar situation arising in future. However, if it was 2 or more questions since my last lifeline (or the salience model provided less decisive evidence) I should wait for more evidence for the model before being willing to bet. Again, this roughly matches my intuition, looking back, my prior was possibly a bit high.

Deliberately influencing the audience

I mentioned beforehand that an interesting strategy would be for the contestant to influence the audience deliberately to encourage it to vote in a particular way.

Imagine you were able to get all of the audience who didn’t know to vote for a single answer which you knew (or were fairly confident) was wrong. The most likely way to influence people in a particular way would be to pretend that you thought that this was the correct answer.

If it works, this should produce a very high vote for that question plus a 15-20% vote for the correct answer and a small vote for the remaining 2 options. The low vote on the other 2 which you influence away from would be good evidence that your influence attempt was successful.

If you get almost everyone in the audience voting for the influenced answer and no spike on any of the other 3 that either means that it’s the correct answer after all or that very few people know the actual answer.

I’m not sure how easy it would be to apply this level of influence towards a single answer. I suspect that audience members like to feel as though they are making some form of choice so it would probably be wise to leave open the possibility of at least one other option in your deliberations so that people can feel like they’re deciding on your favoured choice.

This might be made easier if one of the answers is particularly salient anyway. You shouldn’t need to do as much pushing to get more people to choose this answer.

If you have your 50:50 lifeline left then this will help you if your results are inconclusive. I suspect that if one maximised the use of ask the audience and 50:50 combined then it would be a rare occasion that you wouldn’t be able to get to at least \hspace{1mm}^2/_3 ">p>2/3 in favour of one answer.

If I plan to influence the audience when I use my lifeline, I need to maximise my effect. Looking at Cialdini’s 6 principles, I think authority and consensus (a.k.a. social proof) are most likely to helpful here.

If people are going to be influenced by my statements then they need to believe that I am an authority on the question at hand. This can be done in at least 2 ways prior to asking the audience:

1. Establishing that you have good overall general knowledge

2. Establishing you ability to work through tricky questions to get to the correct answer

In order to persuade people away from some answers and towards others I need to give them a reason for changing. The reason doesn’t have to be true, just believable and I have to be able to come up with this quickly.

As for consensus, I think that when I choose to use my ask the audience I should say “I’m pretty sure the answer is D and that when I see results I’m going to feel like I wasted a lifeline but I just want to be sure as there’s a small chance it might be C”. Even though the audience members don’t know what the consensus is, an expectation of the consensus is created.

My main worry is that people might realise that everyone is going to vote the same way and then try to be helpful by selecting their original thought. However, I would expect the number of people who did this to be relatively low so I hope I'd be safe.

Summary

Superficial readings of ask the audience results are dangerous.

If you're going to use ask the audience and 50:50 on the same question, ask the audience first.

For each additional question past your latest safety net, approximately 1 more bit of evidence is required to justify answering (Kelly). Alternatively, \frac{q}{q+1}">p>qq+1 for expected utility. Don't trust your gut.

Watch lots of episodes beforehand to test your ability to predict what people who don't know the answer will guess.

If possible, influence the audience so that you are better able to perform this prediction in your game.

Even if you do all this, it will, at best, get you 1 question further in the quiz - your performance is still dominated by you general knowledge and the luck of the draw as to whether you get the questions which match your areas of knowledge.

Bonus material

Watch out if you play in Russia (from tv tropes):

Audiences of the Russian version are infamous for deliberately giving the wrong answer out of spite, especially to certain aggravating celebrities.

Similarly, either this French audience were really stupid or complete bastards.



Discuss

Drexler on AI Risk

1 февраля, 2019 - 08:11
https://s0.wp.com/i/blank.jpg

Boundaries - A map and territory experiment. [post-rationality]

1 февраля, 2019 - 05:08
Published on February 1, 2019 2:08 AM UTC

Original post:http://bearlamp.com.au/boundaries/

<!-- wp:paragraph --> <p>This is an experimental investigation of map and territory.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p><a href="https://wiki.lesswrong.com/wiki/The_map_is_not_the_territory">Map and territory</a> is a relationship where the map represents the territory.. The map is <em>not</em> the territory, that we know. </p> <!-- /wp:paragraph -->

<!-- wp:quote --> <blockquote class="wp-block-quote"><p>Scribbling on the map does not change the territory</p><p></p></blockquote> <!-- /wp:quote -->

<!-- wp:separator --> <hr class="wp-block-separator"/> <!-- /wp:separator -->

<!-- wp:image {"id":1354} --> <figure class="wp-block-image"><img src="http://bearlamp.com.au/wp-content/uploads/2019/02/earth.jpeg" alt="" class="wp-image-1354"/></figure> <!-- /wp:image -->

<!-- wp:paragraph --> <p>I am in my house, sitting at a table with a picture of planet earth. There's a relationship between the picture and myself because technically I am in that picture map. But also I am looking at that picture and I recognise it as a map of the territory that I live in. There's a boundary between me and the map.</p> <!-- /wp:paragraph -->

<!-- wp:image {"id":1353} --> <figure class="wp-block-image"><img src="http://bearlamp.com.au/wp-content/uploads/2019/02/map-img-australia.png" alt="" class="wp-image-1353"/></figure> <!-- /wp:image -->

<!-- wp:paragraph --> <p>Now I have a map of the land mass of Australia. I am both in a territory represented by the map, and this map describes me (weakly).</p> <!-- /wp:paragraph -->

<!-- wp:gallery {"ids":[1355]} --> <ul class="wp-block-gallery columns-1 is-cropped"><li class="blocks-gallery-item"><figure><img src="http://bearlamp.com.au/wp-content/uploads/2019/02/sydney.jpeg" alt="" data-id="1355" data-link="http://bearlamp.com.au/?attachment_id=1355" class="wp-image-1355"/></figure></li></ul> <!-- /wp:gallery -->

<!-- wp:paragraph --> <p>Now I have a map of my city. There's again the same relationship. Two ways. I am in my city, but also my city map is separate from me because it sits on my table in front of me.<br></p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p> now I have a map (floorplan) of my house.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p>I am looking at a piece of paper, the map is external to the territory of me walking around my house.</p> <!-- /wp:paragraph -->

<!-- wp:image {"id":1356} --> <figure class="wp-block-image"><img src="https://i2.wp.com/bearlamp.com.au/wp-content/uploads/2019/02/sims.jpg?fit=640%2C360" alt="" class="wp-image-1356"/></figure> <!-- /wp:image -->

<!-- wp:paragraph --> <p>Now I have a 3d model of my house. It includes the table I'm standing in front of, and a mini version of all my maps on the table, and a 3d house model.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p>there's a boundary where I am looking at the map and not in the map.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p>but I've also got a little figurine of myself in my 3d model. My figurine appears to be looking at the mini 3d model of the house that's resting on his table. There's a boundary here. A relationship between me and the model.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p>where I am looking at an external model of myself looking at an external model of myself.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p>But now I am here. In my head. With an internal map of myself, standing here, looking at myself in the wholeness of my being, and I ask, </p> <!-- /wp:paragraph -->

<!-- wp:quote --> <blockquote class="wp-block-quote"><p><strong>"where is the boundary between myself and the map?"</strong></p></blockquote> <!-- /wp:quote -->

<!-- wp:separator --> <hr class="wp-block-separator"/> <!-- /wp:separator -->

<!-- wp:paragraph --> <p><em>Now might be a good time to pause or reflect on the exercise before reading on.  Obviously I can't make you do that but I considered ending the whole article here for that effect.</em></p> <!-- /wp:paragraph -->

<!-- wp:separator --> <hr class="wp-block-separator"/> <!-- /wp:separator -->

<!-- wp:heading --> <h2>Some Discussion</h2> <!-- /wp:heading -->

<!-- wp:paragraph --> <p><strong>Friend</strong>: would it be that you is what remains when you turn away from the map. If it's in your mind, then you remain when you stop thinking of the map?</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p><strong>Me</strong>: "what is the "you" that remains when "you" stop thinking of the map?</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p><strong>Friend</strong>: If we define identity the way I think you're pointing at, then the you constantly changes. So, sure, that "you" is no longer there when you turn away from the map.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p><strong>Me</strong>: Yes. From that place, repeating the exercise, the new map now includes that information "<em>the 'you' always changes</em>". And I can ask the same question. "<em>what is the you that remains separate from the map</em>?"</p> <!-- /wp:paragraph -->

<!-- wp:separator --> <hr class="wp-block-separator"/> <!-- /wp:separator -->

<!-- wp:paragraph --> <p>Existing map-less is very hard. The human brain really likes to put maps around things. I will be thinking, "I am map-less" and then realise that "thinking, 'I am map-less'" is a map too. There is a realisation that there is only one real territory (that we live in), and it's very hard to exist in the territory and not the map. And a further realisation that, for everyone else who exists in their maps and not "in the territory" they are also just genuinely existing in the territory too because maps are in the territory too.</p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p>From that place can come an acceptance of anyone and anything as they are. Being as their being is, bringing what they bring. Because that's (from my perspective, from the outside that person) the territory.</p> <!-- /wp:paragraph -->

<!-- wp:separator --> <hr class="wp-block-separator"/> <!-- /wp:separator -->

<!-- wp:paragraph --> <p>I feel like this exercise has the opportunity to generate weird feelings. Sometimes confusion, sometimes fear or dizzy or any number of other experiences. That's the point. The purpose is to then enable the experimenter to explore the feelings that have come up. What does that mean for the nature of reality that I live in. What's the dizzy trying to help explain to me? <strong> I wonder what is going on.</strong></p> <!-- /wp:paragraph -->

<!-- wp:paragraph --> <p><em>Special mention of the book <a href="https://booko.info/9781570627439/No-Boundary">No Boundary</a> by Ken Wilbur of <a href="https://en.wikipedia.org/wiki/Integral_theory_(Ken_Wilber)">Integral Theory</a></em></p> <!-- /wp:paragraph -->



Discuss

Reliability amplification

1 февраля, 2019 - 00:12
Published on January 31, 2019 9:12 PM UTC

In a recent post I talked about capability amplification, a putative procedure that turns a large number of fast weak agents into a slower, stronger agent.

If we do this in a naive way, it will decrease reliability. For example, if…

  • Our weak policy fails with probability 1%.
  • In order to implement a strong policy we combine 10 decisions made by weak agents.
  • If any of these 10 decisions is bad, then so is the combination.

…then the combination will be bad with 10% probability.

Although the combination can be more powerful than any individual decision, in this case it is much less reliable. If we repeat policy amplification several times, our failure probability could quickly approach 1, even if it started out being exponentially small.

Complementary to capability amplification is reliability amplification: given a policy H that usually works, can we implement a policy H that works significantly more reliably?

To be slightly less imprecise (but still quite crude):

  • Given a distribution A over policies that ε-close to a benign policy for some ε ≪ 1, can we implement a distribution A⁺ over policies which is δ-close to a benign policy of similar capability, for some δ ≪ ε?

When I say that A is ε-close to a benign policy, I mean that there is a benign distribution over policies which has total variation distance at most ε from A. We can view this is as a simple mixture between a single aligned policy that is chosen with probability (1−ε) and an adversarial policy that is chosen with probability ε.

I will move freely between talking about “policies” and “distributions over policies;” we can turn a distribution into a single policy by averaging.

The notion of “implement” here is exactly the same as for capability amplification: we can use any number of implementations of A and whatever other resources are at our disposal, and must compute A⁺’s responses to a sequence of observations, within some time limit such as a day.

Reliability amplification via deliberation

I think there is one very natural way to do reliability amplification — if some variant of this idea doesn’t work, I’m not sure what else to try.

As a warm-up, let’s consider the setting where the agents want to make a single decision rather than a sequence of decisions — we’ll return to the sequential case soon.

We sample three agents A, B, and C from the given distribution. We ask them to talk for a while and then each propose an action. Then we ask them to do approval voting amongst the three options, and randomly pick from the maximally-approved-of proposals.

If all three agents are aligned, then this proposal is clearly going to produce an aligned action. If two the agents behave adversarially then we are clearly out of luck, but that happens only with probability O(ε²), so we can basically ignore it.

The interesting case is when there is one adversarial agent and two aligned agents. If we threw up our hands and picked randomly in this case, then we would have a failure probability of ε. So the question is: can we do better than that?

The obvious hope is that aligned agents can do better-than-chance at recognizing each other / at recognizing good actions. If they can get any non-negligible advantage, then we can get a non-negligible reduction in error probability (and by iterating the procedure we can potentially get large reductions).

The sequential case

So far we have talked about making a single decision. The problem is more challenging when we need to make a sequence of decisions.

We can try to simply generalize the voting approach, running a new vote for each action. To see the difficulty, suppose that the optimal policy looks as follows:

  1. Generate some hard-to-transfer internal state X. Think of X as something like an intuitive vision for how to approach the situation.
  2. Use X to generate a first action.
  3. After making an observation, use X to generate a second action. It is important that the same state X be used in steps [2] and [3].

Now suppose that in the first step the three agents all follow this protocol correctly. They arrive at three different proposed actions, which are backed up by three different visions. Now we have to pick one of these actions, effectively committing ourselves to a single agent. This destroys our ability to ensemble effectively in future steps: if the chosen agent fails then we are out of luck, since the other two agents don’t have the vision needed to follow-through on that action.

To get around this problem we need to count on a stronger property: whatever kind of private state X an agent could generate, the agents need to be able to jointly develop a “consensus” private state which serves the same function. Then if one agent fails, the other agents can also use the consensus private state in order to continue acting intelligently.

(It would also be OK if only pairs of agents could establish such a consensus state, as long as the third agent was able to arbitrate between the two in the event of a future failure.)

Implementation

As of commit 1cbaae, this approval-voting scheme is part of my ALBA implementation on GitHub.

It is implemented in amplify.reliability.amplify_reliability, and is applied as part of amplify.amplify directly prior to applying HCH.

As a result I’ve moved one issue from the “FIXME: will fail” column to the “TODO: needs improvement” column.

Three differences with redundant computation

There is an easy recipe for fault-tolerant computation: compute the result 3 times, and take the majority vote. This increases the computational cost by a factor of 3, but decreases the error rate from ε to O(ε²).

Our reliability amplification scheme is considerably more subtle, and if we were to formalize the reliability amplification problem as stated we would find it impossible. This section touches on the most important additional subtleties.

Building consensus

Redundant computation normally relies on the existence of a unique correct answer on which all uncorrupted computations will agree. In our setting, there may be a range of “correct” behaviors. This introduces the additional challenge of being able to agree on a particular correct answer (and to build a consensus version of the hidden state underlying that answer).

Without some assumption of this form, reliability amplification is impossible. For example, suppose that there are 99 possible good answers and 1 bad answer. Our distribution A picks at random amongst the 100 possible answers. If the copies of A with good answers are completely unable to agree with each other, then there may simply be no way to tell the good answers from the bad answer. In this case, we can never reduce the error probability below 1%.

Balanced abilities

In order for our scheme to work we need to assume that the abilities of all of the agents are roughly balanced. I think this is probably necessary for any scheme to work. If you have an ε chance of a malicious superintelligence and a (1−ε) chance of a friendly human, you are probably out of luck — in a deliberative process involving two humans and a superintelligence, the superintelligence is likely to have disproportionately much influence over the outcome even if it is wrong.

Our particular scheme fails outright in the context of powerful adversaries. In general there is a 3ε chance that at least one of the agents is unaligned. We may be able to prevent the unaligned agent from causing the group to adopt a really bad answer. But even in the best case, the unaligned agent will probably be able to tilt the balance between the two options proposed by the aligned agents, allowing them to increase the probability of failure. If the unaligned agent is no smarter than the aligned agents, then the aligned agents can counteract this effect by preferentially picking the better action. But if the unaligned agent is much smarter than the aligned agents, then it can increase the probability of failures which are too subtle for the unaligned agents to anticipate, and the aligned agents have no way to decrease the probability of these failures.

Less time to think

Traditionally redundant computation increases costs by a constant factor. But in the context of reliability amplification, we also make life harder for each agent: in addition to picking an action, they need to spend time arguing about actions and critiquing each others’ choices. This gives each agent less time to think about their action. It is not clear whether we can offset this effect

Relation to capability amplification and ALBANecessity

Reliability amplification seems completely necessary if we want to do something like ALBA using imitation or imitation+RL, since otherwise reliability will fall with each iteration of capability amplification. Of course we could have a “robust” capability amplification procedure which does not decrease reliability. But designing such a procedure is strictly harder than reliability amplification. So I think it makes a lot of sense to split up the problem into two pieces.

If working with approval-direction and pure RL, there is actually a form of reliability amplification already baked in: if the overseer fails with probability 1%, then this only changes the reward function by 0.01, and an RL agent should still avoid highly undesirable actions. That said, capability amplification may still be necessary in a pure RL setup if we can’t solve the RL problem to arbitrary precision. In that case we may always have some non-negligible probability of making a bad decision, and after capability amplification this probability could become too large.

Balancing capability/reliability

Reliability amplification decreases our agent’s capability but increases its reliability. Capability amplification increases capability and decreases reliability.

The hope is that we can somehow put these pieces together in a way that ends up increasing both reliability and capability.

If our reliability amplification step achieves a superlinear reduction in error probability from ε to o(ε), and our capability amplification causes a linear increase from ε to Θ(ε), then this seems almost guaranteed to work.

To see this, consider the capability decrease from reliability amplification. We know that for large enough N, N iterations of capability amplification will more than offset this capability decrease. This N is a constant which is independent of the initial error rate ε, and hence the total effect of N iterations is to increase the error rate to Θ(ε). For sufficiently small ε, this is more than offset by the ε → o(ε) reliability improvement from reliability amplification. So for sufficiently small ε we can increase both reliability and capability.

A reduction from ε to O(ε²) is basically the “best case” for reliability amplification, corresponding to the situation where two aligned agents can always reach correct consensus. In general, aligned agents will have some imperfect ability to reach consensus and to correctly detect bad proposals from a malicious agent. In this setting, we are more likely to have an ε → O(ε) reduction. Hopefully the constant can be very good.

There are also lower bounds on the achievable reliability ε derived from the reliability of the human and of our learning procedures.

So in fact reliability amplification will increase reliability by some factor R and decrease capability by some increment Δ, while capability amplification decreases reliability by some factor R′ and increases capability by some increment Δ′. Our hope is that there exists some capability amplification procedure with Δ′/log(R′) > Δ/log(R), and which is efficient enough to be used as a reward function for semi-supervised RL.

I think that this condition is quite plausible but definitely not a sure thing; I’ll say more about this question in future posts.

Conclusion

A large computation is almost guaranteed to experience some errors. This poses no challenge for the theory of computing because those errors can be corrected: by computing redundantly we can achieve arbitrarily low error rates, and so we can assume that even arbitrarily large computations are essentially perfectly reliable.

A long deliberative process is similarly guaranteed to experience periodic errors. Hopefully, it is possible to use a similar kind of redundancy in order to correct these errors. This question is substantially more subtle in this case: we can still use a majority vote, but here the space of options is very large and so we need the additional step of having the correct computations negotiate a consensus.

If this kind of reliability amplification can work, then I think that capability amplification is a plausible strategy for aligned learning. If reliability amplification doesn’t work well, then cascading failures could well be a fatal problem for attempts to define a powerful aligned agent as a composition of weak aligned agents.

This was originally posted here on 20th October, 2016.

The next post in this sequence will be 'Security Amplification' by Paul Christiano, on Saturday 2nd Feb.



Discuss

The role of epistemic vs. aleatory uncertainty in quantifying AI-Xrisk

31 января, 2019 - 09:13
Published on January 31, 2019 6:13 AM UTC

(Optional) Background: what are epistemic/aleatory uncertainty?

Epistemic uncertainty is uncertainty about which model of a phenomenon is correct. It can be reduced by learning more about how things work. An example is distinguishing between a fair coin and a coin that lands heads 75% of the time; these correspond to two different models of reality, and you may have uncertainty over which of these models is correct.

Aleatory uncertainty can be thought of as "intrinsic" randomness, and as such is irreducible. An example is the randomness in the outcome of a fair coin flip.

In the context of machine learning, aleatoric randomness can be thought of as irreducible under the modelling assumptions you've made. It may be that there is no such thing as intrinsic randomness, and everything is deterministic, if you have the right model and enough information about the state of the world. But if you're restricting yourself to a simple class of models, there will still be many things that appear random (i.e. unpredictable) to your model.

Here's the paper that introduced me to these terms: https://arxiv.org/abs/1703.04977

Relevance for modelling AI-Xrisk

I've previously claimed something like "If running a single copy of a given AI system (let's call it SketchyBot) for 1 month has a 5% chance of destroying the world, then running it for 5 years has a 1 - .95**60 ~= ~95% chance of destroying the world". A similar argument applied to running many copies of SketchyBot in parallel. But I'm suddenly surprised that nobody has called me out on this (that I recall), because this reasoning is valid only if this 5% risk is an expression of only aleatoric uncertainty.

In fact, this "5% chance" is best understood as combining epistemic and aleatory uncertainty (by integrating over all possible models, according to their subjective probability).

Significantly, epistemic uncertainty doesn't have this compounding effect! For example, you could two models of how the world could work, one where we are lucky (L), and SketchyBot is completely safe, and another where we are unlucky (U), and running SketchyBot even for 1 day destroys the world (immediately). If you believe we have a 5% chance of being in world U and a 95% chance of being in world L, then you can roll the dice and run SketchyBot and not incur more than a 5% Xrisk.

Moreover, once you've actually run SketchyBot for 1 day, if you're still alive, you can conclude that you were lucky (i.e. we live in world L), and SketchyBot is in fact completely safe. To be clear, I don't think that absence of evidence of danger is strong evidence of safety in advanced AI systems (because of things like treacherous turns), but I'd say it's a nontrivial amount of evidence. And it seems clear that I was overestimating Xrisk by naively compounding my subjective Xrisk estimates.

Overall, I think the main takeaway is that there are plausible models in which we basically just get lucky, and fairly naive approaches at alignment just work. I don't think we should bank on that, but I think it's worth asking yourself where your uncertainty about Xrisk is coming from. Personally, I still put a lot of weight on models where the kind of advanced AI systems we're likely to build are not dangerous by default, but carry some ~constant risk of becoming dangerous for every second they are turned on (e.g. by breaking out of a box, having critical insights about the world, instantiating inner optimizers, etc.). But I also put some weight on more FOOM-y things and at least a little bit of weight on us getting lucky.



Discuss

Applied Rationality podcast - feedback?

31 января, 2019 - 04:46
Published on January 31, 2019 1:46 AM UTC

My goal is to help non-rationalists become rationalists, and provide exercises for existing rationalists to hone their skills, in the form of audio/video episodes.

'Rationally Speaking' and 'The Bayesian Conspiracy' are lovely, but they don't serve the function I have in mind.

Since I am an amature rationalist, and possibly the least intelligent person in this community, it would be ideal for someone else to run this (preferrably CFAR themselves, or at least someone who has attended). If there is an existing project like this, please point me to it so I can support it. If not, I at least have the drive to get the ball rolling.

I would probably start by using the 'Hammertime' sequence as material, and breaking it down a bit so the average layperson can understand it easier. This would require very little knowledge on my part, and give me time to maybe find a replacement host, cohost, or a queue of interviewees to assist in future episodes.

My question: is there any feedback you can give on this idea?



Discuss

Wireheading is in the eye of the beholder

30 января, 2019 - 21:23
Published on January 30, 2019 6:23 PM UTC

tl;dr: there is no natural category called "wireheading", only wireheading relative to some desired ideal goal.

Suppose that we have a built an AI, and have invited a human H to help test it. The human H is supposed to press a button B if the AI seems to be behaving well. The AI's reward is entirely determined by whether H presses B or not.

So the AI manipulates or tricks H into pressing B. A clear case of the AI wireheading itself.

Or is it? Suppose H was a meddlesome government inspector that we wanted to keep away from our research. Then we want H to press B, so we can get them our of our hair. In this case, the AI is behaving entirely in accordance with our preferences. There is no wireheading involved.

Same software, doing the same behaviour, and yet the first is wireheading and the second isn't. What gives?

Well, initially, it seemed that pressing the button was a proxy goal for our true goal, so manipulating H to press it was wireheading, since that wasn't what we intended. But in the second case, the proxy goal is the true goal, so maximising that proxy is not wireheading, it's efficiency. So it seems that the definition of wireheading is only relative to what we actually want to accomplish.

In other domains

I similarly have the feeling that wireheading-style failures in value-learning, low impact, and corrigibility, also depend on a specification of our values and preferences - or at least a partial specification. The more I dig into these areas, the more I'm convinced they require partial value specification in order to work - they are not fully value-agnostic.



Discuss

Masculine Virtues

30 января, 2019 - 19:03
Published on January 30, 2019 4:03 PM UTC

Cross-posted from Putanumonit (where there's already a good discussion going).

Boys Will Be Boys

Have you seen the Gillette ad? Everyone’s seen the Gillette ad. And after my last post on masculinity, everyone’s been asking me what I think of the Gillette ad.

Well, I used to shave with Gillette and I’ve dumped them… back in 2014 when I realized that Dollar Shave Club sells basically the same razors for $1 each.

And the ad? Eh, it’s fine.

Cringeworthy at times, but fine.

Gillette is a division of a consumer products company selling bathroom items. No one is forced to watch their ads or use their razors. Clay Routledge put it brilliantly: we are living in an era of woke capitalism in which companies pretend to care about social justice to sell products to people who pretend to hate capitalism. Woke capitalism is silly but it gives Gillette customers what they want, which all you can expect of a corporation.

In contrast, APA is a professional organization of health care providers, writing guidelines for practicing therapists who deal with vulnerable men who come to them for help. The standards are quite different.

The content is quite different also.

Here is a list of things APA considers “harmful”, under the umbrella term of “traditional masculinity”:

  • Stoicism.
  • Competitiveness.
  • Aggression.
  • Dominance.
  • Anti-femininity.
  • Achievement.
  • Adventure and risk.
  • Violence.
  • Providing for loved ones (if you’re a black man).

Here’s a list of things the Gillette ad is against:

  • A mob chasing a teenager.
  • Texting someone “FREAK!!!”
  • Old TV shows.
  • Catcalling and butt-grabbing.
  • Patronizing your employees.
  • Six-year-olds fighting.
  • Chanting “boys will be boys” in unison.
  • Sexual assault and sexual harassment.

What do the two lists have in common? Violence, which is never the answer, is the only answer. Find the traditional man closest to you and ask them how many things on Gillette’s list they approve of; it’s not going to be many. “Traditional” men tend to complain they it’s no longer OK to hold doors open for women or take their kids hunting, not that in good ol’ days you could bully people over text or grope ladies on the street.

Here are the things Gillette is in favor of:

  • Terry Crews.
  • Accountability.
  • Demonstratively protecting women from other men.
  • Fatherhood.
  • Using your superior strength to break up fights between smaller males.
  • Teaching all of the above to your son.

Those are remarkably traditional male traits and behaviors, in the sense that they are present and praised among men in almost every modern and pre-modern society. With the exception of Mr. Crews, all of those predate the human species.

Gillette’s ad is in no way against traditional masculinity. The list of behaviors they come out against is referred to as toxic masculinity, including by Gillette themselves.

Those who hate men or who gain status from pretending to do so will continue to conflate masculinity with the terrible (and not particularly masculine) behaviors portrayed in the first half of the ad. Toxic/traditional is a perfect setup for motte-and-bailey: “I like extreme sports. – Ah, a traditional male. I bet you grope women on the subway.” But it’s equally toxic to conflate Gillette with APA’s attack on traditional manhood.

Gillette’s Best Man

If I had to pick a role model of masculinity I would name Roger Federer. Federer is the best tennis player ever among men, the best gentleman among tennis players, philanthropist, father of four and husband to one.

Federer is also the best exemplar of the not-so-subtle distinction between toxic masculinity and traditional masculinity. Roger has been Gillette spokesperson for more than a decade, and he also makes an absolute mockery of the APA list.

Stoicism? Federer won tournaments playing through injury, on sweltering Melbourne days and chilly London nights. While the best female tennis player in history garnered a reputation for furious outbursts at umpires and fans, Federer is legendary for never losing his cool.

Violence? Ok, even Roger has broken a racket or two in his career (so have I).

Competitiveness? Among the multitude of tennis records held by Federer are the 10 times he came back from two sets down to win a match. I was in the stands for #9 in New York when Federer outlasted Gael Monfils playing one of the best matches of his life. Even after Roger lost the first two sets while hitting 26 unforced errors and being outworked by the athletic Frenchman, not a single person in the crowd doubted Federer’s ability to raise his game and ultimately triumph.

Providing for loved ones? Yes, even for black boys.

Aggression and dominance? When I was young and Federer always won, I used to root against him (because he always won). The same pattern would play out in dozens of Federer matches: the game would proceed evenly until something minor would happen that would shake the confidence of Federer’s opponent a tiny bit. Perhaps the opponent would lose a break point opportunity, or miss an easy shot. And then Roger would transform into Darth Federer: a ruthless predator who would pounce on an opponent’s single moment of weakness, breaking his serve and destroying his will to compete in the space of 5 or 10 minutes.

And yet, the other players on tour would revere Roger, much more than they did the equally talented Rafael Nadal or Novak Djokovic. The only tennis award voted on by the players themselves is the ATP sportsmanship award, Federer has won it 13 times.

What is it that Federer does so well and masculinity-haters resent? Climbing Hierarchies. When Federer was #1, he wasn’t just first per the arcane schema of ATP ranking points. He was the best tennis players in the eyes of fans, journalists, sponsors, and, importantly, his opponents. #1 takes tennis skill, but it also takes stoicism, competitiveness, aggression, and dominance.

And I suspect that it’s hierarchies that those who take issue with the above-listed traits are really against.

If you say “hierarchy” three times in front of a mirror you summon the spirit of Jordan Peterson.Who Hates Hierarchies?

There’s a lot of bitching online about “the war on men”, most of it tedious. Group X thinks that men should have lower status, some guy says ‘no, fuck you!’, more at 10. Jordan Peterson and Jonathan Haidt often get lumped in with that, but they are saying something entirely distinct. Peterson and Haidt are saying that there is a war on certain traits which are commonly coded as masculine: self-reliance, resilience, self-improvement through facing adversity, competence. They describe how parents, schools, and society as a whole discourage those traits, particularly in young people, particularly in young boys.

When I first encountered their writings, I found it too alarmist. But after reading the APA guidelines I remembered that Peterson and Haidt are both psychologists, the former practicing clinical psychology for twenty years. They saw this coming before everyone else.

What does a “war on competence” look like? Think of someone trying to get better at their work to get promoted, working on their writing to build an audience for their blog, or practicing a sport to rise in the rankings. Building competence doesn’t happen by itself. It requires focusing on a goal, taking on challenges, dealing with discomfort, risking failure, and overcoming problems on your own. Building competence (and getting recognized for it) is a crucial component of well being for all humans.

Of course, APA doesn’t mention this. All they have to say on the behaviors that build competence is:

Research suggests that socialization practices that teach boys from an early age to be self-reliant, strong, and to minimize and manage their problems on their own (Pollack, 1995) yield adult men who are less willing to seek mental health treatment.

If society values a particular skill or achievement (like work, blogging, or tennis) a competence hierarchy will form around it. That’s what it means for society to value a skill: those who display it get social rewards and status. But of course, not every hierarchy is a competence hierarchy. Those who got the rewards have a strong interest in removing the competence aspect, making sure that the goodies keep coming to them and not to more competent challengers.

This is why, according to Jordan Peterson, societies need both conservatives and progressives:

There’s space and necessity for a constant dialogue between the left and right. […]
You have to move forward towards valued things, so you have to have a value hierarchy. There has to be hierarchy because one thing has to be more important than another, or you can’t do anything. […]
No matter what you’re acting out, some people are way better at it than others. Doesn’t matter if it’s basketball or hockey or plumbing or law, as soon as there’s something valuable and you’re doing it collectively there’s a hierarchy.
So then what happens is the hierarchy can get corrupt and rigid and then it stops rewarding competence and it starts rewarding criminality and power. The right-wingers say that we really need to abide by the hierarchies and the left-wingers say: wait a second, your hierarchy can get corrupt and also puts a lot of dispossessed people at the bottom. And that’s not only bad for the dispossessed people, it actually threatens the whole hierarchy.

The progressive project is often about disrupting corrupt hierarchies, and it has done so successfully many times. But times change, and so do the requirements for identifying which hierarchies are broken.

In 1942, the New York Times staff was composed entirely of goofy white dudes. It’s clear that being a goofy white dude is not commensurate with journalistic merit, and the composition of the staff changed. Today, the New York Times staff is a multi-ethnic and gender-diverse group of graduates from a small handful of elite colleges who share a political ideology and worldview. Is this a corrupt hierarchy of journalism or a meritorious one? This is a much harder question to answer.

Instead of dealing with hard questions, it’s easier to reuse the tricks that worked in the past like saying that any majority-male hierarchy is nefarious and privileged. The APA was quick to point out that 95% of Fortune 500 CEOs are men. So are 80% of Google engineers and 80% of top-grossing actors. Also 99% of HVAC mechanics, but only 2% of dental hygienists. Are those examples of privilege or of competence?

The answers to all of the above are “almost certainly both, it’s complicated”. But this answer doesn’t help you climb the hierarchy of progressive politics. To maintain that those are all examples of pure male privilege, one has to completely deny the role of competence. As people on the left compete to demonstrate their commitment to dismantling privilege, the entire concept of competence gets wholly ignored and the pursuit of it is seen as pathological. I think that this impulse is at the root of the “war on competence”.

(The opposite happens to conservatives, who call every blatant example of privilege a meritocracy. Consider the belief that multimillion-heir Donald Trump is a self-made man.)

The traditionally masculine [1] traits are those required to climb hierarchies of competence: competitiveness, physical and emotional resilience, adventurous risk-taking, perseverance, the drive to achieve and overcome. Like all traits, they become vices when pushed too far. The most competitive basketball player of all time was a notorious jerk. People “kill themselves” in demanding careers or literally kill themselves running triathlons while ignoring signs of pain and danger. Entrepreneurs bet big on themselves and lose, or sacrifice what they can’t afford to in order to win.

But ascending hierarchies of competence is vital even for the 99% of us who will not become elite athletes, CEOs, or superstars. Improving at a valuable skill is meaningful, and rising through the ranks provides validation of that meaning. It brings self-confidence and fulfillment. It demonstrates your worth to others and to yourself. When developed well, the masculine traits are virtues independent of any competition. They enable people to simply livebetter in the world, enjoying success as a well-deserved reward rather than a fleeting stroke of luck, and seeing setbacks as challenges rather than tragedies.

How do young people learn to develop masculine traits into masculine virtues? Schools and media are two of the institutions that are tasked with teaching young people, but those two institutions are among the most deeply entrenched in the progressive ideology that rejects competence and sees masculine traits as negative. You can turn to parents or friends, but not everyone has good role models around them. You can listen to a Jordan Peterson lecture, but he’s liable to ramble about Jesus for hours on end.

Or, you can turn on the TV and watch some sports, and then sign up for a local rec league.

There’s nothing anti-feminine about masculine virtues.What Sports Taught Me

I hold a lot of opinions that are hugely controversial outside the rationalist community but are well subscribed within it. That self-improving AI is an existential threat, that status seeking drives most of social behavior, that you should correct for multiple hypothesis testing. I hold one opinion that is hugely controversial among rationalists and is unremarkable everywhere else: that the three hours I spent watching soccer last Saturday were time well spent.

I want to write one day about the beauty of sports as a deep and complex art form and on the link between watching professional athletes and one’s own physical development. But sports are not just entertainment, they’re a human activity built on the values of sportsmanship, and those values are worth paying attention to.

1. Protecting the game is more important than winning

There’s a big difference between fans of competing political parties and of competing NBA teams. The former see only conflict in everything they care about. But the latter have something in common: their love of basketball. For this reason, almost all fans want their team to win fairly, and not by sabotaging opponents or bribing referees. Winning an NBA game is pointless if you destroy the NBA by cheating.

Sports fans recognize that the rules of the game are paramount. Not all the rules are written, of course, and there’s room to push the boundary. But ultimately the participants in the game establish collectively what is cheating and what is fair play, and they’re quick to punish cheaters.

Contrast this with journalists cannibalizing their own industry by replacing objective reporting with clickbait and scandal. Companies like Gawker Media took pride in destroying journalism norms for page views. And for a while, Gawker “won” the competition for eyeballs and attention. Now Gawker is gone, and the entire industry is in a death spiral.

2. Opponents are not enemies

A corollary to #1: the goal of sports is to outperform your opponent, not to destroy them. Even MMA fighters (for the most part) look to outfight their opponent in the cage, not to harm or humiliate them. At the end of the match, they are colleagues again.

The opposite is true in culture war and politics. People spend all their effort sticking it to the outgroup: getting someone silenced, banished, fired, ridiculed. Whether this actually helps your own cause or the groups you claim to fight for is an afterthought. The 35-day government shutdown harmed both Republican and Democrat voters, while both Trump and the House Democrats seemed to care more about making sure the other loses than helping their constituents.

Sports fandom is a channel for tribal impulses, but largely a benign one at that. Few fans and even fewer athletes forget the humanity of the person they compete against and the respect they’re owed. Outside of sports, few seem to remember that.

3. It matters how good you are today, not what you did yesterday

Many people react to accolades and achievements by lowering their own standards. Think of an academic wasting their tenure on prestige squabbles instead of exploring bold ideas, or anyone on Twitter with a blue check next to their name.

In sports, the opposite is true. Winning a title grants you accolades, but it makes the road tougher in the future. Opponents will learn your strengths and weaknesses, fans will expect more of you. Roger Federer’s past success doesn’t earn him a pass, it just guarantees that every young opponent tries to play the game of their life against him.

An achievement can be a temptation to rest on your laurels or an opportunity to raise your game further. Our instincts push us toward the former, sports teach us the latter.

4. You will get hurt. That’s OK

In a lifetime of playing soccer, I suffered bruised shins, twisted ankles, balls to the face, balls to the balls, elbows to the ribs, and a torn calf muscle. I also learned that none of the above is a big deal, certainly nothing worth sacrificing something as enjoyable as playing soccer over. If you watch sports you see athletes get hurt and recover all the time, but you almost never hear them wish they hadn’t started in the sport in the first place.

There are many fun things we can do with our bodies. The most fun involve some risk of pain and harm: snowboarding, getting tattoos, climbing trees, having kids, lifting, BDSM, soccer, cliff jumping, punch bug. Sports provides exposure to physical risks, letting you decide which activities are worth the bruises.

It’s possible to live life bruise-free, but I’m not sure you can call that “living”.

5. You will lose a lot. That’s OK

I noticed a strange thing recently: almost all my rationalist friends who are into sports also play competitive card games like Magic: The Gathering, Hearthstone, and Artifact. After much cajoling, I decided to jump in. And then it took me a while to get used to all the losing.

Most single-player video games, which are what I played before, are balanced to let the player “win” 80-90% of the time. Dark Souls aside, when a single-player game presents you with a challenge you can confidently expect to deal with it. Movies, adventure books, and single-player games often rely on the trope of “succeeding against all odds”, and yet the odds are very much stacked in the protagonist’s favor.

But in competitive games, you get pwned. A lot [2]. In fact, in games like Hearthstone, you will win exactly 45-50% of your games no matter how good you are. If you work hard at it, you will win 55% of your games for a short while before going back down to 45%, but with a higher rank number next to your name.

In sports, the odds are even tougher. Each year 32 NFL teams compete for a single trophy, which means that fans of 97% of football teams will not celebrate at the end of the year. Sometimes, a team’s season ends through no fault of its own: a bounce of the ball, a coin flip, a blown call.

But that’s how life is. Achieving anything meaningful is hard and entails a lot of failure on the way. As for NFL fans, as for everyone, it is important to take joy and pride in small achievements and marginal improvements along the way. And as for losses:

I wish you bad luck, again, from time to time so that you will be conscious of the role of chance in life and understand that your success is not completely deserved and that the failure of others is not completely deserved either. And when you lose, as you will from time to time, I hope every now and then, your opponent will gloat over your failure. It is a way for you to understand the importance of sportsmanship.
Justice John Roberts (h/t Slarphen for the quote attribution)6. In the end, it’s all up to you

Chance, bad calls and all the rest play an important role in deciding the outcomes of sports events, but sports fans ultimately have little patience for those who shift blame and responsibility. No one wants to litigate old grievances once the name is engraved on the trophy and a new season starts.

While sports teaches us that luck plays a role in outcomes, it also trains us to behave as if that is not the case. The team that benefitted from a lucky bounce was good enough to be in the position of a single bounce from victory, the team that lost weren’t good enough to ensure a margin for victory. Winners rarely apologize for luck, and losers are mocked if they complain about it.

Many institutions send the opposite message. They say: if you failed, it’s not your fault. It was done to you, taken from you. The system will make it right and fix the injustice, all you must do is to surrender your life to the system.

Assigning responsibility for outcomes to your own actions is called “internal locus of control” in psychology. It is associated with a need for achievement, and also with a lower incidence of depression. The latter result is from a study published by APA in 1988 before it was trying to cure men of manliness.

The lessons of sports are useful and important, but it’s not enough to read about them. Like all virtues, they require time to internalize by observing them in role models and practicing them in your own life. Sports are full of role models, both men and women, who have honed those traits to virtues. They are also full of cautionary examples of athletes who took them too far.

When one side of the culture war spectrum rejects all masculine traits and the other side uncritically glorifies them, watching Federer play a tennis match is the balanced meal that your soul needs.

Footnotes

[1] I am basically using “masculine traits” to mean “traits for climbing competitive hierarchies”.

This is not an arbitrary definition. Males of many species have a much higher tendency than women to measure themselves against other man and arrange themselves in a hierarchy. The root cause of this is that the reproductive prospects of females are more equal, while those of males are highly varied – men need to prove their worth in a hierarchy to get to mate.

If you don’t buy the evolutionary argument, it’s not important to the main point I’m making. Consider my use of “masculine traits” a simple shorthand for “hierarchy-climbing traits”.

[2] Artifact is particularly brutal for starting players. It’s hugely complex with barely a tutorial, the feedback loops are long which makes it harder to learn quickly, and the matchmaking will pit you against 14-year-olds from Slovakia who will drink your blood.

It does become very rewarding after you spend the time learning the game. There’s nothing quite like edging the opponent by one lane with a brilliant combination of cards and being cursed at in Slovak. You can improve via phantom drafts, or by finding me on Steam for a casual match; my username is “Putanumonit”.



Discuss

Book Trilogy Review: Remembrance of Earth’s Past (The Three Body Problem)

30 января, 2019 - 04:10
Published on January 30, 2019 1:10 AM UTC

Epistemic Status: Stuff that keeps not going away so I should write it up I suppose. Brief spoiler-free review, then main talk will be spoilerific.

Spoiler-Free Review

I read the trilogy a few months ago, on general strong reviews and lack of a better known science fiction option I hadn’t already read.

I was hoping to get a Chinese perspective, some realistic physics (as per Tyler Cowen’s notes) and a unique take on various things. To that extent I got what I came for. Book felt very Chinese, or at least very not American. Physics felt like it was trying far more than most other science fiction, and the consequences are seriously explored. Take on that and many things felt unique versus other books I’ve read, in ways I will discuss in the main section.

What I didn’t feel I got was a series that was high enough quality to justify its awards and accolades, or allow me to fully recommend reading it to others. It’s not bad, it has some great moments and ideas and I don’t mind having read it, but I was hoping for better. That’s fine. That is probably a lot about my expectations getting too high, as I can’t point to (in the limited selection of things I’ve read) recent science fiction I think is even as good. Like other genres, read mostly old books is wise advice that I follow less than I should.

It is a reasonable decision to do any of: Not read the book and not continue further, not read the book and allow it to be spoiled here, to read some and see if you like it, or to read the whole thing.

This long post is long. Very long. Also inessential. Apologies. I definitely didn’t have the time to make it shorter. Best to get it out there for those who want it, and to move on to other things.

All The Spoilers – The Plot Summary

(This unfolds everything in linear order, the books rightly keep some things mysterious at some points by telling events somewhat out of order. This is what I remember and consider important, rather than an attempt to include everything. There’s a lot that happens that’s interesting enough to be included!)

Book I – The Three Body Problem

Communist China during the cultural revolution was really bad. Reeling from a combination the cultural revolution and its murder of (among others) her father and all reasonable discourse, her forced exile to the countryside, and environmental panic raised by a combination of Silent Spring and the actual terrible environmental choices and consequences she sees around her, Ye Wenjie despairs for humanity. When she sees that contact has been made with extra-terrestrial intelligence, with a message warning us not to reply as doing so would give away our location and we would be conquered, she replies asking for an alien invasion from Trisolaris, and goes on to found the ETO, the Earth-Trisolaris Organization, with the explicit aim of betraying humanity and giving Earth over to the aliens.

Because we are so awful that it couldn’t help be an improvement, right? Using a game called Three Body that illustrates the origins of Trisolaris, the ETO recruits huge portions of the world’s intellectual classes, largely because of environmental concerns and humanity’s other moral failings, making them willing to betray humanity in favor of an alien invasion.

Trisolaris sends an invasion fleet that will arrive from Alpha Centauri in 400 years. Worried that our technological advancement is so rapid we will by then defeat them, they send protons to Earth that they have ‘unfolded’ to transform into sophons, which they can use to communicate faster than light in both directions, and which can be programmed to monitor everything on Earth, mess with particle accelerator experiments to prevent technological progress, and do a few other tricks. The crazy physics results and other ETO efforts to suppress science drive many physicists to suicide. The ETO convinces the world’s intellectual elite to run a cultural campaign against science, as science is the only thing they fear. In particular they go after one particular person, the not-very-virtuous-or-brilliant-seeming astrophysicist Luo Ji. These symptoms are what lets the authorities around the world to investigate and figure out they are facing the ETO, which they manage to infiltrate and then mostly shut down. But scientific progress is still blocked by the sophons, so Earth seems doomed to fall to the invasion fleet.

Book II – The Dark Forest

With Earth under total surveillance and no prospect of scientific progress, humanity sets out to prepare for what it calls the doomsday battle. Earth’s resources are redirected to two efforts. An epic space fleet is constructed to attempt to battle the invasion, with so much being invested in these efforts, despite the four hundred year time frame, that from the start the world soon expects rationing of basic goods and bad economic times. A second effort is the wallfacer program. Four special humans are chosen to develop secret strategies, since the one place sophons can’t intrude is the inside of a human brain. The job of the wallfacers is, given unlimited resources, to develop a strategy to stopping the invasion, but keep it secret and use misdirection and deception to ensure that the enemy does not figure out their plans. Everyone is required to do as the wallfacers ask, without question or explanation, and everything they do is considered part of their plan one way or another.

The four chosen are: A scorched-Earth terrorist from Venezuela who is then hailed as a master of strategy and asymmetric warfare, a former US secretary of defense and veteran officer, an English neuroscientist, and then Luo Ji because they couldn’t help but notice that the ETO really, really wanted Luo Ji dead even if they had no idea why. Luo Ji wants no part of this, but his refusals are taken to be part of his plan, so instead he uses his position to set himself up for a nice quiet life.

The ETO then assigns each a ‘wallbreaker’ to uncover the wallfacer’s plan and reveal it to the world. Humanity gets poetic justice from its first three Wallfacers, all of whose plans are revealed as not only huge wastes of resources but active dangers. The scorched-Earth terrorist tries to use a Mercury base as a means to cause Mercury to fall into the Sun and cause a collapse of the entire solar system, thinking he can hold it hostage to force a compromise. The general creates a swarm of automated ships that he intends to use to betray Earth’s fleet in order to then try and somehow trick the invasion fleet. The neuroscientist creates a brainwashing machine, claims he’s using it to convince our officers we’ll win, but actually uses it to brainwash all volunteers with utter faith that we will lose. None of the plans have a practical chance of working even on their own merits, and all three cause humanity to collectively react in horror.

That leaves Luo Ji, who they force to work by forcing his wife and child into hibernation as hostages. Luo Ji thinks for a long time and then decides to ‘cast a catastrophic spell on the planets of a star’ using the same solar broadcast technique we used to communicate with Trisolaris in the first place, broadcasting the location of another star out to the broader universe, as a test that will take a hundred years to verify. Trisolarian allies almost kill him with an engineered virus, and he is sent into hibernation until we can find a cure.

Upon awakening, Lou Ji finds a transformed world. There has been an economic and ecological collapse called The Great Ravine, caused by the devotion of all resources to the war effort, but when humanity gave up on the war and went back to trying to live until the end, the economy recovered underground and the Earth recovered on its own once left alone. Our fundamental physics was still blocked, but our tech still got a lot better, and eventually we got around to building a fleet, which can even go faster than the Trisolarian fleet (0.15c versus 0.1c) and everyone is confident of victory once the Trisolarian fleet arives.

He learns that his ‘spell’ worked, the sun in question shattered by a photoid. He also faces a computer system trying to kill him thanks to an old virus engineered to target him, which he narrowly escapes multiple times until the authorities realize what is happening. Then he goes about adjusting to living out his life.

People are so confident of victory that when a probe arrives in advance of the fleet, everyone gets into super close formation because they’re fighting over bragging rights regarding this historic moment, rather than thinking about what would happen if the probe might try something. The main battle they prepare for is which continent’s ships will get the place of honor. Only two ships are semi-accidentally held back and no one gives that much thought. Those two ships flee the solar system.

The probe tries something, which is to ram and destroy all Earth’s ships, because the probe is super fast and is made of a super-dense material organized around the strong nuclear force, and Earth’s ships aren’t. Nothing we have can do anything to the probe. The probe then proceeds to Earth. Luo Ji expects it to kill him, but instead the probe shuts down our ability to communicate with the outside via the Sun, as we had done previously to communicate with Trisolaris and for Luo Ji to cast his spell on the star.

It seems too late, and all is lost. Luo Ji had realized that the universe is a ‘dark forest’ in which any civilization that reveals its location is destroyed, because resources are limited but life grows exponentially. If everyone out there knows where you are, one of them will wipe you out, even if most leave you alone. That was how his ‘spell’ worked.

Thanks to his spell, Luo Ji is now identified as our last hope and treated like a messiah. The wallfacer project is revived, he is given carte blanche and a blank check. But when he seems to be devoting his time and energy to the details of an early warning system that can’t possibly save us, people turn on him. He is treated as a pariah, not allowed on buses, the man who betrayed us and gives us false hope. Scientists develop theories that the sun he cast a spell on being destroyed was a coincidence, as it was about to go nova or something right at that time, never mind the probability of that on such a short time frame.

All this time, the one thing more reviled than Luo Ji is ‘escapism’. It is treated as the ultimate crime against humanity to attempt to flee the solar system. Never mind that this is the only realistic way that humanity could survive, or at least it is the only way we might survive. The one thing everyone agrees on is that if we can’t all survive, no one can be allowed to try and survive, so all such attempts must be stopped at all costs.

Thus, we have very little to attempt escape with. All we have left are the the two ships left out of the final battle, one ship that had been hijacked when someone who wanted that more than anything was given temporary full control over a ship, including the ability to take it out of the solar system, while the ship was checked for ‘defeatist’ officers who had been given the brainwashing treatment, and three ships pursuing that one ship.

Or rather we only have two ships. Both groups of ships realized that the other ships in their fleet comprised vital resources of matter necessary to completing their journey to another star, and realized that other ships would view them likewise. As a result, each group of ships fought a battle where only one ship survived, and then had the resources to continue its journey. This only reinforces humanity’s devotion to not allowing any escape, as space is viewed as a place that takes away our humanity and makes us do terrible things.

Finally, Luo Ji is ready. He proceeds to a remote location, digs his own grave, and talks to Trisolaris via the sophons. He explains that the early warning system he created is actually a message to the galaxy. By placing its components carefully around the solar system, he has engineered a message revealing the location of Triosolaris. If he is killed, a dead man switch will cause the message to be sent out, and others will then destroy both Trisolaris and Earth, since previous communications already revealed our distance from Trisolaris.

Under this threat, Trisolaris surrenders. They pull back their probes, stop their sophons from interfering with science experiments and divert their fleet, rather than face mutually assured destruction. Luo Ji, it appears, has saved us.

Book III – Death’s End

Death’s End is the story of humanity’s journey through various eras as told through the journey of a Chinese scientist named Cheng Xin.

In the common era, Cheng Xin’s team manages to send a brain into space to meet with the Trisolarian fleet, hopeful that they can then reconstitute the person and that the person can then act as an envoy and spy. They send Yun Tianming, who is in love with Cheng Xin and bought her a star, as the United Nations was selling them to raise money for the war effort.

Upon awakening in the future, this star proves to have planets and thus be very valuable, and Cheng Xin becomes super wealthy for the rest of the story.

In the deterrence era, Luo Ji has become the swordholder, standing ready at all times to broadcast the location of Trisolaris (and thus also Sol), thus ensuring peace. This works for a while, but humans forget how precarious their situation is and lose the appetite for hard choices. The men of this era, it is noted, are soft, and not true men. The number of warning stations and backup broadcasters is cut back to save money. One of the ships that survived the wiping out of Earth’s fleet, the Bronze Age, is recalled to Earth, then everyone is put on trial for crimes against humanity. They manage to notify the other ship that has escaped Earth, the Blue Space, to warn them not to return home, so a Trisolarian probe and an Earth ship are sent in pursuit and are on top of them ready to strike, but in no hurry.

Earth has decided the time has come to select a new swordholder to provide deterrence. Wade, Cheng Xin’s old ruthless spymaster boss from the project that sent up Yun’s brain, tries to kill her to prevent her from becoming swordholder and claim the position himself. Later, the remaining candidates again try in vain to convince her not to run for the position, but she feels obligated and does anyway. In a true instance of You Had One Job, the public chooses her as someone they like better, despite her being obviously less of a deterrent and Trisolaris still being able to watch literally everything everywhere all the time when making choices. A mindboggingly thankless Earth then arrests Luo Ji for high crimes because there might have been life on the planets orbiting the star he cast a spell on to prove that his hypothesis worked, never mind that there was no reason to believe that and oh that was part of him saving us from an alien invasion. But I get ahead of myself with the editorializing.

And of course, the second Cheng Xin takes over, the Trisolarians send probes in to dismantle what little broadcasting ability we have left and cripple us. Cheng Xin has the opportunity to push the button and reveal the location of Trisolaris, but (in the book’s words) her ‘motherly instincts’ take over and she refuses to doom us all, hoping things will somehow work out. Sophon (one of them has now taken on humanoid form) then explains gratefully that she only had about 10% deterrence value based on their models, whereas Wade was at 100% and certain to retaliate, which of course would have meant they wouldn’t have tried anything.

Instead, she proclaims, all Humans must dismantle their technology and move to Australia, except for a few million who will coordinate this move and hunt down the resistance. Australia and an area on Mars are, she claims, our reward for what our culture has brought to Trisolaris, including the concept of deception, which allowed their technological progress to restart after stalling out for eons, but we must give up most technology. Then, once everyone is there, she has the power for the farms there taken out, and announces that the plan is for us to fight each other for food to survive for the four years until the second Trisolarian fleet, which now can achieve light speed, can arrive to take over Earth’s intact cities and provide support for the survivors.

Before that can happen, it is revealed that the ship Gravity, which was pursuing the renegade ship Blue Space, has been taken over by the crew of Blue Space, who also managed to somehow defeat the probe attacking both of them, and together the crews voted to reveal the location of Trisolaris and therefore Sol. They did this via finding a pocket of four dimensional space, and explorations into four dimensional space find strange and ominous results including a ‘tomb’ to a dead race. The tomb explains that higher dimensional civilizations have no fear of lower dimensional ones, and lower dimensional ones have no resources higher dimensional civilizations might need, so the two have no reason to interact.

With Earth now a dead planet orbiting, the Trisolarians divert their fleet elsewhere and allow humanity to return to its planet and technology. Humans turn on those who managed to save its cities and civilization (while also, to be fair, laying the groundwork for xenocide), and greatly rewards those in the resistance including Luo Ji. Trisolaris is destroyed when a photoid hits one of their three stars.

The question now becomes, what to do, given that the universe will soon know where we are?

Before leaving Earth, Sophon gives Cheng Xin a chance to speak with Yun Tianming, but is warned they can only speak on personal matters and to avoid revealing technical or scientific or otherwise valuable intelligence, or Cheng Xin will be blown up by a nuke before she can share what she has heard. Yun shares with Cheng a fairy tale in three parts that was part of Yun’s published work ‘Fairy Tales from Earth.’ The works are well known, and they are not only not blown up or warned, they are allowed to go over the planned allotted time so he can finish the tales. They then promise to meet in the future at her star. Cheng memorizes the tale and reports the contents back, and she is congratulated on her success as all of our finest people work on figuring out the hidden meaning in the tale.

The tale is long and provides lots and lots of clues about how the universe works, what is technologically possible, what is in store for the solar system and how we might defend ourselves. Humanity figures out a little of it, but misses a lot more and knows it has done so, and comes up with three potential solutions to its crisis.

First, humanity notices that both known ‘dark forest strikes’ against known civilizations have taken the form of a photoid being used to shatter a star, and that if they can hide behind Jupiter or Saturn when that happens, they will survive the strike and have the technology to survive going forward. So we could build space cities to house our population.

Second, we could build faster than light ships and escape the solar system in time. Trisolaris developed light speed ships, and our older slower ship Gravity was on its way to another star even without this, so we know such a thing is possible. But that’s escapism, and escapism is the worse than Hitler, since not everyone could get away, so all such attempts are banned to make sure that if most of us die, everyone dies.

Third, we could take a concept gleaned from the fairy tale and from a question once asked to Sophon where she confirmed that there is a way to credibly communicate that we are not a threat and thus be left in peace. We could turn Earth into a dark region of space that can’t be escaped from. The problem is we don’t know how to do that and our attempts to figure it out came to nothing.

Humanity puts almost all its resources into plan one, and succeeds in moving everyone into space. It bans plan two, and mostly gives up on plan three.

Wade then meets Cheng Xin (our characters go into cryogenic sleep so they can move between eras as the plot requires) and demands she turn over her company to him so he can research light speed travel, since without it humanity will never be great even if we somehow survive, and also this plan of hiding behind Jupiter and hoping no other civilization thinks we might do that seems kind of unlikely to work when you think about it, ya know? Cheng Xin agrees and turns her company over with the caveat that if Wade would do anything to threaten human lives he has to awaken her and give her control back.

He agrees to the condition. Cheng Xin is awakened to news that Wade has made great progress in his research, but the authorities are trying to shut it down because escapism, and his response has been to seed soldiers with anti-matter bullets on the space cities so that he can threaten reprisals that would destroy them if his work is stopped. Cheng Xin is horrified, once again refusing to use such threats, and orders him to surrender. Amazingly, he does, fulfilling his promises to Cheng Xin. She goes back into hibernation again.

She is awakened to warning of a dark forest strike. Our hour has come, and a different type of weapon is attacking us, collapsing space around the solar system down to two dimensions. Laws against escapism are repealed, but escape velocity is light speed, so it is too late. Except for one ship that Wade managed to finish, with two seats, which Cheng Xin and her friend Ai Aa take, first to Pluto to help distribute priceless artifacts as a memorial to humanity, and then to escape the ongoing attack. Even as we are all doomed, humanity continues to hate ‘escapism’ and a number of ships try (with no success) to stop and destroy her ship because if they can’t escape, everyone should die. She does escape and directs the ship to her star.

When there, they attempt to meet Yun, but an accident causes them to miss each other. They do meet up with a descendant of the crew of the ships Gravity and Blue Space, which survived and became human civilization. He explains that dimension-collapsing weapons are being used throughout the universe, bringing it down from its original ten plus dimensions to now mostly two and three dimensional space with a few four dimensional pockets, and other laws of physics are under attack too. That’s why string theory and all these ‘folded-up’ dimensions and such. They are given instructions to enter a pocket dimension to await the big crunch, which will reset things and restore all the dimensions.

They then get a message from the universe. It notifies them that both Earth and Trisolaris made the list of impactful civilizations in the universe, and asks that all matter in pocket universes be returned because too much matter has been removed from the main universe into various pocket universes and this lack of sufficient mass will prevent the big crunch, dooming the universe to heat death instead of renewal. They decide to put most of the mass back into the main universe, leaving behind a message for the next cycle and taking a ship to explore what is left of the main universe.

We do not learn if enough others cooperated, or whether the big crunch did occur.

Glad that four-thousand-word plot summary is out of the way. On to the discussions.

Big Picture: An Alien Perspective

I don’t mean Trisolaris. I mean China.

Trisolarians have several characteristics on which to hang their planet of hats. They evolved around a trio star, which leads to unpredictable periods of extreme heat and cold, lasting unpredictable amounts of time, and worse, with only occasional ‘stable eras’ where they are mostly revolving around one sun and life can continue. Thus, they have the ability to ‘dehydrate’ and store themselves in this state during chaotic periods, then rehydrate when things stabilize, and they survived this way through many collapses until reaching an era of high technology.

They communicate telepathically and automatically, so before they met humans they didn’t know that lying or deception could be a thing. Their entire culture should be alien to us.

Instead, the humans are written a lot stranger, to my American eyes, than the Trisolarians. Most characters are Chinese, but even those who are not continue to mostly think and act in this similarly alien style. I appreciated the opportunity to see the view of humanity from someone in a completely different culture. But it points out how ordinary the Trisolarians are that they are, at least, not obviously less like the humans I know, than are the humans in the book. How much of this is how the Chinese or some Chinese group view humans and think about the world, versus how the author does? I cannot say.

The viewpoint expressed is deeply, deeply conservative and cynical.

People, all but a handful of them, are absurdly fickle, petty, short-sighted, self-interested and self-defeating. They are obsessed with status and how they are relative to others, and on the margin wish others harm rather than well. If anyone tries to escape disaster or otherwise rise on their own, universally mankind’s response is to band together to kill them.

When times are good, they lay back on universal benefits and forget the universe is a harsh place, and that their very survival depends on hard work, sacrifice and hard choices, and explicitly condemn and go after anyone who makes hard choices or makes sacrifices. They choose leaders who can’t make hard choices, showing weakness and inviting attack. They all grow soft, such that an entire prosperous era can lack ‘real men,’ leaving only effeminate weaklings incapable of action, leaving only those frozen in the past capable of taking meaningful action.

They will believe and put their hope in any damn thing they are pointed at, for a while, no matter how absurd. Then they will despair of any action, or turn in desperation to any hope for change. They will alternatively hail the same person as a hero, and arrest them as a villain, multiple times, for the same actions, based on how they want to feel at the time and what hopes they are given.

The intellectuals are even worse, and at the start of the book the bulk of them are actively working to sell us out to the aliens for no reason other than environmentalism and humans sucking, so why not turn things over to a race determined to conquer us and hope for the best?

The idea that humans, or any beings anywhere, could successfully navigate basic game theoretic problems like the prisoner’s dilemma, rather than killing each other and literally destroying the dimensions and physical laws of the universe, is not in hypothesis space.

They are also, as we’ll go over in multiple places, profoundly and deeply Too Dumb To Live.

The few who are not all of that, or even are a little bit not all of that some of the time, because the plot needs them not to be, are good bets to outright save humanity.

Humanity ends up saved, repeatedly, by the actions of a few people from our era, in spite of almost everyone alive’s consistent efforts, through the ages, to the contrary.

When almost everyone alive dies in the third book, it’s hard to escape the feeling that they kinda deserved it.

The Great Betrayal

It is not hard to see why Ye Wenjie, victim of the cultural revolution, might despair for humanity or her country, and take any opportunity for intervention. Yes, it was an open invitation to an alien race of would-be conquering warmongers, about whom you know nothing else, so thinking this is worth a shot seems rather like a lack of imagination on how bad things can get. Then again, it’s not clear she doesn’t just want to see all the humans suffer and die for what they’ve done.

The part where her motivation, and those of the (majority of) intellectuals who join her, is largely environmentalism? That she’s motivated in large part by reading Silent Spring?

Given how likely entirely alien race is to care about our world’s ecology at all, this does not seem like a reasonable position for someone who values Earth’s other life forms.

It seems more like Earth’s intellectual class collectively decided that this random other alien race has had a hard time, whereas humans are Just Awful and do not deserve to live, so they’re going to work to hand the planet over. That fits what the few recruits we see say – it seems the book thinks that most intellectual people are disgusted by and despise humanity, and want it to suffer and die.

The way they recruit in the book is, there’s a virtual reality game called Three Body. Those who play it unravel the nature of Trisolaris, after which they are asked to come to a meeting in real life, where they are asked their opinion on humanity and whether it would be a good idea to betray us all in favor of Trisolaris. It seems most people who get that far say yes, and the organization got very large this way without being infiltrated.

I hope that this perspective on what intellectuals think is not too common among Chinese or those with the deeply conservative views of the author. It seems such a horrible thing to think, and its implications on what one should do are even worse. I try to stretch and see how one might think this, and I can sort of kind of see it? But not really. My best sketch is a logic something like, they despise so many of the preferences of the common folk, seeing them as sexist and racist and wasteful and destructive and full of hate and lack of consideration for their fellow man, so they must hate them in return?

I can also understand how, seeing the things many people say nowadays, one might reach this conclusion. Scott Alexander recently wrote an article noting that many believe that there are zero, or approximately zero, non-terrible human beings on the planet. It also offers the hypothesis that there are approximately zero, or perhaps actual zero, non-terrible human beings in history. Direct quote from the top of the article:

There are some pretty morally unacceptable things going on in a pretty structural way in society. Sometimes I hear some activists take this to an extreme: no currently living person is morally acceptable.

The otherwise excellent show The Good Place reveals that 98%+ of all humans who ever lived were sent to The Bad Place rather than The Good Place. Good enough is hard.

A lot of people who oppose hate a lot sure seem to hate, and hate on, a huge portion of all humans who ever lived.

You can and would be wise to love the sinner and hate the sin. That’s mostly not what the sinners experience.

If you see lots of people loudly saying such things, it’s easy to see how one might view them as sufficiently hateful of humans in general that they are willing to sell humanity out, for actual literal nothing, to the aliens.

I can also see a religious perspective. Suppose you think that all men are sinners, and kind of terrible, but that they are redeemed by faith, or by the forgiveness of God, or some symbolic form of penance, or what have you. Now suppose you see a group of people who seem to agree about the awful nature of humanity, even if they don’t agree on why or on which parts of humanity are awful. But they don’t have this God concept. When people announce the error of their ways, the reception is usually observed to be worse than if they’d just said nothing. Once you’ve done wrong, these moral systems don’t seem, to such an outsider, to offer any path to redemption. Certainly not one that a non-nominal portion of people could meet.

In the book, the character we are following does what I would presume almost everyone would do when offered the chance to support an alien invasion and takeover of Earth backed by zero promises of anything for themselves or the rest of humanity. He reports the situation, cooperates with authorities, infiltrates the organization and with almost no effort sets up a raid that takes down a huge portion of their membership including their leader. I strongly believe I would do the same, not because I’m an unusually pro-human or morally great person, but because that kind of is the least you can do in that spot.

When I see people claiming to be negative utilitarians, or otherwise claiming beliefs that imply that there is only an ugly kludge, high transaction costs and/or lack of initiative standing between them and omnicidal maniachood, a part of me hopes I won’t end up having to kill them some day.

To see a book that expects not just a handful but the majority of intellectuals to, when given the choice, take the opposite path, is true alien perspective stuff.

The War Against Science

If you were in command of an organization whose goal was to ensure that humanity would fall to an alien invasion fleet scheduled to arrive in four hundred years, what would you do?

Trisolaris’ strategy is almost entirely to target science, in particular particle physics.

At the rate humanity is advancing, by the time their fleet arrives, our science and technology would by default advance to the point where we could crush the invasion fleet. On the other hand, if we could be prevented from advancing our knowledge of physics, no amount of technological tinkering on the edges will make a dent. This is what we later see, as the strong-force-powered probe proves immune to everything we can throw at it.

Trisolaris uses two distinct vectors to attack science.

At first, they utilize their control of the intellectual elite to cause culture to turn the people against science and technology. This effort is portrayed as being wildly successful, and heavily implies that today’s intellectual elite are in fact doing something akin to this, only perhaps not quite as extreme. It is not hard to see that perspective after yet another young adult dystopia.

Once the threat from Trisolaris is discovered, the jig on such strategies is up, the same way that America embraced science after Sputnik. Instead, Trisolaris relies on the sophons, which randomize the results of experiments in particle accelerators, cutting off progress in theoretical physics.

This is a very Sid Meier’s Civilization style view of how science works. To advance Science, one must fill enough beakers to discover the secret of advanced particle physics, which then allows us to fill beakers to discover the secret of strong force interactions or light speed travel. Progress is based on the sum of the science done, so ask if any given effort produces an efficient amount of science. If you cut off a link in the chain, everything stops. There’s no way around it. One cannot tinker. One cannot perform other experiments.

This is a rather poor model, in my view, of how science works. Science is not a distinct magisterium. Neither need be particle physics.

Thus, I have my doubts that this would be sufficient to cut off such progress. There must be other ways to advance knowledge.

The existence of sophons and their particular abilities likely offers a gigantic hint. Trisolaris in general had the really terrible habit of only trying to kill the few people, or stop the few experiments, that they saw as threatening. My first thought would be, try stuff, see if they let you do it, and isn’t that an interesting result in and of itself? 

Reverse engineering and catch-up growth are so much easier than going at it from first principles!

Another effort to stop us is made via the wallbreaker program. Each of the four wallfacers, humans assigned to carry out hidden agendas to defend us that would not be revealed to the prying eyes of the sophons (which can observe anything but the contents of a human brain), is assigned one person to figure out what their plan is, then reveal it to the world and show how hopeless it is. This is partly to spread hopelessness, and partly to ensure the schemes are not in fact a threat. In the book each task is left to a single individual, which is rather silly, but it seems a good use of resources, and three of the four succeed.

Only One Man

The last thing Trisolaris does is try to kill Luo Ji. This is both the smartest and dumbest thing they do.

It is the smartest because Luo Ji is the largest threat standing in the way of their victory. Only he divines the dark forest nature of the universe (that everyone who is discovered is destroyed as a defensive measure, since life expands exponentially but resources are finite, so everyone must hide well and cleanse well) and our ability to threaten Trisolaris via broadcast of the location of their homeworld.

Let us assume for now (premise!) that the dark forest hypothesis is known by Trisolaris to be true, and they know only Luo Ji has much chance of figuring it out. What they then do is to attempt to kill him in a way that looks like an accident. When that fails, they make him a wallfacer, and force him to work against his will, and he eventually shows that he is figuring out how the dark forest works. When he awakens, there is an old virus waiting to try and kill him, but it fails.

A probe, that definitely could kill him if it wanted to, then descends. Luo Ji is certain its first act will be to kill him.

Instead, the probe cuts off our ability to broadcast via our sun – again, showing us their hand, although this time it seems worthwhile.

But they do not kill Luo Ji.

Luo Ji is hailed as our only hope because of his casting a spell on a star, and made into a wallfacer with unlimited resources.

They do not kill Luo Ji.

Luo Ji starts to tinker with an early warning system and its details, for no obvious reason.

They do not kill Luo Ji.

Luo Ji constructs a dead man switch for himself, in full view of the sophons (since everything is in full view of the sophons).

They do not kill Luo Ji.

Finally, he reveals his plan to use the warning system as a method of broadcast to the universe, after it is already in place, and Trisolaris is forced to surrender. They explain that they no longer saw him as a threat, since nothing he seemed to be doing appeared meaningful, the deadman switch was a harmless part of the plan of another wallfacer (the terrorist, of course), and the sun had been cut off as a means of broadcast.

This is a monumentally important failure mode. One should not need the evil overlord list to avoid this. The mistake is not only made repeatedly by the humans and aliens alike, it is also made by the judgment of history (in the book), and hence likely also by the author. 

As evidence that the author does not appreciate how truly boneheaded a move this is, consider two things.

First, right after this incident Luo Ji hands control of the broadcast system to the government, which quickly realizes it would not reliably execute on the threat to expose both worlds and hands control back to Luo Ji within a day, making him the swordholder. The author notes that the failure of Trisolaris to attack during this period is considered one of the great strategic failures of history. This is despite the fact that the probability of retaliation at that time is highly uncertain, and it is most definitely not zero. Such a move is absurdly risky versus trying to reach a compromise, given the stakes, and in fact clearly would have been an error given future opportunities Trisolaris is offered.

Second, the explanation given by Trisolaris as to why they made the colossal blunder of not killing Luo Ji, which is frankly absurd. They admit failure to see what he was up to, why he was doing it or how any of it could matter.

They then site this failure as a justification for NOT killing Luo Ji.

This is exactly backwards. This is all the more reason to kill him, and no one in the book seems to understand this.

Because space bunnies must die.

I take this saying from the title of the (quite bad) game Space Bunnies Must Die!. If your opponent is doing something that makes no sense, whose purpose you can’t understand, which you have no explanation for why they might do it, assume it is highly dangerous. They are the enemy. They have a reasons they chose to take this action. That might be ‘they are an idiot who is flailing around and doesn’t know anything better to do.’ Maybe. Be skeptical.

Luo Ji is Earth’s space bunny. He must die. They do not kill him, and he beats them.

For a while.

Mutually Assured Destruction and the Swordholder

Luo Ji’s triumph gives Earth the ability to reveal the location of Trisolaris, dooming both worlds. He and humanity use this to free ourselves and begin to rapidly make scientific, technological and economic progress to catch up with Trisolaris.

What Trisolaris still has are the sophons. They still know everything that happen on Earth. They can’t quite see fully into human minds, but they can build increasingly accurate psychological models. Thus, one can view them as a kind of Omega, that sees all, knows all and predicts all, albeit with a substantial error rate, when it knows the parameters it is looking at.

Everything depends on the prediction they make – in the book’s words, we must ‘hold the enemy’s gaze.’ If Trisolaris thinks we won’t retaliate by revealing their location, they will attack. Whether we retaliate or not, we are doomed. If Trisolaris thinks we will retaliate, they won’t attack.

Mutually assured destruction, with a sufficiently strong predictor, is a variation of Newcomb’s Problem. The boxes are transparent, and what matters is whether you’ll take the thousand dollars if the million dollars is not there. If the predictor thinks you wouldn’t take the thousand dollars in that case, then the predictor gives you the million dollars and you never have a choice to make.

I have a lot to say about Newcomb problems in general, but won’t go into it here because humanity’s failures in this trilogy are so much more basic, and go so much deeper, than that. And because anything I bury in here risks being quite lost in something this long.

This variation is a very, very easy form of Newcomb, for two reasons.

The first is that you know the rules in advance, and get the resources of all of civilization to choose who plays. Must be nice.

The second is that the choice you precommit to gets you revenge on those who decided to destroy us, so it’s not asking for something super weird and unnatural, or even something that’s necessarily worse going forward than the alternative. Many a dedicated two-boxer would still make an effective swordholder – even if you get problems like this wrong, you have additional ways to get this particular version right.

Humanity in the book chooses someone who not only is not, in Wade’s terminology, unwilling to sell their mother to a whorehouse. It does not only also choose someone who has not shown an understanding of the importance of commitment. It chooses someone whose temperament clearly does not value revenge.

Electing Cheng Xin to be Swordholder is not simply a failure of game theory.

It can only be described as a suicide pact.

When Cheng Xin chooses whether or not to retaliate, and chooses not to, she does so out of a ‘motherly instinct.’ Out of a slim hope that it will turn out all right, somehow.

It can come as no surprise, to anyone, that Trisolaris attacks the second she is put in charge.

To emphasize this, at the same time Trisolaris is launching their attack, the humans arrest Luo Ji, on suspicion of genocide, for exactly the actions that saved humanity. Because when he destroyed a random uninhabited star to test the Dark Forest theory that allowed us to survive, there might, in theory have been intelligent life there.

Lou Ji’s reward, for literally saving the world, is arrest and would have then (except for the attack) have been trial.

Humanity decides, as it decides at several other points, that it no longer desires to survive. Humans repeatedly punish and strike down those who value humanity’s survival and make sacrifices to that end. Anyone who does what must be done, in order to let us survive, is a horrible criminal who has committed crimes against humanity, even when they fully succeed and save all of us. 

Humans repeatedly choose superficially ‘nice’ values incompatible with the survival of the species, or later with the survival of all life on and from Earth (take that, first-book environmentalists!), over any hope of actual survival. They do this after knowing that those out there in the dark forest are, as a whole, far, far worse. 

Even when they aren’t getting themselves killed, they’re sustaining a much lower than replacement-level birth rate. By the time the end comes, humanity is under one billion people entirely due to low birth rates.

Something, the most important thing, has gone horribly wrong. Values that improve the world have morphed into crazy lost purposes and signaling games. Everything worthwhile is lost.

It is hard not to read this as a critique of liberal values. The book explicitly endorses cyclical theories of history, where prosperity contains the seeds of its own destruction. In The differences this time, with higher technology and stakes, are in the consequences when it happens. Good times have created weak men, who have forgotten what it takes to create and maintain good times, and those weak men tear down that which sustains civilization, and gives its people the ability to reproduce themselves and their values and culture into future generations, without stopping to ask if they are doing so.

Then those who accept the dark forest, who know to hide well and cleanse well, inherit the universe.

As I noted earlier, this is fundamentally a deeply, deeply conservative and cynical set of books.

The Great Ravine

Or, You Fail Economics Forever, all of humanity edition, even more than usual. Oh, boy.

Humanity knows they face a doomsday battle four hundred years in the future. So everyone assumes, correctly, that humanity will begin rationing of goods so we can all go on a war footing and build ships to fight the doomsday battle.

I was thrilled when I later learned that this caused the world economy and environment to collapse. After which, we stopped trying to prepare for the war and focused on improving life. After which we rapidly recovered, developed better technology, started repairing the Earth, then incidentally decided to create a vastly superior fleet to the one we ruined ourselves trying to build.

Sounds about right.

A lot of the mistakes humanity makes in the book feel like mistakes we wouldn’t make.

This one feels like yes, we are exactly this stupid.

We totally, totally would forget that human flourishing and development requires play, requires caring about things other than The Cause. We totally would ignore the value of compound interest and economic growth and technological development, and of the unexpected. We’d abandon our freedoms and curiosity and everything of value, and embrace central planning and authoritarianism and technocrats, aimed at creating the symbolic representation of the symbolic representation of a space fleet. We totally would treat those who pointed such things out as traitors and shame them into silence.

Compound interest is not the most powerful force in the universe. But it is on the list. Tyler Cowen’s position, that we should almost entirely focus on maximizing economic growth subject to the continuity and stability of civilization, has some strong objections one might make. But all those strong suggestions come down to one of two things. Either they argue we should value the present more than the future (for various reasons), or that economic growth is the wrong thing to be measuring, and there is a different mechanism at work here that we should be thinking about in a different way.

Neither of those applies here, at all. We sacrificed the present for nothing.

The obvious defense is that basic physics research was blocked, so there’s no point in waiting. That’s quite silly. Even with basic research stalled, technology is a whole different animal, as we see later on. And even if technology was fixed, there are plenty of other ways to improve over centuries.

Note that Trisolaris does not make this mistake. They pay zero mind to any war preparations. All their sabotage goes towards stopping basic scientific research. This points to the opposite mistake, of thinking that the practical stuff one might accomplish doesn’t matter. In this case, they have somewhat of a point, given what the probes can do, but I would not want to rely on that.

You can’t run a world-wide death march for four hundred years, as much as mood affectation might demand it.

A good question is, if you were in charge of doomsday battle planning, when would it make sense to start preparing, how much, in what ways?

Year one, I would do my best to counter Trisolaris’ efforts, and devote as much effort as was productive to scientific research on all fronts at all levels, and creating a culture where that had value. I’d work to create norms allowing us to run costly experiments of all kinds, that we currently don’t consider remotely ehtical. I would do the same for space exploration and colonization, but without any mind to combat. I’d invest in long term economic growth in all its forms.

And of course, I’d do everything I could to build a true AGI. Trisolaris is evidence that the problem is hard, but the bottleneck is unlikely to be basic physics (unless it’s a pure shortage of compute, but that still doesn’t seem like a basic physics concern) and Trisolaris clearly isn’t that far ahead of us – it feels the need to sabotage our efforts. So there’s definitely chance (although presumably, given what we learn later, this would not have worked).

I’d also double down on intelligence augmentation and genetic engineering. Neither gets even a mention in the book, at all. But this is the obvious way to go! You have four centuries to get your act together. It turned out we had somewhat less, but we still had several centuries. That’s plenty of time for a very very slow takeoff. Plenty of time for getting smarter about getting smarter about getting smarter.

As a bonus, brains are the one thing Trisolaris can’t see with the Sophons, so the more people can keep in their heads because their heads are better, the better our ability to avoid countermeasures.

Oh, and the moment I could I’d get as many people as possible the hell out of the system, on seed ships, in case we lose. Even if it didn’t work, trying to do this seems likely to bear very good fruits. But there’s a whole section on that later.

Around one hundred years from the target, I’d consider starting to encourage martial traditions and space combat, and study show to train effective fighters from birth.

Within the last fifty years I’d actually build my ships or other weapons, and start getting ready in earnest. I’d go full war footing for the last five or ten, tops.

I’d also be fully on board with the Wallfacer program, but would like to think I would at least not choose an actual suicide-bomber-loving terrorist to be one of them, and ideally wait to invest mostly in enhanced people yet to be born. Plus some other obvious improvements. I’ll stop there to avoid going too far down the rabbit hole.

The Fairy Tale

The fairy tale, in which all we need to know about the universe and its inhabitants through multiple metaphoric levels, has epic levels of cool. It is constructed meticulously. It is definitely not easy to unravel.

There are even parts of the story that the book doesn’t bother explaining, but which seem to still have important meaning. Consider the spinning umbrella, which protects you from being turned into a picture. What is it? I have a guess (continuous use of light speed engines to darken matter), now that I know how the universe works, but it doesn’t quite fit, and it seems really important. There are other similar things that we ‘should’ have known ‘had to mean something important’ but then there are others that also seem that way, that (probably?) didn’t mean anything.

I got a little of what was going on when reading the tale, but missed the bulk of it. It felt like a fair puzzle in some places I missed, less fair in others. Which is, itself, quite fair.

Being Genre Savvy is hard, yo. The book’s mocking of people as unable to presume or unravel multiple levels of metaphor seems like what one says when they know the answer to their own riddle, and can’t figure out why literally no one else can figure it out.

If you start without knowledge of light speed travel, or how to darken matter, or that the number of dimensions can be reduced, making those leaps is hard.

It doesn’t seem hard on the level of an entire civilization’s experts working for dozens of years. On that level it seems easy. But if everyone got the same prompts and frames, and the same wrong ideas, it makes sense. We have long histories of everyone on Earth missing potential inventions or ideas for long periods of time, if they require multiple leaps to get there. Asking people to unravel multiple metaphorical layers is tough, especially if your poets aren’t physicists, and your physicists aren’t poets.

The book doesn’t mention a worldwide program to teach the physicists poetry, or the poets physics. People in this book stick to inside the box, and take human capacity as fixed.

That is one of the big hidden assumptions and messages of the book. People change with the times, and technology can advance, but people, and intelligence, fundamentally don’t change and don’t improve, even among aliens give or take their hat. Quirks and mistakes aside, there is mostly a single way of thinking, a single way of viewing the universe, a single set of basic values, and all intelligent life is roughly the same. No one can take over or have good probe coverage, no one creates workable systems of safety and control, and no one solves game theory. It’s weird. To some extent, I get that not doing this is hard and forces focus on things the book doesn’t want to be about, the same way that you are mostly forced to pretend AIs don’t work the ways they obviously do if you want to write science fiction that says interesting things about people. Still, it’s quite jarring.

The Trisolarians are dense throughout the books. They start not understanding the idea of deception, which I gotta say is really weird even if they can read each others’ thoughts. If nothing else, they have long distance communication, no? And abstract reason? Problems they do get over through exposure to Earth, but they still repeatedly make the same fundamental mistake.

If Trisolarians can’t see the explicit direct path by which an action leads to something they don’t like, they treat the action as harmless. And they don’t ask why people do things. They have no concept of the space bunny. And here, in our last encounter, we see the ultimate version of that. So much so, that they let the conversation go overtime, because they want to let them finish telling harmless fairy tales.

As opposed to, they’re burning their invaluable communication, this one time opportunity, on frigging fairy tales, something MUST be fishy here. 

Once you ask, are these tales fishy, if you know about dimensional warfare, I really really don’t see how you miss it. It’s one thing to not solve the riddle when you don’t know the answer. But if you know the answer, the riddle is transparent. I have no idea how it wasn’t obvious to every reader on Trisolaris, or at least each one who knew physics, what was going on.

But once again, that’s probably the ‘knows-the-answer-already’ talking.

Consider: I think it was a month ago I learned what The Safety Dance was about. My parents sang me Puff the Magic Dragon.

I mean, come on.

The Great Escape

If there is one thing mankind can agree upon throughout the trilogy, it is that the worst possible crime is escapism.

Escapism is the most dangerous crime of noticing that the solar system is doomed and trying to get the hell out as quickly as possible.

You see, several characters point out as soon as we learn we are doomed, figuring out how to build spaceships that can leave Earth is the easy part. The impossible part is deciding who gets to go. And if we can’t all go, then none of us should go, and those who try are the enemies of humanity.

Later on this expands to any form of study of interstellar travel. You see, the Earth is doomed, so learning how to leave it would be just awful because it would be unequal. Or, in some arguments, because it would distract from solving our problems, or would cause unfair hope. 

We can all agree this is a horrible, no-good way of looking at things. The last thing those on the Titanic should do, upon learning that there are not enough lifeboats, would be to sink the Titanic with all hands. We can disagree about whether it should be ladies and children first or whether class of ticket should matter or what not, but at a minimum you run a lottery or a mad race for the boats or something.

If you actually disagree with that, if you think that it is a good thing to notice that some people, somewhere, might not be doomed and make sure that every last one of us is doomed unless we can un-doom all of us, then, seriously, stop reading this blog because I kind of want you to die in a fire. It would be for the best.

When we say that while there is one slave in the world none of us are free, that is not an instruction to enslave everyone else until the problem is solved. Injustice for one is injustice for all because it is bad and sets bad incentives and bad values, not because it means we should then visit injustice upon everyone.

I am belaboring this point because I see people making arguments of this type, all the time. People who actually, seriously think that the way to finish “you have two cows” is “but someone else, somewhere, doesn’t have any cows, so we kill your cows you so we can all be equal.” If we all can’t live forever, no one can. The evil eye. The enemy of progress, and of humanity, and of life itself in all its forms.

No examples, since this blog has a strict no-someone-I-do-not-respect-is-wrong-on-the-internet policy. You know who you are.

Given this book claims that this is humanity’s basic mode of thinking, this seems like a good time to say, once and for all, this is a profoundly evil, not-even-wrong mode of thinking and when faced with it, I can’t even.

So, yeah.

I like to think the book is profoundly wrong about humanity, here. I like to think this is not mankind’s default mode of thinking. I like to think that this is only what a few fanatical and crazy people think, and they are loud and obnoxious and have platforms. I like to think that people talk a good game sometimes in order to turn things to their advantage, but they don’t actually want to burn everything down so no one will have an unfairly large house, whether or not they could be king of the ashes.

I like to think that while we would not resolve this problem well, exactly, that humanity would survive the death of the solar system if it had plenty of time and the technology, resources and capability to build generational starships and get some of us out of town. 

If we couldn’t, we wouldn’t deserve to survive, and I would not mourn for us.

I greatly appreciated that the key to defending ourselves, to turning our system into dark matter that cannot threaten others, is light-speed travel. Thus, by refusing to research light-speed travel, we cut ourselves off from discovering that the damage such travel causes to space is what would allow us to protect ourselves. This felt profoundly fair, given the way related technologies work in the book’s universe.

As Isaac Asimov said, there is but a single light of science. To brighten it anywhere, is to brighten it everywhere. It is very hard to predict what one will learn by exploring. Refusing to look in a place where you don’t understand how something works, to not ask questions because you don’t like what might happen if you found the answers, is likely to end up cutting off the things that matter most.

The book does offer an interesting implicit question, near the end. Once we develop the technology to turn our solar system into dark matter, if we get it in time, should we do it? Should we shroud our world in permanent isolation, enjoy some good years and then go meekly into that good night with the death of the sun?

I would expect us to shroud, but would proudly vote against it.

The Dark Game Theoretic Forest

In The Dark Knight (minor spoiler), there is a scene where two boats, one with convicts and one with ordinary citizens, are each given a button the Joker says will blow up the other boat, after which the Joker promises the remaining boat will be free to leave. Thus, because the convicts might push the button, the citizens might want to do so first, and thus the convicts might press it even if they didn’t want to, and so on. Starting on either side. Neither side considers their decision obvious.

This book thinks the buttons get pressed. That this is how the universe works, in the end. You can’t fully trust anyone not to blow you up, so you don’t give them your location, and when possible you blow everyone else up first. There is no thought to negotiation, cooperation or peace. Only war. Only the dark forest. Hide well, cleanse well.

In this perspective, game theory offers us no hope. No way out. Cooperation is impossible in a wide range of problems people like me consider mostly solved problems.

The price? Six dimensions of the universe. Minimum. Almost all of the universe’s value, then almost all of what is left. Six times over. The weapons used, to contain threats (where ‘threat’ is simply ‘intelligent life, anywhere’) are so toxic that they wipe out the laws of physics, the dimensions themselves. The world has already almost entirely ended, long before humans arrive on the scene.

Life is the point of the universe. All, even in-book, agree. Despite this, it is so impossible to try and work something out, so impossible to get a system of incentives or deterrence or cooperation, that life’s primary purposes are to hide and to hunt all other life.

The universe’s inhabitants needs better decision theory. Perhaps they should start here. Or maybe try the explanations here, instead. At some point I should give a shot to writing a better explainer.

What is most weird is, all of this applies back on Earth. Life still expands without limit. Resources remain finite. Offense is far cheaper than defense. Why don’t we inhabit the dark forest? Mostly we neither hide well nor cleanse well, and we are worse at both each passing year. Yes, in some ways we have it easier. In other ways, we have it harder.

Late in the third book, it is revealed that it isn’t quite all hiding and hunting. There are at least some forms of communication between various aliens, and even some dreams of cooperation to prevent the heat death of the universe. Everyone is basically similar and understands each other; as I noted above, the aliens seem more alike in their thinking than the book’s model of humans is from mine, or its Chinese from Americans. We really, really should all be able to get along, starting with a ban on weapons of dimensional destruction.

Is an upstart civilization with primitive broadcasting technology really even more dangerous than wiping out the third dimension at their home world, expanding with an escape velocity of light speed?

It doesn’t even work right. If they have light speed travel, their ships get away and try again. If they don’t, it seems like quite the overkill.

Another question that is never resolved, that I can recall, is why Trisolaris never sent probes to the stars closest to it. We know why they did not send sophons; sophons are each a huge economic cost. But why not send an ordinary probe that would give data on inhabitable worlds? Or at least, one that would give data on where one could safely park one’s ships while having a ready supply of fuel, the way humanity later puts its city-ships around Jupiter?

Humans, with technology clearly still vastly inferior to that of Trisolaris, find it not that burdensome to pack up everyone, within a hundred years, and move them into space cities around Jupiter. If that’s a viable solution, how hard is it to find a target world with a gas giant and only one or two stars (so it won’t suffer from the three body problem and be unstable)? Given the attrition rate they faced from space travel – the majority of the original invasion fleet doesn’t make it to Sol – wouldn’t it make the most sense to have already sent colony ships to Sol simply because it is a single star with large gas giants, without even needing to know Earth has life?

Instead, Trisolaris seems to be waiting around for a communication that will reveal someone’s location, despite having no expectation of that being nearby, or it being someone they can defeat militarily, or of anyone being stupid enough to do that, since in this universe everyone is hiding all the time and Trisolaris knows why they do this.

It doesn’t seem worthwhile for them to return our broadcasts, either. Yes, they can attempt to get us to stall our development, but it reveals the location of Trisolaris to us, which means we can destroy them, and will have that capacity for over a century. We could figure that out at any time. And we might do that accidentally, since we’re too stupid to realize that broadcasts of locations lead to destroyed stars.

Along similar lines, if I was a civilization that felt that all other civilizations were a mortal threat to me, to the extent that most the universe had become dark matter and we’d lost most of our dimensions to wars, you better believe I’d have at least have a probe on every world checking for signs of developing life so I could handle it with weapons that wouldn’t wipe out my supply of usable matter. And if resources are so precious, I wouldn’t be letting most of them lie around going to waste.

The book shows us that the aliens who wipe out Earth are actually getting wiped out themselves in a civil war with their own colony, about the be forced to retreat into lower dimensions (why they cannot go dark instead is not explained). So in this universe, it seems, aliens do not expand to use all the resources because they are so afraid of someone else trying to use all the resources, that creating the means to do that would mean inevitable conflicts that kill you. So most of the universe, even the useful parts, sits idle or shrouded because no one can trust even their own kind to not turn on them, simply because someone at some point might turn on someone so everyone turns on everyone.

But (mumble mumble von Neumann probes mumble mumble) that’s enough before I get distracted by going into all the plot holes and missed opportunities. Let’s wrap this up.

Conclusion and Science Fiction’s Fall

This has been by far this blog’s longest post to date, and still feels like it could have been far longer. The trilogy was full of big ideas, and theories and predictions about humanity. I disagree with many, and agree with many others, and find items from both groups interesting. It is good to see an author unafraid to say many things explicitly, that others are afraid to and unwilling to express – regardless of whether I ultimately agree with the hypothesis.

What I found strangest about these books is the degree to which it is possible to utterly miss the point. And I wonder what that says about what has happened to science fiction.

I can’t find the link, but I read a review of these books, from a clearly not-stupid person who had read the books, that praised them as offering a positive vision of humanity’s future.

On a very superficial level, yes, this is a positive vision of humanity’s future, in the sense that humanity gains higher technology and improves its standard of living, and then in that it manages to escape the solar system and create a lasting civilization among the stars – despite itself, and its repeated attempts to prevent itself from surviving, but yes in the end this does happen.

On any other level, this is utterly insane. The books despair for humanity and its future. We survive because of repeated narrative intervention, and the actions of a brave few against the will of the many to die. The world betrays itself, repeatedly. Man does not cooperate with man, and the universe is revealed as a place where being cannot cooperate with, or even fail to kill upon being given an opportunity, another being. 99.9999% of humanity is wiped out. The rest emerge into a dark forest war of all against all, where none dare reveal their location, and civilizations wage war by reducing the dimensions of the universe using weapons that have destroyed almost all value in the universe six times already and are rapidly working on a seventh time with what is left. With no way out.

That big crunch that hopes to reset the universe, if only its only safe and effectively immortal residents all sacrifice themselves? I have a hard time believing it has much chance of happening, even in my model of how things work. In this book’s universe? Not a chance.

If that’s an optimistic, positive science fiction with a positive view of the future, then what the hell is the pessimistic viewpoint?

Eliezer talks about how he was raised on old school science fiction, like Isaac Asimov and Arthur C. Clarke. His parents were careful to stock his bookshelves with only the older stuff, because newer stuff did not have the same optimistic view of humanity, and belief in science and technology as great things, and in the universe as capable of great things. Those writers ‘after the divide’ instead, in this view, use technology mainly as a way to think about what is wrong with it, and what is terrible about its consequences, and what is terrible about people. Such work does not teach us to reach for the stars.

Surely there are exceptions. But there’s no denying that the general vibe of technology being assumed to be great has somehow given way to the general vibe even in most science fiction that at best, wherever you go, there you are, and more likely that technology leads to young adult dystopia or worse. Inequality in the future is always through the roof. That roof is reliably underwater, due to climate change, which often renders Earth uninhabitable. Even if technology and our heroes ultimately save the day, the future is a mostly terrible place where we pay for our sins, and by our sins we mean our civilization and its science and technology.

Compared to that, these books present an optimistic paradise. But what kind of standard is that?

 

 

 

 

 

 

 



Discuss

Alignment Newsletter #43

30 января, 2019 - 00:10
Published on January 29, 2019 9:10 PM UTC

Alignment Newsletter #43 The techniques behind AlphaStar, and the many arguments for AI safety View this email in your browser

Find all Alignment Newsletter resources here. In particular, you can sign up, or look through this spreadsheet of all summaries that have ever been in the newsletter.

Highlights

AlphaStar: Mastering the Real-Time Strategy Game StarCraft II (The AlphaStar team): The AlphaStar system from DeepMind has beaten top human pros at StarCraft. You can read about the particular details of the matches in many sources, such as the blog post itself, this Vox article, or Import AI. The quick summary is that while there are some reasons you might not think it is conclusively superhuman yet (notably, it only won when it didn't have to manipulate the camera, and even then it may have had short bursts of very high actions per minute that humans can't do), it is clearly extremely good at StarCraft, both at the technically precise micro level and at the strategic macro level.

I want to focus instead on the technical details of how AlphaStar works. The key ideas seem to be a) using imitation learning to get policies that do something reasonable to start with and b) training a population of agents in order to explore the full space of strategies and how to play against all of them, without any catastrophic forgetting. Specifically, they take a dataset of human games and train various agents to mimic humans. This allows them to avoid the particularly hard exploration problems that happen when you start with a random agent. Once they have these agents to start with, they begin to do population-based training, where they play agents against each other and update their weights using an RL algorithm. The population of agents evolves over time, with well-performing agents splitting into two new agents that diversify a bit more. Some agents also have auxiliary rewards that encourage them to explore different parts of the strategy space -- for example, an agent might get reward for building a specific type of unit. Once training is done, we have a final population of agents. Using their empirical win probabilities, we can construct a Nash equilibrium of these agents, which forms the final AlphaStar agent. (Note: I'm not sure if at the beginning of the game, one of the agents is chosen according to the Nash probabilities, or if at each timestep an action is chosen according to the Nash probabilities. I would expect the former, since the latter would result in one agent making a long-term plan that is then ruined by a different agent taking some other action, but the blog post seems to indicate the latter -- with the former, it's not clear why the compute ability of a GPU restricts the number of agents in the Nash equilibrium, which the blog posts mentions.)

There are also a bunch of interesting technical details on how they get this to actually work, which you can get some information about in this Reddit AMA. For example, "we included a policy distillation cost to ensure that the agent continues to try human-like behaviours with some probability throughout training, and this makes it much easier to discover unlikely strategies than when starting from self-play", and "there are elements of our research (for example temporally abstract actions that choose how many ticks to delay, or the adaptive selection of incentives for agents) that might be considered “hierarchical”". But it's probably best to wait for the journal publication (which is currently in preparation) for the full details.

I'm particularly interested by this Balduzzi et al paper that gives some more theoretical justification for the population-based training. In particular, the paper introduces the concept of "gamescapes", which can be thought of as a geometric visualization of which strategies beat which other strategies. In some games, like "say a number between 1 and 10, you get reward equal to your number - opponent's number", the gamescape is a 1-D line -- there is a scalar value of "how good a strategy is", and a better strategy will beat a weaker strategy. On the other hand, rock-paper-scissors is a cyclic game, and the gamescape looks like a triangle -- there's no strategy that strictly dominates all other strategies. Even the Nash strategy of randomizing between all three actions is not the "best", in that it fails to exploit suboptimal strategies, eg. the strategy of always playing rock. With games that are even somewhat cyclic (such as StarCraft), rather than trying to find the Nash equilibrium, we should try to explore and map out the entire strategy space. The paper also has some theoretical results supporting this that I haven't read through in detail.

Rohin's opinion: I don't care very much about whether AlphaStar is superhuman or not -- it clearly is very good at StarCraft at both the micro and macro levels. Whether it hits the rather arbitrary level of "top human performance" is not as interesting as the fact that it is anywhere in the ballpark of "top human performance".

It's interesting to compare this to OpenAI Five (AN #13). While OpenAI solved the exploration problem using a combination of reward shaping and domain randomization, DeepMind solved it by using imitation learning on human games. While OpenAI relied primarily on self-play, DeepMind used population-based training in order to deal with catastrophic forgetting and in order to be robust to many different strategies. It's possible that this is because of the games they were playing -- it's plausible to me that StarCraft has more rock-paper-scissors-like cyclic mechanics than Dota, and so it's more important to be robust to many strategies in StarCraft. But I don't know either game very well, so this is pure speculation.

Exploring the full strategy space rather than finding the Nash equilibrium seems like the right thing to do, though I haven't kept up with the multiagent RL literature so take that with a grain of salt. That said, it doesn't seem like the full solution -- you also want some way of identifying what strategy your opponent is playing, so that you can choose the optimal strategy to play against them.

I often think about how you can build AI systems that cooperate with humans. This can be significantly harder: in competitive games, if your opponent is more suboptimal than you were expecting, you just crush them even harder. However, in a cooperative game, if you make a bad assumption about what your partner will do, you can get significantly worse performance. (If you've played Hanabi, you've probably experienced this.) Self-play does not seem like it would handle this situation, but this kind of population-based training could potentially handle it, if you also had a method to identify how your partner is playing. (Without such a method, you would play some generic strategy that would hopefully be quite robust to playstyles, but would still not be nearly as good as being able to predict what your partner does.)

Read more: Open-ended Learning in Symmetric Zero-sum GamesAMA with AlphaStar creators and pro players, and Vox: StarCraft is a deep, complicated war strategy game. Google’s AlphaStar AI crushed it.

Disentangling arguments for the importance of AI safety (Richard Ngo): This post lays out six distinct arguments for the importance of AI safety. First, the classic argument that expected utility maximizers (or, as I prefer to call them, goal-directed agents) are dangerous because of Goodhart's Law, fragility of value and convergent instrumental subgoals. Second, we don't know how to robustly "put a goal" inside an AI system, such that its behavior will then look like the pursuit of that goal. (As an analogy, evolution might seem like a good way to get agents that pursue reproductive fitness, but it ended up creating humans who decidedly do not pursue reproductive fitness single-mindedly.) Third, as we create many AI systems that gradually become the main actors in our economy, these AI systems will control most of the resources of the future. There will likely be some divergence between what the AI "values" and what we value, and for sufficiently powerful AI systems we will no longer be able to correct these divergences, simply because we won't be able to understand their decisions. Fourth, it seems that a good future requires us to solve hard philosophy problems that humans cannot yet solve (so that even if the future was controlled by a human it would probably not turn out well), and so we would need to either solve these problems or figure out an algorithm to solve them. Fifth, powerful AI capabilities could be misused by malicious actors, or they could inadvertently lead to doom through coordination failures, eg. by developing ever more destructive weapons. Finally, the broadest argument is simply that AI is going to have a large impact on the world, and so of course we want to ensure that the impact is positive.

Richard then speculates on what inferences to make from the fact that different people have different arguments for working on AI safety. His primary takeaway is that we are still confused about what problem we are solving, and so we should spend more time clarifying fundamental ideas and describing particular deployment scenarios and corresponding threat models.

Rohin's opinion: I think the overarching problem is the last one, that AI will have large impacts and we don't have a strong story for why they will necessarily be good. Since it is very hard to predict the future, especially with new technologies, I would expect that different people trying to concretize this very broad worry into a more concrete one would end up with different scenarios, and this mostly explains the proliferation of arguments. Richard does note a similar effect by considering the example of what arguments the original nuclear risk people could have made, and finding a similar proliferation of arguments.

Setting aside the overarching argument #6, I find all of the arguments fairly compelling, but I'm probably most worried about #1 (suitably reformulated in terms of goal-directedness) and #2. It's plausible that I would also find some of the multiagent worries more compelling once more research has been done on them; so far I don't have much clarity about them.

Technical AI alignment   Iterated amplification sequence

Learning with catastrophes (Paul Christiano): In iterated amplification, we need to train a fast agent from a slow one produced by amplification (AN #42). We need this training to be such that the resulting agent never does anything catastrophic at test time. In iterated amplification, we do have the benefit of having a strong overseer who can give good fedback. This suggests a formalization for catastrophes. Suppose there is some oracle that can take any sequence of observations and actions and label it as catastrophic or not. How do we use this oracle to train an agent that will never produce catastrophic behavior at test time?

Given unlimited compute and unlimited access to the oracle, this problem is easy: simply search over all possible environments and ask the oracle if the agent behaves catastrophically on them. If any such behavior is found, train the agent to not perform that behavior any more. Repeat until all catastrophic behavior is eliminated. This is basically a very strong form of adversarial training.

Rohin's opinion: I'm not sure how necessary it is to explicitly aim to avoid catastrophic behavior -- it seems that even a low capability corrigible agent would still know enough to avoid catastrophic behavior in practice. However, based on Techniques for optimizing worst-case performance, summarized below, it seems like the motivation is actually to avoid catastrophic failures of corrigibility, as opposed to all catastrophes.

In fact, we can see that we can't avoid all catastrophes without some assumption on either the environment or the oracle. Suppose the environment can do anything computable, and the oracle evaluates behavior only based on outcomes (observations). In this case, for any observation that the oracle would label as catastrophic, there is an environment that regardless of the agent's action outputs that observation, and there is no agent that can always avoid catastrophe. So for this problem to be solvable, we need to either have a limit on what the environment "could do", or an oracle that judges "catastrophe" based on the agent's action in addition to outcomes. That latter option can cache out to "are the actions in this transcript knowably going to cause something bad to happen", which sounds very much like corrigibility.

Thoughts on reward engineering (Paul Christiano): This post digs into some of the "easy" issues with reward engineering (where we must design a good reward function for an agent, given access to a stronger overseer).

First, in order to handle outcomes over long time horizons, we need to have the reward function capture the overseer's evaluation of the long-term consequences of an action, since it isn't feasible to wait until the outcomes actually happen.

Second, since human judgments are inconsistent and unreliable, we could have the agent choose an action such that there is no other action which the overseer would evaluate as better in a comparison between the two. (This is not exactly right -- the human's comparisons could be such that this is an impossible standard. The post uses a two-player game formulation that avoids the issue, and gives the guarantee that the agent won't choose something that is unambiguously worse than another option.)

Third, since the agent will be uncertain about the overseer's reward, it will have the equivalent of normative uncertainty -- how should it trade off between different possible reward functions the overseer could have? One option is to choose a particular yardstick, eg. how much the overseer values a minute of their time, some small amount of money, etc. and normalize all rewards to that yardstick.

Fourth, when there are decisions with very widely-varying scales of rewards, traditional algorithms don't work well. Normally we could focus on the high-stakes decisions and ignore the others, but if the high-stakes decisions occur infrequently then all decisions are about equally important. In this case, we could oversample high-stakes decisions and reduce their rewards (i.e. importance sampling) to use traditional algorithms to learn effectively without changing the overall "meaning" of the reward function. However, very rare+high-stakes decisions will probably require additional techniques.

Fifth, for sparse reward functions where most behavior is equally bad, we need to provide "hints" about what good behavior looks like. Reward shaping is the main current approach, but we do need to make sure that by the end of training we are using the true reward, not the shaped one. Lots of other information such as demonstrations can also be taken as hints that allow you to get higher reward.

Finally, the reward will likely be sufficiently complex that we cannot write it down, and so we'll need to rely on an expensive evaluation by the overseer. We will probably need semi-supervised RL in order to make this sufficiently computationally efficient.

Rohin's opinion: As the post notes, these problems are only "easy" in the conceptual sense -- the resulting RL problems could be quite hard. I feel most confused about the third and fourth problems. Choosing a yardstick could work to aggregate reward functions, but I still worry about the issue that this tends to overweight reward functions that assign a low value to the yardstick but high value to other outcomes. With widely-varying rewards, it seems hard to importance sample high-stakes decisions, without knowing what those decisions might be. Maybe if we notice a very large reward, we instead make it lower reward, but oversample it in the future? Something like this could potentially work, but I don't see how yet.

For complex, expensive-to-evaluate rewards, Paul suggests using semi-supervised learning; this would be fine if semi-supervised learning was sufficient, but I worry that there actually isn't enough information in just a few evaluations of the reward function to narrow down on the true reward sufficiently, which means that even conceptually we will need something else.

Techniques for optimizing worst-case performance (Paul Christiano): There are “benign” failures of worst-case performance, where the AI system encounters a novel situation and behaves weirdly, but not in a way that systematically disfavors human values. As I noted above, we can’t get rid of all of these, but that’s probably fine. We instead would like to focus on “malign” failures, where the AI system applies its intelligence in pursuit of the wrong goal. There are a few techniques that could be applied to this problem.

With adversarial training, we can have one system find inputs on which our agent fails catastrophically, and then train the agent to avoid those behaviors. The main issue here is that there will likely be some failures that aren't found.

On the other hand, verification has strong guarantees, but suffers from the problem that it is hard to know what to specify, and it is computationally expensive to actually perform verification. If we have a strong trusted overseer, eg. produced by amplification, we could use it as a very expensive specification. Alternatively, we could use a catastrophe-checker in lieu of a specification. (Note: While I can see some ways of using catastrophe-checkers, the post seems to have a specific method in mind that I don't understand.)

Any malignant failure must be using the intelligence of the agent somehow, and the agent is only intelligent on the training data, so if we can use strong transparency techniques on the training data, we could find such failures. However, if you use transparency for this, by default you weed out the comprehensible failures and leave in the incomprehensible ones.

Rohin's opinion: While I agree that it would be great to optimize worst-case performance, all of these techniques feel quite difficult to do scalably and with guarantees. With adversarial training, you need to find allof the ways that an agent could fail, while catastrophe could happen if the agent stumbles across any of these methods. It seems plausible to me that with sufficient additional information given to the adversary we can meet this standard, but it seems very hard to knowably meet this standard, i.e. to have a strong argument that we will find all of the potential issues.

With verification, the specification problem seems like a deal-breaker, unless combined with other methods: a major point with AI safety is that we can't write down a good specification for what we want. If we instead only use verification to propagate worst-case guarantees from one model to another (as the post suggests), then it seems possible in theory, but very expensive in practice: most verification techniques assume unlimited fast access to the specification, whereas our specification is very expensive to query. Of course, not much research has focused on this setting, so we can expect large gains; nonetheless you do need to somehow extrapolate the specification to all possible inputs, which seems hard to do with limited access to the specification.

Transparency also seems like it provides additional safety, rather than making any guarantees, since we probably can't get a guarantee that our transparency mechanisms can show us all possible failure modes in a way that we understand them. The argument that we can only focus on the training data makes the assumption that the AI system is not going to generalize well outside of the training dataset. While I'm sympathetic to this assumption (we really don't have good methods for generalization, and there are strong reasons to expect generalization to be near-impossible), it isn't one that I'm confident about, especially when we're talking about general intelligence.

Of course, I'm still excited for more research to be done on these topics, since they do seem to cut out some additional failure modes. But if we're looking to have a semi-formal strong argument that we will have good worst-case performance, I don't see the reasons for optimism about that.

Value learning sequence

The human side of interaction (Rohin Shah): The lens of human-AI interaction (AN #41) also suggests that we should focus on what the human should do in AI alignment.

Any feedback that the AI system gets must be interpreted using some assumption. For example, when a human provides an AI system a reward function, it shouldn't be interpreted as a description of optimal behavior in every possible situation (which is what we currently do implicitly). Inverse Reward Design (IRD) suggests an alternative, more realistic assumption: the reward function is likely to the extent that it leads to high true utility in the training environment. Similarly, in inverse reinforcement learning (IRL) human demonstrations are often interpreted under the assumption of Boltzmann rationality.

Analogously, we may also want to train humans to give feedback to AI systems in the manner that they are expecting. With IRD, the reward designer should make sure to test the reward function extensively in the training environment. If we want our AI system to help us with long-term goals, we may want the overseers to be much more cautious and uncertain in their feedback (depending on how such feedback is interpreted). Techniques that learn to reason like humans, such as iterated amplification and debate, would by default learn to interpret feedback the way humans do. Nevertheless it will probably be useful to train humans to provide useful feedback: for example, in debate, we want humans to judge which side provided more true and useful information.

Future directions for narrow value learning (Rohin Shah): This post summarizes some future directions for narrow value learning that I'm particularly interested in from a long-term perspective.

Problems

Disentangling arguments for the importance of AI safety (Richard Ngo): Summarized in the highlights!

Agent foundations

Clarifying Logical Counterfactuals (Chris Leong)

Learning human intent

ReNeg and Backseat Driver: Learning from Demonstration with Continuous Human Feedback (Jacob Beck et al)

Handling groups of agents

Theory of Minds: Understanding Behavior in Groups Through Inverse Planning (Michael Shum, Max Kleiman-Weiner et al) (summarized by Richard): This paper introduces Composable Team Hierarchies (CTH), a representation designed for reasoning about how agents reason about each other in collaborative and competitive environments. CTH uses two "planning operators": the Best Response operator returns the best policy in a single-agent game, and the Joint Planning operator returns the best team policy when all agents are cooperating. Competitive policies can then be derived via recursive application of those operations to subsets of agents (while holding the policies of other agents fixed). CTH draws from ideas in level-K planning (in which each agent assumes all other agents are at level K-1) and cooperative planning, but is more powerful than either approach.

The authors experiment with using CTH to probabilistically infer policies and future actions of agents participating in the stag-hunt task; they find that these judgements correlate well with human data.

Richard's opinion: This is a cool theoretical framework. Its relevance depends on how likely you think it is that social cognition will be a core component of AGI, as opposed to just another task to be solved using general-purpose reasoning. I imagine that most AI safety researchers lean towards the latter, but there are some reasons to give credence to the former.

Forecasting

Forecasting Transformative AI: An Expert Survey (Ross Gruetzemacher et al)

Near-term concerns   Fairness and bias

Identifying and Correcting Label Bias in Machine Learning (Heinrich Jiang and Ofir Nachum)

AI strategy and policy

FLI Podcast- Artificial Intelligence: American Attitudes and Trends (Ariel Conn and Baobao Zhang): This is a podcast about The American Public’s Attitudes Concerning Artificial Intelligence (AN #41), you can see my very brief summary of that.

Other progress in AI   Exploration

Amplifying the Imitation Effect for Reinforcement Learning of UCAV's Mission Execution (Gyeong Taek Lee et al)

Reinforcement learning

AlphaStar: Mastering the Real-Time Strategy Game StarCraft II (The AlphaStar team): Summarized in the highlights!

Deep learning

Attentive Neural Processes (Hyunjik Kim et al)

News

SafeML ICLR 2019 Call for Papers (Victoria Krakovna et al): The SafeML workshop has a paper submission deadline of Feb 22, and is looking for papers on specification, robustness and assurance (based on Building safe artificial intelligence: specification, robustness, and assurance (AN #26)).

Copyright © 2019 Rohin Shah, All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.



Discuss

The Question Of Perception

29 января, 2019 - 23:59
Published on January 29, 2019 8:59 PM UTC

Quotes from the post:

Once during an Indian coconut harvest, a farmer, tired from his day’s work of chopping down fruit, slumped down in the shade of a tree to enjoy a coconut, and upon splitting it open, found inside a message from God (or in this case, from Vishnu, his Hindu deity). The Brahmic writing was plainly visible for anyone to see, spelt out in the two halves of oily white meat. The implications of such an experience could only have been one of the following: the first is that the Supreme Being has no qualms about revealing his Divine Will in the contents of mere palm fruit, any more than in a whirlwind or through an oracle. The second is that the farmer’s perceptual systems produced an inaccurate representation of what he saw engraved in the fruit lining. It’s anybody’s guess as to which it really was. But regardless of whatever information was contained in the coconut, the moral of the story is that whilst miracles are known to happen, it can also be said that people regularly see things which are not there.

That we end up being misled by our senses is a widely-accepted truism, as most humans mistake the limits of their perception for the limits of the world itself. It’s not uncommon for people to push forward in their endeavors with a premature understanding of things, using inadequate standards to measure what they see in the world around them. Such errors in judgement range from being mild, such as mistakenly purchasing a rotten apple because you didn’t inspect its underside, to being detrimental, such as presuming that the oasis in the middle of the desert is real, when it’s only a hallucination. Hence, there are times when people are not so much moved by external objects as by their perception of those objects; what they claim to “know” is in fact only known conditionally, since human knowledge is always limited, thanks in part to our limited perceptual systems. Rare are those whose conception of reality is not exclusively dependent on their perception of it, who understand that there are things which you know and things which you don’t know, and in between are several doorways, of which one is called “Perception”; this door must be entered with prudence.

It is a common presumption that people see the world primarily with their eyes, but in fact, that is only true when they look at the objects that they recognize; things they have seen before and have categorized in their mind. In a case like that, they would have already built up the necessary perceptual tools that allow them to properly identify what they’re looking at. But when they look at something that they don’t recognize, something alien or new to them, the imagination takes over the function of the eyes, and becomes their primary tool of orientation. Whilst our physical senses may reveal to us the physical world, it is our imagination that allows us to project ourselves beyond finite time and space, into the realm of possibilities, abstracts and narratives.

In terms of perceptual experience, the dominant line of thought is that the world is primarily made up of objects. This may seem fairly straightforward, since you presumably see these objects all around you: buildings, street lights, vehicles, telephones, animals, humans, etc. As a consequence of seeing these objects, you generate thoughts about how to interact with them, and after completing your thought process, you proceed to act. As self-evident as this might appear, it begs further reflection. Let’s imagine a scenario where you’re sitting in a cafe, and you happen to catch a glimpse of an attractive individual at a nearby table. On one level, you perceive them as an object of interest, i.e, a intriguing material thing to be observed. Captivated by their appearance, you immediately start to draft assumptions about what kind of person they are, and how you might approach them to initiate a relationship. But in your enamored state, what you failed to consider was that there are other levels of this person’s existence which are invisible, yet equally defining for them. In terms of the biological, that person exists as a collection of billions of cells that perform numerous functions uninterruptedlyAt a higher level of organization, the cells form tissues that perform specific bodily functions, and those tissues collectively form organs, and so on until we finally get to what the person looks like in front of you in their total embodied form. None of these other levels of biological existence are less relevant than the person’s overall appearance to you as an object. If it turns out that he or she is a cancer patient, battling a tumor, then the import of their unseen cellular reality becomes quite relevant. And there are yet other level of analysis: this person will have inextricable social ties that define them, such as friends and family, who may come from different backgrounds, cultures and other group categories based on their ethnicity, level of education, income bracket, etc. And even those groups are connected to yet other groupings, until who this person is can be expanded to encompass virtually anything. But when you blissfully observe them from your nearby table, you don’t see any of that reflected in them as a mere object. You can only see them at a certain level of resolution, as mediated by your ideas and impressions, yet all of the other details that escape your attention are equally relevant for defining who they are.

The idea of the world being more than just a collection of objects can be seen not only in social relationships or biological matter, but also in how we interact with inanimate devices. A computer is viewed as an object, but when you interact with it, you’re not really interacting with the computer itself, which is essentially a motherboard and other internal electric components. A keyboard, mouse and a graphical user interface have been provided for you to use, and those tools will interact with the computer for you. But if the computer were to unexpectedly crash, then you would be forced to interact with the machine itself, which most people find frustrating, as they realize that they know very little about how computers actually work. Similarly, when you interact with the world, your desire is mainly to produce a favorable result for yourself, for which a technical understanding of how things work is often unnecessary. This is why, from public transport to consumer technology, we are often met by a friendly, uncomplicated user interface or control surface when interacting with the infrastructure around us, which hides the complex configurations that more accurately would define the object in front of us. What this means is that people’s perceptions are ultimately framed by the things they want or have been trained to see, and not an actual understanding of the world or its technical workings. So in terms of perceptual experience, the world is not primarily made up of the innumerable objects that you see around you; it’s made up of information. More practically speaking, it’s made up of tools and obstacles; things that you can use for your purposes, and things that get in your way. An illustration of this can be seen in how babies relate to the world: it takes years for them to build up an object-based view of their environment, as they have little to no comprehension of what their surroundings actually are, yet they still manage to orient themselves, albeit somewhat clumsily. So even though people may look at the world and think they only see objects, there are in fact multiple processes that influence them to make that judgement, such as their pre-conceptions, desires, and past experiences. It is only because people have such a limited perception of things that they fail to realize that an object always transcends the manner in which they frame it.

It would be interesting to hear whether people recognize the above ideas as something familiar, or view them as a set of open-ended concepts that have yet to reach a satisfactory conclusion.

The article was originally published on Greekspeek.com



Discuss

Which textbook would you recommend to learn decision theory?

29 января, 2019 - 23:48
Published on January 29, 2019 8:48 PM UTC

Eliezer talks a lot about decision theory in his sequences, e.g. the Aumann agreement theorem or the von Neumann-Morgenstern utility theorem. From what I've seen so far, decision theory looks extremely interesting.

Which textbook in decision theory would you recommend to start with? I'd appreciate if the book contained not only theory, but also some exercise/problem section - I have noticed that usually a lecture is not enough to fully grasp a topic. I want a book which will not shy away from mathematical side of the theory.

I have a strong background in mathematics and computer science, but I only know a little about game theory.



Discuss

Towards equilibria-breaking methods

29 января, 2019 - 19:19
Published on January 29, 2019 4:19 PM UTC

All of this is vague intuition.

Much has been said about the problems of Nash equilibria. I've been wondering about how to quantify these sorts of problems. From Moloch's Toolbox:

3.  Systems that are broken in multiple places so that no one actor can make them better, even though, in principle, some magically coordinated action could move to a new stable state.

My animating thought is that information and incentives aren't evenly distributed, and further there is an important time dimension to the problem. I expect it should be possible to temporarily disrupt any given equilibrium. From disruption, a better equilibrium could be reached. Alternatively, maybe it is possible to go directly to the better equilibrium.

  • We should be able to get knowledge of the actors.
  • Information, which is to say beliefs-about-the-current-equilibrium, is not perfect or even among them. How imperfect and how uneven?
  • With knowledge of the actors, we should be able to get a distribution of their incentives.
  • Now we know how many actors, and their incentives, and their information. With this knowledge, we should be able to determine how big an incentive we need to offer in order to shift them into a target equilibrium.
  • If we want it to be a new equilibrium, we will need to affect enough actors to make it fairly stable. How many actors will we need, a la the hundredth monkey effect?
  • It is probably important to be able to tailor incentives to actors.
  • My intuition for the threshold is that the weaker the belief in the current equilibrium, the less incentive will be required to shift them. So the target is something like (current incentives from equilibrium + incentive to overcome belief).

I think of this like surveying a rock face for blasting. Once we identify the lowest-dynamite method, anyone else with enough dynamite can come along and detonate.

My default assumption is that this will take the traditional form of getting a lot of capital and using that to directly or indirectly provide these incentives. Under this assumption there needs to be a reason for that capital to be available, which is to say a profit; maybe this is realizable through market operations directly. Alternatively this could be the vehicle for a startup's pitch; it seems like this process would qualify as superlative market research.

However, it does occur to me that actors might be more like individuals who work in key positions, rather than entire firms. If that is the case I expect the belief-disparities to be even higher than they would be for firms, and the cost of shifting them probably smaller.

Near as I can tell, no one has really taken a startup-pitch-worth of effort to describe an attempt at shifting an equilibrium, especially not in the sense that we talk about them.



Discuss

Can there be an indescribable hellworld?

29 января, 2019 - 18:00
Published on January 29, 2019 3:00 PM UTC

.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')}

Can there be an indescribable hellworld? What about an un-summarisable one?

By hellworld, I mean a world of very low value according to our value scales - maybe one where large number of simulations are being tortured (aka mind crimes.

A hellworld could look superficially positive, if we don't dig too deep. It could look irresistibly positive.

Could it be bad in a way that we would find indescribable? It seems that it must be possible. The set of things that can be described to us is finite; the set of things that can be described to us without fundamentally changing our values is much smaller still. If a powerful AI was motivated to build a hellworld such that the hellish parts of it were too complex to be described to us, it would seem that it could. There is no reason to suspect that the set of indescribable worlds contains only good worlds.

Can it always be summarised?

Let's change the setting a bit. We have a world W, and a powerful AI A that is giving us information about W. The A is aligned/friendly/corrigible or whatever we need to be. It's also trustworthy, in that it always speaks to us in a way that increases our understanding.

Then if W is an indescribable hellworld, can A summarise that fact for us?

It seems that it can. In the very trivial sense, it can, by just telling us "it's an indescribable hellworld". But it seems it can do more than that, in a way that's philosophically interesting.

A hellworld is ultimately a world that is against our values. However, our values are underdefined and changeable. So to have any chance of saying what these values are, we need to either extract key invariant values, synthesise our contradictory values into some complete whole, or use some extrapolation procedure (eg CEV). In any case, there is a procedure for establishing our values (or else the very concept of "hellworld" makes no sense).

Now, it is possible that our values themselves may be indescribable to us now (especially in the case of extrapolations). But A can at least tell us that W is against our values, and provide some description as to the value it is against, and what part of the procedure ended up giving us that value. This does give us some partial understanding of why the hellworld is bad - a useful summary, if you want.

On a more meta level, imagine the contrary - that W was hellworld, but the superintelligent agent A could not indicate what human values it actually violated, even approximately. Since our values are not some exnihilio thing floating in space, but derived from us, it is hard to see how something could be against our values in a way that could never be summarised to us. That seems almost definitionally impossible: if the violation of our values can never be summarised, even at the meta level, how can it be a violation of our values?

Trustworthy debate is FAI complete

It seems that the consequence of that is that we can avoid hellworlds (and, presumably, aim for heaven) by having a corrigible and trustworthy AI that engages in debate or is a devil's advocate. Now, I'm very sceptical of getting corrigible or trustworthy AIs in general, but it seems that if we can, we've probably solved the FAI problem.

Note that even in the absence of a single given way of formalising our values, the AI could list the plausible formalisations for which W was or wasn't a hellworld.



Discuss

How much can value learning be disentangled?

29 января, 2019 - 17:17
Published on January 29, 2019 2:17 PM UTC

In the context of whether the definition of human values can disentangled from the process of approximating/implementing that definition, David asks me:

  • But I think it's reasonable to assume (within the bounds of a discussion) that there is a non-terrible way (in principle) to specify things like "manipulation". So do you disagree?

I think it's a really good question, and its answer is related to a lot of relevant issues, so I put this here as a top-level post. My current feeling is, contrary to my previous intuitions, that things like "manipulation" might not be possible to specify in a way that leads to useful disentanglement.

Why manipulate?

First of all, we should ask why an AI would be tempted to manipulate us in the first place. It may be that it needs us to do something for it to accomplish its goal; in that case it is trying to manipulate our actions. Or maybe its goal includes something that cashes out as out mental states; in that case, it is trying to manipulate our mental state directly.

The problem is that any reasonable friendly AI would have our mental states as part of its goal - it would at least want us to be happy rather than miserable. And (almost) any AI that wasn't perfectly indifferent to our actions would be trying to manipulate us just to get its goals accomplished.

So manipulation is to be expected by most AI designs, friendly or not.

Manipulation versus explanation

Well, since the urge to manipulate is expected to be present, could we just rule it out? The problem is that we need to define the difference between manipulation and explanation.

Suppose I am fully aligned/corrigible/nice or whatever other properties you might desire, and I want to inform you of something important and relevant. In doing so, especially if I am more intelligent than you, I will simplify, I will omit irrelevant details, I will omit arguably relevant details, I will emphasise things that help you get a better understanding of my position, and de-emphasise things that will just confuse you.

And these are exactly the same sorts of behaviours that smart manipulator would do. Nor can we define the difference as whether the AI is truthful or not. We want human understanding of the problem, not truth. It's perfectly possible to manipulate people while telling them nothing but the truth. And if the AI structures the order in which it presents the true facts, it can manipulate people while presenting the whole truth as well as nothing but the truth.

It seems that the only difference between manipulation and explanation is whether we end up with a better understanding of the situation at the end. And measuring understanding is very subtle. And even if we do it right, note that we have now motivated the AI to... aim for a particular set of mental states. We are rewarding it for manipulating us. This is contrary to the standard understanding of manipulation, which focuses on the means, not the end result.

Bad behaviour and good values

Does this mean that the situation is completely hopeless? No. There are certain manipulative practices that we might choose to ban. Especially if the AI is limited in capability at some level, this would force it to follow behaviours that are less likely to be manipulative.

Essentially, there is no boundary between manipulation and explanation, but there is a difference between extreme manipulation and explanation, so ruling out the first can help (or maybe not).

The other thing that can be done is to ensure that the AI has values close to ours. The closer the values of the AI are to us, the less manipulation it will need to use, and the less egregious the manipulation will be. It might be that, between partial value convergence and ruling out specific practices (and maybe some physical constraints), we may be able to get an AI that is very unlikely to manipulate us much.

Incidentally, I feel the same about low-impact approaches. The full generality problem, an AI that is low impact but value-agnostic, I think is impossible. But if the values of the AI are better aligned with us, and more physically constrained, then low impact becomes easier to define.



Discuss

[Link] Did AlphaStar just click faster?

29 января, 2019 - 02:45
Published on January 28, 2019 8:23 PM UTC

This is a linkpost for: https://medium.com/@aleksipietikinen/an-analysis-on-how-deepminds-starcraft-2-ai-s-superhuman-speed-could-be-a-band-aid-fix-for-the-1702fb8344d6.

tl;dr: AlphaStar clicked at a rate of 1000+ Actions Per Minute for five second periods, and a rate 1500+ APM for fractions of a second. The fastest human player can't sustain anything above 500 APM for more than a second or two. Did AlphaStar just spam click its way to victory?



Discuss

Techniques for optimizing worst-case performance

29 января, 2019 - 00:29
Published on January 28, 2019 9:29 PM UTC

If powerful ML systems fail catastrophically, they may be able to quickly cause irreversible damage. To be safe, it’s not enough to have an average-case performance guarantee on the training distribution — we need to ensure that even if our systems fail on new distributions or with small probability, they will never fail too badly.

The difficulty of optimizing worst-case performance is one of the most likely reasons that I think prosaic AI alignment might turn out to be impossible (if combined with an unlucky empirical situation).

In this post I want to explain my view of the problem and enumerate some possible angles of attack. My goal is to communicate why I have hope that worst-case guarantees are achievable.

None of these are novel proposals. The intention of this post is to explain my view, not to make a new contribution. I don’t currently work in any of these areas, and so this post should be understood as an outsider looking in, rather than coming from the trenches.

Malign vs. benign failures and corrigibility

I want to distinguish two kinds of failures:

  • “Benign” failures, where our system encounters a novel situation, doesn’t know how to handle it, and so performs poorly. The resulting behavior may simply be erratic, or may serve an external attacker. Their effect is similar to physical or cybersecurity vulnerabilities — they create an opportunity for destructive conflict but don’t systematically disfavor human values. They may pose an existential risk when combined with high-stakes situations, in the same way that human incompetence may pose an existential risk. Although these failures are important, I don’t think it is necessary or possible to eliminate them in the worst case.
  • “Malign” failures, where our system continues to behave competently but applies its intelligence in the service of an unintended goal. These failures systematically favor whatever goals AI systems tend to pursue in failure scenarios, at the expense of human values. They constitute an existential risk independent of any other destructive technology or dangerous situation. Fortunately, they seem both less likely and potentially possible to avoid even in the worst case.

I’m most interested in malign failures, and the narrower focus is important to my optimism.

The distinction between malign and benign failures is not always crisp. For example, suppose we try to predict a human’s preferences, then search over all strategies to find the one that best satisfies the predicted preferences. Guessing the preferences even a little bit wrong would create an adversarial optimizer incentivized to apply its intelligence to a purpose at odds with our real preferences. If we take this approach, incompetence does systematically disfavor human values.

By aiming for corrigible rather than optimal behavior (see here or here) I’m optimistic that it is possible to create a sharper distinction between benign and malign failures, which can be leveraged by the techniques below. But for now, this hope is highly speculative.

Amplification

I believe that these techniques are much more likely to work if we have access to an overseer who is significantly smarter than the model that we are trying to train. I hope that amplification makes this possible.

It seems realistic for a strong overseer to recognize an (input, output) pair as a malign failure mode (though it may require a solution to informed oversight). So now we have a concrete goal: find a model that never gives an output the overseer would diagnose as catastrophically bad.

Historically researchers in the AI safety community have been extremely pessimistic about reliability. I think part of that pessimism is because they have been imagining working with models much smarter than the overseer.

Techniques

I’ll describe three categories of techniques:

  • Adversarial training
  • Verification
  • Transparency

Previous versions of this list included implicit ensembles, e.g. Bayesian neural nets. I still think there might be useful techniques in that space, but I’ve become pessimistic enough that it no longer makes the list.

Adversarial training

(Examples: Explaining and Harnessing Adversarial Examples, Adversarial Machine Learning at Scale.)

If we want to know whether our model ever fails catastrophically, it’s natural to train an adversary to search for catastrophe-inducing inputs. If the adversary can’t find one, then we hope that none exist. In order to produce a catastrophe-free model, we can integrate the results of testing into the training process.

The basic problem with this strategy is that no matter how hard you test, it’s always possible (realistically, it’s likely) that there are possible failure cases your adversary didn’t notice. For example, suppose that it takes 2ᴺ seconds to find the Nth failure. Because nature will be many times more complex than our training procedure, it’s inevitable that there are many types of failures that nature can find but we can’t.

We can attack this problem by trying to make the adversary’s job easier. Some possibilities:

  • Use interpretability techniques (see below) to help the adversary identify possible failures.
  • Use interpretability to expand what counts as catastrophic — if the adversary can get the model to think in a way that looks like it could lead to catastrophic behavior under different situations, that could count as a success for the adversary.
  • Combine with verification (see below) to help guide the adversary’s search and expand the space of allowed attacks.
  • Allow the adversary to directly control some of the internal computation of our model rather than merely controlling inputs. This requires a way to constrain the actions of the adversary — we want them to be able to produce any failure that could arise naturally, but if we give them too much power then they can trivially cause a catastrophic failure.
  • Change the system’s inputs so they are easier for the adversary to fake: see the “distorting observations” and “abstracting observations” sections of this post.
  • Make the adversary more powerful, without changing their task, by applying amplification.

Adversarial training faces other difficulties as well. For example, it might take a long time for adversarial training to converge, or training might never converge due to non-convexity. I think these are significantly less likely to be deal-breakers, but they are certainly important research problems if we actually want to apply adversarial training.

Verification

(Examples: Certified Defenses against Adversarial Examples, Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks)

In verification the goal is to find a certificate that there is no input on which a system behaves badly.

Verification has (at least) two problems:

  • Complexity. It’s really, really hard to verify properties of a computationally interesting model.
  • Specification. It’s unclear what we should be trying to verify.

Handling complexity is definitely challenging. But if we are free to train the model in order to facilitate verification, and if we are only interested in certifying some “easy” property that the model satisfies with slack, then it’s not clearly doomed.

Prima facie, specification looks more like an unfixable deal-breaker. In the rest of this section I’ll give three possible approaches for finding specifications. I think none of these is satisfactory on their own, but they leave me optimistic that verification can be useful despite specification difficulty.

My current view is that specification and complexity are both potentially surmountable difficulties, but that it’s likely their resolutions will result in partial solutions that need to be combined with other approaches.

Use a slow model of overseer as a specification. If our overseer is a physical process, to actually get a specification we’d need to learn a model of the overseer. So we’ve just shifted the problem to the model of the overseer.

Fortunately, this move does potentially made the problem one step better. Because we don’t need to use the overseer model at runtime, we can afford to use a much slower model. So verification could distill a very slow reliable model into a fast reliable model.

Amplification. If our overseer is produced by amplifying some process H, and we have a trusted model of H, then we can obtain a trusted model of the overseer. So if we were able to form a trusted model of the first step of amplification, then we could iteratively use verification to construct trusted models at each subsequent step.

We are still left with the problem at the first step of amplification. But at this step we might be able to restrict attention to a small range of questions for which worst-case guarantees are more feasible; we may be able to train weaker models for which worst-case guarantees are more feasible; or we might be able to eliminate the human altogether and begin amplification from some explicit simple core of reasoning. Any of these seems plausible.

“Verification” without specification. I find it helpful to think of verification from the primal/dual perspective, which also makes the connection to adversarial training clearer.

In adversarial training, the adversary wants to find an input on which the system behaves poorly. The dual to the adversary is an explicit list of all inputs, showing that each one of them performs well. The dual is exponentially large, and (relatedly) the adversary’s problem is exponentially hard.

We can view verification in a given proof system as making the adversary’s job easier, by expanding the set of allowable “attacks.” If we do this carefully, we can make the dual certificate much smaller and make the adversary’s task much more tractable.

For example, if our proofs (dual solutions) are in first order logic, the adversary’s task is finding any model of the axioms in which there exists an attack. (This is only different in the infinite case.) If our proofs are sum of squares proofs, the adversary’s task is finding a pseudodistribution over attacks. For certain linear programming proofs, the adversary’s goal is finding an attack on a particular extrapolation of the model. And so on.

On this perspective, we don’t necessarily need an actual specification. Instead, we can take some external catastrophe-checker, relax it to a catastrophe checker over the adversary’s relaxed action space, and then try to implement an adversary against that catastrophe checker. This would change the meaning of the dual, which would now have a subtle statistical interpretation rather than being a traditional certificate.

Transparency

(Examples: Feature Visualization, Deep Inside Convolutional Networks)

While training a model, we are free to look inside and “see what it’s thinking.” If we could actually understand those thoughts, we might be able to more easily predict under what conditions it would fail malignantly.

A (weird) example: suppose that human evolution was part of an alien project to train a mathematician, and some humans are extracted from our simulation and trained to solve alien math problems. If aliens are only able to test the humans’ behavior they might conclude that they always faithfully solve whatever problem they are posed — especially after a few generations of selection removed all the humans who didn’t do so. But if the aliens were able to look inside the humans’ heads they might discover that this behavior is extremely contingent — perhaps humans are only solving problems because they believe they will die otherwise, and the humans would behave very differently as soon as they saw an opportunity to escape.

That example is exotic and somewhat implausible, but I think that the same basic dynamic is likely to apply in much messier practical situations. If a model would fail malignantly under some conditions, I think there is some sense in which the model must be testing those conditions during training.

Malign vs. benign failures: I think the prospects for using transparency to detect malign failures in models trained by gradient descent are much better than the prospects for detecting benign failures, or for detecting failures in arbitrary models.

The key point is that a malign failure requires leveraging the intelligence of the model to do something actively bad. If our model is trained by gradient descent, its behavior can only be intelligent when it is exercised on the training distribution — if part of the model never (or very rarely) does anything on the training distribution, then that part of the model can’t be intelligent. So in some sense a malign failure mode needs to use a code path that gets run on the training distribution, just under different conditions that cause it to behave badly.

That means that we don’t necessarily need to analyze our model in the abstract, we can look at its behavior on actual training data. This looks much more likely to be tractable. (Though this entire discussion is extremely subtle, and without being more precise it’s very easy for us to go wrong.)

Amplification: Transparency will be hard if we ultimately want to train models much smarter than we are, but deeply understanding what a model is thinking probably requires being much smarter than it. I think that problem can potentially be overcome by using amplification to stay ahead of the models we are training. This means that current work on transparency, which considers models doing tasks that are within the abilities of the humans who are trying to understand them, could potentially remain relevant even as AI improves significantly.

What do you do with transparency? Merely understanding that a model might behave catastrophically could be useful, but it would be much nicer to actually fix the problem. Adversarial training gives a natural mechanism: once we understand a failure we can synthesize appropriate data and then train on that data.

This approach puts significantly more stress on our transparency techniques. Even if were initially able to use transparency to see how our model might fail, after we perform many generations of selection we might weed out exactly the comprehensible failures and leave the incomprehensible ones. You would only want to apply this technique if you had a great deal of faith in your methods; if you were feeling at all shaky about your ability to achieve worst-case guarantees, and transparency techniques let you see one potential catastrophic failure, it would be better to consider that a near-miss and seriously rework your project rather than plowing on.

Conclusion

Making ML systems work in the worst case is hard, even if we are only concerned with malign failures and have access to an overseer who can identify them. If we can’t solve this problem, I think it seriously calls into question the feasibility of aligned ML.

Fortunately there are at least a few plausible angles of attack on this problem. All of these approaches feel very difficult, but I don’t think we’ve run into convincing deal-breakers. I also think these approaches are complementary, which makes it feel even more plausible that they (or their descendants) will eventually be successful. I think that exploring these angles of attack, and identifying new approaches, should be a priority for researchers interested in alignment.

This was originally posted here on 1st February, 2018.

The next post in this sequence is "Reliability Amplification", and will come out on Tuesday.



Discuss

"Giftedness" and Genius, Crucial Differences

28 января, 2019 - 23:22
Published on January 28, 2019 8:22 PM UTC

This essay by Arthur Jensen is from an old book .

The genius has limits. A simple answer, and undoubtedly true. But my
assignment here is to reflect on the much more complex difference
between intellectual giftedness and genius, using the latter term in its
original sense, as socially recognized, outstandingly creative achievement. In
this think-piece (which is just that, rather than a comprehensive review of the
literature), I will focus on factors, many intriguing in and of themselves, that
are characteristic of genius. My primary thesis is that the emergence of genius is
best described using a multiplicative model.


I will argue that exceptional achievement is a multiplicative function of a
number of different traits, each of which is normally distributed, but which in
combination are so synergistic as to skew the resulting distribution of achieve­
ment. An extremely extended upper tail is thus produced, and it is within this
tail that genius can be found. An interesting two-part question then arises: how
many different traits are involved in producing extraordinary achievement, and
what are they? The musings that follow provide some conjectures that can be
drawn on to answer this critical question.
As a subject for scientific study, the topic of genius, although immensely
fascinating, is about as far from ideal as any phenom enon one can find. The
literature on real genius can claim little besides biographical anecdotes and
speculation, with this chapter contributing only more of the same. Whether the
study of genius will ever evolve from a literary art form into a systematic science
is itself highly speculative. The most promising efforts in this direction are
those by Simonton (198 8 ) and Eysenck (1 9 9 5 ), with Eysenck’s monograph
leaving little of potential scientific value that can be added to the subject at
present, pending new empirical evidence.


Intelligence


Earlier I stated that genius has limits. But its upper limit, at least in some fields,
seems to be astronomically higher than its lower limit. Moreover, the upper
lim it of genius cannot be described as characterized by precocity, high intel­
ligence, knowledge and problem -solving skills being learned with speed and
ease, outstanding academic achievement, honors and awards, or even intellec­
tual productivity. Although such attributes are commonly found at all levels of
genius, they are not discriminating in the realm of genius.
My point is perhaps most clearly illustrated by the contrast between two
famous mathematicians who became closely associated with one another as
“teacher” and “student.” T he reason for the quotation marks here will soon be
obvious, because the teacher later claimed that he learned more from the
student than the student had learned from him . G. H. Hardy was England’s
leading mathematician, a professor at Cambridge University, a Fellow of the
Royal Society, and the recipient of an honorary degree from Harvard. Rem ark­
ably precocious in early childhood, especially in m athem atics, he became an exceptionally brilliant student, winning a scholarship after another acknowledged the star graduate in mathematics at Cambridge, where he remained to become a professor of m athem atics. He also became a world-class
mathematician. His longtime friend C. R Snow relates that Hardy, at the peak
of his career, ranked himself fifth among the most important mathematicians
of his day, and it should be pointed out that Hardy’s colleagues regarded him as
an overly modest man (Snow, 1967). If the Study of Mathematically Precocious
Youth (SMPY) had been in existence when Hardy was a schoolboy, he would
have been a most prized and promising student in the program.
One day Hardy received a strange-looking letter from Madras, India. It
was full of mathematical formulations written in a quite unconventional—one
might even say bizarre—form . The writer seemed almost mathematically illiter­
by Cambridge standards. It was signed “Srinivasa Ramanujan.” At first
glance, Hardy thought it might even be some kind of fraud. Puzzling over this
letter with its abstruse formulations, he surmised it was written either by some
trickster or by someone sincere but poorly educated in m athem atics. Hardy
sought the opinion of his most highly esteemed colleague, J. E. Littlewood, the
other famous mathematician at Cambridge. After the two of them had spent
several hours studying the strange letter, they finally realized, with excitement
and absolute certainty, that they had “discovered” a major mathematical ge­
nius. The weird-looking formulas, it turned out, revealed profound m athe­
matical insights of a kind that are never created by ordinarily gifted mathematics.
Hardy regarded this “discovery” as the single most important event in his
life. Here was the prospect of fulfilling what, until then, had been for him only
an improbable dream: of ever knowing in person a mathematician possibly of
Gauss’s caliber.
A colleague in Hardy’s department then traveled to India and persuaded
Ramanujan to go to Cambridge, with all his expenses and a salary paid by the
university. When the youth arrived from India, it was evident that, by ordinary
standards, his educational background was meager and his almost entirely self­
taught knowledge of math was full of gaps. He had not been at all successful in
school, from which he had flunked out twice, and was never graduated. To say,
however, that he was obsessed by m athem atics is an understatement. As a boy in
Madras, he was too poor to buy paper on which to work out his math prob­
lems. He did his prodigious mathematical work on a slate, copying his final
results with red ink on old, discarded newspapers.
While in high school, he thought he had made a stunning mathematical
discovery, but he later learned, to his great dismay, that his discovery had
already been made 150 years earlier by the great mathematician Euler. R am anu­
jan felt extraordinary shame for having “discovered” something that was not
original, never considering that only a real genius could have created or even re­
created that discovery.
At Cambridge, Ramanujan was not required to take courses or exams.
That would have been almost an insult and a sure waste of time. He learned some essential things from Hardy, but what excited Hardy the most had noth­ing to do with Ramanujan’s great facility in learning the most advanced concepts and technical skills of mathematical analysis. Hardy himself had that kind of facility. What so impressed him was Ramanujan’s

uncanny mathematical intuition and capacity for inventing incredibly original and profound the­orems. That, of course, is what real mathematical genius is all about. Facility in resolving textbook problem s and in passing difficult tests is utterly trivial when
discussing genius. Although working out the proof of a theorem, unlike dis­covering a theorem , may take immense technical skill and assiduous effort, it is
not itself a hallmark of genius. Indeed, Ramanujan seldom bothered to prove
his own theorem s; proof was a technical feat that could be left to lesser geniuses.
Moreover, in some cases, because of his spotty mathematical education, he
probably would have been unable to produce a formal proof even if he had
wanted to. But a great many important theorems were generated in his ob­
passively active brain. Often he seemed to be in another world. One might say
that the difference between Ram anujan creating a theorem and a professional
mathematician solving a complex problem with standard techniques of analysis

is like the difference between St. Francis in ecstasy and a sleepy vicar reciting the
morning order of prayer.
After his experience with Ramanujan, Hardy told Snow that if the word
genius meant anything, he (Hardy) was not really a genius at all (Snow, 1967, p.
27). Hardy had his own hundred-point rating scale of his estimates of the
“natural ability” of eminent mathematicians. Though regarding himself at the
tim e as one of the world’s five best pure mathematicians, he gave himself a
rating of only 25. The greatest mathematician of that period, David Hilbert,
was rated 80. But Hardy rated Ramanujan 100, the same rating as he gave Carl
Friedrich Gauss, who is generally considered the greatest mathematical genius
the world has known. On the importance of their total contributions to mathematics, however, Hardy rated himself 35, Ramanujan 85, and Gauss 100. By this
reckoning Hardy was seemingly an overachiever and Ramanujan an under­
achiever. Yet one must keep in mind that Ramanujan died at age thirty, Hardy at
seventy, and Gauss at seventy-eight.
of course, all geniuses are by definition extrem e overachievers, in the
statistical sense. Nothing else that we could have known about them besides the
monumental contributions we ascribe to their genius would have predicted
such extraordinary achievement. In discussing Ramanujan's work, the Polish
mathematician Mark Kac was forced to make a distinction between the “ordi­
nary genius” and the “magician.” He wrote:
An ordinary genius is a fellow that you and I would be just as good as, if we were
only many times better. There is no mystery as to how his mind works. Once we
understand what he has done, we feel certain that we, too, could have done it. It is
different with the magicians. They are, to use mathematical jargon , in the orthogonal complement of where we are and the working of their minds is for all intents
and purposes incomprehensible. Even after we understand what they have done,
the process by which they have done it is completely dark. (Quoted in Kanigel,
1991, p. 28 1 ; Kanigel’s splendid biography of Ramanujan is highly recommended)

To come back to earth and the point of my meandering, genius requires
giftedness (consisting essentially of g, often along with some special aptitude or
talent, such as mathematical, spatial, musical, or artistic talent). But obviously
there are other antecedents (to the magic of Ramanujan's “thinking processes”)
that are elusive to us. Nonetheless, we do know of at least two key attributes,
beyond ability, that appear to function as catalysts for the creation of that
special class of behavioral products specifically indicative of genius. They are
productivity and creativity.


Creativity


Although we can recognize creative acts and even quantify them after a fashion
(MacKinnon, 1962), our understanding of them in any explanatory sense is
practically nil. Yet one prominent hypothesis concerning creativity (by which I
mean the bringing into being of something that has not previously existed)
seems to me not only unpromising, but extremely implausible and probably
wrong. It is also inherently unfalsifiable and hence fails Popper’s criterion for a
useful scientific theory. I doubt that it will survive a truly critical examination.
Because ruling out one explanation does further our understanding of creative
ity, I will focus on this theory.
I am referring here to what has been termed the ch an ce configuration
theory of creativity (well explicated by Simonton, 1988, ch. 1). Essentially, it
amounts to expecting that a computer that perpetually generates strictly ran­
dom sequences of all the letters of the alphabet, punctuation signs, and spaces
will eventually produce Hamlet or some other work of creative genius. The
theory insists that blind chance acting in the processes of m em ory searches for
elements with which to form random combinations and permutations, from
which finally there emerges some product or solution that the world considers
original or creative. It is also essential that, although this generating process
is operating entirely by blind chance, the random permutations produced
thereby are subjected to a critical rejection/selection screening, with selective
retention of the more promising products. This theory seems implausible,
partly because of the sheer numerical explosion of the possible combinations
and permutations when there are more than just a few elements. For example,
the letters in the word permutation have 11! = 3 9 ,916,800 possible perm uta­
tions. To discover the “right” one by randomly permuting the letters at a
continuous rate of one permutation per second could take anywhere from one
second (if one were extremely lucky) up to one year, three thirty-day months,
and seven days (if one were equally unlucky). Even then, these calculations
assume that the random generating mechanism never repeated a particular
permutation; otherwise it would take much longer.
The combinatorial and permutation explosion resulting from an in ­

crease in the number of elements to be mentally manipulated and the exponen­
tially increased processing time are not, however, the worst problem s for this
theory. The far greater problem is that, just as “nature abhors a vacuum,” the
human mind abhors random ness. I recall a lecture by the statistician Helen M.
Walker in which she described a variety of experiments showing that intelligent

people, no matter how sophisticated they are about statistics or how well they
understand the meaning of randomness, and while putting forth their best
conscious efforts, are simply incapable of selecting, combining, or permutation
numbers, letters, words, or anything else in a truly random fashion. For exam ­
ple, when subjects are asked to generate a series of random numbers, or repeat­
edly to make a random selection of N items from among a much larger number
of different objects spread out on a table, or take a random walk, it turns out no
one can do it. This has been verified by statistical tests of randomness applied to
their performance. People even have difficulty simply reading aloud from a
table of random numbers without involuntarily and nonrandom ly inserting
other numbers. (Examples of this phenomenon are given in Kendall, 1948.)
Thus, random ness (o r blind chance, to use the favored term in chance
configuration theory) seems an unlikely explanation of creative thinking. This
theory seems to have originated from what may be deemed an inappropriate
analogy, namely the theory of biological evolution creating new living forms.
According to the latter theory, a great variety of genetic effects is produced by
random mutations and the screening out of all variations except those best
adapted to the environment—that is, natural selection. But a genetic mutation,
produced perhaps by a radioactive particle hitting a single molecule in the DNA
at random and altering its genetic code, is an unfitting analogy for the neces­
sarily integrated action of the myriad neurons involved in the mental manip
ulation of ideas.

The Creative Process
The implausibility of randomness, however, in no way implies that creative
thinking does not involve a great deal of “trial-and -error” mental manipula
tion, though it is not at all random . The products that emerge are then critically
sifted in light of the creator’s aim. The individuals in whom this mental manipulation process turns out to be truly creative most often are those who
are relatively rich in each of three sources of variance in creativity: (1) ideational
flu en cy, or the capacity to tap a flow of relevant ideas, them es, or images, and to
play with them , also known as “brainstorming”; (2) what Eysenck (1995) has
termed the individuals’ relevance horizon ; that is, the range or variety of ele­
m ents, ideas, and associations that seem relevant to the problem (creativity
involves a wide relevance horizon); and (3) suspension of critical ju d g m en t.
Creative persons are intellectually high risk takers. They are not afraid of
zany ideas and can hold the inhibitions of self-criticism temporarily in abey­
ance. Both Darwin and Freud mentioned their gullibility and receptiveness to
highly speculative ideas and believed that these traits were probably charac­

teristic of creative thinkers in general. Darwin occasionally performed what
he called “fool’s experiments,” trying out improbable ideas that most people
would have instantly dismissed as foolish. Francis Crick once told me that Linus
Pauling’s scientific ideas turned out to be wrong about 80 percent of the time,
but the other 20 percent finally proved to be so important that it would be a
mistake to ignore any of his hunches.
I once asked another Nobel Prize winner, William Shockley, whose cre­
activity resulted in about a hundred patented inventions in electronics, what he
considered the main factors involved in his success. He said there were two: (1)
he had an ability to generate, with respect to any given problem , a good many
hypotheses, with little initial constraint by previous knowledge as to their
plausibility or feasibility; and (2) he worked much harder than most people
would at trying to figure out how a zany idea might be shaped into something
technically feasible. Some of the ideas that eventually proved most fruitful, he
said, were even a physical impossibility in their initial conception. For that
very reason, most knowledgeable people would have dismissed such unrealistic
ideas immediately, before searching their imaginations for transformations
that might make them feasible.
Some creative geniuses, at least in the arts, seem to work in the opposite
direction from that described by Shockley. That is, they begin by producing
something fairly conventional, or even trite, and then set about to impose novel
distortions, reshaping it in ways deemed creative. I recall a demonstration of
this by Leonard Bernstein, in which he compared the early drafts of Beethoven’s
Fifth Symphony with the final version we know today. The first draft was a
remarkably routine-sounding piece, scarcely suggesting the familiar qualities of
Beethoven’s genius. It was more on a par with the works com posed by his
mediocre contemporaries, now long forgotten. But then two processes took
hold: (1) a lot of “doctoring,” which introduced what for that time were sur­
prising twists and turns in the harmonies and rhythms, along with an ascetic
purification, and (2) a drastic pruning and simplification of the orchestral score
to rid it completely of all the “unessential” notes in the harmonic texture, all the
“elegant variations” of rhythm , and any suggestion of the kind of filigree orna­
mentation that was so common in the works of his contemporaries. This
resulted in a starkly powerful, taut, and uniquely inevitable-sounding master
piece, which, people now say, only Beethoven could have written. But when
Beethoven’s symphonies were first performed, they sounded so shockingly
deviant from the prevailing aesthetic standards that leading critics declared him
ripe for a madhouse.
One can see a similar process of artistic distortion in a fascinating motion

picture using time-lapse photography of Picasso at work ( The Picasso Mystery).
He usually begin by sketching something quite ordinary—for example, a com ­
pletely realistic horse. Then he would begin distorting the figure this way and
that, repeatedly painting over what he had just painted and imposing further,
often fantastic, distortions. In one instance, this process resulted in such an
utterly hopeless mess that Picasso Finally tossed the canvas aside, with a remark
to the effect of “Now I see how it should go.” Then, taking a clean canvas, he
worked quickly, with bold, deft strokes of his paintbrush, and there suddenly
took shape the strangely distorted figure Picasso apparently had been striving
for. Thus he achieved the startling aesthetic impact typical of Picasso’s art.
It is exactly this kind of artistic distortion of perception that is never seen
in the productions of the most extremely gifted idiot savants, whose drawings
often are incredibly photographic, yet are never considered works of artistic
genius. The greatest artists probably have a comparable gift for realistic draw­
ing, but their genius leads them well beyond such photographic perception.
Other examples of distortion are found in the recorded performances of
the greatest conductors and instrumentalists, the re-creative geniuses, such as
Toscanini and Furtwangler, Paderewski and Kreisler. Such artists are not primarily distinguished from routine practitioners by their technical skill or vir­tuosity (though these are indeed impressive), but by the subtle distortions,
within fairly narrow limits, of rhythm , pitch, phrasing, and the like, that they
impose, consciously or unconsciously, on the works they perform . Differences
between the greatest performers are easily recognizable by these “signatures.”
But others’ attempts to imitate these idiosyncratic distortions are never subtle
enough or consistent enough to escape detection as inauthentic; in fact, they
usually amount to caricatures.

Psychosis

What is the wellspring of the basic elements of creativity listed above—idea­
tional fluency, a wide relevance horizon, the suspension of inhibiting self­
criticism , and the novel distortion of ordinary perception and thought? All of
these features, when taken to an extreme degree, are characteristic of psychosis.
The mental and emotional disorganization of clinical psychosis is, however,

generally too disabling to perm it genuinely creative or productive work, espe­
cially in the uncompensated individual. Eysenck, however, has identified a trait,
or dimension of personality, term ed psychoticism , which can be assessed by
means of the Eysenck Personality Questionnaire (Eysenck & Eysenck, 1991).
Trait psychoticism , it must be emphasized, does not imply the psychiatric

diagnosis of psychosis, but only the predisposition or potential for the develop­
m ent of psychosis (Eysenck & Eysenck, 1976). In many creative geniuses, this
potential for actual psychosis is usually buffered and held in check by certain
other traits, such as a high degree of ego strength. Trait psychoticism is a
constellation of characteristics that persons may show to varying degrees; such
persons may be aggressive, cold, egocentric, im personal, impulsive, antisocial,
unempathetic, tough-minded, and creative. This is not a charm ing picture of
genius, perhaps, but a reading of the biographies of some of the world’s most
famous geniuses attests to its veracity.
By and large, geniuses are quite an odd lot by ordinary standards. Their
spouses, children, and close friends are usually not generous in their personal
recollections, aside from marveling at the accomplishments for which the per­
son is acclaimed a genius. Often the personal eccentricities rem ain long hidden
from the public. Beethoven’s first biographer, for example, is known to have
destroyed some of Beethoven’s letters and conversation books, presumably
because they revealed a pettiness and meanness of character that seemed utterly
inconsistent with the sublime nobility of Beethoven’s music. Richard Wagner's
horrendous character is legendary. He displayed virtually all of the aforemen
tioned features of trait psychoticism to a high degree and, to make matters
worse, was also neurotic.
Trait psychoticism is hypothesized as a key condition in Eysenck’s (1995)
theory of creativity. Various theorists have also mentioned other characteristics,
but some of these, such as self-confidence, independence, originality, and n o n ­
conformity, to name a few, might well stem from trait psychoticism. (See
Jackson & Rushton, 1987, for reviews of the personality origins of productivity
and creativity.)


Productivity


A startling corollary of the multiplicative model of exceptional achievement is
best stated in the form of a general law. This is Price’s Law, which says that if K
persons have made a total of N countable contributions in a particular field,
then N /2 of the contributions will be attributable to

(Price, 1963). Hence,

as the total number of workers ( K ) in a discipline increases, the ratio VTc/ K
shrinks, increasing the elitism of the major contributors. This law, like any
other, only holds true within certain limits. But within fairly homogeneous
disciplines, Price’s Law seems to hold up quite well for indices of productivity—
for example, in math, the empirical sciences, musical composition, and the
frequency of performance of musical works. Moreover, there is a high rank-

order relationship between sheer productivity and various indices of the im ­
portance of a contributor’s work, such as the frequency and half-life of scien­
tific citations, and the frequency of performance and staying power of musical
com positions in the concert repertoire. (Consider such contrasting famous
contemporaries as Mozart and Salieri; Beethoven and Hummel; and Wagner
and Meyerbeer.)
If productivity and importance could be suitably scaled, however, I would
imagine that the correlation between them would show a scatter-diagram of the
“twisted pear” variety (Fisher, 1959). That is, high productivity and triviality
are more frequently associated than low productivity and high importance. As
a rule, the greatest creative geniuses in every field are astoundingly prolific,
although, without exception, they have also produced their share of trivia.
(Consider Beethoven’s King Stephen Overture and Wagner's “United States
Centennial M arch,” to say nothing of his ten published volumes of largely triv­
ial prose writings—all incredible contrasts to these composers’ greatest works.)
But such seemingly unnecessary trivia from such geniuses is probably the
inevitable effluvia of the mental energy without which their greatest works
would not have come into being. On the other hand, high productivity is
probably much more common than great importance, and high productivity
per se is no guarantee of the importance of what is produced. The “twisted
pear” relationship suggests that high productivity is a necessary but not suffi­
cient condition for making contributions of importance in any field. The im ­
portance factor, however, depends on creativity—certainly an elusive attribute.
What might be the basis of individual differences in productivity? The
word motivation immediately comes to mind, but it explains little and also
seems too intentional and self-willed to fill the bill. When one reads about
famous creative geniuses one finds that, although they may occasionally have to
force themselves to work, they cannot will themselves to be obsessed by the
subject of their work. Their obsessive-compulsive mental activity in a particular
sphere is virtually beyond conscious control. I can recall three amusing exam ­
ples of this, and they all involve dinner parties. Isaac Newton went down to the
cellar to fetch some wine for his guests and, while filling a flagon, wrote a
mathematical equation with his finger on the dust of the wine keg. After quite a
long tim e had passed, his guests began to worry that he might have had an
accident, and they went down to the cellar. There was Newton, engrossed in his
mathematical formulas, having completely forgotten that he was hosting a
dinner party.
My second example involves Richard Wagner. Wagner, while his guests as­
sembled for dinner, suddenly took leave of them and dashed upstairs. Alarmed

that something was wrong, his wife rushed to his room . Wagner exclaimed,
“I'm doing it!”—their agreed signal that she was not to disturb him under any
circumstances because some new musical idea was flooding his brain and
would have to work itself out before he could be sociable again. He had a
phenomenal m em ory for musical ideas that spontaneously surfaced, and could
postpone writing them down until it was convenient, a tedious task he referred
to not as com posing but as merely “copying” the music in his mind's ear.
Then there is the story of Arturo Toscanini hosting a dinner party at which
he was inexplicably morose and taciturn, just as he had been all that day and the
day before. Suddenly he got up from the dinner table and hurried to his study;
he returned after several minutes beam ing joyfully and holding up the score of
Brahms's First Symphony (which he was rehearsing that week for the N BC
Symphony broadcast the following Sunday). Pointing to a passage in the first
movement that had never pleased him in past performances, he exclaimed that
it had suddenly dawned on him precisely what Brahm s had intended at this
troublesome point. In this passage, which never sounds “clean” when played
exactly as written, Toscanini slightly altered the score to clarify the orchestral
texture. He always insisted that his alterations were only the composer's true
intention. But few would complain about his “delusions”; as Puccini once
remarked, “Toscanini doesn’t play my music as I wrote it, but as I dreamed it.”

Mental Energy
Productivity implies actual production or objective achievement. For the psy­chological basis of intellectual productivity in the broadest sense, we need a

construct that could be labeled m en tal energy. This term should not be co n ­
fused with Spearman's g (for general intelligence). Spearman's theory of psy­
chom etric g as “mental energy” is a failed hypothesis and has been supplanted
by better explanations of g based on the concept of neural efficiency (Jensen,
1993). The energy construct I have in mind refers to something quite different
from cognitive ability. It is more akin to cortical arousal or activation, as if by a
stimulant drug, but in this case an endogenous stimulant. Precisely what it
consists of is unknown, but it might well involve brain and body chemistry.
One clue was suggested by Havelock Ellis (1 904) in A Study of British
Genius. Ellis noted a much higher than average rate of gout in the eminent
subjects of his study; gout is associated with high levels of uric acid in the blood.
So later investigators began looking for behavioral correlates of serum urate
level (SU L), and there are now dozens of studies on this topic (reviewed in
Jensen & Sinha, 1993). They show that SUL is only slightly correlated with IQ,
but is more highly correlated with achievement and productivity. For instance,

among high school students there is a relation between scholastic achievement
and SUL, even controlling for IQ (Kasl, Brooks, & Rodgers, 1970). The “over­
achievers” had higher SUL ratings, on average. Another study found a correla­
tion of + .3 7 between SUL ratings and the publication rates of university pro­
fessors (Mueller & French, 1974).
Why should there be such a relationship? The most plausible explanation
seems to be that the molecular structure of uric acid is nearly the same as that of
caffeine, and therefore it acts as a brain stimulant. Its more or less constant
presence in the brain, although affecting measured ability only slightly, consid­
erably heightens cortical arousal and increases mental activity. There are proba­
bly a number of other endogenous stimulants and reinforcers of productive
behavior (such as the endorphins) whose synergistic effects are the basis of
what is here called mental energy. I suggest that this energy, combined with very
high f o r an exceptional talent, results in high intellectual or artistic productiv­
ity. Include trait psychoticism with its creative component in this synergistic
mixture and you have the essential makings of genius.
To summarize:
Genius = High Ability X High Productivity AND High Creativity.

The theoretical underpinnings of these three ingredients are:
—Ability = g = efficiency of information processing
—Productivity = endogenous cortical stim u lation
—Creativity = trait psychoticism


Other Personality Correlates


There are undoubtedly other personality correlates of genius, although some of
them may only reflect the more fundamental variables in the formula given
above. The biographies of m any geniuses indicate that, from an early age, they
are characterized by great sensitivity to their experiences (especially those of a
cognitive nature), the development of unusually strong and long-term interests
(often manifested as unusual or idiosyncratic hobbies or projects), curiosity
and exploratory behavior, a strong desire to excel in their own pursuits, th eo­
retical and aesthetic values, and a high degree of self-discipline in acquiring
necessary skills (MacKinnon, 1962).
The development of expert-level knowledge and skill is essential for any
important achievement (Rabinowitz & Glaser, 1985). A high level of expertise

involves the automatization of a host of special skills and cognitive routines.
Automatization com es about only as a result of an immense amount of prac­
tice (Jensen, 1990; Walberg, 1988). Most people can scarcely imagine (and
are probably incapable of) the extraordinary amount of practice that is re­
quired for genius-quality performance, even for such a prodigious genius as
Mozart.
In their self-assigned tasks, geniuses are not only persistent but also re­
markably able learners. Ramanujan, for example, disliked school and played
truant to work on math problems beyond the level of anything he was offered at
school. Wagner frequently played truant so he could devote his whole day to
studying the orchestral scores of Beethoven. Francis Galton, with an estimated
childhood IQ of around 200 and an acknowledged genius in adulthood, abso­
lutely hated the frustrations of school and pleaded with his parents to let him
quit. Similar examples are legion in the accounts of geniuses.
In reading about geniuses, I consistently find one other important factor
that must be added to the composite I have described so far. It is a factor related
to the direction of personal ambition and the persistence of effort. This factor
channels and focuses the individual’s mental energy; it might be described best
as personal ideals or values. These may be artistic, aesthetic, scientific, theoret­
ical, philosophical, religious, political, social, economic, or moral values, or
something idiosyncratic. In persons of genius, especially, this “value factor”
seems absolutely to dominate their self-concept, and it is not mundane. People
are often puzzled by what they perceive as the genius’s self-sacrifice and often
egocentric indifference to the needs of others. But the genius’s value system, at
the core of his or her self-concept, is hardly ever sacrificed for the kind of
mundane pleasures and unimaginative goals com m only valued by ordinary
persons. Acting on their own values—perhaps one should say acting out their
self-images—is a notable feature of famous geniuses.


Characteristics of Genius: Some Conclusions


Although this chapter is not meant to provide an exhaustive review of the
literature on geniuses and highly creative individuals, it has raised some consis­
tent them es that might be worthy of scientific study. I propose that genius is
a multiplicative effect of high ability, productivity, and creativity. Moreover,
many of the personality traits associated with genius can be captured by the
label “psychoticism.” Although geniuses may have a predisposition toward such
a disorder, they are buffered by a high degree of ego strength and intelligence. A
number of the remaining personality correlates of genius may best be captured
by the idea that genius represents an acting-out of its very essence.

Giftedness and Genius: Important Differences
Although giftedness (exceptional mental ability or outstanding talent) is a
threshold trait for the emergence of genius, giftedness and genius do seem to be
crucially different phenomena, not simply different points on a continuum . It
has even been suggested that giftedness is in the orthogonal plane to genius.
Thomas Mann (1 9 4 7 ), in his penetrating and insightful study of Richard Wag­
ner’s genius, for instance, makes the startling point that Wagner was not a
musical prodigy and did not even seem particularly talented, in music or in
anything else for that matter, compared to many lesser composers and poets.
He was never skilled at playing any musical instrument, and his seriously
focused interest in music began much later than it does for most musicians. Yet
M ann is awed by Wagner's achievements as one of the world’s stupendous
creative geniuses, whose extraordinarily innovative masterpieces and their ines­
capable influence on later composers place him among the surpassing elite in
the history of music, in the class with Bach, Mozart, and Beethoven.
It is interesting to note the words used by M ann in explaining what he calls
Wagner's “vast genius”; they are not “giftedness” or “talent,” but “intelligence”
and “will.” It is the second word here that strikes m e as most telling. After all, a
high level of intelligence is what we mean by “gifted,” and Wagner was indeed
most probably gifted in that sense. His childhood IQ was around 140, as
estimated by Catherine Cox (1 9 2 6 ) in her classic, although somewhat flawed,
study of three hundred historic geniuses. Yet that level of IQ is fairly com ­
m on place on university campuses.
We do not have to discuss such an awesome level of genius as Wagner's,
however, to recognize that garden-variety outstanding achievement, to which
giftedness is generally an accompaniment, is not so highly correlated with the
psychometric and scholastic indices of giftedness as many people, even psychol­
ogists, might expect. At another symposium related to this topic, conducted
more than twenty years ago, one of the speakers, who apparently had never
heard of statistical regression, expressed fire alarm at the observation that far
too many students who scored above the 99th percentile on IQ tests did not
turn out, as adults, among those at the top of the distribution of recognized
intellectual achievements. He was dismayed at many of the rather ordinary
occupations and respectable but hardly impressive accomplishments displayed
in midlife by the majority of the highly gifted students in his survey. A signifi­
can't number of students who had tested considerably lower, only in the top
quartile, did about as well in life as many of the gifted. The speaker said the
educational system was to blame for not properly cultivating gifted students. If

they were so bright, should they not have been high achievers? After all, their
IQs were well within the range of the estimated childhood IQs of the three
hundred historically eminent geniuses in C ox’s (1 9 2 6 ) study. Although educa­
tion is discussed in more detail below, the point here is that giftedness does not
assure exceptional achievement; it is only a necessary condition.
To reinforce this point, I offer an additional example that occurred on the
very day I sat down to write this chapter. O n that day I received a letter from
someone I had never met, though I knew he was an eminent professor of
biophysics. He had read something I wrote concerning IQ as a predictor of
achievement, but he was totally unaware of the present work. The coincidence
is that my correspondent posed the very question that is central to my theme.
He wrote;


I have felt for a long time that IQ , however defined, is only loosely related to men ­
tal achievement. Over the years I have bumped into a fair nu m ber of MENSA
people. As a group, they seem to be dilettantes seeking titillation bu t seem unable
to think critically or deeply. They have a lot of motivation for intellectual play but
little for doing anything worthwhile. One gets the feeling that brains were wasted
on them . So, what is it that makes an intelligently productive person?



This is not an uncom m on observation, and I have even heard it expressed by
m em bers of MENSA. It is one of their self-perceived problem s, one for which
some have offered theories or rationalizations. The most typical is that they are
so gifted that too many subjects attract their intellectual interest and they can
never commit themselves to any particular interest. It could also be that indi­
viduals drawn toward m em bership in M ENSA are a selective subset of the
gifted population, individuals lacking in focus. After all, most highly gifted
individuals do not join MENSA.
We must, then, consider some of the ways in which achieved em en t contrasts
with ability if we are to make any headway in understanding the distinction
between giftedness (i.e., mainly high g or special abilities) and genius. Genius
involves actual achievement and creativity. Each of these characteristics is a
quantitative variable. The concept of genius generally applies only when both of
these variables characterize accomplishments at some extraordinary socially
recognized level. Individual differences in countable units of achievement, un­
like measures of ability, are not normally distributed, but have a very positively
skewed distribution, resembling the so-called J-curve. For example, the num ­
ber of publications of members of the American Psychological Association, of
research scientists, and of academicians in general, the number of patents
of inventors, the number of compositions of composers, or the frequency of

composers’ works in the concert repertoire all show the same J-curve. M ore­
over, in every case, the J-curve can be normalized by a logarithmic transform a­
tion. This striking phenomenon is consistent with a multiplicative model of
achievement, as developed and discussed above. That is, exceptional achieve­
m ent is a multiplicative function of a number of different traits, each of which
may be normally distributed, but which in combination are so synergistic as to
skew the resulting distribution of achievement. Thereby, an extremely extended
upper tail of exceptional achievement is produced. Most geniuses are found far
out in this tail.
The multiplication of several normally distributed variables yields, there­
fore, a highly skewed distribution. In such a distribution, the mean is close to
the bottom and the mode generally is the bottom . For any variable measured on
a ratio scale, therefore, the distance between the median and the 99th percentile
is much smaller for a normally distributed variable, such as ability, than for a
markedly skewed variable, such as productivity. Indeed, this accords well with
subjective impressions: the range of individual differences in ability (g or fluid
intelligence) above the median level does not seem nearly so astounding as the
above-median range of productivity or achievement.
In conclusion, giftedness, a normally distributed variable, is a prerequisite
for the development of genius. When it interacts with a number of other critical
characteristics, which also are normally distributed, exceptional achievement is
produced. Exceptional achievement, however, is a variable that is no longer
norm al; it is highly skewed, with genius found at the tip of the tail.


Educational Implications


At this point in my highly speculative groping to understand the nature of
genius as differentiated from giftedness, I should like to make some practical
recommendations. First, I would not consider trying to select gifted youngsters
explicitly with the aim of discovering and cultivating future geniuses. Julian
Stanley’s decision (Stanley, 1977) to select explicitly for mathematical gifted­
ness—to choose youths who, in Stanley’s words, “reason exceptionally well
mathematically”—was an admirably sound and wise decision from a practical
and socially productive standpoint. The latent traits involved in exceptional
mathematical reasoning ability are mainly high g plus high math talent (inde­
pendent of g). These traits are no guarantee of high productivity, much less of
genius. But the threshold nature of g and m ath talent is so crucial to excelling in
math and the quantitative sciences that we can be fairly certain that most of the
productive mathematicians and scientists, as well as the inevitably few geniuses,

will com e from that segment of the population of which the SM PY students are
a sample. Indeed, in Donald MacKinnon’s (1962) well-known study of large
numbers of creative writers, mathematicians, and architects (certainly none of
them a Shakespeare, Gauss, or Michelangelo), the very bottom of the range of
intelligence-test scores in the whole sample was at about the 75th percentile
of the general population, and the mean was at the 98th percentile (MacKinnon
8c Hall, 1972).
However, it might eventually be profitable for researchers to consider
searching beyond high ability per se and identify personality indices that also
will aid in the prediction of exceptional achievement. The proportion of those
gifted youths selected for special opportunities who are most apt to be produc­tive professionals in their later careers would thereby be increased. Assuming
that high achievement and productivity can be predicted at all, over and above
what our usual tests of ability can predict, it would take extensive research
indeed to discover sufficiently valid predictors to justify their use in this way.
Lubinski and Benbow (1992) have presented evidence that a “theoretical orien­
tation,” as measured by the Allport, Vernon, and Lindzey Study of Values,
might be just such a variable for scientific disciplines.


Conclusion


Certainly, the education and cultivation of intellectually gifted youths has never
been more important than it is today, and its importance will continue to grow
as we move into the next century. The preservation and advancement of civi­
lized society will require that an increasing proportion of the population have a
high level of educated intelligence in science, engineering, and technology.
Superior intellectual talent will be at a premium . Probably there will always be
only relatively few geniuses, even among all persons identified as gifted. Yet this
is not cause for concern. For any society to benefit from the fruits of genius
requires the efforts of a great many gifted persons who have acquired high levels
of knowledge and skill. For example, it takes about three hundred exceptionally
talented and highly accomplished musicians, singers, set designers, artists,
lighting directors, and stage directors, besides many stagehands, to put on a
production of The Ring of the Nibelung, an artistic creation of surpassing
genius. Were it not for the concerted efforts of these performers, the score of
Wagner's colossal work would lie idle. The same is true, but on a much larger
scale, in modern science and technology. The instigating creative ideas are
seldom actualized for the benefit of society without the backup and follow through endeavors of a great many gifted and accomplished persons. Thus, a nation’s most important resource is the level of educated intelligence in its
population; it determines the quality of life. It is imperative for society to
cultivate all the high ability that can possibly be found, wherever it can be
found.


References


Cohn, S. J., Carlson, J. S., & Jensen, A. R. (1985). Speed of information processing in academically
gifted youths. Personality and Individual Differences 6:621 -629.
Cox, C. M. ( 1926). The early mental traits of three hundred geniuses. Stanford: Stanford University
Press.
Ellis, H. ( 1904). A study of British genius. London: Hurst & Blackett.
Eysenck, H. J. (1995). Genius: The natural history of creativity. Cambridge: Cambridge University
Press.
Eysenck, H. J., 8c Eysenck, S. B. G. (1976). Psychoticism as a dimension of personality. London:
Hodder & Stoughton.
Eysenck, H. J., & Eysenck, S. B. G. (1991). Manual of the Eysenck Personality Scales (EPS Adult).
London: Hodder & Stoughton.
Fisher, J. ( 1959). The twisted pear and the prediction of behavior. Journal of Consulting Psychology
23:400-405.
Jackson, D. N., 8c Rushton, J. P. (Eds.). (1987). Scientific excellence: Origins and assessment. Beverly
Hills: Sage Publications.
Jensen, A. R. (1990). Speed of information processing in a calculating prodigy. Intelligence 14:259­
274.
Jensen, A. R. (1992a). The importance of intraindividual variability in reaction time. Personality
and Individual Differences 13:869-882.
Jensen, A. R. (1992b). Understanding gin terms of information processing. Educational Psychology
Review 4:271-308.
Jensen, A. R. (1993). Spearman’s g: From psychometrics to biology. In F. M. Crinella 8c J. Yu (Eds.),
Brain mechanisms and behavior. New York: New York Academy of Sciences.
Jensen, A. R., Cohn, S. J., & Cohn, C. M. G. (1989). Speed of information processing in academy
ically gifted youths and their siblings. Personality and Individual Differences 10:29-34.
Jensen, A. R., & Sinha, S. N. (1993). Physical correlates of human intelligence. In P. A. Vernon (Ed.),
Biological approaches to the study of human intelligence. Norwood, N.J.: Ablex.
Kanigel, R. (1991). The man who knew infinity: A life of the genius Ramanujan. New York: Scribner's.
Kasl, S. V., Brooks, G. W., 8; Rodgers, W. L. (1970). Serum uric acid and cholesterol in achievement
behaviour and motivation: 1. The relationship to ability, grades, test performance, and motiva­
tion. Journal of the American Medical Association 213:1158-1164.
Kendall, M. G. (1948). The advanced theory of statistics (Vol. 1). London: Charles Griffin.
Lubinski, D., & Benbow, C. P. (1992). Gender differences in abilities and preferences among the
gifted: Implications for the math-science pipeline. Current Directions in Psychological Science
1:61-66.
MacKinnon, D. W. (1962). The nature and nurture of creative talent. American Psychologist 17:
484-495.
MacKinnon, D. W., 8c Hall, W. B. (1972). Intelligence and creativity. In H. W. Peter, Colloquium 17:
The measurement of creativity. Proceedings, Seventeenth International Congress of Applied Psychol­
ogy, Liege, Belgium, 2 5 -3 0 July, 1971 (Vol. 2, pp. 1883-1888). Brussels: Editest.
Mann, T. (1947). Sufferings and greatness of Richard Wagner. In T. Mann, Essays of three decades
(H. T. Low-Porter, Trans., pp. 307-352). New York: Knopf.
Mueller, E. F., 8c French, J. R„ Jr. (1974). Uric acid and achievement. Journal of Personality and
Social Psychology 30:336-340.
Price, D. J. ( 1963). Little science, big science. New York: Columbia University Press.
Rabinowitz, M., 8c Glaser, R. (1985). Cognitive structure and process in highly competent perfor­

mance. In F. D. Horowitz & M. O’Brien (Eds.), The gifted and talented: Developmental perspec­
tives (pp. 7 5 -9 8 ). Washington, D.C.: American Psychological Association.
Simonton, D. K. (1988). Scientific genius: A psychology of science. New York: Cambridge University
Press.
Snow, C. P. (1967). Variety of men. London: Macmillan.
Stanley, J. C. (1977). Rationale of the Study of Mathematically Precocious Youth (SMPY) during its
first five years of promoting educational acceleration. In J. C. Stanley, W. C. George, & C. H.
Solano (Eds.), The gifted and the creative: A fifty-year perspective (pp. 75-112). Baltimore: Johns
Hopkins University Press.
Walberg, H. J. (1988). Creativity and talent as learning. In R. J. Sternberg (Ed.), The nature of
creativity: Contemporary psychological perspectives (pp. 340-361). Cambridge: Cambridge Uni­
versity Press.



Discuss

A small example of one-step hypotheticals

28 января, 2019 - 19:12
Published on January 28, 2019 4:12 PM UTC

.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')}

Just a small example of what one-step hypotheticals might mean in theory and in practice.

This involves a human H pricing some small object:

In theory

The human H is (hypothetically) asked various questions that causes it to model how much they would pay for the small violin. These questions are asked at various times, and with various phrasings, and the results look like this:

Here the costings are all over the place, and one obvious way of reconciling them would be to take the mean (indicated by the large red square), which is around 5.5.

But it turns out there are extra patterns in the hypotheticals Ht and the answers f(Ht). For example, there is a clear difference between valuations that are done in the morning, around midday, or in the evening. And there is a difference if the violin is (accurately) described as "handmade"L

There are now more options for finding a "true" valuation here. The obvious first step would be to over-weight the evening valuations, as there are less datapoints there (this would bring the average up a bit). Or one could figure out whether the "true" H was better represented by their morning, midday, or evening selves. Or whether their preference for "handmade" objects was strong and genuine, or a passing positive affect. H's various meta-preferences would all be highly relevant to these choices.

In practice

Ok, that's what might happen if the agent had the power to ask unlimited hypothetical questions in arbitrarily many counterfactual scenarios. But that is not the case in the real world: the agent would be able to ask one, or maybe two questions at most, before the human attitude to the violin would change, and further data would become tainted.

Note that if the agent had a good brain model of H, it might be able to simulate all the relevant answers; but we'll assume for the moment that the agent doesn't have the capabilities.

So, in theory, huge amounts of data and many relevant patterns that are meta-preferentially relevant. In practice, two values maximum.

Now, if this was all that the agent had access to, then it could only use a crude guess. But if the agent was investigating the human more thoroughly, it could do a lot more. The pattern of valuing things differently at different times of the day might show up over longer observations, as would the pattern of reacting to key words in the description. If the agent assumed that "valuing objects" was not something that humans did ex-nihilo with each object (with each object having its own independent quirky biases), then it could apply the template across all valuations, and from even a single data point (along with knowledge of the time of day, the description, etc...) come up with an estimate that was closer to the theoretical one.



Discuss

Страницы