Вы здесь
Новости LessWrong.com
What explanatory power does Kahneman's System 2 possess?
In the 70s and 80s, Kahneman and Tversky did a bunch of pioneering research on heuristics and biases in human thought. Then, in Thinking Fast and Slow, Kahneman divided human cognition into System 1 and System 2  basically, System 1 applies quick heuristics and biases, and System 2 does the slow, effortful thinking.
But what does System 2 actually add to the theory in terms of explanatory power? Consider an alternative version of Thinking Fast and Slow in which Kahneman wrote something like "Here are the conditions in which humans use this mode of reasoning I'm calling System 1, which is fast and approximate and effortless and uses heuristics and demonstrates biases which can be detected in certain ways. The rest of the time, I have no idea what's going on, except that it doesn't display the traits that would qualify it as System 1 reasoning." In what ways would this be less informative than his actual claims?
Discuss
MesaOptimizers and Overoptimization Failure(Optimizing and Goodhart Effects, Clarifying Thoughts  Part 4)
In the previous posts, I first outlined Selection versus Control for Optimization, then talked about What Optimization Means, and how we quantify it, then applied these ideas a bit to ground the discussion.
After doing so, I reached a point where I think that there is something useful to say about mesaoptimizers. This isn't yet completely clear to me, and it seems likely that at some point in the (hopefully near) future, someone will build a much clearer conceptual understanding, and I'll replace this with a link to that discussion. Until then, I want to talk about how mesaoptimizers are control systems built within selection systems, and why that poses significant additional concern for alignment.
MesaOptimizersI claimed in the previous post that Mesaoptimizers are always control systems. The base optimization selects for a mesaoptimizer, usually via sideeffectfree or lowcost sampling and/or simulation, then creates a system that does further optimization as an output. In some cases, the further optimization is direct optimization in the terms laid out in my earlier post, but in others it is control.
Direct MesaOptimizersIn many cases, the selection system finds optimal parameters or something similar for a direct optimization system. This is exactly the earlier example of building a guidance system for a rocket is an obvious class of example where selection leads to a direct optimizer. This isn't the only way that optimizers can interact, though.
I'd say that MCTS + Deep learning is an important example of this mix which has been compared to "Thinking Fast and Slow" (pdf, NIPS paper). In chess, for example, the thinking fast is the heuristic search to choose where to explore, which is based on a selection system, and the exploration is MCTS, which in this context I'm calling direct optimization. (This is despite the fact that it's obviously probabilistic, so in some respects looks like selection, because while the selection IS choosing points in the space, it isn't doing evaluation, but rather deterministic playforward of alternative scenarios. The evaluation is being done by the heuristic evaluation system.) In that specific scenario, any misalignment is almost entirely a selection system issue  unless there was some actual mistake in the implementation of the rules of chess.
This opens up a significant concern for causal Goodhart; regime change will plausibly have particularly nasty effects. This is because the directly optimizing mesaoptimizer isn't at all able to consider whether the parameters selected by the baseoptimizer should be reconsidered. And this is far worse for "true" control mesaoptimizers.
Control MesaoptimizersBefore we talk more about failure modes, it's important to clarify two levels of failure; the baseoptimizer can fail to achieve its goals because it designs a misaligned mesaoptimizer, and the mesaoptimizer itself can fail to achieve its goals. The two failure modes are different, because we don't have any reason to assume apriori that our mesaoptimizer shares goals with our base optimizer.
Just like humans are adaptationexecutioners, Mesaoptimizers are mesaoptimizers, not simply optimizers. If their adaptations are instead tested against the true goal, or at least the baseoptimizers goal, then evaluated on that basis, they aren't mesaoptimizers, they are trials for the selection optimizer. Note that these two cases aren't necessarily incompatible; in something like Google's federated learning model, the shared model is updated based on the control system's data. So selfdriving cars may be mesaoptimizers using a neural net, and the data gathered is later used to update the baseoptimizer model, which is then given back to the agents. The two parts of the system can therefore suffer different types of failures, but at least the posthoc updating seems to plausibly reduce misalignment of the mesaoptimizer.
But in cases where sideeffects occur, so that the control mesaoptimizer imposes externalities on the system, the mesaoptimizer typically won't share goals with the base optimizer! This is because if it shares goals exactly, the base optimization doesn't need to build a mesaoptimizer, it can run tests without ceding direct control. (Our selfdriving cars in the previous paragraph are similar to this, but their goal is the trained network's implicit goal, not the training objective used for the update. In such in a federated learning model, it can fail and the trial can be used for better learning the model.) On the other hand, if no sideeffects occur and goals are shared, the difference is irrelevant; the mesaoptimizer can fail costlessly and start over.
Failure of Mesaoptimizers as PrincipleAgenttype "Agents"There is a clear connection to principalagent problems. (Unfortunately, the term agent is ambiguous, since we also want to discuss embedded agents. In this section, that's not what is being discussed.) Mesaoptimizers can succeed at their goals but fail to be reliable agents, or they can fail even at their own goals. I'm unsure about this, but it seems each of these cases should be considered separately. Earlier, I noted that some Goodhart failures are model failures. With mesaoptimizers involved, there are two sets of models  the base optimizer model, and the mesaoptimizer model.
Principal optimization failures occur either if the mesaoptimiser itself falls prey to a Goodhart failure due to shared failures in the model, or if the mesaoptimizer model or goals are different than the principal's in ways that allow the metrics not to align with the principals' goals. (Abrams correctly noted in an earlier comment that this is misalignment. I'm not sure, but it seems this is principally a terminology issue.)
This second form allows a new set of failures. I'm even less sure about this next, but I'll suggest that we can usefully categorize the second class into 3 cases; mesasuperoptimizers, mesasuboptimizers, and mesatransoptimizers. The first, mesasuperoptimizers, is where the Mesaoptimizer is able to find clever ways to get around the (less intelligent) base optimizer's model. This allows all of the classic Goodhart's lawtype failures, but they occur between the Mesaoptimizer and the basedoptimizer, rather than between the human controller and the optimizer. This case includes the classic runaway superintelligence problems. The second, mesasuboptimizers, is where the mesaoptimizer uses a simpler model than the base optimizer, and hits a Goodhart failure that the baseoptimizer could have avoided. (Let's say, for example, that it uses a correlational model holding certain factors known by the base optimizer to influence the system constant, and for whatever reason the mesaoptimizer enters a regimechange region, where those factors change in ways that the basemodel understands.) Lastly, there are mesatransoptimizers, where typical human types of principleagent failures can occur because the mesaoptimizer has different goals. The other way this occurs is if the mesaoptimizer has access to or builds a different model than the baseoptimzer. This is a bit different than mesasuperoptimizers, and it seems likely that there are a variety of cases in this last category. I'd suggest that it may be more like multiagent failures than it is like a traditional superintelligence alignment problem.
On to Embedded AgentsI need to think further about the above, and should probably return to it, but for now I plan to make the (eventual) next post about embedded agency in this context. Backing up from the discussing on Mesaoptimizers, a key challenge for building safe optimizers in general is that control often involves embedded agent issues, where the model must be smaller than the system. In particular, in the case of mesaoptimizers, the baseoptimizer needs to think of itself as an embedded agent whose model needs to include the mesaoptimizer's behavior, which is being chosen by the baseoptimizer. This isn't quite embedded agency, but it requires the base optimizer to be "larger" than the mesaoptimizer, only allowing mesasuboptimizers, which is unlikely to be guaranteed in general.
Discuss
Three Stories for How AGI Comes Before FAI
To do effective differential technological development for AI safety, we'd like to know which combinations of AI insights are more likely to lead to FAI vs UFAI. This is an overarching strategic consideration which feeds into questions like how we should think about the value of AI capabilities research.
As far as I can tell, there are actually several different stories for how we may end up with a set of AI insights which makes UFAI more likely than FAI, and these stories aren't entirely compatible with one another.
Story #1: The Roadblock StoryNate Soares describes the roadblock story in this comment:
...if a safetyconscious AGI team asked how we’d expect their project to fail, the two likeliest scenarios we’d point to are "your team runs into a capabilities roadblock and can't achieve AGI" or "your team runs into an alignment roadblock and can easily tell that the system is currently misaligned, but can’t figure out how to achieve alignment in any reasonable amount of time."
(emphasis mine)
The roadblock story happens if there are key safety insights that FAI needs but AGI doesn't need.
There is subtlety here. In order to make a strong argument for the existence of insights like this, it's not enough to point to failures of existing systems, or describe hypothetical failures of future systems. You also need to explain why the insights necessary to create AGI wouldn't be sufficient to fix the problems.
Some possible ways the roadblock story could come about:

The knowledge needed for FAI is a superset of the knowledge needed for AGI. If the safety insights are difficult to obtain, or no one is working to obtain them, we could find ourselves in a situation where we have all the AGI insights without having all the FAI insights.

Maybe safety insights are more or less agnostic to the chosen AGI technology and can be discovered in parallel. (Stuart Russell has argued against this, saying that in the same way making sure bridges don't fall down is part of civil engineering, safety should be part of AI.)

Maybe safety insights require AGI insights as a prerequisite, leaving us in a precarious position where we will have acquired the capability to build an AGI before we begin critical FAI research.
 This could be the case if the needed safety insights are mostly about how to safely assemble AGI insights into an FAI. It's possible we could do a bit of this work in advance by developing "contingency plans" for how we would construct FAIs in the event of particular capabilities advances that seem plausible. Such "contingency plans" could also be helpful for directing differential technological development, since we'd get a sense of the difficulty of FAI under various tech development scenarios. [tk BTW, factored cognition is a great example of this because it's speculative engineering that assumes much better imitation learning.] [tk Thread where you argue advanced GPT2 could solve alignment for itself. "Contingency plan" for way better NLP tech (e.g. could it be used to implement factored cognition?) could also help figure out if way better NLP tech is something we want.]

Maybe there will be multiple subsets of the insights needed for FAI which are sufficient for AGI.
 In this case, we'd like to speed the discovery of whichever FAI insight will be discovered last.

Maybe there are some insights which are only helpful for AGI and don't help with FAI.
Story #2: The Security Story
[tk Eliezer Yudkowsky describes the security story in his security mindset series]
The main difference between the security story and the roadblock story is in the security story, the team can't easily tell that the system is misaligned.
We can subdivide the security story based on the ease of fixing a flaw if we're somehow able to anticipate it.
[tk Most of the time, insecure systems are adjacent to secure systems in design space. Offer that list of security problems. I think Crypotography is unusual in that people think they can create codes that others can't break etc. and these aren't typically adjacent to good codes in design space. Find Guttman quote on this?]
[tk OWASP security checklist. In worlds where fixes are easy in retrospect, this kind of thing should be our focus? think laterally link to Brian Eno's Oblique Strategies deck?]
[A related scenario is one where there's an available quick fix for the problem in retrospect even if it's not a deep fix. Although EY says bolton is bad, this opinion is not a universal one in the computer security community. Link that post from Alex re: the value of bolton security.]
Differential technological development could be useful in the security story if we push for the development of AI tech that is easier to secure. However, it's not clear how confident we can be in our intuitions about what will or won't be easy to secure. In his book Thinking Fast and Slow, Daniel Kahneman describes his adversarial collaboration with expertise researcher Gary Klein. Kahneman was an expertise skeptic, and Klein an expertise booster:
We eventually concluded that our disagreement was due in part to the fact that we had different experts in mind. Klein had spent much time with fireground commanders, clinical nurses, and other professionals who have real expertise. I had spent more time thinking about clinicians, stock pickers, and political scientists trying to make unsupportable longterm forecasts. Not surprisingly, his default attitude was trust and respect; mine was skepticism.
...
When do judgments reflect true expertise? ... The answer comes from the two basic conditions for acquiring a skill:
 an environment that is sufficiently regular to be predictable
 an opportunity to learn these regularities through prolonged practice
In a less regular, or lowvalidity, environment, the heuristics of judgment are invoked. System 1 is often able to produce quick answers to difficult questions by substitution, creating coherence where there is none. The question that is answered is not the one that was intended, but the answer is produced quickly and may be sufficiently plausible to pass the lax and lenient review of System 2. You may want to forecast the commercial future of a company, for example, and believe that this is what you are judging, while in fact your evaluation is dominated by your impressions of the energy and competence of its current executives. Because substitution occurs automatically, you often do not know the origin of a judgment that you (your System 2) endorse and adopt. If it is the only one that comes to mind, it may be subjectively undistinguishable from valid judgments that you make with expert confidence. This is why subjective confidence is not a good diagnostic of accuracy: judgments that answer the wrong question can also be made with high confidence.
Basically, our intuitions are only as good as the data we're able to gather. [tk OWASP security checklist holds promise here?]
Story #3: The Alchemy Story
[tk _ and _ describe the alchemy story in their test of time presentation]
The alchemy story has similarities to both the roadblock story and the security story.
From the perspective of the roadblock story, "alchemical" insights could be viewed as insights which could be useful if we only cared about creating AGI, but are too unreliable to use in an FAI. (It's possible there are other insights which fall into the category of "usable for AGI but not FAI" due to something other than their alchemical natureif you can think of any, I'd be interested to hear.)
In some ways, alchemy could be worse than a clear roadblock. It might be that not everyone agrees whether the systems are reliable enough to form the basis of an FAI, and then we're looking at a [tk unilateralist's curse] scenario.
Just like chemistry only came after alchemy, it's possible that we'll first develop the capability to create AGI via alchemical means, and only acquire the deeper understanding necessary to create a reliable FAI later. This is a case of the possibility discussed above where FAI insights require AGI insights as a prerequisite. To prevent this, we could try & deepen our understanding of components we expect to fail in subtle ways, and retard the development of components we expect to "just work" once invented.
[tk Find & read recent Paul C discussion re: hardware and quote yourself re: why we should discourage AI hardware development]
From the perspective of the security story, "alchemical" insights could be viewed as components which are clearly prone to vulnerabilities. Alchemical components could produce failures which are hard to understand or summarize, let alone fix. From a differential technological development point of view, the best approach may be to differentially advance less alchemical, more interpretable [tk note: interpretability is not the same as explainability!] AI paradigms, developing the AI equivalent of reliable cryptographic primitives.
Trying to create an FAI from alchemical components is obviously not the best idea. But it's not totally clear how much of a risk they pose, because if the components don't work reliably, an AGI built from them may not work well enough to pose a threat. Such an AGI could work to upgrade its own components, but we might be able to program it so it periodically reevaluates the training data it's been given as its components get upgraded, so its understanding of human values improves as its components improve.
Discussion Questions
Did I miss anything? Or maybe ask readers for what scenarios DON'T fall into the categories you've described? But be sure to also mention the "something else" category in case your taxonomy is somehow flawed?
How plausible does each story seem?
This is a fake framework to facilitate brainstorming, what does the framework not capture?
Discuss
Adjectives from the Future: The Dangers of Resultbased Descriptions
Jumping to Conclusions
Suppose your friend tells you he's on a weightloss program. What do you think will happen in three months if he keeps on the weightloss program? Will he lose weight?
If you're like me, you're thinking, "Of course. He is on a weightloss program, isn't he? So, ipso facto, he will lose weight."
Does there seem to be anything fishy about that chain of reasoning?
We usually describe the current features of a thing and predict something about the future. For example, we might say "I'm running for half an hour each day" and predict that we will lose a certain number of pounds by the end of the month. But your friend above skipped the description and talked about the prediction as if it were visible right now: "I'm on a weightloss program".
You weren't told the features of the activity (running for half an hour) or even a name (CrossFit program). If you had been told either, you could have judged it based on your past knowledge of those features or names. Running regularly does help you lose weight and so does CrossFit. But, here, you were told just the prediction itself. This means you can't predict anything for sure. If his program involves running, he will lose weight; if it involves eating large cheese pizzas, he won't. You don't know which it is.
Yet, it sounded convincing! Even if you objected by saying that your friend probably won't stick to the exercise regimen, you probably bought into the premise, like me, that the program was a weightloss program.
Hypothesis: If you are given an adjective that describes a future event and are not given any currentlyvisible features, then you're likely to jump to the conclusion that that future event will occur.
However, if you can see some features, you are more likely to be skeptical.
This means that if you hear a resultbased description, like "weightloss program", you will assume that the thing will produce the result when it may actually not. Not so much when you hear about a "steam room that helps you lose fat". You haven't really heard about the feature (steam) making people lose weight.
A more serious example is when someone mentions a drugprevention program. We automatically think that it will prevent illegal drugs from being bought and sold. But that depends on what measures the program actually adopts. Running ads saying "Don't do drugs!" may not achieve much, whereas inspecting trucks at border checkpoints may. To judge whether the program will be successful, you have to inspect its actual features. But "drugprevention program" sounded convincing, right? Notice how the adjective "drugprevention" describes a future event  it says that drugs will be prevented in the future. Now, since you can't look into the future and tell whether drugs were in fact prevented, you shouldn't accept such an adjective. And since you're not told anything else about the program, you really can't say anything either way. And yet it sounds so convincing!
Similarly, take environmentprotection laws. Don't you feel like they will protect the environment? After all, it's right there in the name. Contrast that to saying "a law that raises the tax rate on fossil fuels". Now this may or may not protect the environment in terms of air pollution, but at least you don't jump to that conclusion right away.
This means that the person who chooses the adjective can mislead you (and himself) in the direction he desires by describing the thing in terms of the result and by omitting any features.[1] Suppose someone tells you this is an earthquakeresistant building. Do you believe that it will withstand earthquakes better than ordinary buildings? I do. He may have described the thing solely in terms of the result, but it still sounded convincing, right? Contrast that to "this building is made out of steelreinforced concrete". Now, you have one feature of the building. If you had to predict whether it would withstand earthquakes better than ordinary buildings, you would lean towards yes because reinforced concrete has worked in the past. But you wouldn't always jump to the conclusion that it was "earthquakeresistant". If I said "this building is made out of greencolored brick", you would be skeptical about its ability to withstand earthquakes better because you haven't heard anything about brick color being relevant.
The above illusion is compounded by the fact that you won't get feedback from others about your mistaken ideas if you use resultbased descriptions[2]. Suppose your weightloss friend assumed that using a telemarketed ab machine will help him get abs (it's right there in the name, I tell you!). Even then, he wouldn't have been lost if he had told you his concrete plan. You would have corrected his belief as soon as you stopped laughing at him. But since he told you that he's using a weightloss program, you couldn't really correct him. He might go on behaving as if that silly "ab machine" is going to get him sixpack abs by summer.
Why do we even accept descriptions that have nothing except a description about the future?
For one, it matters that no features are described. If I said that I was drinking lemonade, you wouldn't really predict that I would lose weight. You would ask me what evidence I have for lemonade causing weightloss. But what if I said I was having a weightloss drink? You might be less skeptical as long as you didn't look at my glass. Who knows; maybe there are drinks out there that cause weightloss.
Another relevant factor is the speaker's credibility how often we think the speaker sees the underlying features along with the eventual result. We accept an expert's resultbased description because we trust that he knows the features that lead to the result and is just omitting them when talking to a layman. When a doctor says that these are "sleeping pills", we are more likely to accept it than when a school boy does  the doctor knows that the pills contain benzodiazepine, which usually works. When a politician calls something a "drugprevention program", we are more likely to accept it than when a housewife says it  the politician knows that borderchecks (or whatever) have worked in the past. However, this might be misleading when the expert is dealing with something novel, such as a brandnew pill formula or a brandnew approach to drug regulations, since he is unlikely to have seen the result of those features (or may not care very much about deceiving the voters).
Finally, such descriptions might be fine when talking about the past. Saying that "I went on a weightloss program and lost 50 pounds" is a bit redundant, but harmless. You actually observe the result there, so you can decide based on the result how skeptical to be. You won't blindly jump to the conclusion that it will work as when someone says "I'm on a weightloss program right now".
So, we should avoid describing something only in terms of the result and should describe it using features instead. And if anyone tries to bias our prediction by sneaking in an adjective from the future, we should stop and ask for the features.
Examples of Adjectives from the FutureHere are some resultbased descriptions that I collected from news reports and books as I was testing the above hypothesis. All of them talk about future results, completely omit current features, and seem to make us jump to the desired conclusion. Did you fall for any of them?
Rehabilitation program  Don't you feel like the drug addict is going to get better after going to the rehab program? It's right there in the name! Notice that there are no features mentioned, just a description of the future as though it were the present. Contrast that to "not having access to drugs for 30 days, listening to lectures, and talking about your experiences". This doesn't make us jump to the conclusion that the addict will get better. We might even be skeptical about the power of lectures to fight off the temptation of drugs. For a realworld contrast, think of "the 12step program". It too tries to overcome addiction, but it is described in terms of the features (12 steps), not the desired result (overcoming addiction). In fact, it sounds like work, which it probably is. A rehabilitation program doesn't quite sound like that.
Peace process  Feels like it is going to lead to peace. No features; only desired results. Contrast that to "shaking hands and signing agreements in front of the world press". We may be more skeptical that that will prevent future wars. But in the former case, we would be insulated from feedback because we keep talking about the "peace process" instead of the "handshaking and agreementsigning".
Wait. Aren't there people who distrust the peace process and talk about its possible failure? I suspect that they do so after mentioning features of the process. They might say that this dictator has reneged on his promises in the past and thus should not be trusted right now. It would sound ludicrous if they expressed skepticism without any features. People would ask, "What do you mean this peace process may not bring about peace? It's a peace process."
Dangerous driving  No features; no feedback; only the future result  danger. Contrast that to: onehanded driving, texting while driving, or overtaking cars by switching lanes. We are a bit more skeptical that it will cause danger.
Costcutting measures  Need I say anything? Of course the costcutting measure is going to cut costs. Contrast that to "switching to online advertising" or "encouraging working from home a few days a week", which we are more skeptical about, since they may or may not bring down ultimate costs.
Healthy morning drink  No features, but it sounds like it will lead to health. Even the "morning" part is not a description of a feature of the object. It just talks about the time when people will drink it. Contrast to: drink containing 15g of protein and other stuff, which may or may not lead to more "health".
Recidivismreduction classes for exconvicts, i.e., making sure they don't go back to jail after getting out  Again, we feel like these classes will make them less likely to go back in. The classes reduce recidivism, after all. No features mentioned; description in terms of the future result (recidivismreduction); insulated from feedback. Contrast that to "lectures and reading books and stuff". We might be much more skeptical.
You can find any number of examples like these: national security bill vs a bill that increases the number of fighter jets; sufficiently wellfunded program vs same budget as last year (which may not be enough this year); a Sudokusolving program vs program that solved a set of easy and medium Sudoku puzzles.
How does this apply to LessWrong?Now, let's look at some descriptions that may be important to us as LessWrong readers.
Effective altruism  Doesn't effective altruism feel like it will be effective? And altruistic? It talks about the future results and doesn't mention any current features. Contrast that to "cash transfers" or even "evidencebased donations" and "evidencebased job changes", which talk about currentlyavailable evidence, not future results. We may be more skeptical that such cash transfers or donations will be effective or even altruistic. At least "cause prioritization" talks about a feature of the process right now. We can see a clear gap between the causes we prioritize and their eventual effectiveness. That gap doesn't even seem to exist when we talk about effective altruism.
When I hear "Against Malaria Foundation", I feel like it is sure to strike a blow against malaria. All it needs is the money. But if I were to hear "Mosquito Net Distributors", I would ask some questions about the effectiveness of mosquito nets. I may indeed get convinced that a dollar spent on nets will go farther than on other methods to fight malaria, but I won't jump to that conclusion. I may even think of how it might backfire or how mosquitoes might adapt. Not so with "Against Malaria Foundation".
Notice how futurebased adjectives could make a cause immune to feedback. If you were to mention that you won't donate to, say, AMF, people could raise their eyebrows, "Are you seriously against fighting against malaria?". But if you mention the means, you can safely say that you are in favour of fighting malaria, but against focusing on mosquito nets.
Finally, if "Mosquito Net Distributors" sounds a bit too sober because it doesn't mention its purpose, perhaps we could combine the two as "Mosquito Nets to Fight Malaria". [3]
Rationality techniques  When I see the term "rationality technique" or "rationality training" or "methods of rationality", I feel like the technique will lead to good, if not optimal, results. It doesn't describe any features after all; it just promises that good things will happen in the future. Contrast that to experimentation techniques or logical deduction. These talk about the features of the process and I don't assume that these will always get me the best results, since I know I might miss a confounding variable or apply rules incorrectly. I'm not quite as skeptical when I hear about the "methods of rationality".
Even when I look at concrete technique names, hearing about the CFAR technique of "Comfort Zone Expansion (CoZE)" makes me feel like it will actually expand my "comfort zone". But it doesn't mention any features; just the desired future result. Contrast that to "doing for an hour, in public, a few things you avoided doing in the past". Now, I don't jump to the conclusion that it will help me do what you or I may actually care about: ask a boss for a raise, tell an annoying colleague to shove it, or ask out a crush. I can tell that there is quite a gap between lying down on the pavement for 30 seconds and doing something that might jeopardize my work life. But when I hear "Comfort Zone Expansion", I really do feel like my "comfort zone" will be expanded, meaning that I will do those kinds of things more frequently. Why not call it "uncomfortableaction practice" or the original "exposure therapy"?
Brain emulation or brainemulating software  "How sure are you that brain emulations would be conscious?" (source)
My immediate response is that, of course, brain emulations would be conscious. If human brains are conscious (whatever that means) and if human brain emulations emulate human brains, then those would also be conscious. The very term seems to dispose me to a particular answer. It doesn't describe any present features, just the desired future results  that the program will behave like a human brain in most respects.
Imagine if we used a term that talked only about whichever observable tests you want: "How sure are you that, say, a DARPA Grand Challengewinning program would be conscious?" Suddenly, we are given two separate variables and asked to bridge the gap between them. That gives us a lot more room for skepticism. We can see that there could be many a slip between its present features and its future results.
I reason just as naively about claims like whole brain emulation can be "an easy way to create intelligent computers" or will acquire the "information contained within a brain", since human brains are already intelligent and already contain information. Given that this is a field where no one has succeeded, i.e., no one has emulated a human brain, we should take pains to avoid terms that make us less skeptical.
Optimization power  Lastly, take this description of a car design: "To hit such a tiny target in configuration space requires a powerful optimization process. The better the car you want, the more optimization pressure you have to exert  though you need a huge optimization pressure just to get a car at all."
I find myself agreeing with that. A car that travels fast is highlyoptimized, so of course it would need a powerful optimization process.
Unfortunately, "optimization process" does not describe any present features of the process itself. It simply says that the future result will be optimized. So, if you want something highlyoptimized, you'd better find a powerful optimizer. Seems to make sense even though it's a null statement! But if you describe any features, as in "the design of a car requires 1 teraflop of computing power for simulation", I immediately ask, is that too little computing power? Too much? I become a lot more skeptical.
Again, this suggests that, in such a novel domain, we should be more careful about avoiding resultbased descriptions like "optimization power", "superintelligence", and "selfimproving AI".
Should we always avoid ResultBased Descriptions?No. I don't think it's possible and I don't think people would want it. Like I said above, when I go to the doctor, I may just want "sleeping pills", not "benzodiazepine". Speaking about the latter would be a waste of time for the doctor and for me, provided I trust him. But what if I don't trust the person or if he's deluded himself?
I would reserve this technique for occasions when you're accepting an important pitch a pitch that asks for a big investment, either in business or politics or social circumstances. People may try to convince us to accept a "careerdefining opportunity" (instead of a shift to another department, which may not define your career) or a "jobsforthepoor program" (instead of a law that reserves X% of infrastructure jobs, which may not be filled and may not employ all the poor) or a "lifechanging experience" (instead of skydiving for six minutes, which may or may not change your life much).
When it comes to our own usage, as people who want to portray an accurate map of reality, we should avoid using such resultbased descriptions that might mislead others and, most importantly, ourselves. Marketing may demand a title that sounds catchy, but you have to decide whether you want to risk deceiving others, especially when you're pitching an idea that will ask them to invest a lot.
What's in a name? Isn't it ok to have the name based on the result as long as the contents tell you the features? Well, that would be ok if people always mentioned the contents. But we usually omit the contents when referring to something and someone who is new or busy may not look at the contents. Thus they (and we) might get misled into predicting the result based on the title. A person donating to an organization or paying for a workshop may see only the title, perhaps a few testimonials from friends, and maybe some headings on the website. If all of these descriptions are resultbased, he might think that the organization or workshop does, in fact, have a good chance of delivering those results. If he had been given the features, maybe he would have been much more skeptical.
Let me know your thoughts below. Does the basic hypothesis seem valid? What about some of its implications?
There is a similar phenomenon in goalsetting where they distinguish between outcome goals (such as losing 10 pounds) and process goals (such as going to the gym four times a week). However, the focus there is on which goalsetting style is more effective in getting results. My focus here is on which type of description makes you more gullible. The two may be related. ↩︎
Isn't "resultbased description" itself a resultbased description, an adjective from the future? I don't think so. It's something you can observe right now. Specifically, if the description isn't fully determined by past features, then it's a resultbased description. (Contrast that to "misleading description".) ↩︎
And, yes, "mosquito net" is itself a resultbased description, since you expect it to keep out mosquitoes, but at least it mentions one feature  the net. ↩︎
Discuss
Could we solve this email mess if we all moved to paid emails?
Have you ever…
 Sent an email to someone in rationality and not heard back for many weeks (or more)?
 Avoided sending an email to someone because you wanted to spare their attention, despite thinking there was a fair chance they’d be genuinely interested?
 Wanted some way to signal that you actually cared more than usual about this email, but without having to burn social capital (such as by saying “urgent” or “please read”)?
 Had to ignore an email because, even though it might have been interesting, figuring that out would simply have been too effortful?
I think that 1) problems like these are prevalent, 2) they have pretty bad consequences, and 3) they could be partly solved by using services where you can pay to send someone an email (payment is usually conditional on reply).
I’m considering running a coordination campaign to move the community to using paid emails (in addition to their ordinary inbox), but before launching that unilaterally I want more confidence it is a good idea.
It would be very helpful data if people who'd use this is if >=50 other people also did would post just saying "I'd use this is >=50 particular other people did".
BackgroundEmail seems broken. This is not that surprising: your email is basically a todo list where other people (and companies) can add items for free, without asking; and where you’re the only one who can remove them. We should do something about this.
More broadly, the attention economy seems broken. Recognising this, many rationalists use various software tools to protect themselves from apps that are engineered to be addictive. This helps at an individual level, but it doesn’t help solve the collective action problem of how to allocate our attention as a community. We should do something about this.
Costly signalling and avoiding information asymmetries
An “information asymmetry” is situation where someone has true information which they are unable to communicate. For example, suppose 10 economists are trying to influence government policy on issue X, and one of them actually, really knows what the most effective thing is. Yet, they might not be able to communicate this to the decisionmakers, since the remaining 9 have degrees from equally prestigious institutions and arguments that sound equally rigorous to someone without formal training in economics. Information asymmetries are a key mechanism that generate bad equilibria.
When it comes to email, this might look as follows: Lots of people write to senior researchers asking for feedback on papers or ideas, yet they’re mostly crackpots or uninteresting, so most stuff is not worth reading. A promising young researcher without many connections would want their feedback (and the senior researcher would want to give it!), but it simply takes too much effort to figure out that the paper is promising, so it never gets read. In fact, expecting this, the junior researcher might not even send it in the first place
This could be avoided if people who genuinely believed their stuff was important could pay some money as a costly signal of this fact. Actual crackpots could of course also pay up, but 1) they might be less likely to, and 2) the payment would offset some of the cost of the recipient figuring out whether the email is important or not.
How the signalling problem is currently solved, and why that’s bad
Currently, the signalling problem is solved by things like:
 Spending lots of effort crafting interestingsounding intros which signal that the thing is worth reading, instead of just getting to the point
 Burning social capital  adding tags like “[Urgent]” or “[Important]” to the subject line
This is bad, because:
1) It’s a slippery slope to a really bad equilibrium. I’ve gotten emails with titles like “Jacob, is everything alright between us?” because I didn’t buy a water bottle from some company. This is what we should expect when companies fight for my attention without any way to just directly pay for it. Even within the rationality community, if our only way of allocating importance is by drawing upon very serious vocabulary, we’ll create an incentive for exaggeration, differentially favouring those less scrupulous about this practice, and chip away at our ability to use sharedcuesofimportance when it really matters.
2) The main thing protecting us from this inside a smaller community is that people want to preserve their reputations. But if you’re unsure how important your thing is, and mislabeling it means potentially cryingwolf and risking your reputation, this usually makes it more worth it to just avoid the tag. Which means that we lose out on all those times when your thing actually was important and using the tag would have communicated that.
3) It puts the recipient between a rock and a hard place, and they’re not being compensated for it. If you mark something as “[Urgent]” that actually is urgent, and the person responds and does what you want, you’ve still presented them with the choice between sacrificing some ability to freely prioritise their tasks, and sacrificing some part of the quality of your relationship. There should be some easy way for you to compensate them for that.
4) It’s way too coarsegrained. There’s not really any way of saying:
“This is kinda important, but not that urgent, though it would probably be good if you read it at some point, though that depends on what else is on your plate”
apart from writing exactly that  but then you’re making a complicated cognitive demand, which has already burnt lots of attention for the recipient.
Brief FAQWhat if replacing email with paid emails puts us in another equilibrium that’s bad for unexpected reasons?
At the moment, it doesn’t seem feasible for us to use this to replace email. There isn’t even software available for doing that completely. Rather, people would consent to receiving paid messages (for example via earn.com, see below) in addition to having their regular inbox.
What if people don’t have enough money?
As mentioned above, sending standard emails are still an option. Yet this becomes a problem in the world where we move to the equilibrium where a standard email is taken to signal “I didn’t pay for this, so it’s not that important”. Then I can imagine grants for “email costs” being a thing, or that the benefits of the new equilibrium outweigh this cost, or that they don’t. I’m uncertain.
Wouldn’t this waste a lot of money?
Not really, assuming that the people who you send money to are at least as effective at spending it as you are, which seems likely if this gets used within the rationality community.
If this is basically right: then what do we do?Earn.com is a site which offers paid emails. For example, you can pay to message me at earn.com/jacobjacob/
If this seems like something that could solve the current email mess, we should coordinate to get a critical mass of the community to signup, and make their profile url:s available. (Compare this to how we’ve previously started using things reciprocity.io and Calendly.)
I’d be happy to coordinate such a campaign, but I don’t want to do it until I’m more confident it would be a good thing.
(For the record, I have no relation to earn.com and would not benefit personally by others joining, beyond the obvious positive effects on the community. They simply seem like the best available option for doing this. They have a pretty solid team, and are used by some very senior VCs like Marc Andreessen and Keith Rabois.)
Discuss
AI Safety Reading Group
If you are interested in AI Safety, come visit the AI Safety Reading Group.
The AI Safety reading group meets on Skype Wednesdays at 18:45 UTC, discussing new and old articles on different aspects of AI Safety. We start with a presentation round, then a summary of the article is presented, followed by discussion both on the article and in general.
Sometimes we have guests. On Wednesday the 14th, Stuart Armstrong will be giving a presentation on his research agenda in the reading group:
https://www.alignmentforum.org/posts/CSEdLLEkap2pubjof/researchagendav09synthesisingahumanspreferencesinto
Join us by Skype, by adding ‘soeren.elverlin’.
Previous guests include Eric Drexler, Rohin Shah, Matthijs Maas, Scott Garrabrant, Robin Hanson, Roman Yampolskiy, Vadim Kosoy, Abram Demski and Paul Christiano. A full list of articles read can be found at https://aisafety.com/readinggroup/
Discuss
Does human choice have to be transitive in order to be rational/consistent?
I was struck by that question reading one of the responses to the post polling the merits of several AI alignment research ideas.
I have not really thought this through but it seems the requirement for preference ordering satisfying a transitivity requirement must also assume the alternatives being ranked can be distilled to some common denominator (economics would probably suggest utility per unit or more accurately MU/$).
I'm not sure that really covers all, and perhaps not even the majority of cases.
It we're really comparing different sets of attributes we label A, B and C transitive preferences might well the the exception rather than the rule.
The A>B, B>C therefore A>C is often violated  in political science that produces a voting cycle  when considering group choices.
I just wonder if it really is correct to claim such results within one person's head, given we're comparing different things  and so likely the use/consumption in a slightly different context as well.
Could that internal voting cycle be a source of indecision (which is a bit different that indifference) and why we will often avoid a pairwise decision process and opt for putting all the alternatives up against the others to pick the preferred alternative?
If so would that be something that an AGI will also find naturally occurs and it is not an error to be corrected but rather a situation where applying a pairwise choice or some transitivity check would actually be the error.
Discuss
Diana Fleischman and Geoffrey Miller  Audience Q&A
Crossposted from Putanumonit.
This is the audience Q&A with Diana Fleischman and Geoffrey Miller at the NYC Rationality meetup, following up on my own interview which you can find here.
Content note: the audience comprised rationalists of many ethnicities, orientations, and gender expressions and we asked questions that could offend many ethnicities, orientations, and gender expressions.
What are the main hypothesized causes of homosexuality?
Diana: There’s a difference between homosexual behavior and homosexual orientation. Homosexual orientation is very rare. There’s one species, domestic sheep, in which 810% of rams are not interested in ewes at all. You can tie a ewe in heat in front of them and they don’t react at all. Actually, one area where homosexuality research has flourished is among sheep breeders because if you buy one of these rams who’s gay, that’s really bad news for the business.
So homosexual orientation is exceedingly rare. Even though you see stats that it’s 10% in people, it’s about 3%. In a paper that I wrote I claim that bisexuality is the optimal sexual strategy because sex is not just used for reproduction, it’s also used for affiliation. There are a number of ways to affiliate: you can give somebody food or you can give somebody an orgasm. These are ways to get other people to like you. If you’re somewhat attracted to people of the same sex but not enough to forego reproductive opportunities with people of the opposite sex, then you can actually engage with both sexes.
Is the bisexual revolution coming?
Diana: There are places where people are much more open to it, but not many places. It’s a whole spectrum of behavior.
In places around the world, there are men who have anything from affectionate to sexual interactions with other men and they’re not considered gay. They have homosexual behavior along with heterosexual behavior, and that’s a common thing. If you look at the bell curve of orientation in a society that suppresses homosexual behavior among men, you’ll find that only men who are predominantly homosexual will exhibit this type of behavior. It’s not used for affiliation in our particular society, so men don’t engage in it. Homosexual men are basically just the edge of this bell curve. The reason there keep being a small percentage of men who are exclusively homosexual is because bisexuality is adaptive.
The most controversial hypothesis about the origin of homosexuality is Gregory Cochran’s idea though there’s not much evidence behind it, which is the ‘gay germ’ hypothesis. Now homosexuality is not as hereditary as you may think; if one identical twin is gay his twin would be gay only 50% of the time. But also, identical twins are usually in the same amniotic sac, while fraternal twins are in different amniotic sacs. Cochran shares some evidence that if twins share an amniotic sac, they are more likely to both be gay. So there might be some virus or bacteria that causes homosexuality later in life.
That’s very controversial because the idea is that there would then be some kind of gay vaccine which could prevent women from giving birth to boys who would grow up to be gay.
Geoffrey: And the reason why it would make sense from the point of view of the virus is that it’s a lot easier to spread since gay men have a lot more sexual partners on average than straight men.
I have a Ph.D. student studying generally whether sexually transmitted pathogens evolve to manipulate our sexual behavior to promote their spread. This could include lowering our mating standards, making us more promiscuous, more sexual, being sexually active earlier and later in life, all kinds of things. So it might not be us in the driver’s seat as much as we think.
Diana: There are also ideas about endocrine disruptors and things of that nature. This is all very controversial. I was at a conference with other people, like me, who study homosexuality. And there was a guy there who wasn’t willing to admit that gay men are more feminine than straight men on average, that they’re more likely on average to have feminine interests. And if there’s controversy about something as basic as that, it’s very difficult to talk about anything in this sphere.
Or, for example, if you give a female rat testosterone while she’s in the womb she’s going to mount other rats when she’s older. And in the 1980s some women took a miscarriage drug that masculinized their daughters and made them more likely to be lesbians. That kind of stuff happens as well and people really hate talking about it.
Is there quantified data on gay men being more promiscuous?
Geoffrey: There are some studies about the average number of sexual partners, the typical results are that sexually active gay men in their 30s or 40s have had several dozen partners, while the average for straight men is closer to 10. The average for women is closer to 10 but they may be lying. And it’s lower for lesbians than for straight women.
But there’s a bit of a taboo around mentioning these results because promiscuity is heavily moralized. Polyamorous people also tend to have more partners than monogamous people. And then conservatives say that poly people are all sluts. Now we are, but in a good way!
Why do men not have a preference for tall women? Women prefer tall men, and tall women have tall children, some of whom will be sons. So it’s weird that tall women aren’t universally preferred.
Diana: Dutch people are tall and people in Spain and Portugal are short, and there’s evidence that it’s not by accident. There is mixed evidence that there has actually been positive selection for shortness in Portugal and Spain (which is why I’m short). There’s evidence from across the world that men prefer smaller women. Some of that has to do with smallness being associated with youth and neoteny, some of that might have to do with less savory hypotheses about exploitability and controllability.
It’s probably also associated with fertility. I think that Randy Thornhill said that 5’2” is the optimal height for fertility. Women parlay some of the somatic effort of growing into reproductive effort.
Jacob: In ancient Greece, men preferred tall women. In The Odyssey, Calypso is bewildered that Odysseus wants to go back home since she’s taller than Penelope. And Odysseus admits that she’s taller and more beautiful, but says that he’s still loyal to his family even though they aren’t tall like the gods.
You said earlier that it’s important for people to be conscious of their motivations, and I’ve recently heard two criticisms of that idea. One is Robin Hanson’s idea that if you know your motivations, you’re worse at doing something and convincing people that you’re doing it for the right reasons. The more interesting one is Jonathan Haidt’s idea of divinity, the moral dimension that puts animals below you and the gods above you. The less like an animal you are the more you can make meaning out of your life, or act morally and communally.
Do you think that thinking about motivations like trying to get laid takes away from that?
Diana: To Robin Hanson’s view, it’s an interesting idea that you’re worse at things if you know the motivations. But I can also make myself happier if I’m aware of what’s driving me. I used to date an Effective Altruist who made a lot less money than I did. At the beginning of an evening, I would give him cash so that he could pay for everything through the course of the night. And I would try my best and often succeed in forgetting I had given him cash so that I would feel like he’s taking me out. And that was awesome, but it required being quite cynical about my motivations.
As for Jonathan Haidt, I don’t identify with that notion at all. I can feel very much that I have higher meaning and purposes in my life and still be a mammal, a shitting, gestating, weird body.
I’m curious why you consider altruism a positive signal. I’m asking because I have no morals and no principles and I’m living a very happy life. Potentially the happiest life I could imagine living. Why is altruism so appealing to other people? To me, it seems inefficient.
Diana: Ok, I’m going to talk to you like you’re an alien.
That’s right, I’m an alien.
Diana: Let’s say you’re in a small group and you want people to exchange with you. You need to signal certain qualities to them that mean that in the long term you’re a good exchange partner. One of those would be loyalty to the group; another is consistency over time. This is very interesting: people change a lot over time but they like to signal that they’re very consistent so they’ll be reliable partners in the future. Altruism signals that you’re willing to be generous to those who are powerless. That is a good and virtuous signal because it also means that you’re a better exchange partner. If someone’s down and out and no longer useful to you, for example, you would still be willing to help them out.
Geoffrey: I have a whole chapter on this in The Mating Mind, which Diana should read some time *grins*. What I argue there is: imagine that you have two tribes. In one, you signal traits in ways that don’t bring benefit to others, like showing off only by singing which doesn’t bring evolutionary benefits like food or protection to the group. In the second tribe you’re doing cooperative hunting, for example, and bringing back tens of thousands of calories in mammoth meat and distributing it conspicuously within the tribe. Which of these tribes is going to do better in competition?
Jacob: Depends. Is the competition American Idol?
Geoffrey: The concept is that we’re descended from the second tribe. We’re descended from many generations of ancestors that signaled altruism within the group and had these exchange partner benefits within the group but also had net benefits in competition with other groups.
This is not a strict group selection argument, it’s an equilibrium selection argument in the gametheoretic sense. That’s what I argued in The Mating Mind, that we’re an unusually altruistic species because at the tribe v. tribe level the cooperative altruistic signalers did better. And in fact, that was exactly what Darwin argued in 1871.
Given evolutionary accounts of relationship activities and strategies, the cynical approach tends to have a moralizing effect. Calling what women do ‘shittesting’ makes it sound like a morally bad thing to do to your partner. But of course, a lot of what we’re doing is good: it strengthens bonds, creates a sense of security, etc.
I wonder, in your personal life or from an academic perspective, how do you choose when to indulge your evolutionary preferences and when not to?
Diana: I don’t have as much control as people may think. I’m very aware when I’m shittesting, and Geoffrey will say as much to me, but I still carry on. I can also predict how long I’ll be feeling angry and how long I might feel like shit testing.
Yes, calling it that makes it sound bad because it generally is bad, not just because it’s a red pill term. When I was shittesting today I recognized that I was taking the thing that he had done most recently as an indicator of his overall commitment rather than aggregating. It’s much easier for a woman to aggregate all of her grievances than to aggregate all of her joys. It’s because of this error management, it’s much better to err on the side of being doubtful about somebody’s commitment. It can be good to play around with these emotions, but it’s hard to handle them.
Geoffrey: It might be better to use a more neutral term. Amotz Zahavi called it ‘testing the bond’ in his 1975 book The Handicap Principle.
You mentioned giving a talk about how kink spaces are outlets for evolutionary urges. How would kink be represented in days of yore?
Geoffrey: There are carved images that look like the Venus of Willendorf fertility figures that have indications of rope and bondage going back at least 30,000 years. It’s hard to interpret, but tying women up is what you would do when you raid another village to kill all the men and steal the women.
Domination/submission dynamics run really deep in primates, generally. Primates have been doing domination and submission signals for at least 6070 million years. The idea of engineering situations that bring that out seems like a natural thing for social primates to do.
And as for role play, kids like to pretend to be someone they’re not. And sensation play like spanking and flogging… there’s a speculative theory.
The standard form of copulation for mammals is dorsoventral rear entry — ‘doggy style’. What that tends to do is put a lot of repeated impact on the female’s back end. So, what you’re getting with flogging is a superstimulus of prolonged extrahard copulation. It’s a fitness indicator because that would be hard for a male to sustain.
Diana: There’s another thing about kink, which is that when you’re aroused you have arousalinduced analgesia — you don’t experience pain as much. If you can take a lot of pain, you’re actually giving an honest signal of how aroused you are.
The top three sexual fantasies for women are: number one, having sex with their main partner *mimes yawning*, number two, sex with a strange man, and number three, being taken against their will, which is now called “rapture play”. So these are things that women have experienced over evolutionary history many, many times. It’s a controversial thing to talk about but if you look at any romance novel there are often interactions where women are taken against their will.
Geoffrey: I did give a talk in Amsterdam on the evolutionary psychology of BDSM and kink. What I pointed out there is that a lot of what happens in kink in playing around and ritualizing a lot of evolutionary psychology stuff. Likewise, a lot of what happens in roleplay — letting those instincts out to play. And that can be wonderful and creative and really really good for a relationship.
I don’t know how many other ev psych people have this view, we don’t really talk about it at the HBES conference. I do think that in the future there will be more openness about it. There will be a view that we can keep these dark instinctual secrets when we go out to our corporate jobs and behave in society, but you can still make space for them to come out and play in private life in safe and creative ways.
While we’re on the subject of kink and strange desires, I found that I have a desire as a woman that makes no evolutionary sense at all. I wonder if you have a theory about it. I have a fantasy about finding a shitty loser that no one likes and have him dominate me.
Diana: So utilitarian! But would he even know how? It’s actually a very utilitarian activity to have sex with all the men that no one else will have sex with. It would make their year!
Geoffrey: Normally hypergamy works a little differently. Normally a woman would take an existing partner and do some mental trick or roleplay to build him up as a kind of superhero. So the status differential between them gets amplified until she feels like Leda and the swan (who’s actually Zeus) and that’s exciting. Taking a total loser and then pushing yourself way below him… I don’t get at all.
Diana: I’ll just make up a story here. You want to find someone that no one else wants, so you have no competition for him, and then elicit in him the highest status and accomplishment he can have. So you have an ideal situation where you can totally monopolize him because no one else has an idea that he’s good. Women often choose men based on how many other women are attracted to them, but if you choose somebody that you have special knowledge about you never have to share them with anybody.
Geoffrey: I have an idea. If you’re a teenage girl in an agestratified school system where you’re only allowed to date and interact with guys your age, they’re all losers relative to actual mature adults. So in order to feel any attraction to these guys you have to be able to see the little bit of promise and potential in them and be able to feel the hypergamy by making yourself even lower than they are. That would be adaptive if that’s all the choice you’ve got available.
So I think that some women are in a position where the best they can do in terms of finding a partner that they think is a catch, given the constraints that you’re all in 9th grade.
Discuss
Which of these five AI alignment research projects ideas are no good?
I'll post five AI alignment research project ideas as comments. It would be great if you could approvalvote on them by using upvotes. Ie. when you think the project idea isn't good, you leave the comment as is; otherwise you give it a single upvote.
The project ideas follow this format (cf. The Craft of Research):
I'm studying <topic>, because I want to <question that guides the search>, in order to help my reader understand <more significant question that would be informed by an answer to the previous question>.The project ideas are fixedwidth in order to preserve the indentation. If they get formatted strangely, you might be able to fix it by increasing the width of your browser window or zooming out.
Discuss
Intransitive Preferences You Can't Pump
In the usual argument of moneypumping we take an agent with preferences A>B, B>C and C>A. Then we offer it to exchange C+$1 for B, then B+$1 for A, and finally A+$1 for C. Now the agent paid $3 and ended up where it started.
The assumption here is that not only does this agent prefer A to B, it prefers A to B+$1. Of course, the price of $1 could be too steep, then we could look for something less valuable to exchange.
However, payments of arbitrarily low value need not exist! Sure, you can propose a lottery, where the agent only has to pay 1$ with probability p. But arbitrarily low probabilities need not exist. To offer a lottery, you need to have a physical method to generate events with that probability. If p equals 1 divided by Grahm's number, how many coins would you have to flip to run this lottery? Even if you said "the agent will pay $1 when a lump of solid gold materializes out of thin air", there are numbers lower than that probability. I hate to be an ultrafinitist, but it's true, extremely small (or large) numbers are not physically meaningful.
Note, here I am assuming the axiom of continuity, i.e. that if B<A<B+$1, then for some p, we must have B+p·$1<A. However, we should be able to violate this axiom as well.
What does this imply? Probably nothing. This perverse agent is almost indistinguishable from an agent which sets U(A)=U(B)=U(C). The rule is that you can violate any axioms, but only when it doesn't matter.
Discuss
Categorial preferences and utility functions
This post is motivated by a recent post of Stuart Armstrong on going from preferences to a utility function. It was originally planned as a comment, but seems to have developed a bit of a life of its own. The ideas here came up in a discussion with Owen Biesel; all mistakes in this exposition are mine. I'm not very good with the typesetting engine here, so apologies for latex and other problems.
The basic ideas is as follows. Suppose we have a set .mjxchtml {display: inlineblock; lineheight: 0; textindent: 0; textalign: left; texttransform: none; fontstyle: normal; fontweight: normal; fontsize: 100%; fontsizeadjust: none; letterspacing: normal; wordwrap: normal; wordspacing: normal; whitespace: nowrap; float: none; direction: ltr; maxwidth: none; maxheight: none; minwidth: 0; minheight: 0; border: 0; margin: 0; padding: 1px 0} .MJXcdisplay {display: block; textalign: center; margin: 1em 0; padding: 0} .mjxchtml[tabindex]:focus, body :focus .mjxchtml[tabindex] {display: inlinetable} .mjxfullwidth {textalign: center; display: tablecell!important; width: 10000em} .mjxmath {display: inlineblock; bordercollapse: separate; borderspacing: 0} .mjxmath * {display: inlineblock; webkitboxsizing: contentbox!important; mozboxsizing: contentbox!important; boxsizing: contentbox!important; textalign: left} .mjxnumerator {display: block; textalign: center} .mjxdenominator {display: block; textalign: center} .MJXcstacked {height: 0; position: relative} .MJXcstacked > * {position: absolute} .MJXcbevelled > * {display: inlineblock} .mjxstack {display: inlineblock} .mjxop {display: block} .mjxunder {display: tablecell} .mjxover {display: block} .mjxover > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxunder > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxstack > .mjxsup {display: block} .mjxstack > .mjxsub {display: block} .mjxprestack > .mjxpresup {display: block} .mjxprestack > .mjxpresub {display: block} .mjxdelimh > .mjxchar {display: inlineblock} .mjxsurd {verticalalign: top} .mjxmphantom * {visibility: hidden} .mjxmerror {backgroundcolor: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; fontstyle: normal; fontsize: 90%} .mjxannotationxml {lineheight: normal} .mjxmenclose > svg {fill: none; stroke: currentColor} .mjxmtr {display: tablerow} .mjxmlabeledtr {display: tablerow} .mjxmtd {display: tablecell; textalign: center} .mjxlabel {display: tablerow} .mjxbox {display: inlineblock} .mjxblock {display: block} .mjxspan {display: inline} .mjxchar {display: block; whitespace: pre} .mjxitable {display: inlinetable; width: auto} .mjxrow {display: tablerow} .mjxcell {display: tablecell} .mjxtable {display: table; width: 100%} .mjxline {display: block; height: 0} .mjxstrut {width: 0; paddingtop: 1em} .mjxvsize {width: 0} .MJXcspace1 {marginleft: .167em} .MJXcspace2 {marginleft: .222em} .MJXcspace3 {marginleft: .278em} .mjxtest.mjxtestdisplay {display: table!important} .mjxtest.mjxtestinline {display: inline!important; marginright: 1px} .mjxtest.mjxtestdefault {display: block!important; clear: both} .mjxexbox {display: inlineblock!important; position: absolute; overflow: hidden; minheight: 0; maxheight: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjxtestinline .mjxleftbox {display: inlineblock; width: 0; float: left} .mjxtestinline .mjxrightbox {display: inlineblock; width: 0; float: right} .mjxtestdisplay .mjxrightbox {display: tablecell!important; width: 10000em!important; minwidth: 0; maxwidth: none; padding: 0; border: 0; margin: 0} .MJXcTeXunknownR {fontfamily: monospace; fontstyle: normal; fontweight: normal} .MJXcTeXunknownI {fontfamily: monospace; fontstyle: italic; fontweight: normal} .MJXcTeXunknownB {fontfamily: monospace; fontstyle: normal; fontweight: bold} .MJXcTeXunknownBI {fontfamily: monospace; fontstyle: italic; fontweight: bold} .MJXcTeXamsR {fontfamily: MJXcTeXamsR,MJXcTeXamsRw} .MJXcTeXcalB {fontfamily: MJXcTeXcalB,MJXcTeXcalBx,MJXcTeXcalBw} .MJXcTeXfrakR {fontfamily: MJXcTeXfrakR,MJXcTeXfrakRw} .MJXcTeXfrakB {fontfamily: MJXcTeXfrakB,MJXcTeXfrakBx,MJXcTeXfrakBw} .MJXcTeXmathBI {fontfamily: MJXcTeXmathBI,MJXcTeXmathBIx,MJXcTeXmathBIw} .MJXcTeXsansR {fontfamily: MJXcTeXsansR,MJXcTeXsansRw} .MJXcTeXsansB {fontfamily: MJXcTeXsansB,MJXcTeXsansBx,MJXcTeXsansBw} .MJXcTeXsansI {fontfamily: MJXcTeXsansI,MJXcTeXsansIx,MJXcTeXsansIw} .MJXcTeXscriptR {fontfamily: MJXcTeXscriptR,MJXcTeXscriptRw} .MJXcTeXtypeR {fontfamily: MJXcTeXtypeR,MJXcTeXtypeRw} .MJXcTeXcalR {fontfamily: MJXcTeXcalR,MJXcTeXcalRw} .MJXcTeXmainB {fontfamily: MJXcTeXmainB,MJXcTeXmainBx,MJXcTeXmainBw} .MJXcTeXmainI {fontfamily: MJXcTeXmainI,MJXcTeXmainIx,MJXcTeXmainIw} .MJXcTeXmainR {fontfamily: MJXcTeXmainR,MJXcTeXmainRw} .MJXcTeXmathI {fontfamily: MJXcTeXmathI,MJXcTeXmathIx,MJXcTeXmathIw} .MJXcTeXsize1R {fontfamily: MJXcTeXsize1R,MJXcTeXsize1Rw} .MJXcTeXsize2R {fontfamily: MJXcTeXsize2R,MJXcTeXsize2Rw} .MJXcTeXsize3R {fontfamily: MJXcTeXsize3R,MJXcTeXsize3Rw} .MJXcTeXsize4R {fontfamily: MJXcTeXsize4R,MJXcTeXsize4Rw} .MJXcTeXvecR {fontfamily: MJXcTeXvecR,MJXcTeXvecRw} .MJXcTeXvecB {fontfamily: MJXcTeXvecB,MJXcTeXvecBx,MJXcTeXvecBw} @fontface {fontfamily: MJXcTeXamsR; src: local('MathJax_AMS'), local('MathJax_AMSRegular')} @fontface {fontfamily: MJXcTeXamsRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_AMSRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_AMSRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_AMSRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalB; src: local('MathJax_Caligraphic Bold'), local('MathJax_CaligraphicBold')} @fontface {fontfamily: MJXcTeXcalBx; src: local('MathJax_Caligraphic'); fontweight: bold} @fontface {fontfamily: MJXcTeXcalBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakR; src: local('MathJax_Fraktur'), local('MathJax_FrakturRegular')} @fontface {fontfamily: MJXcTeXfrakRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_FrakturRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_FrakturRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_FrakturRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakB; src: local('MathJax_Fraktur Bold'), local('MathJax_FrakturBold')} @fontface {fontfamily: MJXcTeXfrakBx; src: local('MathJax_Fraktur'); fontweight: bold} @fontface {fontfamily: MJXcTeXfrakBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_FrakturBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_FrakturBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_FrakturBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathBI; src: local('MathJax_Math BoldItalic'), local('MathJax_MathBoldItalic')} @fontface {fontfamily: MJXcTeXmathBIx; src: local('MathJax_Math'); fontweight: bold; fontstyle: italic} @fontface {fontfamily: MJXcTeXmathBIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MathBoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MathBoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MathBoldItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansR; src: local('MathJax_SansSerif'), local('MathJax_SansSerifRegular')} @fontface {fontfamily: MJXcTeXsansRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansB; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerifBold')} @fontface {fontfamily: MJXcTeXsansBx; src: local('MathJax_SansSerif'); fontweight: bold} @fontface {fontfamily: MJXcTeXsansBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansI; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerifItalic')} @fontface {fontfamily: MJXcTeXsansIx; src: local('MathJax_SansSerif'); fontstyle: italic} @fontface {fontfamily: MJXcTeXsansIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXscriptR; src: local('MathJax_Script'), local('MathJax_ScriptRegular')} @fontface {fontfamily: MJXcTeXscriptRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_ScriptRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_ScriptRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_ScriptRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXtypeR; src: local('MathJax_Typewriter'), local('MathJax_TypewriterRegular')} @fontface {fontfamily: MJXcTeXtypeRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_TypewriterRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_TypewriterRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_TypewriterRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalR; src: local('MathJax_Caligraphic'), local('MathJax_CaligraphicRegular')} @fontface {fontfamily: MJXcTeXcalRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainB; src: local('MathJax_Main Bold'), local('MathJax_MainBold')} @fontface {fontfamily: MJXcTeXmainBx; src: local('MathJax_Main'); fontweight: bold} @fontface {fontfamily: MJXcTeXmainBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MainBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MainBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MainBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainI; src: local('MathJax_Main Italic'), local('MathJax_MainItalic')} @fontface {fontfamily: MJXcTeXmainIx; src: local('MathJax_Main'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmainIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MainItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MainItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MainItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainR; src: local('MathJax_Main'), local('MathJax_MainRegular')} @fontface {fontfamily: MJXcTeXmainRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MainRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MainRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MainRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathI; src: local('MathJax_Math Italic'), local('MathJax_MathItalic')} @fontface {fontfamily: MJXcTeXmathIx; src: local('MathJax_Math'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmathIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MathItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MathItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MathItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize1R; src: local('MathJax_Size1'), local('MathJax_Size1Regular')} @fontface {fontfamily: MJXcTeXsize1Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size1Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size1Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size1Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize2R; src: local('MathJax_Size2'), local('MathJax_Size2Regular')} @fontface {fontfamily: MJXcTeXsize2Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size2Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size2Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size2Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize3R; src: local('MathJax_Size3'), local('MathJax_Size3Regular')} @fontface {fontfamily: MJXcTeXsize3Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size3Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size3Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size3Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize4R; src: local('MathJax_Size4'), local('MathJax_Size4Regular')} @fontface {fontfamily: MJXcTeXsize4Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size4Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size4Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size4Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecR; src: local('MathJax_Vector'), local('MathJax_VectorRegular')} @fontface {fontfamily: MJXcTeXvecRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_VectorRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_VectorRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_VectorRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecB; src: local('MathJax_Vector Bold'), local('MathJax_VectorBold')} @fontface {fontfamily: MJXcTeXvecBx; src: local('MathJax_Vector'); fontweight: bold} @fontface {fontfamily: MJXcTeXvecBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_VectorBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_VectorBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_VectorBold.otf') format('opentype')} Sof objects, and we are given some information on which objects are preferred to which other objects. Then we are interested in whether and in how many ways this data can be captured by a utility function. Our key innovation is that we assume not only the direction of preferences is given, but also some information on the strength of the preferences, in a manner which we will make precise below (weak preferences).
Basic on orders vs utility functions
We refer to the Order Theory page on Wikipedia for the definitions of reflexive, antisymmetric, transitive and connexive binary relations. If .mjxchtml {display: inlineblock; lineheight: 0; textindent: 0; textalign: left; texttransform: none; fontstyle: normal; fontweight: normal; fontsize: 100%; fontsizeadjust: none; letterspacing: normal; wordwrap: normal; wordspacing: normal; whitespace: nowrap; float: none; direction: ltr; maxwidth: none; maxheight: none; minwidth: 0; minheight: 0; border: 0; margin: 0; padding: 1px 0} .MJXcdisplay {display: block; textalign: center; margin: 1em 0; padding: 0} .mjxchtml[tabindex]:focus, body :focus .mjxchtml[tabindex] {display: inlinetable} .mjxfullwidth {textalign: center; display: tablecell!important; width: 10000em} .mjxmath {display: inlineblock; bordercollapse: separate; borderspacing: 0} .mjxmath * {display: inlineblock; webkitboxsizing: contentbox!important; mozboxsizing: contentbox!important; boxsizing: contentbox!important; textalign: left} .mjxnumerator {display: block; textalign: center} .mjxdenominator {display: block; textalign: center} .MJXcstacked {height: 0; position: relative} .MJXcstacked > * {position: absolute} .MJXcbevelled > * {display: inlineblock} .mjxstack {display: inlineblock} .mjxop {display: block} .mjxunder {display: tablecell} .mjxover {display: block} .mjxover > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxunder > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxstack > .mjxsup {display: block} .mjxstack > .mjxsub {display: block} .mjxprestack > .mjxpresup {display: block} .mjxprestack > .mjxpresub {display: block} .mjxdelimh > .mjxchar {display: inlineblock} .mjxsurd {verticalalign: top} .mjxmphantom * {visibility: hidden} .mjxmerror {backgroundcolor: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; fontstyle: normal; fontsize: 90%} .mjxannotationxml {lineheight: normal} .mjxmenclose > svg {fill: none; stroke: currentColor} .mjxmtr {display: tablerow} .mjxmlabeledtr {display: tablerow} .mjxmtd {display: tablecell; textalign: center} .mjxlabel {display: tablerow} .mjxbox {display: inlineblock} .mjxblock {display: block} .mjxspan {display: inline} .mjxchar {display: block; whitespace: pre} .mjxitable {display: inlinetable; width: auto} .mjxrow {display: tablerow} .mjxcell {display: tablecell} .mjxtable {display: table; width: 100%} .mjxline {display: block; height: 0} .mjxstrut {width: 0; paddingtop: 1em} .mjxvsize {width: 0} .MJXcspace1 {marginleft: .167em} .MJXcspace2 {marginleft: .222em} .MJXcspace3 {marginleft: .278em} .mjxtest.mjxtestdisplay {display: table!important} .mjxtest.mjxtestinline {display: inline!important; marginright: 1px} .mjxtest.mjxtestdefault {display: block!important; clear: both} .mjxexbox {display: inlineblock!important; position: absolute; overflow: hidden; minheight: 0; maxheight: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjxtestinline .mjxleftbox {display: inlineblock; width: 0; float: left} .mjxtestinline .mjxrightbox {display: inlineblock; width: 0; float: right} .mjxtestdisplay .mjxrightbox {display: tablecell!important; width: 10000em!important; minwidth: 0; maxwidth: none; padding: 0; border: 0; margin: 0} .MJXcTeXunknownR {fontfamily: monospace; fontstyle: normal; fontweight: normal} .MJXcTeXunknownI {fontfamily: monospace; fontstyle: italic; fontweight: normal} .MJXcTeXunknownB {fontfamily: monospace; fontstyle: normal; fontweight: bold} .MJXcTeXunknownBI {fontfamily: monospace; fontstyle: italic; fontweight: bold} .MJXcTeXamsR {fontfamily: MJXcTeXamsR,MJXcTeXamsRw} .MJXcTeXcalB {fontfamily: MJXcTeXcalB,MJXcTeXcalBx,MJXcTeXcalBw} .MJXcTeXfrakR {fontfamily: MJXcTeXfrakR,MJXcTeXfrakRw} .MJXcTeXfrakB {fontfamily: MJXcTeXfrakB,MJXcTeXfrakBx,MJXcTeXfrakBw} .MJXcTeXmathBI {fontfamily: MJXcTeXmathBI,MJXcTeXmathBIx,MJXcTeXmathBIw} .MJXcTeXsansR {fontfamily: MJXcTeXsansR,MJXcTeXsansRw} .MJXcTeXsansB {fontfamily: MJXcTeXsansB,MJXcTeXsansBx,MJXcTeXsansBw} .MJXcTeXsansI {fontfamily: MJXcTeXsansI,MJXcTeXsansIx,MJXcTeXsansIw} .MJXcTeXscriptR {fontfamily: MJXcTeXscriptR,MJXcTeXscriptRw} .MJXcTeXtypeR {fontfamily: MJXcTeXtypeR,MJXcTeXtypeRw} .MJXcTeXcalR {fontfamily: MJXcTeXcalR,MJXcTeXcalRw} .MJXcTeXmainB {fontfamily: MJXcTeXmainB,MJXcTeXmainBx,MJXcTeXmainBw} .MJXcTeXmainI {fontfamily: MJXcTeXmainI,MJXcTeXmainIx,MJXcTeXmainIw} .MJXcTeXmainR {fontfamily: MJXcTeXmainR,MJXcTeXmainRw} .MJXcTeXmathI {fontfamily: MJXcTeXmathI,MJXcTeXmathIx,MJXcTeXmathIw} .MJXcTeXsize1R {fontfamily: MJXcTeXsize1R,MJXcTeXsize1Rw} .MJXcTeXsize2R {fontfamily: MJXcTeXsize2R,MJXcTeXsize2Rw} .MJXcTeXsize3R {fontfamily: MJXcTeXsize3R,MJXcTeXsize3Rw} .MJXcTeXsize4R {fontfamily: MJXcTeXsize4R,MJXcTeXsize4Rw} .MJXcTeXvecR {fontfamily: MJXcTeXvecR,MJXcTeXvecRw} .MJXcTeXvecB {fontfamily: MJXcTeXvecB,MJXcTeXvecBx,MJXcTeXvecBw} @fontface {fontfamily: MJXcTeXamsR; src: local('MathJax_AMS'), local('MathJax_AMSRegular')} @fontface {fontfamily: MJXcTeXamsRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_AMSRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_AMSRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_AMSRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalB; src: local('MathJax_Caligraphic Bold'), local('MathJax_CaligraphicBold')} @fontface {fontfamily: MJXcTeXcalBx; src: local('MathJax_Caligraphic'); fontweight: bold} @fontface {fontfamily: MJXcTeXcalBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakR; src: local('MathJax_Fraktur'), local('MathJax_FrakturRegular')} @fontface {fontfamily: MJXcTeXfrakRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_FrakturRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_FrakturRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_FrakturRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakB; src: local('MathJax_Fraktur Bold'), local('MathJax_FrakturBold')} @fontface {fontfamily: MJXcTeXfrakBx; src: local('MathJax_Fraktur'); fontweight: bold} @fontface {fontfamily: MJXcTeXfrakBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_FrakturBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_FrakturBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_FrakturBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathBI; src: local('MathJax_Math BoldItalic'), local('MathJax_MathBoldItalic')} @fontface {fontfamily: MJXcTeXmathBIx; src: local('MathJax_Math'); fontweight: bold; fontstyle: italic} @fontface {fontfamily: MJXcTeXmathBIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MathBoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MathBoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MathBoldItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansR; src: local('MathJax_SansSerif'), local('MathJax_SansSerifRegular')} @fontface {fontfamily: MJXcTeXsansRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansB; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerifBold')} @fontface {fontfamily: MJXcTeXsansBx; src: local('MathJax_SansSerif'); fontweight: bold} @fontface {fontfamily: MJXcTeXsansBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansI; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerifItalic')} @fontface {fontfamily: MJXcTeXsansIx; src: local('MathJax_SansSerif'); fontstyle: italic} @fontface {fontfamily: MJXcTeXsansIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXscriptR; src: local('MathJax_Script'), local('MathJax_ScriptRegular')} @fontface {fontfamily: MJXcTeXscriptRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_ScriptRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_ScriptRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_ScriptRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXtypeR; src: local('MathJax_Typewriter'), local('MathJax_TypewriterRegular')} @fontface {fontfamily: MJXcTeXtypeRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_TypewriterRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_TypewriterRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_TypewriterRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalR; src: local('MathJax_Caligraphic'), local('MathJax_CaligraphicRegular')} @fontface {fontfamily: MJXcTeXcalRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainB; src: local('MathJax_Main Bold'), local('MathJax_MainBold')} @fontface {fontfamily: MJXcTeXmainBx; src: local('MathJax_Main'); fontweight: bold} @fontface {fontfamily: MJXcTeXmainBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MainBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MainBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MainBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainI; src: local('MathJax_Main Italic'), local('MathJax_MainItalic')} @fontface {fontfamily: MJXcTeXmainIx; src: local('MathJax_Main'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmainIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MainItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MainItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MainItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainR; src: local('MathJax_Main'), local('MathJax_MainRegular')} @fontface {fontfamily: MJXcTeXmainRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MainRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MainRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MainRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathI; src: local('MathJax_Math Italic'), local('MathJax_MathItalic')} @fontface {fontfamily: MJXcTeXmathIx; src: local('MathJax_Math'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmathIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_MathItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_MathItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_MathItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize1R; src: local('MathJax_Size1'), local('MathJax_Size1Regular')} @fontface {fontfamily: MJXcTeXsize1Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size1Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size1Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size1Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize2R; src: local('MathJax_Size2'), local('MathJax_Size2Regular')} @fontface {fontfamily: MJXcTeXsize2Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size2Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size2Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size2Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize3R; src: local('MathJax_Size3'), local('MathJax_Size3Regular')} @fontface {fontfamily: MJXcTeXsize3Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size3Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size3Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size3Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize4R; src: local('MathJax_Size4'), local('MathJax_Size4Regular')} @fontface {fontfamily: MJXcTeXsize4Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_Size4Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_Size4Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_Size4Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecR; src: local('MathJax_Vector'), local('MathJax_VectorRegular')} @fontface {fontfamily: MJXcTeXvecRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_VectorRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_VectorRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_VectorRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecB; src: local('MathJax_Vector Bold'), local('MathJax_VectorBold')} @fontface {fontfamily: MJXcTeXvecBx; src: local('MathJax_Vector'); fontweight: bold} @fontface {fontfamily: MJXcTeXvecBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/eot/MathJax_VectorBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/woff/MathJax_VectorBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/fonts/HTMLCSS/TeX/otf/MathJax_VectorBold.otf') format('opentype')} S is a set and U:S→R is a function (`utility'), this induces a reflexive, transitive and connected binary relation (not antisymmetric in general, unless U is injective).
Conversely, any reflexive, transitive, antisymmetric and connexive binary relation (a.k.a. total order) on a countable set S, this is induced by a utility function taking values in the rational numbers (link to proof); there is a more general discussion here.
Strength of preferences
In what follows, we fix a totally ordered abelian group G. To express a preference between to objects s1, s2 of our set S, one should give an element of G which expresses how strongly s1 is preferred to s2. The most natural example is to take G=Z, then:
Saying s1 is preferred to s2 with strength 1 means we slightly prefer s1 to s2;
Saying s1 is preferred to s2 with strength 0 means have no preference between s1 and s2;
Saying s1 is preferred to s2 with strength 2 means we prefer s1 to s2 more strongly;
Saying s1 is preferred to s2 with strength 1 means we slightly prefer s2 to s1.
Expressions of preferenceWe consider three ways of describing preferences among objects in a set S:
1. Weak Preference
Definition. A weak preference among elements of S consists of a collection of triples (s1,s2,g) with si∈S and g∈G.
A triple (s1,s2,g) is interpreted as meaning that s1 is preferred to s2 with strength g. This is the kind of data one might be provided with in practise; for example, someone tells you that they slightly prefer cats to dogs, but strongly prefers dogs to snakes (assigning numbers/elements of G to the strengths of the preferences).
2. Categorical preference
Definition. To give a categorical preference for elements of S is to give a category with objects S, together with an enrichment over G.
This definition may seem a bit cryptic, but is close to a standard way of thinking of an order as an enrichment. For every two objects s1,s2∈S we assign an element of G (which we think of as telling us the strength of the preference for s1 over s2), subject to a bunch of compatibility conditions (for example it will imply that the preference for s over s is given by the unit of G). More details and expansion can be found in Lawvere's Taking Categories Seriously.
3. Utility function
This is just a function U:S→R, and is interpreted in the usual way.
GoalIn practise, information is likely to be supplied in the form of a weak preference. Ultimately one wants a utility function to feed to some optimisation procedure. We will describe how to pass naturally from a weak to a categorical preference, and then see (under some hypotheses) that a categorial preferences induces an essentiallyunique utility function.
We suppose throughout that S and G are fixed, with S finite for simplicity.
Weak preferences to categorial preferences
First, given a categorical preference, choosing a subset of arrows yields a weak preference. On the other hand, given a weak preference, we are interested in
Q1. whether this weak preference can arise from a categorical preference, and
Q2. if so, then from how many distinct categorical preferences can this weak preference arise.
Suppose we are given a weak preference W. First, for every triple (s1,s2,g), we adjoin to W the triple (s2,s1,−g); call the resulting weak prefernce W′. A cycle in W' is a finite ordered sequence of triples ((s1,s2,g1),(s2,s3,g2),⋯,(sn,s1,gn)) , and such a cycle is closed if ∑ni=1gi=0G.
Claim 1: The weak preference W′can be extended to a categorical preference if and only if every cycle is closed.
Sketch of proof. Suppose we are given s1 and s2, and want to assign an element of G.If there is no path from s1 to s2 we can assign any element we want (we will use this observation later). If there is a path, then the group law in G determines a value of the preference of s1 over s2. The only way a problem might arise is if there are two (or more) paths from s1 to s2, but then the cycles are closed condition means that these paths will induce the same preference for s1 over s2. QED
We move on the the uniqueness question. Assume that W′ satisfies the `cycles are closed' condition, and write π0(W′) for the set of connected components of W′.
Claim 2. The set of categorical preferences restricting to W′ is naturally in bijection with G#π0(W′)−1.
In particular, if W′ is connected then there is exactly one way to associate a categorical preference.
Sketch of proof. In the proof of claim 1, the only choice we made was when there was no path from s1 to s2. In that case we chose an element of G. QED
From categorical preferences to utility functions
Suppose we are given a categorical preference on S, i.e. the structure of a category enriched over G with object set S . Analogously to the above, we are interested in whether and in how many ways this can be induced by a utility function.
From now on, we assume that G is Archimedean , equivalently that it is isomorphic (as an ordered group) to a subgroup of R. This isomorphism is then necessarily unique up to scaling (Holder's theorem).
Suppose first that we have a utility function U:S→R. Let G⊂R be a subgroup containing U(s1)−U(s2) for every s1,s2∈S. Then by assigning to the pair (s1,s2) the element U(s1)−U(s2)∈G we give S a categorical preference structure, with enrichment over G.
Again, we are interested in two questions:
Q1. given a categorical preference on S, does it arise from some utility function in the above fashion?
Q2. If the answer to Q1 is `yes', then from` how many` utility functions can out categorical preference arise?
The answer to Q1 is always yes. First choose an embedding of G in R as a totally ordered group. Then choose some element s0∈S, and define U(s0)=0.The values of U on the other elements of S are then uniquely determined by the enriched category structure.
Q2 is almost as easy. The embedding of G in R is unique up to scaling, and the choice of U(s0) is just a translation. Hence the utility function U is unique up to translation and scaling.
ConclusionIn practise, we might be given the data of a weak preference. We have seen an easy way to check whether it can be extended to a categorical preference, and a simple description of all the resulting possible categorical preferences. A categorical preference always comes from a utility function, in a way which is unique up to translation and scaling.
In actual practise, it is not unlikely that the weak preference data will not come from a categorical preference (and thus not from a utility function). Then we should probably look for the `most reasonable' associated categorical preference, how to do this is not so clear to me yet.
I find the fact that utility functions are only unique up to translation and scaling a bit awkward; maybe this notion of categorical preference captures the important data in a more canonical fashion?!
Discuss
What is the state of the ego depletion field?
It's been almost a half decade since the replication crisis in psychology broke, and one of the major casualties was the subfield of ego depletion. (I belive this was the initial large scale metaanalysis that found no effect. And here is a popular article on the failure to replicate.)
Does anyone have a good summary of the state of that research program? Is there any part of it that held up? What can we conclusively say about self control, willpower, depletion of mental resources, etc?
Discuss
Is there a source/market for LWrelated tshirts?
I think some of the sayings that come from LW and surrounding community would work well on tshirts. I would both be interested in making and buying them.
 Is there already a source of such tshirts?
 If not, is there interest?
 What is the best platform for these tshirts?
Discuss
AI Forecasting Dictionary (Forecasting infrastructure, part 1)
This post introduces the AI Forecasting Dictionary, an opensource set of standards and conventions for precisely interpreting AI and auxiliary terms. It is the first part in a series of blog posts which motivate and introduce pieces of infrastructure intended to improve our ability to forecast novel and uncertain domains like AI.
The Dictionary is currently in beta, and we're launching early to get feedback from the community and quickly figure out how useful it is.
Background and motivation1) Operationalisation is an unsolved problem in forecasting
A key challenge in (AI) forecasting is to write good questions. This is tricky because we want questions which both capture important uncertainties, and are sufficiently concrete that we can resolve them and award points to forecasters in hindsight. For example:
Will there be a slow takeoff?is a question that’s important yet too vague.
Will there be a 4year doubling of world output before the first 1year doubling of world output?is both important and concrete, yet sufficiently farout that it’s unclear if standard forecasting practices will be helpful in resolving it.
Will there be a Starcraft II agent by the end of 2020 which is at least as powerful as AlphaStar, yet uses <$10.000 of publicly available compute?is more amenable to standard forecasting practices, but at the cost of being only tangentially related to the highlevel uncertainty we initially cared about. And so on.
Currently, forecasting projects reinvent this wheel of operationalisation all the time. There’s usually idiosyncratic and timeconsuming processes of writing and testing questions (this might take many hours for a single question) [1], and best practices tend to evolve organically but without being systematically recorded and built upon [2].
2) The future is big, and forecasting it might require answering a lot of questions
This is an empirical claim which we’ve become more confident by working in this space over the last year.
One way of seeing this is by attempting to break down an important highlevel question into pieces. Suppose we want to get a handle on AI progress by investigating key inputs. We might branch those into progress on hardware, software, and data (including simulations). We might then branch hardware into economics and algorithmic parallelizability. To understand the economics, we must branch it into the supply and demand side, and we must then branch each of those to understand how they interface with regulation and innovation. This involves thousands of actors across academia, industry and government, and hundreds of different metrics for tracking progress of various kinds. And we’ve only done a brief depthfirst search on one of the branches of the hardsoftwaredata tree, which in turn is just one way of approaching the AI forecasting problem.
Another way of guesstimating this: the AI Impacts archives contains roughly 140 articles. Suppose this is 10% of the number of articles they’d need to accomplish their mission. If they each contains 130 uncertain claims that we’d ideally like to gather estimates on, that’s 1400 to 42000 uncertainties  each of which would admit many different ways of being sufficiently operationalised. For reference, over the 4 years of the Good Judgement Project, roughly 500 questions were answered.
We’d of course be able to prune this space by focusing on the most important questions. Nonetheless, there seems to be a plausible case that scaling our ability to answer many questions is important if we want our forecasting efforts to succeed.
We see some evidence of this from the SciCast project, a prediction tournament on science and technology that ran from 20132015. The tournament organizers note the importance of scaling question generation through templates and the creation of a style guide. (See the 2015 Annual report, p. 86.)
3) So in order to forecast AI we must achieve economiesofscale – making it cheap to write and answer the marginal question by efficiently reusing work across them.
AI Forecasting DictionaryAs a piece of the puzzle to solve the above problems, we made the AI Forecasting Dictionary. It is an opensource set of standards and conventions for precisely interpreting AI and auxiliary terms.
Here’s an example entry:
AutomatableSee also: JobA job is automatable at a time t if a machine can outperform the medianskilled employee, with 6 months of training or less, at 10,000x the cost of the median employee or less. Unless otherwise specified, the date of automation will taken to be the first time this threshold is crossed.Examples:*As of 2019, Elevator OperatorNonexamples:*As of 2019, Ambulance Driver*As of 2019, Epidemiologist(This definition is based on Luke Muelhauser’s here.)
There are several mechanisms whereby building a dictionary helps solve the problems outlined above.
Less overhead for writing and forecasting questions
The dictionary reduces overhead in two ways: writers don’t have to reinvent the wheel whenever they operationalise a new thought, and forecasters can reduce the drag of constantly interpreting and understanding new resolutions. This makes it cheaper to both generate and answer the marginal question.
A platform for spreading high initial costs over many future use cases
There are a number of common pitfalls that can make a seemingly valid question ambiguous or misleading. For example, positively resolving the question:
Will an AI lab have been nationalized by 2024?by the US government nationalising GM as a response to a financial crisis, yet GM nonetheless having a selfdriving car research division.Or forecasting:
When will there be a superhuman Angry Birds agent using no hardcoded knowledge?and realizing that there seems to be little active interest in the yearly benchmark competition (with performance even declining over years). This means that the probability entirely depends on whether anyone with enough money and competence decides to work on it, as opposed to what key components make Angry Birds difficult (e.g. physicsbased simulation and planning) and how fast progress is in those domains.
Carefully avoiding such pitfalls comes with a high initial cost when writing the question. We can make that cost worth it by ensuring it is amortized across many future questions, and broadly used and built upon. A Dictionary is a piece of infrastructure that provides a standardised way of doing this. If someone spends a lot of time figuring out how to deal with a tricky edge case or a “spurious resolution”, there is now a Schelling point where they can store that work, and expect future users to read it (as well as where future users can expect them to have stored it).
Version management
When resolving and scoring quantitative forecasting questions, it’s important to know exactly what question the forecaster was answering. This need for precision often conflicts with the need to improve the resolution conditions from questions as we learn and stresstest them over time. For the Dictionary, we can use best practices for software version management to help solve this problem. As of this writing, the Dictionary is still in beta, with the latest release being v0.3.0.
Opensource serendipity
The Dictionary might be useful not just for forecasting, but also for other contexts where precisely defined AI terms are important. We opensourced it in order to allow people to experiment with such use cases. If you do so in a substantial way, please let us know.
How to use the dictionaryIf you use the Dictionary for forecasting purposes, please reference it to help establish it as a standard of interpretation.One way of doing this is by appending the tag [aidictvX.Y.Z] at the end of the relevant stringFor example:
I predict that image classification will be made robust against unrestricted adversarial examples by 2023. [aidictv2]or
Will there be a superhuman Starcraft agent trained using less than $10.000 of publicly available compute by 2025? [aidictv0.4]In some cases you might want to tweak or change the definitions of a term to match a particular use case, thereby departing from the Dictionary convention. If so, then you SHOULD mark the terms receiving a nonstandard interpretation with the “^” symbol. For example:
I expect unsupervised language models to be humanlevel^ by 2024. [aidictv1.3]You might also want to add the following notice:
For purposes of resolution, these terms are interpreted in accordance with the Technical AI Forecasting Resolution Dictionary vX.Y.Z, available at aidict.com. Any term whose interpretation deliberately departs from this standard has been marked with a ^."How to contribute to the dictionaryThe AI Forecasting Dictionary is opensource, and you can contribute by making pullrequests to our GitHub or suggestions in the Google Doc version (more details here). We especially welcome:
 Attempts to introduce novel definitions that capture important terms in AI (current examples include: “module”, “transformative AI” and “compute (training)”)
 Examples of forecasting questions which you wrote and which ended up solving/making progress on some tricky piece of operationalisation, such that others can build on that progress
Footnotes
[1] Some people might be compelled by an analogy to mathematics here: most of the work often lies in setting up the right formalism and problem formulation rather than in the actual proof (for example, Nash’s original fixed point theorems in game theory aren’t that difficultle once the setup is in place, but realising why and how this kind of setup was applicable to a large class of important problems was highly nontrivial).
[2] English Common Law is the clear example of how definitions and policies evolve over time to crystallize judgements and wisdom.
Discuss
Why Gradients Vanish and Explode
Epistemic status: Confused, but trying to explain a concept that I previously thought I understood.
Without taking proper care of a very deep neural network, gradients tend to suddenly become quite large or quite small. If the gradient is too large, then the network parameters will be thrown completely off, possibly causing them to become NaN. If they are too small, then the network will stop training entirely. This problem is called the vanishing and exploding gradients problem.
When I first learned about the vanishing gradients problem, I ended up getting a vague sense of why it occurs. In my head I visualized the sigmoid function.
I then imagined this being applied elementwise to an affine transformation. If we just look at one element, then we can imagine it being the result of a dot product of some parameters, and that number is being plugged in on the xaxis. On the far left and on the far right, the derivative of this function is very small. This means that if we take the partial derivative with respect to some parameter, it will end up being extremely (perhaps vanishingly) small..mjxchtml {display: inlineblock; lineheight: 0; textindent: 0; textalign: left; texttransform: none; fontstyle: normal; fontweight: normal; fontsize: 100%; fontsizeadjust: none; letterspacing: normal; wordwrap: normal; wordspacing: normal; whitespace: nowrap; float: none; direction: ltr; maxwidth: none; maxheight: none; minwidth: 0; minheight: 0; border: 0; margin: 0; padding: 1px 0} .MJXcdisplay {display: block; textalign: center; margin: 1em 0; padding: 0} .mjxchtml[tabindex]:focus, body :focus .mjxchtml[tabindex] {display: inlinetable} .mjxfullwidth {textalign: center; display: tablecell!important; width: 10000em} .mjxmath {display: inlineblock; bordercollapse: separate; borderspacing: 0} .mjxmath * {display: inlineblock; webkitboxsizing: contentbox!important; mozboxsizing: contentbox!important; boxsizing: contentbox!important; textalign: left} .mjxnumerator {display: block; textalign: center} .mjxdenominator {display: block; textalign: center} .MJXcstacked {height: 0; position: relative} .MJXcstacked > * {position: absolute} .MJXcbevelled > * {display: inlineblock} .mjxstack {display: inlineblock} .mjxop {display: block} .mjxunder {display: tablecell} .mjxover {display: block} .mjxover > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxunder > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxstack > .mjxsup {display: block} .mjxstack > .mjxsub {display: block} .mjxprestack > .mjxpresup {display: block} .mjxprestack > .mjxpresub {display: block} .mjxdelimh > .mjxchar {display: inlineblock} .mjxsurd {verticalalign: top} .mjxmphantom * {visibility: hidden} .mjxmerror {backgroundcolor: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; fontstyle: normal; fontsize: 90%} .mjxannotationxml {lineheight: normal} .mjxmenclose > svg {fill: none; stroke: currentColor} .mjxmtr {display: tablerow} .mjxmlabeledtr {display: tablerow} .mjxmtd {display: tablecell; textalign: center} .mjxlabel {display: tablerow} .mjxbox {display: inlineblock} .mjxblock {display: block} .mjxspan {display: inline} .mjxchar {display: block; whitespace: pre} .mjxitable {display: inlinetable; width: auto} .mjxrow {display: tablerow} .mjxcell {display: tablecell} .mjxtable {display: table; width: 100%} .mjxline {display: block; height: 0} .mjxstrut {width: 0; paddingtop: 1em} .mjxvsize {width: 0} .MJXcspace1 {marginleft: .167em} .MJXcspace2 {marginleft: .222em} .MJXcspace3 {marginleft: .278em} .mjxtest.mjxtestdisplay {display: table!important} .mjxtest.mjxtestinline {display: inline!important; marginright: 1px} .mjxtest.mjxtestdefault {display: block!important; clear: both} .mjxexbox {display: inlineblock!important; position: absolute; overflow: hidden; minheight: 0; maxheight: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjxtestinline .mjxleftbox {display: inlineblock; width: 0; float: left} .mjxtestinline .mjxrightbox {display: inlineblock; width: 0; float: right} .mjxtestdisplay .mjxrightbox {display: tablecell!important; width: 10000em!important; minwidth: 0; maxwidth: none; padding: 0; border: 0; margin: 0} .MJXcTeXunknownR {fontfamily: monospace; fontstyle: normal; fontweight: normal} .MJXcTeXunknownI {fontfamily: monospace; fontstyle: italic; fontweight: normal} .MJXcTeXunknownB {fontfamily: monospace; fontstyle: normal; fontweight: bold} .MJXcTeXunknownBI {fontfamily: monospace; fontstyle: italic; fontweight: bold} .MJXcTeXamsR {fontfamily: MJXcTeXamsR,MJXcTeXamsRw} .MJXcTeXcalB {fontfamily: MJXcTeXcalB,MJXcTeXcalBx,MJXcTeXcalBw} .MJXcTeXfrakR {fontfamily: MJXcTeXfrakR,MJXcTeXfrakRw} .MJXcTeXfrakB {fontfamily: MJXcTeXfrakB,MJXcTeXfrakBx,MJXcTeXfrakBw} .MJXcTeXmathBI {fontfamily: MJXcTeXmathBI,MJXcTeXmathBIx,MJXcTeXmathBIw} .MJXcTeXsansR {fontfamily: MJXcTeXsansR,MJXcTeXsansRw} .MJXcTeXsansB {fontfamily: MJXcTeXsansB,MJXcTeXsansBx,MJXcTeXsansBw} .MJXcTeXsansI {fontfamily: MJXcTeXsansI,MJXcTeXsansIx,MJXcTeXsansIw} .MJXcTeXscriptR {fontfamily: MJXcTeXscriptR,MJXcTeXscriptRw} .MJXcTeXtypeR {fontfamily: MJXcTeXtypeR,MJXcTeXtypeRw} .MJXcTeXcalR {fontfamily: MJXcTeXcalR,MJXcTeXcalRw} .MJXcTeXmainB {fontfamily: MJXcTeXmainB,MJXcTeXmainBx,MJXcTeXmainBw} .MJXcTeXmainI {fontfamily: MJXcTeXmainI,MJXcTeXmainIx,MJXcTeXmainIw} .MJXcTeXmainR {fontfamily: MJXcTeXmainR,MJXcTeXmainRw} .MJXcTeXmathI {fontfamily: MJXcTeXmathI,MJXcTeXmathIx,MJXcTeXmathIw} .MJXcTeXsize1R {fontfamily: MJXcTeXsize1R,MJXcTeXsize1Rw} .MJXcTeXsize2R {fontfamily: MJXcTeXsize2R,MJXcTeXsize2Rw} .MJXcTeXsize3R {fontfamily: MJXcTeXsize3R,MJXcTeXsize3Rw} .MJXcTeXsize4R {fontfamily: MJXcTeXsize4R,MJXcTeXsize4Rw} .MJXcTeXvecR {fontfamily: MJXcTeXvecR,MJXcTeXvecRw} .MJXcTeXvecB {fontfamily: MJXcTeXvecB,MJXcTeXvecBx,MJXcTeXvecBw} @fontface {fontfamily: MJXcTeXamsR; src: local('MathJax_AMS'), local('MathJax_AMSRegular')} @fontface {fontfamily: MJXcTeXamsRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_AMSRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_AMSRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_AMSRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalB; src: local('MathJax_Caligraphic Bold'), local('MathJax_CaligraphicBold')} @fontface {fontfamily: MJXcTeXcalBx; src: local('MathJax_Caligraphic'); fontweight: bold} @fontface {fontfamily: MJXcTeXcalBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakR; src: local('MathJax_Fraktur'), local('MathJax_FrakturRegular')} @fontface {fontfamily: MJXcTeXfrakRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_FrakturRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_FrakturRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_FrakturRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakB; src: local('MathJax_Fraktur Bold'), local('MathJax_FrakturBold')} @fontface {fontfamily: MJXcTeXfrakBx; src: local('MathJax_Fraktur'); fontweight: bold} @fontface {fontfamily: MJXcTeXfrakBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_FrakturBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_FrakturBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_FrakturBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathBI; src: local('MathJax_Math BoldItalic'), local('MathJax_MathBoldItalic')} @fontface {fontfamily: MJXcTeXmathBIx; src: local('MathJax_Math'); fontweight: bold; fontstyle: italic} @fontface {fontfamily: MJXcTeXmathBIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MathBoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MathBoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MathBoldItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansR; src: local('MathJax_SansSerif'), local('MathJax_SansSerifRegular')} @fontface {fontfamily: MJXcTeXsansRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansB; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerifBold')} @fontface {fontfamily: MJXcTeXsansBx; src: local('MathJax_SansSerif'); fontweight: bold} @fontface {fontfamily: MJXcTeXsansBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansI; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerifItalic')} @fontface {fontfamily: MJXcTeXsansIx; src: local('MathJax_SansSerif'); fontstyle: italic} @fontface {fontfamily: MJXcTeXsansIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXscriptR; src: local('MathJax_Script'), local('MathJax_ScriptRegular')} @fontface {fontfamily: MJXcTeXscriptRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_ScriptRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_ScriptRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_ScriptRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXtypeR; src: local('MathJax_Typewriter'), local('MathJax_TypewriterRegular')} @fontface {fontfamily: MJXcTeXtypeRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_TypewriterRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_TypewriterRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_TypewriterRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalR; src: local('MathJax_Caligraphic'), local('MathJax_CaligraphicRegular')} @fontface {fontfamily: MJXcTeXcalRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainB; src: local('MathJax_Main Bold'), local('MathJax_MainBold')} @fontface {fontfamily: MJXcTeXmainBx; src: local('MathJax_Main'); fontweight: bold} @fontface {fontfamily: MJXcTeXmainBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainI; src: local('MathJax_Main Italic'), local('MathJax_MainItalic')} @fontface {fontfamily: MJXcTeXmainIx; src: local('MathJax_Main'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmainIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainR; src: local('MathJax_Main'), local('MathJax_MainRegular')} @fontface {fontfamily: MJXcTeXmainRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathI; src: local('MathJax_Math Italic'), local('MathJax_MathItalic')} @fontface {fontfamily: MJXcTeXmathIx; src: local('MathJax_Math'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmathIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MathItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MathItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MathItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize1R; src: local('MathJax_Size1'), local('MathJax_Size1Regular')} @fontface {fontfamily: MJXcTeXsize1Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size1Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size1Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size1Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize2R; src: local('MathJax_Size2'), local('MathJax_Size2Regular')} @fontface {fontfamily: MJXcTeXsize2Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size2Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size2Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size2Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize3R; src: local('MathJax_Size3'), local('MathJax_Size3Regular')} @fontface {fontfamily: MJXcTeXsize3Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size3Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size3Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size3Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize4R; src: local('MathJax_Size4'), local('MathJax_Size4Regular')} @fontface {fontfamily: MJXcTeXsize4Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size4Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size4Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size4Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecR; src: local('MathJax_Vector'), local('MathJax_VectorRegular')} @fontface {fontfamily: MJXcTeXvecRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_VectorRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_VectorRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_VectorRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecB; src: local('MathJax_Vector Bold'), local('MathJax_VectorBold')} @fontface {fontfamily: MJXcTeXvecBx; src: local('MathJax_Vector'); fontweight: bold} @fontface {fontfamily: MJXcTeXvecBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_VectorBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_VectorBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_VectorBold.otf') format('opentype')} 1
Now, I know the way that I was visualizing this was very wrong. There are a few mistakes I made:
1. This picture doesn't tell me anything about why the gradient "vanishes." It's just showing me a picture of where the gradients get small. Gradients also get small when they reach a local minimum. Does this mean that vanishing gradients are sometimes good?
2. I knew that gradient vanishing had something to do with the depth of a network, but I didn't see how the network being deep affected why the gradients got small. I had a rudimentary sense that each layer of sigmoid compounds the problem until there's no gradient left, but this was never presented to me in a precise way, so I just ignored it.
I now think I understand the problem a bit better, but maybe not a whole lot better.
First, the basics. Without describing the problem in a very general sense, I'll walk through a brief example. In particular, I'll show how we can imagine a forward pass in a simple recurrent neural network that enables a feedback effect to occur. We can then immediately see how gradient vanishing can become a problem within this framework (no sigmoids necessary).
Imagine that there is some sequence of vectors which are defined via the following recursive definition,
.mjxchtml {display: inlineblock; lineheight: 0; textindent: 0; textalign: left; texttransform: none; fontstyle: normal; fontweight: normal; fontsize: 100%; fontsizeadjust: none; letterspacing: normal; wordwrap: normal; wordspacing: normal; whitespace: nowrap; float: none; direction: ltr; maxwidth: none; maxheight: none; minwidth: 0; minheight: 0; border: 0; margin: 0; padding: 1px 0} .MJXcdisplay {display: block; textalign: center; margin: 1em 0; padding: 0} .mjxchtml[tabindex]:focus, body :focus .mjxchtml[tabindex] {display: inlinetable} .mjxfullwidth {textalign: center; display: tablecell!important; width: 10000em} .mjxmath {display: inlineblock; bordercollapse: separate; borderspacing: 0} .mjxmath * {display: inlineblock; webkitboxsizing: contentbox!important; mozboxsizing: contentbox!important; boxsizing: contentbox!important; textalign: left} .mjxnumerator {display: block; textalign: center} .mjxdenominator {display: block; textalign: center} .MJXcstacked {height: 0; position: relative} .MJXcstacked > * {position: absolute} .MJXcbevelled > * {display: inlineblock} .mjxstack {display: inlineblock} .mjxop {display: block} .mjxunder {display: tablecell} .mjxover {display: block} .mjxover > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxunder > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxstack > .mjxsup {display: block} .mjxstack > .mjxsub {display: block} .mjxprestack > .mjxpresup {display: block} .mjxprestack > .mjxpresub {display: block} .mjxdelimh > .mjxchar {display: inlineblock} .mjxsurd {verticalalign: top} .mjxmphantom * {visibility: hidden} .mjxmerror {backgroundcolor: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; fontstyle: normal; fontsize: 90%} .mjxannotationxml {lineheight: normal} .mjxmenclose > svg {fill: none; stroke: currentColor} .mjxmtr {display: tablerow} .mjxmlabeledtr {display: tablerow} .mjxmtd {display: tablecell; textalign: center} .mjxlabel {display: tablerow} .mjxbox {display: inlineblock} .mjxblock {display: block} .mjxspan {display: inline} .mjxchar {display: block; whitespace: pre} .mjxitable {display: inlinetable; width: auto} .mjxrow {display: tablerow} .mjxcell {display: tablecell} .mjxtable {display: table; width: 100%} .mjxline {display: block; height: 0} .mjxstrut {width: 0; paddingtop: 1em} .mjxvsize {width: 0} .MJXcspace1 {marginleft: .167em} .MJXcspace2 {marginleft: .222em} .MJXcspace3 {marginleft: .278em} .mjxtest.mjxtestdisplay {display: table!important} .mjxtest.mjxtestinline {display: inline!important; marginright: 1px} .mjxtest.mjxtestdefault {display: block!important; clear: both} .mjxexbox {display: inlineblock!important; position: absolute; overflow: hidden; minheight: 0; maxheight: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjxtestinline .mjxleftbox {display: inlineblock; width: 0; float: left} .mjxtestinline .mjxrightbox {display: inlineblock; width: 0; float: right} .mjxtestdisplay .mjxrightbox {display: tablecell!important; width: 10000em!important; minwidth: 0; maxwidth: none; padding: 0; border: 0; margin: 0} .MJXcTeXunknownR {fontfamily: monospace; fontstyle: normal; fontweight: normal} .MJXcTeXunknownI {fontfamily: monospace; fontstyle: italic; fontweight: normal} .MJXcTeXunknownB {fontfamily: monospace; fontstyle: normal; fontweight: bold} .MJXcTeXunknownBI {fontfamily: monospace; fontstyle: italic; fontweight: bold} .MJXcTeXamsR {fontfamily: MJXcTeXamsR,MJXcTeXamsRw} .MJXcTeXcalB {fontfamily: MJXcTeXcalB,MJXcTeXcalBx,MJXcTeXcalBw} .MJXcTeXfrakR {fontfamily: MJXcTeXfrakR,MJXcTeXfrakRw} .MJXcTeXfrakB {fontfamily: MJXcTeXfrakB,MJXcTeXfrakBx,MJXcTeXfrakBw} .MJXcTeXmathBI {fontfamily: MJXcTeXmathBI,MJXcTeXmathBIx,MJXcTeXmathBIw} .MJXcTeXsansR {fontfamily: MJXcTeXsansR,MJXcTeXsansRw} .MJXcTeXsansB {fontfamily: MJXcTeXsansB,MJXcTeXsansBx,MJXcTeXsansBw} .MJXcTeXsansI {fontfamily: MJXcTeXsansI,MJXcTeXsansIx,MJXcTeXsansIw} .MJXcTeXscriptR {fontfamily: MJXcTeXscriptR,MJXcTeXscriptRw} .MJXcTeXtypeR {fontfamily: MJXcTeXtypeR,MJXcTeXtypeRw} .MJXcTeXcalR {fontfamily: MJXcTeXcalR,MJXcTeXcalRw} .MJXcTeXmainB {fontfamily: MJXcTeXmainB,MJXcTeXmainBx,MJXcTeXmainBw} .MJXcTeXmainI {fontfamily: MJXcTeXmainI,MJXcTeXmainIx,MJXcTeXmainIw} .MJXcTeXmainR {fontfamily: MJXcTeXmainR,MJXcTeXmainRw} .MJXcTeXmathI {fontfamily: MJXcTeXmathI,MJXcTeXmathIx,MJXcTeXmathIw} .MJXcTeXsize1R {fontfamily: MJXcTeXsize1R,MJXcTeXsize1Rw} .MJXcTeXsize2R {fontfamily: MJXcTeXsize2R,MJXcTeXsize2Rw} .MJXcTeXsize3R {fontfamily: MJXcTeXsize3R,MJXcTeXsize3Rw} .MJXcTeXsize4R {fontfamily: MJXcTeXsize4R,MJXcTeXsize4Rw} .MJXcTeXvecR {fontfamily: MJXcTeXvecR,MJXcTeXvecRw} .MJXcTeXvecB {fontfamily: MJXcTeXvecB,MJXcTeXvecBx,MJXcTeXvecBw} @fontface {fontfamily: MJXcTeXamsR; src: local('MathJax_AMS'), local('MathJax_AMSRegular')} @fontface {fontfamily: MJXcTeXamsRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_AMSRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_AMSRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_AMSRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalB; src: local('MathJax_Caligraphic Bold'), local('MathJax_CaligraphicBold')} @fontface {fontfamily: MJXcTeXcalBx; src: local('MathJax_Caligraphic'); fontweight: bold} @fontface {fontfamily: MJXcTeXcalBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakR; src: local('MathJax_Fraktur'), local('MathJax_FrakturRegular')} @fontface {fontfamily: MJXcTeXfrakRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_FrakturRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_FrakturRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_FrakturRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakB; src: local('MathJax_Fraktur Bold'), local('MathJax_FrakturBold')} @fontface {fontfamily: MJXcTeXfrakBx; src: local('MathJax_Fraktur'); fontweight: bold} @fontface {fontfamily: MJXcTeXfrakBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_FrakturBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_FrakturBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_FrakturBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathBI; src: local('MathJax_Math BoldItalic'), local('MathJax_MathBoldItalic')} @fontface {fontfamily: MJXcTeXmathBIx; src: local('MathJax_Math'); fontweight: bold; fontstyle: italic} @fontface {fontfamily: MJXcTeXmathBIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MathBoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MathBoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MathBoldItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansR; src: local('MathJax_SansSerif'), local('MathJax_SansSerifRegular')} @fontface {fontfamily: MJXcTeXsansRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansB; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerifBold')} @fontface {fontfamily: MJXcTeXsansBx; src: local('MathJax_SansSerif'); fontweight: bold} @fontface {fontfamily: MJXcTeXsansBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansI; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerifItalic')} @fontface {fontfamily: MJXcTeXsansIx; src: local('MathJax_SansSerif'); fontstyle: italic} @fontface {fontfamily: MJXcTeXsansIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXscriptR; src: local('MathJax_Script'), local('MathJax_ScriptRegular')} @fontface {fontfamily: MJXcTeXscriptRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_ScriptRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_ScriptRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_ScriptRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXtypeR; src: local('MathJax_Typewriter'), local('MathJax_TypewriterRegular')} @fontface {fontfamily: MJXcTeXtypeRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_TypewriterRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_TypewriterRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_TypewriterRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalR; src: local('MathJax_Caligraphic'), local('MathJax_CaligraphicRegular')} @fontface {fontfamily: MJXcTeXcalRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainB; src: local('MathJax_Main Bold'), local('MathJax_MainBold')} @fontface {fontfamily: MJXcTeXmainBx; src: local('MathJax_Main'); fontweight: bold} @fontface {fontfamily: MJXcTeXmainBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainI; src: local('MathJax_Main Italic'), local('MathJax_MainItalic')} @fontface {fontfamily: MJXcTeXmainIx; src: local('MathJax_Main'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmainIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainR; src: local('MathJax_Main'), local('MathJax_MainRegular')} @fontface {fontfamily: MJXcTeXmainRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathI; src: local('MathJax_Math Italic'), local('MathJax_MathItalic')} @fontface {fontfamily: MJXcTeXmathIx; src: local('MathJax_Math'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmathIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MathItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MathItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MathItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize1R; src: local('MathJax_Size1'), local('MathJax_Size1Regular')} @fontface {fontfamily: MJXcTeXsize1Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size1Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size1Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size1Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize2R; src: local('MathJax_Size2'), local('MathJax_Size2Regular')} @fontface {fontfamily: MJXcTeXsize2Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size2Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size2Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size2Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize3R; src: local('MathJax_Size3'), local('MathJax_Size3Regular')} @fontface {fontfamily: MJXcTeXsize3Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size3Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size3Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size3Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize4R; src: local('MathJax_Size4'), local('MathJax_Size4Regular')} @fontface {fontfamily: MJXcTeXsize4Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size4Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size4Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size4Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecR; src: local('MathJax_Vector'), local('MathJax_VectorRegular')} @fontface {fontfamily: MJXcTeXvecRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_VectorRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_VectorRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_VectorRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecB; src: local('MathJax_Vector Bold'), local('MathJax_VectorBold')} @fontface {fontfamily: MJXcTeXvecBx; src: local('MathJax_Vector'); fontweight: bold} @fontface {fontfamily: MJXcTeXvecBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_VectorBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_VectorBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_VectorBold.otf') format('opentype')} h(t)=Wh(t−1)
This sequence of vectors can be identified as the sequence of hidden states of the network. Let W admit an orthogonal eigendecomposition. We can then represent this repeated application of the weights matrix as
h(t)=QΛtQ⊺h(0)
where Λ is a diagonal matrix containing the eigenvalues of W, and Q is an orthogonal matrix. If we consider the eigenvalues, which are the diagonal entries of Λ, we can tell that the ones that are less than one will decay exponentially towards zero, and the values that are greater than one will blow up exponentially towards infinity as t grows in size.
Since Q is orthogonal, the transformation Q⊺h(0) can be thought of as a rotation transformation of the vector h(0) where each coordinate in the new transformation reflects h(0) being projected onto an eigenvector of W. Therefore, when t is very large, as in the case of an unrolled recurrent network, then this matrix calculation will end up getting dominated by the parts of h(0) that point in the same direction as the exploding eigenvectors.
This is a problem because if an input vector ends up pointing in the direction of one of these eigenvectors, the loss function may be very high. From this, it will turn out that in these regions, stochastic gradient descent may massively overshoot. If SDG overshoots, then we end up reversing all of the descent progress that we previously had towards descending down to a local minimum.
As Goodfellow et al. note2, this error is relatively easy to avoid in the case of nonrecurrent neural networks, because in that case the weights aren't shared between layers. However, in the case of vanilla recurrent neutral networks, this problem is almost unavoidable. Bengio et al. showed that in cases where a simple neural network is even a depth of 10, this problem will show up with near certainty.
One way to help the problem is by simply clipping the gradients so that they can't reverse all of the descent progress so far. This helps the symptom of exploding gradients, but doesn't fix the problem entirely, since the issue with blown up or vanishing eigenvalues remains.
Therefore, in order to fix this problem, we need to fundamentally redesign the way that the gradients are backpropagated through time, motivating echo state networks, leaky units, skip connections, and LSTMs. I plan to one day go into all of these, but I first need to build up my skills in matrix calculus, which are currently quite poor.
Therefore, I intend to make the next post (and maybe a few more) about matrix calculus. Then perhaps I can revisit this topic and gain a deeper understanding.
1 This may be an idiosyncratic error of mine. See page 105 in these lecture notes to see where I first saw the problem of vanishing gradients described.
2 See section 10.7 in the Deep Learning Book for a fuller discussion of vanishing and exploding gradients.
Discuss
Calibrating With Cards
In this post, I'll try to bring together two things I enjoy: rationality and magic. Like Hazard, I've also practiced closeup magic for a good amount of time now. After recently seeing Tyler Alterman make a Facebook post about estimations and System 1, it occurred to me that there are a few calibration exercises you can do with a deck of playing cards. The three exercises below are all variants of cutting/manipulating a deck of cards, and then trying to intuit something about the deck.
This serves three purposes:
 Get a feel for your System 1:
 The goal of the following three exercises is to see how good your gut is at estimating uncertainty (hint: probably better than you think!)
 Improve calibration:
 These exercises all allow for some room for error. You can set your confidence intervals and see how quickly you can get calibrated using first principles.
 Practice cool party tricks:
 While I don't intend for this to be a fullon magic tutorial, the exercises I outline are building blocks for magic tricks, and even demonstrating your supercalibration (after getting good) might be impressive.
Below are the three exercises. If you have a deck of cards handy, you can tag along!
Cut Estimation
The simplest exercise is as follows:
 Lift up a packet of cards.
 Estimate how many cards you've picked up.
 Count to check how many you actually picked up.
You have a few obvious reference points. The entire deck is 52 cards, and you can easily tell if you've lifted up more or less than half. With a little practice. I've found that my gut is pretty good at this sort of thing. I'll ask myself how many cards, and there will be a number that feels right. It's usually quite close.
Things to pay attention to:
 When your System 2 estimate conflicts with your System 1 gut answer of how many cards you cut off.
 Whether you are systematically over or underestimating the amount.
This is similar to the first one:
 Cut to a card.
 Replace the pack on the deck.
 Cut to the same card again.
Things to pay attention to:
 When you try to cut to the card a second time, how quickly does your gut know that you got it right or wrong?
 What does it feel like to "know" that the pack of cards in your hand is not the same size as the first time?
 If you know you got it wrong, do you know if you cut too much or too little? And by how much?
This one is something I've just started playing with recently, and it's a mildly superhuman feat to get down right, if I say so myself.
 Name a card. Any playing card.
 Riffle through the cards, watching the corners (where the number and suits are) flip towards you, and look for the card you named.
 Using the information in 2, cut to where you saw the card.
This is difficult. It's partially an estimation task because you need to know approximately where through the deck you saw the card, i.e. halfway, at the end, etc. To start, you can go slow, such that you can see each card as it slips off your thumb.
This gets harder the faster you riffle through the cards. To ramp up the difficulty, riffle faster, such that you can only get a fractional peek at each card's corner.
Things to pay attention to:
 How does your visual experience of watching the cards differ when you aren't looking for a particular card vs when you are? Does anything jump out at you? Are there false positives?
 How many riffles through the deck does it take for you to glimpse the card? (I don't always see it on the first pass through the deck myself.)
 What is the visual experience of trying to differentiate between two similar cards (e.g. Ace of Hearts vs Ace of Diamonds)?
I think the above three exercises are fun learning experiences and a way to check in with your gut feelings through a medium many people may not have tried before. With enough practice, you can hit very levels of accuracy on these tasks, despite them seeming just a little impossible.
If you do decide to give these a go, let me know how it turns out!
Discuss
Edit Nickname
How can I change my nickname?
I logged in with google, but I don't like my full name to be displayed. I tried in the edit account section but couldn't find where to edit the nickname.
Thanks.
Discuss
Verification and Transparency
Epistemic status: I’ve thought about this topic in general for a while, and recently spent half an hour thinking about it in a somewhat focussed way.
Verification and transparency are two kinds of things you can do to or with a software system. Verification is where you use a program to prove whether or not a system of interest has a property of interest. Transparency is where you use tools to make it clear how the software system works. I claim that these are intimately related.
Examples of verification Proving that an alleged compiler actually implements the desired semantics of a system (for example, this verified implementation of ML).
 Proving that a neural network’s classifications of a set of possible inputs are invariant under small perturbations to those inputs (for example, the system described in this paper).
 Sharing the source code of a program, rather than just compiled machine code (as encouraged by the opensource software movement).
 Demonstrating the types of inputs that neurons in a neural network are sensitive to (techniques like this are discussed in the fantastic Building Blocks of Interpretability blog post).
Apart from aesthetic cases, the purpose of transparency is to make the system transparent to some audience, so that members of that audience can learn about the system, and have that knowledge be intimately and necessarily entangled with the actual facts about the system. In other words, the purpose is to allow the users to verify certain properties of the system. As such, you might wonder why typical transparency methods look different than typical verification methods, which also have as a purpose allowing users to verify certain properties of a system.
How verification and transparency are differentVerification systems typically work by having a user specify a proposition to be verified, and then attempting to prove or disprove it. Transparency systems, on the other hand, provide an artefact that makes it easier to prove or disprove many properties of interest. It’s also the case that engagement with the ‘transparency artefact’ need not take the form of coming up with a proposition and then attempting to prove or disprove it: one may well instead interleave proving steps and specification steps, by looking at the artefact, having interesting lemmas come to mind, verifying those, which then inspire more lemmas, and so on.
Intermediate thingsThinking about this made me realise that many sorts of things both serve verification and transparency purposes. Examples:
 Type signatures in a strongly typed language can be seen as a method of ensuring that the compiler proves that certain errors cannot occur, while also giving a human reading the program a better sense of what various functions do.
 A mathematics textbook containing a large numbers of theorems, lemmas, and proofs is made by proving a large number of propositions, and allows a reader to gain an understanding of the relevant mathematical objects by perusing the theorems and lemmas, as well as by looking at the structure of the proofs.
Discuss
Why do humans not have builtin neural i/o channels?
Communication between organisms of the same species is often beneficial, for a variety of reasons: sharing information, signalling, bonding, etc. Yet currently the most advanced form of communication to have evolved, human language, is still very low bandwidth compared with the amount of mental processing our brains do.
It seems conceivable that our nervous systems might have evolved ways to directly (temporarily) interface with each other and exchange a large amount of information. For example, retractable bundles of neurons that are specialised at quickly forming connections with their counterpart neurons in conspecifics.
1. What is the main reason that this has not happened in any large animals? If evolution had continued without humans "taking off", we would have eventually seen such neural links in some animal species?
2. What's the closest thing to this we see in any species?
Discuss
Toy model piece #2: Combining short and long range partial preferences
.mjxchtml {display: inlineblock; lineheight: 0; textindent: 0; textalign: left; texttransform: none; fontstyle: normal; fontweight: normal; fontsize: 100%; fontsizeadjust: none; letterspacing: normal; wordwrap: normal; wordspacing: normal; whitespace: nowrap; float: none; direction: ltr; maxwidth: none; maxheight: none; minwidth: 0; minheight: 0; border: 0; margin: 0; padding: 1px 0} .MJXcdisplay {display: block; textalign: center; margin: 1em 0; padding: 0} .mjxchtml[tabindex]:focus, body :focus .mjxchtml[tabindex] {display: inlinetable} .mjxfullwidth {textalign: center; display: tablecell!important; width: 10000em} .mjxmath {display: inlineblock; bordercollapse: separate; borderspacing: 0} .mjxmath * {display: inlineblock; webkitboxsizing: contentbox!important; mozboxsizing: contentbox!important; boxsizing: contentbox!important; textalign: left} .mjxnumerator {display: block; textalign: center} .mjxdenominator {display: block; textalign: center} .MJXcstacked {height: 0; position: relative} .MJXcstacked > * {position: absolute} .MJXcbevelled > * {display: inlineblock} .mjxstack {display: inlineblock} .mjxop {display: block} .mjxunder {display: tablecell} .mjxover {display: block} .mjxover > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxunder > * {paddingleft: 0px!important; paddingright: 0px!important} .mjxstack > .mjxsup {display: block} .mjxstack > .mjxsub {display: block} .mjxprestack > .mjxpresup {display: block} .mjxprestack > .mjxpresub {display: block} .mjxdelimh > .mjxchar {display: inlineblock} .mjxsurd {verticalalign: top} .mjxmphantom * {visibility: hidden} .mjxmerror {backgroundcolor: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; fontstyle: normal; fontsize: 90%} .mjxannotationxml {lineheight: normal} .mjxmenclose > svg {fill: none; stroke: currentColor} .mjxmtr {display: tablerow} .mjxmlabeledtr {display: tablerow} .mjxmtd {display: tablecell; textalign: center} .mjxlabel {display: tablerow} .mjxbox {display: inlineblock} .mjxblock {display: block} .mjxspan {display: inline} .mjxchar {display: block; whitespace: pre} .mjxitable {display: inlinetable; width: auto} .mjxrow {display: tablerow} .mjxcell {display: tablecell} .mjxtable {display: table; width: 100%} .mjxline {display: block; height: 0} .mjxstrut {width: 0; paddingtop: 1em} .mjxvsize {width: 0} .MJXcspace1 {marginleft: .167em} .MJXcspace2 {marginleft: .222em} .MJXcspace3 {marginleft: .278em} .mjxtest.mjxtestdisplay {display: table!important} .mjxtest.mjxtestinline {display: inline!important; marginright: 1px} .mjxtest.mjxtestdefault {display: block!important; clear: both} .mjxexbox {display: inlineblock!important; position: absolute; overflow: hidden; minheight: 0; maxheight: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjxtestinline .mjxleftbox {display: inlineblock; width: 0; float: left} .mjxtestinline .mjxrightbox {display: inlineblock; width: 0; float: right} .mjxtestdisplay .mjxrightbox {display: tablecell!important; width: 10000em!important; minwidth: 0; maxwidth: none; padding: 0; border: 0; margin: 0} .MJXcTeXunknownR {fontfamily: monospace; fontstyle: normal; fontweight: normal} .MJXcTeXunknownI {fontfamily: monospace; fontstyle: italic; fontweight: normal} .MJXcTeXunknownB {fontfamily: monospace; fontstyle: normal; fontweight: bold} .MJXcTeXunknownBI {fontfamily: monospace; fontstyle: italic; fontweight: bold} .MJXcTeXamsR {fontfamily: MJXcTeXamsR,MJXcTeXamsRw} .MJXcTeXcalB {fontfamily: MJXcTeXcalB,MJXcTeXcalBx,MJXcTeXcalBw} .MJXcTeXfrakR {fontfamily: MJXcTeXfrakR,MJXcTeXfrakRw} .MJXcTeXfrakB {fontfamily: MJXcTeXfrakB,MJXcTeXfrakBx,MJXcTeXfrakBw} .MJXcTeXmathBI {fontfamily: MJXcTeXmathBI,MJXcTeXmathBIx,MJXcTeXmathBIw} .MJXcTeXsansR {fontfamily: MJXcTeXsansR,MJXcTeXsansRw} .MJXcTeXsansB {fontfamily: MJXcTeXsansB,MJXcTeXsansBx,MJXcTeXsansBw} .MJXcTeXsansI {fontfamily: MJXcTeXsansI,MJXcTeXsansIx,MJXcTeXsansIw} .MJXcTeXscriptR {fontfamily: MJXcTeXscriptR,MJXcTeXscriptRw} .MJXcTeXtypeR {fontfamily: MJXcTeXtypeR,MJXcTeXtypeRw} .MJXcTeXcalR {fontfamily: MJXcTeXcalR,MJXcTeXcalRw} .MJXcTeXmainB {fontfamily: MJXcTeXmainB,MJXcTeXmainBx,MJXcTeXmainBw} .MJXcTeXmainI {fontfamily: MJXcTeXmainI,MJXcTeXmainIx,MJXcTeXmainIw} .MJXcTeXmainR {fontfamily: MJXcTeXmainR,MJXcTeXmainRw} .MJXcTeXmathI {fontfamily: MJXcTeXmathI,MJXcTeXmathIx,MJXcTeXmathIw} .MJXcTeXsize1R {fontfamily: MJXcTeXsize1R,MJXcTeXsize1Rw} .MJXcTeXsize2R {fontfamily: MJXcTeXsize2R,MJXcTeXsize2Rw} .MJXcTeXsize3R {fontfamily: MJXcTeXsize3R,MJXcTeXsize3Rw} .MJXcTeXsize4R {fontfamily: MJXcTeXsize4R,MJXcTeXsize4Rw} .MJXcTeXvecR {fontfamily: MJXcTeXvecR,MJXcTeXvecRw} .MJXcTeXvecB {fontfamily: MJXcTeXvecB,MJXcTeXvecBx,MJXcTeXvecBw} @fontface {fontfamily: MJXcTeXamsR; src: local('MathJax_AMS'), local('MathJax_AMSRegular')} @fontface {fontfamily: MJXcTeXamsRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_AMSRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_AMSRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_AMSRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalB; src: local('MathJax_Caligraphic Bold'), local('MathJax_CaligraphicBold')} @fontface {fontfamily: MJXcTeXcalBx; src: local('MathJax_Caligraphic'); fontweight: bold} @fontface {fontfamily: MJXcTeXcalBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakR; src: local('MathJax_Fraktur'), local('MathJax_FrakturRegular')} @fontface {fontfamily: MJXcTeXfrakRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_FrakturRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_FrakturRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_FrakturRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXfrakB; src: local('MathJax_Fraktur Bold'), local('MathJax_FrakturBold')} @fontface {fontfamily: MJXcTeXfrakBx; src: local('MathJax_Fraktur'); fontweight: bold} @fontface {fontfamily: MJXcTeXfrakBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_FrakturBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_FrakturBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_FrakturBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathBI; src: local('MathJax_Math BoldItalic'), local('MathJax_MathBoldItalic')} @fontface {fontfamily: MJXcTeXmathBIx; src: local('MathJax_Math'); fontweight: bold; fontstyle: italic} @fontface {fontfamily: MJXcTeXmathBIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MathBoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MathBoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MathBoldItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansR; src: local('MathJax_SansSerif'), local('MathJax_SansSerifRegular')} @fontface {fontfamily: MJXcTeXsansRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansB; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerifBold')} @fontface {fontfamily: MJXcTeXsansBx; src: local('MathJax_SansSerif'); fontweight: bold} @fontface {fontfamily: MJXcTeXsansBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsansI; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerifItalic')} @fontface {fontfamily: MJXcTeXsansIx; src: local('MathJax_SansSerif'); fontstyle: italic} @fontface {fontfamily: MJXcTeXsansIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_SansSerifItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_SansSerifItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_SansSerifItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXscriptR; src: local('MathJax_Script'), local('MathJax_ScriptRegular')} @fontface {fontfamily: MJXcTeXscriptRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_ScriptRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_ScriptRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_ScriptRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXtypeR; src: local('MathJax_Typewriter'), local('MathJax_TypewriterRegular')} @fontface {fontfamily: MJXcTeXtypeRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_TypewriterRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_TypewriterRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_TypewriterRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXcalR; src: local('MathJax_Caligraphic'), local('MathJax_CaligraphicRegular')} @fontface {fontfamily: MJXcTeXcalRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_CaligraphicRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_CaligraphicRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_CaligraphicRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainB; src: local('MathJax_Main Bold'), local('MathJax_MainBold')} @fontface {fontfamily: MJXcTeXmainBx; src: local('MathJax_Main'); fontweight: bold} @fontface {fontfamily: MJXcTeXmainBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainBold.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainI; src: local('MathJax_Main Italic'), local('MathJax_MainItalic')} @fontface {fontfamily: MJXcTeXmainIx; src: local('MathJax_Main'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmainIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmainR; src: local('MathJax_Main'), local('MathJax_MainRegular')} @fontface {fontfamily: MJXcTeXmainRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MainRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MainRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MainRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXmathI; src: local('MathJax_Math Italic'), local('MathJax_MathItalic')} @fontface {fontfamily: MJXcTeXmathIx; src: local('MathJax_Math'); fontstyle: italic} @fontface {fontfamily: MJXcTeXmathIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_MathItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_MathItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_MathItalic.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize1R; src: local('MathJax_Size1'), local('MathJax_Size1Regular')} @fontface {fontfamily: MJXcTeXsize1Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size1Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size1Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size1Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize2R; src: local('MathJax_Size2'), local('MathJax_Size2Regular')} @fontface {fontfamily: MJXcTeXsize2Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size2Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size2Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size2Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize3R; src: local('MathJax_Size3'), local('MathJax_Size3Regular')} @fontface {fontfamily: MJXcTeXsize3Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size3Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size3Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size3Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXsize4R; src: local('MathJax_Size4'), local('MathJax_Size4Regular')} @fontface {fontfamily: MJXcTeXsize4Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_Size4Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_Size4Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_Size4Regular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecR; src: local('MathJax_Vector'), local('MathJax_VectorRegular')} @fontface {fontfamily: MJXcTeXvecRw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_VectorRegular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_VectorRegular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_VectorRegular.otf') format('opentype')} @fontface {fontfamily: MJXcTeXvecB; src: local('MathJax_Vector Bold'), local('MathJax_VectorBold')} @fontface {fontfamily: MJXcTeXvecBx; src: local('MathJax_Vector'); fontweight: bold} @fontface {fontfamily: MJXcTeXvecBw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/eot/MathJax_VectorBold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/woff/MathJax_VectorBold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTMLCSS/TeX/otf/MathJax_VectorBold.otf') format('opentype')}
I'm working towards a toy model that will illustrate all the steps in the research agenda. It will start with some algorithmic standin for the "human", and proceed to create the UH, following all the steps in that research agenda. So I'll be posting a series of "toy model pieces", that will then be ultimately combined in a full toy model. Along the way, I hope to get a better understanding of how to do the research agenda in practice, and maybe even modify that agenda based on insights making the toy model.
For this post, I'll look in more detail into how to combine different types of (partial) preferences.
Shortdistance, longdistance, and other preferencesI normally use population ethics as my gotoexample for a tension between different types of preferences. You can get a lot of mileage by contrasting the repugnance of the repugnant conclusion with the seeming intuitiveness of the mere addition argument.
However, many people who read this will have strong opinions about population ethics, or at least some opinions. Since I'm not trying to convince anyone of my particular population ethics here, I thought it best to shift to another setting where we could see similar tensions at work, without the baggage.
Living in a world of smilesSuppose you have three somewhat contradictory ethical intuitions. Or rather, in the formulation of my research agenda, two somewhat contradictory partial preferences.
The second is that any world would be better if people smiled more (P1). The third is that if almost everyone smiles all the time, it gets really creepy (P2).
Now, the proper way of resolving those preferences is to appeal to metapreferences, or to cut them up into their web of connotations: why do we value smiles? Is it because people are happy? Why do we find universal smiling creepy? Is it because we fear that something unnatural is making them smile that way? That's the proper way of resolving those preferences.
However, let's pretend there are no metapreferences, and no connotations, and just try to combine the preferences as given.
Smiles and worldsFix the population to a hundred people, and let W be the set of worlds. This set will contain one hundred and one different worlds, described by w(n), where 0≤n≤100 is an integer, denoting the number of people smiling in these worlds.
We can formalise the preferences as follows:
 P1={w(n)≤1w(m)∣n≤m}.
 P2={w(n)≤2w(m)∣n≥95 and n≥m }$.
These give rise to the following utility functions (for simplicity of the formula, I've translated the definition of U2; translations don't matter when combining utilities; I've also written Ui(w(n)) as Ui(n)):
 U1(n)=2n−100.
 U2(n)=2×min(94−n,0).
But before being combined, there preferences have to be normalised. There are multiple ways we could do this, and I'll somewhat arbitrarily choose the "meanmax" method, which normalises the utility difference between the top world and the average world[1].
Given that normalisation, we have:
 U1mema=100−0=100.
 U2mema=0−(−42/101)=42/101≈0.42.
Thus we send the Ui to their normalised counterparts:
 U1(n)→ˆU1(n)=n/50−1.
 U2(n)→ˆU2(n)=10121min(94−n,0).
Now consider what happens when we do the weighted sum of these utilities, weighted by the intensity of the human feeling on the subject:
 U=w1ˆU1+w2ˆU2.
If the weights w1 and w2 are equal, we get the following, where the utility of the world grows slowly with the number of smiles, until it reaches the maximum at n=94 and then drops precipitously:
Thus U1 is dominant most of the time when comparing worlds, but U2 is very strong on the few worlds it really wants to avoid.
But what if U2 (a seeming odd choice) is weighted less that U1 (a more "natural" choice)?
Well, setting w1=1 for the moment, if w2=21/5050, then the utility for all worlds with n≥94 are the same:
.
Thus if 21/5050">w2>21/5050, ˆU2 will force the optimal n to be n≤94 (and ˆU1 will select n=94 from these options). If w2<21/5050, then ˆU1 will dominate completely, setting n=100.
This seems like it could be extended to solve population ethics considerations in various ways (where U1 might be total utilitarianism, with U2 average utilitarianism or just a dislike of worlds with everyone at very low utility). To go back to my old post about differential versus integral ethics, U1 is a differential constraint, U2 is an integral one, and n=94 is the compromise point between them.
Inverting the utilitiesIf we invert the utilities, things behave differently. If we had −U1 (smiles are bad) and −U2 (only lots of smiles are good) instead, things would be different[2]. In meanmax, the norm of these would be:
 −U1mema=100−0=100.
 −U2mema=12−(42/101)=1170/101≈11.58.
So the normalised version of −U1 is just −ˆU1, but the normalised version of U2 is different from −ˆU2.
Then, at equal weights, we get the following graph for U:
Thus −U2 fails at having any influence, and n=0 is optimum.
To get the breakever point, we need w2=585/303, where n=0 and n=100 are equally valued:
For w2 greater than that, −U2 dominates completely, and forces n=100.
It's clear that U1 and U2 are less "antagonistic" than −U1 and −U2 are (compare the single peak in the graph in the first case, with the two peaks in the second).
Why choose the meanmax normalisation? Well, it seems to have a number of nice properties. It has some nice formal properties, as the intertheoretic utility comparison post demonstrates. But it also, to some extent, boosts utility function to the extent that they do not interfere much with other functions.
What do I mean by this? Well, consider two utility functions over n+1 different worlds. The first one, V1, ranks one world (W1) as above all others (the other ones being equal). The second one, V2, ranks one world (W2) as below all others (the other ones being equal).
Under the meanmax normalisation, V1(W1)=1 and V1(W)=−1/n for other W. Under the same normalisation, V2(W2)=−n while V2(W)=1 for other W.
Thus V2 has a much wider "spread" that V1, meaning that, in a normalised sum of utilities, V2 affects the outcome much more strongly than V1 ("outcome" meaning the outcome of maximising the summed utility). This is acceptable, even desirable: V2 dominating the outcome just rules out one universe (W2), while V1 dominating the outcome rules out allbutone universe (W1). So, in a sense, their ability to focus the outcome is comparable: V1 almost never focuses the outcome, but when it does, it narrows down to a single universe. While V2 almost always focuses the outcome, but barely narrows it down. ↩︎
There is no point having the pairs being (U1,−U2) or (−U1,U2), since those pairs agree on the ordering of the worlds, up to ties. ↩︎
Discuss