## Вы здесь

# Новости LessWrong.com

*Адрес:*https://www.lesswrong.com/

*Обновлено:*22 минуты 24 секунды назад

### Twenty-four AI alignment research project definitions

I came up with these research project definitions when I read the iterated amplification sequence. Last year I put five of them up for voting (see Which of these five AI alignment research projects ideas are no good?) and chose no. 23 to work on (see IDA with RL and overseer failures). But I didn't think of publishing all of them until JJ Hepburn gave me the idea that they might be useful to others.

The definitions follow the format recommended in The Craft of Research: ‘I'm studying X, because I want to find out Y, so that I (and you) can better understand Z.’ The quality of language and content varies.

An unaligned benchmark ---------------------- 1 I'm studying mild optimization, because I want to understand what its problems and limitations are, because I want to decide whether or not to work on one of them, because well-functioning mild optimization appears to be close to what humans and organisations safely do today, so basing an AI on it might have useful outcomes. - This is more a study project than a research project. 2 I'm studying how image classifiers can recognize new types of images, because I want to find out how to detect out-of-distribution inputs, in order to help my reader understand how to make ML systems that don't confidently make wrong predictions. Approval-directed agents/bootstrapping -------------------------------------- 3 I'm studying Bayesian machine learning, because I want to understand how to make ML systems that notice when they are confused in order to help my reader understand how to make ML systems that will ask the overseer for input when doing otherwise would lead to failure. 4 I'm studying possible structures of approval-directed agents because I want to understand how much human thought and input they would require in order to help my reader understand whether approval-directed agents are feasible. Humans Consulting HCH --------------------- 5 I'm studying the articles linked from Humans Consulting HCH, because I want to understand the section ‘Hope’ from Humans Consulting HCH in order be able to think about the tension between the system being capable and reflecting the human's judgment. Corrigibility (Christiano 2017) ------------------------------- 6 I'm studying Omohundro's preferences-about-your-utility-function case, because I want to implement it in order to help my reader understand how it can be implemented and whether it is unstable. 7 I'm studying the change in prediction failures as a predictor becomes stronger, because I want to find out whether the failures not only become fewer, but also harder to detect over time, in order to help my reader understand whether systems built on such predictors will remain corrigible as they become stronger. Iterated Distillation and Amplification --------------------------------------- 8 I'm studying the capability and safety-relevant properties of imitation learning, because I want to find out whether it can produce aligned agents, in order to help my reader understand how to even get to the base case of iterated distillation and amplification. Benign model-free RL -------------------- 9 I'm studying the performance of benign model-free RL, because I want to know whether the claim that it will achieve state-of-the-art performance is true, in order to help my reader understand how useful benign model-free RL will be. 10 I'm studying reward learning and robustness, because I want to know whether they can achieve competitive agents without malign behaviour, in order to help my reader understand whether benign model-free RL will be possible. 11 I'm implementing (parts of) benign model-free RL, because I want to know what works and doesn't work in practice, in order to help my reader understand which parts of the scheme need further conceptual research. Supervising strong learners by amplifying weak experts ------------------------------------------------------ 12 I'm studying ways to improve the sample efficiency of a supervised learner, because I want to know how to reduce the number of calls to H in CSASupAmp, in order to help my reader understand how we can adapt that proof-of-concept for solving real world tasks that require even more training data. 13 I'm studying the effects of how CSASupAmp samples questions, because I want to know how to sample questions in a way that improves the scheme's learning performance, in order to help my reader understand how we can adapt that proof-of-concept for solving real world tasks that require even more training data. Machine Learning Projects for Iterated Distillation and Amplification -------------------------------------------------------------------- 14 ❇ Any of the projects there. At first glance Adaptive Computation is most interesting, but perhaps also requires most studying. I would ask Owain to find out what people are already working on or what has received most interest, then work on the least crowded one. Directions and desiderata for AI alignment ------------------------------------------ 15 I'm studying integrating models of known human heuristics and biases into IRL systems, because I want to improve the performance of IRL in a domain where it is hindered by the discrepancies between the existing error models and actual human irrationality in order to help my readers understand how to get IRL systems to infer true human values despite Stuart Armstrong's impossibility result. The reward engineering problem ------------------------------ 16 I'm experimenting with semi-supervised reinforcement learning, because I want to find out how humans can supervise machine learning with reasonably small effort, in order to help my reader understand how to avoid optimizing proxy objectives that we have to use because the sample hunger of current ML algorithms is so great. 17 I'm studying the use of a discriminator in imitation learning, because I want to find out how to help humans produce demonstrations that the agent can imitate, in order to help my reader understand how we might use imitation learning to solve the reward engineering problem. Capability amplification (Christiano 2016) ------------------------------------------ 18 I'm studying cognitive tasks and how to decompose them into ever simpler steps, because I want to find algorithms for capability amplification in order to help my reader understand the nature of obstacles to capability amplification. – What is the obstacle? How exactly is it an obstacle? Make it simple. Learning with catastrophes -------------------------- 19 ❇ I'm studying the development of the performance of an adversary in adversarial training, because I want to find out whether the adversary gets worse as the primary agent becomes more robust, or whether the adversary gets traction at all when the primary agent is already quite robust, in order to help my reader understand how confident we could be in a red team to find all relevant catastrophic situations. Thoughts on reward engineering ------------------------------ 20 I'm studying the effects of importance sampling on the behaviour that an RL agent learns, because I want to find out whether it can lead to undesirable outcomes in order to help my reader understand whether importance sampling can solve the problem of widely varying rewards in reward engineering. 21 I'm studying the effects of an inconsistent comparison function on optimizing with comparisons, because I want to know whether it prevents the two agents from converging on a desirable equilibrium quickly enough inorder to help my reader understand whether optimizing with comparisons can solve the problem of inconsistency and unreliability in reward engineering. Techniques for optimizing worst-case performance ------------------------------------------------ 22 ❇ I'm studying transparency in the service of adversarial training (using transparency to ease finding adversarial examples, or to detect adversarial success earlier/more often), because I want to make the adversary ten times more effective in order to help my reader understand how to build ML systems that never fail catastrophically. Reliability ampflification -------------------------- 23 I'm studying the impact of overseer failure on RL-based IDA, because I want to know under what conditions the amplification increases or decreases the failure rate, in order to help my reader understand whether we need to combine capability amplification with explicit reliability amplification in all cases. Security amplification ---------------------- 24 I'm studying adversarial examples for meta-execution with ML-based sub-agents, because I want to find a case where security amplification by meta-execution fails to amplify security, in order to help my reader understand what obstructions to security amplification there are. Meta-execution -------------- No project idea from the article directly. Ought has put forth some open issues (cf. https://docs.google.com/document/d/1xzFuDD1xiG-oe750MYrP9PEgwEXtxbfSBnHrgrCRnhY/edit#), but that might be outdated and would be too closely tied to Ought.Discuss

### Absent coordination, future technology will cause human extinction

(Crossposted from on Medium and my blog)

Nick Bostrom, of the Future of Humanity Institute, uses an evocative metaphor to describe the future of humanity’s technological development:

*One way of looking at human creativity is as a process of pulling balls out of a giant urn. The balls represent possible ideas, discoveries, technological inventions. Over the course of history, we have extracted a great many balls — mostly white (beneficial) but also various shades of grey (moderately harmful ones and mixed blessings).*

*What we haven’t extracted, so far, is a black ball: a technology that invariably or by default destroys the civilization that invents it. The reason is not that we have been particularly careful or wise in our technology policy. We have just been lucky.*

The atom bomb, together with the long range bomber, marked the first time a small group of people could destroy dozens of cities in a matter of hours. The physicists who worked on the bomb knew that this invention had the capacity to threaten human civilization with unprecedented destruction. They built it out of fear that if they did not, an enemy state like Nazi Germany would build it first. With this development, the destructive power of humanity increased by several orders of magnitude.

History barely had time to catch its breath before many of these same physicists created a new type of nuclear bomb, the hydrogen bomb, that was hundreds of times more powerful than the atom bomb. They did it for the same reasons, out of fear that a rival state would build it first. The Soviet Union used the same justification to build their biological weapons program during the Cold War, producing large quantities of anthrax, plague, smallpox, and other biological weaponry. As far as we know, the United States did not have a comparably large program, but the fear that the US might have one was sufficient to motivate the Soviet leadership. Examples like this are not exceptions; they are the norm.

It’s clear from the history of warfare that the fear of a rival getting a technology first is sufficient to motivate the creation of purely destructive technology, including those that risk massive blowback from radiation, disease, or direct retaliation. This desire to get there first is not the only incentive to develop civilization-threatening technology, but it is the one that seems to drive people to take the most risks at a civilizational level.

Even when there is no perceived threat, the other motivations for technological innovation — profit, prestige, altruism, etc. — drive us to create new things. For most technologies this is good, and has enabled most of the progress of human civilization. The problem only arises when our technology becomes powerful enough to threaten civilization itself. While innovation is hard, it’s even more difficult to anticipate potentially dangerous innovations and prevent their creation. It’s made more difficult by the lack of personal incentive. We all know the names of famous inventors, but have you ever heard of a famous risk analyst who successfully prevented the development of a dangerous technology? I doubt it.

Still, while long term trends favor aggressive tech development, there are controls in place which slow the development of known dangerous technologies. The Non-Proliferation Treaty, the Biological Weapons Convention, and other efforts put pressure on states not to build new nuclear, chemical, or biological weapons, with variable success. Within their own borders, most countries create and enforce laws forbidding private citizens from researching or building weapons of mass destruction.

Some disincentives for dangerous technology development are cultural. The Asilomar Conference on Recombinant DNA in 1975 was an impressive effort by biologists to make sure their field did not create dangerous new kinds of organisms. A strong safety culture can lead to a reduction in accidents and an inclination towards safe exploration, though it’s not always clear how to create such a culture.

In the private sector, companies balance the benefits of “moving fast and breaking things” with the negative PR that comes from developing safety-critical tech without adequate safeguards. After one of Uber’s driverless cars killed a woman last year and the details were released, the poor safety practices of Uber were revealed. We can hope that the backlash from such accidents will incentivize a safer exploration of these technologies in the future. Other AI companies appear more inclined to invest in safety & robustness measures upfront. Both Deepmind and OpenAI, two leaders in deep learning research, have dedicated safety teams that focus on minimizing negative externalities from the technology. Whether such measures will be sufficient to curtail dangerous methods of exploration of powerful AI systems remains to be seen.

A limitation of current regulations is that they focus on technologies known to pose a risk to human life. Entirely new innovations or obscure technologies receive far less attention. If a particle accelerator might inadvertently set off a chain reaction that destroys all life, only their internal safety process and self-preservation instincts prevent them from taking risks on behalf of everyone. There is no international process for ensuring a new technology won’t end up being a black marble.

Avoiding black marbles is both a coordination problem and an uncertainty problem. In the long term, it’s not enough to get 90% of creators to refrain from exploring dangerous territory. Absent strong coordination mechanisms, future technological development suffers from the unilateralist’s curse. Even if no individual creator has ill-intent, those who believe their technologies prevent little risk will forge ahead, biasing the overall population of creators towards unsafe exploration.

While international coordination is necessary to prevent black marble scenarios, it is not sufficient. In some cases, it will not be easy to tell in advance which technologies will prove dangerous. Even if every country in the world agreed to share intelligence about black-marble technological threats and enforce international laws about their use, there is no guarantee a black marble would not be pulled out by accident.

There are no off-the-shelf solutions to international governance problems. We’re in new territory, and new social technology is required. It’s not clear how to design institutions which can incentivize rigorous risk analysis, the right kinds of caution, and quick responses to potentially dangerous developments. Nor is it clear how world powers can be brought to the negotiating table with the mandate to create the necessary frameworks. What is clear is that something new is necessary.

In 1946, physicists from the Manhattan Project brought together a group of prominent scientists and wrote a collection of essays about the implications of nuclear technology for the future. The volume was called One World or None, In the forward, Niels Bohr wrote the following, and though much has changed in the intervening decades, the message is as relevant in 2020 as it was in 1946. For context, Bohr is writing about the need for international governance to prevent nuclear war.

*Such measures will, of course, demand the abolition of barriers hitherto considered necessary to safeguard national interests but now standing in the way of common security against unprecedented dangers. Certainly the handling of the precarious situation will demand the goodwill of all nations, but it must be recognized that we are dealing with what is potentially a deadly challenge to civilization itself. A better background for meeting such a situation could hardly be imagined than the earnest desire to seek a firm foundation for world security, so unanimously expressed by all those nations which only through united efforts have been able to defend elementary human rights. The extent of the contribution that agreement on this vital matter would make to the removal of obstacles to mutual confidence, and to the promotion of a harmonious relationship between nations can hardly be exaggerated.*

Bohr believed that the threat posed by nuclear weapons was a game changer, and that strong international cooperation was the only solution. Without international control of the bomb, Bohr and other scientists believed that a global nuclear war was inevitable. Fortunately, this hasn’t come to pass, and while this is partially due to luck, it’s also due to the degree of coordination that has occurred between world powers. Efforts like the Non-Proliferation Treaty have been more successful than many believed possible. It’s been 75 years since Hiroshima, and only nine countries posses nuclear weapons.

The success we’ve had curtailing catastrophic threats has bought us time, but we shouldn’t mistake limited forms of cooperation like the UN Security Council for a global framework capable of addressing existential threats. Many scientists of the 1940s recognized that unprecedented forms of international coordination were necessary to address existentially-threatening technologies. 75 years later, we’ve all but lost this ambition, but the threats haven’t gone away. On the contrary, we continue to develop new technologies, and if the process continues we will find a black marble. Absent coordination, future technology will cause human extinction.

On the other hand, with better international systems of cooperation, we can anticipate and avert existential threats. Despite the fears of Cold War and many close calls, the world came through without a single nuclear weapon used against another nation. Designing international frameworks up to the tasks of coordination and global risk analysis will be difficult. It will be even more difficult to get existing leaders and stakeholders on board. It’s tempting to throw up our hands and call this kind of effort impossible, but I believe this would be a mistake. Despite the difficulty, real international coordination is still the best chance humanity has to avoid extinction.

Discuss

### Long Now, and Culture vs Artifacts

I've recently gotten re-interested in The Long Now Foundation, and had a conversation about it that seemed worth writing down.

This doesn't have a central thesis, just covering some interesting highlights.

**Absorbing vs Evangelizing**

The last time I paid attention to Long Now, I was in a fairly "evangelical rationalist" phase and also specifically trying to network and pitch Secular Solstice to them as a "longtermist holiday."

It didn't go well – partly because I wasn't very good at networking, partly because people usually (correctly) have immune-responses against network-and-pitching. I think those two things were sufficient to explain the networking "not seeming worth it," although if I had been savvier I expect I'd still have run into them having different values and epistemic outlooks from me.

It occurred to me recently that I'm a much more chill person than I was 2 years ago and I'd probably get more value out of visiting Long Now events than I would have then, since I'd be more oriented around "what good things can I absorb here?" rather than "how can I pitch them on stuff and extract resources from them?"

(I haven't yet tried doing this though)

**Longtermism vs Pivotal Generationism**

I love the Long Now Foundation's aesthetic, but felt sort of frustrated by them not seeming to 'get x-risk' or things like that. But, their actions might make more sense if you get that they are Longtermist, without being Pivotal Generation-ist.

(Separately, I think they don't think the future will get "So Weird As To Invalidate Everything", i.e. converting the world to computronium. Although they might just not think that's tractable to plan for.)

**Culture is Medium Term, Artifacts are Long Term.**

The conversation explored Longtermism vs Pivotal Generationism a bit. But, the most interesting takeaway for me was: If you are acting on 10,000 year timescales, and you don't think your generation is particularly special, *culture is not a very effective way to steer the future. Cultures usually last hundreds of years. Physical artifacts are much more reliable ways to affect people 10,000 years from now. *

I had been thinking of "shaping culture" as a way to interact with the longterm future. But, at least historically (setting aside for now Very Weird Futures), cultures tend to act on the *medium term*, a few hundred years at most.

The Long Now foundation *does* also do cultural work (in addition to their meetups and Ted Talk-esque presentations, they push Long Bets and Predictions which seem pretty valuable to me). So I don't know that the "culture is medium term" hypothesis is that salient to them. But, it still was an interesting update for me.

**When ****is**** culture enduring?**

Some cultures last for thousands of years. Others do not. Can you predict ahead of time which is which? If your *goal* is affecting things 1,000 or 10,000 years from now, is culture a viable way of doing that? Or is endurance just selection effect?

Lots of cultures aim to be "generally enduring" but still seem to have changed radically.

My impression is that Christianity succeeded somewhat intentionally, but it wasn't doing things that were strategically novel (beyond what many religions/cultures/civilizations do re: indoctrination and institution-building). My off-the-cuff guess is "they were quite competent, but they endured where other competent cultures fizzled mostly due to luck".

My impression is that Judaism and Confucianism both succeeded more intentionally and predictably at enduring thousands of years. (I haven't done anything like an unbiased survey of cultures, but they both stand out among cultures I've heard about)

*Judaism*

Judaism seems to have a scholastic culture is a cleverly constructed trap: smart people are *encouraged* to argue and doubt, but in a way that still ultimately circles back towards believing and identifying with the culture. So the sort of people who are most likely to think of ways to change the system instead have a framework that keeps any innovations within the context of the system.

See also this slatestarscratchpad (note: requires tumblr login).

*Confucianism*

My understanding of Confucianism comes from Legal Systems Different From Our Own, and this review of the book Little Soldiers.

In his review of Little Soldiers, Dormin notes:

Due to some combination of climate, food availability, culture, and maybe governance, China has historically been able to push its population closer to the Malthusian limit than any other region on earth. This has led to China always having massive populations and wealth, but also being prone to population collapses. It's known for dynastic cycles where ruling families united most or all of China for extended periods of growth, then fell into stagnation and collapsed during cataclysmic eras of contraction.

This reality has encouraged Chinese culture to favor stability as a primary aim.

Confucius was a 6th century BC bureaucrat whose teachings stand as an explicit codification of how Chinese society should order itself to encourage stability. Essentially, Confucius envisioned all of society as an integrated family unit. At each level, the parents protect, foster, and teach children. To reciprocate, children honor and respect their parents. Once parents are too old to provide, they become dependents to be taken care of by their children. The multitude of obligations within this network is known as *familial piety*.

Purely on the family-level, Confucius’s model is not too different from what we might find anywhere else in the world. But he introduces two major innovations to the formula.

First, there is an extraordinarily strong assumption of *obligation* in the family. Families are the fundamental societal unit in China, *not* individuals. All (normal) individuals are ultimately loyal to their parents above all else, including their children and siblings, and *especially* over their spouses. From the moment an individual is born, until the moment his last parent dies, he is expected to be in their service.

Even in the modern-day, strong familial piety is the norm in China. Most parents have direct control over all important aspects of their children’s lives. This control will lessen once a child is married, but will still remain strong until the parents die. From my personal observations, these controlled categories will include where a child goes to college, what he studies, where he works, who his friends are, who he marries, and where he lives (with his parents until marriage).

Confucius’s second big innovation is that the family construct is abstracted to all of society. The government is parent to the citizen-children. Companies are parents to employee-children. And of course, schools are parents to student-children. At every level of society, there is a system of mutual obligation based on an exchange of nurturing protection for subservience.

To Confucius, this structure was the only way to keep Chinese society stable at the high end of the Malthusian trap. If children were free to disobey their parents, citizens free to disobey their governments, apprentices free to disobey their masters, etc, then Chinese society would be pulled apart and chaos would reign. Only strict social norms enforced by authoritarian measures could keep China strong.

How do you enforce this? One way is a long history of a particular kind of standardized testing. Legal Systems Very Different From Our Own notes:

In the early part of the final dynasty, there were about half a million licentiates out of a population of several hundred million, only about 18,000 people who had reached the next level. The provincial exam that separated the two groups had a pass rate of about one percent. It was offered every three years and could be, and often was, taken multiple times.

The metropolitan exam produced 200 to 300 degrees from as many as 8000 candidates each time it was given. While a few unusually talented candidates made it through before they were twenty-five, a majority were in their thirties, some older. The exams did not test administrative ability, knowledge of the law, expertise in solving crimes or other skills with any obvious connection to the job of district magistrate or most of the other jobs for which the exams provided a qualification.

> “The content of the provincial examinations presented an exacting challenge, especially to the novitiate. Its syllabus called for compositions on themes from the four core texts of the Neo-Confucian canon and a further five or more classics, extended dissertations on the classics, history, and contemporary subjects, verse composition, and at various times the ability to write formal administrative statements and dispatches. To be at all hopeful of success, the candidate should have read widely in the extensive historical literature, thoroughly digested the classics, developed a fluent calligraphy, and mastered several poetic styles. Above all he should have mastered the essay style, known as the ‘eight-legged’ essay from its eight-section format, which was the peculiar product of the examination system.” (Watt 1972, 24-25)

This raises an obvious question: Why? Why require the ablest men in the society to spend an extended period of time, often decades, studying to pass the exams instead of applying their skills to running the empire? Why test a set of skills with little obvious connection to the jobs those men were expected to do?

One possible explanation is that the exams were the equivalent of IQ tests, designed to select the most intellectually able (and hardworking) members of the population for government service. But it is hard to believe that there was no less costly way of doing so or no approach along similar lines that would have tested more relevant abilities.

A more interesting explanation focuses on the content of what they were studying—Confucian literature and philosophy. There are two characteristics one would like officials to have. One is the ability to do a good job. The other is the desire to do a good job—instead of lining their pockets with bribes or neglecting public duties in favor of private pleasures. One might interpret the examination system as a massive exercise in indoctrination, training people in a set of beliefs that implied that the job of government officials was to take good care of the people they were set over while being suitably obedient to the people set over them. Those who had fully internalized that way of thinking would be better able to display it in the high-pressure context of the exams.

Indoctrination isn't a novel concept. Indoctrination that emphasizes stability/tradition also isn't a novel concept. But there is something particularly clever about weaving a giant tempting trap for your best and brightest, where the act of participating changes the structure of how they think, in a way that reinforces the trap.

This does put some limits on what *kinds* of cultures you can build to influence the world 10,000 years from now (i.e. you have to focus on stability and self-preservation in order for it to work at all)

Granted, once you re-introduce potentially Very Weird Futures into the mix, more possibilities open up with AIs or Uploads that are carefully constructed to be self-modifiable in some particular ways but not others.

**Unfolding Memeplexes**

My previous thinking on cultural longtermism accepted that there were limits to how self-preserved culture could be. The hope wasn't that (things like Solstice, or the rationality community) would survive unscathed into the future. But that they're start snowball effects. They'd accumulate a lot of cruft, but hopefully shift trajectories a bit, and preserve some kernels.

Relevant quote from an interview with Linchuan on Secular Solstice:

**Linch: **Where do you see the solstice going in five years?

**Raymond: **Within five years? I want there be more “major flagship” Solstice events that are heavily promoted, that hundreds of people come to, that are high production value and showcase how beautiful the holiday can be. It’d be neat if they each had slightly different aesthetics. Right now the New York Solstice has a slightly jazzy feel to it, but it’d be great if there were also a big Solstice with more of a classical music feel, or tried other experiments.

Much more importantly - I want there to be many more *smaller* Solstices. This whole thing was inspired by a 20-person Christmas Eve party in a living room. When winter rolls around, I want secular people to think about throwing Solstice parties with fun singalong music that feels *distinctly* humanist, that makes it feel like there is a culture you can be part of.

I want it to be something you can take pieces of, make your own, and invite your closest friends and family to share. Right now, lots of people know that they can throw a solstice party (it’s literally the oldest holiday in the world). But they have to build their tradition from scratch.

The whole point of science is to stand on the shoulders of the people who come before you, and build off their work. This applies to art and culture too. I hope people are able to use existing Solstice traditions to build their own experience.

**Linch: **And on a longer timescale? Ten to twenty years?

**Raymond: **In 10-20 years, the question gets more interesting.

Subcultures have one of two things happen to them, in my experience. Either they stay obscure, with only their most devoted fans - who care deeply about the substance - participating. Or, they go mainstream. This sometimes means somewhat “dumbing down”, or warping due to capitalist forces. Look at Christmas - it’s super commercialized. It encourages you to by the biggest, most amazing Christmas displays and expensive gifts that you can manage. It has little to do with Jesus.

And look at Hannukah, which was blown out of proportion to its original cultural significance, so that it can compete with Christmas.

So I think in twenty years, Secular Solstice will *either* still be fairly obscure, *or* it will have gone mainstream enough that people are making t-shirts and expensive displays. And it’s a open question which is better.

My hope, I think, is for it to go mainstream - as long as it can carry forward a kernel of its core ideas, and the people who care most deeply about it are still celebrating something meaningful. Christmas is over-commercialized, but if you know where to look you can still find quiet midnight masses that respect the core of the holiday.

Solstice actually has an even harder job than Christmas or Hannukah though: it needs to not just preserve a core story - it *needs that story to evolve, *as our understanding of the science or philosophy or morality evolves. And people will be arguing about how they think those things are (or should be) evolving. That’s all healthy, but it makes for a tougher tradition to keep in place.

Solstice doesn’t just need people to preserve one set of traditions. It needs cultural stewards to actively pursue truth, who work to develop songs and stories that reflect our deepening understanding of the nature of reality.

**Possible followup questions**

- When you introduce potentially Very Weird Futures back into the mix, what cultural forces seem promising?
- What kinds of artifacts seem promising? (I do kinda think the 10,000 Year Clock will probably be converted to computronium ten millennia from now, unless Earth-in-particular is preserved)

- Are there counterexamples of cultures that prioritized creativity and didn't disintegrate or get subsumed?
- Science has lasted a couple hundred years at least, fairly intentionally. It's generally hard to evaluate newer stuff.

Discuss

### Looking for books about software engineering as a field

I work in the software industry but am not a software developer. My job is to write about software development, and I've learned a whole bucketload of terms: stuff like 'linked lists', 'CI/CD', 'performance optimization', 'deploy to AWS', 'dockerize', 'microservices', 'SQL injection', 'multithreaded program', 'vectorized code', and on and on and on. However, a lot of the time I'm basically just Chinese-rooming – I can write about these things, but I don't actually understand how any of them fit together. For example, I've had three people try to explain exactly what an API is to me, for more than two hours total, but I just can't internalize it. I feel that there's some impossible-to-articulate piece I'm missing, and none of the words people say to me about software stuff stick because I'm lacking a foundation on which to build up my understanding.

So my question is, are there any books (or other resources) that explain the field of software engineering as a cohesive whole? I'm not looking for books that will teach me to code, because I don't think that's the thing I want. Feel free to ask clarifying questions. Thanks!

Discuss

### Category Theory Without The Baggage

*If you are an algebraic abstractologist, this post is probably not for you. Further meta-commentary can be found in the “meta” section, at the bottom of the post.*

So you’ve heard of this thing called “category theory”. Maybe you’ve met some smart people who say that’s it’s really useful and powerful for… something. Maybe you’ve even cracked open a book or watched some lectures, only to find that the entire subject seems to have been generated by training __GPT-2__ on a mix of algebraic optometry and output from __theproofistrivial.com__.

What is this subject? What could one do with it, other than write opaque math papers?

This introduction is for you.

This post will cover just the bare-bones foundational pieces: categories, functors, and natural transformations. I will mostly eschew the typical presentation; my goal is just to convey intuition for what these things mean. Depending on interest, I may post a few more pieces in this vein, covering e.g. limits, adjunction, Yoneda lemma, symmetric monoidal categories, types and programming, etc - leave a comment if you want to see more.

Outline:

- Category theory is the study of paths in graphs, so I’ll briefly talk about that and highlight some relevant aspects.
- What’s a category? A category is just a graph with some notion of equivalence of paths; we’ll see a few examples.
- Pattern matching: find a sub-category with a particular shape. Matches are called “functors”.
- One sub-category modelling another: commutative squares and natural transformations.

Here’s a graph:

Here are some paths in that graph:

- A -> B
- B -> C
- A -> B -> C
- A -> A
- A -> A -> A (twice around the loop)
- A -> A -> A -> B (twice around the loop, then to B)
- (trivial path - start at D and don’t go anywhere)
- (trivial path - start at A and don’t go anywhere)

In category theory, we usually care more about the edges and paths than the vertices themselves, so let’s give our edges their own names:

We can then write paths like this:

- A -> B is written y
- B -> C is written z
- A -> B -> C is written yz
- A -> A is written x
- A -> A -> A is written xx
- A -> A -> A -> B is written xxy
- The trivial path at D is written id_D (this is roughly a standard notation)
- The trivial path at A is written id_A

We can build longer paths by “composing” shorter paths. For instance, we can compose y (aka A -> B) with z (aka B -> C) to form yz (aka A -> B -> C), or we can compose x with itself to form xx, or we can compose xx with yz to form xxyz. We can compose two paths if-and-only-if the second path starts where the first one begins - we can’t compose x with z because we’d have to magically jump from A to B in the middle.

Composition is asymmetric - composing y with z is fine, but we can’t compose z with y.

Notice that composing id_A with x is just the same as x by itself: if we start at A, don’t go anywhere, and then follow x, then that’s the same as just following x. Similarly, composing x with id_A is just the same as x. Symbolically: id_A x = x id_A = x. Mathematically, id_A is an “identity” - an operation which does nothing; thus the “id” notation.

In applications, graphs almost always have data on them - attached to the vertices, the edges, or both. In category theory in particular, data is usually on the edges. When composing those edges to make paths, we also compose the data.

A simple example: imagine a graph of roads between cities. Each road has a distance. When composing multiple roads into paths, we add together the distances to find the total distance.

Finally, in our original graph, let’s throw in an extra edge from A to itself:

Our graph has become a “multigraph” - a graph with (potentially) more than one distinct edge between each vertex. Now we can’t just write a path as A -> A -> A anymore - that could refer to xx, xx’, x’x, or x’x’. In category theory, we’ll usually be dealing with multigraphs, so we need to write paths as a sequence of edges rather than the vertices-with-arrows notation. For instance, in our roads-and-cities example, there may be multiple roads between any two cities, so a path needs to specify which roads are taken.

Category theorists call paths and their associated data “morphisms”. This a terrible name, and we mostly won’t use it. Vertices are called “objects”, which is a less terrible name I might occasionally slip into.

What’s a category?A category is:

- a directed multigraph
- with some notion of equivalence between paths.

For instance, we could imagine a directed multigraph of flights between airports, with a cost for each flight. A path is then a sequence of flights from one airport to another. As a notion of equivalence, we could declare that two paths are equivalent if they have the same start and end points, and the same total cost.

There is one important rule: our notion of path-equivalence must respect composition. If path p is equivalent to q (which I’ll write .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} p≅q), and x≅y, then we must have px≅qy. In our airports example, this would say: if two flight-paths p and q have the same cost (call it c1), and two flight-paths x and y have the same cost (call it c2), then the cost of px (i.e. c1+c2) must equal the cost of qy (also c1+c2).

Besides that, there’s a handful of boilerplate rules:

- Any path is equivalent to itself (reflexivity), and if x≅y and y≅z then x≅z (transitivity); these are the usual rules which define equivalence relations.
- Any paths with different start and end points must not be equivalent; otherwise expressions like “px≅qy” might not even be defined.

Let’s look at a few more examples. I’ll try to show some qualitatively different categories, to give some idea of the range available.

__Airports & Flights__

Our airport example is already a fairly general category, but we could easily add more bells and whistles to it. Rather than having a vertex for each airport, we could have a vertex for each airport at each time. Flights then connect an airport at one time to another airport at another time, and we need some zero-cost “wait” edges to move from an airport at one time to the same airport at a later time. A path would be some combination of flights and waiting. We might expect that the category has some symmetries - e.g. “same flights on different days” - and later we’ll see some tools to formalize those.

__Divisibility__

As a completely different example, consider the category of divisibility of positive integers:

This category has a path from n to m if-and-only-if n is divisible by m (written m | n, pronounced “m divides n”, i.e. 2 | 12 is read “two divides twelve”). The “data” on the edges is just the divisibility relations - i.e. 6 | 12 or 5 | 15:

We can compose these: 2|6 and 6|12 implies 2|12. A path 12 -> 6 -> 2 in this category is, in some sense, a proof that 12 is divisible by 2 (given all the divisibility relations on the edges). Note that *any* two paths from 12 to 2 produce the same result - i.e. 12 -> 4 -> 2 also gives 2|12. More generally: in this category, any two paths between the same start and end points are equivalent.

__Types & Functions__

Yet another totally different direction: consider the category of types in some programming language, with functions between those types as edges:

This category has a LOT of stuff in it. There’s a function for addition of two integers, which goes from (int, int) to int. There’s another function for multiplication of two integers, also from (int, int) to int. There are functions operating on lists, strings, and hash tables. There are functions which haven’t been written in the entire history of programming, with input and output types which also haven’t been written.

We know how to compose functions - just call one on the result of the other. We also know when two functions are “equivalent” - they always give the same output when given the same input. So we have a category, using our usual notions of composition and equivalence of functions. This category is the main focus of many CS applications of category theory (e.g. types in Haskell). Mathematicians instead focus on the closely-related category of functions between sets; this is exactly the same except that functions go from one set to another instead of one type to another.

__Commutative Diagrams__

A lot of mathy fields use diagrams like this:

For instance, we can scale an image down (f1) then rotate it (g1) or rotate the image (g2) then scale it (f2), and get the same result either way. The idea that we get the same result either way is summarized by the phrase “the diagram commutes”; thus the name “commutative diagram”. In terms of paths: we have path-equivalence f1g1=g2f2.

Another way this often shows up: we have some problem which we could solve directly. But it’s easier to transform it into some other form (e.g. change coordinates or change variables), solve in that form, then transform back:

Again, we say “the diagram commutes”. Now our path-equivalence says f=Tf′T−1.

Talking about commutative diagrams is arguably the central purpose of category theory; our main tool for that will be “natural transformations”, which we’ll introduce shortly.

Pattern Matching and FunctorsThink about how we use regexes. We write some pattern then try to match it against some string - e.g. “colou*r” matches “color” or “colour” but not “pink”. We can use that to pick out parts of a target string which match the pattern - e.g. we could find the query “color” in the target “every color of the rainbow”.

We’d like to do something similar for categories. Main idea: we want to match objects (a.k.a vertices) in the query category to objects in the target category, and paths in the query category to paths in the target category, in a way that keeps the structure intact.

For example, consider a commutative square:

We’d like to use that as a query on some other category, e.g. our airport category. When we query for a commutative square in our airport category, we’re looking for two paths with the same start and end airports, (potentially) different intermediate airports, but the same overall cost. For instance, maybe Delta has flights from New York to Los Angeles via their hub in Atlanta, and Southwest has flights from New York to Los Angeles via their hub in Vegas, and market competition makes the prices of the two flight-paths equal.

We’ll come back to the commutative square query in the next section. For now, let’s look at some simpler queries, to get a feel for the building blocks of our pattern-matcher. Remember: objects to objects, paths to paths, keep the structure intact.

First, we could use a single-object category with no edges as a query:

This can match against any one object (a.k.a vertex) in the target category. Note that there is a path hiding in the query - the identity path, where we start at the object and just stay there. In general, our pattern-matcher will always match identity paths in the query with identity paths on the corresponding objects in the target category - that’s one part of “keeping the structure intact”.

Next-most complicated is the query with two objects:

This one is slightly subtle - it might match two different objects, or both query objects might match against the *same* target object. This is just the way pattern-matching works in category theory; there’s no rule to prevent multiple vertices/edges in the query from collapsing into a single vertex/edge in the target category. This is actually useful quite often - for instance, if we have some function which takes in two objects from the target category, then it’s perfectly reasonable to pass in the same object twice. Maybe we have a path-finding algorithm which takes in two airports; it’s perfectly reasonable to expect that algorithm to work even if we pass the same airport twice - that’s a very easy path-finding problem, after all!

Next up, we add in an edge:

Now that we have a nontrivial path, it’s time to highlight a key point: we map paths to paths, *not* edges to edges. So if our target category contains something like A -> B -> C, then our one-edge query might match against the A -> B edge, or it might match against the B -> C edge, or it might match the whole *path* A -> C (via B) - even if there’s no direct edge from A to C. Again, this is useful quite often - if we’re searching for flights from New York to Los Angeles, it’s perfectly fine to show results with a stop or two in the middle. So our one-edge query doesn’t just match each edge; it matches each path between any two objects (including the identity path from an object to itself).

Adding more objects and edges generalizes in the obvious way:

This finds any two paths which start at the same object. As usual, one or both paths could be the identity path, and both paths could be the same.

The other main building block is equivalence between paths. Let’s consider a query with two edges between two objects, with the two edges declared to be equivalent:

This finds not just any two paths with the same start and end, but two *equivalent* paths. As usual, the two paths could be the same path, but they don’t have to be.

We could add a rest stop in the middle of one path, while still considering both paths equivalent:

All the paths matched by the previous query would also be matched by this one, but now we get some extra information in the matching - in addition to the two equivalent paths, we pick out some object along one of the paths.

This highlights one last key point: even if two queries match the same paths, it does matter which things we’re picking out along those paths. For each pair of equivalent paths, our rest-stop query generates one match for every intermediate object along one path - whereas the original equivalent-paths query just generates one single match per pair of equivalent paths.

Category theorists call each individual match a “functor”. Each different functor - i.e. each match - maps the query category into the target category in a different way.

Note that the target category is itself a category - which means we could use it as a query on some third category. In this case, we can compose matches/functors: if one match tells me how to map category 1 into category 2, and another match tells me how to map category 2 into category 3, then I can combine those to find a map from category 1 into category 3.

Because category theorists love to go meta, we can even define a graph in which the objects are categories and the edges are functors. A path then composes functors, and we say that two paths are equivalent if they result in the same map from the original query category into the final target category. This is called “Cat”, the category of categories and functors. Yay meta.

Meanwhile, back on Earth (or at least low Earth orbit), commutative diagrams.

Exercise: Hopefully you now have an intuitive idea of how our pattern-matcher works, and what information each match (i.e. each functor) contains. Use your intuition to come up with a formal definition of a functor. Then, compare your definition to __wikipedia’s definition__ (jargon note: "morphism" = set of equivalent paths); is your definition equivalent? If not, what’s missing/extraneous in yours, and when would it matter?

Let’s start with a microscopic model of a pot of water. We have some “state”, representing the positions and momenta of every molecule in the water (or quantum field state, if you want to go even lower-level). There are things we can do to the water - boil it, cool it back down, add salt, stir it, wait a few seconds, etc - and each of these things will transform the water from one state to another. We can represent this as a category: the objects are states, the edges are operations moving the water from one state to another (including just letting time pass), and paths represent sequences of operations.

In physics, we usually don’t care how a physical system arrived in a particular state - the state tells us everything we need to know. That would mean that any path between the same start and end states are equivalent in this category (just like in the divisibility category). To make the example a bit more general, let’s assume that we do care about different ways of getting from one state to another - e.g. heating the water, then cooling it, then heating it again will definitely rack up a larger electric/gas bill than just heating it.

Microscopic models accounting for the position and momentum of every molecule are rather difficult to work with, computationally. We might instead prefer a higher-level macroscopic model, e.g. a fluid model where we just track average velocity, temperature, and chemical composition of the fluid in little cells of space and time. We can still model all of our operations - boiling, stirring, etc - but they’ll take a different form. Rather than forces on molecules, now we’re thinking about macroscopic heat flow and total force on each little cell of space at each time.

We can connect these two categories: given a microscopic state we can compute the corresponding macroscopic state. By explicitly including these microscopic -> macroscopic transformations as edges, we can incorporate both systems into one category:

Note that multiple micro-states will map to the same macro-state, although I haven’t drawn any.

The key property in this two-part category is path equivalence (a.k.a. commutation). If we start at the leftmost microscopic state, stir (in micro), then transform to the macro representation, then that should be exactly the same as starting at the leftmost microscopic state, transforming to the macro representation, and *then* stirring (in macro). It should not matter whether we perform some operations in the macro or micro model; the two should “give the same answer”. We represent that idea by saying that two paths are equivalent: one path which transforms micro to macro and then stirs (in macro), and another path which stirs (in micro) and then transforms micro to macro. We have a commutative square.

In fact, we have a *bunch* of commutative squares. We can pick any path in the micro-model, find the corresponding path in the macro-model, add in the micro->macro transformations, and end up with a commutative square.

Main take-away: __prism-shaped__ categories with commutative squares on their side-faces capture the idea of representing the same system and operations in two different ways, possibly with one representation less granular than the other. We’ll call these kinds of structures “natural transformations”.

Next step: we’d like to use our pattern-matcher to look for natural transformations.

We’ll start with some arbitrary category:

Then we’ll make a copy of it, and add edges from objects in the original to corresponding objects in the copy:

I’ll call the original category “source”, and the copy “target”.

To finish our pattern, we’ll declare path equivalences: if we follow an edge from source to target, then take any path within the target, that’s equivalent to taking the corresponding path within the source, and then following an edge from source to target. We declare those paths equivalent (as well as any equivalences in the original category, and any other equivalences implied, e.g. paths in which our equivalent paths appear as sub-paths).

Now we just take our pattern and plug it into our pattern-matcher, as usual. Each match is called a natural transformation; we say that the natural transformation maps the source part to the match of the target part. Since we call matches “functors”, a category theorist would say that a natural transformation maps one functor to another of the same shape.

Now for an important point: remember that, in our pot-of-water example, multiple microscopic states could map to the same macroscopic state. Multiple objects in the source are collapsed into a single object in the target. But our procedure for creating a natural transformation pattern just copies the whole source category directly, without any collapsing. Is our pot-of-water example not a true natural transformation?

It is. Last section I said that it’s sometimes useful for our pattern-matcher to collapse multiple objects into one; the pot-of-water is an example where that matters. Our pattern-matcher may be *looking* for a copy of the micro model, but it will still *match* against the macro model, *because* it’s allowed to collapse multiple objects together into one.

More generally: because our pattern-matcher is allowed to collapse objects together, it’s able to find natural transformations in which the target is less granular than the source.

MetaThat concludes the actual content; now I'll just talk a bit about why I'm writing this.

I've bounced off of category theory a couple times before. But smart people kept saying that it's really powerful, in ways that sound related to my research, so I've been taking another pass at the subject over the last few weeks.

Even the best book I've found on the material seems burdened mainly by poor formulations of the core concepts and very limited examples. My current impression is that broader adoption of category theory is limited in large part by bad definitions, even when more intuitive equivalent definitions are available - "morphisms" vs "paths" is a particularly blatant example, leading to an entirely unnecessary profusion of identities in definitions. Also, of course, category theorists are constantly trying to go more abstract in ways that make the presentation more confusing without really adding anything in terms of explanation. So I've needed to come up with my own concrete examples and look for more intuitive definitions. This write-up is a natural by-product of that process.

I'd especially appreciate feedback on:

- whether I'm missing key concepts or made crucial mistakes.
- whether this was useful; I may drop some more posts along these lines if many people like it.
- whether there's some wonderful category theory resource which has already done something like this, so I can just read that instead. I would really, really prefer to do this the easy way.

Discuss

### Protecting Large Projects Against Mazedom

**If we wish to accomplish something that would benefit from or require a larger organization or more levels of management and bureaucracy, what should we do in light of the dangers of mazes?**

There are no easy answers. Real tradeoffs with real sacrifice are the order of the day. But we can do some things to expand the production possibilities frontier, and choose wisely along that frontier.

As is often the case, this starts with admitting you have a problem.

Too often, it is assumed that one should scale without worrying about the costs of scaling, or without counting becoming a maze as one of the biggest costs. Not stretching oneself maximally thin, or getting in the way of this process, becomes the sin of not maximizing effectiveness or profits.

That ensures failure and rapid descent into a maze. Start by getting out of this mindset.

If you are looking to accomplish a big thing that requires lots of organization, management and bureaucracy, here are ways to help contain the damage.

None of them should come as a surprise by this point. This is more of a synthesis of points already made, and not a place I feel I have special additional insights. So I will keep this short.

**Solution 1: Do Less Things and Be Smaller**

Recognize the threat and its seriousness, and the resulting risks and costs of scaling even if handled wisely. Understand that *you* *almost certainly want to be smaller and do less things *due to this concern. This is a real trade-off.

Think of the actions, priorities, skills and members of a group as being increasingly inherently expensive as the group grows – adding new elements has increasing marginal costs. Changing anything, or preventing anything from taking its natural course (including towards being more of a maze) also becomes increasingly expensive. Under such circumstances, the final marginal cost of every action or member is equal to the marginal cost of the final action taken or member added.

One must think about the future scale and the future costs, solve for the resulting budget constraints and trade-offs, and spend wisely. The same way that one must avoid accumulating technical debt, an organization should worry greatly about complexity and culture long before the bills are presented.

Remember that not doing something does not condemn that thing to never being done.

Encourage others to form distinct groups and organizations to do those other things, or where possible to do those other things on their own. Promise and give rewards and trade relations for those that do so. Parts that can operate on their own should usually do so. If your organization discovers a new product, business or other undertaking that is worth pursuing, but isn’t a cultural or logistical fit for what already exists, *or simply doesn’t need to be in the same place to work, *strongly consider spinning it off into its own thing.

This solution applies fractally throughout the process. Do only one central, big thing. Do less major things in support of that thing. Do less minor things in support of each of the major things. Do each of those minor things as elegantly and simply as you can. Take every opportunity to simplify, to look for easier ways to do things, and to eliminate unnecessary work.

**Solution 2: Minimize Levels of Hierarchy**

Throughout this sequence we have emphasized the high toxicity of each level of hierarchy. If you must scale, attempt to do so with the minimum number of hierarchical levels, keeping as many people within one level of the top or bottom as possible. Fully flat is not a thing, but more flat is better.

**Solution 3: Skin in the Game**

Scale inherently limits skin in the game, as there is only 100% equity to go around, in all its forms. The thinner that must be spread, the harder it is to provide enough skin in the game. One can still seek to provide good incentives all around to the extent that it is possible. Isolating local outcomes so as to offer localized skin in the game helps. Keeping people responsible for areas over extended periods also helps.

**Solution 4: Soul in the Game**

No matter how large the organization, if you care deeply about the organization or its mission, preferably the mission, you can still have soul in the game. Mission selection is huge part of this, as some missions lend themselves to soul much more than others, but if you have a mission you are setting out to do as the starting point, you’re stuck. That means protecting against mission creep that would disrupt people’s soul in the game above and beyond other issues of mission creep. Keep people doing what they are passionate about.

**Solution 5: Hire and Fire Carefully**

Nothing raises maze levels faster than hiring someone who is maze aligned. One must not only avoid this, but maintain sufficient maze opposition to prevent it from happening in the future. All of this needs to sustain itself, which will be increasingly difficult over time and as you grow. As noted earlier, maze actions need to be firing offenses.

Solution 6: Promote, Reward and Evaluate CarefullyIf you can find ways to evaluate people, and choose which ones to promote and reward, in ways that are immune or even resistant to mazes and their politics, this would be a giant leg up. Hiring and firing are key moments, but the promotion can be similarly important. One must resist the temptation to try to implement ‘objective criteria’ and use hard numbers, as this introduces nasty Goodhart problems, and forces the system to choose between ‘people around you can change the numbers so the maze wins out’ and ‘people around you can’t change the numbers and no one cares about the people around them.’ It’s a big problem.

**Solution 7: Fight for Culture**

This can be seen as a catch-all, but the most important thing of all is *to care about organizational culture and fight for it. *Where what you are fighting for is not a maze. All of these ‘solutions’ often involve trade-off and sacrifice, and this is no exception. If you do not value the culture enough to fight for it, the culture will die.

**Solution 8: Avoid Other Mazes**

This will not always be possible, but to the extent it is possible, attempt to sell to, buy from, get funding from, make deals with non-mazes, and avoid mazes. Mazes will reward maze behaviors and push towards you raising maze levels, in ways that they will make seem natural. Do not put yourself in that position more than necessary.

**Solution 9: Start Again**

We must periodically start again. Even if everything is done right, within any given organization we are only staving off the inevitable. At a minimum we must periodically clean house, but that seems unlikely to be enough for that long. Actually starting over and building something new every so often, where frequency is highly context-dependent, seems necessary. This goes for corporations, for schools, for governments, and for everything else.

Discuss

### Pessimism About Unknown Unknowns Inspires Conservatism

This is a design for a conservative agent that I worked on with Marcus Hutter. Conservative agents are reluctant to make unprecedented things happen. The agent also approaches at least human-level reward acquisition.

The agent is made conservative by being pessimistic. Pessimism is tuned by a scalar parameter .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0}
.MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0}
.mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table}
.mjx-full-width {text-align: center; display: table-cell!important; width: 10000em}
.mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0}
.mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left}
.mjx-numerator {display: block; text-align: center}
.mjx-denominator {display: block; text-align: center}
.MJXc-stacked {height: 0; position: relative}
.MJXc-stacked > * {position: absolute}
.MJXc-bevelled > * {display: inline-block}
.mjx-stack {display: inline-block}
.mjx-op {display: block}
.mjx-under {display: table-cell}
.mjx-over {display: block}
.mjx-over > * {padding-left: 0px!important; padding-right: 0px!important}
.mjx-under > * {padding-left: 0px!important; padding-right: 0px!important}
.mjx-stack > .mjx-sup {display: block}
.mjx-stack > .mjx-sub {display: block}
.mjx-prestack > .mjx-presup {display: block}
.mjx-prestack > .mjx-presub {display: block}
.mjx-delim-h > .mjx-char {display: inline-block}
.mjx-surd {vertical-align: top}
.mjx-mphantom * {visibility: hidden}
.mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%}
.mjx-annotation-xml {line-height: normal}
.mjx-menclose > svg {fill: none; stroke: currentColor}
.mjx-mtr {display: table-row}
.mjx-mlabeledtr {display: table-row}
.mjx-mtd {display: table-cell; text-align: center}
.mjx-label {display: table-row}
.mjx-box {display: inline-block}
.mjx-block {display: block}
.mjx-span {display: inline}
.mjx-char {display: block; white-space: pre}
.mjx-itable {display: inline-table; width: auto}
.mjx-row {display: table-row}
.mjx-cell {display: table-cell}
.mjx-table {display: table; width: 100%}
.mjx-line {display: block; height: 0}
.mjx-strut {width: 0; padding-top: 1em}
.mjx-vsize {width: 0}
.MJXc-space1 {margin-left: .167em}
.MJXc-space2 {margin-left: .222em}
.MJXc-space3 {margin-left: .278em}
.mjx-test.mjx-test-display {display: table!important}
.mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px}
.mjx-test.mjx-test-default {display: block!important; clear: both}
.mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex}
.mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left}
.mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right}
.mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0}
.MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal}
.MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal}
.MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold}
.MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold}
.MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw}
.MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw}
.MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw}
.MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw}
.MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw}
.MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw}
.MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw}
.MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw}
.MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw}
.MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw}
.MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw}
.MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw}
.MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw}
.MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw}
.MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw}
.MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw}
.MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw}
.MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw}
.MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw}
.MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw}
.MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw}
@font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')}
@font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')}
@font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold}
@font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')}
@font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')}
@font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold}
@font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')}
@font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic}
@font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')}
@font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')}
@font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold}
@font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')}
@font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic}
@font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')}
@font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')}
@font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')}
@font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')}
@font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold}
@font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')}
@font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic}
@font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')}
@font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')}
@font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic}
@font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')}
@font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')}
@font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')}
@font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')}
@font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')}
@font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')}
@font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')}
@font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold}
@font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')}
β∈(0,1). When the agent is more pessimistic, it is more conservative. When it is made more pessimistic, it would be less likely to *exceed* human-level reward acquisition (almost definitely, but I haven’t tried to prove that). It would also require more observations before it started acting, the more pessimistic it is. It is not clear to me how useful the agent would be at the level of pessimism where we could be confident it is safe. At 0 pessimism, it is similar to AIXI (although technically stronger, because AIXI doesn’t have the performance guarantee of matching or exceeding human-level reward acquisition).

The agent has access to a human mentor, and at every timestep, it can either act or defer to the mentor. The only assumption we make is that the true environment belongs to the agent’s (countable) set of world-models. First a bit of math, then the main results.

A bit of math:

An event E is a subset of interaction histories that end with an action. Letting A, O, and R be the action, observation, and reward spaces, an event E⊂(A×O×R)∗×A. An element of E would look like a1o1r1a2o2r2…at−1ot−1rt−1at. Below, I will say “[to] take an action which immediately causes event E”, by which I mean “to take an action such that now the interaction history is an element of E.”

The main results:

1) (At least) mentor-level reward acquisition,

2) Probability of querying the mentor →0,

and this will take some time read, but I figured I’d spell it all out properly:

3) For any complexity class C (defined on normal Turing machines, not e.g. non-deterministic ones), we can construct a set of world-models M such that for all events E in the complexity class C and for all δ>0, there exists a β such that: when the pessimistic agent has the model class M and a pessimism β, the following holds with probability 1−δ: for the whole lifetime of the pessimistic agent, if E has never happened before, the agent will not take an action which immediately causes event E; if the event E ever happens, it will be because the mentor took the action that made it happen.

Comment 1: “The agent takes an action which *eventually* causes E (with probability at least p)” is an event itself, and it happens *immediately* if the agent takes the action in question, so the theorem above applies. But this event may not be in a complexity class that E is in.

Comment 2: The less simple E is, the higher β has to be.

Some other interesting results follow from the “Probably Respecting Precedent Theorem” above. One of which is roughly that (using the same there-exists and for-alls as in the main theorem) it is not instrumentally useful for the agent to cause E to happen. Note there is no need for the qualifier “immediately”.

Here is an event *E* that makes the Probably Respecting Precent Theorem particularly interesting: “Everyone is probably about to be dead.” If we want the agent to avoid an unprecedented bad outcome, all we have to know is an upper bound of the computational complexity of the bad outcome. We don’t have to know how to define the bad outcome formally.

Here’s how the agent works. It has a belief distribution over countably many world-models. A world-model is something that gives a probability distribution over observations and rewards given an interaction history (that ends in an action). It has a belief distribution over countably many mentor-models. A mentor-model is a policy—a probability distribution over actions given an interaction history. At each timestep, it takes the top world-models in its posterior until the total posterior weight of those world-models is at least β. The pessimistic value of a policy is the minimum over those world-models of the expected future discounted reward when following that policy in that world-model. The agent picks a policy which maximizes the pessimistic value. Either it follows this policy, or it defers to the mentor. To decide, it samples a world-model and a mentor-model from its posterior; then, it calculates the expected future discounted reward when following the mentor-model (which is a policy) in that world-model. If this value is greater than the pessimistic value plus positive noise, the agent defers to the mentor. Also, if the pessimistic value is 0, it defers to the mentor. This is called the zero condition, and to ensure that it only happens finitely often, the actual rewards we give have to always be greater than some ε>0. (If for some reason we failed to do this, despite that being in no one’s interest, the only results that would break are performance results, not safety results).

Here is an intuitive argument that some might find more persuasive than the formal results. An advanced RL agent run on a computer in Oxford might come up with two hypotheses about how the environment produces rewards: (1) the environment produces rewards according to how satisfied the human operators are with my performance; (2) the environment produces reward according to which keys are pressed on the keyboard of a certain computer. An agent which assigns sufficient weight to (2) will take over the world if possible to make sure those keys are pressed right. A pessimistic agent (that is sufficiently pessimistic to include (1) in its set of top world-models that cover β of its posterior) will predict that taking over the world will make the human operators unsatisfied, which puts an upper bound on the pessimistic value of such a policy. Better to play it safe, and take actions which satisfy the human operators *and* cause them to press the right keys accordingly. (With the help of mentor-demonstrations, it will have seen enough to have all its top models be approximately accurate about the effects of normal actions). Intuitively, I think much lower values of β are required to get this sort of behavior than the value of β that would be required to get very a small δ for the event “everyone is probably about to be dead” in the Probably Respecting Precedent Theorem.

This agent is definitely not tractable. I mentioned that when β is large enough to make it safe, it might never learn to be particularly superhuman. It is also possible that we never manage to come up with heuristic approximations to this agent (for the sake of tractability) without ruining the safety results. (The most powerful “heuristic approximations” will probably look like “applying the state of the art in AI in place of proper Bayesian reasoning and expectimax planning.”) These are the main reasons I see for pessimism about pessimism.

One thought I’ve had on tractable approximations: I imagine the min over world-models being approximated by an adversary, who takes the agent’s plan and searches for a simple world-model that retrodicts past observations well, but makes the plan look dumb.

Just a warning: the paper is dense.

“I was sweating blood” — Marcus Hutter

Some kind, kind people who read drafts and who were not familiar with the notation said it took them 2-3 hours (excluding proofs and appendices). Sorry about that. I’ve tried to present the agent and the results as formally as I can here without lots of equations with Greek letters and subscripts. Going a level deeper may take some time.

Thanks to Marcus Hutter, Jan Leike, Mike Osborne, Ryan Carey, Chris van Merwijk, and Lewis Hammond for reading drafts. Thanks to FHI for sponsorship. We’ve just submitted this to COLT. We’ll post it to ArXiv after we’ve gotten comments from reviewers. If you’d like to cite this in a paper in the meantime, you can cite it as an unpublished manuscript; if you’re citing it elsewhere, you can link to this page if you like. Hopefully, theorem numbers will stay the same in the final version, but I can’t promise that. I might not be super-responsive to comments here.

Discuss

### Map Of Effective Altruism

In the spirit of my old map of the rationalist diaspora, here’s a map of the effective altruist movement:

Continents are cause areas; cities are charities or organizations; mountains are individuals. Some things are clickable links with title-text explanations. Thanks to AG for helping me set up the imagemap.

Discuss

### UML IX: Kernels and Boosting

(This is the ninth post in a sequence on Machine Learning based on this book. Click here for part I.)

KernelsTo motivate this chapter, consider some training sequence .mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} S=((x1,y1),...,(xm,ym)) with instances in some domain set X. Suppose we wish to use an embedding ψ:X→Rd of the kind discussed in the previous post (i.e., to make the representation of our points more expressive, so that they can be classified by a hyperplane). Most importantly, suppose that d is significantly larger than m. In such a case, we're describing each point ψ(xi) in terms of d coordinates, even though our space only has m points, which means that there can, in some sense, only be m "relevant" directions. In particular, let

U:=span(ψ(Sx))={p∈Rd|∃a∈Rm:p=∑di=1αiψ(xi)}

where Sx is the training sequence without labels, so that ψ(Sx)=(ψ(x1),...,ψ(xm)). Then U is an (at most) m-dimensional subspace of Rd, and we would like to prove that we can work in U rather than in Rd.

IAs a first justification for this goal, observe that ψ(xi)∈U for all i∈[m]. (The symbol [n] for any n∈N denotes the set {1,...,n}.) Recall that we wish to learn a hyperplane parametrized by some w∈Rd that can then be used to predict a new instance ψ(y) for some y∈X by checking whether ⟨w,ψ(y)⟩>0. The bulk of the difficulty, however, lies in *finding* the vector w; this is generally much harder than computing a single inner product ⟨w,ψ(y)⟩.

Thus, our primary goals are to show that

(1) w will lie in U

(2) w can somehow be computed by only working in U

To demonstrate this, we need to look at how w is chosen, which depends on the algorithm we use. In the case of *Soft Support Vector Machines* (previous post), we choose

w∈argminw∈Rd(λ||w||2+1mm∑k=1max[0,1−yk⟨w,ψ(xk)⟩]).

This rule shows that we only care about the inner product between w and our mapped training points, the ψ(xi). Thus, if we could somehow prove (1), then (2) would seem to follow: if w∈U, then, according to the rule above, we would only end up caring about inner products between points that are both in U.

Therefore, we now turn to proving (1) formally. To have the result be a bit more general (so that it also applies to algorithms other than Soft Support Vector Machines), we will analyze a more general minimization problem. We assume that

w∈argminw∈Rd[f(y1,...,ym)(⟨w,ψ(x1)⟩,...,⟨w,ψ(xm)⟩)+R(||w||2)]

where f is any function and R is any *monotonically non-decreasing* function. (You might verify that the original problem is an instance of this one.) Now let w∗ be a solution to the above problem. Then we can use extended orthogonal decomposition1 to write w∗=π(w∗)+q, where π:Rd→U is the projection onto U that leaves vectors in U unchanged and q is orthogonal to every vector in U. Then, for any u∈U, we have

⟨w∗,u⟩=⟨π(w∗)+q,u⟩=⟨π(w∗),u⟩+⟨q,u⟩=⟨π(w∗),u⟩.

In particular, this is true for all the ψ(xi). Furthermore, since R is non-decreasing and the norm of w∗ is at least as large as the norm of π(w∗) (note that ||w∗||2=||π(w∗)||2+||u||2 due to the Pythagorean theorem), this shows that π(w∗) is a solution to the optimization problem. Moreover, if R is *strictly* monotonically increasing (as is the case for Soft Support Vector Machines), then if q>0, it would also be *better* than w∗, which is impossible since w∗ is by assumption optimal. Thus, q must be 0, which implies that not only some but *all* solutions lie in U.

[1] Regular orthogonal decomposition, as I've formulated in the previous post, only guarantees that u is orthogonal to ψ(w∗) rather than to every vector in U. But the extended version is no harder to prove. Choose some *orthonormal *basis B of U, extend it to an orthonormal basis B′ of all of Rd (amazingly, this is always possible), and define π by π(∑|B′|i=1αibi)=∑|B|i=1αibi; i.e., just discard all basis elements that belong to B′ but not B. That does the job.

We've demonstrated that only the inner products between mapped training points matter for the training process. Another way to phrase this statement is that, if we have access to the function

Kψ:X×X→RKψ:(x,y)↦⟨ψ(x),ψ(y)⟩

we no longer have any need to represent the points ψ(xk) explicitly. The function K is what is called the * kernel function*, that gives the chapter its name.

Note that K takes two arbitrary points in X; it is not restricted to elements in the training sequence. This is important because, to actually *apply* the predictor, we will have to compute ⟨w,ψ(y)⟩ for some y∈X, as mentioned above. But to *train* the predictor, we only need inner products between mapped training points, as we've shown. Thus, if we set

gk,ℓ:=K(xk,xℓ)=⟨ψ(xk),ψ(xℓ)⟩∀k,ℓ∈[m]

then we can do our training based solely on the gk,ℓ (which will lead to a predictor that uses K to classify domain points.) Now let's reformulate all our relevant terms to that end. Recall that we have just proved that w∗∈U. This implies that w∗=∑mi=1αiψ(xi) for the right αi. Also recall that our objective is to find w∗ in the set

argmaxw∈Uf(⟨w,ψ(x1)⟩,...,⟨w,ψ(xm)⟩)+R(||w||2)

Now we can reformulate

⟨w,ψ(xk)⟩=⟨∑mi=1αiψ(xi),ψ(xk)⟩=∑mi=1αi⟨ψ(xi),ψ(xk)⟩=∑mi=1αigi,k

for all k∈[m], and

||w||2=⟨w,w⟩=⟨∑mi=1αiψ(xi),∑mi=1αiψ(xi)⟩=∑mk,ℓ=1αkαℓgk,ℓ.

Plugging both of those into the term behind the argmax, we obtain

f(∑mi=1αigi,1,...,∑mi=1αigi,m)+R(∑mk,ℓ=1αkαℓgk,ℓ)

This is enough to establish that one can learn purely based on the gk,ℓ. Unfortunately, the Machine Learning literature has the annoying habit of writing everything that can possibly be written in terms of matrices and vectors in terms of matrices and vectors, so we won't quite leave it there. By setting α:=(α1,...,αm) (a row vector), we can further write the above as

f([αG]1,...,[αG]m)+R(αGαT)whereG=(gk,ℓ)1≤k≤m1≤ℓ≤m

or even as f((αG))+R(αGαT), at which point we've successfully traded any conceivable intuition for compactness. Nonetheless, the point that G is sufficient for learning still stands. G is also called the * Gram matrix*.

And for predicting a new point ψ(y), we have

⟨w,ψ(y)⟩=⟨∑mi=1αiψ(xi),ψ(y)⟩=∑mi=1αi⟨ψ(xi),ψ(y)⟩=∑mi=1αiK(xi,ψ(y)).

At this point, you might notice that we never represented U explicitly, but just reformulated everything in terms of inner products. Indeed, one could introduce kernels without mentioning U, but I find that thinking in terms of U is quite helpful for understanding *why* all of this stuff works. Note that the above equation (where we predict the label of a new instance) is not an exception to the idea that we're working in U. Even though it might not be immediately apparent from looking at it, it is indeed the case that we could first project ψ(y) into U without changing anything about its prediction. In other words, it is indeed the case that ⟨w,ψ(y)⟩=⟨w,π(ψ(y))⟩ for all y∈X. This follows from the definition of π and the fact that all basis vectors outside of U are orthogonal to everything in U.

Kernels allow us to deal with arbitrarily high-dimensional data (even infinitely dimensional) by computing m2 distances, and later do some additional computations to apply the output predictor – under the essential condition that we are able to evaluate the kernel function K. Thus, we are interested in embeddings ψ such that Kψ is easy to evaluate.

For an important example, consider an embedding for *multi-variable polynomials*. Suppose we have such a polynomial of the form p:Rn→R, i.e. something like

p(x,y,z)=x2yz2+3xyz2−2x3z2+12y2

where the above would be a 3-variable polynomial of degree 5. Now recall that, to learn one-dimensional polynomials with linear methods, we chose the embedding ψ:x↦(1,x,x2,...,xk). That way, a linear combination of the image coordinates can do everything a polynomial predictor can do. To do the same for an arbitrary n-dimensional polynomial of degree k, we need the far more complex embedding

ψ:Rn→R(n+1)kψ:(x1,...,xn)↦(∏ki=1xw(i))w∈{0,...,n}k

An n-dimensional polynomial of degree k may have one value for each possible combination of its n variables such that at most k variables appear in each term. Each w∈{0,...,n}k defines such a combination. Note that this is a sequence, so repetitions are allowed: for example, the sequence (1,2,...,2)∈{0,...,n}k corresponds to the term x1xk−12. We set x0=1 so that we also catch all terms with degree less than k: for example, the sequence (0,0,0,3,...,3) corresponds to the term xk−33 and the sequence (0,...,0) to the absolute value of the polynomial.

For large n and k this target space is extremely high-dimensional, but we're studying kernels here, so the whole point will be that we won't have to represent it explicitly.

Now suppose we have two such instances ψ(x) and ψ(x′). Then,⟨ψ(x),ψ(x′)⟩=⟨(∏ki=1xw(i))w∈{0,...,n}k,(∏ki=1x′w(i))w∈{0,...,n}k⟩⟨ψ(x),ψ(x′)⟩=∑w∈{0,...,n}k⟨∏ki=1xw(i),∏ki=1x′w(i)⟩⟨ψ(x),ψ(x′)⟩=∑w∈{0,...,n}k∏ki=1xw(i)x′w(i)

And for the crucial step, the last term can be rewritten as (∑ni=0xix′i)k – both terms include all sequences xix′i of length k where i∈{0,...,n}. Now (recall that x0=x′0=1) this means that the above sum simply equals (1+⟨x,x′⟩)k. In summary, this calculation shows that

K(x,x′):=(1+⟨x,x′⟩)k=⟨ψ(x),ψ(x′)⟩∀x,x′∈X

Thus, even though ψ maps points into the very high-dimensional space R(n+1)k, it is nonetheless feasible to learn a multi-polynomial predictor through linear methods, namely by embedding the values via ψ and then ignoring ψ and using K instead. The gram matrix G will consist of m2 entries, where for each, a term of the form (1+⟨x,x′⟩)k=(1+∑ni=1xix′i)k has to be computed. This doesn't look that scary! Even for relatively large values of d, k, and m, it should be possible to compute on a reasonable machine.

If we do approach learning a multi-dimensional polynomial in this way, then (I think) there are strong reasons to question in what sense the embedding ψ actually *happens* – this question is what I was trying to wrap my head around at the end of the previous post. It seemed questionable to me that ψ is fundamental even if the problem is learned without kernels, but even more so if it is learned with them.

And that is all I have to say about kernels. For the second half of this post, we'll turn to a largely independent topic.

Boosting** Boosting **is another item under the "widening the applicability of classes" category, much like the ψ from earlier.

This time, the approach is not to expand the representation of data and then apply a linear classifier on that representation. Instead, we wish to construct a complex classifier as a *linear combination of simple classifiers*.

When hyperplanes are visualized, it is usually understood that one primarily cares about hyperplanes in higher-dimensional spaces where they are much more expressive, despite the illustration depicting an instance in 2-d or 3-d. But this time, think of the problem instance below in literal 2-d space:

No hyperplane can classify this instance correctly, but consider a combination of these three hyperplanes:

By letting h(p)=σsign(h1(p)+h2(p)+h3(p)−2.5) where hi is the predictor corresponding to the i-th hyperplane and σsign is the sign function, we have constructed a predictor h which has zero empirical error on this training instance.

Perhaps more surprisingly, this trick can also learn non-convex areas. The instance below,

will be classified correctly by letting h(p)=σsign(h1(p)+2h2(p)+2h3(p)), with the hi (ordered left to right) defined like so:

These two examples illustrate that the resulting class is quite expressive. The question is, how to learn such a linear combination?

IIFirst, note that hyperplanes are just an example; the framework is formulated in terms of a learning algorithm that has access to a ** weak learner**, where

If you recall the definition of PAC learnability back from chapter 1, you'll notice that this is very similar. The only difference is in the error: PAC learning demands that it be arbitrarily close to the best possible error, while a weak learner merely has to bound it away from 12 by some fixed amount γ, which can be quite small. Thus, a weak learner is simply an algorithm that puts out a predictor that performs a little bit better than random. In the first example, the upper hyperplane could be the output of a weak learner. The term "boosting" refers to the process of upgrading this one weak learner into a better one, precisely by applying it over and over again under the supervision of a smartly designed algorithm –

– which brings us back to the question of how to define that such an algorithm. The second example (the non-convex one) illustrates a key insight here: repeatedly querying the weak learner on the unaltered training instance is unlikely to be fruitful, because the third hyperplane by itself performs worse than random, and will thus not be output by a γ-weak learner (not for any γ∈R+). To remedy this, we somehow need to *prioritize the points we're currently getting wrong.* Suppose we begin with the first two hyperplanes. At this point, we have classified the left and middle cluster correctly. If we then weigh the right cluster sufficiently more strongly than the other two, eventually, h3 will perform better than random. Alas, we wish to adapt our weighting of training points dynamically, and we can do this in terms of a probability distribution over the training sequence.

Now the roadmap for defining the algorithm which learns a predictor on a binary classification problem via boosting is as follows:

- Have access to a training sequence S and a γ-weak learner Aγ
- Manage a list of weak predictors which Aγ has output in previous rounds
- At every step, hand Aγ the training sequence S along with some distribution D(t) over S, and have it output a γ-weak predictor ht+1 on the problem (S,D(t)), where each point in S is taken into account proportional to its probability mass.
- Stop at some point and output a linear combination of the hi

The particular algorithm we will construct is called * Ada-Boost*, where "Ada" doesn't have any relation to the programming language, but simply means "adaptive".

Let's first look into how to define our probability distribution, which will be the most complicated part of the algorithm. Suppose we have our current distribution D(t) based on past predictors h1,...,ht−1 output by Aγ, and suppose further that we have computed weights w1,...,wt−1 such that wi measures the quality of hi (higher is better). Now we receive a new predictor ht with quality wt. Then we can define a new probability distribution D(t+1) by letting

D(t+1)((xi,yi))∝D(t)((xi,yi))⋅e−wtyiht(xi)∀i∈[m]

where we write ∝ rather than = because the term isn't normalized; it will equal the above scaled such that all probabilities sum to 1.

The term yiht(xi) is 1 iff predictor ht classified xi correctly. Thus, the right component of the product equals e−wt iff the point was classified correctly, and ewt if it wasn't. If ht is a bad predictor and wt is small, say 10−3, the two terms are both close to 1, and we don't end up changing our weight on (xi,yi) very much. But if ht is good and wt is large, the old weight D(t)((xi,yi)) will be scaled significantly upward (if it got the point wrong) or downward (if it got the point right). In our second example, the middle hyperplane performs quite well on the uniform distribution, so w2 should be reasonably high, which will cause the probability mass on the right cluster to increase and on the two other clusters to decrease. If this is enough to make the right cluster dominate the computation, then the weak learner might output the right hyperplane next. If not, it might output the second hyperplane again. Eventually, the weights will have shifted enough for the third hyperplane to become feasible.

IVNow let's look at the weights. Let ϵt=ℓ0−1S(ht) be the usual empirical error of ht, i.e., ϵt=1m|{(x,y)∈S|ht(x)≠y}|. We would like wi to be a real number, which starts close to 0 for ϵt close to 12 and grow indefinitely for ϵt close to 0. One possible choice is wt:=12ln(1ϵt−1). You can verify that it has these properties – in particular, recall that ht is output by a weak learner so that its error is bounded away from 12 by at least γ. Because of this, 1ϵt is larger than 2 so that 1ϵt−1 is larger than 1 and wt is larger than 0.

VTo summarize,

**AdaBoost **(Aγ : weak learner, S : training sequence, T:N+)

D(0)←(1m⋯1m)

**for** (t←1 to T) **do**

ht←Aγ(S,D(t−1))

ϵt←1m|{(x,y)∈S|ht(x)≠y}|

wt←12ln(1ϵt−1)

D(t)←normalize((Di⋅e−wtyiht(xi))i∈[m])

** endfor**

**return **∑Tt=1wtht

**end**

If one assumes that Aγ always returns a predictor with error at most 12−γ (recall that it may fail with probability δ), one can derive a bound on the error of the output predictor. Fortunately, the dependence of the sample complexity on δ is only logarithmic, so δ can probably be pushed low enough that Aγ is unlikely to fail even if it is called T times.

Now the error bound one can derive is e−2γ2T. Looking at this, it has exactly the properties one would expect: a higher γ pushes the error down, and so do more rounds of the algorithm. On the other hand, doing more rounds increases the chance of overfitting to random quirks in the training data. Thus, the parameter T allows one to balance the overfitting vs. underfitting tradeoff, which is another nice thing about AdaBoost.The book mentions that Boosting has been successfully applied to the task of classifying gray-scale images into 'contains a human face' and 'doesn't contain a human face'. This implies that human faces can be recognized using a set of quantitative rules – but, importantly, rules which have been generated by an algorithm rather than constructed by hand. (In that case, the weak learner did not return hyperplanes, but simple predictors of another form.) In this case, the result fits with my intuition (that face recognition is the kind of task where a set-of-rules approach will work). It would be interesting to know how well boosting performs on other problems.

Discuss

### A point of clarification on infohazard terminology

*TL;DR: “Infohazard” means any kind of information that could be harmful in some fashion. Let’s use “memetic hazard” to describe information that could specifically harm the person who knows it.*

Some people in my circle like to talk about the idea of information hazards or infohazards, which are dangerous information. This isn’t a fictional concept – Nick Bostrom characterizes a number of different types of infohazards in his 2011 paper that introduces the term (PDF available here). Lots of kinds of information can be dangerous or harmful in some fashion – detailed instructions for making a nuclear bomb. A signal or hint that a person is a member of a marginalized group. An extremist ideology. A spoiler for your favorite TV show. (Listen, an infohazard is a kind of hazard, not a measure of intensity. A papercut is still a kind of injury!)

I’ve been in places where “infohazard” is used in the Bostromian sense casually – to talk about, say, dual-use research of concern in the biological sciences, and describe the specific dangers that might come from publishing procedures of results.

I’ve also been in more esoteric conversations where people use the word “infohazard” to talk about a specific kind of Bostromian information hazard: information that may harm the person *who knows it*. This is a stranger concept, but there are still lots of apparent examples – a catchy earworm. “You just lost the game.” More seriously, an easy method of committing suicide for a suicidal person. A prototypical fictional example is the “basilisk” fractal from David Langford’s 1988 short story BLIT, which kills you if you see it.

This is a subset of the original definition because it is harmful information, but it’s expected to harm the person who knows it in particular. For instance, detailed schematics for a nuclear weapon aren’t really expected to bring harm to a potential weaponeer – the danger is that the weaponeer will use them to harm others. But fully internalizing the information that Amazon will deliver you a 5-pound bag of Swedish Fish *whenever you want* is specifically a danger to you. (…Me.)

This disparate use of terms is confusing. I think Bostrom and his intellectual kith get the broader definition of “infohazard”, since they coined the word and are actually using it professionally.*

I propose we call the second thing – information that harms the knower – a memetic hazard.

*Pictured: Substantiated example of an infohazard. Something something red herrings.*

This term is shamelessly borrowed from the SCP Foundation, which uses it the same way in fiction. I figure the usage can’t make the concept sound any *more* weird and sci-fi than it already does.

(Memetic hazards don’t have to be hazardous to everybody. Someone who hates Swedish Fish is not going to spend all their money buying bags of Swedish Fish off of Amazon and diving into them like Scrooge McDuck. For someone who loves Swedish Fish – well, no comment. I’d call this “a potential memetic hazard” if you were to yell it into a crowd with unknown opinions on Swedish Fish.)

Anyways, hope that clears things up.

*For a published track record of this usage, see: an academic paper from Future of Humanity Institute and Center for Health Security staff, another piece by Bostrom, an opinion piece by esteemed synthetic biologist Kevin Esvelt, a piece on synthetic biology by FHI researcher Cassidy Nelson, a piece by Phil Torres.

Discuss

### Money isn't real. When you donate money to a charity, how does it actually help?

Money is a shared illusion. Neither paper, gold, nor electron patterns can stop a mosquito or feed a hungry person directly. Transfer of money is a temporary, low friction way to motivate unidentified strangers to preform an action.

So what's the mechanism by which monetary charity works? Are we "just" exploiting a comparative advantage in dollar-gathering in order to force the people on the ground to behave the way we prefer? Do we have any evidence that it can create channels of behavior that are self-sustaining?

On small (and not-so-small, but not universal) margins, money is a fine way to change some behaviors. I'm wondering if there are ways to shift the equilibria at the core rather than the margin.

Discuss

### The Case for Artificial Expert Intelligence (AXI): What lies between narrow and general AI?

For years, I've felt that our AI categories have been missing an important step: what comes between narrow AI & general AI. With the rise of synthetic media (especially natural-language generation) and game-playing AI, we're finally forced to confront this architecture.

Based on my lack of knowledge in the field of machine learning and bare observations of OpenAI's GPT-2, I propose a hypothesis: between narrow artificial intelligence and general artificial intelligence, there lies a sort of architecture capable of generalized learning in narrow fields of tasks. I don't claim to be an AI expert or even an amateur. Indeed, I likely lack so much understanding of data science that literally everything I'm about to say is actually wrong on a fundamental level.

But I do feel like, at least when it comes to mainstream discussions of AI, there's a big problem. Several big problems, in fact.

How does media talk about AI? Typically by reducing it to three architectural categories:

**Artificial narrow intelligence (ANI)**. This is AI that can do one thing— and only one thing. If there's a network that you notice does more than one thing, it's actually just a bundle of ANIs all doing different things at the same time. In technical parlance, most of what is called ANI isn't actually artificial intelligence at all— basic scripts, Markov chains, Monte Carlo Tree Searches, conversation trees, stochastic gradient descent, autoencoding, etc. are part of data science & optimization more than AI, but this really comes down to the fact "AI" carries connotations of humanlike cognition. For the sake of this post, we'll consider them AI anyway.

**Artificial general intelligence (AGI)**. The holy grail of data science. The cybernetic messiah. The solution to all our problems (which includes nuking all our problems). This is AI that can do *anything*, presumably as well as a human can.

**Artificial superintelligence (ASI)**. The rapture of the nerds and your new God. This is an AGI on crack, if that crack was also on crack. Take the limits of human intelligence: fusion-ha'ing Einstein, Euler, Newton, Mozart, the whole lot of them. Push human intelligence as far as it can go genetically, to the absolute limit of standard deviations. ASI is everything even further beyond. It's a level of intelligence no human, either living, dead, or to be, will ever attain.

That's all well and good, but surely one can recognize that there's a massive gap there. How do we go from an AI that can do only one thing to an AI that does literally *everything*? Surely there's some intermediate state in between where you have narrow networks that are generalized, but not quite "general AI."

Up until recently, we had no reference for such a thing. It was either the soberingly incapable computer networks of the present or the artificial brains of science fiction.

But then deep learning happened. By itself, deep learning is little more than a more volumetric evolution of perceptrons made possible by modern computing power, such as might be possible with GPUs. Here we are a decade later, and what do we have? Networks and models that are either generalized or possessing generalized capabilities.

Nominally, most of these networks can only do "one" thing, just like any ANI. But unlike other ANIs, they can learn to do something else that's either closely related to or a direct outgrowth of that thing.

For example: MuZero from DeepMind. This one network has mastered over 50 different games. Even AlphaZero qualified, as it could play three different games. Of course, it still has to be retrained to play these different games as far as I know.

There's another example, this one as a "rooted in a narrow thread, and sprouting into multiple areas" deal: GPT-2. Natural language generation is probably as narrow of a task as you can get: generate data in natural language. But from this narrow task, you can see a very wide range of generalized results. By itself, it has to be trained to do certain things, so the training data determines whether it does any specific thing at this juncture. But as it turns out (and even surprising me), there's a lot that this entails. Natural-language processing is a very funny thing: because digital data itself qualifies as a natural language, that means that a theoretical NLG model can do anything on a computer. Write a story, write a song, compose a song, play that song, create art...

And even play a game of chess.

Though GPT-2 can't actually "play" the game, theoretically it would be feasible to get MuZero and GPT-2 to face off against each other.

Why is this important? Because of something I've called the AGI Fallacy. It's a phenomenon where we assume new tech will either only come about with AGI or is unlikely without it.

We're probably familiar with the AI Effect, yes? The gist there is that we assume that a technology, accomplishment, or innovative idea [X] requires "true" artificial intelligence [Y], but once we actually accomplish [X] with [Y], [Y] is no longer [Y]. That might sound esoteric on the surface, but it's simple: once we do something new with AI, it's no longer called "AI". It's just a classifier, a tree search, fuzzy statistics, a Boolean loop, an expert system, or something of that sort.

As a result, I've started translating "NAI" (narrow AI) as "Not AI" because that's what just about any and every narrow AI system is going to be.

It's possible there's a similar issue building with a fallacy that's closely related to (but is not quite) the AI Effect. To explain my hypothesis: take [X] again. It's a Super Task that requires skills far beyond any ANI system today. In order to reliably accomplish [X], we need [Y]— artificial general intelligence. But here's the rub: most experts place the ETA of AGI at around 2045 at the earliest, with actual data scientists leaning much closer to the 2060s at the earliest, with more conservative estimates placing its creation into the 22nd century. [Z] is how many years away this is, and for simplicity's sake, let's presume that [Z] = 50 years.

To simplify: [X] requires [Y], but [Y] is [Z] years away. Therefore, [X] must also be [Z] years away, or at least it's close to it and accomplishing it heralds [Y].

But this isn't the case for almost everything done with AI thus far. As it turns out, a sufficiently advanced narrow AI system was capable of doing things that past researchers were doggedly sure could only be done with general AI.

Of course, there are some classes of things that do require something more generalized, and it's those that people tend to hinge their bets on as being married to AGI. Except if there is a hitherfore unrecognized type of AI that can also be generalized but doesn't require the herculean task of creating AGI, even those tasks can be predicted to be solved far ahead of time.

So, say, generating a 5-minute-long video of a photorealistic person talking might seem to require AGI at first. This network has to generate a person, make that person move naturally, generate their text, generate their speech, and then make it coherent over the course of five minutes. How can't you do it with AGI? Well, depending on the tools you have, it's possible it's relatively easy.

This can greatly affect future predictions too. If you write something off as requiring AGI and then say that AGI is 50 years away, you then put off that prediction as being 50 years away as well. So if you're concerned about fake videos & movies but think we need AGI to generate them in order for them to be decent or coherent, you're probably going to compartmentalize that concern in the same place as your own natural death or of your grandchildren attending college. It's worth none of your concern in the immediate future, so why bother caring so much about it?

Whereas if you believe that this tech might be here within five years, you're much more apt to act and prepare. If you accept that some AI will be generalized but not completely generalized, you'll be more likely to take seriously the possibility of great upheavals much sooner than commonly considered to be realistic.

It happens to be ridiculously hard to get some people to understand this because, as mentioned, we don't really have any name for that intermediate type of AI and, thus, never discuss it. This even brings some problems because whenever we do talk about "increasingly generalized AI," some types latch onto the "generalized" part of that and think that you're discussing general AI and, thus, believe that we're closer to AGI than we actually are. Or conversely, they say that whatever network you're talking about is the furthest thing from AGI and use that mention of AI generality to shut down the topic since it "deals with science fiction instead of facts."

That's why I really don't like using terms like "proto-AGI" since that makes it sound like we just need to add more power and tasks to make it the full thing when it's really an architectural issue.

Hence why I went with "**artificial expert intelligence**." I forget where I first heard the term, but it was justified by the fact that

- The acronym can be "AXI," which sounds suitably cyberpunk.

- The acronym is original. The other names including "artificial specialized intelligence" (ASI, which is taken) and "artificial networked intelligence" (ANI, which is taken).

The only real drawback is its potential association with expert systems. But generally, I went with "expert" because of the association: experts will have specialized knowledge in a small field of areas, and can explain the relationship in those fields. Not quite a polymath savant that knows everything, and not really a student who has memorized a few equations and definitions to pass some tests.

...ever since roughly around 2015 or so, I started asking myself: "what about AI that can do some things but not everything?" That is, it might be specialized for one specific class of tasks, but it can do several or all of the subtasks within that class. Or, perhaps more simply, it's generalized across a cluster of tasks and capabilities but isn't general AI. It seems so obvious to me that this is the next step in AI, and we even have networks that do this: transformers, for example, specialize in natural-language generation, but from text synthesis, you can also do rudimentary images or organize MIDI files; even with just pure text synthesis, you can generate anything from poems to scripts and everything in between. Normally, you'd need an ANI that specializes in each one of those tasks, and it's true that most transformers right now are trained to do one specifically. But as long as they generate character data, they can theoretically generate more than just words.This isn't "proto-AGI" or anything close; if anything, it's closer to ANI. But it isn't ANI; it's too generalized to be ANI.

Unfortunately, I have literally zero influence and clout in data science, and my understanding of it all is likely wrong, so it's unlikely this term will ever take off.

Discuss

### [Link] Beyond the hill: thoughts on stories, forecasting and essay-completeness

*This is a link-post for: https://www.foretold.io/c/1bea107b-6a7f-4f39-a599-0a2d285ae101/n/5ceba5ae-60fc-4bd3-93aa-eeb333a15464*

*---*

*Epistemic status: gesturing at something that feels very important. Based on a true story. Show, don't tell. *

Why are documents and spreadsheets so successful?

Why does code, which is many times more powerful than spreadsheets, have many times fewer users?

I think it's because code not just forces you to express your ideas in code, but also to *think in code*. It imposes constraints on your ontology for thinking.

Having spent the last year working on forecasting, I got some experience with how ontologies can significantly constrain technology projects.

I think such constraints have...

- heavily limited the usefulness of past forecasting efforts
- resulted in broad misconceptions about what forecasting could be used for
- hidden a large space of interesting work that can be unlocked if we solved them

So the link-post is an interactive essay where I attempt to show what solving them might look like in practice, using some technology which is currently not supported on LessWrong.

Discuss

### "Memento Mori", Said The Confessor

Abstract

The fear of death acts as a sort of master key for introductory rationality concepts. Examining the fear of death ties all the rationality basics together into a coherent framework, including:

- Map/Territory Errors
- Something To Protect
- Keeping Your Identity Small
- Atheism
- X-Risk

**Small brain**: Don't think about death.

**Shining tomagraph**: "After I die I'll go to heaven because I'm a good person."

**Expanding brain**: "God isn't real, I find it more comforting to think that this isn't all a test."

**Galaxy Brain**: *Practice dying*.

Discuss

### Bay Winter Solstice seating-scarcity

*tl'dr: The last few years, Bay Winter Solstice celebration has sold out (at 240 seats, plus an overflow room). I'm one of the three organizers this year, and am trying to gauge the True Demand.*

*So, would you be inviting friends to Winter Solstice who wouldn't come on their own? And/or have you not gone to Solstice in the last few years due to scarcity of seating?*

The past couple years Bay Solstice has been held in a planetarium, which is a pretty cool aesthetic, but only fit 240 people. It turns out there's a bigger planetarium in San Francisco (seats 290 seats, 50 more than last time. It also has a nicer overflow room that seats 100)

**The question is "is 50 more seats ****enough****?"**

Last year we ended up fitting everyone who showed up (including overflow people into the room), with 20 people who originally said they were coming who didn't end up showing up, and 10 people on my facebook wall who said they would have come or brought more people if seating didn't feel scarce.

For the past several years, Bay Winter Solstice attendance has clearly been bottlenecked on venue size, and it seems pretty valuable to have a year where there's _zero_ scarcity, to generally fight any perception of solstice-attendance as scarce, as well as to get a clear sense of what the true demand actually is.

The main alternatives seem to be "Ballrooms and theaters that are reasonably nice but don't really hit any particular Solstice Aesthetic that hard."

It so happens the planetarium is currently on-hold (but not officially booked) for Dec 19th, which is the currently planned date for Bay Solstice Celebration, but we might be able to snatch it away if we move quickly.

I have a lot of uncertainty over whether 290 seats is enough, and curious about other people's thoughts.

**Does a Nicer Overflow Room Matter?**

I also have some uncertainty about the new planetarium's overflow room. Last year, the overflow room doubled as the childcare room, which wasn't a nice experience for those were really trying to get "as close to the Dark Solstice aesthetic as possible."

In we went with the SF Planetarium, a) there'd be childcare room separate from the overflow room, b) the overflow room is really quite, nestled right up against the planetarium itself. It has nice mood lighting. It also has a *giant* projection screen composed out of three projectors. I think there's potential to a legitimately good job livestreaming the event if we put a lot of attention into it. It fits 100 people.

Alternately, it's plausible to maybe just have, like, a whole second Solstice in the overflow room (possibly with a somewhat different vibe).

Last few years the overflow room has been this sad, awkward middle ground of "only a few people go there, most of whom ended up getting to relocate to the planetarium". It seems plausible that if we did a good job with it as a whole second venue that got 50+ people it might feel like a legitimate experience in it's own right.

Or that might be pure wishful thinking.

Curious to hear thoughts about all of this.

Discuss

### The case for lifelogging as life extension

Those in the cryonics community want to be frozen upon legal death, in order to preserve the information content in their brain. The hope is that, given good protocol, damage incurred during the freezing process will not destroy enough information about you to prevent people in the future to reconstruct your identity.

As most who want cryonics will understand, death is not an event. Instead, it is a process with intermediate steps. We consider a long-decayed corpse to be dead because it no longer performs the functions associated with a normal living human being, not because any sort of spirit or soul has left the body.

But philosophers have also identified important dilemmas for the view that death is a process rather than an event. If what we call death is simply my body performing different functions, then what do we make of the fact that we also change so much simply due to the passage of time?

I find it easy to believe that I am the 'same person' as I was last night. Enough of the neural pathways are still the same. Memories from my childhood are essentially still identical. My personality has not changed to any significant extent. My values and beliefs remain more-or-less intact.

But every day brings small changes to our identity. To what extent would you say that you are still the 'same person' as you were when you were a child? And to what extent are you still going to be the 'same person' when you get old?

In addition to the gradual changes that happen due to every day metabolic processes, and interactions with the outside world, there is also a more sudden change that may happen to your identity as you get old. By the age of 85, something like 25 to 50 percent of the population will get a form of dementia. Alzheimer's is a very harsh transformation to our connectome.

Ironically, those who are healthiest in their youth will have the highest chance of getting Alzhiemers, as it is typically a disease of the very-old, rather than the somewhat old. Furthermore, most forecasters expect that as medical technology advances, the rate of Alzhiemers will go *up*, since it's among the hardest diseases to fix with our current paradigm of medical technology, and therefore you won't be as likely to die of the others. And Alzhiemers is just one brand of neurodegenerative diseases.

If you care about preserving your current self, and you think that death is a process rather than event, then it follows that you should want to preserve your current self: memories, personality, beliefs, values, mannerisms etc.

The technology to store the contents of our brains is currently extremely limited and expensive, but we have an alternative. We can store external information about ourselves, in the form of lifelogging. The type of content we preserve can take a variety of forms, such as text, audio and video.

It might seem like preserving an audio of your voice will do little to restore your identity. But that might not be the case. If you are cryopreserved, then much of your connectome will be preserved anyway. The primary value of preserving external information is to 'fill in the blanks' so to speak.

For example, the most famous symptom of Alzheimers is memory loss. This occurs because the hippocampus, the primary component of our brain responsible for storing long-term memories, shrinks radically during the progression of the disease. If you consider memory to be important to your identity, then preserving external information about you could help function as an artificial memory source.

What I'm trying to say is that if death is a process, it's not correct to say that you will either be revived or not in the future, like a binary event. Rather, *part* of you will be revived. How much that part resembles you depends on how much information about you is preserved.

There are many clever methods I currently see for how future civilization could reconstruct your identity using your cryopreserved brain contents, and external memory together. If you can't see how the external memory helps at all, then I consider that a fault of imagination.

Some will object by saying that lifelogging is *embarrassing*, as you are carrying a camera or audio recording device wherever you go. Indeed, most of the reason why people don't sign up for cryonics in the first place is because they fear that their peers will not approve. Lifelogging makes this dire situation worse. But I think there are steps you can take to make the appeal better.

The more information you preserve now, the better. There's no sharp cutoff point between having too little information and having just enough. If you feel uncomfortable walking around with a camera (and who wouldn't?) you don't have to. But consider taking small steps. Perhaps when you are in a video call with someone, ask them if they are OK with you recording it and later storing it as an mp3 on a hard disk. Or maybe you could write more of your personal thoughts into documents, and upload them to Google Drive.

Little actions like that could add up, or not. I claim no silver bullet.

Part of the worst part of death is how terrible we are at motivating ourselves to avoid it. Among people who say they are interested in signing up for cryonics, only a small fraction end up signing the paperwork. And among those who do, the number who get preserved in optimal conditions is far too low. It seems that outside pressure from society is simply too powerful.

But as indicated by the Asch conformity experiments, the best way to overcome societal pressure is by having peers that agree with and encourage you. If just a few people took this post seriously, this could be enough to puncture the equilibrium, and perhaps a lot of people will be interested in recording their lives. Who knows?

Discuss

### What Money Cannot Buy

The problem is, if you're not a hacker, you can't tell who the good hackers are. A similar problem explains why American cars are so ugly. I call it the design paradox. You might think that you could make your products beautiful just by hiring a great designer to design them. But if you yourself don't have good taste, how are you going to recognize a good designer? By definition you can't tell from his portfolio. And you can't go by the awards he's won or the jobs he's had, because in design, as in most fields, those tend to be driven by fashion and schmoozing, with actual ability a distant third. There's no way around it: you can't manage a process intended to produce beautiful things without knowing what beautiful is. American cars are ugly because American car companies are run by people with bad taste.

I don’t know how much I believe this claim about cars, but I certainly believe it about software. A startup without a technical cofounder will usually produce bad software, because someone without software engineering skills does not know how to recognize such skills in someone else. The world is full of bad-to-mediocre “software engineers” who do not produce good software. If you don’t already know a fair bit about software engineering, you will not be able to distinguish them from the people who really know what they’re doing.

Same with user interface design. I’ve worked with a CEO who was good at UI; both the process and the results were visibly superior to others I’ve worked with. But if you don’t already know __what good UI design looks like__, you’d have no idea - good design is largely invisible.

Yudkowsky __makes the case__ that the same applies to security: you can’t build a secure product with novel requirements without having a security expert as a founder. The world is full of “security experts” who do not, in fact, produce secure systems - I’ve met such people. (I believe they mostly make money by helping companies visibly pretend to have made a real effort at security, which is useful in the event of a lawsuit.) If you don’t already know a fair bit about security, you will not be able to distinguish such people from the people who really know what they’re doing.

But to really drive home the point, we need to go back to 1774.

As the American Revolution was heating up, a wave of smallpox was raging on the other side of the Atlantic. An English dairy farmer named Benjamin Jesty was concerned for his wife and children. He was not concerned for himself, though - he had previously contracted cowpox. Cowpox was contracted by milking infected cows, and was well known among dairy farmers to convey immunity against smallpox.

Unfortunately, neither Jesty’s wife nor his two children had any such advantage. When smallpox began to pop up in Dorset, Jesty decided to take drastic action. He took his family to a nearby farm with a cowpox-infected cow, scratched their arms, and wiped pus from the infected cow on the scratches. Over the next few days, their arms grew somewhat inflamed and they suffered the mild symptoms of cowpox - but it quickly passed. As the wave of smallpox passed through the town, none of the three were infected. Throughout the rest of their lives, through multiple waves of smallpox, they were immune.

The same technique would be popularized twenty years later by Edward Jenner, marking the first vaccine and the beginning of modern medicine.

The same wave of smallpox which ran across England in 1774 also made its way across Europe. In May, it reached Louis XV, King of France. Despite the wealth of a major government and the talents of Europe’s most respected doctors, Louis XV died of smallpox on May 10, 1774.

The point: there is knowledge for which money cannot substitute. Even if Louis XV had offered a large monetary bounty for ways to immunize himself against the pox, he would have had no way to distinguish Benjamin Jesty from the endless crowd of snake-oil sellers and faith healers and humoral balancers. Indeed, top medical “experts” of the time would likely have warned him *away* from Jesty.

The general pattern:

- Take a field in which it’s hard for non-experts to judge performance
- Add lots of people who
*claim*to be experts (and may even believe that themselves) - Result: someone who is not already an expert will not be able to buy good performance, even if they throw lots of money at the problem

Now, presumably we can get around this problem by investing the time and effort to become an expert, right? Nope! Where there are snake-oil salesmen, there will also be people offering to teach their secret snake-oil recipe, so that you too can become a master snake-oil maker.

So… what *can* we do?

The cheapest first step is to do some basic reading on a few different viewpoints and think things through for yourself. Simply reading __the “correct horse battery staple” xkcd__ will be sufficient to recognize a surprising number of really bad “security experts”. It probably won’t get you to a level where you can distinguish the best from the middling - I don’t think I can currently distinguish the best from the middling security experts. But it’s a start.

More generally: it’s often easier to tell which of multiple supposed experts is correct, than to figure everything out from first principles yourself. Besides looking at the object-level product, this often involves looking at incentives in the broader system - see e.g. Inadequate Equilibria. Two specific incentive-based heuristics:

- Skin in the game is a good sign - Jesty wanted to save his own family, for instance.
- Decoupling from external monetary incentives is useful - in other words, look for hobbyists. People at a classic car meetup or a track day will probably have better taste in car design than the J.D. Powers award.

That said, remember the main message: there is no full substitute for being an expert yourself. Heuristics about incentives can help, but they’re leaky filters at best.

Which brings us to the ultimate solution: try it yourself. Spend time in the field, practicing the relevant skills first-hand; see both what works and what makes sense. Collect data; run trials. See what other people suggest and test those things yourself. Directly study which things actually produce good results.

Discuss

### Effective Altruism QALY workshop materials & outline (and Jan 13 '19 meetup notes)

What the workshop is & a brief overview

Hi all! This is my first post on LessWrong, but I'll be posting here more often with conversation notes & resources from the Effective Altruism Kansas City meetup group.

This workshop was an experiment to give participants an intellectual & intuitive understanding of how EAs typically prioritize between charities - the QALY (and DALY). Participants were told that they're the board for the Hypothetical Foundation and needed to choose a charity to fund between three options to best serve the residents of Hypothetical Town. Discussion is guided from initial impressions, to how to quantify well-being, to how to estimate DALYs, and finally, how to use DALYs to compare charities.

The workshop was fun & engaging, but I've got a few post-mortem revision notes in case anyone wants to use these materials:

- Have participants draw their own QALY boxes
- Have participants intuit QALY shapes of existing charity models
- Walk through an example calculation before having participants do it
- Give a worksheet that lays out steps nicely and walks participants through an example calculation

- 18:15 TALK: Introduce the foundation board meeting premise and the fake charities we'll be evaluating.
- All charity operations in Hypothetown have room to scale according to the possible additional funding (to simplify the decision).
- 18:20 BREAKOUT: Brief discussion of which charities they'd pick based on information they have
- 18:25 DISCUSS: On what basis would they pick charities? Write ideas on the board.
- Lead them to the essential question: "How many are helped, and by how much?"
- 18:30 TALK: Chief Philosophy Officer [[Josh Rainwater]] presents ways to think about how people are helped and by how much
- Will Josh mention the various ways to calculate utility? Should I ask him to?
- 18:40 BREAKOUT: Have them fill out quality of life forms individually then discuss responses with their partner(s)
- Printout link: https://docs.google.com/document/d/18KpMd2qqg5JbVWatcvfxFTTFWdlKHvrEaO0LPM4NasQ/edit?usp=sharing
- 18:50 DISCUSS: Hand out charity infosheets. Discuss how to incorporate improved life & lengthened life into a single number. Aim for a QALY-like conclusion.
- "Ok, board, we've done some due diligence on these charities - I've prepared decision briefs for each of you." Pass out brief infosheets on the two hypothetical charities. These sheets say how many people are afflicted by the problem and how much it costs to help each person.
- Seize on & develop the idea of a QALY (and really name the term for the first time)
- 18:55 DISCUSS: Draw a QALY rectangle on the whiteboard as an example, then have participants tell me what rectangles to draw for the remaining charities based on their infosheets.
- 19:00 BREAKOUT: Have groups calculate QALY cost estimates.
- "Ok, now we know that charity A improves 5 QALYs per person and charity B does 10. But how much does it cost?"
- Breakout on the cost analysis - make the number require super light calculation
- End the breakout with a review of the numbers - first dollars per person helped, then dollars per QALY or something like it
- 19:10 DISCUSS: Return from breakout and share calculated numbers for each hypothetical charity. Add relevant stats & considerations to the board.
- 19:15 DISCUSS: Hold a board vote on who to fund!
- 19:20 TALK: A brief explanation of DALYs, the differences from QALYs, and when to use each
- 19:25 DISCUSS: Open Q&A
- 19:35 BREAKOUT: Interest surveys & brainstorming sessions!
- 20:00 DISCUSS: Brief return to the group to share insights & potential actions.

**Charity A: Prosthetic arms for people who lost arms in freak woodcutting accidents (lend-a-hand)**

- Printout link: https://www.notion.so/Charity-A-Lend-A-Hand-Prosthetic-arms-30694d914bc04c9795f1698bf3b22a57
- Metrics:
- People helped annually: 1,000
- Annual budget: High price of some kind ($2,000,000?)
- Years of life extended per person: 0
- Quality of life improvement: To be estimated by participants
- Arms lost at age 40 on average

**Charity B: Sanitation training for citizens (wash-ya-hand)**

- Printout link: https://www.notion.so/Charity-B-Wash-Ya-Hand-Sanitation-training-c1a7389bd8e94f11bd629762c7e92f45
- Metrics:
- People helped annually: 100,000
- Annual budget: Low price of some kind ($2,000,000?)
- Average years of life extended per person: 0.15
- Quality of life improvement: To be estimated by participants

**Charity C: Free treatment for "Spontaneous Combustion Syndrome", SCS (hand-on-fire)**

- Printout link: https://www.notion.so/Charity-C-Hands-On-Fire-Vaccinations-against-SCS-4493cb390b484c6eae36fe493c4997c6
- Metrics:
- People helped annually: 1,000
- Annual budget: Middling price of some kind ($2,000,000?)
- Average years of life extended per person: 15
- Quality of life improvement: 0

- See here for full raw notes & ideas on the workshop: https://roamresearch.com/#/app/sams/page/_jYqDkG2p
- Desired takeaways from the workshop:
- (briefly) why impact evaluations matter
- What a QALY is
- How to estimate QALYs given basic information
- What a DALY is
- Start by presenting the problem and the questions that we want to be able to answer.
- Share that Dan & myself will be around during the breakouts to answer questions/provide help, and just wave us over if they need something
- Apply a clip-on tie to my T-Shirt when we start the board meeting.
- Differences between the charities
- One helps few people a lot, one helps a lot of people a little
- Two are both pretty severe to where which is worse is an open question
- One helps only quality of life while another extends only quantity of life
- Charity requirements
- Need a minimum of three charities to meet the above requirements
- Charity A: Helps a few people a lot
- Charity B: Helps a lot of people a little
- Charity C: Extends life without improving quality. Severity gut reaction similar to A

Discuss

### More Rhythm Options

Few instruments do a good job as the only rhythm instrument in a dance band; in my 2014 sample I only saw guitar and fiddle. I can't play guitar for dancing anymore because of my wrists, and the piano has to give up a lot in exchange for its large range. A goal I've had for a long time is to figure out how to get the same full sound from something built around a mandolin.

As a rhythm instrument, the way I play it, the mandolin has a percussive bite and drive that's hard to get with the piano. This drive contributes a lot to the dancing, and is something I really enjoy about a mandolin-piano rhythm section. Take away the piano, though, and everything is high frequency.

I've played with a bunch of ideas here for augmenting my mandolin playing:

DIY organ pedals.

Build a computer vision system that maps from hand shape and position to chord, and then choose bass notes from the chord. Trigger the bass notes with foot pedals.

Make a hat with a tilt sensor, and use head angle to choose bass notes. Foot pedals as before.

Use vocals, perhaps processed, to fill out the sound.

Whistle into a microphone, which controls a bass synthesizer, so I can whistle bass lines.

Recently I tried a new combination:

Whistle into a microphone to select bass notes, trigger the bass notes with foot pedals.

(youtube)

I'm running my standalone pitch detector which translates the whistling into MIDI, with pitch bend to send fractional pitch. I tell my MIDI router what key and mode I'm in, and it listens for I, IV, V, and either vi (minor) or VII (mixo) by picking the nearest option. I have this driving both a bass that's triggered by the foot pedals, and an atmospheric droney pad that just runs. I have the pad set to only change notes on a pedal tap, however.

It's not as flexible as the bass whistle, because I need to choose in advance what key and mode to play in and it only does four bass notes, but it also is much less likely to make weird awkward noises when I screw up slightly.

*Comment via: facebook*

Discuss