I have been trying to understand all the iterative/recursive approaches to AI alignment. The approaches I am aware of are:
- ALBA (I get the vague impression that this has been superseded by iterated amplification, so I haven't looked into it)
- HCH (weak and strong)
- Iterated amplification (IDA)
- (Recursive) reward modeling
- Factored cognition
- Factored evaluation
(I think that some of these, like HCH, aren't strictly speaking an approach to AI alignment, but they are still iterative/recursive things discussed in the context of AI alignment, so I want to better understand them.)
One way of phrasing what I am trying to do is to come up with a "minimal set" of parameters/dimensions along which to compare these different approaches, so that I can take a basic template, then set the parameters to obtain each of the above approaches as an instance.
Here are the parameters/dimensions that I have come up with so far:
- capability of agents: I think in HCH, the agents are human-level. In the other approaches, my understanding is that the capability of the agents increases as more and more rounds of amplification/distillation take place.
- allowed communication: It seems like weak and strong HCH differ in the kind of communication that is allowed between the assistants (with strong HCH allowing more flexible communication). Within IDA, there is low bandwidth vs high bandwidth oversight, which seems like a similar parameter. I'm not sure what the other approaches allow.
- training method during distillation step: I think IDA leaves the training method flexible. According to this post, factored cognition seems to use imitation learning and factored evaluation seems to use reinforcement learning. I think recursive reward modeling also uses reinforcement learning. HCH seems to be just about the amplification step (?), so no training method is used. I'm not sure about the others.
- entity who "splits the questions", coordinates everything during amplification, or selects the branches: In factored cognition, factored evaluation, IDA, and HCH, it seems like the human splits the questions. In Debate, the branches are chosen by the two AIs in the debate (who are in an adversarial relationship).
- entity who does the evaluation/gives feedback ("the overseer"): It seems like in factored evaluation, the human gives feedback. In Debate, the final judgment is provided by the human. My understanding is that in IDA, the nature of the overseer is flexible ("For example, Arthur could advise Hugh on how to define a better overseer; Arthur could offer advice in real-time to help Hugh be a better overseer; or Arthur could directly act as an overseer for his more powerful successor").
- what the overseer does (i.e. what kind of feedback is provided): I think the overseer can be passive/active depending on the distillation method (see my comment here), so maybe this parameter isn't required in a "minimal set".
- required number of human feedback per round: In Debate, there is one feedback at the end of a debate round. In factored evaluation, it seems like the human must provide feedback at each node in the question tree (or a separate human at each node in the question tree).
- depth of recursion: It seems like IDA limits the depth of the recursion to one step, whereas the other approaches seem to allow arbitrary depth (see my comment here).
- separation of task performance vs evaluation/oversight: It seems like in factored evaluation, there is an entity who does the task itself (the experts at the bottom of this diagram), and a separate entity who evaluates the work of the experts (the "factored evaluation" box in the same diagram), but in factored cognition, there is just the entity doing the task.
I would appreciate hearing about more parameters/dimensions that I have missed, and also any help filling in some of the values for the parameters (including corrections to any of my speculations above).
Ideally, there would be a table with the parameters as columns and each of the approaches as rows (possibly transposed), like the table in this post. I would be willing to produce such a write-up assuming I am able to fill in enough of the values that it becomes useful.
If anyone thinks this kind of comparison is useless/framed in the wrong way, please let me know. (I would also want to know what the correct framing is!)
This is the third post in the Arguing Well sequence, but it can be understood on its own. This post is influenced by double crux, is that your true rejection, and this one really nice street epistemology guy.
Al the atheist tells you, “I don’t believe in the Bible, I mean, there’s no way they could fit all the animals on the boat”
So you sit Al down and give the most convincing argument on it indeed being perfectly plausible, answering every counterpoint he could throw at you, and providing mountains of evidence in favor. After a few days you actually managed to convince Al that it’s plausible. Triumphantly you say, “So you believe in the Bible now, right?”
Al replies, “Oh no, there’s no evidence that a great flood even happened on Earth”
Sometimes when you ask someone why they believe something, they’ll give you a fake reason. They’ll do this without even realizing they gave you fake reason! Instead of wasting time arguing points that would never end up convincing them, you can discuss their cruxes.
Before going too deep, here’s a shortcut: ask “if this fact wasn’t true, would you still be just as sure about your belief?”
Ex. 1: Al the atheist tells you, “I don’t believe in the Bible, I mean, there’s no way they could fit all the animals on the boat”
Instead of the wasted argument from before, I’d ask, “If I somehow convinced that there was a perfectly plausible way to fit all the animals on the boat, would you believe in the Bible?” (This is effective, but there’s something subtly wrong with it discussed later. Can you guess it?).
Ex. 2: It’s a historical fact that Jesus existed and died on the cross. Josephus and other historical writers wrote about it and they weren’t Christians!
If you didn’t know about those sources, would you still be just as sure that Jesus existed?General Frame
A crux is an important reason for believing a claim. Everything else doesn’t really carry any weight. How would you generalize/frame the common problem in the above two examples? You have 3 minutes
Note: There is still a ??% about the confidence for the claim. Alice could ask "On a scale from 0 to 100, how confident are you about [Claim]?", which can be a very fun question to ask! If they said "99%", this would allow you to rephrase the crux question to:
Using the frame of probability theory, each crux would have a percent of the reason why you believe that claim. For example, say I’m very sure (95%) my friend Bob is the best friend I’ve ever had. 10% for all the good laughs we had, 30% for all the times Bob initiated calling me first/ inviting to hang out, and 60% for that time he let me stay in his guest room for 6 months while I got back on my feet.
What percentage of weight would a crux need to have to be considered a crux? What percentage would you consider a waste of time? Which cruxes would you tackle first?
This is arbitrary, and it may not matter for most people’s purposes. I can say for sure I’d like to avoid anything that has 0% of the belief! But regardless how you define "crux", it makes sense to start with the highest weighted cruxes first and go down from there.
Ex. 3: Eating meat is fine, I mean, I’m not eating that is that smart anyways.
If I proved that pigs are just as intelligent as dogs, would you still eat pigs?
Ex. 4: The Bible is horrible nonsense. There’s no way a “good” God would have anybody eternally tormented.
“If I proved that the bible had a believable interpretation such that people were just permanently dead instead of tortured, would it make better sense?”
“What if, after digging into the greek and early manuscripts, the most believable interpretation is that some people would be punished temporarily, but eventually everyone would be saved?”Algorithm:
What’s an ideal algorithm for finding cruxes?
1. “Why do you believe in X”
2. “If that reason was no longer true, would you still be just as sure about X?”
a. If no, you can argue the specifics of that reason using the techniques discussed in this sequence
b. If yes, loop back to 1.
It would sort of go like this:
Bob: "I believe [claim]!"
Alice: "Okay, why do you believe it?"
Bob: "Because of [Reason]!"
Alice: "If [Reason] wasn't true, would you still be just as sure about [Claim]?"
And then we figure out if that ??% coming from [Reason] is a false reason (low/zero percent), or an actual crux (higher percent).
If I woke up in a hospital and realized I dreamed up those 6 months at Bob’s, I wouldn’t be as sure that he was the best friend I’ve ever had since I just lost a major crux/ a major reason for believing that."If [Reason] wasn't true, would you still be 99% sure about [Claim]?"Least Convenient World:
What’s the relationship between finding cruxes and the least convenient world?
The least convenient world is meant to prevent finding loopholes in a hard question and any other “avoid directly answering the hard question” technique. It’s a way of finding the crux, to figure out what is actually being valued.
Oftentimes when trying to find someone’s crux, and I say, “imagine a [least convenient world] where your reason is not true. Would you still hold your belief as strongly?”, there’s an objection that that imagined world isn’t true or likely! I can say, “Oh, I’m not trying to say it’s likely to happen, I’m just trying to figure out what you actually care about.” and then I find out if that reason is an actual crux.
(This is covered in Scott’s original post)
The beauty of finding cruxes this way is that you don’t actually have to have concrete information. In Proving Too Much, I need to know a counterexample to prove the logic isn't perfect. In Category Qualifications, I need to know which qualifications for words my audience has in mind to choose which words I use. In False Dilemmas, I need to be able to know what object is being arbitrarily constrained, which qualifications correctly generalize that object, and have real-world information to brainstorm other objects that match those qualifications.
There is still an art to getting someone to understand for the first time the purpose of constructing a least convenient world (“Oh, it's not meant to be realistic, just a tool for introspection!”), but that can be figured out through practice!Final Exercise Set
Ex. 5: I believe that pornography destroys love and there’s a lot of scientific studies showing that it has negative affects. [Note: These are mostly all real life examples, and I’m not just weird]
“If I found a very well done study with a large sample size that determined that pornography consistently reduced crime rates without negative side effects, and the entire scientific field agreed that this was a well done study with robust results, would you still believe that pornography is bad?” (In one of the street epistemologist's videos linked above, the guy replied, “Well ya, because I’m a Christian”)
Note how the scientific study didn’t even have to exist to figure out that scientific evidence wasn’t a crux
Ex. 6: I don’t eat meat because of animal suffering
What if it was reanimated meat like on Star Trek? No actual animals involved, just reconfigured atoms to form meat. Would you eat it then?
If you were at someone’s house, you’re hungry, and they ask if you want the leftover meat that they’re about to throw away. Do you eat it?
Ex. 7: I'm actually a really good singer, and I don't know why you're discouraging me from it.
If you heard a recording of yourself and it sounded bad, would you still think you're a good singer?
Ex. 8: My recorded voice never actually sounds like me.
If I recorded my voice and it sounded like me, would you believe that your recorded voice sounds like you?
(These last two examples came from a conversation with my brother in high school. I was the one who thought I was a good singer, haha)Conclusion
This is my favorite technique to use when talking to anyone about a seriously held belief. It makes it so easy to cut through the superficial/apologetic/"reasonably sounding" beliefs, and start to truly understand the person I'm talking to, to know what they actually care about. The other techniques in this sequence are useful for sure, but finding the crux of the argument saves time and makes communication tractable (Read: Find cruxes first! Then argue the specific points!)
Final Note: due to other priorities, the Arguing well sequence will be on hiatus. I've learned a tremendous amount writing these last 4 posts and responding to comments (I will still respond to comments!). With these new gears in place, I'm even more excited to solve communication problems and find more accurate truths. After a few [month/year]s testing this out in the real world, I'll be back with an updated model on how to argue well.
Technical Appendix: First safeguard?
This sequence is written to be broadly accessible, although perhaps its focus on capable AI systems assumes familiarity with basic arguments for the importance of AI alignment. The technical appendices are an exception, targeting the technically inclined.
Why do I claim that an impact measure would be "the first proposed safeguard which maybe actually stops a powerful agent with an imperfect objective from ruining things – without assuming anything about the objective"?
The safeguard proposal shouldn't have to say "and here we solve this opaque, hard problem, and then it works". If we have the impact measure, we have the math, and then we have the code.
So what about:
- Quantilizers? This seems to be the most plausible alternative; mild optimization and impact measurement share many properties. But
- What happens if the agent is already powerful? A greater proportion of plans could be catastrophic, since the agent is in a better position to cause them.
- Where does the base distribution come from (opaque, hard problem?), and how do we know it's safe to sample from?
- In the linked paper, Jessica Taylor suggests the idea of learning a human distribution over actions – how robustly would we need to learn this distribution? How numerous are catastrophic plans, and what is a catastrophe, defined without reference to our values in particular? (That definition requires understanding impact!)
- Value learning? But
- We only want this if our (human) values are learned!
- Value learning is impossible without assumptions, and getting good enough assumptions could be really hard. If we don't know if we can get value learning / reward specification right, we'd like safeguards which don't fail because value learning goes wrong. The point of a safeguard is that it can catch you if the main thing falls through; if the safeguard fails because the main thing does, that's pointless.
- Corrigibility? At present, I'm excited about this property because I suspect it has a simple core principle. But
- Even if the system is responsive to correction (and non-manipulative, and whatever other properties we associate with corrigibility), what if we become unable to correct it as a result of early actions (if the agent "moves too quickly", so to speak)?
- Paul Christiano's take on corrigibility is much broader and an exception to this critique.
- What is the core principle?
- The three sections of this sequence will respectively answer three questions:
- Why do we think some things are big deals?
- Why are capable goal-directed AIs incentivized to catastrophically affect us by default?
- How can we build agents without these incentives?
- The first part of this sequence focuses on foundational concepts crucial for understanding the deeper nature of impact. We will not yet be discussing what to implement.
- I'm planning on releasing the first third of the posts over the next few weeks; the remainder will come after some delay.
- I strongly encourage completing the exercises. At times you shall be given a time limit; it’s important to learn not only to reason correctly, but with speed.
- Projects are shovel-ready ways to get your hands dirty doing novel and important work. Collaboration is encouraged – in particular, feel free to message me. The first to rise to the challenge will earn a permanent spot in the sequence.
- My paperclip-Balrog illustration is metaphorical: a good impact measure would hold steadfast against the daunting challenge of formally asking for the right thing from a powerful agent. The illustration does not represent an internal conflict within that agent. As water flows downhill, an impact-penalizing Frank prefers low-impact plans.
- Some of you may have a different conception of impact; I ask that you grasp the thing that I’m pointing to. In doing so, you might come to see your mental algorithm is the same. Ask not “is this what I initially had in mind?”, but rather “does this make sense as a thing-to-call-'impact'?”.
- H/T Rohin Shah for suggesting the three key properties. Alison Bowden contributed several small drawings and enormous help with earlier drafts.
What topics is the exercise related to:
1. Mitigation of cognitive distortions such as the stereotype of physical attractiveness
3. article "Being the (Pareto) Best in the World"
Two sources are the basis for this exercise:
- the idea (and practices) of Western esoterics on “biners neutralization” (in particular, quite widely and extensively it is described in the book “The Great Arcana of the Tarot” by Vladimir Shmakov, I don’t know if the book is translated into English)
- the method for developing the theme in screenwriting (in particular, “Substance, Style and the Principles of Screenwriting” by Robert McKee)
Here, I describe the exercise as I practice it – a cross between the two initial variants, and intended for self-development.
Two words about what esoterics has to do with it:
1. The academic study of esoterics - one of the areas of my interests.
2. If we consider the Western esoteric tradition from a psychological point of view, there are many methods and techniques aimed at “convergence” and “neutralization” of opposing personality traits/emotions/ ideas. It is enough to recall the alchemical procedures and visualizations, where the elements of fire and water “merge” together. It seems to me that in this body of knowledge, there is a lot of useful and undeservedly forgotten things.
By the way, is it rational to have prejudices against “esoterics”? :) And do you have it?I use the exercise in two versions.
The purpose of the first one:
- remove the severity of the confrontation between the opposing ideas or subagents
- if possible - to achieve their synthesis.
It is like a Double Crux. With the difference that:
- exercise is purely intellectual, without focusing technic. It does not require you to merge completely with the role and feelings of the investigated subagent
- unlike the Double Crux, the opposing subagents do not ‘negotiate,’ but merge, becoming the expression of some new value that they ‘did not understand’ before.
- it reduces the initial “weight “of any arguments of this subagent (both” for “and” against”).
I use the same version of the exercise to mitigate cognitive biases – such as, for example, the physical attractiveness stereotype; good mood in sunny weather and a bad one in rainy weather; distortions of self-esteem.
I don’t know how well it works for everyone. I tested it only on myself.
The purpose of the second one:
- to “expand” the competence of the “neighboring” subagents so that one task or project can cover several “needs” of the different subagents. (Implementing the principle of "Being the (Pareto) Best in the World")
However, for this, it is necessary to identify areas where the “needs” of the subagents intersect, having the maximum utility.
Different utility functions of different subagents are best illustrated by the picture in the article “Why Subagents?”, demonstrating path dependency in preference selection on a 2D graph, on one axis of which “utility” for subagent 1 is plotted, and on the other – “utility” for subagent 2.
In this modification, the exercise helped me to create something like a 3D map of my subagents, as well as to “configure” them in a more optimal way.
Again, I do not know how much my experience is reproducible. I am curious about what result you will get.
In addition to the psychological benefits, the exercise helps to learn to think broader, and I find it just intellectually funny.Biner: An Idea I take from esoterics
A “biner” is simply two opposing ideas or two motives that draw in different directions. - Beautiful / Ugly, Knowledge / Ignorance, “I want to do push-ups”/“I don’t want to do push-ups.”
In Western esoterics is considered, that only one thing indeed “exist” – positive pole of a biner. Its opposite - negative pole - emerges automatically when the biner is activated.
In order not to write many times the “positive pole of a biner” and the “negative pole of a biner” I will use the terms Center (positive statement) and Shadow (the opposite statement, motive or idea).
For example, Centers are “Beautiful”/“I want ice cream,” their Shadows - “Ugly”/” I need to go to the store for ice cream ... no, I guess, don’t want it.”
Between the Center and the Shadow lies a spectrum of “intermediate solutions” – their androgynes:
- Oh, amazing! - Beautiful - Cute - Normal - Unhandsome - Ugly - What a nasty thing!
- “I want to do push-ups!” - “I’ll do it ten times, and then we’ll see.” - “I should do some push-ups.” - “… but not today.” - “... To do push-ups?” - “No, I won’t do this” - “Push-ups are harmful!”.
Complete androgyne is the “central” androgyne; it contains 50/50 of Center and Shadow statements.
If we are talking about ideas, then in the complete androgyne they are most weakly manifested, and if about subagents, then, most likely, an unstable (or stable) balance of akrasia will arise.
Opposed poles of biner (ideas or subagents) can be merged in the synthesis process if you can find and construct what they express on “higher level.”
Moreover, the highest “energy” – i.e., the action potential – belongs to a complete androgyne (the pendulum “I want ice cream”/“I am too lazy to go for it” at any moment can swing into a full-fledged action or final rejection from it), and the lowest “energy” has a synthesis of biner.
Therefore, the exercise has two variants:
1. I am looking for “synthesis” – if I want to neutralize, or at least assign less “weight” to the preferences of individual subagents.
2. I am looking for a “complete androgyne” – if I know that this area of activity or role is essential to me, and I want to express it in an even more optimal way.
Following the idea that the Shadow does not have an independent existence, I assume that all subagents are always “internally” paired.
In other words, the subagent “I want to do push-ups” - and it’s opposition “You can kill me, but I will not do push-ups!” - represent the Center and Shadow of one subagent.
Shadow is “forces of resistance,” it doesn’t have its existence. Because if there is no question of sport, it doesn’t affect decision-making.
It is the Shadow of a subagent that “engages” other subagents in the decision-making process to gain more weight. “I want to do sports (Sport).” “I want... but (Shadow of sport) later, now there is an interesting TV-show (Entertainment)... [and later] - I have just eaten, and after eating it is prohibit to engage in sports (Mom-so-said-20-years-ago)!”The essence of the method:
1. Find the Center of the subagent – its value. - Center is the option to which the subagent has assigned the maximum (from its point of view) utility function.
2. Describe the Shadow.
3. Study the Center and the Shadow, using specific examples, each of which will be their manifestations as extreme as possible.
4. Consider the entire range of androgynes between the Center of the subagent and its Shadow, coming up with a few specific examples
5A. (Synthesis Search Exercise) Integrate Subagent and its Shadow (this is the most difficult, but the funniest) – In this aspect, the technique is a bit like a Double Crux. – As a result, the “weight” of the subagent’s arguments is reduced.
5 B. (Complete androgyne Search Exercise) Integrate two different “positive” Subagents, so that they reinforce each other, giving each other a “weight of votes” when making a decision. – In this aspect, the technique is similar to the ideas discussed in the article "Being the (Pareto) Best in the World", since it involves changing not the attitude to the activity, but changing the action itself, or changing its emphasis.The whole method is a constant movement across levels from abstract to concrete, and vice versa from concrete to abstract.
Example 1. Just an intellectual warm-up
Let us choose the abstract idea of Light as our subject
1. Idea – Light.
2. The opposite – the absence of Light – Darkness.
3. The choice of specific “extremes” will determine in which direction I will continue to think – physical or psychological. If “physical” - then I will choose the photon / no photon if the psychological - the ability to see / blindness.
Suppose the first one: photon/absence of photon
4. The field of all intermediate states is all gradations of the electromagnetic spectrum. The Complete androgynes (probably), is a photon with an energy of 50 GeV (50% of the most energetic known photon).
5. Synthesis: something that is both a photon and not a photon, and has a lot of “hidden” energy My knowledge of physics suggests two options: - virtual pair “photon-antiphoton,” disappearing in a negligible time in a false vacuum- superposition of two light waves - similar, but 180 out of phase, which, when added, give 0 at a particular point (for example, during the destructive interference experiment)
If we chose the “subjective” plane: ability to see / blindness, then
4. A spectrum of androgynes is all possible qualia of colors and vision accuracy, associated with the quality of ability to see. Complete androgyne – healthy vision.
5. Synthesis: - someone who can see everything with absolute precision and clarity.- and at the same time - blind.
To resolve the contradiction “sees” and “blind” you need to rise to a “level higher,” realizing that the ability to see is not the ability of vision, but the ability of the mind.
Therefore, the synthesis could be a scientist or a prophet, or a project implementer (all who can “see” the causes and consequences of events).
If I were to use the exercise to develop the book’s theme, I would also do a “level down synthesis” – looking for something “worse than the worst” – for example, a prophet who makes a fatal mistake, or a sharp-eyed archer who does not see his wife’s betrayal happening literally “under his nose.”
The development of a character would go through the stages “Center,” “Shadow,” “Synthesis up,” and “Synthesis down” - in any chosen order.
For example, in the first part of the story, the protagonist is a religious leader, confident that he receives prophecies from the deity indicating how to lead his community (value – “Ability to see”). Then, he learns that all his predictions were fake because in reality, his closest ally manipulated by him (“Blind Prophet”). As a result of the struggle for power, the main character loses influence and finds himself in an underground dungeon (“Blindness”). Where he comes to the values of rational and scientific thinking (“Synthesis”), with the help of which he is freed and regains power. (The End :) )
Example 2. Synthesis of opposing subagents – to reduce the “weight” of the particular subagents when making decisions.
For example, there is a choice – to agree for work with more pay, but less free time; or to prefer all those cool things that I can buy for the additional money.
And I want to REDUCE THE ACUTE OF THE MONEY QUESTION to choose based on other criteria, and not to rationalize the decision that has already been made unconsciously.
1. It seems that the Central value for the “activated” subagent is money.
2. Will “stretch” it to extremes to reveal more precisely the Center and Shadow: a multibillionaire and a homeless beggar.
How are they different? – Ability to acquire and quantity of property, i. e. the investigated value is “purchasing power, “or more precisely –” ability to own” something. T
he choice of “specific extremes” is the first psychological marker – what do I mean by “Money”? In this case, it turned out that this is “the ability to own.” But for another person (or for me in a different situation), it could be “security,” or “proof of self-worth to society,” or something else.
3. What are even more remote examples I can come up with for “own”/ and “not be able to own”?
More power than any modern multimillionaire (I think) possessed the ancient Egyptian God-Pharaoh: he owned everything on his land – even without having to “buy” it.
Less “power to own” than a beggar has a slave – he does not own his own body and his own time; a drug addict – he does not own his desires, or a madman – he does not “own” his consciousness. I choose to focus on the dichotomy” God-Emperor “-” Slave.”
4. There are various “quantities” of “ability to own” in the range between these two extremes – from a wealthy person who does not bother to earn money, but who still knows the limit of allowable expenses for himself, to a person who barely survives, earning pennies.
Somewhere in the middle is the “Complete androgyne” – the “average person,” who, in principle, is satisfied with his wealth, and what he owns, and the amount of free time he has.
5. Synthesis: someone who possesses “everything.+ someone who doesn’t even own himself + someone satisfied with the state of things – and not wanting more.
As a concrete expression of synthesis, the following come to my mind:
1) Pauper Mystic. - A person who possesses “everything,” but this “everything” is beyond the concept of money, - and he does not need anything from what the world has to offer; also he does not have “himself” in the sense that he subordinated himself to the inner spiritual discipline.
2) The Lion King (from the film of the same name, particularly in the scene where the old monkey lifts little Simba over the whole valley) – he does not have “everything” because he bought this valley, but because he feels himself responsible for this place, and feels himself the owner of this place. And he “does not possess” himself since he will make anything (including self-sacrifice) to protect peace in the valley. (Which legally can “belong” to the English colonists, or any other four-legged individual who said to herself “I responsible for this place”).
In the dramaturgy “worse than the worst” could be: the god-emperor, whom his priests keep on drugs; a beggar-madman who believes that he is Napoleon; the god-emperor, compelled to obey the most severe daily routine (as it was in ancient China); the king who was kidnapped and sold into slavery; a slave who was secretly seated on the throne, etc. – Such extremes catch us emotionally.
Such a study of values is convenient to do in the form of 2x2 table. (But I don't know how to do it here):
Value: Ability to own (money)
--> The one who owns everything Shadow:
- Mad man, Prisoner
--> The one who does not own even himself Androgynes:
- A man who inherits a million and spends it wisely ... or unwisely;
- A man on the verge of begging;
- The one who earns a lot, but gives the maximum of his earnings to charity;
- The one who purposefully creates capital
Complete androgyne: Just an "ordinary" working man, who is happy with everything
- Pauper Mystic
- The Lion King
How does this all relate to the main question: Should I agree to a more monetary, but also more time-consuming work, or not? (Unless other considerations are not raised, of course).
During the synthesis, it should have become clear to me: “money” for the decision-voting subagent is a measure of how much I “own the world.” Or at least that part of the world that I can call a “home” (my district, my city, my country), and at the same time maintain a sense of “contentment” with myself.
But you can “own the world” without any money. Only by changing status in my own eyes – and becoming, like The Lion King, “the owner of the valley.”
As a result, the “money” factor gets less “weight” when deciding on a job. As well as when making other decisions, where “money” = “acquisition of things.”
One more example: I love learning. Very much. But at the same time, it saddens me that time erases information, I can’t remember everything.
This synthesis is not to make any decision, but to reduce internal discomfort. The same technique is used to reduce cognitive biases when considering ideas such as Beautiful / Ugly, Smart / Stupid, Expensive / Cheap, etc.
- “Eternal student”
- An AI that has all the knowledge in the world. And which is continuously producing new data.
- Total, the extreme expression of the idea: An algorithm that produces new knowledge
Shadow: Destruction of knowledge
- The knowledge that is not applied, but only accumulated
- The knowledge that prevents the creation of new knowledge
- Prohibition of knowledge
- A noise that absorbs information
- Total, the extreme expression of the idea: A virus that destroys information
- A scientist who knows a lot in his field – but is ignorant in other areas, including ordinary life
- A student who uncritically absorb any information
- A “practical” person who refuses to know more than he needs to live
- Sherlock Holmes, eager to forget that the Earth turns
- A priest who knows the religious texts – but does not see that they do not correspond to reality.
- “The map is not the territory.”
Complete androgyne: A person who is continuously learning, testing in practice and structuring his knowledge
Synthesis: something that produces knowledge + something that destroys knowledge
- The search for knowledge, for example, by a genetic algorithm, which destroys un-perfect “past generation” of information.
- Search for erroneous or insufficiently accurate information and its clarification
- “Destruction” of arrays of particular knowledge due to the conclusion of generalizing formulas and principles.
- Translation of (numerical) data into (analytical) knowledge
- Sleep: during sleep, the brain structures information, some (important links and data) goes into permanent memory, other (classified by the brain as unimportant) is erased.
-->The process of acquiring knowledge involves both gaining and losing information. - > Subjective reduction of ”pain” from the forgotten things.Pitfalls of the method:
- sometimes it is just challenging to find a synthesis. You need to give yourself time to think well. Or look for the necessary background information.
- as a result of a properly executed synthesis, the “weight” of the arguments of the corresponding subagent is reduced. But do you want to reduce it? This could be anti-adaptive.
- if the exercise is performed to remove internal conflicts, then to achieve a permanent effect, you should repeat it with some frequency.
Example 3. Searching for Complete androgyne – to include additional subagents in a specific activity.
The purpose of this exercise is to find activities and projects that are fun and beneficial not only to one subagent, but to as many subagents as possible, and ideally help integrate the individual into a holistic agent.
I assume that the subagent who acts as the Center of desire and his Shadow (forces of resistance), taken by themselves, both are destructive forces.
For example “I want to run,“ driven to fanaticism, leads to injury and exhaustion, and “I do not want to run or do any other sports” – reduce the duration and quality of life.
At the same time, “I DO NOT WANT” “drags” other “game participants” to its side (I want to watch a movie; I want to play a game; I want to finish this work project today ... hmm, it is already midnight).
As all “stories” with subagents are very subjective, I give here as an example excerpts from my internal work.
I want to write books. I wanted this all my life. But at the same time, something prevented me from doing this and continues to interfere. - Circumstances, procrastination, work, family, etc.
1. Value – to write fiction novels
2. The opposite – not to write. But if I do not write, then what should I do? - Watch movies? Work? ... these are not opposites. The actual opposite is to live.
3. What is the real difference between Writing and Living? - The way of living the experience. The experience lived in imagination or bodily; a carefully reflexed and structured experience, or a spontaneous and chaotic one.
Extremes: lock yourself in the attic and fantasize – or go on an expedition to the Amazon with only a compass.
- six months a year to travel, and six months to write fiction,
- going in safe tourism, and lying on your travel blog about incredible adventures along the way,
Complete androgyne: a situation where I experience incredible adventures on odd days, and on even days I carefully record and analyze them (and the world stops and waits)
5. What should a Complete androgyne express?
- bright emotions and adventures
- reflection, connections, and structure
- both “external” and “internal” life is equally rich
What other variants of Complete androgyne are possible?:
- develop and train myself to keep a specific type of diary and a precise vision of the world: notice all vivid things that are usually hidden in a routine, comprehend them - and organize my life in a “structure” that is so lacking. And then, perhaps, to release an exciting adventure memoir.
- deliberately invent Adventures, organize them in a story - and to live these adventures
- learn to “live” my own and others’ books with the maximum “effect of presence” (there are NLP-technologies for this)
- use imagination to model my life situation-for example, carefully describing worlds of alternative solutions
- master lucid dreams, and live in this state all that will be in my future book.
- maximally gamify my life, turning it into an analog of RPG-fiction
Are there any subagents in my life that ALREADY implemented: bright emotions and reflection?
- ”From the side” of active life, I have - martial arts classes and sports competitions
- ”From the side” of observation and search for some “meta-knowledge“- the study of esoteric teachings, philosophy and science. As well as the hobby of diary and narrative techniques.
Thus, these three subagents are already linked to each other, and it is possible:
1) to create one project, which would include interests of all “participants” (albeit not equally)
2) to find the benefits of each subagent in the activities of others, strengthening them. For example, from my top three: - write fiction- do karate- study the esoteric teachings of the world
- The “fiction writer” peers into esoteric teachings to take something new for the created world or heroes, and in karate – to reproduce on paper the real martial art experience and knowledge of proper techniques for describing combat clashes.
- “Esoteric” with the help of science fiction “models” how it could be if this or that axiom about the world indeed was real and the practice of meditation helps me to be more focused and “grounded” in karate.
- The “karateka” makes models combat interactions using fiction and seeks the spiritual principles of martial arts in the esoteric. I also physically perform the fightings that I think up for my books - this helps me to reinforce new technics and introduces an element of the game into my karate classes.
Thus, doing one thing – creating a fantastic book – I satisfy the needs of three different subagents. Even if for two of them to a lesser degree than for “main” one (after all, for mastering karate the benefit of 1 hour of karate exercises is higher than 1 hour of composing a battle scene).Pitfalls of the method:
- instead of adhering to the Complete androgyne to deviate towards compromises, that do not satisfy any of the subagents.
For example, I could combine “write” + “do karate” = “become a sports journalist.” – But “utility” for “write” and “do karate” in this case would be close to 0. Or “write fiction” + “esoteric interests” (+ “make money”) = “invent my religion” (how Ron Hubbard did) – But all my subagents “vote” that this is not the case, what they want.
- instead of trying to fuse opposites into androgynes – “mix” them trying to be multitasking: read books during push-ups; fantasize about the plot of the book - when I need to work out martial art techniques, etc.
1. Define the complete androgyne and synthesis of the following ideas:
- Warm sunny weather
2. What are three essential things you did today, or are planning to do?
- What subagents are behind these tasks?
- Define for one of this subagents the field of androgynes and the Complete androgyne properties
- Based on the Complete androgyne properties, what other your subagents could also be ‘involved’ in this task?
- How could this task be changed to fit them more?
1. The idea of exercise is that opposing ideas/desires/ subagents can be merged in two ways:
- “vertical” - looking for something mutual for them at “level above.” As a result, the “arguments” of relevant idea or subagent receives less “weight” compared to other ideas and subagents.
- “horizontal” - looking for an activity or more complex idea, which expresses both opposing parts of an idea or subagent equally. In this case, such an androgyne can be “loaded” with meanings that are relevant for several other subagents.
As a result, you can invent an activity that is more important to the entire ensemble of subagents, and that requires complex competence.
2. Besides, exercise is also an intellectual challenge. It’s fun to do it, especially in a group - when it comes to integrating ideas and searching for complete androgyne for ideas and deeds.
3. This is my first post here and I hope that the exercise will also be useful for the rational community.
Epistemic status: A mental piton, to hang your climbing rope on
The second half hour of cardio you do in a day matters less to your overall happiness and health than the first half hour you do, the second quarter-hour, the first quarter-hour; knitting this into our base mental framework is why the law of diminishing returns is such a "Wow!" moment for a lot of beginner econ students.
The problem with consistently applying the law of diminishing returns to everyday life, however, is that it's boring. People don't become socially-renowned Zen masters through 1 pomodoro of samadhi a day. All they get are improvements to their overall mood and the occasional boring state of pure pleasure and equanimity. People don't become [in]famous computer hackers by spending 2, not-especially-well-planned hours a week learning how to automate the boring stuff with Python; they just learn enough to be a little dangerous.
It doesn't feel like you're using a glitch in the Matrix disguised as a psuedoscientific law. But that's exactly what you're doing.
Understand: Most people do not have the discipline to do just 15 minutes of something. They have to do 0, or 45. This is your actual competition.
You can routinely beat the average, in almost every everyday domain, by applying the law of diminishing returns to your own life ruthlessly and consistently. Or perhaps I should say, ruthlessly and consistently enough?
Early this year, Conor White-Sullivan introduced me to the Zettelkasten method of note-taking. I would say that this significantly increased my research productivity. I’ve been saying “at least 2x”. Naturally, this sort of thing is difficult to quantify. The truth is, I think it may be more like 3x, especially along the dimension of “producing ideas” and also “early-stage development of ideas”. (What I mean by this will become clearer as I describe how I think about research productivity more generally.) However, it is also very possible that the method produces serious biases in the types of ideas produced/developed, which should be considered. (This would be difficult to quantify at the best of times, but also, it should be noted that other factors have dramatically decreased my overall research productivity. So, unfortunately, someone looking in from outside would not see an overall boost. Still, my impression is that it's been very useful.)
I think there are some specific reasons why Zettelkasten has worked so well for me. I’ll try to make those clear, to help readers decide whether it would work for them. However, I honestly didn’t think Zettelkasten sounded like a good idea before I tried it. It only took me about 30 minutes of working with the cards to decide that it was really good. So, if you’re like me, this is a cheap experiment. I think a lot of people should actually try it to see how they like it, even if it sounds terrible.
My plan for this document is to first give a short summary and then an overview of Zettelkasten, so that readers know roughly what I’m talking about, and can possibly experiment with it without reading any further. I’ll then launch into a longer discussion of why it worked well for me, explaining the specific habits which I think contributed, including some descriptions of my previous approaches to keeping research notes. I expect some of this may be useful even if you don’t use Zettelkasten -- if Zettelkasten isn’t for you, maybe these ideas will nonetheless help you to think about optimizing your notes. However, I put it here primarily because I think it will boost the chances of Zettelkasten working for you. It will give you a more concrete picture of how I use Zettelkasten as a thinking tool.Very Short SummaryMaterials
- Staples index-cards-on-a-ring or equivalent, possibly with:
- plastic rings rather than metal
- different 3x5 index cards (I recommend blank, but, other patterns may be good for you) as desired
- some kind of divider
- I use yellow index cards as dividers, but slightly larger cards, tabbed cards, plastic dividers, etc. might be better
- quality hole punch (if you’re using different cards than the pre-punched ones)
- I like this one.
- quality writing instrument -- must suit you, but,
- multi-color click pen recommended
- hi-tec-c coleto especially recommended
- Number pages with alphanumeric strings, so that pages can be sorted hierarchically rather than linearly -- 11a goes between 11 and 12, 11a1 goes between 11a and 11b, et cetera. This allows pages to be easily inserted between other pages without messing up the existing ordering, which makes it much easier to continue topics.
- Use the alphanumeric page identifiers to “hyperlink” pages. This allows sub-topics and tangents to be easily split off into new pages, and also allows for related ideas to be interlinked.
Before I launch into the proper description of Zettelkasten, here are some other resources on note-taking which I looked at before diving into using Zettelkasten myself. (Feel free to skip this part on a first reading.)Related Literature
There are other descriptions of Zettelkasten out there. I mainly read How to Take Smart Notes, which is the best book on Zettelkasten as far as I know -- it claims to be the best write-up available in English, anyway. The book contains a thorough description of the technique, plus a lot of “philosophical” stuff which is intended to help you approach it with the right mindset to actually integrate it into your thinking in a useful way. I am sympathetic to this approach, but some of the content seems like bad science to me (such as the description of growth mindset, which didn’t strike me as at all accurate -- I’ve read some of the original research on growth mindset).
An issue with some other write-ups is that they focus on implementing Zettelkasten-like systems digitally. In fact, Conor White-Sullivan, who I’ve already mentioned, is working on a Workflowy/Dynalist-like digital tool for thinking, inspired partially by Zettelkasten (and also by the idea that a Workflowy/Dynalist style tool which is designed explicitly to nudge users into good thinking patterns with awareness of cognitive biases, good practices for argument mapping, etc. could be very valuable). You can take a look at his tool, Roam, here. He also wrote up some thoughts about Zettelkasten in Roam. However, I strongly recommend trying out Zettelkasten on actual note-cards, even if you end up implementing it on a computer. There’s something good about it that I don’t fully understand. As such, I would advise against trusting other people’s attempts to distill what makes Zettelkasten good into a digital format -- better to try it yourself, so that you can then judge whether alternate versions are improvements for you. The version I will describe here is fairly close to the original.
I don’t strongly recommend my own write-up over what’s said in How to Take Smart Notes, particularly the parts which describe the actual technique. I’m writing this up partly just so that there’s an easily linkable document for people to read, and partly because I have some ideas about how to make Zettelkasten work for you (based on my own previous note-taking systems) which are different from the book.
Another source on note-taking which I recommend highly is Lion Kimbro’s How to Make a Complete Map of Every Thought You Think (html, pdf). This is about a completely different system of note-taking, with different goals. However, it contains a wealth of inspiring ideas about note-taking systems, including valuable tips for the raw physical aspects of keeping paper notes. I recommend reading this interview with Lion Kimbro as a “teaser” for the book -- he mentions some things which he didn’t in the actual book, and it serves somewhat as “the missing introduction” to the book. (You can skip the part at the end about wikis if you don’t find it interesting; it is sort of outdated speculation about the future of the web, and it doesn’t get back to talking about the book.) Part of what I love about How to Make a Complete Map of Every Thought You Think is the manic brain-dump writing style -- it is a book which feels very “alive” to me. If you find its style grating rather than engaging, it’s probably not worth you reading through.Zettelkasten, Part 1: The Basics
Zettelkasten is German for ‘slip-box’, IE, a box with slips of paper in it. You keep everything on a bunch of note cards. Niklas Luhmann developed the system to take notes on his reading. He went on to be an incredibly prolific social scientist. It is hard to know whether his productivity was tied to Zettelkasten, but, others have reported large productivity boosts from the technique as well.Small Pieces of Paper Are Just Modular Large Pieces of Paper
You may be thinking: aren’t small pieces of paper bad? Aren’t large notebooks just better? Won’t small pages make for small ideas?
What I find is that the drive for larger paper is better-served by splitting things off into new note cards. Note-cards relevant to your current thinking can be spread on a table to get the same big-picture overview which you’d get from a large sheet of paper. Writing on an actual large sheet of paper locks things into place.When I was learning to write in my teens, it seemed to me that paper was a prison. Four walls, right? And the ideas were constantly trying to escape. What is a parenthesis but an idea trying to escape? What is a footnote but an idea that tried -- that jumped off the cliff? Because paper enforces single sequence -- and there’s no room for digression -- it imposes a particular kind of order in the very nature of the structure.-- Ted Nelson, demonstration of Xanadu space
I use 3x5 index cards. That’s quite small compared to most notebooks. It may be that this is the right size for me only because I already have very small handwriting. I believe Luhmann used larger cards. However, I expected it to be too small. Instead, I found the small cards to be freeing. I strongly recommend trying 3x5 cards before trying with a larger size. In fact, even smaller sizes than this are viable -- one early reader of this write-up decided to use half 3x5 cards, so that they’d fit in mtg deck boxes.
Writing on small cards forces certain habits which would be good even for larger paper, but which I didn’t consider until the small cards made them necessary. It forces ideas to be broken up into simple pieces, which helps to clarify them. Breaking up ideas forces you to link them together explicitly, rather than relying on the linear structure of a notebook to link together chains of thought.
Once you’re forced to adopt a linking system, it becomes natural to use it to “break out of the prison of the page” -- tangents, parentheticals, explanatory remarks, caveats, … everything becomes a new card. This gives your thoughts much more “surface area” to expand upon.
On a computer, this is essentially the wiki-style [[magic link]] which links to a page if the page exists, or creates the page if it doesn’t yet exist -- a critical but all-too-rare feature of note-taking software. Again, though, I strongly recommend trying the system on paper before jumping to a computer; putting yourself in a position where you need to link information like crazy will help you to see the value of it.
This brings us to one of the defining features of the Zettelkasten method: the addressing system, which is how links between cards are established.Paper Hypertext
We want to use card addresses to organize and reference everything. So, when you start a new card, its address should be the first thing you write -- you never want to have a card go without an address. Choose a consistent location for the addresses, such as the upper right corner. If you’re using multi-color pens, like me, you might want to choose one color just for addresses.
Wiki-style links tend to use the title of a page to reference that page, which works very well on a computer. However, for a pen-and-paper hypertext system, we want to optimize several things:
- Easy lookup: we want to find referenced cards as easily as possible. This entails sorting the cards, so that you don’t have to go digging; finding what you want is as easy as finding a word in the dictionary, or finding a page given the page number.
- Easy to sort: I don’t know about you, but for me, putting things in alphabetical order isn’t the easiest thing. I find myself reciting the alphabet pretty often. So, I don’t really want to sort cards alphabetically by title.
- Easy to write: another reason not to sort alphabetically by title is that you want to reference cards really easily. You probably don’t want to write out full titles, unless you can keep the titles really short.
- Fixed addresses: Whatever we use to reference a card, it must remain fixed. Otherwise, references could break when things change. No one likes broken links!
- Related cards should be near each other. Alphabetical order might put closely related cards very far apart, which gets to be cumbersome as the collection of cards grows -- even if look-up is quite convenient, it is nicer if the related cards are already at hand without purposefully deciding to look them up.
- No preset categories. Creating a system of categories is a common way to place related content together, but, it is too hard to know how you will want to categorize everything ahead of time, and the needs of an addressing system make it too difficult to change your category system later.
One simple solution is to number the cards, and keep them in numerical order. Numbers are easy to sort and find, and are very compact, so that you don’t have the issue of writing out long names. However, although related content will be somewhat nearby (due to the fact that we’re likely to create several cards on a topic at the same time), we can do better.
The essence of the Zettelkasten approach is the use of repeated decimal points, as in “22.3.14” -- cards addressed 2.1, 2.2, 2.2.1 and so on are all thought of as “underneath” the card numbered 2, just as in the familiar subsection-numbering system found in many books and papers. This allows us to insert cards anywhere we want, rather than only at the end, which allows related ideas to be placed near each other much more easily. A card sitting “underneath” another can loosely be thought of as a comment, or a contituation, or an associated thought.
However, for the sake of compactness, Zettelkasten addresses are usually written in an alphanumeric format, so that rather than writing 1.1.1, we would write 1a1; rather than writing 1.2.3, we write 2b3; and so on. This notation allows us to avoid writing so many periods, which grows tiresome.
Alternating between numbers and letters in this way allows us to get to two-digit numbers (and even two-digit letters, if we exhaust the whole alphabet) without needing periods or dashes or any such separators to indicate where one number ends and the next begins.
Let’s say I’m writing linearly -- something which could go in a notebook. I might start with card 11, say. Then I proceed to card 11a, 11b, 11c, 11d, etc. On each card, I make a note somewhere about the previous and next cards in sequence, so that later I know for sure how to follow the chain via addresses.
Later, I might have a different branch-off thought from 11c. This becomes 11c1. That’s the magic of the system, which you can’t accomplish so easily in a linear notebook: you can just come back and add things. These tangents can grow to be larger than the original.
Don’t get too caught up in what address to give a card to put it near relevant material. A card can be put anywhere in the address system. The point is to make things more convenient for you; nothing else matters. Ideally, the tree would perfectly reflect some kind of conceptual hierarchy; but in practice, card 11c might turn out to be the primary thing, with card 11 just serving as a historical record of what seeded the idea.
Similarly, a linear chain of writing doesn’t have to get a nice linear chain of addresses. I might have a train of thought which goes across cards 11, 11a, 11b, 11b1, 11b1a, 11b1a1, 18, 18a… (I write a lot of “1a1a1a1a”, and it is sometimes better to jump up to a new top-level number to keep the addresses from getting longer.)
Mostly, though, I’ve written less and less in linear chains, and more and more in branching trees. Sometimes a thought just naturally wants to come out linearly. But, this tends to make it more difficult to review later -- the cards aren’t split up into atomic ideas, instead flowing into each other.
If you don’t know where to put something, make it a new top-level card. You can link it to whatever you need via the addressing system, so the cost of putting it in a suboptimal location isn’t worth worrying about too much! You don’t want to be constrained by the ideas you’ve had so far. Or, to put it a different way: it’s like starting a new page in a notebook. Zettelkasten is supposed to be less restrictive than a notebook, not more. Don’t get locked into place by trying to make the addresses perfectly reflect the logical organization.Physical Issues: Card Storage
Linear notes can be kept in any kind of paper notebook. Nonlinear/modular systems such as Zettelkasten, on the other hand, require some sort of binder-like system where you can insert pages at will. I’ve tried a lot of different things. Binders are typically just less comfortable to write in (because of the rings -- this is another point where the fact that I’m left-handed is very significant, and right-handed readers may have a different experience).
(One thing that’s improved my life is realizing that I can use a binder “backwards” to get essentially the right-hander’s experience -- I write on the “back” of pages, starting from the “end”.)
They’re also bulky; it seems somewhat absurd how much more bulky they are than a notebook of equivalently-sized paper. This is a serious concern if you want to carry them around. (As a general rule, I’ve found that a binder feels roughly equivalent to one-size-larger notebook -- a three-ring binder for 3x5 cards feels like carrying around a deck of 4x6 cards; a binder of A6 paper feels like a notebook of A5 paper; and so on.)
Index cards are often kept in special boxes, which you can get. However, I don’t like this so much? I want a more binder-like thing which I can easily hold in my hands and flip through. Also, boxes are often made to view cards in landscape orientation, but I prefer portrait orientation -- so it’s hard to flip through things and read while they’re still in the box.
Currently, I use the Staples index-cards-on-a-ring which put all the cards on a single ring, and protect them with plastic covers. However, I replace the metal rings (which I find harder to work with) with plastic rings. I also bought a variety of note cards to try -- you can try thicker/thinner paper, colors, line grid, dot grid, etc. If you do this, you’ll need a hole punch, too. I recommend getting a “low force” hole punch; if you just go and buy the cheapest hole punch you can find, it’ll probably be pretty terrible. You want to be fairly consistent with where you punch the holes, but, that wasn’t as important as I expected (it doesn’t matter as much with a one-ring binder in contrast to a three-ring, since you’re not trying to get holes to line up with each other).
I enjoy the ring storage method, because it makes cards really easy to flip through, and I can work on several cards at once by splaying them out (which means I don’t lose my place when I decide to make a new card or make a note on a different one, and don’t have to take things out of sort order to work with them).
I don’t keep the cards perfectly sorted all the time. Instead, I divide things up into sorted and not-yet-sorted:
(Blue in this image mean “written on” -- they’re all actually white except for the yellow divider, although of course you could use colored cards if you like.)
As I write on blank cards, I just leave them where they are, rather than immediately putting them into the sort ordering. I sort them in later.
There is an advantage to this approach beyond the efficiency of sorting things all at once. The unsorted cards are a physical record of what I’m actively working on. Since cards are so small, working on an idea almost always means creating new cards. So, I can easily jump back into whatever I was thinking about last time I handled the binder of cards.
Unless you have a specific new idea you want to think about (in which case you start a new card, or, go find the most closely related cards in your existing pile), there are basically two ways to enter into your card deck: from the front, and from the back. The front is “top-down” (both literally and figuratively), going from bigger ideas to smaller details. It’s more breadth-first. You’re likely to notice an idea which you’ve been neglecting, and start a new branch from it. Starting from the back, on the other hand, is depth-first. You’re continuing to go deeper into a branch which you’ve already developed some depth in.
Don’t sort too often. The unsorted cards are a valuable record of what you’ve been thinking about. I’ve regretted sorting too frequently -- it feels like I have to start over, find the interesting open questions buried in my stack of cards all over again.
In theory, one could also move cards from sorted to unsorted specifically to remind oneself to work on those cards, but I haven’t really used this tactic.
Splitting & Deck Management
When I have much more than 100 filled cards on a ring, I sort all of the cards, and split the deck into two. (Look for a sensible place to split the tree into two -- you want to avoid a deep branch being split up into two separate decks, as much as you can.) Load up the two new decks with 50ish blank cards each, and stick them on new rings.
Everything is still on one big addressing system, so, it is a good idea to label the two new binders with the address range within. I use blank stickers, which I put on the front of each ring binder. The labels serve both to keep lookup easy (I don’t want to be guessing about which binder certain addresses are in), and also, to remind me to limit the addresses within a given deck.
For example, suppose this is my first deck of cards (so before the split, it holds everything). Let’s say there are 30 cards underneath “1”, 20 cards underneath “2”, and then about 50 more cards total, under the numbers 3 through 14.
I would split this deck into a “1 through 2” deck, and a “3 through *” deck -- the * meaning “anything”. You might think it would be “3 through 14”, but, when I make card 15, it would go in that deck. So at any time, you have one deck of cards with no upper bound. On the other hand, when you are working with the “1 - 2” deck, you don’t want to mistakenly make a card 3; you’ve already got a card 3 somewhere. You don’t want duplicate addresses anywhere!
Currently, I have 6 decks: 0 - 1.4, 1.5 - 1.*, 2 - 2.4, 2.5 - 2.*, 3, and 4 - 4.*. (I was foolish when I started my Zettelkasten, and used the decimal system rather than the alphanumeric system. I switched quickly, but all my top-level addresses are still decimal. So, I have a lot of mixed-address cards, such as 1.3a1, 1.5.2a2, 2.6b4a, etc. As for why my numbers start at 0 rather than 1, I’ll discuss that in the “Index & Bibliography” section.)
I like to have the unsorted/blank “short-term memory” section on every single deck, so that I can conveniently start thinking about stuff within that deck without grabbing anything else. However, it might also make sense to have only one “short-term memory” in order to keep yourself more focused (and so that there’s only one place to check when you want to remember what you were recently working on!).Getting Started: Your First Card
Your first note doesn’t need to be anything important -- it isn’t as if every idea you put into your Zettelkasten has to be “underneath” it. Remember, you aren’t trying to invent a good category system. Not every card has to look like a core idea with bullet points which elaborate on that idea, like my example in the previous section. You can just start writing whatever. In fact, it might be good if you make your first cards messy and unimportant, just to make sure you don’t feel like everything has to be nicely organized and highly significant.
On the other hand, it might be important to have a good starting point, if you really want to give Zettelkasten a chance.
I mentioned that I knew I liked Zettelkasten within the first 30 minutes. I think it might be important that when I sat down to try it, I had an idea I was excited to work on. It wasn’t a nice solid mathematical idea -- it was a fuzzy idea, one which had been burning in the back of my brain for a week or so, waiting to be born. It filled the fractal branches of a zettelkasten nicely, expanding in every direction.
So, maybe start with one of those ideas. Something you’ve been struggling to articulate. Something which hasn’t found a place in your linear notebook.
Alright. That’s all I have to say about the basics of Zettelkasten. You can go try it now if you want, or keep reading. The rest of this document is about further ideas in note-taking which have shaped the way I use Zettelkasten. These may or may not be critical factors; I don’t know for sure why Zettelkasten is such a productive system for me personally.Note-Taking Systems I Have Known and Loved
I’m organizing this section by my previous note-taking systems, but secretly, the main point is to convey a number of note-taking ideas which may have contributed to Zettelkasten working well for me. These ideas have seemed generally useful to me -- maybe they’ll be useful to you, even if you don’t end up using Zettelkasten in particular.Notebooks
Firstly, and most importantly, I have been keeping idea books since middle school. I think there’s something very important in the simple idea of writing regularly -- I don’t have the reference, but, I remember reading someone who described the day they first started keeping a diary as the day they first woke up, started reflectively thinking about their relationship with the world. Here’s a somewhat similar quote from a Zettelkasten blog:During the time spanning Nov. 2007–Jan. 2010, I filled 11 note books with ideas, to-do lists, ramblings, diary entries, drawings, and worries.Looking back, this is about the time I started to live consciously. I guess keeping a journal helped me “wake up” from some kind of teenage slumber.--Christian
I never got into autobiographical diary-style writing, personally, instead writing about ideas I was having. Still, things were in a very “narrative” format -- the ideas were a drama, a back-and-forth, a dance of rejoinders. There was some math -- pages filled with equations -- but only after a great deal of (very) informal development of an idea.
As a result, “elaborate on an idea” / “keep going” seems like a primitive operation to me -- and, specifically, a primitive operation which involves paper. (I can’t translate the same thinking style to conversation, not completely.) I’m sure that there is a lot to unpack, but for me, it just feels natural to keep developing ideas further.
So, when I say that the Zettelkasten card 1b2 “elaborates on” the card 1b, I’m calling on the long experience I’ve had with idea books. I don’t know if it’ll mean the same thing for you.
Here’s my incomplete attempt to convey some of what it means.
When I’m writing in an idea book, I spend a lot of time trying to clearly explain ideas under the (often false) assumption that I know what I’m talking about. There’s an imaginary audience who knows a lot of what I’m talking about, but I have to explain certain things. I can’t get away with leaving important terms undefined -- I have to establish anything I feel less than fully confident about. For example, the definition of a Bayesian network is something I can assume my “audience” can look up on wikipedia. However, if I’m less than totally confident in the concept of d-separation, I have to explain it; especially if it is important to the argument I hope to make.
Once I’ve established the terms, I try to explain the idea I was having. I spend a lot of time staring off into space, not really knowing what’s going on in my head exactly, but with a sense that there’s a simple point I’m trying to make, if only I could see it. I simultaneously feel like I know what I want to say (if only I could find the words), and like I don’t know what it is -- after all, I haven’t articulated it yet. Generally, I can pick up where I left off with a particular thought, even after several weeks -- I can glance at what I’ve written so far, and get right back to staring at the wall again, trying to articulate the same un-articulated idea.
If I start again in a different notebook (for example, switching to writing my thoughts on a computer), I have to explain everything again. This audience doesn’t know yet! I can’t just pick up on a computer where I left off on paper. It’s like trying to pick up a conversation in the middle, but with a different person. This is sort of annoying, but often good (because re-explaining things may hold surprises, as I notice new details.)
Similarly, if I do a lot of thinking without a notebook (maybe in a conversation), I generally have to “construct” my new position from my old one. This has an unfortunate “freezing” effect on thoughts: there’s a lot of gravity toward the chain of thought wherever it is on the page. I tend to work on whatever line of thought is most recent in my notebook, regardless of any more important or better ideas which have come along -- especially if the line of thought in the notebook isn’t yet at a conclusive place. Sometimes I put a scribble in the notebook after a line of thought, to indicate explicitly that it no longer reflects the state of my thinking, to give myself “permission” to do something else.
Once I’ve articulated some point, then criticisms of the point often become clear, and I’ll start writing about them. I often have a sense that I know how it’s going to go a few steps ahead in this back-and-forth; a few critiques and replies/revisions. Especially if the ideas are flowing faster than I can write them down. However, it is important to actually write things down, because they often don’t go quite as I expect.
If an idea seems to have reached a natural conclusion, including all the critiques/replies which felt important enough to write, I’ll often write a list of “future work”: any open questions I can think of, applications, details which are important but not so important that I want to write about them yet, etc. At this point, it is usually time to write the idea up for a real audience, which will require more detail and refine the idea yet further (possibly destroying it, or changing it significantly, as I often find a critical flaw when I try to write an idea up for consumption by others).
If I don’t have any particular idea I’m developing, I may start fresh with a mental motion like “OK, obviously I know how to solve everything” and write down the grand solution to everything, starting big-picture and continuing until I get stuck. Or, instead, I might make a bulleted list free-associating about what I think the interesting problems are -- the things I don’t know how to do.Workflowy
The next advance in my idea notes was workflowy. I still love the simplicity of workflowy, even though I have moved on from it.
For those unfamilar, Workflowy is an outlining tool. I was unfamiliar with the idea before Workflowy introduced it to me. Word processors generally support nested bulleted lists, but the page-like format of a word processor limits the depth such lists can go, and it didn’t really occur to me to use these as a primary mode of writing. Workflowy doesn’t let you do anything but this, and it provides enough features to make it extremely convenient and natural.
Nonlinear Ideas: Branching Development
Workflowy introduced me to the possibility of nonlinear formats for idea development. I’ve already discussed this to some extent, since it is also one of the main advantages of Zettelkasten over ordinary notebooks.
Suddenly, I could continue a thread anywhere, rather than always picking it up at the end. I could sketch out where I expected things to go, with an outline, rather than keeping all the points I wanted to hit in my head as I wrote. If I got stuck on something, I could write about how I was stuck nested underneath whatever paragraph I was currently writing, but then collapse the meta-thoughts to be invisible later -- so the overall narrative doesn’t feel interrupted.
In contrast, writing in paper notebooks forces you to choose consciously that you’re done for now with a topic if you want to start a new one. Every new paragraph is like choosing a single fork in a twisting maze. Workflowy allowed me to take them all.
What are Children?
I’ve seen people hit a block right away when they try to use workflowy, because they don’t know what a “child node” is.
- Here’s a node. It could be a paragraph, expressing some thought. It could also be a title.
- Here’s a child node. It could be a comment on the thought -- an aside, a critique, whatever. It could be something which goes under the heading.
- Here’s a sibling node. It could be the next paragrapt in the “main thrust” of an argument. It could be an unrelated point under the same super-point everything is under.
As with Zettelkasten, my advice is to not get too hung up on this. A child is sort of like a comment; a parenthetical statement or a footnote. You can continue the main thrust of an argument in sibling nodes -- just like writing an ordinary sequence of paragraphs in a word processor.
You can also organize things under headings. This is especially true if you wrote a sketchy outline first and then filled it in, or, if you have a lot of material in Workflowy and had to organize it. The “upper ontology” of my workflowy is mostly title-like, single words or short noun phrases. As you get down in, bullets start to be sentences and paragraphs more often.
Obviously, all of this can be applied to Zettelkasten to some extent. The biggest difference is that “upper-level” cards are less likely to just be category titles; and, you can’t really organize things into nice categories after-the-fact because the addresses in Zettelkasten are fixed -- you can’t change them without breaking links. You can use redirect cards if you want to reorganize things, actually, but I haven’t done that very much in practice. Something which has worked for me to some extent is to reorganize things in the indexes. Once an index is too much of a big flat list, you can cluster entries into subjects. This new listing can be added as a child to the previous index, keeping the historical record; or, possibly, replace the old index outright. I discuss this more in the section on indexing.
Building Up Ideas over Long Time Periods
My idea books let me build up ideas over time to a greater extent than my peers who didn’t keep similar journals. However, because the linear format forces you to switch topics in a serial manner and “start over” when you want to resume a subject, you’re mostly restricted to what you can keep in your head. Your notebooks are a form of information storage, and you can go back and re-read things, but only if you remember the relevant item to go back and re-read.
Workflowy allowed me to build up ideas to a greater degree, incrementally adding thoughts until cascades of understanding changed my overall view.
Placing a New Idea
Because you’ve got all your ideas in one big outline, you can add in little ideas easily. Workflowy was easy enough to access via my smartphone (though they didn’t have a proper app at the time), so I could jot down an idea as I was walking to class, waiting for the bus, etc. I could easily navigate to the right location, at least, if I had organized the overall structure of the outline well. Writing one little idea would usually get more flowing, and I would add several points in the same location on the tree, or in nearby locations.
This idea of jotting down ideas while you’re out and about is very important. If you feel you don’t have enough ideas (be it for research, for writing fiction, for art -- whatever) my first question would be whether you have a good way to jot down little ideas as they occur to you.
The fact that you’re forced to somehow fit all ideas into one big tree is also important. It makes you organize things in ways that are likely to be useful to you later.
Organizing Over Time
The second really nice thing workflowy did was allow me to go back and reorganize all the little ideas I had jotted down. When I sat down at a computer, I could take a look at my tree overall and see how well the categorization fit. This mostly took the form of small improvements to the tree structure over time. Eventually, a cascade of small fixes turned into a major reorganization. At that point, I felt I had really learned something -- all the incremental progress built up into an overall shift in my understanding.
Again, this isn’t really possible in paper-based Zettelkasten -- the address system is fixed. However, as I mentioned before, I’ve had some success doing this kind of reorganization within the indexes. It doesn’t matter that the addresses of the cards are fixed if the way you actually find those addresses is mutable.
Limitations of Workflowy
Eventually, I noticed that I had a big pile of ideas which I hadn’t really developed. I was jotting down ideas, sure. I was fitting them into an increasingly cohesive overall picture, sure. But I wasn’t doing anything with them. I wasn’t writing pages and pages of details and critique.
It was around this time that I realized I had gone more than three years without using a paper notebook very significantly. I started writing on paper again. I realized that there were all these habits of thinking which were tied to paper for me, and which I didn’t really access if I didn’t have a nice notebook and a nice pen -- the force of the long-practiced associations. It was like waking up intellectually after having gone to sleep for a long time. I started to remember highschool. It was a weird time. Anyway...Dynalist
The next thing I tried was Dynalist.
The main advantage of Dynalist over Workflowy is that it takes a feature-rich rather than minimalistic approach. I like the clean aesthetics of Workflowy, but… eventually, there’ll be some critical feature Workflowy just doesn’t provide, and you’ll want to make the jump to Dynasilt. I use hardly any of the extra features of Dynalist, but the ones I do use, I need. For me, it’s mostly the LaTeX support.
Another thing about Dynalist which felt very different for me was the file system. Workflowy forces you to keep everything in one big outline. Dynalist lets you create many outlines, which it treats as different files; and, you can organize them into folders (recursively). Technically, that’s just another tree structure. In terms of UI, though, it made navigation much easier (because you can easily access a desired file through the file pane). Psychologically, it made me much more willing to start fresh outlines rather than add to one big one. This was both good and bad. It meant my ideas were less anchored in one big tree, but it eventually resulted in a big, disorganized pile of notes.
I did learn my lesson from Workflowy, though, and set things up in my Dynalist such that I actually developed ideas, rather than just collecting scraps forever.
Temporary Notes vs Organized Notes
I organized my Dynalist files as follows:
- A “log” file, in which I could write whatever I was thinking about. This was organized by date, although I would often go back and elaborate on things from previous dates.
- A “todo” file, where I put links to items inside “log” which I specifically wanted to go back and think more about. I would periodically sort the todo items to reflect my priorities. This gave me a list of important topics to draw from whenever I wasn’t sure what I wanted to think about.
- A bunch of other disorganized files.
This system wasn’t great, but it was a whole lot better at actually developing ideas than the way I kept things organized in Workflowy. I had realized that locking everything into a unified tree structure, while good for the purpose of slowly improving a large ontology which organized a lot of little thoughts, was keeping me from just writing whatever I was thinking about.
Dan Sheffler (whose essays I’ve already cited several times in this writeup) writes about realizing that his note-taking system was simultaneously trying to implement two different goals: an organized long-term memory store, and “engagement notes” which are written to clarify thinking and have a more stream-of-consciousness style. My “log” file was essentially engagement notes, and my “todo” file was the long-term memory store.
For some people, I think an essential part of Zettelkasten is the distinction between temporary and permanent notes. Temporary notes are the disorganized stream-of-consciousness notes which Sheffler calls engagement notes. Temporary notes can also include all sorts of other things, such as todo lists which you make at the start of the day (and which only apply to that day), shopping lists, etc. Temporary notes can be kept in a linear format, like a notebook. Periodically, you review the temporary notes, putting the important things into Zettelkasten.
In Taking Smart Notes, Luhmann is described as transferring the important thoughts from the day into Zettel every evening. Sheffler, on the other hand, keeps a gap of at least 24 hours between taking down engagement notes and deciding what belongs in the long-term store. A gap of time allows the initial excitement over an idea to pass, so that only the things which still seem important the next day get into long-term notes. He also points out that this system enforces a small amount of spaced repetition, making it more likely that content is recalled later.
As for myself, I mostly write directly into my Zettelkasten, and I think it’s pretty great. However, I do find this to be difficult/impossible when taking quick notes during a conversation or a talk – when I try, then the resulting content in my Zettelkasten seems pretty useless (ie, I don't come back to it and further develop those thoughts). So, I've started to carry a notebook again for those temporary notes.
I currently think of things like this:
These are the sort of small pointers to ideas which you can write down while walking, waiting for the bus, etc. The idea is stated very simply -- perhaps in a single word or a short phrase. A sentence at most. You might forget what it means after a week, especially if you don’t record the context well. The first thing to realize about jots is to capture them at all, as already discussed. The second thing is to capture them in a place where you will be able to develop them later. I used to carry around a small pocket notebook for jots, after I stopped using Workflowy regularly. My plan was to review the jots whenever I filled a notebook, putting them in more long-term storage. This never happened: when I filled up a notebook, unpacking all the jots into something meaningful just seemed like too huge a task. It works better for me to jot things into permanent storage directly, as I did with Workflowy. I procrastinate too much on turning temporary notes into long term notes, and the temporary notes become meaningless.
A gloss is a paragraph explaining the point of a jot. If a jot is the title of a Zettelkasten card, a gloss is the first paragraph (often written in a distinct color). This gives enough of an idea that the thought will not be lost if it is left for a few weeks (perhaps even years, depending). Writing a gloss is usually easy, and doing so is often enough to get the ideas flowing.
This is the kind of writing I described in the ‘notebooks’ section. An idea is fleshed out. This kind of writing is often still comprehensible years later, although it isn’t guaranteed to be.
This is the kind of writing which is publishable. It nails the idea down. There’s not really any end to this -- you can imagine expanding something from a blog post, to an academic paper, to a book, and further, with increasing levels of detail, gentle exposition, formal rigor -- but to a first approximation, anyway, you’ve eliminated all the contradictions, stated the motivating context accurately, etc.
I called the last item “refinement” rather than “communication” because, really, you can communicate your ideas at any of these stages. If someone shares a lot of context with you, they can understand your jots. That’s really difficult, though. More likely, a research partner will understand your glosses. Development will be understandable to someone a little more distant, and so on.At Long Last, Zettelkasten
I’ve been hammering home the idea of “linear” vs “nonlinear” formats as one of the big advantages of Zettelkasten. But workflowy and dynalist both allow nonlinear writing. Why should you be interested in Zettelkasten? Is it anything more than a way to implement workflowy-like writing for a paper format?
I’ve said that (at least for me) there’s something extra-good about Zettelkasten which I don’t really understand. But, there are a couple of important elements which make Zettelkasten more than just paper workflowy.
- Hierarchy Plus Cross-Links: A repeated theme across knowledge formats, including wikipedia and textbooks, is that you want both a hierarchical organization which makes it easy to get an overview and find things, and also a “cross-reference” type capability which allows related content to be linked -- creating a heterarchical web. I mentioned at the beginning that Zettelkasten forced me to create cross-links much more than I otherwise would, due to the use of small note-cards. Workflowy has “hierarchy” down, but it has somewhat poor “cross-link” capability. It has tags, but a tag system is not as powerful as hypertext. Because you can link to individual nodes, it’s possible to use hypertext cross-links -- but the process is awkward, since you have to get the link to the node you want. Dynalist is significantly better in this respect -- it has an easy way to create a link to anything by searching for it (without leaving the spot you’re at). But it lacks the wiki-style “magic link” capability, creating a new page when you make a link which has no target. Roam, however, provides this feature.
- Atomicity: The idea of creating pages organized around a single idea (again, an idea related to wikis). This is possible in Dynalist, but Zettelkasten practically forces it upon you, which for me was really good. Again, Roam manages to encourage this style.
My cards often look something like this:
I’m left handed, so you may want to flip all of this around if you’re right handed. I use the ring binder “backwards” from the intended configuration (the punched hole would usually be on the left, rather than the right). Also, I prefer portrait rather than landscape. Most people prefer to use 3x5 cards in landscape, I suppose.
Anyway, not every card will look exactly like the above. A card might just contain a bunch of free-writing, with no bulleted list. Or it might only contain a bulleted list, with no blurb at the beginning. Whatever works. I think my layout is close to Luhmann’s and close to common advice -- but if you try to copy it religiously, you’ll probably feel like Zettelkasten is awkward and restrictive.
The only absolutely necessary thing is the address. The address is the first thing you write on a new card. You don’t ever want a card to go without an address. And it should be in a standard location, so that it is really easy to look through a bunch of cards for one with a specific address.
Don’t feel bad if you start a card and leave it mostly blank forever. Maybe you thought you were going to elaborate an idea, so you made a new card, but it’s got nothing but an address. That’s ok. Maybe you will fill it later. Maybe you won’t. Don’t worry about it.
Mostly, a thought is continued through elaboration on bullet points. I might write something like “cont. 1.1a1a” at the bottom of the card if there’s another card that’s really a direct continuation, though. (Actually, I don’t write “cont.”; I just write the down arrow, which means the same thing.) If so, I’d write “see 1.1a1” in the upper left hand corner, to indicate that 1.1a1a probably doesn’t make much sense on its own without consulting 1.1a1 -- moreso than usual for child cards. (Actually, I’d write another down arrow rather than “see”, mirroring the down arrow on the previous card -- this indicates the direct-continuation relationship.)
In the illustration, I wrote links [in square brackets]. The truth is, I often put them in full rectangular boxes (to make them stand out more), although not always. Sometimes I put them in parentheses when I’m using them more as a noun, as in: “I think pizza (12a) might be relevant to pasta. [14x5b]” In that example, “(12a)” is the card for pizza. “[14x5b]” is a card continuing the whole thought “pizza might be relevant to pasta”. So parentheses-vs-box is sort of like top-corner-vs-bottom, but for an individual line rather than a whole card.
Use of Color
The colors are true to my writing as well. For a long time, I wanted to try writing with multi-color click pens, because I knew some people found them very useful; but, I was unable to find any which satisfied my (exceptionally picky) taste. I don’t generally go for ball-point pens; they aren’t smooth enough. I prefer to write with felt-tip drawing pens or similar. I also prefer very fine tips (as a consequence of preferring my writing to be very small, as I mentioned previously) -- although I’ve also found that the appropriate line width varies with my mental state and with the subject matter. Fine lines are better for fine details, and for energetic mental states; broad lines are better for loose free-association and brainstorming, and for tired mental states.
In any case, a friend recommended the Hi-Tec C Coleto, a multi-color click pen which feels as smooth as felt-tip pens usually do (almost). You can buy whatever colors you want, and they’re available in a variety of line-widths, so you can customize it quite a bit.
At first I just used different colors haphazardly. I figured I would eventually settle on meanings for colors, if I just used whatever felt appropriate and experimented. Mostly, that meant that I switched colors to indicate a change of topic, or used a different color when I went back and annotated something (which really helps readability, by the way -- black writing with a bunch of black annotations scribbled next to it or between lines is hard to read, compared to purple writing with orange annotations, or whatever!). When I switched to Zettelkasten, though, I got more systematic with my use of color.
I roughly follow Lion Kimbro’s advice about colors, from How to Make a Complete Map of Every Thought you Think:Now lets talk about color.Your pen has four colors: Red, Green, Blue, and BlackYou will want to connect meeting with each color.Here’s my associations:RED: Error, Warning, CorrectionBLUE: Structure, Diagram, Picture, Links, Keys (in key-value pairs)GREEN: Meta, Definition, Naming, Brief Annotation, GlyphsBLACK: Main ContentI also use green to clarify sloppy writing later on. Blue is for Keys, Black is for values.I hope that’s self-explanatory.If you make a correction, put it in red. Page numbers are blue. If you draw a diagram, make it blue. Main content in black.Suppose you make a diagram: Start with a big blue box. Put the diagram in the box. (Or the other way around- make the diagram, than the box around it.) Put some highlighted content in black. Want to define a word? Use a green callout. Oops- there’s a problem in the drawing- X it out in red, followed by the correction, in red.Some times, I use black and blue to alternate emphasis. Black and blue are the easiest to see.If I’m annotating some text in the future, and the text is black, I’ll switch to using blue for content. Or vise versa.Some annotations are red, if they are major corrections.Always remember: Tolerate errors. If your black has run out, and you don’t want to get up right away to fetch your backup pen, then just switch to blue. When the thoughts out, go get your backup pen.
The only big differences are that I use brown instead of black in my pen, I tend to use red for titles so that they stand out very clearly, and I use green for links rather than blue.Index & Bibliography?
Taking Smart Notes describes two other kinds of cards: indexes, and bibliographical notes. I haven’t made those work for me very effectively, however. Luhmann, the inventor of Zettelkasten, is described inventing Zettelkasten as a way to organize notes originally made while reading. I don’t use it like that -- I mainly use it for organizing notes I make while thinking. So bibliography isn’t of primary importance for me.
(Apparently Umberto Eco similarly advises keeping idea notes and reading notes on separate sets of index cards.)
So I don’t miss the bibliography cards. (Maybe I will eventually.) On the other hand, I definitely need some sort of index, but I’m not sure about the best way to keep it up to date. I only notice that I need it when I go looking for a particular card and it is difficult to find! When that happens, and I eventually find the card I wanted, I can jot down its address in an index. But, it would be nice to somehow avoid this. So, I’ve experimented with some ideas. Here are someone else’s thoughts on indexing (for a digital zettelkasten).
Listing Assorted Cards
The first type of index which I tried lists “important” cards (cards which I refer to often). I just have one of these right now. The idea is that you write a card’s name and address on this index if you find that you’ve had difficulty locating a card and wished it had been listed in your index. This sounds like it should be better than a simple list of the top-level numbered cards, since (as I mentioned earlier) cards like 11a often turn out to be more important than cards like 11. Unfortunately, I’ve found this not to be the case. The problem is that this kind of index is too hard to maintain. If I’ve just been struggling to find a card, my working memory is probably already over-taxed with all the stuff I wanted to do after finding that card. So I forget to add it to the index.
Sometimes it also makes sense to just make a new top-level card on which you list everything which has to do with a particular category. I have only done this once so far. It seems like this is the main mode of indexing which other people use? But I don’t like the idea that well.
Listing Sibling Cards
When a card has enough children that they’re difficult to keep track of, I add a “zero” card before all the other children, and this works as an index. So, for example, card 2a might have children 2a1, 2a2, 2a3, … 2a15. That’s a lot to keep track of. So I add 2a0, which gets an entry for 2a1-2a15, and any new cards added under 2a. It can also get an entry for particularly important descendants; maybe 2a3a1 is extra important and gets an entry.
For cards like 2, whose children are alphabetical, you can’t really use “zero” to go before all the other children. I use “λ” as the “alphabetical zero” -- I sort it as if it comes before all the other letters in the alphabet. So, card “1λ” lists 1a, 1b, etc.
The most important index is the index at 0, ei, the index of all top-level numbered cards. As I describe in the “card layout” section, a card already mostly lists its own children -- meaning that you don’t need to add a new card to serve this purpose until things get unwieldy. However, top-level cards have no parents to keep track of them! So, you probably want an “absolute zero” card right away.
These “zero” cards also make it easier to keep track of whether a card with a particular address has been created yet. Every time you make a card, you add it to the appropriate zero card; so, you can see right away what the next address available is. This isn’t the case otherwise, especially if your cards aren’t currently sorted.
Kimbro’s Mind Mapping
I’ve experimented with adapting Lion Kimbro’s system from How to Make a Complete Map of Every Thought You Think. After all, a complete map of every thought you think sounds like the perfect index!
In my terminology, Lion Kimbro keeps only jots -- he was focusing on collecting and mapping, rather than developing, ideas. Jots were collected into topics and sub-topics. When an area accumulated enough jots, he would start a mind map for it. I won’t go into all his specific mapping tips (although they’re relevant), but basically, imagine putting the addresses of cards into clusters (on a new blank card) and then writing “anchor words” describing the clusters.
You built your tree in an initially “top-down” fashion, expanding trees by adding increasingly-nested cards. You’re going to build the map “bottom-up”: when a sub-tree you’re interested in feels too large to quickly grasp, start a map. Let’s say you’re mapping card 8b4. You might already have an index of children at 8b40; if that’s the case, you can start with that. Also look through all the descendants of 8b4 and pick out whichever seem most important. (If this is too hard, start by making maps for 8b4’s children, and return to mapping 8b4 later.) Draw a new mind map, and place it at 8b40a -- it is part of the index; you want to find it easily when looking at the index.
Now, the important thing is that when you make a map for 8b, you can take a look at the map for 8b4, as well as any maps possessed by other children of 8b. This means that you don’t have to go through all of the descendents of 8b (which is good, because there could be a lot). You just look at the maps, which already give you an overview. The map for 8b is going to take the most important-seeming elements from all of those sub-maps.
This allows important things to trickle up to the top. When you make a map at 0, you’ll be getting all the most important stuff from deep sub-trees just by looking at the maps for each top-level numbered card.
The categories which emerge from mapping like this can be completely different from the concepts which initially seeded your top-level cards. You can make new top-level cards which correspond to these categories if you want. (I haven’t done this.)
Now, when you’re looking for something, you start at your top-level map. You look at the clusters and likely have some expectation about where it is (if the address isn’t somewhere on your top-level map already). You follow the addresses to further maps, which give further clusters of addresses, until you land in a tree which is small enough to navigate without maps.
I’ve described all of this as if it’s a one-time operation, but of course you keep adding to these maps, and re-draw updated maps when things don’t fit well any more. If a map lives at 8b40a, then the updated maps can be 8b40b, 8b40c, and so on.You can keep the old maps around as a historical record of your shifting conceptual clusters.Keeping Multiple Zettelkasten
A note system like Zettelkasten (or workflowy, dynalist, evernote, etc) is supposed to stick with you for years, growing with you and becoming a repository for your ideas. It’s a big commitment.
It’s difficult to optimize note-taking if you think of it that way, though. You can’t experiment if you have to look before you leap. I would have never tried Zettelkasten if I thought I was committing to try it as my “next system” -- I didn’t think it would work.
Similarly, I can’t optimize my Zettelkasten very well with that attitude. A Zettelkasten is supposed to be one repository for everything -- you’re not supposed to start a new one for a new project, for example. But, I have several Zettelkasten, to test out different formats: different sizes of card, different binders. It is still difficult to give alternatives a fair shake, because my two main Zettelkasten have built up momentum due to the content I keep in them.
I use a system of capital letters to cross reference between my Zettelkasten. For example, my main 3x5 Zettelkasten is “S” (for “small”). I have another Zettelkasten which is “M”, and also an “L”. When referencing card 1.1a within S, I just call it 1.1a. If I want to refer to it from a card in M, I call it S1.1a instead. And so on.
Apparently Luhmann did something similar, starting a new Zettelkasten which occasionally referred to his first.
However, keeping multiple Zettelkasten for special topics is not necessarily a good idea. Beware fixed categories. The danger is that categories limit what you write, or, become less appropriate over time. I’ve tried special-topic notebooks in the past, and while it does sometimes work, I often end up conflicted about where to put something. (Granted, I have a similar conflict about where to put things in my several omni-topic Zettelkasten, but mostly the 3x5 system I’ve described here has won out -- for now.)
On the other hand, I suspect it’s fine to create special topic zettelkasten for “very different” things. Creating a new zettelkasten because you’re writing a new book is probably bad -- although it’ll work fine for the goal of organizing material for writing books, it means your next book idea isn’t coming from Zettelkasten. (Zettelkasten should contain/extend the thought process which generates book ideas in the first place, and it can’t do that very well if you have to have a specific book idea in order to start a zettelkasten about it.) On the other hand, I suspect it is OK to keep a separate Zettelkasten for fictional creative writing. Factual ideas can spark ideas for fiction, but, the two are sufficiently different “modes” that it may make sense to keep them in physically separate collections.
The idea of using an extended address system to make references between multiple Zettelkasten can also be applied to address other things, outside of your Zettelkasten. For example, you might want come up with a way of adding addresses to your old notebooks so that you can refer to them easily. (For example, “notebook number: page number” could work.)Should You Transfer Old Notes Into Zettelkasten?
Relatedly, since Zettelkasten ideally becomes a library of all the things you have been thinking about, it might be tempting to try and transfer everything from your existing notes into Zettelkasten.
(A lot of readers may not even be tempted to do this, given the amount of work it would take. Yet, those more serious about note systems might think this is a good idea -- or, might be too afraid to try Zettelkasten because they think they’d have to do this.)
I think transferring older stuff into Zettelkasten can be useful, but, trying to make it happen right away as one big project is most likely not worth it.
- It’s true that part of the usefulness of Zettelkasten is the interconnected web of ideas which builds up over time, and the “high-surface-area” format which makes it easy to branch off any part. However, not all the payoff is long-term: it should also be useful in the moment. You’re not only writing notes because they may help you develop ideas in the future; the act of writing the notes should be helping you develop ideas now.
- You should probably only spend time putting ideas into Zettelkasten if you’re excited about further developing those ideas right now. You should not just be copying over ideas into Zettelkasten. You should be improving ideas, thinking about where to place them in your address hierarchy, interlinking them with other ideas in your Zettelkasten via address links, and taking notes on any new ideas sparked by this process. Trying to put all your old notes into Zettelkasten at once will likely make you feel hurried and unwilling to develop things further as you go. This will result in a pile of mediocre notes which will ultimately be less useful.
- I mentioned the breadth-first vs depth-first distinction earlier. Putting all of your old notes into Zettelkasten is an extremely breadth-first strategy, which likely doesn’t give you enough time to go deep into further developing any one idea.
What about the dream of having all your notes in one beautiful format? Well, it is true that old notes in different formats may be harder to find, since you have to remember what format the note you want was written in, or check all your old note systems to find the note you want. I think it just isn’t worth the cost to fix this problem, though, especially since you should probably try many different systems to find a good one that works for you, and you can’t very well port all your notes to each new system.
Zettelkasten should be an overall improvement compared to a normal notebook -- if it isn't, you have no business using it. Adding a huge up-front cost of transferring notes undermines that. Just pick Zettelkasten up when you want to use it to develop ideas further.Depth-first vs Breadth-first
Speaking of depth-first vs breadth-first, how should you balance those two modes?
Luckily, this problem has some relevant computer science theory behind it. I tend to think of it in terms of iterative-deepening A* heuristic search (IDA*).
The basic idea is this: the advantage of depth-first search is that you can minimize memory cost by only maintaining the information related to the path you are currently trying. However, depth-first search can easily get stuck down a fruitless path, while breadth-first search has better guarantees. IDA* balances the two approaches by going depth-first, but giving up when you get too deep, backing up, and trying a new path. (The A* aspect is that you define “too deep” in a way which also depends on how promising a path seems, based on an optimistic assessment.) This way, you simulate a breadth-first search by a series of depth-first sprints. This lets you focus your attention on a small set of ideas at one time.
Once you’ve explored all the paths to a certain level, your tolerance defining “too deep” increases, and you start again. You can think of this as becoming increasingly willing to spend a lot of time going down difficult technical paths as you confirm that easier options don’t exist.
Of course, this isn’t a perfect model of what you should do. But, it seems to me that a note-taking system should aspire to support and encourage something resembling this. More generally, I want to get across the idea of thinking of your existing research methodology as an algorithm (possibly a bad one), and trying to think about how it could be improved. Don’t try to force yourself to use any particular algorithm just because you think you should; but, if you can find ways to nudge yourself toward more effective algorithms, that’s probably a good idea.Inventing Shorthand/Symbology
I don’t think writing speed is a big bottleneck to thinking speed. Even though I “think by writing”, a lot of my time is spent... well... thinking. However, once I know what I want to write, writing does take time. When inspiration really strikes, I might know more or less what I want to say several paragraphs ahead of where I’ve actually written to. At times like that, it seems like every second counts -- the faster I write, the more ideas I get down, the less I forget before I get to it.
So, it seems worth putting some effort into writing faster. (Computer typing is obviously a thing to consider here, too.) Shorthand, and special symbols, are something to try.
There’s also the issue of space. I know I advocate for small cards, which are intentionally limiting space. But you don’t want to waste space if you don’t have to. The point is to comprehend as much as possible as easily as possible. Writing bullet points and using indentation to make outlines is an improvement over traditional paragraphs because it lets you see more at a glance. Similarly, using abbreviations and special symbols will improve this.
I’ve tried several times to learn “proper” shorthand. Maybe I just haven’t tried hard enough, but it seems like basically all shorthand systems work by leaving out information. Once you’re used to them, they’re easy enough to read shortly after you’ve written them -- when you still remember more or less what they said. However, they don’t actually convey enough information to fully recover what was written if you don’t have such a guess. Basically, they don’t improve readability. They compress things down to the point where they’re hard to decipher, for the sake of getting as much speed as possible.
On the other hand, I’ve spent time experimenting with changes to my own handwriting which improve speed without compromising readability. Pay attention to what takes you the most time to write, and think about ways to streamline that.
Lion Kimbro emphasizes that you come up with ways to abbreviate things you commonly repeat. Ho describes using the Japanese symbols for days of the week and other common things in his system. The Bullet Journaling community has created its own symbology. Personally, I’ve experimented with a variety of different reference symbols which mean different sorts of things (beyond the () vs  distinction I’ve mentioned).
The Bullet Journaling community has thought a lot about short-and-fast writing for the purpose of getting things out quickly and leaving more space on the page. They also have their own symbology which may be worth taking a look at. (I don't yet use it, but I may switch to it or something similar eventually.)
Well, that’s all I want to say for now. I may add to this document in the future. For now, best of luck developing ideas!
Fantasies are interesting. You know, like when you're imagining how nice that vacation will be, or wishing for a certain kind of social contact, or basking in hypothetical accolades for hypothetical accomplishments, or picturing the ideal end-product of your labor, or escaping into another world, or clinging to the notion that the world works a certain way, or telling yourself that you really are the kind of person you want to be seen as, or expecting things to work themselves out. What is a fantasy? How does it operate? How did it get there? What's it for? Is it a mistake? Have a think. Some of my thoughts are below.
Some kinds of fantasy:
A fantasy might be an element of your ability to plan ahead and avoid disaster, by imagining what might happen and what might cause good and bad outcomes, perhaps detached from any actual planning that you're doing.
A fantasy might be a way you remind yourself what your goals are while you're pursuing a subgoal.
A fantasy might be a way you get yourself to do something by telling yourself you'll get the good of the fantasy if you do it, even if that's not actually true.
A fantasy might be a tool you use to cope with an unmet need you have, which, if you pursued it without knowing more, might lead you to do bad things.
A fantasy might be a tool you use to cope with an unmet need you have, which you believe you couldn't satisfy even if you tried.
A fantasy might be a tool that part of you has discovered for feeling good by tricking part of you into thinking that something good is happening, possibly something much better than anything that might actually happen.
A fantasy might be a way you discover what you want, by looking for situations such that, when you imagine being there, you feel good or believe something good has happened.
A fantasy might be a way you discover what's happening in your mind and soul, by letting non-conscious parts of you project a simulated situation or world into your integrated awareness, and by letting a projected world evoke and communicate with non-conscious parts of you.
A fantasy might be a belief you have because it was told to you by your past self or by someone else, and you aren't noticing that it's not right.
A fantasy might be a belief you have such that part of you doesn't want to notice that it's not true, because that part of you prefers worlds where the belief is true, and that part of you doesn't notice that it's manipulating the news instead of the world.
A fantasy might be a belief you have such that part of you doesn't want to notice that it's not true, because that part of you is worried that if you don't believe it then people will notice and be angry and do bad things to you.
A fantasy might be a way you notice what you think might happen, and notice contradictions in or questions about what you think.
A fantasy might be a tool you use to get quick pleasure or cognitive engagement to distract yourself from something.
A fantasy might be a way that part of you gets what it really wants, because that part of you has forgotten entirely about the real world.
A fantasy might be a way that part of you gets what it really wants, because that part of you always only wanted something from the fantasy.
A fantasy might be a way that you cause others to believe something you want them to believe, by behaving like you believe it, so that they will non-consciously infer your belief and then non-consciously take your unspoken word for it.
A fantasy might be a way that the mental impression of an unprocessed event or information attempts to process by projecting the memory or an imagined related scenario into integrated awareness.
What other kinds of fantasies are there? What explanations of and uses for fantasies are there? How do you dissolve a fantasy you don't want? What do you want to know about fantasy? What do you want other people to know about fantasy? What kinds of fantasies do you have? What kinds of fantasies do people in other cultures and other ideologies have? What kinds of fantasies do animals have? What kinds of fantasies do groups of people have?
I want to highlight a conceptual distinction that is key to my understanding of consciousness and epistemology. There are fundamentally two ways in which we can define entities and much confused is created by failing to separate the two.
The first way is a relabelling; to define something in terms of what we've already defined. If we've already defined what is meant by a hydrogen atom and an oxygen atom, then we can simply define water as a particular configuration of one hydrogen atom and two oxygen atoms. Defining water in this way does not necessitate us to postulate a new entity in our fundamental ontology.
The second way is an external reference; to say that a word refers to something outside our system of language or meaning. Some people assert that we can only define something in terms of things that have already been defined, but if that were the case, then we wouldn't even be able to get our first definition of the ground. We'd end up with our definitions either being circular or some kind of infinite regress.
This isn't an issue that we can fully resolve, but there is a workaround in that we can have things in our system which point to external entities, without being fully able to say what they are. For example, we might only be able to define how quarks interact with each other but not say what their nature actually is apart from being things that interact in that particular way (some might say that behaviour is all that there is to the nature of something, but then what's the difference between an atom and a simulation of an atom?). If this were the case, then we'd have to define them as an external reference.
Here's another way of putting this: Relabeling defines a new entity in the map; external references posit a new entity in the territory.
Relabelings don't change a system
Creating a relabeling or adding a new entity in the map doesn't affect the territory. Here's an example: Let's suppose I introduce a new label "Gaia" and I say that by this I only mean the natural world and nothing more. I'm allowed to define terms however I want, however I shouldn't use this to trick you into accepting the connotations of the term, such as that nature has an order or nurtures us. Perhaps we wouldn't object if the person argued that nature had these properties separately and was using the term to emphasise those aspects, but we shouldn't accept an argument by definition.
We can also see this as a Motte and Bailey argument. The Mott is the claim that they are just using a slightly unusual term for the natural world. The Bailey is that the natural world has the properties we tend to associate with this term. Or in my terminology, Gaia is a relabelling and it can help us understand and talk about the system, but introducing it doesn't change anything. For any argument we can make using the term "Gaia", we should be able to make the same argument without it (see: Tabooing your words). Adding the relabelling "Gaia" into the system doesn't add order or nurturing into the system; if they are present, they should still be present even if we remove that definition.
Here's another example: Suppose we say that an object is beautiful if it meets certain aesthetic criteria. Again this is just a relabeling; just of a much more complex set of conditions. Someone might argue that it is intrinsically valuable for beauty to exist in the world, even if no-one ever experiences it. This may or may not be true, but we shouldn't believe it just because of the connotations that they snuck in. It makes sense to taboo the word and realise that what they are saying is valuable is just atoms in particular combinations. It then seems natural to ask, "What is so special about these combinations that makes them valuable when other combinations are not?"
And indeed the only thing that is immediately obviously special is that they create a certain experience when viewed by a human. But then, in order to be intrinsically valuable, and not merely instrumentally valuable, shouldn't there be something special about these combinations of atoms when not viewed by a human? Perhaps the proponents would have an answer to this argument, but the goal here isn't to argue against this perspective, as much as a naive argument for this perspective.
Ultimately, I'm trying to build up towards making a point about consciousness, which I know will be controversial since I tried making it here before and it met with significant opposition. I decided that it would be logical to take a step back and clarify my epistemological stance, in the hope that it might reduce the chance that we end up talking past each other.
This post contains Francis Bacon's Novum Organum in the version presented at www.earlymoderntexts.com. The text is copyrighted by Jonathan Bennett.
Text prepared for LessWrong by Ruby.
See also Novum Organum: Introduction
Bennett's Reading Guide[Brackets] enclose editorial explanations. Small ·dots· enclose material that has been added, but can be read as though it were part of the original text. Occasional •bullets, and also indenting of passages that are not quotations, are meant as aids to grasping the structure of a sentence or a thought. Every four-point ellipsis . . . . indicates the omission of a brief passage that seems to present more difficulty than it is worth. Longer omissions are reported between brackets in normal-sized type.‘Organon’ is the conventional title for the collection of logical works by Aristotle, a body of doctrine that Bacon aimed to replace. His title Novum Organum could mean ‘The New Organon’ or more modestly ‘A New Organon’; the tone of the writing in this work points to the definite article.
Ruby's Reading GuideNovum Organum is organized as two books each containing numbered "aphorisms." These vary in length from three lines to sixteen pages. The section titles are my own and do not appear in the original.Whereas Bennett encloses his editorial remarks in a single pair of [brackets], I have enclosed mine in a [[double pair of brackets]].APHORISMS CONCERNING THE INTERPRETATION OF NATURE: PREFACE
Those who have taken it on themselves to lay down the law of nature as something that has already been discovered and understood, whether they have spoken in simple confidence or in a spirit of professional posturing, have done great harm to philosophy and the sciences. As well as succeeding in •producing beliefs in people, they have been effective in •squashing and stopping inquiry; and the harm they have done by spoiling and putting an end to other men’s efforts outweighs any good their own efforts have brought. Some people on the other hand have gone the opposite way, asserting that absolutely nothing can be known—having reached this opinion through dislike of the ancient sophists, or through uncertainty and fluctuation of mind, or even through being crammed with some doctrine or other. They have certainly advanced respectable reasons for their view; but zeal and posturing have carried them much too far: they haven’t •started from true premises or •ended at the right conclusion. The earlier of the ancient Greeks (whose writings are lost) showed better judgment in taking a position between
- one extreme: presuming to pronounce on everything,
- the opposite extreme: despairing of coming to understand anything.
My method is hard to practice but easy to explain. I propose to •establish degrees of certainty, to •retain ·the evidence of· the senses subject to certain constraints, but mostly to •reject ways of thinking that track along after sensation. In place of that, I open up a new and certain path for the mind to follow, starting from sense-perception. The need for this was felt, no doubt, by those who gave such importance to dialectics; their emphasis on dialectics showed that they were looking for aids to the intellect, and had no confidence in the innate and spontaneous process of the mind.
[Bacon’s dialectica, sometimes translated as ‘logic’, refers more narrowly to the formalized and rule-governed use of logic, especially in debates.]
But this remedy did no good, coming as it did after the processes of everyday life had filled the mind with hearsay and debased doctrines and infested it with utterly empty idols. (·I shall explain ‘idols’ in 39–45·.) The upshot was that the art of dialectics, coming (I repeat) too late to the rescue and having no power to set matters right, was only good for fixing errors rather than for revealing truth.
[Throughout this work, ‘art’ will refer to any human activity that involves techniques and requires skills.]
We are left with only one way to health—namely to start the work of the mind all over again. In this, the mind shouldn’t be left to its own devices, but right from the outset should be guided at every step, as though a machine were in control.
Certainly if in mechanical projects men had set to work with their naked hands, without the help and power of tools, just as in intellectual matters they have set to work with little but the naked forces of the intellect, even with their best collaborating efforts they wouldn’t have achieved—or even attempted—much. . . . Suppose that some enormous stone column had to be moved from its place (wanted elsewhere for some ceremonial purpose), and that men started trying to move it with their naked hands, wouldn’t any sober spectator think them mad? If they then brought in more people, thinking that that might do it, wouldn’t he think them even madder? If they then weeded out the weaker labourers, and used only the strong and vigorous ones, wouldn’t he think them madder than ever? Finally, if they resolved to get help from the art of athletics, and required all their workers to come with hands, arms, and sinews properly oiled and medicated according to good athletic practice, wouldn’t the onlooker think ‘My God, they are trying to show method in their madness!’?
Yet that is exactly how men proceed in intellectual matters—with just the same kind of mad effort and useless combining of forces—when they hope to achieve great things either through their individual brilliance or through the sheer number of them who will co-operate in the work, and when they try through dialectics (which we can see as a kind of athletic art) to strengthen the sinews of the intellect. With all this study and effort, as anyone with sound judgment can see, they are merely applying the naked intellect; whereas in any great work to be done by the hand of man the only way to increase the force exerted by each and to co-ordinate the efforts of all is through instruments and machinery.
Arising from those prefatory remarks, there are two more things I have to say; I want them to be known, and not forgotten. ·One concerns ancient philosophers, the other concerns modern philosophy·.
(1) If I were to declare that I could set out on •the same road as the ancient philosophers and come back with something better than they did, there would be no disguising the fact that I was setting up a rivalry between them and me, inviting a comparison in respect of our levels of excellence or intelligence or competence. There would nothing new in that, and nothing wrong with it either, for if the ancients got something wrong, why couldn’t I—why couldn’t anyone—point it out and criticise them for it? But that contest, however right or permissible it was, might have been an unequal one, casting an unfavourable light on my powers. So it is a good thing—good for avoiding conflicts and intellectual turmoil—that I can leave untouched the honour and reverence due to the ancients, and do what I plan to do while gathering the fruits of my modesty! There won’t be any conflict here: my aim is to open up •a new road for the intellect to follow, a road the ancients didn’t know and didn’t try. I shan’t be taking a side or pressing a case. My role is merely that of a guide who points out the road—a lowly enough task, depending more on a kind of luck than on any ability or excellence.
(2) That was a point about persons; the other thing I want to remind you of concerns the topic itself. Please bear this in mind: I’m not even slightly working to overthrow the philosophy [here = ‘philosophy and science’] that is flourishing these days, or any other more correct and complete philosophy that has been or will be propounded. I don’t put obstacles in the way of this accepted philosophy or others like it; ·let them go on doing what they have long done so well·—let them give philosophers something to argue about, provide decoration for speech, bring profit to teachers of rhetoric and civil servants! Let me be frank about it: the philosophy that I shall be advancing isn’t much use for any of those purposes. It isn’t ready to hand; you can’t just pick it up as you go; it doesn’t fit with preconceived ideas in a way that would enable it to slide smoothly into the mind; and the vulgar won’t ever get hold of it except through its practical applications and its effects.
[In this work, ‘vulgar’ means ‘common, ordinary, run-of-the-mill’ (as in ‘vulgar induction’ 17) or, as applied to people, ‘having little education and few intellectual interests’.]
So let there be two sources of doctrine, two disciplines, two groups of philosophers, and two ways of doing philosophy, with the groups not being hostile or alien to each other, but bound together by mutual services. In short, let there be one discipline for cultivating the knowledge we have, and another for discovering new knowledge. This may be pleasant and beneficial for both. Most men are in too much of a hurry, or too preoccupied with business affairs, to engage with my way of doing philosophy—or they don’t have the mental powers needed to understand it. If for any of those reasons you prefer the other way—·prefer cultivation to discovery·—I wish you all success in your choice, and I hope you’ll get what you are after. But if you aren’t content to stick with the knowledge we already have, and want
- to penetrate further,
- to conquer nature by works, not conquer an adversary by argument,
- to look not for nice probable opinions but for sure proven knowledge,
I invite you to join with me, if you see fit to do so. [In this context, ‘works’ are experiments.] Countless people have stamped around in nature’s outer courts; let us get across those and try to find a way into the inner rooms. For ease of communication and to make my approach more familiar by giving it a name, I have chosen to call one of these approaches ‘the mind’s anticipation ·of nature·’, the other ‘the interpretation of nature’.
[Throughout this work, ‘anticipation’ means something like ‘second-guessing, getting ahead of the data, jumping the gun’. Bacon means it to sound rash and risky; no one current English word does the job.]
I have one request to make, ·namely that my courtesies towards you, the reader, shall be matched by your courtesies to me·. I have put much thought and care into ensuring that the things I say will be not only true but smoothly and comfortably accepted by •your mind, however clogged •it is by previous opinions. It is only fair—especially in such a great restoration of learning and knowledge—for me to ask a favour in return, namely this: If you are led •by the evidence of your senses, or •by the jostling crowd of ‘authorities’, or •by arguments in strict logical form (which these days are respected as though they were the law of the land), to want to pass judgment on these speculations of mine, don’t think you can do this casually, while you are mainly busy with something else. Examine the matter thoroughly; go a little distance yourself along the road that I describe and lay out; make yourself familiar with the subtlety of things that our experience indicates; give your deeply-rooted bad mental habits a reasonable amount of time to correct themselves; and then, when you have started to be in control of yourself, use your own judgment—if you want to.
[Bacon doesn’t ever in this work address the reader at length. This version sometimes replaces ‘If anybody. . . ’ by ‘If you. . . ’, ‘Men should. . . ’ by ‘You should. . . ’ and so on, to make the thought easier to follow.]
In light of its value as a rationalist text, its historical influence on the progress of science, and its general expression of the philosophy and vision which guides LessWrong 2.0, the moderation team has seen fit to publish Novum Organum as a LessWrong sequence.
Quotes in this post are from Francis Bacon's Novum Organum in the version by Jonathan Bennett presented at www.earlymoderntexts.com
In 1620, Francis Bacon’s Novum Organum was published. Though the work might be succinctly described as Bacon’s views on empiricism and inductivism, it is far more than a list of experimental steps to be followed. It is an entire epistemology and philosophy—possibly the epistemology and philosophy which underlay the Scientific Revolution.
Bacon was damning of the science of his time and preceding centuries. He saw the pseudo-empirical syllogistic paradigm as deeply flawed and incapable of making progress.If those doctrines ·of the ancient Greeks· hadn’t been so utterly like a plant torn up by its roots, and had remained attached to and nourished by the womb of nature, the state of affairs that we have seen to obtain for two thousand years—namely the sciences stayed in the place where they began, hardly changing, not getting any additions worth mentioning, thriving best in the hands of their first founders and declining from then on—would never have come about. (74) 
He also believed that the unaided human mind was incapable of getting far on its own.Nearly all the things that go wrong in the sciences have a single cause and root, namely: while wrongly admiring and praising the powers of the human mind, we don’t look for true helps for it. (9)
Not much can be achieved by the naked hand or by the unaided intellect. Tasks are carried through by tools and helps, and the intellect needs them as much as the hand does. (2)
When the intellect of a sober, patient, and grave mind is left to itself (especially in a mind that isn’t held back by accepted doctrines), it ventures a little way along the right path; but it doesn’t get far, because without guidance and help it isn’t up to the task, and is quite unfit to overcome the obscurity of things. (21)
Nonetheless, he was optimistic that if the old doctrines were abandoned, idols of the mind (i.e., biases, fallacies, and confusions) were cleared out, and his precise, careful empirical method was followed by a community of scholars, then no knowledge was out of reach and humanity would eventually achieve all of the most splendid discoveries.Until now men haven’t lingered long with •experience; they have brushed past it on their way to the ingenious •theorizings on which they have wasted unthinkable amounts of time. But if we had someone at hand who could answer our questions of the form ‘What are the facts about this matter?’, it wouldn’t take many years for us to discover all causes and complete every science. (112)
The human mind is fallible and flawed—”like a distorting mirror,'' Bacon says—yet its biases can be overcome. Through adherence to properly looking at the world, such that if “the road from the senses to the intellect [is] well defended with walls along each side,” then a scientific community can figure out the world and even reach Utopia.
This a decidedly LessWrong worldview.
Indeed, by my reading, Bacon possessed in some form a large number of concepts employed on LessWrong, not limited to: confirmation bias, motivated cognition, the bottom line, mind-projection fallacy, positive bias, entangled evidence, carving reality at its joints, fake causality, worshipping ignorance, idea inoculation, the surprisingly detailedness of reality, inferential distance, incentives, and dissolving confused language. He even spoke of the appropriate degrees of certainty for each stage of an inquiry and deliberately used epistemic statuses!
Novum Organum was Bacon’s monumental attempt to explain all of the above: how and why the existing scientific methods were entirely broken, why nobody had noticed until then, what the alternative method was, and a vision for a community of scholars and institutions which could help discover all scientific truths.
Covering biases and empiricism as it does, Novum Organum is highly instructive as a rationalist text. Yet why read Bacon when we’ve got the Sequences, Codex, and the rest of modern LessWrong? I answer that it’s worthwhile because there’s a focus and immediacy to a text whose author wasn’t writing abstractly, but direly wanted to redirect all the scientific efforts of his time to be more productive.
There’s an impressiveness to someone grappling with how to do science at a point when so much less was known about the world. Compared to us, Bacon’s time was one of extreme mystery. Recall that he was writing before Boyle, Newton, Maxwell, or Darwin. He did not have access to theories of thermodynamics, electromagnetism, evolution, or atomic physics. They hadn’t even invented the mercury thermometer in his time. He earnestly tried to figure out simply “what is heat?” and by use of his meticulous empiricism correctly inferred it was just something to do with motion—150 years before phlogiston theory was laid to rest and with access to only primitive air-based thermometers!
We get to look back and point to all that modern science has done over the centuries to make us feel enthusiastic. Four hundred years ago, Bacon's enthusiasm came entirely from his ability to look forward.
There is also perhaps a validation of the LessWrong worldview to be found in Bacon. Bacon was a symbolic figure of the Scientific Revolution. Inspirational to the Royal Society and many others. Historical credit allocation is hard, but it seems more likely than not that Bacon gets a good deal of credit in bringing about the Scientific Revolution. Seemingly, many of the same ideas that we cherish now were read by the scholars who first read Bacon and kicked off the modern scientific era. If only people hadn’t stopped reading Bacon in the original after a few generations.
Beyond his instruction in biases and empiricism, Bacon in an inspiration to the LessWrong 2.0 project  for his visions of how infrastructure and community are key to intellectual progress. Bacon saw intellectual progress as a technological  and collaborative endeavor, exactly as LessWrong 2.0 does.
At the technologies for individual thinking level, Bacon writes:Not much can be achieved by the naked hand or by the unaided intellect. Tasks are carried through by tools and helps, and the intellect needs them as much as the hand does. And just as the hand’s tools either give motion or guide it, so ·in a comparable way· the mind’s tools either point the intellect in the direction it should go or offer warnings. (2)
Bacon is further adamant that the process of science requires people to write their work down and share it. Perhaps this is obvious now, but Bacon was writing before the first scientific journal, indeed, he is credited as a major inspiration for the Royal Society whose philosophical transactions were the first scientific journal.Even after we have acquired and have ready at hand a store of natural history and experimental results such as is required for the work of the intellect, or of philosophy, still that is not enough. The intellect is far from being able to retain all this material in memory and recall it at will, any more than a man could keep a diary all in his head. Yet until now there has been more thinking than writing about discovery procedures—experimentation hasn’t yet become literate! But a discovery isn’t worth much if it isn’t ·planned and reported· in writing; and when this becomes the standard practice, better things can be hoped for from experimental procedures that have at last been made literate. (101)
Yet another point, maybe, obvious to us now: the work of science can be split up among people.Unlike the work of sheerly thinking up hypotheses, proper scientific work can be done collaboratively; the best way is for men’s efforts (especially in collecting experimental results) to be exerted separately and then brought together. Men will begin to know their strength only when they go this way—with one taking charge of one thing and another of another, instead of all doing all the same things. (113)
Though Bacon’s greatest reference to collaborating and institution for knowledge perhaps comes from his utopian novel, New Atlantis. One character describes the fictional institution of Solomon’s House:Ye shall understand (my dear friends) that amongst the excellent acts of that king, one above all hath the pre-eminence. It was the erection and institution of an Order or Society, which we call Salomon's House; the noblest foundation (as we think) that ever was upon the earth; and the lanthorn of this kingdom. It is dedicated to the study of the works and creatures of God. Some think it beareth the founder's name a little corrupted, as if it should be Solamona's House. But the records write it as it is spoken. So as I take it to be denominate of the king of the Hebrews, which is famous with you, and no stranger to us.
The novel goes into great depth about how the institution functions and all the roles different individuals play in the scientific process. According to Wikipedia, it is this vision which inspired Samuel Hartlib and Robert Boyle to found the Royal Society.
To conclude this introduction, I’ll mention that Novum Organum is actually part two of six from Bacon’s much larger, never-completed work, Instauratio Magna. The title is usually translated as The Great Instauration yet Bennett (whose translation of Novum Organum we are posting) translates it as The Great Fresh Start. Seems fitting to Bacon’s intentions.It is pointless to expect any great advances in science from grafting new things onto old. If we don’t want to go around in circles for ever, making ‘progress’ that is so small as be almost negligible, we must make a fresh start with deep foundations. (31)
Given the Scientific Revolution got going in earnest around his lifetime, I dare say he got what we he asked for.
 Novum Organum consists two books each containing "aphorisms" which range in length from three lines to sixteen pages. A bold number on its own refers to an aphorism from Book 1 by default or Book 2 where the context is very clear. When unclear, aphorisms are referenced by a leading 1- or 2- to disambiguate, e.g 2-13 is the 13th aphorism in Book 2.
 Usually, we now call ourselves simply “LessWrong” but it feels important to disambiguate here since I cannot make claims to the vision for original LessWrong as founded in 2009 by Eliezer. It does seem clear that Eliezer was not influenced by Bacon in the same way that Habryka (LessWrong 2.0’s team lead and core founder) has been.
 By technological I refer broadly to the creation of knowledge and tools that can be used for a specific purpose, including things like methodologies and procedures, not just physical artifacts. I would call a set of techniques for debiasing one’s thinking and likewise training for how to moderate an online forum as both examples of technologies.
Things that happen offline notice: if you’re reading this I strongly predict that you will enjoy attending a Slate Star Codex meetup. I will be at the New York meetup on Saturday 9/21, come say hi!
This is part 1 of the interview, talking about Aella’s background, Twitter, Jews, the homeless, LSD ranches, Inuit mythology, insecurity, and sex. Part 2 is about psychedelics and philosophizing and will post next week.
I’m here in a secret location in the financial district of Manhattan with the most interesting woman in the world.
And we are drinking whiskey, which is appropriate.
You’re doing all the homework beforehand. It’s really cool. I haven’t experienced this before.
In these interviews, I try to channel Tyler Cowen. Before he interviews people he reads everything they’ve produced so he’s not introducing listeners to a person, but diving really deep into how they think. Because we set this interview up on short notice, I spent 8 hours yesterday consuming your entire blog and podcast archive.
Have you heard of Nardwuar? He’s an interviewer who also does this. Last night when you sent me the questions I felt that this would be like a Nardwuar interview.
I want to start with an Askholey question. Do people know the real you? Would people who follow you online be surprised if they got to spend a lot of time with you?
I don’t know what that means, “the real me”. Most likely people would be surprised by things that aren’t that interesting, like small habits I have. Things that are interesting, I tend to talk about because stories are a commodity. Learning to identify the aspects of my life that can serve as a commodity is really useful because you can put it out there and then get attention and start a dialogue in ways that are new and creative. And so the things that aren’t novel are things that might surprise you, like how I take my coffee or something.
Do you identify with the story of you that’s out there?
I don’t know what that means. There’s a strong way in which I feel like the stuff I write about myself in the past is a completely different person. The sense that I have when writing about myself in the past, is like a story that’s not me. It’s just a dream that I woke up with in my head one day and I’m sharing that dream.
So we’re getting a snapshot of Aella’s mind on September 5th, 2019, and that’s it. Is there a set date at which you will automatically unendorse everything you’re going to say today?
It kind of happens as I say it. If I’m paying attention and being mindful when speaking, then it becomes a lot more salient for me. As soon as something leaves my mouth, it’s as if it’s no longer part of me in any way.
It’s a weird thing to have a feeling that you identify with something. Like what even is that, to identify with something?
I think that there’s no objective answer to what your identity is, it’s more of a stance that you can take. If I notice a feature of me that persists over time, and people start predicting my behavior based on it, I’m also going to predict myself based on it and it becomes my identity.
So your identity is prediction?
That’s one of the ways to think about it.
I don’t identify with a lot of aspects of my body. I don’t carry things like my digestive system or subconscious habits that I have as a sense of identity. I guess I’m saying that predictability is not all that you need.
I just saw a woman wearing a t-shirt that said: “Queens are born in December”. People will craft an identity out of any random thing, like the month they were born.
It’s kind of fun!
Something I’ve been working on recently is the sense of flexible identity, being able to rapidly shift the identity that I have. And this is helping me not be afraid of carrying authority and also not be afraid of not having authority. Going back and forth rapidly between knowing what I’m talking about and knowing nothing of what I’m talking about. It feels really good to be able to shift identity or the sense of it.
I asked for question suggestions from our Twitter followers, which gave me a glimpse into your world a little bit. I have 750 followers and I’m very proud of all of them. You have 35,000, and I’m not as proud of them.
Yeah, I often have my hopes in humanity dashed by Twitter. It keeps my relation to humanity humble.
I get the sense that you’re really curious about people at large, without necessarily liking them. A lot of your Twitter questions are driven by curiosity but they don’t show people in the best light.
That is true. I enjoy bonding in darkness, finding the worst aspects of someone or forcing a bad frame or a confession. Showing that we’re all in an evil and gritty sort of world and then displaying acceptance in that world. I really enjoy the sense of acceptance and peace inside the hairy, gross stuff. And so part of my drive is to pull people downwards because I want to show that it can be safe there. It’s safe to go down. I like that.
Tyler Cowen is fond of saying that every thinker is a regional thinker. That the culture and circumstances where you grew up affect how you think and make you different from your current intellectual peers. You grew up in really unusual circumstances; how do they affect the way you think today?
It’s hard to know what is the culture and what is genetics, but there are definitely things that came into play like cultural isolation. When I was growing up access to movies and music was really limited. We weren’t allowed to see media of children who were upset with their parents so even the concept of being whiny just wasn’t in my worldview. Information was controlled at such an extreme level. And of course, I didn’t go to public school except for a couple of brief months, which didn’t have much of an effect on me.
Also very important was the philosophy of us-against-the-world. ‘Us’ being the Christians, the minority, the one that was persecuted, and we had the truth and everybody else hated us. We were told constantly that when you grow up and you go into the world, you’re going to have to stand your ground and everybody is going to think differently than you. But it’s okay to think differently. And so I think that made me feel really comfortable from an early age with the idea of being really different from everybody else. That was just the norm.
Do you still feel like it’s us-against-the-world, just transferred to a different tribe?
I don’t know. I think that this made me feel safe being different in a fundamental way. It reduced my fears around thinking terrible things, fears that other people might have. Although there was strong conformity in my culture as it was, we were still nonconformist in contrast to everyone else. So I had access to that nonconformist identity in ways that maybe the mainstream doesn’t because there’s nothing to contrast themselves to.
How about the genetic side, and also your family upbringing. Do you see any parts of your parents in yourself?
I like to think of myself as sort of the blend between my parents. My dad’s quite severe. He has Asperger’s and is ambitious in a kind of way. He’s very intelligent and kind of immature and very masculine in a low empathy, low agreeableness manner. And I can see some of that in myself, too. When you’re talking about me being more interested in curiosity and less in liking people that might come from low sociality.
But my mom is extremely empathetic and pro-social, exactly the opposite. And she taught us a lot of social norms that feel ingrained in me on a deep level, like being extremely kind to people around you no matter what. This has led to me sometimes engaging in actions that are weird, like going out of my way to get a homeless person an Uber. This backfired, the homeless person did not get in the Uber, they were insane. But I tried to do that when other people already knew that that’s not something you’re supposed to do.
I just took a homeless person grocery shopping for the first time.
Did that work?
Yeah, it was a 10/10 great experience. I see a lot of homeless people in NYC with terrible signs that are clearly lies. But this guy’s sign said: “A $10 shopping spree at CVS would be life-changing”. I liked how direct it was. I asked, “Hey, buddy, what’s your name?” -”Justin.” -“OK, Justin, let’s go shopping at CVS. Anything you want is on me.”
All he got was a box of tuna. I asked if he wanted anything else but he said that someone had already bought him soup, and that soup plus tuna is a great dinner. He also got really excited about how many different kinds of tuna there were and I almost said, “Yeah, isn’t capitalism amazing?” but in the end, I didn’t say that.
I may not have been 100% sober when this happened.
Did that cause you to do this or do you tend to do stuff like this in general? I guess it’s already within you.
I try to do more things like this in general, but I find myself doing them a lot less when I’m sober. But this was such a great experience that I’m going to do more stuff like that.
I used to do this kind of thing when I lived in Seattle. I lived alone; I was kind of lonely. And so one night I got pretty drunk and then went out on the street and saw this group of homeless people and gave them all LSD and then they all started tripping. I just hung out with them all night.
I continued to hang out with this group and would bring them food all the time and gifts. They were very picky about the food and gifts that I brought them, actually, but it was a really great experience. Homeless people and LSD is a great combo.
To all the people out there: go buy some tuna, put LSD in it, and go make friends under the local bridge.
You’re a very unique person in almost every domain, what you believe and the things you do. I have a speculative theory about that. It sounds like none of the social scripts that you’ve absorbed as a child still ring true, and so you had to figure things out from scratch at 19.
I learned a lot of things from my parents and where I grew up about what’s appropriate and what’s good. Scripts for family life, work, education, how to treat people, etc. Some of these I discarded, but for the most part, they were pretty good scripts to follow.
But the only scripts you had came from your parents since you didn’t have things like TV shows to offer an alternative. And the social rules you got from your parents seemed so bad that you completely rejected it. And so at 19, you had to reckon how to interact with the world with a completely open mind, doing sex work and psychedelics and all of that. It’s like you were born again as an adult with no prior programming.
Weirdly, I think some of this might come down to the religious training that I had. You know my dad’s a professional evangelical. A big deal growing up was proving other faiths wrong. And I was always terrified, how do we know they’re wrong and we’re not the ones that are wrong? I recognized that there is this weird parallel between my religion and the other religions. And so a huge part of my attention was devoted towards noticing that people just buy into what they’re raised in, and that can send them to hell to burn for eternity.
This was an extremely terrifying idea to me, that people could just not deeply question what they believed in. I knew I had to do that for my own faith too, which is eventually what knocked me out of it. This deep fear of not questioning the thing that I’m put into was a really big theme for me as a teenager and adult. I viewed everything with really strong skepticism because I’d always been judging people of other faiths, like Muslims and Catholics, for not looking on their lives with a critical eye.
That put the idea into my brain that nothing is sacred at all because my God, the most sacred thing of all could be taken down by reason or at least questioning.
So you still have an instinct to rigorously question things like systems of belief.
I think it’s a point of annoyance with even my closest friends that I may want to build a commune with, me being too fast to be skeptical about belief systems.
Let’s talk about friends and communities. I often think about how the people I spend most of my time with are not really representative of broad humanity, they’re huge outliers in terms of intelligence, education (I don’t mean college degrees), resources, etc. I imagine that for you this is even more extreme. Do you find it strange, living on an island of people who are strongly selected and unusual?
Yes. When I was growing up I felt super introverted and everybody viewed me as an introvert. And then I found the Rationalists and I realized for the first time that I was an extrovert with them, the people that I clicked with. I felt like I’d been looking for them my whole life. There is a way in which my current community is satisfying something for me that normal communities can’t and so I deeply appreciate that about them even though you get normalized to it. I’m assuming this happens to you, too, although maybe not.
I don’t really talk to people outside of my bubble very much. And then when I do, it’s quite a shock. I forget that the world is so incredibly different from me and my friends.
I have a normie day job, I have a soccer team and things like that. One of the reasons why I like New York and don’t move to Berkeley is that I don’t really want to have everyone I talk to be a rationalist.
I assume that growing up in Idaho in a fundamentalist Calvinist family you didn’t know a lot of Jewish people. When was the first time you met one?
I didn’t know any Jewish people growing up. When I moved to the camgirl house in Washington a lot of the other girls had moved from the East Coast and would just talk about Jews or refer to Jews playfully. And I’m like, what the fuck are you talking about? How do you have all these stereotypes about Jews? Where do you meet them? I didn’t know that there was a whole bunch of them here in New York.
And now it seems like every passing year you’re surrounded by more and more Jews.
Oh, yeah. Now it’s just Jew city.
I kind of like it. I feel mild affection for the Jews on the basis of their race, which might be racist, I’m not sure. I tend to have positive experiences with them, especially Israelis. I really like the Israelis that I’ve met. When I go to burns, often there’s a community of Israelis I hang out with and they’re quite interesting, culturally different people.
I’m glad that my tribe is representing.
Yeah, they’re doing it well.
You mentioned potentially forming a commune or an intentional community. Is that your biggest goal in life right now?
I think so. I’m starting to shift from an “explore mindset” to a “settle mindset”, and people around me have been talking about building communities forever. I’ve been afraid of it because you hear constantly about how intentional communities fail and explode. I don’t know any intentional communities that have been together for decades with the same people. Maybe I’m not looking in the right places.
Anyway, this has made me a little bit afraid of pursuing it but now I’m starting to want it badly enough that I’m going to pursue it anyway. I’m interested in getting land or a large building.
With what sort of people? You wrote: “Belief systems aren’t ‘the things we’ve logically concluded about the world.’ They are structures that give us a way to interact with our environment.” And the environment is mostly the people around you. So what sort of people are you looking for and with what shared belief system?
Sufficient intelligence, emotional maturity, and creativity are the three qualities I’m most interested in if we’re going to be scoring people on tests or something. With regards to a belief system, I think I’m primarily interested in a specific relationship to that belief system. The flexibility to recognize that beliefs are something that you hold instead of having the belief hold you. Being able to manipulate or shift around your beliefs to figure out what is most useful in the moment, because I feel like people have a lot more options for interaction available to them if they’re not subject to their beliefs. I’m really drawn to people who seem to have that quality. I would love to live with those people.
So: the anti-cult. You want questioning people who won’t just settle into agreeing on a thing just because that gets them accepted by the group.
Right. I think this falls into the intelligence category a bit. You want people who have this curiosity, which is something that I often find lacking with very creative people, weirdly enough. You’d think that creative people are quite curious, but I’ve been disappointed in this for a while.
What are the main obstacles right now for getting that community set up?
So you’re looking for a millionaire who thinks your community idea sounds cool?
That would be dope, and I don’t think is that infeasible, really. But I don’t like the idea of being at the whim of someone else. Independence is quite important to me. And if you accept someone else’s money, who’s going to be part of this project, then you’re subject to their whims.
It shouldn’t be hard for people who are intelligent, emotionally mature, and creative to make money. I’m not optimizing for making money right now but I could still save up enough in a few years to then live for 20 years on a ranch in Montana. I just want someone to help babysit my kids when I have them.
This is kind of what’s going on. I and some of my friends have some funds that we’re interested in using. Not a ton, but communities can also be businesses. You could have a community that runs a retreat or workshop center. Get investment from someone and start a company.
The FDA could eventually approve psilocybin and LSD therapy for things like addiction and depression.
An LSD cult on a ranch, it’s not like that hasn’t been done before.
Tim Ferriss recently declared that he’s putting all his energy into psychedelics research. It’s a massive business opportunity.
Yeah, I think it’s going to be big and hopefully, it’s done right this time. Last time it was a little… people weren’t ready.
Some people were more than ready. Some people had a great time.
That is true.
That’s the girl who got raped by her brother and cut off her breasts?
I don’t think I was so much identifying with one of the characters as the incredible darkness of the tales of the Inuit life. It seems when reading about legends of different tribes that the farther north you go, the more horrific they are. Demons crawling out of things to eat children and such.
Igaluk and Malina, in Inuit mythology she ends up becoming the moon and he the sun and he’s chasing her forever. This is their fucking sun and moon tale. In other places, it’s the God-father giving birth to the earth and the sun rises in a chariot or something more innocent.
The Aztecs are a tropical civilization, and their myths are pretty dark. But it does seem like warm weather cultures have more trickster gods who aren’t really vicious, they just they fuck around.
In other mythologies, there are classic tales of Hero’s Journey or some shit. But if you fall down the Wikipedia hole of Inuit mythology it’s just consistently gut-wrenching.
So what’s the morality that they teach? That the night is dark and full of terrors?
Probably. I haven’t thought about it too much, but my original theory was that the environment is a lot more threatening than it might be if you’re in a tropical or warm place. So the stories reflect that. It does stay dark up there for a really long time for half the year.
I wonder if they have a set of nicer myths to tell in July when it’s sunny all the time and then they bust out the book of really dark tales in December.
Let’s talk about sex.
There are many approaches and uses of sex, and you’ve explored quite a few of them. And some of them are somewhat in tension. Sex can be used for bonding and creating intimate relationships with people of both genders. But you also wrote about how at the heart of it is this vicious game of power and status. What does sex mean for you and how do you use it?
Maybe the question that I have the least insight into is myself and sex. I’ve been in and out of sex work for a while so for me, sex has primarily functioned as a money-making thing. It’s practical, a business, the thing that you do to support yourself and eat food. It’s quite distant and it’s very much a tool for me. And so in my personal life, it’s shifted a lot. I’ve gone from having a really high sex drive to a really low sex drive and being uncertain about what I want in my partners.
In general, status seems to be attractive. People who are high status seem to make me want to have sex with them by being high status. Not always, obviously, but it’s the highest correlate.
You said on a podcast that when you first left home you were really horny and had sex all the time.
That’s true. My upbringing was extremely sexually suppressive and getting out of it made it feel like a fun game. And of course, I treated it very rationally, thinking I need to be rational in all my actions and so I need to be rational about sex. And rationally, if I enjoy pleasure and people enjoy pleasure, I should probably have sex with as many people who bring me pleasure as I can.
And so I did, but it turns out sex is very complicated. I wasn’t actually enjoying it very much.
It’s probably common for every single human to at some point think they have sex figured out and then to discover how much there is to learn.
I’m kind of excited by the confusion that surrounds sex. Usually, when I meditate or introspect, I can develop a framework of what I’m feeling. But with sexual arousal, it’s just totally opaque. I assume it’s this way for a lot of other people too, like we don’t really know what gives people kinks. It’s a very weird part of human psychology.
I don’t have sex “figured out” but my own relationship with it is more straightforward, at least in how it works for me. It’s not a business for me, it’s not a power move. It’s pleasure and bonding; I always liked women more after sleeping with them.
It makes sense.
I hear women give each other advice that sleeping with a guy will make him respect the woman less. And I understand the evolutionary psychology and the power aspect of it, but I don’t share this feeling at all. I respect anyone who made the smart choice to have sex with me!
Many people see you as a sex object and project their sexual fantasies on you. And also many people see you as a guru and a teacher and project their hopes for enlightenment and wisdom on you. Either one of these roles is rare enough, how weird is it to be seen as both at the same time?
I suspect that the people that view me as a sex object and the people that view me as a teacher are not the same people.
The sex object thing doesn’t really feel like much to me. It feels like a weird glitch. It’s like playing a video game and running a character through this virtual world, and other characters sometimes run up to your character and try and hump it. It’s just people reacting to this type of body in a specific way. It feels quite impersonal and practical. When I was doing sex work, camming and putting photos online, it just felt like a very practical way of existing. It’s like it wasn’t really me doing it at all, but my avatar was doing it. So I don’t really think about it that much.
The guru thing is weirder. There’s this phenomenon that I didn’t realize I was going to have once I hit some level of attention, which is that people treat me as a conduit for their own validation. So they recognize that I have some sort of quote-unquote authority or awareness or attention from others that gives me status. But the thing that they’re looking for is not for my opinions, but rather for me to witness them. So they explain their problems to me, usually over email, and it seems that what they want is for me to just look at it. And so that makes me feel slightly dehumanized. It’s like I am operating as a character that serves you a specific function. It feels a little weird, more so than the sex thing.
Do you hear any problems that you feel really unqualified to answer?
I don’t have any that I remember right now, but there are some where it’s obvious that mental illnesses are at stake. For those I think I have a slightly different approach – “crazy” feels like a frame you can apply or not apply. I tend to be a little bit more open to people when they are in a crazy state because it’s not crazy, it’s just different. It’s typically just a sub-optimal way of achieving your goals.
Let’s talk about insecurity. My favorite thing that you ever wrote is So Says Crazybrain. It sounds like you’ve been thinking about insecurity a lot, attacking it in many ways. What’s the state of that struggle right now?
It vacillates, but the overall trend is having less of it. Insecurity seems to be not knowing what you are and being afraid that what you are is bad in some way. And so all of my insecurities come from the blind spots that I hold within myself. Sometimes my attention is too small or I’m not practicing mindfulness enough such that the blind spots grow and I start to experience insecurity in daily life. I think that this is why insecurity tends to go away in familiar situations because you get to know yourself more in a familiar situation.
So insecurity is feeling like you don’t know yourself, and if you knew you would be disappointed.
I struggled with it a huge amount. When I was younger, I was almost cripplingly insecure. I couldn’t do things that other kids could do because I was so afraid of what people thought of me. I don’t know if this came from my dad who is also extremely insecure.
It’s interesting that you tie monogamy to insecurity. And it seems that even though the amount of insecurity you feel vacillates, you’re never ceding polyamory to it.
I have trouble even imagining not being poly. It’s not a question that I ever think about it. I became poly almost as soon as I heard of it. I was 19, and at the time I was dating my boyfriend monogamously because that’s all I’d ever heard of and I was just doing the thing that everybody did. And then I met this guy and his wife who were living with a third person and they were all poly. And once they explained polyamory to me it made so much sense, it was obviously the thing that I want to do.
I tried explaining this to my boyfriend, why it should be okay for me to have sex with this new guy. But my boyfriend didn’t get it and we ended up breaking up. Ever since then, I was definitely poly. I’m not going to date anybody ever again who doesn’t want me to have sex with someone I want to have sex with.
Suppose we have a gearbox. On one side is a crank, on the other side is a wheel which spins when the crank is turned. We want to predict the rotation of the wheel given the rotation of the crank, so we run a Kaggle competition.
We collect hundreds of thousands of data points on crank rotation and wheel rotation. 70% are used as training data, the other 30% set aside as test data and kept under lock and key in an old nuclear bunker. Hundreds of teams submit algorithms to predict wheel rotation from crank rotation. Several top teams combine their models into one gradient-boosted deep random neural support vector forest. The model achieves stunning precision and accuracy in predicting wheel rotation.
On the other hand, in a very literal sense, the model contains no gears. Is that a problem? If so, when and why would it be a problem?What is Missing?
When we say the model “contains no gears”, what does that mean, in a less literal and more generalizable sense?
Simplest answer: the deep random neural support vector forest model does not tell us what we expect to see if we open up the physical gearbox.
For instance, consider these two gearboxes:
Both produce the same input-output behavior. Our model above, which treats the gearbox as a literal black box, does not tell us anything at all which would distinguish between these two cases. It only talks about input-output behavior, without making any predictions about what’s inside the gearbox (other than that the gearbox must be consistent with the input/output behavior).
That’s the key feature of gears-level models: they make falsifiable predictions about the internals of a system, separate from the externally-visible behavior. If a model could correctly predict all of a system’s externally-visible behavior, but still be falsified by looking inside the box, then that’s a gears-level model. Conversely, we cannot fully learn gears-level models by looking only at externally-visible input-output behavior - external behavior cannot, for example, distinguish between the 3- and 5-gear models above. A model which can be fully learned from system behavior, without any side information, is not a full gears-level model.
Why would this be useful, if what we really care about is the externally-visible behavior? Several reasons:
- First and foremost, if we are able to actually look inside the box, then that provides a huge amount of information about the behavior. If we can see the physical gears, then we can immediately make highly confident predictions about system behavior.
- More generally, any information about the internals of the system provide a “side channel” for testing gears-level models. If data about externally-visible behavior is limited, then the ability to leverage data about system internals can be valuable.
- It may be that all of our input data is only from within a certain range - i.e. we never tried cranking the box faster than a human could crank. If someone comes along and attaches a motor to the crank, then that’s going to generate input way outside the range of what our input/output model has ever seen - but if we know what the gears look like, then that won’t be a problem. In other words, knowing what the system internals look like lets us deal with distribution shifts.
- Finally, if someone changes something about the system, then a model trained only on input/output data will fail completely. For instance, maybe there’s a switch on top of the gearbox which disconnects the gears, and nobody has ever thrown it before. If we know what the inside of the box looks like, then that’s not a problem - we can look at what the switch does.
All that said, if we have abundant data and aren’t worried about distribution shifts or system changes, non-gears models can still give us great predictive power. Solomonoff induction is the idealized theoretical example: it gives asymptotically optimal predictions based on input-output behavior, without any visibility into the system internals.Application: Macroeconomic Models
One particularly well-known example of these ideas in action is the Lucas Critique, a famous 1976 paper by Bob Lucas critiquing the use of simple statistical models for evaluation of economic policy decisions. Lucas’ paper gives several broad examples, but arguably the most remembered example is policy decisions based on the Phillips curve.
The Phillips curve is an empirical relationship between unemployment and inflation. Phillips examined almost a century of economic data, and showed a consistent negative correlation: when inflation was high, unemployment was low, and vice-versa. In other words, prices and wages rise faster at the peak of the business cycle (when unemployment is low) than at the trough (when unemployment is high).
The obvious mistake one might make, based on the Phillips curve, is to think that perpetual low unemployment can be achieved simply by creating perpetual inflation (e.g. by printing money). Lucas opens his critique by eviscerating this very idea:The inference that permanent inflation will therefore induce a permanent economic high is no doubt [...] ancient, yet it is only recently that this notion has undergone the mysterious transformation from obvious fallacy to cornerstone of the theory of economic policy.
Bear in mind that this was written in the mid-1970’s - the era of “stagflation”, when both inflation and unemployment were high for several years. Stagflation was an empirical violation of the Phillips curve - the historical behavior of the system broke down when central banks changed their policies to pursue more inflation, and people changed their behavior to account for faster expected inflation in the future.
In short: a statistical model with no gears in it completely fell apart when one part of the system (the central bank) changed its behavior.
On the other hand, before stagflation was under way, multiple theorists (notably Edmund Phelps and Milton Friedman, via very different approaches) published simple gears-level models of the Phillips curve which predicted that it would break down if currencies were predictably devalued - i.e. if people expected central banks to print more money. The key “gears” in these models were individual agents - the macroeconomic behavior (unemployment-inflation relationship) was explained in terms of the expectations and decisions of all the individual people.
This led to a paradigm shift in macroeconomics, beginning the era of “microfoundations”: macroeconomic models derivable from microeconomic models of the expectations and behavior of individual agents - in other words, gears-level models of the economy.Gears from Behavior?
In general, we cannot fully learn gears-level models by looking only at externally-visible input-output behavior. Our hypothetical 3- or 5-gear boxes are a case in point.
However, some kinds of models can at least deduce something about gears-level structure by looking at externally-visible behavior.
For example: given a gearbox with a crank and wheel, it’s entirely possible that the rotation of the wheel has hysteresis, a.k.a. memory - it depends not only on the crank’s rotation now, but also the crank’s rotation earlier. This would be the case if, for instance, the box contains a flywheel. If we look at the data and see that the wheel’s rotation has no dependence on the crank’s rotation at earlier times (after accounting for the crank’s current rotation), then we can conclude that the box probably does not contain any flywheels or other hysteretic components (or if it does, they’re small or decoupled from the wheel).
More generally, these sort of conditional independence relationships fall under the umbrella of probabilistic causal models. By testing different causal models on externally-visible data, we can back out information about the internal cause-and-effect structure of the system. If we see that only the crank’s current rotation matters to the wheel, then that rules out internal components with memory.
Causal models are the largest class of statistical models I know of which yield information about internal gears. However, they’re not the only way to build gears-level models from behavior. If we have strong prior information, we can often use behavioral data to directly compare gears-level hypotheses.Application: Wolf’s Dice
Around the mid-19th century, Swiss astronomer Rudolf Wolf rolled a pair of dice 20000 times, recording each outcome. The main result was that the dice were definitely not perfectly fair - there were small but statistically significant biases.
Now, we could easily look at Wolf’s data and use it to estimate the frequency with which each face of each die is rolled. But that’s not a gears-level model; it doesn’t say anything about the physical die.
In order to back out gears-level information from the data, we need to leverage our prior knowledge about dice and die-making. Jaynes did exactly this in a 1979 paper; the key pieces of prior information are:
- We know dice are roughly cube-shaped, and any difference in face frequencies should stem from asymmetry of the physical die. We know 3 is opposite 4, 2 is opposite 5, and 1 is opposite 6.
- We know dice have little pips showing the numbers on each face; different faces have different numbers of pips, which we’d expect to introduce a slight asymmetry.
- Imagining how the dice might have been manufactured, Jaynes guesses that the final cut would have been more difficult to make perfectly even than the earlier cuts - leaving one axis slightly shorter/longer than the other two.
Based on those asymmetries, we’d guess:
- One of the three face pairs (3, 4), (2, 5), (1, 6) has significantly different frequency from the others, corresponding to the last axis cut.
- The faces with fewer pips (3, 2, and especially 1) have slightly lower frequency than those with more pips (4, 5, and especially 6), since more pips means slightly less mass near that face.
- Other than that, the frequencies should be pretty even.
This is basically a guess-and-check process: we guess what asymmetry might be present based on our prior knowledge, consider how that would change the behavior, then we use the data to check the model.
Jaynes tests out these models, and finds that (1) the white die’s 3-4 axis is slightly shorter than the other two, and (2) the pips indeed shift the center of mass slightly away from the center of the die. These two asymmetries together explain all of the bias seen in the data, so the die should be quite symmetric otherwise. I analyze the same problem in this post (using slightly different methods from Jaynes) and reproduce the same result.
Because this is a gears-level model, we could in principle check the result using a “side channel”: if we could track down the dice Wolf used, then we could take out our calipers and measure the lengths of the 3-4, 2-5, and 1-6 axes. Our prediction is that the 2-5 and 1-6 axes would be close, but the 3-4 axis would be significantly shorter. Note that we still don’t have a full gears-level model - we don’t predict how much shorter the 3-4 axis is. We don’t have a way to back out all the dimensions of the die. But we certainly expect the difference between the 3-4 length and the 2-5 length to be much larger than the difference between the 2-5 length and the 1-6 length. Our model yields some information about gears-level structure.Takeaway
Statistics, machine learning, and adjacent fields tend to have a myopic focus on predicting future data.
Gears-level models cannot be fully learned by looking at externally-visible behavior data. That makes it hard to prove theorems about convergence of statistical methods, or write tests for machine learning algorithms, when the goal is to learn about a system’s internal gears. So, to a large extent, these fields have ignored gears-level learning and focused on predicting future data. Gears have snuck in only to the extent that they’re useful for predicting externally-visible behavior.
But sooner or later, any field dominated by a gears-less worldview will have its Lucas Critique.
It is possible to leverage probability to test gears-level models, and to back out at least some information about a system’s internal structure. It’s not easy. We need to restrict ourselves to certain classes of models (i.e. causal models) and/or leverage lots of prior knowledge (e.g. about dice). It looks less like black-box statistical/ML models, and more like science: think about what the physical system might look like, figure out how the data would differ between different possible physical systems, and then go test it. The main goal is not to predict future data, but to compare models.
That’s the kind of approach we need to build models which won’t fall apart every time central banks change their policies.
Mod note by habryka: This is a crosspost of an event from SSC meetups everywhere. If you are the owner of this meetup and want to claim it, please leave a comment or contact me via private message.
I'm often reluctant to ask for explanations on LW, and to typical-mind a bit I think this may be true of others as well. This suggests that when you're writing something for public consumption, it's better to err on the side of too much rather than too little explanation. If there's too much explanation, people can just skip over it (and you can make it easier by putting explanations that may be "too much" in parentheses or footnotes), but if there's too little explanation people may never ask for it. So in the future if you ever think something like, "I'll just write down what I think, and if people don't understand why, they can ask" I hope this post will cause you to have a second thought about that.
To make it clearer that this problem can't be solved by just asking or training people to be less reluctant to ask for explanations, I think there are often "good" reasons for such reluctance. Here's a list that I came up with during a previous discussion with Raymond Arnold (Raemon):
- I already spent quite some time trying to puzzle out the explanation, and asking is like admitting defeat.
- If there is a simple explanation that I reasonably could have figured out without asking, I look bad by asking.
- It's forcing me to publicly signal interest, and maybe I don't want to do that.
- Related to 3, it's forcing me to raise the status of the person I'm asking, by showing that I'm interested in what they're saying. (Relatedly, I worry this might cause people to withhold explanations more often than they should.)
- If my request is ignored or denied, I would feel bad, perhaps in part because it seems to lower my status.
- I feel annoyed that the commenter didn't value my time enough to preemptively include an explanation, and therefore don't want to interact further with them.
- My comment requesting an explanation is going to read by lots of people for whom it has no value, and I don't want to impose that cost on them, or make them subconsciously annoyed at me, etc.
- By the time the answer comes, the topic may have left my short term memory, or I may not be that interested anymore.
I've looked at a few of Stuart Armstrong's posts that he put up related to his research agenda (though only ones before he posted the full agenda), and felt like I was missing some prereqs. My background is in philosophy. What subjects or particular resources should I study to be able to read his work?
I live in Iran, and here people strongly believe in Avicenna’s humorism (or what is thought of it in popular culture anyways.). It is believed on the level of it being “common sense.” For example, if you eat fish, milk, broccoli, and tomato sauce, all of which are “cold”, you’re supposed to balance that out by eating walnuts and dates. My personal impression is that there is probably some truth to this simplistic model of nutrition, as I see a lot of anecdotal evidence for it, but well, I like to see what science is on the subject.
Note that the humorism believed in here (Iran) is not a strawman; People don’t believe that humor imbalance is the root cause of all diseases. It is mostly believed that if you eat very imbalanced foods, you will have a significant chance of getting “unwell.” E.g., you can get stomachache, vomit or get a sore sensation in the mouth. (I am not actually very knowledgeable on the traditional lores here, and the different imbalances are known for different symptoms.)
I might add that both my parents are experienced specialized medical doctors, and they, too, believe that there is something to all this.
My personal “wishful thinking bias” in this matter is that I like the whole thing to be false. I generally dislike nutritional restrictions, and I dislike traditions and alternative medicine. :))
Cursory Internet searches did not lead me to good meta analyses on this subject. I just found unempirical denials that these beliefs are now considered pseudoscience. I personally suspect premature theoretical disbelief instead of carefully studying subtle effects.
Artificial intelligence defeated a pair of professional Starcraft II players for the first time in December 2018. Although this was generally regarded as an impressive achievement, it quickly became clear that not everybody was satisfied with how the AI agent, called AlphaStar, interacted with the game, or how its creator, DeepMind, presented it. Many observers complained that, in spite of DeepMind’s claims that it performed at similar speeds to humans, AlphaStar was able to control the game with greater speed and accuracy than any human, and that this was the reason why it prevailed.
Although I think this story is mostly correct, I think it is harder than it looks to compare AlphaStar’s interaction with the game to that of humans, and to determine to what extent this mattered for the outcome of the matches. Merely comparing raw numbers for actions taken per minute (the usual metric for a player’s speed) does not tell the whole story, and appropriately taking into account mouse accuracy, the differences between combat actions and non-combat actions, and the control of the game’s “camera” turns out to be quite difficult.
Here, I begin with an overview of Starcraft II as a platform for AI research, a timeline of events leading up to AlphaStar’s success, and a brief description of how AlphaStar works. Next, I explain why measuring performance in Starcraft II is hard, show some analysis on the speed of both human and AI players, and offer some preliminary conclusions on how AlphaStar’s speed compares to humans. After this, I discuss the differences in how humans and AlphaStar “see” the game and the impact this has on performance. Finally, I give an update on DeepMind’s current experiments with Starcraft II and explain why I expect we will encounter similar difficulties when comparing human and AI performance in the future.Why Starcraft is a Target for AI Research
Starcraft II has been a target for AI for several years, and some readers will recall that Starcraft II appeared on our 2016 expert survey. But there are many games and many AIs that play them, so it may not be obvious why Starcraft II is a target for research or why it is of interest to those of us that are trying to understand what is happening with AI.
For the most part, Starcraft II was chosen because it is popular, and it is difficult for AI. Starcraft II is a real time strategy game, and like similar games, it requires a variety of tasks: harvesting resources, constructing bases, researching technology, building armies, and attempting to destroy their opponent’s base are all part of the game. Playing it well requires balancing attention between many things at once: planning ahead, ensuring that one’s units1 are good counters for the enemy’s units, predicting opponents’ moves, and changing plans in response to new information. There are other aspects that make it difficult for AI in particular: it has imperfect information2, an extremely large action space, and takes place in real time. When humans play, they engage in long term planning, making the best use of their limited capacity for attention, and crafting ploys to deceive the other players.
The game’s popularity is important because it makes it a good source of extremely high human talent and increases the number of people that will intuitively understand how difficult the task is for a computer. Additionally, as a game that is designed to be suitable for high-level competition, the game is carefully balanced so that competition is fair, does not favor just one strategy3, and does not rely too heavily on luck.Timeline of Events
To put AlphaStar’s performance in context, it helps to understand the timeline of events over the past few years:
November 2016: Blizzard and DeepMind announce they are launching a new project in Starcraft II AI
August 2017: DeepMind releases the Starcraft II API, a set of tools for interfacing AI with the game
March 2018: Oriol Vinyals gives an update, saying they’re making progress, but he doesn’t know if their agent will be able to beat the best human players
December 12, 2018: AlphaStar wins five straight matches against TLO, a professional Starcraft II player, who was playing as Protoss4, which is off-race for him. DeepMind keeps the matches secret.
December 19, 2018: AlphaStar, given an additional week of training time5, wins five consecutive Protoss vs Protoss matches vs MaNa, a pro Starcraft II player who is higher ranked than TLO and specializes in Protoss. DeepMind continues to keep the victories a secret.
January 24, 2019: DeepMind announces the successful test matches vs TLO and MaNa in a live video feed. MaNa plays a live match against a version of AlphaStar which had more constraints on how it “saw” the map, forcing it to interact with the game in a way more similar to humans6. AlphaStar loses when MaNa finds a way to exploit a blatant failure of the AI to manage its units sensibly. The replays of all the matches are released, and people start arguing7 about how (un)fair the matches were, whether AlphaStar is any good at making decisions, and how honest DeepMind was in presenting the results of the matches.
July 10, 2019: DeepMind and Blizzard announce that they will allow an experimental version of AlphaStar to play on the European ladder8, for players who opt in. The agent will play anonymously, so that most players will not know that they are playing against a computer. Over the following weeks, players attempt to discern whether they played against the agent, and some post replays of matches in which they believe they were matched with the agent.How AlphaStar works
The best place to learn about AlphaStar is from DeepMind’s page about it. There are a few particular aspects of the AI that are worth keeping in mind:
It does not interact with the game like a human does: Humans interact with the game by looking at a screen, listening through headphones or speakers, and giving commands through a mouse and keyboard. AlphaStar is given a list of units or buildings and their attributes, which includes things like their location, how much damage they’ve taken, and which actions they’re able to take, and gives commands directly, using coordinates and unit identifiers. For most of the matches, it had access to information about anything that wouldn’t normally be hidden from a human player, without needing to control a “camera” that focuses on only one part of the map at a time. For the final match, it had a camera restriction similar to humans, though it still was not given screen pixels as input. Because it gives commands directly through the game, it does not need to use a mouse accurately or worry about tapping the wrong key by accident.
It is trained first by watching human matches, and then through self-play: The neural network is trained first on a large database of matches between humans, and then by playing against versions of itself.
It is a set of agents selected from a tournament: Hundreds of versions of the AI play against each other, and the ones that perform best are selected to play against human players. Each one has its own set of units that it is incentivized to use via reinforcement learning, so that they each play with different strategies. TLO and MaNa played against a total of 11 agents, all of which were selected from the same tournament, except the last one, which had been substantially modified. The agents that defeated MaNa had each played for hundreds of years in the virtual tournament9.January/February Impressions Survey Forecasts
The timing and nature of AlphaStar’s success seems to have been mostly in line with people’s expectations, at least at the time of the announcement. Some respondents did not expect to see it for a year or two, but on average, AlphaStar was less than a year earlier than expected. It is probable that some respondents had been expecting it to take longer, but updated their predictions in 2016 after finding out that DeepMind was working on it. For future expectations, a majority of respondents expect to see an agent (not necessarily AlphaStar) that can beat the best humans without any of the current caveats within two years. In general, I do not think that I worded the forecasting questions carefully enough to infer very much from the answers given by survey respondents.
Some readers may be wondering how these survey results compare to those of our more careful 2016 survey, or how we should view the earlier survey results in light of MaNa and TLOs defeat at the hands of AlphaStar. The 2016 survey specified an agent that only receives a video of the screen, so that prediction has not yet resolved. But the median respondent assigned 50% probability of seeing such an agent that can defeat the top human players at least 50% of the time by 202110. I don’t personally know how hard it is to add in that capability, but my impression from speaking to people with greater machine learning expertise than mine is that this is not out of reach, so these predictions still seem reasonable, and are not generally in disagreement with the results from my informal survey.Speed
Nearly everyone thought that AlphaStar was able to give commands faster and more accurately than humans, and that this advantage was an important factor in the outcome of the matches. I looked into this in more detail, and wrote about it in the next section.Camera
As I mentioned in the description of AlphaStar, it does not see the game the same way that humans do. Its visual field covered the entire map, though its vision was still affected by the usual fog of war11. Survey respondents ranked this as an important factor in the outcome of the matches.
Given these results, I decided to look into the speed and camera issues in more detail.The Speed Controversy
Starcraft is a game that rewards the ability to micromanage many things at once and give many commands in a short period of time. Players must simultaneously build their bases, manage resource collection, scout the map, research better technology, build individual units to create an army, and fight battles against other players. The combat is sufficiently fine grained that a player who is outnumbered or outgunned can often come out ahead by exerting better control over the units that make up their military forces, both on a group level and an individual level. For years, there have been simple Starcraft II bots that, although they cannot win a match against a highly-skilled human player, can do amazing things that humans can’t do, by controlling dozens of units individually during combat. In practice, human players are limited by how many actions they can take in a given amount of time, usually measured in actions per minute (APM). Although DeepMind imposed restrictions on how quickly AlphaStar could react to the game and how many actions it could take in a given amount of time, many people believe that the agent was sometimes able to act with superhuman speed and precision.
Here is a graph12 of the APM for MaNa (red) and AlphaStar (blue), through the second match, with five-second bins:Actions per minute for MaNa (red) and AlphaStar (blue) in their second game. The horizontal axis is time, and the vertical axis is 5 second average APM.
At first glance, this looks reasonably even. AlphaStar has both a lower average APM (180 vs MaNa’s 270) for the whole match, and a lower peak 5 second APM (495 vs Mana’s 615). This seems consistent with DeepMind’s claim that AlphaStar was restricted to human-level speed. But a more detailed look at which actions are actually taken during these peaks reveals some crucial differences. Here’s a sample of actions taken by each player during their peaks:Lists of commands for MaNa and AlphaStar during each player’s peak APM for game 2
MaNa hit his APM peaks early in the game by using hot keys to twitchily switch back and forth between control groups13 for his workers and the main building in his base. I don’t know why he’s doing this: maybe to warm up his fingers (which apparently is a thing), as a way to watch two things at once, to keep himself occupied during the slow parts of the early game, or some other reason understood only by the kinds of people that can produce Starcraft commands faster than I can type. But it drives up his peak APM, and probably is not very important to how the game unfolds14. Here’s what MaNa’s peak APM looked like at the beginning of Game 2 (if you look at the bottom of the screen, you can see that the units he has selected switches back-and-forth between his workers and the building that he uses to make more workers):MaNa’s play during his peak APM for match 2. Most of his actions consist of switching between control groups without giving new commands to any units or buildings
AlphaStar hit peak APM in combat. The agent seems to reserve a substantial portion of its limited actions budget until the critical moment when it can cash them in to eliminate enemy forces and gain an advantage. Here’s what that looked like near the end of game 2, when it won the engagement that probably won it the match (while still taking a few actions back at its base to keep its production going):AlphaStar’s play during its peak APM in match 2. Most of its actions are related to combat, and require precise timing.
It may be hard to see what exactly is happening here for people who have not played the game. AlphaStar (blue) is using extremely fine-grained control of its units to defeat MaNa’s army (red) in an efficient way. This involves several different actions: Commanding units to move to different locations so they can make their way into his base while keeping them bunched up and avoiding spots that make them vulnerable, focusing fire on MaNa’s units to eliminate the most vulnerable ones first, using special abilities to lift MaNa’s units off the ground and disable them, and redirecting units to attack MaNa’s workers once a majority of MaNa’s military units are taken care of.
Given these differences between how MaNa and AlphaStar play, it seems clear that we can’t just use raw match-wide APM to compare the two, which most people paying attention seem to have noticed fairly quickly after the matches. The more difficult question is whether AlphaStar won primarily by playing with a level of speed and accuracy that humans are incapable of, or by playing better in other ways. Though based on the analysis that I am about to present I think the answer is probably that AlphaStar won through speed, I also think the question is harder to answer definitively than many critics of DeepMind are making it out to be.
A very fast human can average well over 300 APM for several minutes, with 5 second bursts at over 600 APM. Although these bursts are not always throwaway commands like those from the MaNa vs AlphaStar matches, they tend not to be commands that require highly accurate clicking, or rapid movement across the map. Take, for example, this 10 second, 600 APM peak from current top player Serral:Serral’s play during a 10 second, 600 APM peak
Here, Serral has just finished focusing on a pair of battles with the other player, and is taking care of business in his base, while still picking up some pieces on the battlefield. It might not be obvious why he is issuing so many commands during this time, so let’s look at the list of commands:
The lines that say “Morph to Hydralisk” and “Morph to Roach” represent a series of repeats of that command. For a human player, this is a matter of pressing the same hotkey many times, or even just holding down the key to give the command very rapidly15. You can see this in the gif by looking at the bottom center of the screen where he selects a bunch of worm-looking things and turns them all into a bunch of egg-looking things (it happens very quickly, so it can be easy to miss).
What Serral is doing here is difficult, and the ability to do it only comes with years of practice. But the raw numbers don’t tell the whole story. Taking 100 actions in 10 seconds is much easier when a third of those actions come from holding down a key for a few hundred milliseconds than when they each require a press of a different key or a precise mouse click. And this is without all the extraneous actions that humans often take (as we saw with MaNa).
Because it seems to be the case that peak human APM happens outside of combat, while AlphaStar’s wins happened during combat APM peaks, we need to do a more detailed analysis to determine the highest APM a human player can achieve during combat. To try to answer this question, I looked at approximately ten APM for each of the 5 games between AlphaStar and MaNa, as well as each of another 15 replays between professional Starcraft II players. The peaks were chosen so that roughly half were the largest peak at any time during the match and the rest were strictly during combat. My methodology for this is given in the appendix. Here are the results for just the human vs human matches:Histogram of 5-second APM peaks from analyzed matches between human professional players in a tournament setting The blue bars are peaks achieved outside of combat, while the red bars are those achieved during combat.
Provisionally, it looks like pro players frequently hit approximately 550 to 600 APM outside of combat before the distribution starts to fall off, and they peak at around 200-350 during combat, with a long right tail. As I was doing this, however, I found that all of the highest APM peaks had one thing in common with each other that they did not have in common with all of the lower APM peaks, which is that it was difficult to tell when a player’s actions are primarily combat-oriented commands, and when they are mixed in with bursts of commands for things like training units. In particular, I found that the combat situations with high APM tended to be similar to the Serral gif above, in that they involve spam clicking and actions related to the player’s economy and production, which was probably driving up the numbers. I give more details in the appendix, but I don’t think I can say with confidence that any players were achieving greater than 400-450 APM in combat, in the absence of spurious actions or macromanagement commands.
The more pertinent question might what the lowest APM is that a player can have while still succeeding at the highest level. Since we know that humans can succeed without exceeding this APM, it is not an unreasonable limitation to put on AlphaStar. The lowest peak APM in combat I saw for a winning player in my analysis was 215, though it could be that I missed a higher peak during combat in that same match.
Here is a histogram of AlphaStar’s combat APM:
The smallest 5-second APM that AlphaStar needed to win a match against MaNa was just shy of 500. I found 14 cases in which the agent was able to average over 400 APM for 5 seconds in combat, and six times when the agent averaged over 500 APM for more than 5 seconds. This was done with perfect accuracy and no spam clicking or control group switching, so I think we can safely say that its play was faster than is required for a human to win a match in a professional tournament. Given that I found no cases where a human was clearly achieving this speed in combat, I think I can comfortably say that AlphaStar had a large enough speed advantage over MaNa to have substantially influenced the match.
It’s easy to get lost in numbers, so it’s good to take a step back and remind ourselves of the insane level of skill required to play Starcraft II professionally. The top professional players already play with what looks to me like superhuman speed, precision, and multitasking, so it is not surprising that the agent that can beat them is so fast. Some observers, especially those in the Starcraft community, have indicated that they will not be impressed until AI can beat humans at Starcraft II at sub-human APM. There is some extent to which speed can make up for poor strategy and good strategy can make up for a lack of speed, but it is not clear what the limits are on this trade-off. It may be very difficult to make an agent that can beat professional Starcraft II players while restricting its speed to an undisputedly human or sub-human level, or it may simply be a matter of a couple more weeks of training time.The Camera
As I explained earlier, the agent interacts with the game differently than humans. As with other games, humans look at a screen to know what’s happening, use a mouse and keyboard to give commands, and need to move the game’s ‘camera’ to see different parts of the play area. With the exception of the final exhibition match against MaNa, AlphaStar was able to see the entire map at once (though much of it is concealed by the fog of war most of the time), and had no need to select units to get information about them. It’s unclear just how much of an advantage this was for the agent, but it seems likely that it was significant, if nothing else because it did not suffer from the APM overhead just to look around and get information from the game. Furthermore, seeing the entire map makes it easier to simultaneously control units across the map, which AlphaStar used to great effect in the first five matches against MaNa.
For the exhibition match in January, DeepMind trained a version of AlphaStar that had similar camera control to human players. Although the agent still saw the game in a way that was abstracted from the screen pixels that humans see, it only had access to about one screen’s worth of information at a time, and it needed to spend actions to look at different parts of the map. A further disadvantage was that this version of the agent only had half as much training time as the agents that beat MaNa.
Here are three factors that may have contributed to AlphaStar’s loss:
- The agent was unable to deal effectively with the added complication of controlling the camera
- The agent had insufficient training time
- The agent had easily exploitable flaws the whole time, and MaNa figured out how to use them in match 6
For the third factor, I mean that the agent had sufficiently many exploitable flaws that were obvious enough to human players that any skilled human player could find at least one during a small number of games. The best humans do not have a sufficient number of such flaws to influence the game with any regularity. Matches in professional tournaments are not won by causing the other player to make the same obvious-to-humans mistake over and over again.
I suspect that AlphaStar’s loss in January is mainly due to the first two factors. In support of 1, AlphaStar seemed less able to simultaneously deal with things happening on opposite sides of the map, and less willing to split its forces, which could plausibly be related to an inability to simultaneously look at distant parts of the map. It’s not just that the agent had to move the camera to give commands on other parts of the map. The agent had to remember what was going on globally, rather than being able to see it all the time. In support of 2, the agent that MaNa defeated had only as much training time as the agents that went up against TLO, and those agents lost to the agents that defeated MaNa 94% of the time during training16.
Still, it is hard to dismiss the third factor. One way in which an agent can improve through training is to encounter tactics that it has not seen before, so that it can react well if it sees it in the future. But the tactics that it encounters are only those that another agent employed, and without seeing the agents during training, it is hard to know if any of them learned the harassment tactics that MaNa used in game 6, so it is hard to know if the agents that defeated MaNa were susceptible to the exploit that he used to defeat the last agent. So far, the evidence from DeepMind’s more recent experiment pitting AlphaStar against the broader Starcraft community (which I will go into in the next section) suggests that the agents do not tend to learn defenses to these types of exploits, though it is hard to say if this is a general problem or just one associated with low training time or particular kinds of training data.AlphaStar on the Ladder
For the past couple months, as of this writing, skilled European players have had the opportunity to play against AlphaStar as part of the usual system for matching players with those of similar skill. For the version of AlphaStar that plays on the European ladder, DeepMind claims to have made changes that address the camera and action speed complaints from the January matches. The agent needs to control the camera, and they say they have placed restrictions on AlphaStar’s performance in consultation with pro players, particularly the maximum actions per minute and per second that the agent can make. I will be curious to see what numbers they arrive at for this. If this was done in an iterative way, such that pro players were allowed to see the agent play or to play against it, I expect they were able to arrive at a good constraint. Given the difficulty that I had with arriving at a good value for a combat APM restriction, I’m less confident that they would get a good value just by thinking about it, though if they were sufficiently conservative, they probably did alright.
Another reason to expect a realistic APM constraint is that DeepMind wanted to run the European ladder matches as a blind study, in which the human players did not know they were playing against an AI. If the agent were to play with the superhuman speed and accuracy that AlphaStar did in January, it would likely give it away and spoil the experiment.
Although it is unclear that any players were able to tell they were playing against an AI during their match, it does seem that some were able to figure it out after the fact. One example comes from Lowko, who is a Dutch player who streams and does commentary for games. During a stream of a ladder match in Starcraft II, he noticed the player was doing some strange things near the end of the match, like lifting their buildings17 when the match had clearly been lost, and air-dropping workers into Lowko’s base to kill units. Lowko did eventually win the match. Afterward, he was able to view the replay from the match and see that the player he had defeated did some very strange things throughout the entire match, the most notable of which was how the player controlled their units. The player used no control groups at all, which is, as far as I know, not something anybody does at high-level play18. There were many other quirks, which he describes in his entertaining video, which I highly recommend to anyone who is interested.
Other players have released replay files from matches against players they believed were AlphaStar, and they show the same lack of control groups. This is great, because it means we can get a sense of what the new APM restriction is on AlphaStar. There are now dozens of replay files from players who claim to have played against the AI. Although I have not done the level of analysis that I did with the matches in the APM section, it seems clear that they have drastically lowered the APM cap, with the matches I have looked at topping out at 380 APM peaks, which did not even occur in combat.
It seems to be the case that DeepMind has brought the agent’s interaction with the game more in line with human capability, but we will probably need to wait until they release the details of the experiment before we can say for sure.
Another notable aspect of the matches that people are sharing is that their opponent will do strange things that human players, especially skilled human players almost never do, most of which are detrimental to their success. For example, they will construct buildings that block them into their own base, crowd their units into a dangerous bottleneck to get to a cleverly-placed enemy unit, and fail to change tactics when their current strategy is not working. These are all the types of flaws that are well-known to exist in game-playing AI going back to much older games, including the original Starcraft, and they are similar to the flaw that MaNa exploited to defeat AlphaStar in game 6.
All in all, the agents that humans are uncovering seem to be capable, but not superhuman. Early on, the accounts that were identified as likely candidates for being AlphaStar were winning about 90-95% of their matches on the ladder, achieving Grandmaster rank, which is reserved for only the top 200 players in each region. I have not been able to conduct a careful investigation to determine the win rate or Elo rating for the agents. However, based on the videos and replays that have been released, plausible claims from reddit users, and my own recollection of the records for the players that seemed likely to be AlphaStar19, a good estimate is that they were winning a majority of matches among Grandmaster players, but did not achieve an Elo rating that would suggest a favorable outcome in a rematch vs TLO20.
As with AlphaStar’s January loss, it is hard to say if this is the result of insufficient training time, additional restrictions on camera control and APM, or if the flaws are a deeper, harder to solve problem for AI. It may seem unreasonable to chalk this up to insufficient training time given that it has been several months since the matches in December and January, but it helps to keep in mind that we do not yet know what DeepMind’s research goals are. It is not hard to imagine that their goals are based around sampling efficiency or some other aspect of AI research that requires such restrictions. As with the APM restrictions, we should learn more when we get results published by DeepMind.Discussion
I have been focusing on what many onlookers have been calling a lack of “fairness” of the matches, which seems to come from a sentiment that the AI did not defeat the best humans on human terms. I think this is a reasonable concern; if we’re trying to understand how AI is progressing, one of our main interests is when it will catch up with us, so we want to compare its performance to ours. Since we already know that computers can do the things they’re able to do faster than we can do them, we should be less interested in artificial intelligence that can do things better than we can by being faster or by keeping track of more things at once. We are more interested in AI that can make better decisions than we can.
Going into this project, I thought that the disagreements surrounding the fairness of the matches were due to a lack of careful analysis, and I expected it to be very easy to evaluate AlphaStar’s performance in comparison to human-level performance. After all, the replay files are just lists of commands, and when we run them through the game engine, we can easily see the outcome of those commands. But it turned out to be harder than I had expected. Separating careful, necessary combat actions (like targeting a particular enemy unit) from important but less precise actions (like training new units) from extraneous, unnecessary actions (like spam clicks) turned out to be surprisingly difficult. I expect if I were to spend a few months learning a lot more about how the game is played and writing my own software tools to analyze replay files, I could get closer to a definitive answer, but I still expect there would be some uncertainty surrounding what actually constitutes human performance.
It is unclear to me where this leaves us. AlphaStar is an impressive achievement, even with the speed and camera advantages. I am excited to see the results of DeepMind’s latest experiment on the ladder, and I expect they will have satisfied most critics, at least in terms of the agent’s speed. But I do not expect it to become any easier to compare humans to AI in the future. If this sort of analysis is hard in the context of a game where we have access to all the inputs and outputs, we should expect it to be even harder once we’re looking at tasks for which success is less clear cut or for which the AI’s output is harder to objectively compare to humans. This includes some of the major targets for AI research in the near future. Driving a car does not have a simple win-loss condition, and novel writing does not have clear metrics for what good performance looks like.
The answer may be that, if we want to learn things from future successes or failures of AI, we need to worry less about making direct comparisons between human performance and AI performance, and keep watching the broad strokes of what’s going on. From AlphaStar, we’ve learned that one of two things is true: Either AI can do long-term planning, solve basic game theory problems, balance different priorities against each other, and develop tactics that work, or that there are tasks which seem at first to require all of these things but did not, at least not at a high level.Acknowledgements
Thanks to Gillian Ring for lending her expertise in e-sports and for helping me understanding some of the nuances of the game. Thanks to users of the Starcraft subreddit for helping me track down some of the fastest players in the world. And thanks to Blizzard and DeepMind for making the AlphaStar match replays available to the public.
All mistakes are my own, and should be pointed out to me via email at firstname.lastname@example.org.
I received a total of 22 submissions, which wasn’t bad, given its length. Two respondents failed to correctly answer the question designed to filter out people that are goofing off or not paying attention, leaving 20 useful responses. Five people who filled out the survey were affiliated in some way with AI Impacts. Here are the responses for respondents’ self-reported level of expertise in Starcraft II and artificial intelligence:
Survey respondents’ mean expertise rating was 4.6/10 for Starcraft II and 4.9/10 for AI.
For this one, it seems easiest to show a screenshot from the survey:
The results from this indicated that people thought the match was unfair and favored AlphaStar:
I asked respondents to rate AlphaStar’s overall performance, as well as its “micro” and “macro”. The term “micro” is used to refer to a player’s ability to control units in combat, and is greatly improved by speed. There seems to have been some misunderstanding about how to use the word “macro”. Based on comments from respondents and looking around to see how people use the term on the Internet, it seems that that there are at least three somewhat distinct ways that people use the phrase, and I did not clarify which I meant, so I’ve discarded the results from that question.
For the next two questions, the scale ranges from 0 to 10, with 0 labeled “AlphaStar is much worse” and 10 labeled “AlphaStar is much better”
I found these results interesting, because AlphaStar was able to consistently defeat professional players, so some survey respondents felt the outcome alone was not enough to rate it as at least as good as the best humans.
How do you think AlphaStar’s micro compares to the best humans?
Survey respondents unanimously reported that they thought AlphaStar’s combat micromanagement was an important factor in the outcome of the matches.Forecasting Questions
Respondents were split on whether they expected to see AlphaStar’s level of Starcraft II performance by this time:Did you expect to see AlphaStar’s level of performance in a Starcraft II agent: Before Now1Around this time8Later than now7I had no expectation either way4
Respondents who indicated that they expected it sooner or later than now were also asked by how many years their expectation differed from reality. If we assign negative numbers to “before now”, positive numbers to “Later than now”, zero to “Around this time”, ignore those with no expectation, and weight responses by level of expertise, we find respondents’ mean expectation was just 9 months later the announcement, and the median respondent expected to see it around this time. Here is a histogram of these results, without expertise weighting:
These results do not generally indicate too much surprise about seeing a Starcraft II agent of AlphaStar’s ability now.
This question was intended to outline an AI that would satisfy almost anybody that Starcraft II is a solved game, such that AI is clearly better than humans, and not for “boring” reasons like superior speed. Most survey respondents expected to see such an agent in two-ish years, with a few a little longer, and two that expected it to take much longer. Respondents had a median prediction of two years and an expertise-weighted mean prediction of a little less than four years.Questions About Relevant Considerations How important do you think the following were in determining the outcome of the AlphaStar vs MaNa matches?
I listed 12 possible considerations to be rated in importance, from 1 to 5, with 1 being “not at all important” and 5 being “extremely important”. The expertise weighted mean for each question is given below:
Respondents rated AlphaStar’s peak APM and camera control as the two most important factors in determining the outcome of the matches, and the particular choice of map and professional player as the two least important considerations.
Again, respondents rated a series of considerations by importance, this time for thinking about AlphaStar in a broader context. This included all of the considerations from the previous question, plus several others. Here are the results, again with expertise weighted averaging.
For these two sets of questions, there was almost no difference between the mean scores if I used only Starcraft II expertise weighting, only AI expertise weighting, or ignored expertise weighting entirely.
The rest of the questions were free-form to give respondents a chance to tell me anything else that they thought was important. Although these answers were thoughtful and shaped my thinking about AlphaStar, especially early on in the project, I won’t summarize them here.
I created a list of professional players by asking users of the Starcraft subreddit which players they thought were exceptionally fast. Replays including these players were found by searching Spawning Tool for replays from tournament matches which included at least one player from the list of fast players. This resulted in 51 replay files.
Several of the replay files were too old, so that they could no longer be opened by the current version of Starcraft II, and I ignored them. Others were ignored because they included players, race matchups, or maps that were already represented in other matches. Some were ignored because we did not get to them before we had collected what seemed to be enough data. This left 15 replays that made it into the analysis.
I opened each file using Scelight, and the time and APM values were recorded for the top three peaks on the graph of that player’s APM, using 5-second bins. Next, I opened the replay file in Starcraft II, and for each peak recorded earlier, we wrote down whether that player was primarily engaging in combat at the time or not. Additionally, I recorded the time and APM for each player for 2-4 5-second durations of the game in which the players were primarily engaged in combat.
All of the APM values which came from combat and from outside of combat were aggregated into the histogram shown in the ‘Speed Controversy’ section of this article.
There are several potential sources of bias or error in this:
- Our method for choosing players and matches may be biased. We were seeking examples of humans playing with speed and precision, but it’s possible that by relying on input from a relatively small number of Reddit users (as well as some personal friends), we missed something.
- This measurement relies entirely on my subjective evaluation of whether the players are mostly engaged in combat. I am not an expert on the game, and it seems likely that I missed some things, at least some of the time.
- The tool I used for this seems to mismatch events in the game by a few seconds. Since I was using 5-second bins, and sometimes a player’s APM will change greatly between 5-second bins, it’s possible that this introduced a significant error.
- The choice of 5 second bins (as opposed to something shorter or longer) is somewhat arbitrary, but it is what some people in the Starcraft community were using, so I’m using it here.
- Some actions are excluded from the analysis automatically. These include camera updates, and this is probably a good thing, but I did not look carefully at the source code for the tool, so it may be doing something I don’t know about.
To do effective differential technological development for AI safety, we'd like to know which combinations of AI insights are more likely to lead to FAI vs UFAI. This is an overarching strategic consideration which feeds into questions like how to think about the value of AI capabilities research.
As far as I can tell, there are actually several different stories for how we may end up with a set of AI insights which makes UFAI more likely than FAI, and these stories aren't entirely compatible with one another.
Note: In this document, when I say "FAI", I mean any superintelligent system which does a good job of helping humans (so an "aligned Task AGI" also counts).Story #1: The Roadblock Story
Nate Soares describes the roadblock story in this comment:
...if a safety-conscious AGI team asked how we’d expect their project to fail, the two likeliest scenarios we’d point to are "your team runs into a capabilities roadblock and can't achieve AGI" or "your team runs into an alignment roadblock and can easily tell that the system is currently misaligned, but can’t figure out how to achieve alignment in any reasonable amount of time."
The roadblock story happens if there are key safety insights that FAI needs but AGI doesn't need. In this story, the knowledge needed for FAI is a superset of the knowledge needed for AGI. If the safety insights are difficult to obtain, or no one is working to obtain them, we could find ourselves in a situation where we have all the AGI insights without having all the FAI insights.
There is subtlety here. In order to make a strong argument for the existence of insights like this, it's not enough to point to failures of existing systems, or describe hypothetical failures of future systems. You also need to explain why the insights necessary to create AGI wouldn't be sufficient to fix the problems.
Some possible ways the roadblock story could come about:
Maybe safety insights are more or less agnostic to the chosen AGI technology and can be discovered in parallel. (Stuart Russell has pushed against this, saying that in the same way making sure bridges don't fall down is part of civil engineering, safety should be part of mainstream AI research.)
Maybe safety insights require AGI insights as a prerequisite, leaving us in a precarious position where we will have acquired the capability to build an AGI before we begin critical FAI research.
- This could be the case if the needed safety insights are mostly about how to safely assemble AGI insights into an FAI. It's possible we could do a bit of this work in advance by developing "contingency plans" for how we would construct FAI in the event of combinations of capabilities advances that seem plausible.
- Paul Christiano's IDA framework could be considered a contingency plan for the case where we develop much more powerful imitation learning.
- Contingency plans could also be helpful for directing differential technological development, since we'd get a sense of the difficulty of FAI under various tech development scenarios.
- This could be the case if the needed safety insights are mostly about how to safely assemble AGI insights into an FAI. It's possible we could do a bit of this work in advance by developing "contingency plans" for how we would construct FAI in the event of combinations of capabilities advances that seem plausible.
Maybe there will be multiple subsets of the insights needed for FAI which are sufficient for AGI.
- In this case, we'd like to speed the discovery of whichever FAI insight will be discovered last.
CORAL: You know, back in mainstream computer security, when you propose a new way of securing a system, it's considered traditional and wise for everyone to gather around and try to come up with reasons why your idea might not work. It's understood that no matter how smart you are, most seemingly bright ideas turn out to be flawed, and that you shouldn't be touchy about people trying to shoot them down.
The main difference between the security story and the roadblock story is that in the security story, it's not obvious that the system is misaligned.
We can subdivide the security story based on the ease of fixing a flaw if we're able to detect it in advance. For example, vulnerability #1 on the OWASP Top 10 is injection, which is typically easy to patch once it's discovered. Insecure systems are often right next to secure systems in program space.
If the security story is what we are worried about, it could be wise to try & develop the AI equivalent of OWASP's Cheat Sheet Series, to make it easier for people to find security problems with AI systems. Of course, many items on the cheat sheet would be speculative, since AGI doesn't actually exist yet. But it could still serve as a useful starting point for brainstorming flaws.
Differential technological development could be useful in the security story if we push for the development of AI tech that is easier to secure. However, it's not clear how confident we can be in our intuitions about what will or won't be easy to secure. In his book Thinking Fast and Slow, Daniel Kahneman describes his adversarial collaboration with expertise researcher Gary Klein. Kahneman was an expertise skeptic, and Klein an expertise booster:
We eventually concluded that our disagreement was due in part to the fact that we had different experts in mind. Klein had spent much time with fireground commanders, clinical nurses, and other professionals who have real expertise. I had spent more time thinking about clinicians, stock pickers, and political scientists trying to make unsupportable long-term forecasts. Not surprisingly, his default attitude was trust and respect; mine was skepticism.
When do judgments reflect true expertise? ... The answer comes from the two basic conditions for acquiring a skill:
- an environment that is sufficiently regular to be predictable
- an opportunity to learn these regularities through prolonged practice
In a less regular, or low-validity, environment, the heuristics of judgment are invoked. System 1 is often able to produce quick answers to difficult questions by substitution, creating coherence where there is none. The question that is answered is not the one that was intended, but the answer is produced quickly and may be sufficiently plausible to pass the lax and lenient review of System 2. You may want to forecast the commercial future of a company, for example, and believe that this is what you are judging, while in fact your evaluation is dominated by your impressions of the energy and competence of its current executives. Because substitution occurs automatically, you often do not know the origin of a judgment that you (your System 2) endorse and adopt. If it is the only one that comes to mind, it may be subjectively undistinguishable from valid judgments that you make with expert confidence. This is why subjective confidence is not a good diagnostic of accuracy: judgments that answer the wrong question can also be made with high confidence.
Our intuitions are only as good as the data we've seen. "Gathering data" for an AI security cheat sheet could helpful for developing security intuition. But I think we should be skeptical of intuition anyway, given the speculative nature of the topic.Story #3: The Alchemy Story
Batch Norm is a technique that speeds up gradient descent on deep nets. You sprinkle it between your layers and gradient descent goes faster. I think it’s ok to use techniques we don’t understand. I only vaguely understand how an airplane works, and I was fine taking one to this conference. But it’s always better if we build systems on top of things we do understand deeply? This is what we know about why batch norm works well. But don’t you want to understand why reducing internal covariate shift speeds up gradient descent? Don’t you want to see evidence that Batch Norm reduces internal covariate shift? Don’t you want to know what internal covariate shift is? Batch Norm has become a foundational operation for machine learning. It works amazingly well. But we know almost nothing about it.
The alchemy story has similarities to both the roadblock story and the security story.
From the perspective of the roadblock story, "alchemical" insights could be viewed as insights which could be useful if we only cared about creating AGI, but are too unreliable to use in an FAI. (It's possible there are other insights which fall into the "usable for AGI but not FAI" category due to something other than their alchemical nature--if you can think of any, I'd be interested to hear.)
In some ways, alchemy could be worse than a clear roadblock. It might be that not everyone agrees whether the systems are reliable enough to form the basis of an FAI, and then we're looking at a unilateralist's curse scenario.
Just like chemistry only came after alchemy, it's possible that we'll first develop the capability to create AGI via alchemical means, and only acquire the deeper understanding necessary to create a reliable FAI later. (This is a scenario from the roadblock section, where FAI insights require AGI insights as a prerequisite.) To prevent this, we could try & deepen our understanding of components we expect to fail in subtle ways, and retard the development of components we expect to "just work" without any surprises once invented.
From the perspective of the security story, "alchemical" insights could be viewed as components which are clearly prone to vulnerabilities. Alchemical components could produce failures which are hard to understand or summarize, let alone fix. From a differential technological development point of view, the best approach may be to differentially advance less alchemical, more interpretable AI paradigms, developing the AI equivalent of reliable cryptographic primitives. (Note that explainability is inferior to interpretability.)
Trying to create an FAI from alchemical components is obviously not the best idea. But it's not totally clear how much of a risk these components pose, because if the components don't work reliably, an AGI built from them may not work well enough to pose a threat. Such an AGI could work better over time if it's able to improve its own components. In this case, we might be able to program it so it periodically re-evaluates its training data as its components get upgraded, so its understanding of human values improves as its components improve.Discussion Questions
How plausible does each story seem?
What possibilities aren't covered by the taxonomy provided?
What distinctions does this framework fail to capture?
Which claims are incorrect?
Links in posts and comments now display a hover-preview when linking to LessWrong content.
For instance, this link to an old post by Scott Alexander will display the title, first couple paragraphs, and other meta-data.
For comments, it will display the post-name and the comment-in-question, such as on this shortform comment by Buck about effective tutoring.
We're considering ways to effectively do hover previews for particular common external links (such as wikipedia and arxiv), but for the immediate future, all external links just provide a simple "url" hoveover.
A false dilemma is of the form “It’s either this, or that. Pick one!” It tries to make you choose from a limited set of options, when, in reality, more options are available. With that in mind, what’s wrong with the following examples?
Ex. 1: You either love the guy or hate him
Counterargument 1: “Only a Sith deals in absolutes!”
Counterargument 2: I can feel neutral towards the guy
Ex. 2: You can only add and subtract in mathematics.
Uh, division and multiplication, right?
Ex. 3: Either you get me that new car, or you don’t love me!
I can care about your well-being and happiness and not get you that specific car. I could buy a used car, so you wouldn’t freak out if it got damaged or [3 different examples based on what the (father?) thinks is best].
Ex. 4: You didn’t donate to the food drive, so you don’t care about starving children!
I do care about starving children, but my family and I are just barely scraping by and I value feeding them first.Generalization
How would you generalize the above examples? Of course it presents a certain set of options as the only available options (I said this in the intro!), but what’s the relationship between those options? Are there different types of options that are similar/different between the examples?
One way to frame it is to separate options into two types: values and actions
With that frame, false dilemmas can be categorized into 4 varieties:
1. Only these values are compatible with an agent
2. Only these actions are compatible with a system/environment
3. Only these actions are compatible with this value
4. Only these values are compatible with this action
These 4 varieties correspond to each of their respective examples above (this is not a coincidence). Note: you could have answered Ex.3 as (4) and Ex.4 as (3), it really just depends on the truth of the situation. I’ll leave the details as an exercise for the reader.
[Also, I’m sticking my neck out here and claiming that these are the only 4 categories a false dilemma can ever fit. Prove me wrong in the comments and I’ll update the post]
So how about trying this new frame out on the following:
Ex. 5: You can either get up and work out every day or stay on that couch and stay an unhealthy slob!
Ouch. Rephrasing: “Only working out every day is compatible with desiring health”. There’s also couch-to-5k, high intensity training 1-3 days/week, playing sports together that could all improve health and be better long-term or transitioning-wise or whatever you also value.
Ex. 6: Pencils are for writing and erasing
I could use it to make a beat, play miniature football, as a flagpole in a diorama, as a gift, or to poke holes in a page, or …
Ex. 7: You upset your friend Alice with an insensitive joke. Bob tells you, “I can’t believe you said such a terrible thing. You don’t care about her at all!”
Rephrasing: “Only never hurting someone’s feelings is compatible with caring about them”. I care about Alice very much, I just really suck at showing it. (I think it’s interesting that this covers failed goals/good intentions. This makes sense since agents, like us, aren’t logically omniscient)
Ex. 8: You either care for animal life or your own temporary comfort!
I can care about both to different degrees.Algorithm
What’s a possibly ideal general algorithm to solve False Dilemmas?
1. What are the possible values and actions?
2. What’s arbitrarily constrained?
3. Yoda Timers: brainstorm more options
[Note: Some of these examples are limited because we don’t know the context or the purpose, so it’s hard to enumerate very useful actions/accurate values. That’s okay because in reality, you can introspect and ask questions]Useful Constraints:
Constraints can be useful when used purposefully. False Dilemmas are generally bad because they are arbitrary constraints and no one is even aware that a constraint is being used! In the following prompts, what’s a useful constraint to use and why is it useful?
Ex. 9: “Write a 500 word essay on the United States”
"Narrow it down to the front of one building on the main street of Bozeman. The Opera House. Start with the upper left-hand brick.” [This is stolen from Zen and the art of motorcycle maintenance]. Narrowing down the topic is useful for writer’s block.
Ex. 10: Prove that given two Natural numbers (0,1,2,3,...) n and m, n*m = 0 if and only if one of them is 0.
I could talk about all natural numbers n &m; however, it may be easier to divide this into 4 cases: n = 0 & m = 0, n >0 & m = 0, n = 0 & m > 0, and n > 0 & m >0. This is useful because it’s trivial to prove each case, and it’s hard (for me!) to prove this in a way that doesn’t rely on cases.
Ex. 11: You listed them out, and it appears you have 10 problems in your life and you can’t even do anything about some of them!
Circle which ones you can do something about, and constrain your thoughts/actions to only those problems. Constraining in this way helps combat feeling overwhelmed/helpless.
Ex. 12: Your friend asks “Where do you want to go eat?”. You say “Oh wherever, I’m not pick”. They say, “Oh me too!”. (Normally it takes 5-10 minutes after this to figure out where to go)
I could say anything like “Place A and B are very close and I like both, which one is good for you?”, and that normally speeds up the process. Same applies for picking a time and place for hanging out. “How about this coffeehouse at this time? Is that good?”
Ex. 13: You always seem to have trouble deciding what food to order when you go to a new restaurant.
There are several constraints that may serve you well. One is “Pick what is popular in the restaurant”, another is “pick the first thing that sounds good” (Got this from someone here in LW).Introspective Problem Set
Very often we arbitrarily constrain what we can do.
Ex. 14: How do identities (I am a Parent/Good Student/ Good friend/Smart Person/etc) constrain possible actions? What’s a specific example of an identity you’ve held (or are holding) that has constrained what you thought you could do.
By identifying as X, we must act like someone who is X. Whether that’s mimicking how we’ve seen another X act, or how our model of being X.
I’ve personally identified as a “Good Student”. What I mean is that as long as I really put in the time to study, then I have done my duty. This has constrained possible actions such as asking what specifically confused me and googling other explanations.
Ex. 15: Are there any times when it’s good to constrain who you are/ what you can do?
Tons! The search space of a problem is sometimes very big. Having an identity/ being a role always is usually bad because it constrains you to a specific chunk of search space always. But temporarily playing as different roles helps search through different chunks more efficiently.
For example, playing as the devil’s advocate with yourself can help search for the very best reasons why your idea is good, and find the very best reasons why it’s bad. When a friend tells me a huge problem they’re dealing with, it’s time to switch to “Very concerned friend who listens”. When I meet someone very shy, it’s time to switch to “Talkative friend who asks easy questions”.
This is also related to the Intelligent Social Web.
Ex. 16: What about emotions/mental states? What’s a time they’ve constrained you to do something bad? What about something something good?
Sometimes when something frustrating happens, I feel like I’m only constrained to the action “Eat something sweet and distract yourself”. Realizing that, I have now produced several alternatives: take a walk, take care of any present needs (hunger, sleep, etc), play piano, call a loved one, etc. I would say this is category 2, “Only these actions are compatible in this specific environment”
Being in a meditative state helps me love those who are late (it’s true because it rhymes). I am consistently way more patient in a meditative state and it’s my go to move when someone is late.Conclusion
Category Qualifications was about seeing words as a set of qualifications and how to wield qualifications to effectively communicate. This post is about seeing constraints in planning/agents/environments and how to wield those constraints effectively to achieve your goals.
When arguing well, it’s useful to know exactly which constraints are being used and why they are being used. People fall victim to False Dilemmas when they’re not aware of the implied/assumed constraints.
In the next post, we’ll be investigating how to find cruxes and how this is useful when arguing well.
As a final exercise (which I haven’t worked myself), how does this apply to conflict vs mistake culture?
[This is an iterative post/sequence. I don’t think any of us on LW claim we have the 100% truth/final word on a topic. If you strongly downvote a post, please also leave a comment saying why it’s wrong/not useful so the iterative/improving process can happen]