Paul Graham has a new essay out, The Lesson to Unlearn, on the desire to pass tests. It covers the basic points made in Hotel Concierge's The Stanford Marshmallow Prison Experiment. But something must be missing from the theory, because what Paul Graham did with his life was start Y-Combinator, the apex predator of the real-life Stanford Marshmallow Prison Experiment. Or it's just false advertising.
As a matter of basic epistemic self-defense, the conscientious reader will want to read the main source texts for this essay before seeing what I do to them:
- The Lesson to Unlearn
- The Stanford Marshmallow Prison Experiment
- Sam Altman's Manifest Destiny
- Black Swan Farming
- Sam Altman on Loving Community, Hating Coworking, and the Hunt for Talent
The first four are recommended on their own merits as well. The fifth is not. For the less conscientious reader, I've summarized below according to my own ends, biases, and blind spots. You get what you pay for.The Desire to Pass Tests hypothesis: a Brief Recap
The common thesis of The Lesson to Unlearn and The Stanford Marshmallow Prison Experiment is:
Our society is organized around tests imposed by authorities; we're trained and conditioned to jump through arbitrary hoops and pretend that's what we wanted to do all along, and the upper-middle, administrative class is strongly selected for the desire to do this. This is why we're so unhappy and so incapable of authentic living.
Graham goes further and points out that this procedure doesn't know how to figure out new and good things, only how to perform for the system the things it already knows to ask for. But he also talks about trying to teach startup founders to focus on real problems rather than passing Venture Capitalists' tests.
The rhetoric of Graham's essay puts his young advisees in the role of the unenlightened who are having a puzzling amount of trouble understanding advice like "the way you get lots of users is to make the product really great." Graham casts himself in the role of someone who has unlearned the lesson that you should just try to pass the test of whoever you're interacting with, implying that the startup accelerator (i.e. combination Venture Capital firm and cult) he co-founded, Y-Combinator, is trying to do something outside the domain of Tests.
In fact, Graham's behavior as an investor has perpetuated and continues to perpetuate exactly the problem he describes in the essay. Graham is not behaving exceptionally poorly here - he probably makes substantial except by persuasively advertising himself as the Real Thing, confusing those looking for the actual real thing.
If you know anyone in the SF Bay Area startup scene, you know that Y-Combinator is the place to go if you're an ambitious startup founder who wants a hand up. Here's a relevant quote from Tad Friend's excellent New Yorker profile of Sam Altman, the current head of Y-Combinator:Paul Graham considered the founders of Instacart, DoorDash, Docker, and Stripe, in their hoodies and black jeans, and said, “This is Silicon Valley, right here.” All the founders were graduates of Y Combinator, the startup “accelerator” that Graham co-founded: a three-month boot camp, run twice a year, in how to become a “unicorn”—Valleyspeak for a billion-dollar company. Thirteen thousand fledgling software companies applied to Y Combinator this year, and two hundred and forty were accepted, making it more than twice as hard to get into as Stanford University.
Perhaps the most dispositive theory about YC is that the power of its network obviates other theories. Alumni view themselves as a kind of keiretsu, a network of interlocking companies that help one another succeed. “YC is its own economy,” Harj Taggar, the co-founder of Triplebyte, which matches coders’ applications with YC companies, said. Each spring, founders gather at Camp YC, in a redwood forest north of San Francisco, just to network—tech’s version of the Bohemian Grove, only with more vigorous outdoor urination. When Altman first approached Kyle Vogt, the C.E.O. of Cruise, Vogt had been through YC with an earlier company, so he already knew its lessons. He told me, “I talked to five of my friends who had done YC more than once and said, ‘Was it worth it the second time? Are you likely to receive higher valuations because of the brand, and because you’re plugging into the network?’ Across the board, they said yes.”There really is no counter-theory. “The knock on YC,” Andy Weissman, a managing partner at Union Square Ventures, told me, “is that on Demo Day their users are just YC companies, which entirely explains why they’re all growing so fast. But how great to have more than a thousand companies willing to use your product!” It’s not just that YC startups can get Airbnb and Stripe to use their apps; it’s that the network’s alumni honeycomb the Valley’s largest companies. Many of the hundred and twenty-one YC startups that have been acquired over the years have been absorbed by Facebook, Apple, and Google.
This matches the impression I've gotten from most of the people I've talked to about their startups, that Y-Combinator is singularly important as a certifier of potential, and therefore gatekeeper to the kinds of network connections that can enable a fledgling business - especially one building business tools, the ostensible means of production - get off the ground.
When interviewed by Tyler Cowen, Altman expressed a desire to move from being a singularly important gatekeeper to being the exclusive gatekeeper:Someday we will fund all the companies in the world, all the good ones at least.
Given Graham's values, one might have expected a sort of proactive talent scouting, to find people who've deeply invested in interesting ventures that don't fit the mold, and offer them some help scaling up. But the actual process is very different.Selection Effects
Y-Combinator, like the prestigious colleges Graham criticizes in his essay, has a formal application process. It receives many more applications for admission than it can accept, so it has to apply some strong screening filters. It seems to filter for people who are anxiously obsessed with quickly obtaining the approval of the test evaluators, who have been acculturated into upper-middle-class test-passing norms, and who are so obsessed with numbers going the "right" way that they'll distort the shape of their own bodies to satisfy an arbitrary success metric.Anxious Preoccupied
In his interview with Sam Altman, Tyler Cowen asked about the profile of successful Y-Combinator founders:COWEN: Why is being quick and decisive such an important personality trait in a founder?ALTMAN: That is a great question. I have thought a lot about this because the correlation is clear, that one of the most fun things about YC is that, I think, we have more data points on what successful founders and bad founders look like than any other organization has had in the history of the world. We have that all in our heads, and that’s great. So I can say, with a high degree of confidence, that this correlation is true.Being a fast mover and being decisive — it is very hard to be successful and not have those traits as a founder. Why that is, I’m not perfectly clear on, but I think it is something . . . about the only advantage that startups have or the biggest advantage that startups have over large companies is agility, speed, willing to make nonconsensus, concentrated bets, incredible focus. That’s really how you get to beat a big company.COWEN: How quickly should someone answer your email to count as quick and decisive?ALTMAN: You know, years ago I wrote a little program to look at this, like how quickly our best founders — the founders that run billion-plus companies — answer my emails versus our bad founders. I don’t remember the exact data, but it was mind-blowingly different. It was a difference of minutes versus days on average response times.
The kind of "decisiveness" Altman is talking about doesn't involve making research or business decisions that matter on the scale of months or weeks, but responding to emails in a few minutes. In other words, the minds Altman is looking for are not just generically decisive, but quickly responsive - not spending long slow cycles doing a new thing and following their own interest, but anxiously attentive to new inputs, jumping through his hoops fast. Similarly telling is the ten-minute interview in which he mainly looks for how responsive the interviewee is to cues from the interviewer:This gets to the question . . . the most common question I get about Y Combinator is how can you make a decision in a 10-minute interview about who to fund? Where we might miss her is the upper filters of our application process.We have far more qualified people that want to do YC each year than we can fund, but by the time we get someone in the room, by the time we can sit across the table and spend 10 minutes with somebody, as far as I know, we have never made a big mistake at that stage of the process. We’ve looked at tens of thousands — well, in person we’ve maybe looked at 10,000 companies.These personality traits of determination and communication and the ability to articulate a vision for the world and explain how you’re going to get that done — I used to think that that was so hard to assess in 10 minutes, it was maybe impossible to try, and YC interviews used to be like an hour. I now think that most of the time, we could get it right in five minutes.When you have enough data points, when you meet enough people and get to watch what they go on to do — because the one thing that’s hard in a 10-minute interview and the most important thing about evaluating someone is their rate of improvement. It’s a little bit hard when you only get a single sample. But when you do this enough times, and you get to learn what to look for, it is incredible how good you can get at that.
While responsiveness is doubtless a valid test of some sort of intellectual aliveness and ability, it could easily take hours or days to integrate real, substantive new information; in ten minutes all one may be able to do is perform responsiveness.
In case there was any doubt as to whether this attitude is consistent with supporting technical innovation, later in the interview he says unconditionally that Y-Combinator would fund a startup founded by James Bond (a masculine wish-fulfillment fantasy, whose spy work is mostly interpersonal intrigue), but not by Q (the guy in the Bond movies who develops cool gadgets for Bond to use), unless he has a "good cofounder."Conventionally Successful
Then consider the kinds of backgrounds that make a good Y-Combinator founder:COWEN: I come from northern Virginia, the Washington, DC, area. We have very few geniuses. The few that we have tend to be crazy and sometimes destructive. We have what I would call —ALTMAN: They’re very stable, though.COWEN: —a lot of upper-middle-class intellectual talent. People who are pretty smart and good at something. When it comes to spotting good upper-middle-class intellectual talent, do you think you have the same competitive edge as with spotting geniuses who will make rapidly scalable tech companies?ALTMAN: I think that’s how I’d characterize myself: upper middle class, pretty smart, not a super genius by any means. It turns out I’ve met many people smarter than me, but I would say I’ve only ever met a handful of people that are obviously more curious than me.I don’t think raw IQ is my biggest strength — pretty good, to be clear, but the chances of me winning a Nobel Prize in physics are low. I think physics is a bad field at this point, unfortunately. What we spot — you do have to be pretty smart to be a successful founder, but that is not where I look for people to be true outliers.COWEN: Given that self-description — assuming that I accept it, and I’m not sure I do — do you think you’re as good at spotting upper-middle-class intellectual talent as superstar founders? Let’s say we put you in charge —ALTMAN: There’s a statement here that’s just bad about the world, but I think if you look at most successful founders, they are pretty smart, upper-middle-class people. They are very rarely the children of super successful people. They are very rarely born in real poverty. They are very rarely the absolute smartest people who otherwise would win a Fields Medal. They are never dumb, but upper-middle-class, pretty smart people that have grit and drive and creativity and vision and edge and a different way of thinking about the world. That is what I think I’m good at spotting, and that is what I think are good founders. There’s a whole bunch of reasons why that’s a sad statement about the world, but there it is.If you look at most successful founders, they are pretty smart, upper-middle-class people. They are very rarely the children of super successful people. They are very rarely born in real poverty. They are very rarely the absolute smartest people who otherwise would win a Fields Medal. They are never dumb, but upper-middle-class, pretty smart people that have grit and drive and creativity and vision and edge and a different way of thinking about the world. That is what I think I’m good at spotting, and that is what I think are good founders.COWEN: So someone else has to find the geniuses.ALTMAN: Again, I don’t want to go for false modesty here. I think I’m a smart person. The founders we fund are smart people. I would have maybe said 10 years ago that raw IQ is the thing that matters most for founders. I’ve updated my view on that.
In other words, the people who best succeed at Y-Combinator's screening process are exactly the people you'd expect to score highest at Desire To Pass Tests.A High Health Score is Better Than Health
Then consider this little detail:COWEN: In the world of tech startups, venture capital, what is weight lifting correlated with?
ALTMAN: Weight lifting?
COWEN: Weight lifting. Taleb tells us, in New York City, weight lifting is correlated with supporting Trump, but I doubt if that’s true in —
ALTMAN: It’s not true here.
ALTMAN: I think it’s correlated with successful founders. It’s fun to have numbers that go up and to the right. The most fun thing for me about weight lifting is . . . I’m basically financially illiterate. I can’t build an Excel model for anything. I can’t read a balance sheet. But my Excel model for weight lifting is beautiful because they’re numbers that go up and to the right, and it’s really fun to play around with that.
(For context, Taleb's attitude towards weightlifting is that it builds a kind of "antifragile" robustness to small perturbations, which is consistent with the sort of risk-bearing behavior that builds a sustainable society. See Strength Training is Learning from Tail Events for Taleb's own account. By contrast, Altman's idea of a weightlifter is someone who just likes to see the numbers go in the correct direction - what Taleb would call an Intellectual Yet Idiot, the exact "academico-bureaucrat" class from which Y-Combinator draws its founders, who pretend that their measurements capture more about what they study than they do, and offload the risks their models can't account for onto others. For more on Taleb's outlook, read Skin in the Game.)
Hotel Concierge's story about weight in The Stanford Marshmallow Prison Test makes an interesting comparison:MTV CONFESSION CAM: I was an accidental anorexic.I was 17 years old when I moved into the college dorms and decided that I wanted a six-pack. I had never thought about my body too much before that—I didn’t play sports, I didn’t care about fashion, and I spent most of my time daydreaming about fantasy novels and videogames. In the dorms, however, I finally realized that girls existed. I wasn’t sure about how to get a girlfriend on purpose, but I was pretty sure it had something to do with “abs.” So I decided to work on that.I began running for 45 minutes three times a week, along with daily stretching, push-ups, and crunches. After hearing a fire-and-brimstone “Talk About Nutrition!” presentation in the dining commons, I decided that I would have to change my diet as well. I stopped eating sweets of any sort. Increased lean protein intake. All breaded objects became whole wheat. But this didn’t seem enough. Food, I decided, was an ephemeral pleasure, whereas a well-sculpted body was a constant joy to live in and behold. Why bother with anything but the healthiest of foods? After some trial and error, I decided upon the optimal meal plan.Breakfast: Oatmeal, one orange, one kiwi.
Lunch: Salad (lettuce with kidney & garbanzo beans), PB&J, hardboiled egg.
Dinner: Cheerios and milk, salad, orange.I was proud of my discipline, and the nights when I went to sleep hungry only intensified my pride, as I first ignored and then began to appreciate the sharp jabs of an empty stomach. I was no longer getting fit to attract girls—I was getting fit for me. After a month of running, my progress seemed to flatline. I added weights to my regimen. I wasn’t sure how many reps to do, so I decided to just lift until my arms went limp. After another month, I had developed only mild abdominal definition. I decided to step it up: 200 push-ups upon waking every day, stretches and crunches in the evening. I wasn’t perfect. Sometimes I would give in to temptation and eat something off-diet—if someone took me out to lunch, or if I had a cookie at a school event—and I would feel guilty and sad for a while, but before long I would regain my composure and vow to increase my exercise regimen in the next few days to make up for my setback.After three months, I went home for winter break. My mom said that I looked like an Auschwitz survivor. I said that was a huge overreaction. I asked my dad what he thought. My dad said that I looked a little skinny but that there was nothing wrong with getting into shape. I went back to school and kept up my routine. My strength declined. I wasn’t sure why. My ribs could be individually grasped. I visited home and weighed myself at 107 pounds.“That’s really low,” my mom said.
“It’s not that bad,” I said.My mom convinced me to go to the campus nutritionist. The nutritionist, who was middle-aged, and blonde, and wore glasses, and smiled a lot, told me that I had lost 35 pounds in four and half months, and that remaining at this weight could be dangerous.“I was just trying to be fit,” I explained to her.
“Your current weight isn’t fit,” she said. “It’s not healthy.”
“I really don’t want to be fat,” I said.
“You’re not going to be fat,” she said. “That’s not your body type.”
“I guess. I didn’t mean to. I was just trying to eat healthy.”
“Right now, you can eat whatever you want,” she said. “You need to gain some weight. Back into the 140s, at least.”
“I’m never sure what to eat. I don’t want to eat too much.”
“I’ll help you come up with a meal plan,” she said. “We can figure out what you need to do.”I felt tremendous relief. This was a test I could pass.Treatment Effects
The admissions process is not the end of the acculturation. In Even artichokes have doubts, Martina Keegan wrote about how people admitted to Yale don't stop the sort of anxious approval-seeking that got them in. Instead, having been conditioned into that behavior pattern, they're easy targets for recruitment by further generic prestige-awarders like McKinsey or investment banks, even though they know it won't get them much they actually want, and report that they expect it will cause their future behavior to drift farther from what they see as the optimum.
At Y-Combinator, the situation is even worse. According to Friend's New Yorker article, the founders are deliberately forced into situations where short-run survival depends on getting approval now:A founder’s first goal, Graham wrote, is becoming “ramen profitable”: spending thriftily and making just enough to afford ramen noodles for dinner. “You don’t want to give the founders more than they need to survive,” Jessica Livingston said. “Being lean forces you to focus. If a fund offered us three hundred thousand dollars to give the founders, we wouldn’t take it.” (Many of YC’s seventeen partners, wealthy from their own startups, receive a salary of just twenty-four thousand dollars and get most of their compensation in stock.) This logic, followed to its extreme, would suggest that you shouldn’t even take YC’s money, and many successful startups don’t. Only twenty per cent of the Inc. 500, the five hundred fastest-growing private companies, raised outside funding. But the YC credential, and the promise that it will turn you into a juggernaut, can be hard to resist.
Y-Combinator forces a focus on "growth" feedback short-run even though this isn't a good proxy for long-run success. Friend continues:Nearly all YC startups enter the program with the same funding, and thus the same valuation: $1.7 million. After Demo Day, their mean valuation is ten million. There are several theories about why this estimation jumps nearly sixfold in three months. One is that the best founders apply to the best accelerator, and that YC excels at picking formidable founders who would become successful anyway. Paul Buchheit, who ran the past few batches, said, “It’s all about founders. Facebook had Mark Zuckerberg, and MySpace had a bunch of monkeys.”The corollary is that Y Combinator makes its companies more desirable by teaching them how to tell their story on Demo Day. The venture capitalist Chris Dixon, who admires YC, said, “The founders are so well coached that they know exactly how to reverse-engineer us, down to demonstrating domain expertise and telling anecdotes about their backgrounds that show perseverance and courage.” In the winter batch, the pitches followed an invariable narrative of imminent magnitude: link yourself to a name-brand unicorn (“We’re Uber for babysitting . . . Stripe for Africa . . . Slack for health care”), or, if there’s no apt analogue, say, “X is broken. In the future, Y will fix X. We’re already doing Y.” Then express your premise as a chewy buzzword sandwich: We “leverage technology to achieve personalization in a fully automated way” (translation: individuated shampoo). Paul Graham cheerfully acknowledged that, by instilling message discipline, “we help the bad founders look indistinguishable from the good ones.”The counter-theory is that YC actually does make its companies better, by teaching them to focus on growth above all, thereby eliminating distractions such as talking to the tech press or speaking at conferences or making cosmetic coding tweaks. YC’s gold standard for revenue growth is ten per cent a week, which compounds to 142x a year. Failing that, well, tell a story of some other kind of growth. On Demo Day, one company announced that it had enjoyed “fifty-per-cent word-of-mouth growth,” whatever that might be. Sebastian Wallin told me that his security company, Castle, raised $1.8 million because “we managed to find a good metric to show growth. We tried tracking installations of our product, but it didn’t look good. So we used accounts protected, a number that showed roughly thirty-per-cent growth through the course of YC—and about forty per cent of the accounts were YC companies. It was a perfect fairy-tale story.”The truth is that rapid growth over a long period is rare, that the repeated innovation required to sustain it is nearly impossible, and that certain kinds of uncontrollable growth turn out to be cancers. Last year, after a series of crises at Reddit, Altman, who is on its board, convinced Steve Huffman, the co-founder of the company, to return as C.E.O. Huffman said, “I immediately told Sam, ‘Don’t get on my ass about growth. I’m not in control of it.’ Every great startup—Facebook, Airbnb—has no idea why it’s growing at first, and has to figure that out before the growth stalls. Growth masks all problems.”
This despite knowing that the intervention could well be doing more harm than good. From the interview with Cowen, a comment on growth:COWEN: You once said, “Growth masks all problems.” Are there exceptions to that?ALTMAN: Cancer?[laughter]ALTMAN: I mean, clearly, yes. I don’t mean that so flippantly. There is —COWEN: There’s an article in the Jerusalem Post today: someone credible claiming that cancer has been cured. I don’t know if you saw that.ALTMAN: I didn’t see that, but I do — having talked to many biologists working in the field, I will say there is a surprising amount of optimism that we are within a decade or two of that being true.It’s not an area where I feel anywhere near expert enough to comment on the validity of that statement, and I think it’s always dangerous to just trust what other smart people say, especially when they have an incentive to hawk their own book, but it does seem like a lot of people believe that.Growth is bad in plenty of times, but it does mask a lot of problems. A statement that I wouldn’t make is that growth is always an inherent good, although I do think — I think you’ve said something like this, too — that sustainable economic growth is almost always a moral good.Something that I think a lot of the current problems in the country can be traced to is the decline in that. And part of what motivates me to work on Y Combinator and OpenAI is getting back to that, getting back to sustainable economic growth, getting back to a world where most people’s lives get better every year and that we feel the shared spirit of success is really important.And growth feels good. It does mask a lot of problems, but there definitely are individual instances where you’d be better off with slower growth for whatever reason.
Obviously, in any enterprise trying to do a specific thing, some kinds of measurable activity will have to increase at some point. But this obsession with measured revenue growth (or analogues to it) early on leads to performative absurdities like word-of-mouth growth" of fifty percent, which are not words that would come out of the mouth of someone trying to, as Paul Graham advises in his essay, "make the product really great."
Altman knows he's not doing the right thing, or the thing that would make him the most money. But he's doing the fastest-growing thing.Moloch Blindness
How is it that someone like Graham could hold so strongly and express so eloquently the opinion that the most important lesson to unlearn is the desire to pass tests, and then create an institution like Y-Combinator?
This isn't the only such contradiction in Graham's words and actions. Friend's New Yorker article describes Graham's attitude towards meanness, and contrasts this with Altman's actual character, revealing a similar pattern:YC prides itself on rejecting jerks and bullies. “We’re good at screening out assholes,” Graham told me. “In fact, we’re better at screening out assholes than losers. All of them start off as losers—and some evolve.” The accelerator also suggests that great wealth is a happy by-product of solving an urgent problem. This braiding of altruism and ambition is a signal feature of the Valley’s self-image. Graham wrote an essay, “Mean People Fail,” in which—ignoring such possible counterexamples as Jeff Bezos and Larry Ellison—he declared that “being mean makes you stupid” and discourages good people from working for you. Thus, in startups, “people with a desire to improve the world have a natural advantage.” Win-win.
[Altman] attended Stanford University, where he spent two years studying computer science, until he and two classmates dropped out to work full time on Loopt, a mobile app that told your friends where you were. Loopt got into Y Combinator’s first batch because Altman in particular passed what would become known at YC as the young founders’ test: Can this stripling manage adults? He was a formidable operator: quick to smile, but also quick to anger. If you cross him, he’ll joke about slipping ice-nine into your food. (Ice-nine, in Kurt Vonnegut’s “Cat’s Cradle,” annihilates everything it touches that contains water.) Paul Graham, noting Altman’s early aura of consequence, told me, “Sam is extremely good at becoming powerful.”
So, Graham claimed that mean people fail, and then selected a very mean person to run Y-Combinator. Likewise, he wrote that test-passing isn't good for growing a business, and then promoting an obligate test-passer, who then remade his institution to optimize for test-passing.
I think the root process generating this sort of bait-and switch must be something like the following:
There's a way to succeed (i.e. become a larger share of what exists) through production, and a way to succeed through purely adversarial (i.e. zero-sum) competition. These are incompatible strategies, so that productive people will do poorly in zero-sum systems, and vice versa. The productive strategy really is good, and in production-oriented contexts, a zero-sum attitude really is a disadvantage.
Graham has a natural affinity for production-based strategies which allowed him to acquire various kinds of capital. He blinds himself to the existence of adversarial strategies, so he's able to authentically claim to think that e.g. mean people fail - he just forgets about Jeff Bezos, Larry Ellison, Steve Jobs, and Travis Kalanick because they are too anomalous in his model, and don't feel to him like central cases of success.
This is a case of fooling oneself to avoid confronting malevolent power. It's the path towards the true death, so if you want to stay aligned with truth and life, you'll have to look for alternatives. To keep track of anomalies, at least in your internal bookkeeping; to conceal and lie, if you have to, to protect yourself.
If, to participate in higher growth rates, you have to turn into something else, then in what sense is it you that's getting to grow faster? Moloch, as Scott Alexander points out, offers "you" power in exchange for giving up what you actually care about - but this means, offering you no substantive concessions. For what is a person profited, if they shall gain the whole world, and lose their own soul?
Thus blinded, Graham writes about the virtues of production-based strategies as though they were the only way to succeed. He then sets up an institution optimizing for "success" directly, rather than specifically for production-based strategies. But in the environment in which he's operating, adversarial strategies can scale faster. Of course, just because adversarial strategies scale faster doesn't mean they make you richer faster - and as we'll see below, selling out is not, according to Graham's perspective, the way to maximize returns. But faster growth feels more successful. So he ends up selling out his credibility to the growth machine. Or, as Hotel Concierge called it, the Stanford Marshmallow Prison Experiment. (It's perhaps not a coincidence that the Stanford brand is most prestigious in the startup / "tech" scene.)
Here's the thing, though. Graham knows he's doing the wrong thing. He confessed in Black Swan Farming that even though doing the right thing would work out better for him in the long run, he just isn't getting enough positive feedback, so it's psychologically intolerable:The one thing we can track precisely is how well the startups in each batch do at fundraising after Demo Day. But we know that's the wrong metric. There's no correlation between the percentage of startups that raise money and the metric that does matter financially, whether that batch of startups contains a big winner or not.Except an inverse one. That's the scary thing: fundraising is not merely a useless metric, but positively misleading. We're in a business where we need to pick unpromising-looking outliers, and the huge scale of the successes means we can afford to spread our net very widely. The big winners could generate 10,000x returns. That means for each big winner we could pick a thousand companies that returned nothing and still end up 10x ahead.
We can afford to take at least 10x as much risk as Demo Day investors. And since risk is usually proportionate to reward, if you can afford to take more risk you should. What would it mean to take 10x more risk than Demo Day investors? We'd have to be willing to fund 10x more startups than they would. Which means that even if we're generous to ourselves and assume that YC can on average triple a startup's expected value, we'd be taking the right amount of risk if only 30% of the startups were able to raise significant funding after Demo Day.I don't know what fraction of them currently raise more after Demo Day. I deliberately avoid calculating that number, because if you start measuring something you start optimizing it, and I know it's the wrong thing to optimize. But the percentage is certainly way over 30%. And frankly the thought of a 30% success rate at fundraising makes my stomach clench. A Demo Day where only 30% of the startups were fundable would be a shambles...For better or worse that's never going to be more than a thought experiment. We could never stand it. How about that for counterintuitive? I can lay out what I know to be the right thing to do, and still not do it.
So instead, he does the wrong thing, knowingly, on purpose, but tries to pretend otherwise to himself. Sad!
A dominant framework in rationality is internal alignment : sort out conflicts between parts of yourself, stop working at cross-purposes to yourself, stop doing internal violence, aim to take coherent action based on coherent beliefs towards coherent goals, etc. I think the alternate/complementary orientation of aiming for internal empowerment is often neglected / underemphasized. By internal empowerment I mean prioritizing giving each "part" (subsystem, motive, drive, goal, desire, subagent, whatever) the resources it needs to increase its capability to understand the world, know what it wants, and do stuff. This isn't a new idea, but my impression is that in practice rationalists underemphasize this in their actual practice.
Examples of prioritizing empowerment over alignment: you have a conflict, say, should I eat The Sugar. An alignment stance might prioritize: doing IDC or other negotiations; trying to extract from the pro / anti Sugar parts what they think, and what they want, and what they'd be okay with; being satisfied with the situation if one part "goes silent". An empowerment stance would prioritize giving each part access to the cognitive resources needed to grow into its full power: pay attention more at other times in life to [the qualia that tipped you off that you were conflicted about eating The Sugar] and to [the felt sense of not wanting to eat the sugar], so that the part that generates those qualia is more able to become known to you; for some time, let that part seek information, e.g. by doing nutrition research; for some time, let that part act, by temporarily blending as much as you can with that part and actually acting according to its will.
There's a sense of good action, action and thought that grows the capability of a part, sliding around the blocky blocks that comprise the current rut you're in, through the cracks, while not actually doing anything bad. There's a sense of accessing underlying drives, making contexts where it's safe to do that (e.g. alone and truly intending not to act before further deliberation); practicing being each part; unsuppressing, notifying already-negotiated treaties that the terms might expire soon (and that they were signed under duress in back-room dealings); feeding / nurturing / encouraging / loving / raising parts. Practice being your subagents, don't just talk with them. Ask, what does this felt sense want me to think about? And then think about that, and about the next thing.
Of course, alignment and empowerment are complementary and amphicausal: if you're better at conducting negotiations and preventing internecine suppression, then there's more space for part-wise empowerment, and on the other hand if your parts are stronger then they'll in general be more able to competently negotiate / work together / fuse / align.
On the other hand, I often find that if I take a statement someone made about internal alignment and then think of it as referring to members of a population of humans, it seems creepy and totalitarian.Sometimes internal alignment, at the margin, is internal suppression. Stability can be maintained by propaganda (I keep telling myself...), like the sixth iteration of the Matrix or a small world specialized to encompass many other small worlds. (Come home to the unique flavor of shattering the grand illusion. Come home to Simple Rick's.) If thread X in your life is working pretty well, and thread Y is something you'd want but don't have much traction on, then, locally speaking, starting up a pursuit of Y might create more internal conflict: the hope for Y might push against the status quo, and the cognition pointed toward X might try to push Y away from recruiting cognition. Here, internal alignment and internal empowerment come apart.
This consideration has me suspicious of certain kinds of meditation. Training yourself to let your desires whither on the vine is not a way to get what you want. Better to want two things, suffer the conflict and loss, and achieve one thing, than to want no things and achieve no things.
Parts may be greedy or jealous about cognitive resources, but their implicit claims usually aren't true; you've a lot of mental resources to go around. Leaving one of your children out in the wilderness to die is one way of returning peace to a troubled family, yes, but that is a high price. A limp ache in your soul and success in your feedback loops, or a real shot that might fail at your true goals; take your pick. You don't deal with conflicts between you and your child by teaching them to always submit to you, you find a temporary working compromise and spend most of your energy to raise them to the point where they can fully apprehend the issues in the negotiation. Well-developed parts are not equal negotiating partners with under-developed parts. First practice being each thing, so it gets a chance to be an agent, develop hooks into your whole world model, see what's possible, and see what it really wants; then do the negotiation if necessary.
One of the things impeding the many worlds vs wavefunction-collapse dialogue is that nobody seems to be able to point to a situation in which the difference clearly matters, where we would make a different decision depending on which theory we believe. If there aren't any, pragmatism would instruct us to drop write the question off as meaningless.
Has anyone tried to pose a compelling thought experiment in which the difference matters?
In the post introducing mesa optimization, the authors defined an optimizer asa system [that is] internally searching through a search space (consisting of possible outputs, policies, plans, strategies, or similar) looking for those elements that score high according to some objective function that is explicitly represented within the system.
The paper continues by defining a mesa optimizer as an optimizer that was selected by a base optimizer.
However, there are a number of issues with this definition, as some have already pointed out.
First, I think by this definition humans are clearly not mesa optimizers. Most optimization we do is implicit. Yet, humans are the supposed to be the prototypical example of a mesa optimizer, which appears be a contradiction.
Second, the definition excludes perfectly legitimate examples of inner alignment failures. To see why, consider a simple feedforward neural network trained by deep reinforcement learning to navigate my Chests and Keys environment. Since "go to the nearest key" is a good proxy for getting the reward, the neural network simply returns the action, that when given the board state, results in the agent getting closer to the nearest key.
Is the feedforward neural network optimizing anything here? Hardly, it's just applying a heuristic. Note that you don't need to do anything like an internal A* search to find keys in a maze, because in many environments, following a wall until the key is within sight, and then performing a very shallow search (which doesn't have to be explicit) could work fairly well.
As far as I can tell, Hjalmar Wijk introduced the term "malign generalization" to describe the failure mode that I think is most worth worrying about here. In particular, malign generalization happens when you trained a system with objective function X, that at deployment the system has the actual outcome of doing Y, where Y is so bad we'd prefer the system to fail completely. To me at least, this seems like a far more intuitive and less theory-laden way of framing inner alignment failures.
This way of reframing the issue allows us to keep the old terminology that we are concerned with capability robustness without alignment robustness, but drops all unnecessary references to mesa optimization.
Mesa optimizers could still form a natural class of things that are prone to malign generalization. But if even humans are not mesa optimizers, why should we expect mesa optimizers to be the primary real world examples of such inner alignment failures?
my daughter is about 3 and a half year old and we have been enjoying some interactive story telling where she is manipulating the story somehow.
When I was ~18, I played tabletop RPG with my friends (for about a year) and I really enjoyed and I think that I would really enjoy something like that with my daughter. So far I have been making things up as we go and sometimes I just get stuck, so I am trying to create some plot and ideas, print out/draw some graphics, maps around the area where we live...
...and I think it's a great playground and opportunity to try to slowly incorporate some of the rationality-related ideas (or simply "those things I learned about the world which makes me happier or stronger"). These things are stuff like:
- some people have better ideas than others
- information is valuable and costly to get
- updating is valuable
- fundamental attribution error
- you can deal with anxiety (CBT/Unlocking the emotional brain stuff)
- some useful models or concepts like multi-agent minds, Chesterton fence...
- Julia Galef's + EY's "things to teach kids" here
- and so on
Anyone knows about some nice plots, materials, stories, tools, game settings or whatever to get some inspiration? It doesn't matter if it's for the older kids, I can slowly get there or adjust it for the current age.
Congress is a pretty awesome, annual event from the Chaos Computer Club. If you aren't familiar, search the web or find more information here.
We had a meet up last year. I had the idea of getting a couple of LW-People together and see what happens. I was more than pleased with the result. A whole lot of people showed up, everything from people well-known in the community to intrigued newcomers who only heard about it once. So we told the newcomers about our impression as to what rationality is or incorporates. We then split into several smaller groups, one with a thorough introduction and several about specific topics (e.g. AI safety, the dragon army experiment and others).
Apart from that, a Signal-group and a pad with a long list of links and book recommendations emerged, from which I hope everything was able to profit.
I'd like to improve it this year. The group for communication proved really useful during congress. The pad was essential for easily sharing recommendations - which was used frequently. So I set them up already:
- Telegram group: https://t.me/joinchat/AQyfUhHcTKJjmDm9h-s2EQ
- EtherPad: https://pad.riseup.net/p/lwccc
- Self-Organized-Session: https://events.ccc.de/congress/2019/wiki/index.php/Session:LessWrong_Congress_Meetup
Also, for this year's meetup, let us do a bit more. Last year, we kind of 'spontaneously' split apart into groups, I would like to do this earlier and more organized this year. So there is an topic-collection in the pad. For each topic, the idea is to form a 'discussion group' with that topic as the 'reference point' of discussion, though you are not restricted to only talk about this at all.
Also, if you have other ideas, feel free to reach out to me!
I'd love to see all of you this year again.
I wrote in a previous post how my life is better when I avoid certain kinds of media. It increases my happiness, decreases stress and makes me smarter.
It's time to follow my own advice. I hereby commit myself to abstaining from junk media for one year.
There's a long list of rules, definitions and exceptions but the basic idea is I'll avoid videogames, news, web surfing, Hot Network Questions and similar mindless media feeds. YouTube is special case. Music and dance videos are okay so I created a YouTube account and trained the recommendation algorithm to only recommend these kinds of videos.
I've been abstaining from junk media for longer and longer periods of time. My record is around two months. Now's finally the time to pull the trigger and go a whole year.
Most New Years Resolutions fail so instead I'm starting this year of media blackout on 47 Ronin Remembrance Day.
As noted in the document, several sections of this agenda drew on writings by Lukas Gloor, Daniel Kokotajlo, Anni Leskelä, Caspar Oesterheld, and Johannes Treutlein. Thank you very much to David Althaus, Tobias Baumann, Alexis Carlier, Alex Cloud, Max Daniel, Michael Dennis, Lukas Gloor, Adrian Hutter, Daniel Kokotajlo, János Kramár, David Krueger, Anni Leskelä, Matthijs Maas, Linh Chi Nguyen, Richard Ngo, Caspar Oesterheld, Mahendra Prasad, Rohin Shah, Carl Shulman, Stefan Torges, Johannes Treutlein, and Jonas Vollmer for comments on drafts of this document. Thank you also to the participants of the Effective Altruism Foundation research retreat and workshops, whose contributions also helped to shape this agenda.References
Arif Ahmed. Evidence, decision and causality. Cambridge University Press, 2014.
AI Impacts. Likelihood of discontinuous progress around the development of agi. https://aiimpacts.org/likelihood-of-discontinuous-progress-around-the-development-of-agi/, 2018. Accessed: July 1 2019.
Riad Akrour, Marc Schoenauer, and Michele Sebag. Preference-based policy learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 12–27. Springer, 2011.
Steffen Andersen, Seda Ertaç, Uri Gneezy, Moshe Hoffman, and John A List. Stakes matter in ultimatum games. American Economic Review, 101(7):3427-39, 2011.
Giulia Andrighetto, Daniela Grieco, and Rosaria Conte. Fairness and compliance in the extortion game. 2015.
Scott E Atkinson, Todd Sandler, and John Tschirhart. Terrorism in a bargaining framework. The Journal of Law and Economics, 30(1):1-21, 1987.
Robert Axelrod. On six advances in cooperation theory. Analyse & Kritik, 22(1):130-151, 2000.
Robert Axelrod and William D Hamilton. The evolution of cooperation. science, 211 (4489):1390-1396, 1981.
Kyle Bagwell. Commitment and observability in games. Games and Economic Behavior, 8(2):271-280, 1995.
Tobias Baumann. Surrogate goals to deflect threats. http://s-risks.org/using-surrogate-goals-to-deflect-threats/, 2017. Accessed March 6, 2019.
Tobias Baumann. Challenges to implementing surrogate goals. http://s-risks.org/challenges-to-implementing-surrogate-goals/, 2018. Accessed March 6, 2019.
Tobias Baumann, Thore Graepel, and John Shawe-Taylor. Adaptive mechanism design: Learning to promote cooperation. arXiv preprint arXiv:1806.04067, 2018.
Ken Binmore, Ariel Rubinstein, and Asher Wolinsky. The nash bargaining solution in economic modelling. The RAND Journal of Economics, pages 176-188, 1986.
Iris Bohnet, Bruno S Frey, and Steffen Huck. More order with less law: On contract enforcement, trust, and crowding. American Political Science Review, 95(1):131-144, 2001.
Friedel Bolle, Yves Breitmoser, and Steffen Schlächter. Extortion in the laboratory. Journal of Economic Behavior & Organization, 78(3):207-218, 2011.
Gary E Bolton and Axel Ockenfels. Erc: A theory of equity, reciprocity, and competition. American economic review, 90(1):166-193, 2000.
Nick Bostrom. Ethical issues in advanced artificial intelligence. Science Fiction and Philosophy: From Time Travel to Superintelligence, pages 277-284, 2003.
Nick Bostrom. Superintelligence: paths, dangers, strategies. 2014.
Ronen I Brafman and Moshe Tennenholtz. Efficient learning equilibrium. In Advances in Neural Information Processing Systems, pages 1635-1642, 2003.
R. A. Briggs. Normative theories of rational choice: Expected utility. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, fall 2019 edition, 2019.
Ernst Britting and Hartwig Spitzer. The open skies treaty. Verification Yearbook, pages 221-237, 2002.
Colin Camerer and Teck Hua Ho. Experience-weighted attraction learning in normal form games. Econometrica, 67(4):827-874, 1999.
Colin F Camerer. Behavioural game theory. Springer, 2008.
Colin F Camerer, Teck-Hua Ho, and Juin-Kuan Chong. A cognitive hierarchy model of games. The Quarterly Journal of Economics, 119(3):861-898, 2004.
Christopher Cherniak. Computational complexity and the universal acceptance of logic. The Journal of Philosophy, 81(12):739-758, 1984.
Thomas J Christensen and Jack Snyder. Chain gangs and passed bucks: Predicting alliance patterns in multipolarity. International organization, 44(2):137-168, 1990.
Paul Christiano. Approval directed agents. https://ai-alignment.com/model-free-decisions-6e6609f5d99e, 2014. Accessed: March 15 2019.
Paul Christiano. Humans consulting hch. https://ai-alignment.com/humans-consulting-hch-f893f6051455, 2016a.
Paul Christiano. Prosaic ai alignment. https://ai-alignment.com/prosaic-ai-control-b959644d79c2, 2016b. Accessed: March 13 2019.
Paul Christiano. Clarifying “ai alignment”. https://ai-alignment.com/clarifying-ai-alignment-cec47cd69dd6, 2018a. Accessed: October 10 2019.
Paul Christiano. Preface to the sequence on iterated amplification. https://www.lesswrong.com/s/XshCxPjnBec52EcLB/p/HCv2uwgDGf5dyX5y6, 2018b. Accessed March 6, 2019.
Paul Christiano. Preface to the sequence on iterated amplification. https://www.lesswrong.com/posts/HCv2uwgDGf5dyX5y6/preface-to-the-sequence-on-iterated-amplification, 2018c. Accessed: October 10 2019.
Paul Christiano. Techniques for optimizing worst-case performance. https://ai-alignment.com/techniques-for-optimizing-worst-case-performance-39eafec74b99, 2018d. Accessed: June 24, 2019.
Paul Christiano. What failure looks like. https://www.lesswrong.com/posts/HBxe6wdjxK239zajf/what-failure-looks-like, 2019. Accessed: July 2 2019.
Paul Christiano and Robert Wiblin. Should we leave a helpful message for future civilizations, just in case humanity dies out? https://80000hours.org/podcast/episodes/paul-christiano-a-message-for-the-future/, 2019. Accessed: September 25, 2019.
Paul F Christiano, Jan Leike, Tom Brown, Miljan Martic, Shane Legg, and Dario Amodei. Deep reinforcement learning from human preferences. In Advances in Neural Information Processing Systems, pages 4299-4307, 2017.
Mark Coeckelbergh. Can we trust robots? Ethics and information technology, 14(1):53-60, 2012.
EA Concepts. Importance, tractability, neglectedness framework. https://concepts.effectivealtruism.org/concepts/importance-neglectedness-tractability/, n.d. Accessed: July 1 2019.
Ajeya Cotra. Iterated distillation and amplification. https://www.alignmentforum.org/posts/HqLxuZ4LhaFhmAHWk/iterated-distillation-and-amplification, 2018. Accessed: July 25 2019.
Jacob W Crandall, Mayada Oudah, Fatimah Ishowo-Oloko, Sherief Abdallah, Jean-François Bonnefon, Manuel Cebrian, Azim Shariff, Michael A Goodrich, Iyad Rahwan, et al. Cooperating with machines. Nature communications, 9(1):233, 2018.
Andrew Critch. A parametric, resource-bounded generalization of loeb’s theorem, and a robust cooperation criterion for open-source game theory. The Journal of Symbolic Logic, pages 1-15, 2019.
Allan Dafoe. Ai governance: A research agenda. Governance of AI Program, Future of Humanity Institute, University of Oxford: Oxford, UK, 2018.
Wei Dai. Towards a new decision theory. https://www.lesswrong.com/posts/de3xjFaACCAk6imzv/towards-a-new-decision-theory, 2009. Accessed: March 5 2019.
Wei Dai. The main sources of ai risk. https://www.lesswrong.com/posts/WXvt8bxYnwBYpy9oT/the-main-sources-of-ai-risk, 2019. Accessed: July 2 2019.
Robyn M Dawes. Social dilemmas. Annual review of psychology, 31(1):169-193, 1980.
Karl W Deutsch and J David Singer. Multipolar power systems and international stability. World Politics, 16(3):390-406, 1964.
Daniel Dewey. My current thoughts on miri’s “highly reliable agent design” work. https://forum.effectivealtruism.org/posts/SEL9PW8jozrvLnkb4/my-current-thoughts-on-miri-s-highly-reliable-agent-design, 2017. Accessed: October 6 2019.
Avinash Dixit. Trade expansion and contract enforcement. Journal of Political Economy, 111(6):1293-1317, 2003.
Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017.
K Eric Drexler. Reframing superintelligence: Comprehensive ai services as general intelligence, 2019.
Martin Dufwenberg and Uri Gneezy. Measuring beliefs in an experimental lost wallet game. Games and economic Behavior, 30(2):163-182, 2000.
Daniel Ellsberg. The theory and practice of blackmail. Technical report, RAND CORP SANTA MONICA CA, 1968.
Johanna Etner, Meglena Jeleva, and Jean-Marc Tallon. Decision theory under ambiguity. Journal of Economic Surveys, 26(2):234-270, 2012.
Owain Evans, Andreas Stuhlmüller, Chris Cundy, Ryan Carey, Zachary Kenton, Thomas McGrath, and Andrew Schreiber. Predicting human deliberative judgments with machine learning. Technical report, Technical report, University of Oxford, 2018.
Tom Everitt, Jan Leike, and Marcus Hutter. Sequential extensions of causal and evidential decision theory. In International Conference on Algorithmic DecisionTheory, pages 205-221. Springer, 2015.
Tom Everitt, Daniel Filan, Mayank Daswani, and Marcus Hutter. Self-modification of policy and utility function in rational agents. In International Conference on Artificial General Intelligence, pages 1-11. Springer, 2016.
Tom Everitt, Pedro A Ortega, Elizabeth Barnes, and Shane Legg. Understanding agent incentives using causal influence diagrams, part i: single action settings. arXiv preprint arXiv:1902.09980, 2019.
James D Fearon. Rationalist explanations for war. International organization, 49(3):379-414, 1995.
Ernst Fehr and Klaus M Schmidt. A theory of fairness, competition, and cooperation. The quarterly journal of economics, 114(3):817-868, 1999. Ernst Fehr, Simon Gächter, and Georg Kirchsteiger. Reciprocity as a contract enforcement device: Experimental evidence. ECONOMETRICA-EVANSTON ILL-, 65:833-860, 1997.
Dan S Felsenthal and Abraham Diskin. The bargaining problem revisited: mínimum utility point, restricted monotonicity axiom, and the mean as an estimate of expected utility. Journal of Conflict Resolution, 26(4):664-691, 1982.
Mark Fey and Kristopher W Ramsay. Mutual optimism and war. American Journal of Political Science, 51(4):738-754, 2007.
Mark Fey and Kristopher W Ramsay. Uncertainty and incentives in crisis bargaining: Game-free analysis of international conflict. American Journal of Political Science, 55(1):149-169, 2011.
Ben Fisch, Daniel Freund, and Moni Naor. Physical zero-knowledge proofs of physical properties. In Annual Cryptology Conference, pages 313-336. Springer, 2014.
Jakob Foerster, Richard Y Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, and Igor Mordatch. Learning with opponent-learning awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pages 122-130. International Foundation for Autonomous Agents and Multiagent Systems, 2018.
Lance Fortnow. Program equilibria and discounted computation time. In Proceedings of the 12th Conference on Theoretical Aspects of Rationality and Knowledge, pages 128-133. ACM, 2009.
James W Friedman. A non-cooperative equilibrium for supergames. The Review of Economic Studies, 38(1):1-12, 1971.
Daniel Garber. Old evidence and logical omniscience in bayesian confirmation theory. 1983.
Ben Garfinkel. Revent developments in cryptography and possible long-run consequences. https://drive.google.com/file/d/0B0j9LKC65n09aDh4RmEzdlloT00/view,2018. Accessed: November 11 2019.
Ben Garfinkel and Allan Dafoe. How does the offense-defense balance scale? Journal of Strategic Studies, 42(6):736-763, 2019.
Scott Garrabrant. Two major obstacles for logical inductor decision theory. https://agentfoundations.org/item?id=1399, 2017. Accessed: July 17 2019.
Scott Garrabrant and Abram Demski. Embedded agency. https://www.alignmentforum.org/posts/i3BTagvt3HbPMx6PN/embedded-agency-full-text-version, 2018. Accessed March 6, 2019.
Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, and Jessica Taylor. Logical induction. arXiv preprint arXiv:1609.03543, 2016.
Alexandre Gazet. Comparative analysis of various ransomware virii. Journal in computer virology, 6(1):77-90, 2010.
Samuel J Gershman, Eric J Horvitz, and Joshua B Tenenbaum. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science, 349(6245):273-278, 2015.
Allan Gibbard and William L Harper. Counterfactuals and two kinds of expected utility. In Ifs, pages 153-190. Springer, 1978.
Itzhak Gilboa and David Schmeidler. Maxmin expected utility with non-unique prior. Journal of mathematical economics, 18(2):141-153, 1989.
Alexander Glaser, Boaz Barak, and Robert J Goldston. A zero-knowledge protocol for nuclear warhead verification. Nature, 510(7506):497, 2014.
Charles L Glaser. The security dilemma revisited. World politics, 50(1):171-201, 1997.
Piotr J Gmytrasiewicz and Prashant Doshi. A framework for sequential planning in multi-agent settings. Journal of Artificial Intelligence Research, 24:49-79, 2005.
Oded Goldreich and Yair Oren. Definitions and properties of zero-knowledge proof systems. Journal of Cryptology, 7(1):1-32, 1994.
Shafi Goldwasser, Silvio Micali, and Charles Rackoff. The knowledge complexity of interactive proof systems. SIAM Journal on computing, 18(1):186-208, 1989.
Katja Grace, John Salvatier, Allan Dafoe, Baobao Zhang, and Owain Evans. When will ai exceed human performance? evidence from ai experts. Journal of Artificial Intelligence Research, 62:729-754, 2018.
Hilary Greaves, William MacAskill, Rossa O’Keeffe-O’Donovan, and Philip Trammell. Research agenda–web version a research agenda for the global priorities institute. 2019.
Avner Greif, Paul Milgrom, and Barry R Weingast. Coordination, commitment, and enforcement: The case of the merchant guild. Journal of political economy, 102(4):745-776, 1994.
Frances S Grodzinsky, Keith W Miller, and Marty J Wolf. Developing artificial agents worthy of trust: “would you buy a used car from this artificial agent?”. Ethics and information technology, 13(1):17-27, 2011.
Werner Güth, Rolf Schmittberger, and Bernd Schwarze. An experimental analysis of ultimatum bargaining. Journal of economic behavior & organization, 3(4):367-388, 1982.
Dylan Hadfield-Menell, Stuart J Russell, Pieter Abbeel, and Anca Dragan. Cooperative inverse reinforcement learning. In Advances in neural information processing systems, pages 3909-3917, 2016.
Edward H Hagen and Peter Hammerstein. Game theory and human evolution: A critique of some recent interpretations of experimental games. Theoretical population biology, 69(3):339-348, 2006.
Joseph Y Halpern and Rafael Pass. Game theory with translucent players. International Journal of Game Theory, 47(3):949-976, 2018.
Lars Peter Hansen and Thomas J Sargent. Robustness. Princeton university press, 2008.
Lars Peter Hansen, Massimo Marinacci, et al. Ambiguity aversion and model misspecification: An economic perspective. Statistical Science, 31(4):511-515, 2016.
Garrett Hardin. The tragedy of the commons. science, 162(3859):1243-1248, 1968.
Paul Harrenstein, Felix Brandt, and Felix Fischer. Commitment and extortion. In Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems, page 26. ACM, 2007.
John C Harsanyi and Reinhard Selten. A generalized nash solution for two-person bargaining games with incomplete information. Management Science, 18(5-part-2): 80-106, 1972.
Joseph Henrich, Richard McElreath, Abigail Barr, Jean Ensminger, Clark Barrett, Alexander Bolyanatz, Juan Camilo Cardenas, Michael Gurven, Edwins Gwako, Natalie Henrich, et al. Costly punishment across human societies. Science, 312(5781): 1767-1770, 2006.
Jack Hirshleifer. On the emotions as guarantors of threats and promises. The Dark Side of the Force, pages 198-219, 1987.
Douglas R Hofstadter. Dilemmas for superrational thinkers, leading up to a luring lottery. Scientific American, 6:267-275, 1983.
Terence Horgan. Counterfactuals and newcomb’s problem. The Journal of Philosophy, 78(6):331-356, 1981.
Edward Hughes, Joel Z Leibo, Matthew Phillips, Karl Tuyls, Edgar Dueñez-Guzman, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin McKee, Raphael Koster, et al. Inequity aversion improves cooperation in intertemporal social dilemmas. In Advances in neural information processing systems, pages 3326-3336, 2018.
Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, et al. Population based training of neural networks. arXiv preprint arXiv:1711.09846, 2017.
Robert Jervis. Cooperation under the security dilemma. World politics, 30(2):167-214, 1978.
Robert Jervis. Perception and Misperception in International Politics: New Edition. Princeton University Press, 2017.
Daniel Kahneman, Ilana Ritov, David Schkade, Steven J Sherman, and Hal R Varian. Economic preferences or attitude expressions?: An analysis of dollar responses to public issues. In Elicitation of preferences, pages 203-242. Springer, 1999.
Ehud Kalai. Proportional solutions to bargaining situations: interpersonal utility comparisons. Econometrica: Journal of the Econometric Society, pages 1623-1630, 1977.
Ehud Kalai, Meir Smorodinsky, et al. Other solutions to nash’s bargaining problem. Econometrica, 43(3):513-518, 1975.
Fred Kaplan. The wizards of Armageddon. Stanford University Press, 1991.
Holden Karnofsky. Some background on our views regarding advanced artificial intelligence. https://www.openphilanthropy.org/blog/some-background-our-views-regarding-advanced-artificial-intelligence, 2016. Accessed: July 7 2019.
D Marc Kilgour and Frank C Zagare. Credibility, uncertainty, and deterrence. American Journal of Political Science, 35(2):305-334, 1991.
Stephen Knack and Philip Keefer. Institutions and economic performance: cross-country tests using alternative institutional measures. Economics & Politics, 7(3): 207-227, 1995.
Daniel Kokotajlo. The “commitment races” problem. https://www.lesswrong.com/posts/brXr7PJ2W4Na2EW2q/the-commitment-races-problem, 2019a. Accessed: September 11 2019.
Daniel Kokotajlo. Cdt agents are exploitable. Unpublished working draft, 2019b.
Peter Kollock. Social dilemmas: The anatomy of cooperation. Annual review of sociology, 24(1):183-214, 1998.
Kai A Konrad and Stergios Skaperdas. Credible threats in extortion. Journal of Economic Behavior & Organization, 33(1):23-39, 1997.
David M Kreps and Joel Sobel. Signalling. Handbook of game theory with economic applications, 2:849-867, 1994.
Joshua A Kroll, Solon Barocas, Edward W Felten, Joel R Reidenberg, David G Robinson, and Harlan Yu. Accountable algorithms. U. Pa. L. Rev., 165:633, 2016.
David Krueger, Tegan Maharaj, Shane Legg, and Jan Leike. Misleading meta-objectives and hidden incentives for distributional shift. Safe Machine Learning workshop at ICLR, 2019.
Andrew Kydd. Which side are you on? bias, credibility, and mediation. American Journal of Political Science, 47(4):597-611, 2003.
Andrew H Kydd. Rationalist approaches to conflict prevention and resolution. Annual Review of Political Science, 13:101-121, 2010.
Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, and Thore Graepel. A unified game-theoretic approach to multiagent reinforcement learning. In Advances in Neural Information Processing Systems, pages 4190-4203, 2017.
Daryl Landau and Sy Landau. Confidence-building measures in mediation. Mediation Quarterly, 15(2):97-103, 1997.
Patrick LaVictoire, Benja Fallenstein, Eliezer Yudkowsky, Mihaly Barasz, Paul Christiano, and Marcello Herreshoff. Program equilibrium in the prisoner’s dilemma via loeb’s theorem. In Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014.
Joel Z Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, and Thore Graepel. Multi-agent reinforcement learning in sequential social dilemmas. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pages 464-473. International Foundation for Autonomous Agents and Multiagent Systems, 2017.
Joel Z Leibo, Edward Hughes, Marc Lanctot, and Thore Graepel. Autocurricula and the emergence of innovation from social interaction: A manifesto for multi-agent intelligence research. arXiv preprint arXiv:1903.00742, 2019.
Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, and Shane Legg. Scalable agent alignment via reward modeling: a research direction. arXiv preprint arXiv:1811.07871, 2018.
Adam Lerer and Alexander Peysakhovich. Maintaining cooperation in complex social dilemmas using deep reinforcement learning. arXiv preprint arXiv:1707.01068, 2017.
Anni Leskela. Simulations as a tool for understanding other civilizations. Unpublished working draft, 2019.
Alistair Letcher, Jakob Foerster, David Balduzzi, Tim Rocktäschel, and Shimon Whiteson. Stable opponent shaping in differentiable games. arXiv preprint arXiv:1811.08469, 2018.
David Lewis. Prisoners’ dilemma is a newcomb problem. Philosophy & Public Affairs, pages 235-240, 1979.
Xiaomin Lin, Stephen C Adams, and Peter A Beling. Multi-agent inverse reinforcement learning for certain general-sum stochastic games. Journal of Artificial Intelligence Research, 66:473-502, 2019.
Zachary C Lipton. The mythos of model interpretability. arXiv preprint arXiv:1606.03490, 2016.
William MacAskill. A critique of functional decision theory. https://www.lesswrong.com/posts/ySLYSsNeFL5CoAQzN/a-critique-of-functional-decision-theory, 2019. Accessed: September 15 2019.
William MacAskill, Aron Vallinder, Caspar Oesterheld, Carl Shulman, and Johannes Treutlein. The evidentialist’s wager. Manuscript, 2019.
Fabio Maccheroni, Massimo Marinacci, and Aldo Rustichini. Ambiguity aversion, robustness, and the variational representation of preferences. Econometrica, 74(6): 1447-1498, 2006.
Michael W Macy and Andreas Flache. Learning dynamics in social dilemmas. Proceedings of the National Academy of Sciences, 99(suppl 3):7229-7236, 2002. Christopher JG Meacham. Binding and its consequences. Philosophical studies, 149 (1):49-71, 2010.
Kathleen L Mosier, Linda J Skitka, Susan Heers, and Mark Burdick. Automation bias: Decision making and performance in high-tech cockpits. The International journal of aviation psychology, 8(1):47-63, 1998.
Abhinay Muthoo. A bargaining model based on the commitment tactic. Journal of Economic Theory, 69:134-152, 1996.
Rosemarie Nagel. Unraveling in guessing games: An experimental study. The American Economic Review, 85(5):1313-1326, 1995.
John Nash. Two-person cooperative games. Econometrica, 21:128-140, 1953.
John F Nash. The bargaining problem. Econometrica: Journal of the Econometric Society, pages 155-162, 1950.
Andrew Y Ng, Stuart J Russell, et al. Algorithms for inverse reinforcement learning. In Icml, volume 1, page 2, 2000.
Douglass C North. Institutions. Journal of economic perspectives, 5(1):97-112, 1991.
Robert Nozick. Newcomb’s problem and two principles of choice. In Essays in honor of Carl G. Hempel, pages 114-146. Springer, 1969.
Caspar Oesterheld. Deep reinforcement learning from human preferences. https://casparoesterheld.files.wordpress.com/2018/01/rldt.pdf, 2017a.
Caspar Oesterheld. Multiverse-wide cooperation via correlated decision making. 2017b.
Caspar Oesterheld. Robust program equilibrium. Theory and Decision, pages 1-17, 2019.
Caspar Oesterheld and Vincent Conitzer. Extracting money from causal decision theorists. 2019. Accessed: March 13 2019.
Stephen M Omohundro. The nature of self-improving artificial intelligence. Singularity Summit, 2008, 2007.
Stephen M Omohundro. The basic ai drives. In AGI, volume 171, pages 483-492, 2008.
OpenAI. Openai charter. https://openai.com/charter/, 2018. Accessed: July 7 2019.
Petro A Ortega and Vishal Maini. Building safe artificial intelligence: specification, robustness, and assurance. https://medium.com/@deepmindsafetyresearch/building-safe-artificial-intelligence-52f5f75058f1, 2018. Accessed: July 7 2019.
Raja Parasuraman and Dietrich H Manzey. Complacency and bias in human use of automation: An attentional integration. Human factors, 52(3):381-410, 2010. Judea Pearl. Causality. Cambridge university press, 2009.
Julien Perolat, Joel Z Leibo, Vinicius Zambaldi, Charles Beattie, Karl Tuyls, and Thore Graepel. A multi-agent reinforcement learning model of common-pool resource appropriation. In Advances in Neural Information Processing Systems, pages 3643-3652, 2017.
Alexander Peysakhovich and Adam Lerer. Consequentialist conditional cooperation in social dilemmas with imperfect information. arXiv preprint arXiv:1710.06975, 2017.
Robert Powell. Bargaining theory and international conflict. Annual Review of Political Science, 5(1):1-30, 2002.
Robert Powell. War as a commitment problem. International organization, 60(1): 169-203, 2006.
Kai Quek. Rationalist experiments on war. Political Science Research and Methods, 5 (1):123-142, 2017.
Matthew Rabin. Incorporating fairness into game theory and economics. The American economic review, pages 1281-1302, 1993.
Neil C Rabinowitz, Frank Perbet, H Francis Song, Chiyuan Zhang, SM Eslami, and Matthew Botvinick. Machine theory of mind. arXiv preprint arXiv:1802.07740, 2018.
Werner Raub. A general game-theoretic model of preference adaptations in problematic social situations. Rationality and Society, 2(1):67-93, 1990.
Robert W Rauchhaus. Asymmetric information, mediation, and conflict management. World Politics, 58(2):207-241, 2006.
Jonathan Renshon, Julia J Lee, and Dustin Tingley. Emotions and the microfoundations of commitment problems. International Organization, 71(S1):S189-S218, 2017.
Stephane Ross, Geoffrey Gordon, and Drew Bagnell. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, pages 627-635, 2011.
Ariel Rubinstein. Perfect equilibrium in a bargaining model. Econometrica: Journal of the Econometric Society, pages 97-109, 1982.
Stuart Russell, Daniel Dewey, and Max Tegmark. Research priorities for robust and beneficial artificial intelligence. Ai Magazine, 36(4):105-114, 2015.
Stuart J Russell and Devika Subramanian. Provably bounded-optimal agents. Journal of Artificial Intelligence Research, 2:575-609, 1994.
Santiago Sanchez-Pages. Bargaining and conflict with incomplete information. The Oxford Handbook of the Economics of Peace and Conflict. Oxford University Press, New York, 2012.
Wiliam Saunders. Hch is not just mechanical turk. https://www.alignmentforum.org/posts/4JuKoFguzuMrNn6Qr/hch-is-not-just-mechanical-turk?_ga=2.41060900. 708557547.1562118039-599692079.1556077623, 2019. Accessed: July 2 2019.
Stefan Schaal. Is imitation learning the route to humanoid robots? Trends in cognitive sciences, 3(6):233-242, 1999.
Jonathan Schaffer. The metaphysics of causation. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, fall 2016 edition, 2016.
James A Schellenberg. A comparative test of three models for solving “the bargaining problem”. Behavioral Science, 33(2):81-96, 1988.
Thomas Schelling. The Strategy of Conflict. Harvard University Press, 1960.
David Schmidt, Robert Shupp, James Walker, TK Ahn, and Elinor Ostrom. Dilemma games: game parameters and matching protocols. Journal of Economic Behavior & Organization, 46(4):357-377, 2001.
Wolfgang Schwarz. On functional decision theory. umsu.de/wo/2018/688, 2018. Accessed: September 15 2019.
Anja Shortland and Russ Roberts. Shortland on kidnap. http://www.econtalk.org/anja-shortland-on-kidnap/, 2019. Accessed: July 13 2019.
Carl Shulman. Omohundro’s “basic ai drives” and catastrophic risks. Manuscript, 2010.
Linda J Skitka, Kathleen L Mosier, and Mark Burdick. Does automation bias decision-making? International Journal of Human-Computer Studies, 51(5):991–1006, 1999.
Alastair Smith and Allan C Stam. Bargaining and the nature of war. Journal of Conflict Resolution, 48(6):783-813, 2004.
Glenn H Snyder. “prisoner’s dilema” and “chicken” models in international politics. International Studies Quarterly, 15(1):66-103, 1971.
Nate Soares and Benja Fallenstein. Toward idealized decision theory. arXiv preprint arXiv:1507.01986, 2015.
Nate Soares and Benya Fallenstein. Agent foundations for aligning machine intelligence with human interests: a technical research agenda. In The Technological Singularity, pages 103-125. Springer, 2017.
Joel Sobel. A theory of credibility. The Review of Economic Studies, 52(4):557-573, 1985.
Ray J Solomonoff. A formal theory of inductive inference. part i. Information and control, 7(1):1-22, 1964.
Kaj Sotala. Disjunctive scenarios of catastrophic ai risk. In Artificial Intelligence Safety and Security, pages 315-337. Chapman and Hall/CRC, 2018.
Tom Florian Sterkenburg. The foudations of solomonoff prediction. Master’s thesis, 2013.
Joerg Stoye. Statistical decisions under ambiguity. Theory and decision, 70(2):129-148, 2011.
Joseph Suarez, Yilun Du, Phillip Isola, and Igor Mordatch. Neural mmo: A massively multiagent game environment for training and evaluating intelligent agents. arXiv preprint arXiv:1903.00784, 2019.
Chiara Superti. Addiopizzo: Can a label defeat the mafia? Journal of International Policy Solutions, 11(4):3-11, 2009.
Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction. MIT press, 2018.
William Talbott. Bayesian epistemology. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, winter 2016 edition, 2016.
Jessica Taylor. My current take on the paul-miri disagreement on alignability of messy ai. https://agentfoundations.org/item?id=1129, 2016. Accessed: October 6 2019.
Max Tegmark. Parallel universes. Scientific American, 288(5):40-51, 2003.
Moshe Tennenholtz. Program equilibrium. Games and Economic Behavior, 49(2): 363-373, 2004.
Johannes Treutlein. Modeling multiverse-wide superrationality. Unpublished working draft., 2019.
Jonathan Uesato, Ananya Kumar, Csaba Szepesvari, Tom Erez, Avraham Ruderman, Keith Anderson, Nicolas Heess, Pushmeet Kohli, et al. Rigorous agent evaluation: An adversarial approach to uncover catastrophic failures. arXiv preprint arXiv:1812.01647, 2018.
Eric Van Damme. The nash bargaining solution is optimal. Journal of Economic Theory, 38(1):78-100, 1986.
Hal R Varian. Computer mediated transactions. American Economic Review, 100(2): 1-10, 2010.
Heinrich Von Stackelberg. Market structure and equilibrium. Springer Science & Business Media, 2010.
Kenneth N Waltz. The stability of a bipolar world. Daedalus, pages 881-909, 1964.
Weixun Wang, Jianye Hao, Yixi Wang, and Matthew Taylor. Towards cooperation in sequential prisoner’s dilemmas: a deep multiagent reinforcement learning approach. arXiv preprint arXiv:1803.00162, 2018.
E Roy Weintraub. Game theory and cold war rationality: A review essay. Journal of Economic Literature, 55(1):148-61, 2017.
Sylvia Wenmackers and Jan-Willem Romeijn. New theory about old evidence. Synthese, 193(4):1225-1250, 2016.
Lantao Yu, Jiaming Song, and Stefano Ermon. Multi-agent adversarial inverse reinforcement learning. arXiv preprint arXiv:1907.13220, 2019.
Eliezer Yudkowsky. Ingredients of timeless decision theory. https://www.lesswrong.com/posts/szfxvS8nsxTgJLBHs/ingredients-of-timeless-decision-theory, 2009. Accessed: March 14 2019.
Eliezer Yudkowsky. Intelligence explosion microeconomics. Machine Intelligence Research Institute, accessed online October, 23:2015, 2013.
Eliezer Yudkowsky. Modeling distant superintelligences. https://arbital.com/p/distant_SIs/, n.d. Accessed: Feb. 6 2019.
Eliezer Yudkowsky and Nate Soares. Functional decision theory: A new theory of instrumental rationality. arXiv preprint arXiv:1710.05060, 2017.
Claire Zabel and Luke Muehlhauser. Information security careers for gcr reduction. https://forum.effectivealtruism.org/posts/ZJiCfwTy5dC4CoxqA/information-security-careers-for-gcr-reduction, 2019. Accessed: July 17 2019.
Chongjie Zhang and Victor Lesser. Multi-agent learning with policy prediction. In Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010.
A few months ago I posted about making eggy crepes as a way to get the kids to eat more protein. Lately they've been interested in waffles, and I wanted to figure out how to make something similar. It turns out if you search for [eggy waffles] you tend to find people making waffles that have only slightly more eggs than usual. We can do better than that:
- 3 Eggs
- 1/4C full fat Greek yoghurt
- 2T butter
- 2T flour
It's possible that they could use a bit less flour, but if you leave the flour out entirely they come out rubbery and won't crisp.
How much more protein is this? Here's the first recipe I found searching for [waffle recipe], which looks pretty normal:
- 2 eggs
- 2C flour
- 1 3/4C milk
- 1/2C oil
- 1T sugar
- 4t baking powder
- 1/4t salt
- 1/2t vanilla
These aren't the same size, but after normalizing both to 1,000 calories (sheet) the regular waffles have 22g of protein vs 41g (1.9x) in the eggy version. If we look specifically at lysine, the amino acid someone who mostly only wants to eat carbs is most at risk of being deficient in, it's 3.0g vs 1.1g (2.7x).
I was also curious how Eggo frozen waffles compare, since they have "egg" in the name. I count 23g protein per 1,000 calories, which is like regular waffles.
Comment via: facebook
I have come to a realisation a bit later than I should have. Although I am still quite young and definitely have time to act on this realisation now, I wish I had started sooner.
I am studying to become a teacher, and I hope to go into education policy later, with quite some large ambition in mind. And yet, my social skills are quite poor, and I have hardly any charisma. I seek to change this. I know that much of the cause of my poor social skills is never having created or found opportunities to develop them in the natural developmental path of a child/teenager.
And so I take to reading books in order to learn, and then apply what I read in life. I suppose I could just sit and think and figure out what to improve, but in the name of efficiency I want to at least start with the guidance of someone who actually knows what they're talking about.
So, any book recommendations that explicitly teach social skills and charisma? I've started working through Just Listen by Mark Goulston, which so far seems quite valuable.