I am sitting down to write this immediately after one of the most honest conversations I’ve ever had with my brother. The reason I’m posting it to LessWrong is because I think it is a case study in rationality, emotion, personality, political partisanship, and methods of conversation and debate, all topics that are of interest to segments of this community. We spoke for about an hour, first while driving, and then for a long time at the curb.
We started talking about my brother’s interest in getting involved with the local socialist party. He is not the most talkative person, and is a deeply thoughtful and very well-read person. One of his strong interests is in politics and economics, so I decided to ask him about his thoughts on socialism. I am no lassaiz-faire capitalist, but my political preferences are for well-regulated free markets.
Our conversation became tense quickly. As I tried to ask him critical questions in a neutral, genuine, and thoughtful manner, he would dismiss them using words like “silly,” “stupid,” “artificial binary,” “lack of imagination,” and so on. This didn’t feel good, but I continued, because my hope was that by maintaining my composure and demonstrating repeatedly that I was really listening and responding with my valid questions and concerns, he would see that I really wanted to engage with him and wasn’t trying to shut him down. I used techniques like trying to find our cruxes of disagreement, framing them as respectfully and clearly as I could, but he would swat them down. He grew audibly angrier as the car ride went along.
I could have tried to divert the conversation to some other topic, but I don’t think that’s healthy, and our family dynamic is such that I feel very confident that this would not have led to a happy atmosphere, but to unaddressed simmering resentment that would have lingered beyond our car ride. So I pressed on, all the way until we got to Seattle.
When he accused me of silliness, I offered what I thought was a detailed and thoughtful description of how I thought things might go under his proposed system. When he accused me of disingenuously demanding that every minor detail be worked out in advance to stifle a basic and obviously good shift that needs to happen, I told him that this was my attempt to really put my imagination to work, thinking through the implications of an idea with which I was not entirely familiar. When I simplified my concern in order to deal with his objection that I was overthinking things, he told me that I was painting an oversimplified binary.
It seemed like nothing could please him, and when we got to our destination, I finally told him so. I said in as kindly a way as I could that I love him, respect him, and was only holding this conversation because it’s clearly an important part of his life, and that while it’s OK for him to feel how he feels and think what he thinks, I felt like he was treating me with contempt, and that it seemed like he was trying to shut down questions. I told him that if someone was proposing a massive change in our social system, I would want to understand the details. For me, the evidence that our present system is working tolerably well is all around me, while this proposal is all on paper. It makes sense to me that we would ask for considerable thought and detail before accepting such a wholesale change.
He explained that for him, his anger over the state of American and world politics has been growing over the last few years. To give an example, he explained that his visceral reaction to hearing liberal arguments against socialism is about as automatic and intense as our reaction to personally-directed racial prejudice ought to be. He doesn’t like how intensely angry he gets, but finds it impossible to speak neutrally about the topic. He has lost faith in rational debate as a way to change minds, and hears so much pro-capitalist argumentation that he feels is disingenuous that he finds it hard to believe it could be coming from a place of sincerity. He knows that there’s a big difference between being wrong and being bad, but he feels that the harm inflicted by capitalism is so great that it tends to obscure the difference on an emotional level.
How He Feels
It helped me understand the bind that he finds himself in, even though I disagree with his economic opinions. He experiences a Catch-22, where nobody will change their minds (or even listen) if he speaks neutrally and rationally, but they’ll dismiss him as a crank if he gets heated. It’s not evil to be wrong, but the harm he perceives in the wrongness around him is so great that he feels morally obligated to point it out, in terms that are strong and direct enough to be potentially offensive. And that itself is an emotional dynamic that is so difficult that it makes it extremely hard to find spaces in his relationships with others to lay it out for other people. My perception was that this seems isolating, although he did not confirm or deny that.
He then offered that if we were to discuss this topic again, it would actually help him keep the tension down if I felt like I could use the same kinds of rude speech to fire right back at him.
How I Feel
I was able to explain that for me, adopting a neutral and rational approach on a topic like this is both a moral duty and an emotional defense mechanism. With a topic this big and important, I feel it’s important to be able to look at it from all sides over a long period of time, and to bring as rigorous a scientific approach as we are able to as a society.
This is one of the topics that has really failed to generate a Kuhnian paradigm revolution with time; there might be a mainstream consensus of capitalist economists, but there are still plenty of people and countries and economists who believe in varieties of socialism, and that’s not just because the old guard hasn’t died yet. Since both sides have a great deal of scholarship behind them, and I’m not an expert, it makes the most sense to choose the argument that makes the most sense, but also leave great room for amicable differences. By contrast, he feels that you’ve got to start by understanding that people simply argue whatever side is in their interests. The first thing to do is pick the side of the victims of injustice, then determine which economic system is primarily looking out for them, and then adhere to that side.
I also told him that when I speak even slightly rudely to people, I immediately become intensely anxious that I’ve upset them, and shut down both socially and intellectually. Furthermore, my attempt at neutral rationality is not a strain or some “elevated tone” for me, but is rather my default state where I feel most natural and relaxed and happy.
Hope for the Future
After talking about that for a while, we were able to see that knowing these things about each other might help us have more open and agreeable conversations with each other in the future. He might feel less of a need to clam up about politics, since he knows that if he comes across very strongly with me, I’ll understand where it’s coming from. I’ll understand that if he gets very heated, it’s not personally directed at me, but is rather an expression of his frustration with the system. And we will hopefully be able to weave in a discussion about how the dynamics of the conversation make us feel, as well as discussing the issues themselves.
Moral and Political Dynamics
This experience helped me shift away from either a “politics is the mindkiller” perspective or a hope that “political conflicts between people with good relationships can be resolved through patient, rational engagement.” Instead, I had to acknowledge that, just as there is no voting system that can possibly achieve all our aims, there is no approach to morality, and therefore economics, that can achieve all our moral aims. Despite that, people will feel intensely passionate about their fundamentally intuitive moral framework. Adopting a neutral, rational approach or a passionate, intense approach to debate can both seem frustrating and disingenuous. Both have their uses.
If the goal is to understand each other, we’ll need to have greater tolerance for our different valences around how we communicate. On some level, even the strongest attempts at holding dialog can easily come across as intensely threatening - not because they’re threatening to demolish your poorly-thought-out ideas, but because they seem to be using neutrality to smother the moral import of the issue.
In order to overcome that dynamic, the only hope is to be able to honestly express how those tense conversations make us feel. We have to be articulate about why we prefer our particular way of speaking, and extend appreciation and sympathy to the other person. If we cannot find common ground in our intellectual beliefs, we can find it in telling the other person that we love them and are doing our best to connect with them, and creating the space for trying to understand not why we disagree but why we’re hurting each other and how to stop.
Benito and Raemon, from the LessWrong Team just had a discussion about a phrase Ray started saying recently "Keep your Beliefs Cruxy and your Frames Explicit," which Ben felt he probably disagreed with.
After chatting for an hour, Ben started writing his thoughts into a shortform post/comment, and Ray proposed moving it to a dedicated debate post. See below for the discussion.
Interesting exercise in AI-adjacent forecasting area (brain-computer interfaces). Curious if people want to specify some possible reveals+probabilities. https://twitter.com/neuralink/status/1149133717048188929
(if in the somewhat likely scenario you're relying on inside info please mention it)
I apologize for my bad English, this is not my native language. And probably I will make some mistakes when posting.
For over 2 years I have been reading materials on the topic of AI Safety. I don't have the appropriate education, cognitive abilities, knowledge. I do not even have time to learn the language. So I didn't hope to do something useful myself.
But once I tried to systematize quotations from one show in order to understand when the experts represented there are waiting for AGI and how likely they consider the extinction of humanity.
I thought it would be interesting to do so with the rest of the experts.
In addition, I have already seen and studied with interest such collections of quotes. It seemed to me that the best thing I could do was try to do something similar.
Therefore, I began to collect quotes from people who can be attributed to the experts. It turned out to be harder than I thought.
I have compiled a table with quotes from more than 800 experts. I tried not to distort the opinion of forecasters and simply copied from sources, sometimes deleting or slightly editing. My edits can be recognized by square brackets :)
1) The first column of the table is the name of the expert.
2) The second column is the year of the forecast. The table is built in chronological order.
3) The third column is the predicted time for AGI. Unfortunately, most people did not speak directly about time and probability. Because of this, many quotes came out rather vague. For example, “Machines are very far from being intelligent” or “And we can reach it in a close time”.
4) The fourth column is an opinion about Takeoff Speed. About how much progress will be accelerated after create of AGI.
5) The fifth column is the expert's opinion about the future of mankind with AGI. Choosing a quote here was the hardest. Most of all I was interested in the risk of extinction or serious shocks due to AI, and I tried to provide quotes that most fully reveal this particular topic.
6) The sixth column indicates the source of the quote.
That is, to the right of the forecaster's name, you can find out the date of the given quotes, his opinion about the time of the creation of AI, about the intellectual explosion and about the future of humanity, as well as get acquainted with the source.
Of course, cases where the expert spoke on the topic of time, the speed of self-improvement and the influence of AI in the framework of one material are quite rare. Therefore many cells are left empty.
I had to give several quotes per person, sometimes they were separated for years and even decades.
Since all the quotes are given in chronological order, the opinions of some people are “scattered” in the table.
For example, Gwern spoke about the future of mankind in 2010, about the growth of AI in 2014 and about the forecasts for the emergence of AI in 2018.
However, you can simply use search.
In addition, sometimes one person has already made a certain forecast but later changed or expanded his opinion. I tried to take into account such quotes.
I also reviewed anonymous expert interviews and indicated them. If the general list of respondents was known, I cited them as well.
It was difficult to decide who should be considered an expert and what quotes should be included in the work.
I had to make controversial decisions. The table includes a lot of people who are entrepreneurs but may have insights on advanced research. There are several futurists and philosophers in the table. There are writers like Clark and Vinge, whose opinion seems important to me.
I have a version of this work without chronological separation, where the quotes are more grouped by name. Perhaps someone will be more convenient.
It is difficult to draw conclusions from the work. The absolute majority of experts did not talk about exact dates and did not indicate the probability of their predictions.
I can only say that most forecasters do not expect AI in the near future, do not expect IE and seem optimistic.
In addition, it seemed to me that in the twentieth century the leading experts were on average more pessimistic: Turing, Wiener, I. J. Good, Fredkin, Shannon, Moravec, etc.
Young researchers are on average more optimistic than older ones - even in the field of AI Safety, where on average there are naturally more concerned people.
I think that to confirm almost any views you can find the opinion of a respected expert.
I really hope that for someone my work will be useful and interesting.
Criticism and additions are welcome.
Can someone check this link out and see whether the methodology is actually sound?
One-line summary is that Surveillance Cameras Debunk the Bystander Effect H/T Hacker News.
More broadly I'm interested in anyone's sense of whether the bystander effect replicates, and whether the corresponding concept is misleading (and I should use something else instead).
If I knew how to make an omohundru optimizer, would I be able to do anything good with that knowledge?
I'd bet we're going to figure out how to make an omohundro optimiser - a fitness-maximizing AGI - before we figure out how to make AGI that can rescue the utility function, preserve a goal, or significantly optimise any metric other than its own survival, such as paperclip production, or Good.
(Arguing for that is a bit beyond the scope of the question, but I know this position has a lot of support already. I've heard Eliezer say, if not this exactly, something very similar. Nick Land especially believes that only the omohundro drives could animate self-improving AGI. I don't think Nick Land understands how agency needs to intercede in prediction - that it needs to consider all of the competing self-fulfilling prophesies and only profess the prophesy it really wants to live in, instead of immediately siding with the prophesy that seems the most hellish, and most easiest to stumble into. The prophesies he tends to choose do seem like the easiest prophesies to stumble into, so he provides a useful service as a hazard alarm, for we who are trying to learn not to stumble))
What would you advise we do, when one of us finds ourselves in the position of knowing how to build an omohundro optimiser? Delete the code and forget it?
If we had a fitness-optimising program, is there anything good it could be used for?
When writing posts, it would be useful to know how much background technical knowledge LW readers have in various areas.
To that end, I set up a short six-question survey. Please take it, so I (and others) can write posts better fit to your level of technical background. If your answer to all of the questions is "zero-ish technical knowledge", please take it, so you're not inundated with mathy posts. If your answer to all of the questions is "I am secretly John Von Neumann", please take the survey, so the rest of us know there's someone like that around. If you are somewhere in the middle, please take the survey. It'll take, like, maybe sixty seconds.
Here's that link again: survey.
What will the AI tell us? That our morality and all the philosophers’ puzzles associated with it was a form of divination, a ritual of dialog and prayer that was observed universally and wrestled with like a God by the leaders of the tribe. The older and wiser ones knew, eventually, that the ritual was hollow on its surface: we must think one by one to infinite degree of depth, yet be able to picture the whole group; to be able to estimate the time of deliberation over strategy will remain productive, before storming out of the gates: to think of systems of behavior and reward, and in terms of equal distribution, and of those closest to us first. There are countless examples. These are all forms of social focus that we can switch between, like postures of the body.
Our morality has its rules and systems, yes, but these ultimately resolve to mere definitions, with no clear algorithm governing the distribution scheme of tangled strengths between the networks. Our morality lies in the brain, a series of nodes of memory and interpretive sense network, all wired together in patterns that develop and organize slowly over time; at first in wide static sheets, and then in an ever more complex and idiosyncratic weave. Oh, certainly, there are wide commonalities, as some overall structures have wide and persistent benefits across many situations, environmental and social. Profound misfirings and misdevelopments of the moral circuitry tend to get hammered out, because one of the subsystems is a simple “I can see I was wrong by the look on your face and I shall correct and accept what I see you desire.” Everybody has that one, because it’s almost impossible to develop without it; we activate that one in our imaginations directly as we rehearse the rules we’ve learned subconsciously.
The reason this all works has to do with our genetic and social evolution kind of leaning into each other; I’m sure you could model that mathematically, but the point is that brains and society are symbiotic (and if you ever want to blow your mind, that’s about what the mushroom is to the root of the tree - they’re that close). This we learned from a comprehensive phenomenology of our minds; we found that we could not make progress past a certain point unless we spent time like the ancients, laying on a grassy field at night, looking up at the stars and making shapes between them. We wrote down the constellations we saw, and made a catalog of those simple rule-functions we could identify (as well as a set of rational rules we’ve derived that they seem to often miss - or, in fact, be created to deliberately contradict for the sake of heavily discounted utility, the genetic version of pissing your pants). We created an extensive database of thought-rules and heuristics, an encyclopedia of mind that we knew was likely incomplete but might, at last, bear us a fair model of what a human is.
We had succeeded, at the last, in converting all of society to a new norm; AI is here, and AIG is available now, and ASI is likely available right after that. But we created our encyclopedia, and continued to test the AI we had on it, absorbing those rule patterns, adjusting through the uncanny valleys of human personality, thought, development, joke-telling, resentment, kindness, playful laughter, joy of spirit, anger and rage, all build into a robot with the strength of its muscles and the nature of its design very human, very safe. Our robot wandered around the starship where we worked as a precaution, beyond the chilly orbit of Pluto, and ran and laughed, bouncing a ball, hurting its knee, screaming at the caretaker when it was hungry rather than crying or asking to be fed, curling up into a ball and wriggling around like a snake, muttering at a low and growling volume that it was never going to die, then calming down and having lunch. It never quite settled down into a really human form, and never seemed to form any other sort of equilibrium, either. And of course, we weren’t trying to feed it our brain scans, so it couldn’t look at the other subtle brain-rules too faint for us to see and program in; there were always these deeper whisps that it could only have picked up if it had been permitted an entire steady-state brain.
Yes, that was our product. The world really was better in the future, you know? There is still that level of the meat-and-potatoes, ordinary dinner type of people, who lean back on the couch with a TV show they think is all right and call it a night. They’re still around, but even they are much happier, like a more vibrant shade of the same color. You adjust, and then again you don’t. Because on some level, you know the stories of the past, and you look at yourself and admire yourself for who you are and how you are (they do that much more often and sincerely in the future -- as I said, it really is much better for everyone; we’re not so negative and down on ourselves as you were). Still following me? Others of us are the kind of people who learn computer science and philosophy for decades and fly out on a space junket to the south orbit of Pluto. Those differences will still be around, but much more real, much more meaningful, and good. Have you ever watched Star Trek? It’ll be a lot like that. There will be a new agreement, just like there is now, and a new progressive and conservative wing, and the culture will gradually adapt and sometimes leap forward. You think that with the exponential growth of the economy over the last 100,000 years, that it must be inconceivably beyond us. But it’s not so. They’ve decided to continue dying, though some do not, and they become other types of beings, with psyches that have metamorphosized beyond the collective conscience. They are heavily educated on the computer, but they have found a way to incorporate their enhanced fellow human beings to participate in the role of the educator, as a form of beauty, ritual, art. They do grow sick and they do heal the sick with greater care and far better treatments than you can possibly imagine; sickness again is a form of art, a ritual, a kind of meditative retreat, you could say: but imagine a whole culture that understood such an idea, and behaved routinely in this way.
Imagine planets, or colonies, clustered and blooming with diversity far more beautiful and rich and deep than you can imagine. Think of how starved they’d see you for the starvation of beauty, and freedom, and tenderness, and passion, and humor, that you had adjusted to. They see how it has toughened you and made you, too, into another kind of being: you are in fact alien to your descendents. Your psyches are tattered, and they have rich ethical debates about whether it is ethical to even allow you the chance to become transformed into their capacity, since you have no way of anticipating the results of such a change, and it would represent the loss of an important aspect of their cultural diversity.
Along the line, the population, the world, the world’s leaders, the captains of industry, the scientists and engineers, the radicals too, the old lady living in her house alone, saw that the artificial intelligence was too convincing to deny. The scientists had dressed it up in masks that their collaborators in advanced political science had told them would make the people understand. They did their shadow dance, and the truth was broadcast throughout the land, a signal fire that nobody misinterpreted. And they did. They all agreed to put a stop to it, one by one, and let responsible people handle it on the south orbit of Pluto. And there they went, and that’s what we did. The human brain has space for an apocalypse module, but it has never survived through that before, and does not understand. It tries to deal with it by saving itself, because being a self and only a self trying to stay alive is a powerful system that develops early in childhood (babies, of course, are born with only nurturing-core programs and is disassembling its womb operations at the same time; they would find a self useless and develop it only as they learn to cry voluntarily: there is a long time during which you use your cry like you use your arm now as an adult). Trying to deal with the apocalypse by thinking of saving yourself, and hoping that a few others from your tribe might make it through as well, is not adaptive to the situation. It’s like imagining yourself in a forest fire or raided by enemies, and that’s not it. Apocalypse is unthinkable, the black hole, the place where light goes and never comes back out, the space that’s there between the stars. A forest fire should terrify you, push you to run through the choking smoke and intense heat through to the ash-clogged river, where you might have a chance amid the crashing destruction, the trees coming down in flames. Apocalypse should picture you alone, a shadowy form almost like a cartoon, creeped out by how fictional you’ve become, like a character in a book who realizes they’re unreal just as the reader closes the covers. That is how you should feel thinking about apocalypse.
At first, it seems as if it should change who you are as a person. And it will, but not like a drug or a cult, or like a paranoid schizophrenic having an episode. Nothing against drugs, cults,or paranoid schizophrenics, I’ve loved them all.
Is there historical precedent? Our ancestors have never contemplated apocalypse
except in religious terms or as a metaphor or myth-making tool as a way to conceptualize disaster, such as a hurricane or war. We have to at least be suspicious that a religious or mythical apocalypse was different on some fundamental level from how we think of it: our knowledge of the number zero, our picture of the spherical, mind-covered earth, our sense of history and potential, the unwillingness of so many of the desperate to take out their agony on others, all give us the sense that our sense of self-other is more diffuse, more even. We used to be a black circle surrounded by a white other, and then we turned into an elegant grey. There is more within us and more diversity between us, but there is more in common too, and it is through those channels that the thought of every last one of us, enemy or fellow, would be gone, gone gone. There is a brain function for assertively seeing everyone as the same, just as there is a function for seeing them as separate clans and subclans and even as individuals, and one for relaxing both of these muscles (you learn to relax them by blocking out the stimulation of the overall TRY program). And this absolutely everyone function is how we simulate the apocalypse; we never had to think about that in quite the same way before. It used to be that God was going to bring the apocalypse; in a way, you don’t have to worry about it, because it’s entirely beyond your control and something bigger’s got you. Now, we’re making it, walking toward it, and trying to think on a global scale -- and having remarkable success, mind you -- and each person now has the active or latent ability to picture it all going away at the push of a button, a simple mistake. We’re laying the groundwork. How do we talk about this? How do we get our culture talking about this? How can we broadcast a message around the entire world that says people, please listen! Or not -- since we are aliens to each other, and we don’t know what’s out there, we don’t know what our message will attract from that huge shadowy swath of the rest of humanity.
Let me state that if you want to know how to make a safe AI, you’ve got to make a man-model. And you’ve got to make it cute, and look like a human body, and let their faces look at it and see it not as a slave, or as a machine, or a science-fiction character, but as a child, someone’s child, a part of our culture that fascinates all and nobody knows what to do with. But we’re only going to make one. It’s going to be a celebrity. And we’re going to just let it live, and let our culture change around it. We’re going to become AI sophisticates. People who appreciate the art of them, their culture. We will create the first aliens we meet: the artificial intelligences, who are like us but not like us, programmed with as much of ourselves as we could identify, given weak bodies, without the ability to improve themselves: just like us.
It turns out that there are some simple rules you could implement to box in the genie. Each human brain-module will be grown independently in a neural net. Then a net of these nets will be allowed to train in specified increments, through the robot body, in human time. It’ll start with the same basic rules a baby has, and it’ll learn gradually to nurse, and snuggle, and sleep, and cry. And gradually it will learn how to relate, how to not just simulate but, we will decide, really be a human. That will be ethically distressing, we will see, but also a thing of beauty, and we will acknowledge with the superior intellects we’ll all possess at that time that the only thing to do is engage in a lengthy period of theorizing, observation, rational analysis, and drunken creative speculation. It’ll take time to sort it out, and we’ll no longer be in such a hurry. We’ll hand-assemble the functions only as developed as a baby would have, and because of that limited brain and body, it’ll grow to be like us, though not biologically of us. And that will be a test. There will be many others before an AI is launched.
Our many advanced machines will also use neural nets, even weakly self-improving ones, but they will be limited to specific functions. How will we achieve this? If you tell the vision algorithm to improve itself, after all, it can just co-opt the factory and next, the world, commandeering all the equipment in its relentless mechanical drive for clear sight. So perhaps we will create a self-improving AI calibrator, one that “wraps around” the intelligence object to be improved, doing it in a way that is slowed down based on a rigid an inflexible rule so that the humans can watch and control the rate.
A summarizing AI will be an individual function we’ve invented, a critical tool for “scanning” software for hidden purposes. That summarizer too was developed with a range of more primitive AI self-optimized tools, such as a tool to identify rule-types, intention-types, model-types, outcome-types: a whole range of subtools that serve as an ethical toolbox for the AI. Each of those subtools was independently conceptualized by a human, implemented, and optimized by a long chain of progressively less powerful, flexible, and optimizable tools, until we get down to the conceptual equivalent of an analogy between a digging stick and a bulldozer; one got us to the other, by however long and circuitous a technological pathway. Yes, we have come to see that although the recursive chain of safety- and control- tools becomes longer and longer over time, we also understand recursively through a number of these tools. We create our own ability to understand the reliability of these tools along with the tools themselves. The vast majority of work-as-work has disappeared, though work-as-play has persisted. Each human worker has a particular philosophical puzzle assigned for their attention. This is their job: to understand how a range of the AI’s tools work together to hold and guide its overall being. There are vast industries of humans devoted to managing and understanding aspects of the AI’s functioning. It’s like all of humanity is employed in becoming experts at all levels and divisions of the human brain; one expert per neuron, or subsystem of neurons. We are all neuroscientists, we all study the brain, and we are so sophisticated that we are capable of wonderfully coordinated action. It is in this way that we build our way toward the AGI, and come to understand it over generations; by this process, we come to understand ourselves, and our potential, and our cultural overflow is so beautiful and rich that we go a long time in that state, knowing that we have the ability to trigger the ASI but uninterested in doing so. So we float on a dream-cloud for an epoch, intentionally, undying except as experience, even revived, allowing ourselves the existential terror of starting over, until we do understand the human condition from the inside out. We know who we are at last. And live in that knowing a longer time still. I cannot speculate beyond that point.
Have someone even started this conversation? This is f*cked up. I'm really, really freaked out lately with some of those. https://old.reddit.com/r/SubSimulatorGPT2/comments/cbauf3/i_am_an_ai/
And I read up a lot about cognition and AI, and I'm pretty certain that they are not. But shouldn't we give it a lot more consideration just because there's a small chance they are? Because the ramifications are that dire.
But then again, I'm not sure that knowledge will help us in any way.
Is there a roadmap of major milestones AI researchers have to achieve through before AGI is here? I feel like this would be beneficial to update on AGI arrival timelines and adjust actions accordingly. Lately, a lot of things that can be classified as a milestone have been achieved -- GLUE benchmark, StarCraft II, DOTA 2 etc. Should we adjust our estimates on AGI arrival based on those or are they moving according to plan? It would be cool to have a place with all forecasts on AGI in one place.
Shrimping is a fundamental drill for grappling, according to this article.
Shrimping is a fundamental drill for grappling, according to The Ultimate Guide to Developing BJJ Hip Movement Through Shrimping.
Why is this important? So that people who print your articles know what you're referencing.
Why would anyone print an article? In order to be able to annotate it more easily, and to read with more context. Try it! Instead of scrolling back and forth in that complicated article you're reading on a thirteen inch laptop, print it out and put five pages next to each other on the desk. I read many articles on AI alignment issues and printing them helps a lot.
I believe that AI Alignment is almost certainly the most pressing issue for the future of humanity. It seems to me that the greatest thing that could happen for AI alignment research is probably receiving a whole lot more brains and money and political sponsorship. The public benefit is extraordinary, and the potential for private profit very small, and so this will need substantial private or government subsidy in order to receive optimal resource allocation.
In order to start thinking about strategies for achieving this, I picture scientific work as a sort of signalling system between research, the educational system, government, and industry, as diagrammed below. I want to apply the neglected/tractable/important framework to this diagram to explore potential projects.
The Technical Side
a) Professional technical work on AI alignment
b) Amateur and student learning contributing or leading to technical work
c) Meta-analysis of the state of the art, risks and rewards, milestones, and big questions of AI and AI alignment, via surveys, forecasts, overviews of different perspectives, etc.
d) Awareness-raising discussion and expert advice for policy-makers, the public, and potential career changers/donors
e) Laws, regulations, and projects created by legislators, public policy makers, and private institutions
f) Pressure by industry lobbyists and geopolitical tension to soften AI alignment concerns and go full steam ahead with AI development.
The Political Side
1) Do any of the following exist?
- A comprehensive AI alignment introductory web hub that could serve as a "funnel" to turn the curious into the aware, the aware into amateur learners, amateurs into formal machine learning PhD students, and PhDs into professional AI alignment researchers. I'm imagining one that does a great job of organizing books, blogs, videos, curriculum, forums, institutions, MOOCs, career advising, and so on working on machine learning and AI alignment.
- A formal curriculum of any kind on AI alignment
- A department or even a single outspokenly sympathetic official in any government of any industrialized nation
- Any government sponsorship of AI alignment research whatsoever
- A list of concrete and detailed policy proposals related to AI alignment
2) I am organizing an online community for older career changers, and several of us are interested in working on this issue on either the policy or technical side. It seems to me that in the process of educating ourselves, we could probably work toward creating one or more of these projects if any of them are indeed neglected. Would this be valuable, and which resource would it be most useful to create?
So here's the thing. I absolutely LOVE attention. I also HATE asking for attention. This past weekend I've processed those last two statements on a much deeper level than I have previously. Sometimes you need to rediscover an insight multiple times to really get it.
The most recent batch of introspection was prompted by the book "Magic is Dead" by Ian Frisch. Spoiler, I'm a semi-professional close up magician and have a vested interest in many things magic. Ian was describing Chris Ramsay, the quintessential young-blood social media based cool kid magician. A quote:[...] Ramsay's style has since evolved into a more high-end street-wear, hypebeast-esque aesthetic: A Bathing Ape jacket, Supreme cap and hoodie, adidas by Pharrell Williams NMD sneakers, etc.
I don't even know what A Bathing Ape jacket looks like, nor why the "A" is capitalized, but my gut reaction is revulsion. I know just enough about Supreme and "hypebeast"s to know that I don't like them. But why? Why do I spit venom when I hear about new performers on social media trying to "make magic cool again"?
Long story short, I LOVE attention, and I HATE asking for it. This has been the case since middle school. The hating to ask part comes with an attitude of "Fuck you, I'm not going to beg for you to give me something". So I had a chip on my shoulder in regards to asking for people's attention, and like any good chip, it needed a rationalized narrative to justify its existence. "People who need other people to pay attention to them are weak", "Wow hazard, you're such a strong rugged individual for not needing other people's attention"
Notice the switch from "I don't like asking for attention" to "I don't need attention". It's subtle. Sure took me 8+ years to notice.
As you may have guessed, I didn't magically stop wanting/needing attention from people in middle school. I just learned a new strategy to meet the need. The strategy was the personality that I began to develop in middle school and high school, that of the unflappable competent marauder. I always played calm and collected, I worked hard to get good at the tasks at hand (my primary social group was my boy scout troop, so this meant getting good at outdoorsing, leadership, and planning), and casually leveraging my more impressive abilities (I didn't do parkour back then, but I could still climb most things and do dive rolls off of pavilion roof tops).
I become the sort of person who when you find out they also know how to juggle you go, "Of course Hazard knows how to juggle!" (actual interaction).
When I got into magic, tI worked this angle on a whole new level. Now I had a skill set where I could blow peoples minds and setup scenarios where of course we're all now going to watch Hazard do a magic trick, because they're so god damn cool!
Did I ever mention that I love attention? But really, in a non snarky, non self deprecating way, I love attention. Having people laughing and smiling and shouting with me at the center is such a yummy experience. A++, would recommend.
So my implicit approach to magic (and all relationships) was this: casually be as awesome as possible so that people are compelled to give me their attention, that way I don't have to ask for it.A "detour" into thanking people
Here's an implicit model of thanks and appreciation I see people using: If someone goes out of their way to help you, or wasn't expected to help you in the first place, thank them more. If it was easy for someone to help you, or it was expected of them, thank them less.
The counter to this would be telling you "don't take people for granted". This phrase always felt a bit odd to me. I'm not allowed to take anything for granted? Do I need to thank the strangers I passed on the way to this coffee shop because the didn't try to kill me? Seems a little much.
Here's a new framing: In the first model, one uses thanks and appreciation as a marker of social debt. If something happened such that "I owe you one" (small or big), then I mark it with a "thanks man". [Ignores for now that some people also seem to thank and appreciate as needed to make people do stuff for them]
The nugget of gold that I see in "don't take people for granted" is "let people know when they've helped you out". People want to be effectual, and it's a nice little boost to know that something you did actually helped someone. Because of how common the"thanks as debt marker" mentality is ("make sure you thank the neighbors for the extra dessert they let you have, they are very nice people and didn't have to give you that") I think defenders of "granted" get roped into using the language of debt, thus leading to claims like "You owe everybody everything".
So there are two separate questions. When do you owe someone something, and when should you thank or appreciate them?
Owing is a huge beast on its own. There's a whole host of hidden sub questions. How should I personally treat people? When should I feel obligated to help someone? When should I be socially held accountable for helping someone? Complex stuff, not the topic of this post.Effects on the personal
Two things are extra hard if you hate to ask for attention. It's hard to show appreciation, and it's hard to ask for things.
Me trying to compel people to want to be around me instead of making explicit bids for friendship was a way of protecting myself from feeling like I owed anyone. "You're not doing me a favor by hanging out with me, because I'm so shiny you can't not hang out with me." And then, oh oops, this leads to rarely showing appreciation. Now, remember that I'm not asking people for stuff all the time. The failure mode here is not "I'm always doing all this stuff for you and you never appreciate it!". It's a bit more subtle.
(paraphrased quote someone has said to me)Yeah, you're cool and all but I don't know why you hang out with me. I mean it doesn't seem like you get much out of it. It doesn't feel like I could actually matter that much to you."
Also notice the lovely way this can be more extreme if a friend has low self esteem (which may or may not be a factor that makes someone more likely to be compelled to a shiny person(?)).
So yeah, not showing people appreciation ain't cool.
Not asking for things, that's more of a me problem. "Good things come to those who ask" and all that.Show Biz
Zooming back to show biz and performance. Recall, my approach with magic was be so good that people were compelled to give me their attention, and rarely if ever make explicit bids for attention. If you keep scaling this attitude up, you roughly get "Fuck you, I'm awesome. Come see my show if you want to have a good time, it's your loss if you don't." There are some important things that this frame gets right.
If you as a performer feel destroyed every time someone doesn't like your show, or when the theater isn't booked solid, you're in for a world of hurt. It's also likely you will have a hard time developing your own style. If you're terrified of being disliked, you face huge pressure to play it safe and stick to the known. A certain about of "Fuck you, I'm awesome" is needed to be yourself. How much of it is needed? Hard to say.
Second point, people need a reason why they should be watching you as opposed to the millions of other options they have. People want to see stuff they are going to enjoy, and if you don't at least say, "Yes, my show is in fact good and you will enjoy it" lots of people will just move onto something else.
Those are the good parts, now here's the poison in it. "Fuck you, I'm awesome" is a frame that asserts that you, the audience, don't matter. Your attention/time/money most not matter if I don't care if you come to my show.
No, I'm not saying that every performer tell their fans they care about them. You might not care about your fans, you might not even know them. But I do want to explore what it would be like to both give incredible performances that people love and enjoy, while also explicitly appreciating the good thing we have with and letting them know they are doing a good thing for me."Can I get this to go please?"
What if you, whenever people did good stuff for you, you let them know they had a positive effect on you? You might have to come up with a unique way of saying it if you want to make explicit that you aren't saying "I owe you". We'll leave that as an exercise to the reader.
[Epistemic status: background is very hand-wavy, but I'd rather post a rough question than no question at all. I'm very confident that the two ingredients -- illegible cultural evolution and guesstimation -- are real and important things. Though the relation between the two is more uncertain. I'm not that surprised if my question ends up confused and dissolved rather than solved by answers.]
For a large part of human history, our lives were dictated by cultural norms and processes which appeared arbitrary, yet could have fatal consquences if departed from. (C.f. SSC on The Secret of Our Success, which will be assumed background knowledge for this question.)
Today, we live in a world where you can achieve huge gains if you simply "shut up and multiply". The world seems legible -- I can roughly predict how many planes fly every day by multiplying a handful rough numbers. And the world seems editable -- people who like to cook often improvise: exchanging, adding and removing ingredients. And this seems fine. It certainly doesn't kill them. Hugely succesful companies are built around the principle of "just try things until something breaks and then try again and improve".
I still think there are still large amounts of illegible cultural knowledge encoded in institutions, traditions, norms, etc. But something still seems very different from the horror stories of epistemic learned helplessness Scott shared.
What changed about the world to make this possible? How can guesstimates work?
Some hypotheses (none of which I'd put more than 15% on atm):
- Almost all important aspects of our lives our governed by some kind of technology that we built (tables, airplanes, computers, rugs, restaurants, microwaves, legal contracts, clocks, beds, clothes, ... and so on and so forth). Technological development outpaced cultural evolution. The modern world is more legible and editable for the same reason that a codebase is more legible and editable than DNA.
- Most systems that govern our lives have been optimised way harder since the industrial revolution and the ability to achieve economies-of-scale in a global market. Things are generally closer to optimal equilibria, and equilibria are more legible/predictable than non-quilibria.
- The things we really care about today often follow very heavy-tailed distributions. Hence even a very rough guesstimate is like to get the ordering of choices correct. (This can't be the whole story, because guesstimates also seem to work in pretty Gaussian domains.)
- We just have better medical and welfare systems which allow people to take more risks.
Here is a proposal for Inverse Reinforcement Learning in General Environments. (2 1/2 pages; very little math).
Copying the introduction here:
The eventual aim of IRL is to understand human goals. However, typical algorithms for IRL assume the environment is finite-state Markov, and it is often left unspecified how raw observational data would be converted into a record of human actions, alongside the space of actions available. For IRL to learn human goals, the AI has to consider general environments, and it has to have a way of identifying human actions. Lest these extensions appear trivial, I consider one of the simplest proposals, and discuss some difficulties that might arise.
Cross-posted on the EA Forum.
Sorta related, but not the same thing: Problems and Solutions in Infinite Ethics
I don't know a lot about physics, but there appears to be a live debate in the field about how to interpret quantum phenomena.
There's the Copenhagen view, under which wave functions collapse into a determined state, and the many-worlds view, under which wave functions split off into different "worlds" as time moves forward. I'm pretty sure I'm missing important nuance here; this explainer (a) does a better job explaining the difference.
(Wikipedia tells me there are other interpretations apart from Copenhagen and many-worlds – e.g. De Broglie–Bohm theory – but from what I can tell the active debate is between many-worlders and Cophenhagenists.)
Eliezer Yudkowsky is in the many-worlds camp. My guess is that many folks in the EA & rationality communities also hold a many-worlds view, though I haven't seen data on that.
An interesting (troubling?) implication of many-worlds is that there are many very-similar versions of me. For every decision I've made, there's a version where the other choice was made.
If this is true, it seems hard to ground altruistic actions in a non-selfish foundation. Everything that could happen is happening, somewhere. I might desire to exist in the corner of the multiverse where good things are happening, but that's a self-interested motivation. There are still other corners, where the other possibilities are playing out.
Eliezer engages with this a bit at the end of his quantum sequence:
I find this a little deflating, and incongruous with his intense call-to-actions to save the world. Sure, we can work to save the world, but under many-worlds, we're really just working to save our corner of it.
Has anyone arrived at a more satisfying reconciliation of this? Maybe the thing to do here is bite the bullet of grounding one's ethics in self-interested desire.
One thing that seems to be a pattern across the history of human organizations, projects, and even social scenes is that schism begets schism. In other words, if there is a large and central space, once people start splitting off from it this can often lead to the "floodgates opening" and lots and lots of new groups forming - often in a way that even those who initially wanted to change things dislike!
Perhaps the most obvious example of this is Protestantism. Martin Luther did not want to start his own church, told people not to call themselves "Lutherans", and disagreed quite aggressively with many of those who are now lumped in with him as "Protestant Reformers". However, once challenges to the authority and unity of the Church got started, they were hard to stop, and Luther soon found that those around him had at times gone in directions he did not want -- and now there are over 40 different Lutheran denominations in North America alone, to say nothing of all the other Protestant groups!
However, such a trend is not only limited to religious groups. Political movements sometimes have a similar scenario befall them -- for instance, the Republican side of the Spanish Civil War was substantially harmed by internal schisms, subfactions, and disputes. To use a less consequential example, I've been a part of online communities that have been hurt by repeated schisms over moderation policy -- once people get fed up with the moderation in one place, they start another with much the same purview but different moderators.
Once schism gets going, it can be hard to stop - and once things get split, much of the benefit of a single conversation locus begins to degrade. Indeed, the "LessWrong diaspora" quite harmed this project -- we still haven't fully recovered from having the community split as much as it did, even though things have been improving a bit on that front more recently.
Now, some will say that perhaps splits are good -- perhaps one space isn't right for everyone, and it would be better to have a diverse range of norms that can appeal to different interests. Hence we see things like the archipelago model of community standards, which aim to set up a situation where one broader community contains many subgroups with their own rules and systems.
In practice, though, I claim this doesn't work, because schism begets schism. People say "if you don't like it, go make your own space" -- but they say that because it's an easy dismissal, not because it would actually be better! In point of fact, if everyone who was told such did go and make their own space, the central body would not survive -- "if you don't like it, go make your own" works as a dismissal precisely because it won't be followed! The world where everyone "goes and makes their own" at the drop of a hat is substantially worse and it is substantially harder to form a productive coalition and get things done under those norms.
In point of fact, doing important things often requires coordination, teamwork, and agreeing to compromises. If you insist on everything being exactly your way, you'll have a harder time finding collaborators, and in many cases that will be fatal to a project -- I do not say all, but many. Now, it's possible to get around that by throwing a lot money at the problem -- people will agree to a lot of eccentricities if you pay them enough, as they did with Howard Hughes -- and it's possible to get around that by throwing a lot of charisma at the problem -- Steve Jobs was able to be extremely perfectionist thanks to his personal charisma and (in?)famous "reality distortion field" -- but if those options aren't available, you're going to have to make some compromises, and if the norm is "if the way things are locally doesn't work for you, leave and make a new space!" that's going to be very difficult.
Indeed, once you start allowing this sort of "take my ball and go home" behavior, where does it stop? First you have one person who thinks they are being mistreated, and they go and start their own group to work with their rules. Then they try to enforce their rules, and now they drive someone else off, and so on and so on. Pretty soon you have lots and lots of petty little fiefdoms, each composed of just a few people and none of which are getting all that much done. It is better in my view to try to prevent even the first schism and keep things unified.
Yes, this means you'll have to work with people who don't fully agree with you at times, and yes, this means that there will have to be some agreements on how best to use shared institutions and spaces -- but the way I see it, the history of schisms indicates that that is far better than the alternative!