There's someone in my family we're trying to get into rehab in Bangalore, India ASAP. I'm trying to figure out what rehab center would be best to send him to but I have no priors on how to choose one place over another. Any advice on how to choose a good rehab center? Also interested in good research on efficacy of different types of rehab if anyone knows any.
Friston has famously invoked the idea of Markov Blankets for representing agent boundaries, in arguments related to the Free Energy Principle / Active Inference. The Emperor's New Markov Blankets by Jelle Bruineberg competently critiques the way Friston tries to use Markov blankets. But some other unrelated theories also try to apply Markov blankets to represent agent boundaries. There is a simple reason why such approaches are doomed.
This argument is due to Sam Eisenstat.
Consider the data-type of a Markov blanket. You start with a probabilistic graphical model (usually, a causal DAG), which represents the world.
A "Markov blanket" is a set of nodes in this graph, which probabilistically insulates one part of the graph (which we might call the part "inside" the blanket) from another part ("outside" the blanket):
("Probabilistically insulates" means that the inside and outside are conditionally independent, given the Markov blanket.)
So the obvious problem with this picture of an agent boundary is that it only works if the agent takes a deterministic path through space-time. We can easily draw a Markov blanket around an "agent" who just says still, or who moves with a predictable direction and speed:
But if an agent's direction and speed are ever sensitive to external stimuli (which is a property common to almost everything we might want to call an 'agent'!) we cannot draw a markov blanket such that (a) only the agent is inside, and (b) everything inside is the agent:
It would be a mathematical error to say "you don't know where to draw the Markov blanket, because you don't know which way the Agent chooses to go" -- a Markov blanket represents a probabilistic fact about the model without any knowledge you possess about values of specific variables, so it doesn't matter if you actually do know which way the agent chooses to go.
The only way to get around this (while still using Markov blankets) would be to construct your probabilistic graphical model so that one specific node represents each observer-moment of the agent, no matter where the agent physically goes. In other words, start with a high-level model of reality which already contains things like agents, rather than a low-level purely physical model of reality. But then you don't need Markov blankets to help you point out the agents. You've already got something which amounts to a node labeled "you".
I don't think it is impossible to specify a mathematical model of agent boundaries which does what you want here, but Markov blankets ain't it.
Although it's arbitrary which part we call inside vs outside.
Drawing Markov blankets wouldn't even make sense in a model that's been updated with complete info about the world's state; if you know the values of the variables, then everything is trivially probabilistically independent of everything else anyway, since known information won't change your mind about known information. So any subset would be a Markov blanket.
Or you could have a more detailed model, such as one node per neuron; that would also work fine. But the problem remains the same; you can only draw such a model if you already understand your agent as a coherent object, in which case you don't need Markov blankets to help you draw a boundary around it.
The last few days have been confusing, chaotic, and stressful. We're still trying to figure out what happened with Sam Altman and OpenAI and what the aftermath will look like.
I have personally noticed my emotions fluctuating more. I have various feelings about the community, about the current state of the world, about the increasingly strong pressures to view the world in terms of factions, about the current state of AIS discourse, and the current state of the AI safety community.
Between now and AGI, there will likely be other periods of high stress, confusion, or uncertainty. I figured it might be a good idea for me to write down some thoughts that I have found helpful or grounding.
If you have noticed feelings of your own, or any strategies that have helped you, I encourage you to share them in the comments.Frames I find helpful & grounding
1. On whether my actions matter. In some worlds, my actions will not matter. Maybe I am too late to meaningfully affect things. Maybe this is true of my friends, allies, and community as well. In the extreme case, at some point we will pass a "point of no return"– the point where my actions and those of my community no longer have any meaningful effect on the world. I can accept this uncertainty, and I can choose to focus on the worlds where my actions still matter.
2. On not having clear end-to-end impact stories. There are not many things that make a meaningful difference, but there are a few. I know of at least one that I was meaningfully part of, and I know of a few others that my friends & allies were part of. Sometimes, these things will not be clear in advance. (Ex: I wrote the initial draft of a sentence that ended up becoming the CAIS statement, but at the time, I did not realize that was going to be a big deal. It felt like an interesting side project, and I certainly didn't have a clear end-to-end impact story for it. Of course, it is valuable to strive for projects that have ex-ante end-to-end impact stories, and it is dangerous to adopt a "well, IDK why this is good, but hopefully it will work out" mentality. But there is something emotionally reassuring about the fact that sometimes you can pursue things with an incomplete understanding of exactly how it is going to work out.)
3. On friendship. I am lucky to have found friends and allies who are trying to make the world a better place. In the set of all possible lives, I have found myself in one where I am regularly in contact with people who are fighting to make the world better and safer. I can strive to absorb some of Alice's relentless drive to solve problems, Bob's ability to speak with integrity and build coalitions, Carol's deep understanding of technical issues, etc.
4. Gratitude to the community. The AI safety community has provided me a lot: knowledge, motivation, thinking skills, friendships, and concrete opportunities to make the world better. I would not be here without the community. When I reflect on this, I feel viscerally grateful to the community.
5. Criticism of the community. The AI safety community has made mistakes and undoubtedly continues to make important mistakes. I can feel grateful for certain parts of the community while speaking out against others. There is no law that says that the "community" must be fully good or fully bad– and indeed, it is neither.
6. On identifying with the EA or AIS community. I do not have to identify with a community or all parts of it. I can find specific people and projects that I choose to contribute to. I can be aware of how the community impacts me, both positively and negatively. I can try to extract its lessons and best practices while being aware of its dangers. I can be grateful for the fact that I have become a more precise communicator, I have new ways of monitoring my uncertainty, and I speak & think more probabilistically. This can coincide with concerns I have about groupthink and ways in which the community may atrophy my ability to think clearly, say what I believe, or take actions in the world. I know I am not alone in many of these feelings.
7. On dying. I may live a short life due to AGI. If that's the case, I want to live a fulfilling and worthwhile life. I can choose to spend my time striving to develop my virtues, find new tools to see the world more clearly, understand and cultivate integrity, identify new ways of expressing kindness and love, and strive to find actions that make the world safer.
8. On living admirably. I find the "death with dignity" frame helpful, though I think it's too defeatist to properly characterize my epistemic state or my emotional state. I better understand it through the lens of "living admirably". Regardless of my P(doom), I can live a life according to my values and strive to improve my character & wisdom. If my efforts and humanity's efforts are to fail, I would like to get closer to a state where I can say, "I really did try very hard to rise to the occasion."
9. On drama and politics. If I pay attention to drama and politics, I should do so intentionally. Drama and politics are attention-grabbing, and attention is a resource I should deploy wisely. Sometimes, it will make sense to pay attention to drama and politics– other times, it will be best to refocus my attention on meaningful work and leisure activities. I can wait until Thursday before I have an opinion about OpenAI.
10. On my own fallibility. I am a fallible human who has been wrong many times before and will be wrong many times in the future. This is not to say that this means that everything will work out, or that my worries about the world are all incorrect. Uncertainty cuts both ways– things could turn out better or worse than I expected. Nonetheless, I find something emotionally comforting about remembering that my judgments about the world are imperfect. It also reminds me that there is much for me to continue working toward.
Approximately four GPTs and seven years ago, OpenAI’s founders brought forth on this corporate landscape a new entity, conceived in liberty, and dedicated to the proposition that all men might live equally when AGI is created.
Now we are engaged in a great corporate war, testing whether that entity, or any entity so conceived and so dedicated, can long endure.
What matters is not theory but practice. What happens when the chips are down?
So what happened? What prompted it? What will happen now?
To a large extent, even more than usual, we do not know. We should not pretend that we know more than we do.
Rather than attempt to interpret here or barrage with an endless string of reactions and quotes, I will instead do my best to stick to a compilation of the key facts.
(Note: All times stated here are eastern by default.)Just the Facts, Ma’am
What do we know for sure, or at least close to sure?
Here is OpenAI’s corporate structure, giving the board of the 501c3 the power to hire and fire the CEO. It is explicitly dedicated to its nonprofit mission, over and above any duties to shareholders of secondary entities. Investors were warned that there was zero obligation to ever turn a profit:
Here are the most noteworthy things we know happened, as best I can make out.
- On Friday afternoon at 3:28pm, the OpenAI board fired Sam Altman, appointing CTO Mira Murati as temporary CEO effective immediately. They did so over a Google Meet that did not include then-chairmen Greg Brockman.
- Greg Brockman, Altman’s old friend and ally, was removed as chairman of the board but the board said he would stay on as President. In response, he quit.
- The board told almost no one. Microsoft got one minute of warning.
- Mira Murati is the only other person we know was told, which happened on Thursday night.
- From the announcement by the board: “Mr. Altman’s departure follows a deliberative review process by the board, which concluded that he was not consistently candid in his communications with the board, hindering its ability to exercise its responsibilities. The board no longer has confidence in his ability to continue leading OpenAI.”
- In a statement, the board of directors said: “OpenAI was deliberately structured to advance our mission: to ensure that artificial general intelligence benefits all humanity. The board remains fully committed to serving this mission. We are grateful for Sam’s many contributions to the founding and growth of OpenAI. At the same time, we believe new leadership is necessary as we move forward. As the leader of the company’s research, product, and safety functions, Mira is exceptionally qualified to step into the role of interim CEO. We have the utmost confidence in her ability to lead OpenAI during this transition period.”
- OpenAI’s board of directors at this point: OpenAI chief scientist Ilya Sutskever, independent directors Quora CEO Adam D’Angelo, technology entrepreneur Tasha McCauley, and Georgetown Center for Security and Emerging Technology’s Helen Toner.
- Usually a 501c3’s board must have a majority of people not employed by the company. Instead, OpenAI’s said that a majority did not have a stake in the company, due to Sam Altman having zero equity.
- In response to many calling this a ‘board coup’: “You can call it this way,” Sutskever said about the coup allegation. “And I can understand why you chose this word, but I disagree with this. This was the board doing its duty to the mission of the nonprofit, which is to make sure that OpenAI builds AGI that benefits all of humanity.” AGI stands for artificial general intelligence, a term that refers to software that can reason the way humans do.When Sutskever was asked whether “these backroom removals are a good way to govern the most important company in the world?” he answered: “I mean, fair, I agree that there is a not ideal element to it. 100%.”
- Other than that, the board said nothing in public. I am willing to outright say that, whatever the original justifications, the removal attempt was insufficiently considered and planned and massively botched. Either they had good reasons that justified these actions and needed to share them, or they didn’t.
- There had been various clashes between Altman and the board. We don’t know what all of them were. We do know the board felt Altman was moving too quickly, without sufficient concern for safety, with too much focus on building consumer products, while founding additional other companies. ChatGPT was a great consumer product, but supercharged AI development counter to OpenAI’s stated non-profit mission.
- OpenAI was previously planning an oversubscribed share sale at a valuation of $86 billion that was to close a few weeks later.
- Board member Adam D’Angelo said in a Forbes in January: There’s no outcome where this organization is one of the big five technology companies. This is something that’s fundamentally different, and my hope is that we can do a lot more good for the world than just become another corporation that gets that big.
- Sam Altman on October 16: “4 times in the history of OpenAI––the most recent time was in the last couple of weeks––I’ve gotten to be in the room when we push the veil of ignorance back and the frontier of discovery forward. Getting to do that is the professional honor of a lifetime.” There was speculation that events were driven in whole or in part by secret capabilities gains within OpenAI, possibly from a system called Gobi, perhaps even related to the joking claim ‘AI has been achieved internally’ but we have no concrete evidence of that.
- Ilya Sutskever co-leads the Superalignment Taskforce, has very short timelines for when we will get AGI, and is very concerned about AI existential risk.
- Sam Altman was involved in starting multiple new major tech companies. He was looking to raise tens of billions from Saudis to start a chip company. He was in other discussions for an AI hardware company.
- Sam Altman has stated time and again, including to Congress, that he takes existential risk from AI seriously. He was part of the creation of OpenAI’s corporate structure. He signed the CAIS letter. OpenAI spent six months on safety work before releasing GPT-4. He understands the stakes. One can question OpenAI’s track record on safety, many did including those who left to found Anthropic. But this was not a pure ‘doomer vs. accelerationist’ story.
- Sam Altman is very good at power games such as fights for corporate control. Over the years he earned the loyalty of his employees, many of whom moved in lockstep, using strong strategic ambiguity. Hand very well played.
- Essentially all of VC, tech, founder, financial Twitter united to condemn the board for firing Altman and for how they did it, as did many employees, calling upon Altman to either return to the company or start a new company and steal all the talent. The prevailing view online was that no matter its corporate structure, it was unacceptable to fire Altman, who had built the company, or to endanger OpenAI’s value by doing so. That it was good and right and necessary for employees, shareholders, partners and others to unite to take back control.
- Talk in those circles is that this will completely discredit EA or ‘doomerism’ or any concerns over the safety of AI, forever. Yes, they say this every week, but this time it was several orders of magnitude louder and more credible. New York Times somehow gets this backwards. Whatever else this is, it’s a disaster.
- By contrast, those concerned about existential risk, and some others, pointed out that the unique corporate structure of OpenAI was designed for exactly this situation. They also mostly noted that the board clearly handled decisions and communications terribly, but that there was much unknown, and tried to avoid jumping to conclusions.
- Thus we are now answering the question: What is the law? Do we have law? Where does the power ultimately lie? Is it the charismatic leader that ultimately matters? Who you hire and your culture? Can a corporate structure help us, or do commercial interests and profit motives dominate in the end?
- Great pressure was put upon the board to reinstate Altman. They were given two 5pm Pacific deadlines, on Saturday and Sunday, to resign. Microsoft’s aid, and that of its CEO Satya Nadella, was enlisted in this. We do not know what forms of leverage Microsoft did or did not bring to that table.
- Sam Altman tweets ‘I love the openai team so much.’ Many at OpenAI respond with hearts, including Mira Murati.
- Invited by employees including Mira Murati and other top executives, Sam Altman visited the OpenAI offices on Sunday. He tweeted ‘First and last time i ever wear one of these’ with a picture of his visitors pass.
- The board does not appear to have been at the building at the time.
- Press reported that the board had agreed to resign in principle, but that snags were hit over who the replacement board would be, and over whether or not they would need to issue a statement absolving Altman of wrongdoing, which could be legally perilous for them given their initial statement.
- Bloomberg reported on Sunday 11:16pm that temporary CEO Mira Murati aimed to rehire Altman and Brockman, while board sought alternative CEO.
- OpenAI board hires former Twitch CEO Emmett Shear to be the new CEO. He issues his initial statement here. I know a bit about him. If the board needs to hire a new CEO from outside that takes existential risk seriously, he seems to me like a truly excellent pick, I cannot think of a clearly better one. The job set for him may or may not be impossible. Shear’s PPS in his note: PPS: “Before I took the job, I checked on the reasoning behind the change. The board did *not* remove Sam over any specific disagreement on safety, their reasoning was completely different from that. I’m not crazy enough to take this job without board support for commercializing our awesome models.”
- New CEO Emmett Shear has made statements in favor of slowing down AI development, although not a stop. His p(doom) is between 5% and 50%. He has said ‘My AI safety discourse is 100% “you are building an alien god that will literally destroy the world when it reaches the critical threshold but be apparently harmless before that.”’ Here is a thread and video link with more, transcript here or a captioned clip. Here he is tweeting a 2×2 faction chart a few days ago.
- Microsoft CEO Satya Nadella posts 2:53am Monday morning: We remain committed to our partnership with OpenAI and have confidence in our product roadmap, our ability to continue to innovate with everything we announced at Microsoft Ignite, and in continuing to support our customers and partners. We look forward to getting to know Emmett Shear and OAI’s new leadership team and working with them. And we’re extremely excited to share the news that Sam Altman and Greg Brockman, together with colleagues, will be joining Microsoft to lead a new advanced AI research team. We look forward to moving quickly to provide them with the resources needed for their success.
- Sam Altman retweets the above with ‘the mission continues.’ Brockman confirms. Other leadership to include Jackub Pachocki the GPT-4 lead, Szymon Sidor and Aleksander Madry.
- Nadella continued in reply: I’m super excited to have you join as CEO of this new group, Sam, setting a new pace for innovation. We’ve learned a lot over the years about how to give founders and innovators space to build independent identities and cultures within Microsoft, including GitHub, Mojang Studios, and LinkedIn, and I’m looking forward to having you do the same.
- Ilya Sutskever posts 8:15am Monday morning: I deeply regret my participation in the board’s actions. I never intended to harm OpenAI. I love everything we’ve built together and I will do everything I can to reunite the company. Sam retweets with three heart emojis. Jan Leike, the other head of the superalignment team, Tweeted that he worked through the weekend on the crisis, and that the board should resign.
- Microsoft stock was down -1% after hours on Friday, was back to roughly its previous value on Monday morning and at the open. All priced in. Neither Google or S&P made major moves either.
- 505 of 700 employees of OpenAI, including Ilya Sutskever, sign a letter telling the board to resign and reinstate Altman and Brockman, threatening to otherwise move to Microsoft to work in the new subsidiary under Altman, which will have a job for every OpenAI employee. Full text of the letter that was posted: To the Board of Directors at OpenAI,OpenAl is the world’s leading Al company. We, the employees of OpenAl, have developed the best models and pushed the field to new frontiers. Our work on Al safety and governance shapes global norms. The products we built are used by millions of people around the world. Until now, the company we work for and cherish has never been in a stronger position.The process through which you terminated Sam Altman and removed Greg Brockman from the board has jeopardized all of this work and undermined our mission and company. Your conduct has made it clear you did not have the competence to oversee OpenAI.When we all unexpectedly learned of your decision, the leadership team of OpenAl acted swiftly to stabilize the company. They carefully listened to your concerns and tried to cooperate with you on all grounds. Despite many requests for specific facts for your allegations, you have never provided any written evidence. They also increasingly realized you were not capable of carrying out your duties, and were negotiating in bad faith.The leadership team suggested that the most stabilizing path forward – the one that would best serve our mission, company, stakeholders, employees and the public – would be for you to resign and put in place a qualified board that could lead the company forward in stability. Leadership worked with you around the clock to find a mutually agreeable outcome. Yet within two days of your initial decision, you again replaced interim CEO Mira Murati against the best interests of the company. You also informed the leadership team that allowing the company to be destroyed “would be consistent with the mission.”Your actions have made it obvious that you are incapable of overseeing OpenAl. We are unable to work for or with people that lack competence, judgement and care for our mission and employees. We, the undersigned, may choose to resign from OpenAl and join the newly announced Microsoft subsidiary run by Sam Altman and Greg Brockman. Microsoft has assured us that there are positions for all OpenAl employees at this new subsidiary should we choose to join. We will take this step imminently, unless all current board members resign, and the board appoints two new lead independent directors, such as Bret Taylor and Will Hurd, and reinstates Sam Altman and Greg Brockman.1. Mira Murati2. Brad Lightcap3. Jason Kwon4. Wojciech Zaremba5. Alec Radford6. Anna Makanju7. Bob McGrew8. Srinivas Narayanan9. Che Chang10. Lillian Weng11. Mark Chen12. Ilya Sutskever
- There is talk that OpenAI might completely disintegrate as a result, that ChatGPT might not work a few days from now, and so on.
- It is very much not over, and still developing.
- There is still a ton we do not know.
- This weekend was super stressful for everyone. Most of us, myself included, sincerely wish none of this had happened. Based on what we know, there are no villains in the actual story that matters here. Only people trying their best under highly stressful circumstances with huge stakes and wildly different information and different models of the world and what will lead to good outcomes. In short, to all who were in the arena for this on any side, or trying to process it, rather than spitting bile: .
Later, when we know more, I will have many other things to say, many reactions to quote and react to. For now, everyone please do the best you can to stay sane and help the world get through this as best you can.
More drama. Perhaps this will prevent spawning a new competent and funded AI org at MS?
Recently (2 Nov), The Guardian posted what I thought was an extremely well-made video with Ilya's thoughts. I didn't think to repost it at the time but given the OpenAI developments over the last couple of days, and the complete Twitter and media meltdown surrounding that, I thought this video gives a strong vibey insight into Ilya's thoughts on AGI and safety and it's a useful reference point for how general public may perceive Ilya (had 221k views thus far).
* * *
Transcript (bold highlights mine):
Now AI is a great thing, because AI will solve all the problems that we have today.
It will solve employment, it will solve disease, it will solve poverty, but it will also create new problems.
The problem of fake news is going to be a million times worse, cyber attacks will become much more extreme, we will have totally automated AI weapons. I think AI has the potential to create infinitely stable dictatorships. This morning a warning about the power of artificial intelligence, more than 1,300 tech industry leaders, researchers and others are now asking for a pause in the development of artificial intelligence to consider the risks.
Playing God, scientists have been accused of playing God for a while, but there is a real sense in which we are creating something very different from anything we've created so far. Yeah, I mean, we definitely will be able to create completely autonomous beings with their own goals.
And it will be very important, especially as these beings become much smarter than humans, it's going to be important to have these beings, the goals of these beings be aligned with our goals.
What inspires me? I like thinking about the very fundamentals, the basics. What can our systems not do, that humans definitely do? Almost approach it philosophically.
Questions like, what is learning?
What is experience?
What is thinking?
How does the brain work?
I feel that technology is a force of nature. I feel like there is a lot of similarity between technology and biological evolution. It is very easy to understand how biological evolution works, you have mutations, you have natural selections.
You keep the good ones, the ones that survive and just through this process you are going to have huge complexity in your organisms. We cannot understand how the human body works because we understand evolution, but we understand the process more or less. And I think machine learning is in a similar state right now, especially deep learning, we have a very simple rule that takes the information from the data and puts it into the model, and we just keep repeating this process. And as a result of this process the complexity from the data gets transferred into the complexity of the model. So the resulting model is really complex, and we don't really know exactly how it works you need to investigate, but the algorithm that did it is very simple.
ChatGPT, maybe you've heard of it, if you haven't then get ready. You describe it as the first spots of rain before a downpour. It's something we just need to be very conscious of, because I agree it is a watershed moment. Well ChatGPT is being heralded as a gamechanger and in many ways it is, its latest triumph outscoring people.
A recent study by Microsoft research concludes that GPT4 is an early, yet still incomplete artificial general intelligence system.
Artificial General Intelligence.
AGI, a computer system that can do any job or any task that a human does, but only better.
There is some probability the AGI is going to happen pretty soon, there's also some probability it's going to take much longer.
But my position is that the probability that AGI could happen soon is high enough that we should take it seriously.
And it's going to be very important to make these very smart capable systems be aligned and act in our best interests. The very first AGIs will basically be very, very large data centres. Packed with specialised neural network processors working in parallel. Compact, hot, power-hungry package, consuming like, 10m homes' worth of energy.
You're going to see dramatically more intelligent systems and I think it's highly likely that those systems will have completely astronomical impact on society.
Will humans actually benefit? And who will benefit and who will not?
The beliefs and desires of the first AGIs will be extremely important and so it's important to programme them correctly.
I think that if this is not done, then the nature of evolution, of natural selection, favour those systems prioritise their own survival above all else.
It's not that it's going to actively hate humans and want to harm them, but it is going to be too powerful, and I think a good analogy would be the way human humans treat animals. It's not we hate animals, I think humans love animals and have a lot of affection for them, but when the time comes to build a highway between two cities, we are not asking the animals for permission we just do it because it's important for us, and I think by default that's the kind of relationship that's going to be between us and AGIs which are truly autonomous and operating on their own behalf.
Many machine learning experts, people who are very knowledgeable and very experienced, have a lot of scepticism about AGI. About when it could happen and about whether it could happen at all. Right now this is something that just not that many people have realised yet. That the speed of computers for neural networks, for AI, are going to become maybe 100,000 times faster in a small number of years.
If you have an arms race dynamics between multiple teams trying to build the AGI first, they will have less time make sure that the AGI that they will build will care deeply for humans.
Because the way I imagine it is that there is an avalanche, like there is an avalanche of AGI development.
Imagine it, this huge unstoppable force. And I think it's pretty likely the entire surface of the earth will be covered with solar panels and data centres.
Given these kinds of concerns, it will be important that AGI somehow build as a cooperation between multiple countries.
The future is going to be good for the AI regardless. It would be nice if it were good for humans as well.
This is a linkpost for https://www.youtube.com/watch?t=6285&v=ICnFtfN-sUc and https://www.youtube.com/watch?v=cw_ckNH-tT8&t=2466s.
Overall, I found their views surprisingly nuanced, including e.g. compared to Sam Altman's.
I am quila, and have been studying alignment for the past year.
After first reading the sequences as others advised, I have been poring over alignment literature every day since late 2022. I've also been discussing subjects and ideas with other alignment researchers via discord, but so far have not shared theory to the broader alignment community.
I think I'm ready to start doing that, so here's a post contextualizing my agenda.
First, I think superintelligence will probably arrive soon. In that case, we may not have enough time to solve alignment from within the 'old framework' of highly optimized agents. Instead, my focus is towards a different (but still pivotal) goal: to enable the safe use of unaligned systems to steer reality.
I hope for this to bring Earth to a point where things are roughly okay, and where we have more time to solve the hard problems of aligning powerful agents.
Without this frame, my future posts may at first appear to some as ill-focused on problems outside that scope, such as myopia, performative prediction, and other concepts yet to be named. I hope that when read with the above focus in mind, there will be a clear connection to this longer-term plan.
Second, I expect superintelligent predictive models to be creatable in the future. Although current predictive models have promising properties, catastrophic failure modes are likely to arise at higher capability levels (e.g as detailed in 'conditioning predictive models'). My hope here is to develop methods which bridge the safety gap between current and superintelligent models, leaving no free variables whose optimization would effect the world in unexpected ways.
Lastly, a note on why I care to begin with.
I suffered a lot as a human, and came to feel it is dire to minimize suffering in other beings (human, animal, artificial). Solving alignment seems to be the best way to do this in our lightcone and beyond.
There has been some discussion about how future value should be distributed. Although I do have some ideals for what a good universe would look like, they are minor in comparison to my opposition to suffering.
Therefore, I have few worries about issues of who the eventual ASI is aligned to, or whether they 'follow through on the LDT handshake'. As long as the resulting world minimizes the occurrence of devastating forms of suffering, I will be mostly satisfied.
If you're interested in working together, please reach out to me on discord (username: quilalove).
Cross-posted from New Savanna.
Tyler Cowen posted the following tweet over at Marginal Revolution under the title "Solving for the Equilibrium":
I posted this comment:
Earlier in the week Scott Alexander had posted a skeptical review of a Girard book and I commented that, though I'm a Girard skeptic, albeit a somewhat interested one, Tyler regarded him as one of the great 20th century thinkers. In the course of introducing today's Open Thread, Alexander notes: "I would love to know more about Tyler’s interpretation of Girard and the single-victim process. Maybe in the context of recent events?" While we've got lots of recent to choose from – I'm thinking of the Israel/Palestine mess (ancient Israel, after all, is central to Girard's thinking on this matter) – I suspect the single-victim prompt points to the OpenAI upheaval.
Indeed, my interpretive Spidey sense suggests that a Girardian reading might be illuminating. I'd start with the idea that Sam Altman is the sacrificial victim. His position as leader of OpenAI is a natural focal point for mimetic dynamics. In this case those dynamics ripple far and wide. One might wish, for example, to include the fairy extensive commentary on Altman over at LessWrong, and not just recently. What about Sutskever's role? Just how this inquiry would play out, I do not know. No way to tell about these things until you actually do the work.
From the new interim CEO at OpenAI:
ChatGPT seems to have really awesome voice-to-text ability. However, it seems to only record within ChatGPT itself so can't be used to create notes or type in other programs and it's unclear to me how to best take advantage of the increased technological capabilities.
I'd love to hear about how people integrated the newest voice-to-text capabilities into their workflow.
I run a channel on YouTube. One of the directions is the shooting of short feature films about rationality. Previously, we shot a video based on Scott Alexander's essay "ARS LONGA, VITA BREVIS". This time we present to you an adaptation of "The Simple Truth" by Eliezer Yudkowsky. The video is in Russian, but we have added English subtitles using parts of the original text.
What essays would you be interested in seeing adapted in this way?
link to video:
link to original essay:
I was born an optimist.
So far in human history, optimists always win. It is so lucky being an optimist.
Optimists are extremely important in advancing the technology and the human society. Optimists have brought radical abundance that were unthinkable even 100 years ago.
On the other hand, there were always countless doomers and naysayers:
Technology takes our jobs, reduces our wages, increases inequality, threatens our health, ruins the environment, degrades our society, corrupts our children, impairs our humanity, threatens our future, and is ever on the verge of ruining everything.
None of them were right and they seem so laughable in the rearview mirror.
Every time people talked about the devastating impact of a new technology, I would recall the famous technology rules from Douglas Adams:
Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.
Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.
Anything invented after you’re thirty-five is against the natural order of things.
If we were to learn anything from history, it is that we should always embrace the new technology.
However, we must realize that history was written by survivors. If there were dinosaurian naysayers or Mesopotamian naysayers, they would have been absolutely right about their doomed fate. The only problem is: that they didn’t survive to write their history.
The Fermi paradox has showed that humans are at least miraculously lucky to have reached the current stage of civilization. Since luck was in play, we cannot definitely say the past doomers and naysayer were all laughable, as history could have gone the other way and they just didn’t get to write the history.
I sincerely hope all the Great Filters are behind us but it is unlikely that nothing stands in between our civilization and the multiplanetary civilization. The stake is so high that anyone should be cautious even if there is only a slight chance the technology would destroy the humanity.
Let’s face it: “optimists always win!” has been reinforced again and again in human history, yet no one gets to reflect on it if it ever fails.
In the not-so-long-ago past, Nazi Germany and the US were both rushing to invent the first atomic bomb. Nothing fundamental stopped Nazi Germany getting to the finish line first. If Nazi Germany were to invent the first atomic bomb or, maybe worse, if Nazi Germany and the US were able to invent atomic bombs in quick succession, how many atomic bombs would have been dropped on the Earth? The results are unimaginable.
Later in the Cold War, some accidents could have triggered a nuclear war between the two then superpowers. Humans are so lucky that it didn’t happen.
I still believe it is important to be an optimist, as the technology advances regardless how many pessimists are there. But being an optimist doesn’t mean blindly embracing and accelerating any technology we build. Being an optimist is preparing for the “lucky” outcome, where the technology safely lands, and driving society towards that outcome.
A quick (ill-informed) thought - it seems like publishing effective interpretability techniques has an effect similar to open sourcing because it would allow the person using the technique to learn (some of) the model weights. If you think open sourcing is bad, should you also think that publishing effective interpretability techniques is bad.
That's very interesting.
I think it's very good that board stood their ground, and maybe a good thing OpenAI can keep focusing on their charter and safe AI and keep commercialization in Microsoft.
People that don't care about alignment can leave for the fat paycheck, while commited ones stay at OpenAI.
What are your thought on implications of this for alignment?
Update: un-paywalled article from the Verge: https://www.theverge.com/2023/11/20/23967515/sam-altman-openai-board-fired-new-ceo
This is still breaking as of 9:30 PT on Sunday, but according to a Bloomberg journalist:
Emmett Shear, co-founder of Amazon-owned video streaming site Twitch, will take over as interim CEO, Sutskever said.
Manifold is not totally convinced:
But I think this is mainly due to the permanent vs. interim issue.
Emmett's twitter: https://twitter.com/eshear
Emmett's appearance on The Logan Bartlett show: https://twitter.com/liron/status/1672986864297578501
Congratulations to Emmett!
Computers are awesome. Also computers really suck… the time out of my day. The same goes for phones and tablets and anything that can access the internet. I've been addicted to way too many different internet activities: Reddit, TV, Instagram, Twitter, Hackernews, product reviews, travel blogs, tech blogs, the EA forum, etc, etc, etc. Even Wikipedia can be a huge time sink for me! Maybe I have less self control or less willpower than others? Regardless, my internet time wasting got pretty bad over the years. Working from home during the pandemic finally set me off a cliff where I realized I really needed to solve the problem.
So, in late 2020, I started putting together a system of blocks that make my computer and phone helpful rather than destructive. I wasn’t successful immediately. I’ve made many little tweaks in that time and patched up gaps and cracks. But, the result is a system that solves most of my internet addiction problems without completely disabling my devices! Over time, the urge to do addictive things online has also faded a lot, but I have no plans to loosen the controls.Blocking
The flowchart above describes the algorithm for my blocks. The basic principles are:
- My laptop is almost exclusively for work. Non-work stuff, even if it’s not addictive like a bank website, should be blocked. Having different devices for different purposes really helps to keep me doing what I want to be doing.
- I use a “deep work” whitelist of acceptable apps and websites for times when I really want to get some coding or writing done.
- My phone and tablet can be used for anything else. But very addictive stuff is either completely blocked or is tightly time limited.
- All my electronic devices should stop being useful or fun at night so that I am more likely to go to sleep. I don’t get sleepy at night until very very late. But, if I lie down for ten minutes, I will fall asleep. So if I want to go to bed on time, I need to reduce the attractiveness of staying awake.
- I can violate these rules with permission from a few different people who share a password that unlocks these blocks. For example, if I want to watch a TV show with a friend, we’ll either use their laptop or they will use the password to unblock my tablet for an hour. I do not know the password!
In painstaking detail:
- I use Cold Turkey Blocker for blocking websites and apps on my laptop. I have six block lists in Cold Turkey:
- “Permanent” - These are things that I should never use my computer for. Some obvious suspects are in there: TV shows, Instagram, Strava, Twitter, news websites. But also, there are also a lot of online shopping, blogs, and product review sites. This block is password protected with the password.
- “Non-addictive” - websites that are not work but are also not addictive. Normally I shouldn’t be using my laptop for these tasks. But, if I need to, I can type 150 characters of random text to unlock this block. This is mostly financial and bill-paying related stuff like my bank, credit card websites, airline websites, Mint.
- “Whitelist” - This blocks every website and app except for those on a list that I’ve built over the years. I will typically enable this in an irreversible way for 1-3 hours but sometimes for a whole day. When I first started, I couldn’t enable the whitelist for more than 20 minutes without having trouble getting work done. But, I’ve just noted down sites whenever I find something that I need access to and then I add the site after the current whitelist session is over. Being able to completely block almost all distraction potential for an hour gives me a lot of momentum even after that hour is over.
- “Email” - This is just for email websites and apps. Email can be very addictive! Especially when I’m expecting an important message. Knowing that I absolutely can’t access my email until later in the day is helpful for focusing on other tasks. After a morning email session, I block access to my email until the evening. (“Start and lock for 7 hours”)
- “All apps at night” - This blocks every last app on my computer from 9:15pm until 2am. Just like “Non-addictive”, if I need to, I can type 200 characters of random text to unlock this block.
- “Whitelist at night” - even if I unlock “Everything at night”, the whitelist also enables from 9:15 pm to 2am. This block is password protected. So, I can only use my laptop for work during that time. If there’s some work emergency, this means I can handle it. Also, I sometimes get very motivated in the night and want to work. I don’t object to that and don’t want to prevent myself from doing that. This double night-time block evolved out of an issue where I would want to work in the evening and would disable my night-time block. Then, I would work from say 10pm to midnight. But, after I was done working, I would take advantage of the unlocked laptop and stay up until 3am wasting time online by doing something like watching a TV show.
- The phone and tablet blocks (synced together) are much simpler. I use the built-in Screen Time tools and allow myself:
- 12 minutes a day on Twitter. I get a lot of value from Twitter but it has a lot of addictive potential.
- 10 minutes a day on Discord.
- 20 minutes a day summed across all apps or website in a big list of time-sinks like YouTube, Reddit, etc. There are 114 websites on this list right now.
- One hour a day on Safari. This covers the “Do nothing**” in the flowchart. If a website isn’t blocked some other way, I’m still limited to at most an hour on that site. And normally much less time because I will have already used up some Safari time for other purposes.
- Complete blocks on Instagram, a few video games and all TV/Movie streaming services.
- Other web browsers are blocked so I can’t evade the Safari block.
- Most apps are completely blocked after 9:45pm. The exception to this are apps that have no potential to keep me up late and are useful while traveling or out late: Messages, WhatsApp, Weather, Calendar, Uber, etc.
- I should probably block the App Store but having access hasn’t caused me problems yet.
- I maintain and update these lists above by automatically tracking my time use. Lots of other folks have written about this. I use Timing for this on my laptop. On my phone, I use the built-in Screen Time app. Once a week, I look at the websites and apps I’ve used over the last week. If there’s anything that should be added to a block. If I realize I got distracted by a site, I’ll update the block lists immediately.
- I also use Daily. It pops up a window in the corner of my screen every ten minutes asking what I’m doing. I can respond with one key press whether I’m working (“w”) or not working (“n”). Since my laptop is intended for work, it’s useful to track when I’m using it for non-work purposes and try to adjust the system to push that kind of thing off to a different device. It’s also nice to know how much I’ve worked in that past week or month. “Oh yeah, that’s why I’m tired! Maybe I’ll go for a long run in the mountains tomorrow.” or “Gosh, I didn’t get much done this week. I wonder why?”
Implementation details that I left out above:
- If you open “App Limits” and try to add a time limit for Safari, you will sadly be unable to find the app! But there’s a workaround. Go back to the main Screen Time page and click “See All Activity”. Then, scroll down to the “Most used” list and select “Safari”. Click “Add Limit” and you’ll be able to set a limit. Source
- Make sure you select “Block at End of Limit” when setting up an App Limit. The default is to just yell at you but not actually block anything.
- I haven’t found a good way to block an app completely in Screen Time in a way that also allows using a password to temporarily unlock the block. Instead I just set a 1 minute App Limit. For most things, one minute is a short enough time to make the app useless. You can’t watch a TV show in one minute.
- To set up my night time block in Screen Time, I set a schedule in “Downtime”, then I turned off “Block at Downtime” so that my phone is still usable. Finally, I select which apps are acceptable in “Always Allowed”.
- In Cold Turkey, I give an allowance of five minutes for the night time block. Without this, all my apps would suddenly close at 9:20pm. That’s normally not a big deal, but it is nice to be able to leave some browser or IDE tabs open overnight. Occasionally, killing all my apps is a big deal like when I’m running some code overnight and I didn't remember to launch the job inside tmux.
Technical problems that I’d love a solution for:
- Cold Turkey doesn’t allow a scheduled block to also be temporarily activated. So, I need to keep my night-time whitelist manually in sync with my day-time whitelist.
- Cold Turkey doesn’t have a way to temporarily turn off a block. So if I disable a block, I need to make sure to turn the block back on. In contrast, Screen Time allows choosing to disable a block for either 15 minutes, 1 hour or all day. It's nice that I can't accidentally leave a Screen Time block disabled.
- It’d be nice to be able to request access to an app/website remotely with Screen Time. If I need a Cold Turkey block on my laptop disabled remotely, I can do a screen share and let someone remotely type in the password without telling me the password. But, iOS has no mechanism for remotely controlling a phone like that. So, at the moment, if I’m traveling and need to unblock something, I just use my laptop. I've managed to get through quite a few trips without needing to disable anything!
<!-- Footnotes themselves at the bottom. -->Notes
Reading about the history of some country that I’ve never read about before is an easy three hours. ↩︎
I also don’t read fiction anymore because I just won’t sleep until the book is over. Should I be embarrassed to admit that I read the entire ASOIAF (Game of Thrones) series in less than a week in 2010? I do still listen to fiction audiobooks, but even that can be a bit risky and I mostly reserve it for vacations or to listen to while I do some big home repair project. ↩︎
At one point, Liz went even further than me and completely blocked the internet on her personal laptop unless I explicitly enabled it. Her laptop was not connected to our wifi and only I had the wifi password. So, when she needed to use the internet on her laptop, I went into our router settings and activated a guest wifi network for her. The guest wifi was then active for three hours. Note that she did have a separate work laptop and these wifi blocks only applied to her personal laptop. ↩︎
Actually working at the times that you plan to work is orthogonal to the question of how much you work. ↩︎
My “tablet” is effectively a second laptop because it has a fold-out keyboard and trackpad. This has been important for getting this system to work. Otherwise, I would drift back to using my laptop for non-work tasks that require a lot of typing. I think having a second laptop would be less effective because iPadOS is much easier to irreversibly lock down than a laptop. Also doing a lot of work tasks (coding!) is unpleasant on a tablet so this firms up the work vs not-work separation. ↩︎
I get sleepy at 3pm all the time… ↩︎
The Pro version currently costs $39 but it’s worth like 1000x more than that to me. None of the other blocker apps are at the level of Cold Turkey. If you use something else and like it more, let me know! ↩︎
I would’ve loved to share the contents of all my block lists here, but that would be akin to sharing my entire search history. It’s private! If you want the contents of the whitelist, I am more willing to share that because it’s only work-related websites. Feel free to email and ask. ↩︎
I chose 2am for the stop time so that I have some leeway to wake up early instead of unblocking. Suppose I need to get something done by 7am and it’s not done by 9:15 pm. I can either unblock a device or I can just wake up early. On the flip side, 2am is late enough that I won’t stay up in order to wait out the block. If I had chosen 11pm or 12am, that might’ve been an issue. ↩︎
In grad school, I got most of my best work done between 10pm and 5am. How I “fixed” this deserves its own post. These days, I’m most productive in the morning. But, I still sometimes stay up late working when I don’t have anything important in the morning and I’m motivated. ↩︎
There are work-related talks on YouTube that I might want to watch but normally I can just substitute by reading the corresponding research paper. I'm not a big fan of watching talks anyway. ↩︎
Reddit is incredibly useful for getting opinions. I’m one of those people that appends “reddit” to the end of lots of Google searches. ↩︎
I'm fortunate enough to go to a high-caliber American university. I study math and economics, so not fields that are typically subject to funding constraints or have some shortage of experts. The incentives to become a professor here seem pretty strong--the median professor at my school made over $150k last year, and more than two-thirds of the faculty have tenure. There is, as far as I can tell, little to no oversight as to what they research or what they teach. From the outside, it seems like a great gig.
And yet, most of my professors have been really bad at teaching. It's weird. And I don't just mean that they could be doing a little better. I mean they consistently present things in unclear or inconsistent ways, write exams that are extremely subject to test-taking skill shenanigans, and go off on rambling tangents that lead nowhere. My classes are often poorly designed with large gaps in the curricula or inconsistent pacing throughout the semester. I feel like I've actually learned something in maybe 1/3 of my courses.
I don't want to come across as someone who's just ranting--I'm legitimately confused by this phenomenon and want to figure it out.
Are my standards unreasonably high? I was fortunate enough to go to an excellent high school with some truly fantastic teachers. I also spent a little time in the US competitive math scene and encountered lots of wonderful and smart tutors along the way. And, most of all, I've been fortunate to be involved with a number of extremely good teachers in my time at rat camps, many of whom have pushed the boundaries of pedagogy and content quality. I realize I've been living in quite the intellectual bubble, and I don't want to overlook that.
My counterargument is that there are such strong incentives for professors that schools should be able to find good ones. My high school had a median teacher income of about $55k, and yet I would consider something like 20% of my high school teachers to be better than my university ones. University professors enjoy shorter working hours, the ability to do research, higher job security, and other benefits like housing. I don't understand why the bar for competence is so low.
It's also possible that my teachers just haven't been good for me. Again…maybe? But most of the areas in which my professors have underwhelmed me are really basic things, like not interacting with the audience, arriving late to lectures, or giving exams that are horrible proxies for understanding. I can't imagine any students preferring those to be the case.
Finally, I've heard the explanation that professors are there to do research, not to teach. While this might be true, I don't understand the decision from an institutional perspective. The school makes money and builds a brand name on students/alumni and loses utility on research (if you don't think that's true, name 5 MIT alumni then try to name 5 current professors there). If anything, it seems like universities should only be using research as an incentive to draw great professors, not vice versa.
I know that there are great teachers out there--I've had the pleasure of working with many of them at rat camps, Olympiad camps, and other cool places like that. Why aren't top universities filled with these people?
I spent the weekend doing demolition: I'm redoing the first floor bathroom. I previously did the ones on the second and third floors, so I now feel like I have a bit of practice. Before I did my first one I read a bit about how to do it and what tools people tend to use, and while I don't remember any of it being wrong exactly, I know how much stronger opinions about what equipment is useful and in which cases it's worth getting something nice. So: here's my prioritized list of what I think is most helpful for residential demolition.
Elastomeric p100 respirator. Demolition is incredibly dusty, especially if your house has plaster, and you do not want to be breathing dust. It is possible to get disposable p100s, but the elastomeric ones get a much better seal, and seal is really important.
Ear protection. A lot of this is very noisy, including sudden sharp noises from hammering that are especially bad for your hearing. You want something you can comfortably wear for hours.
Eye protection. I normally only put on (additional) eye protection when I'm doing something especially likely to send out fragments, like hitting things with hammers, but if I didn't normally wear glasses I would probably wear safety glasses in the whole time.
Gloves. Otherwise you'll tear up your hands.
Pry bar. I probably spend more than half my time using the pry bar, separating things that have been nailed (or glued) together. I have the Stanley "Wonder Bar" and like that it's shaped so you can hammer it into place.
Thick plastic sheeting and masking tape. If you don't surround where you are working with plastic, you will end up with dust throughout the house. You can also put a layer of plastic on the floor before you start, which is helpful in cleanup though not critical.
Hammer. Often you can use the pry bar alone, but sometimes you need to hammer it into position before prying. Also, sometimes hitting things to loosen them is helpful, and drywall screws are brittle enough to often break off with a good hit.
Reciprocating saw (sawzall). When you're taking things down and tossing it all you want fast and not pretty, and this is the saw for that. I want at least a wood+nails blade (though I try to avoid cutting nails with it since it will go dull very quickly) and a metal cutting blade. On this project I used it for cutting partition wall studs, water damaged oriented strand board, and, with the metal cutting blade, an enameled steel bathtub.
Contractor cleanup bags. These are plastic, and impressively strong: filled with more masonry debris than I can lift they don't tear.. You can put wood with nails in them, and while the nails might poke holes in the bag, they don't rip to bits.
Drill. While it usually makes sense to pry, cut, or break screws, sometimes it's easier to back them out.
Shop vac. Especially good at getting up dust.
Locking pliers (vice grips). Sometimes it is useful to be able to really grab something and twist, and these are much stronger than you are.
Utility knife. Especially useful for separating something you want to keep from something you're going to tear up, since it lets you get a clean line and preserves the finish.
Sledgehammer. Some things just go faster when you hit them really hard.
Circular saw. With a demolition blade, it's very powerful and very fast, but I've almost never used mine for demo. Instead, the reciprocating saw is fine.
Standard pliers. Good for taking out carpet staples, or anything else where you don't need that much force and want quicker repetition than you'd get with locking pliers.
Chisel. Can be useful for getting up tile or cutting something clear, but the pry bar is fine for the former and a saw is usually better for the latter.
Cordless tools. You can get excellent used professional quality corded tools for less than what you'd pay for entry level battery-powered ones. If you're doing this all the time then cordless can be worth it, but if you do a bit every few years the lifetime issues with batteries are a major issue.
Oscillating tool. Some people say these are useful; I've never used one, and in the examples I see people give they seem like only a small improvement.
Other specialized tools. If you're doing the same thing over and over (extracting nails, pulling carpet, removing shingles, etc) then getting a tool specifically for that seems like it would be be worth it. I haven't done this, though.
I recently released a podcast episode with Aaron Silverbook, a person within the LW-o-sphere, about his new start-up that produces a bacterium that might cure cavities, and also how cavities work and what's up with the bacteria we all have coating our teeth.