# Новости LessWrong.com

A community blog devoted to refining the art of rationality
Обновлено: 36 минут 34 секунды назад

### What would a 10T Chinchilla cost?

3 мая, 2022 - 17:48
Published on May 3, 2022 2:48 PM GMT

I've heard several people say that a 10 Trillion parameter GPT-3-like model, trained with DeepMind's new scaling laws in mind, would be pretty terrifying. I'm curious if anyone could give me a Fermi estimate of the cost of such a thing - if indeed it is feasible at all right now even with an enormous budget.

Discuss

### Does the “ugh field” phenomenon sometimes occur strongly enough to affect immediate sensory processing?

3 мая, 2022 - 16:42
Published on May 3, 2022 1:42 PM GMT

Related to my comment on the parent question: is there documentation of specific attention minimization and/or blotting-out effects in immediate sensory processing related to past emotional aversion? I suspect the two of those, despite being listed as separate bullet points in the parent question, should be treated separately…

A more generalized form of this seems like it'd be the kind of dissociation that can occur in e.g. PTSD. Do some PTSD sufferers have sharper sensory issues surrounding the brain refusing to recognize certain stimuli?

Discuss

### Why humans don’t learn to not recognize danger?

3 мая, 2022 - 15:53
Published on May 3, 2022 12:53 PM GMT

Very short version in the title. A bit longer version at the end. Most of the question is context.

Long version / context:

This is something I vaguely remember reading (I think on ACX). I want to check if I remember correctly/ where I could learn it in more technical detail.

Say you go camping in a desert. You wake up and notice something that might be a scary spider you take a look and confirm it's a scary spider indeed. This is bad, you feel bad.

Since this is bad, you will be less likely to do some things that led to you will be less likely to do things led to you feeling bad, for example you'll be less likely to go camping in a desert.

But you probably won't learn to:

• avoid looking at something that might be a scary spider or
• stop recognizing spiders

even though those were much closer to you feeling bad (about being close to a scary spider).

This is a bit weird if you think that humans learn to just get a reward usually you'd expect stuff that happened closer to the punishment to get punished more, not less.

What I recall is that there is a different reward for "epistemic" tasks. Based on accuracy or saliency of things it recognizes, not on whether it's positive / negative.

A bit longer version of the question:

Discuss

### What would be the impact of cheap energy and storage?

3 мая, 2022 - 08:20
Published on May 3, 2022 5:20 AM GMT

Imagine fusion technology developed such that the marginal price of an additional unit of energy was ten thousand times cheaper than it is currently.

Further suppose that we invented cheap, safe, lightweight batteries with effectively unlimited storage.

What impact would that have on technology and society?

Discuss

### Monthly Shorts 4/2022

3 мая, 2022 - 07:10
Published on May 3, 2022 4:10 AM GMT

Conflict

Elon Musk was once asked about the regulatory situation of providing satellite internet without the local country’s permission. His response was uniquely Muskian:

Elon Musk @elonmusk@thesheetztweetz They can shake their fist at the sky

September 1st 2021

1,290 Retweets11,916 Likes

Now, it turns out, there are also other options. Dictators can, for example, launch electronic warfare measures against SpaceX’s operations. Fortunately…it turns out that SpaceX is better than the Russians and so Ukranian internet access continues.

Fun piece on military inter-service conflict (in favor), if that’s your jam.

One of the things I’ve had to grapple with, at my age, is understanding just how meaningful 9/11 is to people older than me. Two months of car crash deaths get shown on TV, and everybody goes completely mad. I go to a panel on national security work, and every single panelist and the moderator says that their inspiration to enter government service was 9/11. The Census Bureau handed over information on Arab neighborhoods to DHS (the story is more complicated than that: DHS seems to be both lying and incompetent and the Census Bureau did something both understandable and legally required, but this is the short version). We passed the Patriot Act, setting up massive denial of civil liberties by means both legal (new authorizations) and structural (empowering a type of agency that cares very little for such things at the expense of Justice and State, which do).

Discuss

### Home Antigen Tests Aren’t Useful For Covid Screening

3 мая, 2022 - 04:30
Published on May 3, 2022 1:30 AM GMT

Epistemic status: I strongly believe this is the right conclusion given the available data. The best available data is not that good, and if better data comes out I reserve the right to change my opinion.

EDIT (4/27): In a development I consider deeply frustrating but probably ultimately good, the same office is now getting much more useful information from antigen tests. They aren’t tracking with same rigor so I can’t comapre results, but they are now beating the bar of “literally every noticing covid”.

In an attempt to avoid covid without being miserable, many of my friends are hosting group events but requiring attendees to take a home covid test beforehand. Based on data from a medium-sized office, I believe testing for covid with the tests people are using, to be security theater and provide no decrease in riskAntigen tests don’t work for covid screening. There is a more expensive home test available that provides some value, and rapid PCR may still be viable.

It’s important to distinguish between test types here: antigen tests look for viral proteins, and genetic amplification tests amplify viral RNA until it reaches detectable levels. The latter are much more sensitive. Most home tests are antigen tests, with the exception of Cue, which uses NAAT (a type of genetic amplification). An office in the bay area used aggressive testing with both Cue and antigen tests to control covid in the office and kept meticulous notes, which they were kind enough to share with me. Here are the aggregated numbers:

• The office requested daily Cue tests from workers. I don’t know how many people this ultimately included, probably low hundreds? I expect compliance was >95% but not perfect.
• The results are from January when the dominant strain was Omicron classic, but no one got strain tested.
• 39 people had at least one positive Cue test, all of which were either asymptomatic or ambiguously symptomatic (e.g. symptoms could be explained by allergies) at the time, and 27 of which had recent negative cue tests (often but not always the day before, sometimes the same day)
• Of these, 10 definitely went on to develop symptoms, 7 definitely did not, and 18 were ambiguous (and a few were missing data).
• 33 people with positives were retested with cue tests, of which 9 were positive.
• Of those 24 who tested positive and then negative, 4 tested positive on tests 3 or 4.
• Of the 20 people with a single positive test followed by multiple negative retests, 6 went on to develop symptoms.
• 0 people tested positive on antigen tests. There was not a single positive antigen test across this group. They not only didn’t catch covid as early as Cue did, they did not catch any cases at all, including at least 2 people who took the tests while experiencing definitive systems.
• Antigen tests were a mix of Binax and QuickVue.
• Early cases took multiple antigen tests over several days, later cases stopped bothering entirely.
• The “negative test while symptomatic” count is artificially low because I excluded people with ambiguous symptoms, and because later infectees didn’t bother with antigen tests.
• I suppose I can’t rule out the possibility that they had an unrelated disease with similar symptoms and a false positive on the Cue test. But it seems unlikely that that happened 10-28 times out a few hundred people without leaving other evidence.

A common defense of antigen tests is that they detect whether you’re contagious at that moment, not whether you will eventually become contagious. Given the existence of people who tested antigen-negative while Cue-positive and symptomatic, I can’t take that seriously.

Unfortunately Cue tests are very expensive. You need a dedicated reader, which is $250, and tests are$65 each (some discount if you sign up for a subscription). A reader can only run 1 test at a time and each test takes 30 minutes, so you need a lot for large gatherings even if people stagger their entrances.

My contact’s best guess is that the aggressive testing reduced but did not eliminate in-office spread, but it’s hard to quantify because any given case could have been caught outside the office, and because they were trying so many interventions at once. Multiple people tested positive, took a second test right away, and got a negative result, some of whom went on to develop symptoms; we should probably assume the same chance of someone testing negative when a second test would have come back positive, and some of those would have been true positives. So even extremely aggressive testing has gaps.

Meanwhile, have I mentioned lately how good open windows and air purifiers are for covid? And other illnesses, and pollution? And that taping a HEPA filter to a box fan is a reasonable substitute for an air purifier achievable for a very small number of dollars? Have you changed your filter recently?

PS. Before you throw your antigen tests out, note that they are more useful than Cue tests for determining if you’re over covid. Like PCR, NAAT can continue to pick up dead RNA for days, maybe weeks, after you have cleared the infection. A negative antigen test after symptoms have abated and there has been at least one positive test is still useful evidence to me.

PPS. I went through some notes and back in September I estimated that antigen testing would catch 25-70% of presymptomatic covid cases. Omicron moves faster, maybe faster enough that 25% was reasonable for delta, but 70% looks obviously too high now.

PPPS. Talked to another person at the office, their take is the Cue tests are oversensitive. I think this fits the data worse but feel obliged to pass it on since they were there and I wasn’t.

PPPPS (5/02): multiple people responded across platforms that they had gotten positive antigen tests. One or two of these was even presymptomatic. I acknowledge the existence proof but will not be updating until the data has a denominator. If you’re doing a large event like a conference I encourage you to give everyone both cue, antigen, and rapid PCR tests and record their results, and who eventually gets sick. If you’d like help designing this experiment in more detail please reach out (elizabeth-at-acesounderglass.com)

Discuss

### Is evolutionary influence the mesa objective that we're interested in?

3 мая, 2022 - 04:18
Published on May 3, 2022 1:18 AM GMT

Epistemic status: this is the result of me trying to better understand the idea of mesa optimizers. It's speculative and full of gaps, but maybe it's interesting and I'm not realistically going to have time to improve it much in the near future.

Humans are often presented as an example of "mesa optimisers" - organisms created to "maximise evolutionary fitness" that end up doing all sorts of other things including not maximising evolutionary fitness and transforming the world in the process. This analogy is usually accompanied by a disclaimer like this:

We do not expect our analogy to live up to intense scrutiny. We present it as nothing more than that: an evocative analogy (and, to some extent, an existence proof) that explains the key concepts.

I am proposing that if we focus on evolutionary "influence" instead of "fitness", we can flip both claims on their head:

• Humans are extremely evolutionarily influential
• We should take the evolution analogy seriously
Evolution is about how things change

I think evolutionary fitness is to some extent not the interesting thing about evolution. If the world was always full of indistinguishable rabbits and always will be full of indistinguishable rabbits, then rabbits are in some sense "fit", but evolution is also a boring theory because all it says is "rabbits". The interesting content of evolution is in what it says about how things change: if the world is full of regular rabbits plus one extra-fit rabbit, then evolution says that in a few years, the world will have few regular rabbits and lots of extra-fit rabbits.

I want to propose a rough definition of evolutionary influence that generalises this idea. There are a few gaps in the definition which I hope can be successfully resolved, but I haven't had the time to do this yet.

First, we need an environment. I currently think of an environment as "a universe at a particular point in time". The universe is:

• A set Ω of configurations that the universe can have at a particular time
• An update rule f:Ω→Δ(Ω) that probabilistically maps the configuration at one point in time to the configuration at the next (Δ(Ω) means "the set of probability distributions on Ω").

Given a time t∈T, the environment μt is a probability distribution on Ω.

An organism is a particular configuration of a "small piece" of a universe. We can specify a function  u:Ω→{0,1} that evaluates whether the universe contains the organism, and u is somehow restricted to evaluating "small parts" of the universe (what I mean by a "small part" is currently a gap in the definition). We can condition an environment μt on the presence of an organism u to get the environment μt|u and similarly μt|∼u is the environment without the organism u.

A feature is a "large piece" of an environment. Like organisms, I'm not sure what I mean by "large piece". In any case, there's a function v:Ω→{0,1} that tells us whether a feature is present, and it must in some sense be "big and obvious".

An organism u at t has a large evolutionary influence at t">t′>t if the probability of some large feature v is very different in the future environment with the organism μt|uft′−t than in the future environment without it μt|∼uft′−t

Intuition: If at time t the environment μt is  full of grass and there are also a pair of rabbits u there, then in the future t′ the environment μt|uft′−t will be full of rabbits and have a lot less grass. On the other hand, if there are no rabbits at time t then the future environment μt|∼uft′−t will still be mostly grass.

Intuition 2: Perhaps if humans had not appeared when they did, highly intelligent life would have taken much longer to appear on Earth/never done so.  Then the Earth without humans 300k years ago wouldn't have any cities etc. today.

Relevance to AI

The fundamental question I'm asking here is: are AI research efforts likely to produce highly influential "organisms". A second important question is whether this influence is aligned with the creator's aims, but this seems to me to add a lot of complication.

My basic thinking here is that an AI system in training is embedded in two "universes". Think of a large neural network in training. One "universe" in which it lives is the space of network weights, and the update rule is given by the training algorithm and the loss incurred on the data at each step of training. It's not clear that it's meaningful to talk about "influentialness" in this universe. Maybe there is some "small" feature of the initialisation that determines whether it converges to something useful or not, but that is speculative (and I don't know what I mean by "small").

It also exists in the real universe - i.e. it's a configuration in the storage of some computer somewhere. Here there's a more intuitive sense that we can talk about "influentialness" - if it produces useful outputs somehow, people will be excited by their new AI and publish papers about it, create products using it and so forth, whereas if it doesn't then none of that will happen and it will be forgotten.

Given the way neural networks are trained, a trained neural network basically has to be something that performs reasonably well in the training universe. However, influentialness in the real universe trumps performance in the training universe - a real-universe-influential AI that isn't actually good on the training set is still real-universe-influential.

1. There is a training universe that, when run for long enough, produces a highly influential organism in our universe
• An example of this would be a very high-performance reinforcement learner whose reward is based on some "large feature" of our universe
2. AI training that we actually do has a reasonable chance of creating such a training universe
• For example, maybe it's not too hard to repurpose existing techniques (or future developments of them) to create the reinforcement learner mentioned in 1

Speculatively, there is some sense in which we can talk about the compatibility of the training environment and the real universe in the sense that "high performance" in the training environment is correlated with influence in the real universe. For a hypothetically "optimal" reinforcement learner rewarded based on large features of the real universe, this compatibility is maximal. However, even more pragmatic AI training regimes might exhibit high compatibility, and it also might not be so transparent about what kind of influence it is likely to exert. In particular, influence trumps performance on the training objective, so good behaviour assuming optimality with respect to the training objective cannot necessarily be relied on, and this seems to me to be one of the key insights of the idea of mesa-optimisation. On the other hand, it's completely plausible to me that good behaviour assuming optimality could imply good behaviour for near-optimality too. It is also quite mysterious to me how to actually characterise "good behaviour".

Discuss

### Open & Welcome Thread - May 2022

3 мая, 2022 - 02:47
Published on May 2, 2022 11:47 PM GMT

If it’s worth saying, but not worth its own post, here's a place to put it.

If you are new to LessWrong, here's the place to introduce yourself. Personal stories, anecdotes, or just general comments on how you found us and what you hope to get from the site and community are invited. This is also the place to discuss feature requests and other ideas you have for the site, if you don't want to write a full top-level post.

If you want to explore the community more, I recommend reading the Library, checking recent Curated posts, seeing if there are any meetups in your area, and checking out the Getting Started section of the LessWrong FAQ. If you want to orient to the content on the site, you can also check out the new Concepts section.

The Open Thread tag is here. The Open Thread sequence is here.

Discuss

### Predicting for charity

3 мая, 2022 - 01:59
Published on May 2, 2022 10:59 PM GMT

Excerpted from Above the Fold.

Prediction markets succeed when they require people to bet something they value, like money. But past attempts at real-money prediction markets like Intrade have been shut down by the CFTC. At Manifold Markets, we currently allow users to trade on Manifold Dollars (aka M$or “mana”), an in-game currency specific to our platform. But one of the most common criticisms we hear is: “Why should I care about trading fake currency?” It would be nice to find a middle ground where users can bet something of real value which doesn’t run afoul of financial regulations. Thinking about this, we were inspired by donor lotteries: if you can gamble with charitable donations, shouldn’t you be able to make bets with them? Thus, Manifold for Good was born. You can now donate your M$ winnings to charity! Through the month of May, every M$100 you contribute turns into USD$1 sent to your chosen charity - we’ll cover all processing fees!

Why?

Manifold for Good solves two problems with today’s prediction markets. First, it allows you to bet using something valuable to you (i.e. donations to your favorite charity), which increases the incentive to bet correctly, relative to just virtual points. Second, it respects existing financial regulations, which has proven difficult for prediction markets in the past.

By providing an entertaining and impactful way to allocate money to charity, Manifold for Good can also increase the total amount of money donated. Just as donors participating in a charity bingo night are willing to pay extra for the value of entertainment, so too can Manifold’s markets provide a fun, motivating reason to participate in charitable activities.

What’s Next?

Think of Manifold for Good as an experiment! We’re seeing what the level of demand is for this kind of redemption for Manifold Dollars; let us know if you have any thoughts or suggestions.

In the future, we’d like to grow the program to increase the number of available charities. We currently support 30+ charities; if you have a charity recommendation, let us know and we’ll pay a M$500+ bounty once we add it! We’d also love to offer donation matching to cause areas or charities — we think this would get users even more excited about forecasting and donating! If you would like to partner with us to fund an experiment like this, or be featured as a charity, please get in touch at give@manifold.markets. Finally: one HUGE shoutout to Sam Harsimony and Sinclair Chen for leading the effort to build out Manifold for Good! Note: we are not affiliated with most of these charities, other than being fans of their work. As Manifold itself is a for-profit org, your M$ contributions will not be tax-deductible.

Bonus: our codebase is now open source!

At Manifold, we’ve always aimed to be transparent about the way we do things. Some examples:

I’m excited to say that we’ve taken the next step forward: open sourcing our entire codebase! Check out our Github repo here.

Don’t forget to like and subscribe leave a Github Star!

This effort was spearheaded by Marshall Polaris, to whom the whole Manifold community owes a huge debt of gratitude. Marshall has been behind the scenes, doing the necessary work to prepare us for this big step forward:

• Ensuring our user data is secure against attackers
• Expanding our documentation to help new contributors get up and running
• Improving our processes to scale up with many new potential contributors

I can’t wait to see what features you build, bugs you fix, or projects you start, now that our code is open to you!

— Austin

Kudos to Justis Millis for reviewing a draft of this post.

Discuss

### Information security considerations for AI and the long term future

2 мая, 2022 - 23:54
Published on May 2, 2022 8:54 PM GMT

Summary

This post is authored by Jeffrey Ladish, who works on the security team at Anthropic, and Lennart Heim, who works on AI Governance with GovAI (more about us at the end). The views in the post are our own and do not speak for Anthropic or GovAI. This post follows up on Claire Zabel and Luke Muehlhauser’s 2019 post, Information security careers for GCR reduction.

We’d like to provide a brief overview on:

1. How information security might impact the long term future
2. Why we’d like the community to prioritize information security

In a following post, we will explore:

1. How you could orient your career toward working on security

Tl;dr:

• New technologies under development, most notably artificial general intelligence (AGI), could pose an existential threat to humanity. We expect significant competitive pressure around the development of AGI, including a significant amount of interest from state actors. As such, there is a large risk that advanced threat actors will hack organizations — that either develop AGI, provide critical supplies to AGI companies, or possess strategically relevant information— to gain a competitive edge in AGI development. Limiting the ability of advanced threat actors to compromise organizations working on AGI development and their suppliers could reduce existential risk by decreasing competitive pressures for AGI orgs and making it harder for incautious or uncooperative actors to develop AGI systems.
What is the relevance of information security to the long term future?

The bulk of existential risk likely stems from technologies humans can develop. Among candidate technologies, we think that AGI, and to a lesser extent biotechnology, are most likely to cause human extinction. Among technologies that pose an existential threat, AGI is unique in that it has the potential to permanently shift the risk landscape and enable a stable future without significant risks of extinction or other permanent disasters. While experts in the field have significant disagreements about how to navigate the path to powerful aligned AGI responsibly, they tend to agree that actors that seek to develop AGI should be extremely cautious in the development, testing, and deployment process, given the failures could result in catastrophic risks, including human extinction.

NIST defines information security as “The protection of information and information systems from unauthorized access, use, disclosure, disruption, modification, or destruction.”

We believe that safe paths to aligned AGI will require extraordinary information security effort for the following reasons:

• Insufficiently responsible or malicious actors are likely to target organizations developing AGI, software and hardware suppliers, and supporting organizations to gain a competitive advantage.
• Thus, protecting those systems will reduce the risk that powerful AGI systems are developed by incautious actors hacking other groups.

Okay, but why is an extraordinary effort required?

• Plausible paths to AGI, especially if they look like existing AI systems, expose a huge amount of attack surface because they’re built using complex computing systems with very expansive software and hardware supply chains.
• Securing systems as complex as AGI systems is extremely difficult, and most attempts to do this have failed in the past, even when the stakes have been large, for example, the Manhattan Project.
• The difficulty of defending a system depends on the threat model, namely the resources an attacker brings to bear to target a system. Organizations developing AGI are likely to be targeted by advanced state actors who are amongst the most capable hackers.

Even though this is a challenging problem requiring extraordinary effort, we think the investments are worth pursuing, and the current measures are insufficient.

What does it mean to protect an AI system?

It can be helpful to imagine concrete scenarios when thinking about AI security — the security of AI systems. Gwern recently wrote a compelling fictional take on one such scenario, including some plausible infosec failures, which we recommend checking out.

Infosec people often think about systems in terms of attack surface. How complex is the system, how many components does it have, how attackable is each component, etc? Developing a modern AI system like GPT-3 involves a lot of researchers and developers -- the GPT-3 paper had 31 authors! Each developer has their own laptop, and needs some amount of access to the models they work on. Usually, code runs on cloud infrastructure, and there are many components that make up that infrastructure. Data needs to be collected and cleaned. Researchers need systems for training and testing models. In the case of GPT-3, an API is created to grant limited access for people outside the company.

Figure 1: Overview of the active components in the development of an ML system. Each introduces more complexity, expands the threat model, and introduces more potential vulnerabilities.

Most of the components described here contain a staggering amount of complexity (see Figure 1). For example, the Linux kernel alone contains over 20 million lines of code. Each piece of hardware and software is a component that could be exploited. The developer could be using a malicious browser plugin that steals source code, or their antivirus software could be silently exfiltrating critical information. The underlying cloud infrastructure could be compromised, either because of underlying exploits in the cloud provider or because of misconfigurations or vulnerable software introduced by the organization.

And these are just the technical components. What’s not depicted are the humans creating and using the software and hardware systems. Human operators are generally the weakest part of a computing system. For example, developers or system administrators could be phished, phones used for multifactor authentication systems could be SIM-swapped, or passwords for key accounts could be reset by exploiting help desk employees. This is generally described as social engineering. In short, modern AI systems are very complex systems built by many people, and thus are fundamentally difficult to secure.

Threat Actors and AGI

Securing a modern AI project against attacks from general cybercriminals — such as ransomware operators, extortionists, etc —is difficult but not extraordinarily difficult. Securing a system as complex as a modern AI system against a state actor or an Advanced Persistent Threat (APT) actor is extremely difficult, as the examples in the reference class failures section demonstrate. We believe it is quite likely that state actors will increasingly target organizations developing AGI for the reasons listed below:

• Near-term AI capabilities may give states significant military advantages
• State-of-the-art AI labs have valuable IP and are thus a rich target for state-sponsored industrial espionage
• Concepts of the strategic value of AGI are present and accessible in our current culture

The US, China, Russia, Israel, and many other states are currently acting as if modern and near-term AI capabilities have the potential to improve strategic weapons systems, including unmanned aerial vehicles (UAVs), unmanned underwater vehicles (UUVs), submarine detection, missile detection systems, and hacking tools.

Some of these areas have already been the subject of successful state hacking activities. Open Philanthropy Project researcher Luke Muelhauser compiles several examples in this document, including: “In 2011-2013, Chinese hackers targeted more than 100 U.S. drone companies, including major defense contractors, and stole designs and other information related to drone technology…  Pasternack (2013); Ray (2017).”

It’s not evident that espionage aimed at strategic AI technologies will necessarily target labs working on AGI if AGI itself is not recognized as a strategic technology, perhaps because it is perceived as too far out to be beneficial. However, it seems likely that AGI labs will develop more useful applications as they get closer to capable AGI systems, and that some of these applications will touch on areas states care about. For example, China used automated propaganda to target Taiwan elections in 2018 and could plausibly be interested in stealing language models like GPT-3 for this purpose. Code generation technologies like Codex and AlphaCode might have military applications as well, perhaps as hacking tools.

In addition to the direct military utility of AI systems, state-sponsored espionage is also likely to occur for the purpose of economic competition. AI companies have raised $216.6B in investment, with at least a couple of billion raised by companies specifically trying to pursue AGI development which makes them a valuable economic target. As already outlined, stealing IP for AGI development itself or required critical resources is of interest to malicious actors, such as nations or companies. If IP violations are hard to detect or enforce, it makes industrial espionage especially attractive. The most concerning reason that states may target AGI labs is that they may believe that AGI is a strategically important technology. Militaries and intelligence agencies have access to the same arguments that convinced the EA community that AI risk is important. Nick Bostrom discusses the potential strategic implications of AGI in Superintelligence (2014, ch. 5). Like with the early development of nuclear weapons, states may fear that their rivals might develop AGI before them. Note that it is the perception of state actors which matters in these cases. States may perceive new AI capabilities as having strategic benefits even if they do not. Likewise, if AI technologies have the potential for strategic impact but state actors do not recognize this, then state espionage is less of a concern. Other targets Next to targeting organizations that develop AGI systems, we also think that organizations that either (a) supply critical resources to AGI labs or (b) research and develop AGI governance strategies are also at risk: • Suppliers of critical resources, such as compute infrastructure or software, are of significant relevance for AGI organizations. These supplies are often coming from external vendors. Targeting vendors is a common practice for state actors trying to gain access to high value targets. In addition, stealing intellectual property (IP) from external vendors could boost a malicious actor's capabilities of developing relevant resources (which it might have previously been excluded from due to export bans). • We think that AI Governance researchers may also be targets if they are perceived to possess strategic information about AI systems in development. State actors are likely especially interested in obtaining intelligence on national organizations and those relevant to the military. Reference Class Failures In trying to understand the difficulty of securing AI systems, it is useful to look at notable failures in securing critical information and ways to mitigate them. There is no shortage of examples — for a more comprehensive list, see this shared list of examples of high-stake information security breaches by Luke Muehlhauser. We want to present some recent examples — walking on the spectrum from highly technical state-sponsored attacks to relatively simple but devastating social engineering attacks. One recent example of a highly technical attack is the Pegasus 0-click exploit. Developed by the NSO group, a software firm that sells “technology to combat terror and crime to governments only”, this attack enabled actors to gain full control of the victim’s iPhone (reading messages, files, eavesdropping, etc.). It was used to spy on human rights activists and was also connected to the murder of Jamal Khashoggi. As outlined in this blogpost by Google’s Project Zero, this attack was highly sophisticated and required a significant amount of resources for its development — costing millions of dollars for state actors to purchase. On the other side of the spectrum, we have the Twitter account hijacking in 2020, where hackers gained access to internal Twitter tools by manipulating a small number of employees to gain access to their credentials. This then allowed them to take over Twitter accounts of prominent figures, such as Elon Musk and Bill Gates — and all of this was probably done by some teenagers. Another more recent attack, which also likely relied heavily on social engineering tactics, is the hack of NVIDIA by the Lapsus$ group. This attack is especially interesting as NVIDIA is an important actor in developing AI chips. Their intellectual property (IP) being leaked and potentially being used by less cautious actors could accelerate competitors’ efforts in developing more powerful AI chips while actively violating others' IP norms. Notably, while the target of the hacking group is a target of interest to state actors, we have some first hints that this attack might actually have been conducted by a number of teenagers.

The Twitter account hijacking and recent NVIDIA hack are notable as the required resources for those attacks were relatively small. More significant efforts and good security practices could have mitigated those attacks or at least made them significantly more expensive. Companies that are of strategic importance, or in respect to this article, organizations relevant to the development of AGI systems, should benefit from the best security and be on a similar security level, as national governments — given their importance for (national) security.

The applicability of the security mindset to alignment work

Good engineering involves thinking about how things can be made to work; the security mindset involves thinking about how things can be made to fail.

In addition to securing information that could play an outsized role in the future of humanity, we think that the information security field also showcases ways of thinking that are essential for AI alignment efforts. These ideas are outlined in Eliezer Yudkowsky’s post Security Mindset and Ordinary Paranoia and follow up post Security Mindset and the Logistic Success Curve. The security mindset is the practice of looking at a system through the lens of adversarial optimization, not just looking for ways to exploit the system, but looking for systemic weakness that might be abused even if the path to exploitation is not obvious. In the second post, Eliezer describes the kind of scenario where the security mindset is crucial to building a safe system:

…this scenario might hold generally wherever we demand robustness of a complex system that is being subjected to strong external or internal optimization pressures… Pressures that strongly promote the probabilities of particular states of affairs via optimization that searches across a large and complex state space…Pressures which therefore in turn subject other subparts of the system to selection for weird states and previously unenvisioned execution paths… Especially if some of these pressures may be in some sense creative and find states of the system or environment that surprise us or violate our surface generalizations…

This scenario describes security critical code like operating system kernels, and even more so describes AGI systems. AGI systems are incredibly dangerous because they are powerful, opaque optimizers and human engineers are unlikely to have a good understanding of the scope, scale, or target of their optimization power as they are being created. We believe this task is likely to fail without a serious application of the security mindset.

Applying the security mindset means going beyond merely imagining ways your system might fail. For example, if you’re concerned your code-writing model-with-uncertain-capabilities might constitute a security threat, and you decide that isolating it in a Docker container is sufficient, then you have failed to apply the security mindset. If your interpretability tools keep showing your models say things its internal representation knows are false, and you decide to use the output of the interpretability tools to train the model to avoid falsehoods, then you have failed to apply the security mindset.

The security mindset is hard to learn and harder to teach. Still, we think that more interplay between the information security community and the AI alignment community could help more AI researchers build skills in this area and improve the security culture of AI organizations.

Building the relevant information security infrastructure now

We think that labs developing powerful AGI systems should prioritize building secure information systems to protect against the theft and abuse of these systems. In addition, we also think that other relevant actors, such as organizations working on strategic research or critical suppliers, are increasingly becoming a target and also need to invest in information security controls.

Our appeal to the AI alignment and governance community is to take information security seriously now, in order to build a firm foundation as threats intensify. Security is not a feature that’s easy to tack on later. There is ample low-hanging fruit — using up to date devices and software, end-to-end encryption, strong multi-factor authentication, etc. Setting up these controls is an excellent first step and high investment return. However, robust information security controls will require investment and difficult tradeoffs. People are usually the weak point of information systems. Therefore, training and background checks are essential.

Information security needs are likely to become much more demanding as AGI labs, and those associated with AGI labs, are targeted by increasingly persistent and sophisticated attacks. In the recent Lapsus$attacks, personal accounts were often targeted to gain access to 2FA systems to compromise company accounts. Tools like Pegasus, already utilized by agencies in dozens of countries, could easily be leveraged against those working on AI policy and research. While we could make many more specific security recommendations, we want to emphasize the importance of threat awareness and investment in secure infrastructure, rather than any specific control. That being said, if you do have questions about specific security controls, or want help making your org more secure, feel free to reach out! Part of our goal is to help organizations just starting to think about security get started, as well as helping existing organizations to ramp up their security programs. Conclusion In this post, we’ve presented our argument for the importance of information security for the long term future. In the next post, we’ll give some concrete suggestions for ways people could contribute to the problem, including: • 1) how to know if an infosec career is a good idea for you • 2) how to orient your career toward information security and • 3) how others working to reduce AI risk can acquire and incorporate infosec skills into their existing work In the meantime, you can engage with others on related discussions in the Information Security in Effective Altruism Facebook group. About us • About Jeffrey • I work on the security team at Anthropic (we’re hiring!), and am also working on AI security field building and strategy. I've worn several hats in my infosec career: I worked as a security engineer at Concur Technologies, led security efforts for the cryptocurrency company, Reserve, and then started my own security consultancy business. I also spent a couple of years exploring existential and catastrophic risks from nuclear weapons and biotechnology. You can find some of my work here: https://jeffreyladish.com • About Lennart • I work on the intersection of AI hardware and AI Governance. Before that, I studied Computer Engineering and have a long-standing interest in security (though I never worked professionally full-time in this field). I used to work on security as a research assistant in wireless and networked systems or in my leisure time, mostly on embedded systems and webservers. Acknowledgements Many thanks to Ben Mann, Luke Muehlhauser, Markus Anderljung, and Leah McCuan for feedback on this post. Discuss ### My thoughts on nanotechnology strategy research as an EA cause area 2 мая, 2022 - 21:07 Published on May 2, 2022 5:57 PM GMT This is a cross-post from the Effective Altruism Forum. Two-sentence summary: Advanced nanotechnology might arrive in the next couple of decades (my wild guess: there’s a 1-2% chance in the absence of transformative AI) and could have very positive or very negative implications for existential risk. There has been relatively little high-quality thinking on how to make the arrival of advanced nanotechnology go well, and I think there should be more work in this area (very tentatively, I suggest we want 2-3 people spending at least 50% of their time on this by 3 years from now). Context: This post reflects my current views as someone with a relevant PhD who has thought about this topic on and off for roughly the past 20 months (something like 9 months FTE). Note that some of the framings and definitions provided in this post are quite tentative, in the sense that I’m not at all sure that they will continue to seem like the most useful framings and definitions in the future. Some other parts of this post are also very tentative, and are hopefully appropriately flagged as such. Key points • I define advanced nanotechnology as any highly advanced future technology, including atomically precise manufacturing (APM), that uses nanoscale machinery to finely image and control processes at the nanoscale, and is capable of mechanically assembling small molecules into a wide range of cheap, high-performance products at a very high rate (note that my definition of advanced nanotechnology is only loosely related to what people tend to mean by the term “nanotechnology”). (more) • If developed, advanced nanotechnology could increase existential risk, for example by making destructive capabilities widely accessible, by allowing the development of weapons that pose a higher existential risk, or by accelerating AI development; or it could decrease existential risk, for example by causing the world’s most destructive weapons to be replaced by weapons that pose a lower existential risk. (more) • Timelines for advanced nanotechnology are extremely uncertain and poorly characterised, but the chance it arrives by 2040 seems non-negligible (I’d guess 1-2%), even in the absence of transformative AI. (more) • It seems likely that there’d be a long period of development with clear warning signs before advanced nanotechnology is developed, pushing against prioritising work in this area and pushing towards a focus on monitoring and foundational work. (more) • There has been relatively little high-quality nanotechnology strategy work, and by default this seems unlikely to change much in the near future. (more) • It seems possible to make progress in this area, for example by clarifying timelines, tracking potential warning signs of accelerating progress, and doing strategic planning. (more) • Overall, I think that nanotechnology strategy research could be very valuable from a longtermist EA perspective. Currently, my extremely rough, unstable guess is that we should have 2-3 people spending at least 50% of their time on this by 3 years from now (against a background of perhaps 0-0.5 FTE over the past 5 years or so). (more) • Note that it seems that we don’t want to accelerate progress towards advanced nanotechnology because of (i) the dramatic but highly uncertain net effects of the technology, including the possibility of very bad outcomes, (ii) the plausible difficulty of reversing an increase in the rate of progress, and (iii) the option of waiting to gain more information. (Though note that I still feel a bit confused about how harmful various forms of accelerating progress might be, and I’d like to think more carefully about this topic.) (more) Introduction This post has two main goals: 1. To provide a resource that EA community members can use to improve their understanding of advanced nanotechnology and nanotechnology strategy. 2. To make a case for nanotechnology strategy research being valuable from a longtermist EA perspective in order to get more people to consider working on it. If you’re mostly interested in the second point, feel free to quickly skim through the first parts of the post, or maybe to skip directly to How to prioritise nanotechnology strategy research. Definitions and intuitions In this section, I introduce some concepts that seem useful for thinking about nanotechnology strategy. I’ll refer to these in various places throughout the rest of the post. Note that, although the focus of nanotechnology strategy research is ultimately advanced nanotechnology, I start by describing atomically precise manufacturing (APM), which I consider to be a particular version of advanced nanotechnology. I do this because APM is a far more concrete and well-explored technology than advanced nanotechnology, and because I like to refer to APM to help describe what I mean by advanced nanotechnology. Defining atomically precise manufacturing (APM) I think the term APM is often used in a fairly vague way, and doesn’t have a widely accepted precise definition, so for the purposes of this post I’ll introduce some more precisely defined concepts that pick out important aspects of what people refer to as APM. These are: • Core APM: A system of nanoscale atomically precise machinery that can mechanically assemble small molecules into useful atomically precise products. • Complex APM: Core APM that is highly complex, made up of very stiff components, operates in a vacuum, and performs many operations per second; that is joined to progressively larger high-performance assembly systems in order to allow a range of product sizes; where the assembly machines and products are perhaps only mostly atomically precise; and where the assembly method is perhaps only mostly mechanical assembly. • Consequential APM: Complex APM that can create a wide range of complex, very high performance products at very low cost ($1000/kg or less) and with very high throughput (1kg of product per 1kg of assembly machinery in 3 days or less).

For more detailed definitions of these terms, see the Appendix section APM definitions in more detail.

The APM concept originated with Eric Drexler.[1] Complex APM roughly corresponds to the technical side of Drexler’s vision for APM, while consequential APM describes a watered-down version of the capabilities Drexler describes for APM.[2],[3] In what follows, I’ll use these terms when I want to be specific, and I’ll use the term “APM” when I want to point to the wider concept.

Intuitions for why APM might be possible and impactful

This section provides some quick intuitions for why you might think that consequential APM is possible and for why atomic precision might be desirable when building very high performance assembly machines and products.

We know that core APM is feasible, i.e. that atomically precise nanoscale machines can be used to do mechanical assembly of small molecules into useful atomically precise products, because examples of core APM exist in nature. For example, ribosomes are atomically precise nanoscale machines[4] that join amino acids to create proteins, which are themselves atomically precise.[5] (Note that we don’t have similar “existence proofs” for complex or consequential APM.)

Atomic precision might be desirable for nanoscale assembly systems and products because, on the nanoscale, the atomic building blocks might be 1/100th or 1/1000th the width of the structure, so that you need atomic precision unless you design for very wide tolerances in structural parts. Atomic precision also gives you perfectly faithful realisations of your design (provided there are no manufacturing errors).

A manufacturing capability that can exactly reproduce a target structure down to the last atom might intuitively seem able to produce products with performance dramatically exceeding current capabilities. Naturally evolved systems often outperform present-day artificial ones on important dimensions,[6] showing that we have some way to go before we reach performance ceilings. In addition, these evolved systems are composed of nanoscale machines and structures (albeit not always atomically precise ones), which are themselves a product of nanoscale manufacturing, suggesting that this is a powerful scheme for producing high performance products.

Nanoscale machines might be able to achieve high throughput because stiff nanoscale machines moving small distances can operate with very high frequency, and because 1cm³ of nanoscale assembly systems can together perform vastly more operations per second than a single machine of size 1cm³.[7],[8]

This high throughput in turn suggests cheap products, for the following reasons:

• A complex APM system might be able to do efficient processing of input materials into whatever high-quality building blocks are required just as easily as it can do highly efficient assembly of those building blocks, and so be able to accept cheap and abundant input materials (only requiring the presence of the necessary chemical elements).
• Similarly, we might expect efficient processing to allow for harmless, easily manageable waste.
• A complex APM system might itself be cheap if it can manufacture copies of itself.

We can also appeal to naturally evolved systems, which seem to often be significantly cheaper to manufacture (at least based on a rough comparison of energy cost) than artificial ones, showing that nanoscale manufacturing systems can do cheap manufacturing.[9]

For a more detailed discussion, see Appendix section More intuitions for why APM might be possible and impactful.

As far as I know, EA efforts in this area have been focused on APM. I consider APM to be a remarkably concrete vision for future nanotechnology, and an extremely useful thing to analyse, but I tentatively propose that we slightly broaden the scope of work in this area to consider what I call advanced nanotechnology.

I define advanced nanotechnology as any highly advanced technology involving nanoscale machinery that allows us to finely image and control processes at the nanoscale, with manufacturing capabilities roughly matching, or exceeding, those of consequential APM.

Advanced nanotechnology covers a wider range of possible future technologies than APM. For example, a future nanotechnology might rely less on mechanical positioning or very stiff machines than APM does.[10],[11] But these technologies would look similar to APM in many ways, and might have the same strategic implications. My current feeling is that it’s better for strategy work to cover this wider area than to focus purely on APM.[12]

Note that advanced nanotechnology is fairly loosely connected to the concept that’s usually pointed to by the term “nanotechnology”.

The term “nanotechnology field” seems to commonly refer to a loose connection of research areas that are united by the fact that they concern physical systems on a length scale of between roughly 1-100nm (although there are lots of fields that aren’t usually considered nanotechnology that also concern physical systems on that length scale). The term “nanotechnology” is sometimes used to refer to useful (potential) products or technologies that come from this field.

See the later section called Current R&D landscape for my take on the present-day fields of research that are most relevant for advanced nanotechnology.

Defining transformative AI (TAI)

At a few points in this post I’ll refer to “transformative AI”, usually through the abbreviation “TAI”. By this I mean “AI that precipitates a transition comparable to (or more significant than) the agricultural or industrial revolution”, which is the definition for TAI sometimes used by Open Philanthropy (for more detail on this definition, see What Open Philanthropy means by "transformative AI").

Advanced nanotechnology’s extremely powerful capabilities suggest that its development could have very dramatic effects.

The potential effects listed below seem among the most important from a longtermist EA perspective. I’d guess that if advanced nanotechnology is developed within the next 20 years and we haven’t (yet) developed TAI, there’d be something like a 70% chance that it would have very dramatic effects, i.e. effects of at least similar importance to the ones described below.[13] Note that the first two bullet points use the typology from Nick Bostrom’s paper The Vulnerable World Hypothesis.

• Vulnerable world type-1 vulnerabilities. Advanced nanotechnology might lead to widespread access to manufacturing devices able to make things like nuclear weapons, dangerous pathogens, or worse. This seems like a plausible outcome to me, assuming society were to fail to properly regulate and control the use of the technology.
• Vulnerable world type-2a vulnerabilities. Even if manufacture of highly destructive weapons doesn’t become widely available to ordinary citizens, advanced nanotechnology might allow states to manufacture weapons that pose an (even) greater probability of existential catastrophe than the highly risky weapons that are currently accessible (such as currently accessible nuclear and biological weapons). I’d place a “grey goo” scenario, where self-replicating nanoscale machines turn the entire world into copies of themselves, into this category. It seems very plausible to me that advanced nanotechnology would enable the development of weapons that pose a significantly larger existential risk than do our most dangerous present-day weapons. And while weapons that have a moderate-to-high chance of causing existential catastrophe might not seem very appealing on the face of it, it seems very plausible to me that at least some states would perceive a sufficiently high strategic benefit from developing them.[14]
• Reduced existential risk from state weapons. On the other hand, in a reversal of the above scenario, advanced nanotechnology might lower existential risk from nuclear and biological weapons (or from other future weapons that pose an existential risk) by allowing states to develop weapons that are more strategically useful and pose a lower existential risk. This also seems very plausible to me.
• More powerful computers, leading to earlier TAI. Advanced nanotechnology could allow us to build more powerful computers more cheaply. If advanced nanotechnology is developed before TAI, having cheaper and better computers could lead to earlier TAI.[15] This seems very plausible to me, and I’d also (very tentatively) guess that this would come earlier than the effects mentioned in the previous three bullet points.

Each of the above effects could constitute an existential risk or existential risk factor, or could correspond to a reduction in existential risk. Overall, currently I’m very unsure about whether advanced nanotechnology would be good or bad for the world on net, and I’m very unsure whether I’d rather learn that advanced nanotechnology was coming sooner or later than expected (but I still currently think that acting to speed up the arrival of advanced nanotechnology is probably bad. See the later section Potential harms from nanotechnology strategy research).

Note that the potential effects described above all follow from having an extremely powerful and flexible manufacturing capability, and novel nanoscale devices as products are necessary only in the grey goo scenario.[16]

For other potential effects of advanced nanotechnology, see the Appendix section Other potential effects of advanced nanotechnology.

When and how advanced nanotechnology might be developed

This section is quite long and complex, so here is a quick summary:

• My rough guess is that there’s something like a 20-70% chance that advanced nanotechnology is feasible (where by “feasible” I mean: it’s possible in principle for a sufficiently advanced civilisation to build the technology, ignoring issues of economic feasibility). (more)
• I’d guess there’s currently something like $10-100 million per year of funding that is fairly well directed towards developing advanced nanotechnology. These efforts are mostly focused on using scanning probe microscopy, which doesn’t seem like the most effective approach for developing advanced nanotechnology. (more) • My current very rough, unstable guesses regarding advanced nanotechnology timelines are: • Assuming feasibility, a median estimate of 2110 for the arrival of advanced nanotechnology. (more) • Not assuming feasibility, a 4-5% probability that advanced nanotechnology arrives by 2040. (more) • Not assuming feasibility, and assuming that advanced nanotechnology comes before TAI, a 1-2% probability that advanced nanotechnology arrives by 2040. (more) • It seems likely to me that there’d be a long period of development with clear warning signs before the arrival of advanced nanotechnology. (more) In-principle feasibility In this section, I briefly give my views on whether APM technologies, and advanced nanotechnology more broadly, are feasible in principle. By “feasible in principle”, I mean that it’s possible in principle for a sufficiently advanced civilisation to build the technology, ignoring issues of economic viability. The probabilities given in this section are based on only relatively shallow thinking and are very unstable, and shouldn’t be taken as more than a rough indication of my personal guesses at the time I wrote this.[17] As noted in the earlier section Intuitions for why APM might be possible and impactful, we know that core APM is feasible because examples of it exist in nature. Maybe I’d give a 50% probability that complex APM is feasible. While I think there’s a significant chance that complex APM is feasible, and I’m not aware of convincing arguments that it couldn’t possibly be built, I wouldn’t find it that surprising if it turns out to be impossible. For example, it might turn out that it’s not possible to find suitable arrangements of atomic building blocks, given the finite selection of atoms and atomic sizes. Assuming that complex APM is feasible, it seems pretty plausible to me that consequential APM is also feasible. Maybe I’d give this a 10-50% probability (implying that I guess a 5-25% unconditional probability that consequential APM is feasible). The earlier section Intuitions for why APM might be possible and impactful and the Appendix section More intuitions for why APM might be possible and impactful cover some reasons for thinking that complex APM might be feasible. In addition to those points, I’d note that Drexler has published technical arguments for the in-principle feasibility of APM (including a book, Nanosystems, which explores designs and capabilities), and in my view no-one has shown that the technology is not feasible despite some attention on the question.[18] I don’t assign a higher probability to the feasibility of consequential APM because I feel like there could be practical barriers that block the development of consequential APM even if complex APM is feasible. For example, maybe the system can’t be made reliable enough or is very expensive to maintain (so that the “very cheap products” condition can’t be met), or maybe there just aren’t any designs that allow you to take advantage of the favourable theoretical properties of nanoscale assembly systems. Advanced nanotechnology is, by definition, at least as likely to be feasible as consequential APM, and we might consider it to be substantially more likely to be feasible because it covers a much wider array of potential technologies. Overall, I’d guess something like a 20-70% chance that advanced nanotechnology is feasible. This is around 3 times as large as my guess for the probability that consequential APM is feasible, which feels reasonable to me. Current R&D landscape Anecdotally, my impression is that the majority of researchers in fields relevant for advanced nanotechnology haven’t heard of APM, and don’t think about visions for transformative nanotechnology more generally.[19] Correspondingly, there is not much R&D explicitly targeting APM or broader advanced nanotechnology, as far as I’m aware. I’d guess there’s something like$10-100 million per year of funding that is fairly well directed towards developing advanced nanotechnology. This is mostly being spent (according to my guess) at Zyvex and on a project at Canadian Bank Note. These projects use an approach involving scanning probe microscopy[20] to make progress towards APM (sometimes called the “hard path” to APM), which doesn’t seem like the most promising approach,[21] although of course surprises are always possible.

Particular examples of less well-targeted, but still relevant, work include:

More broadly, my impression is that the most relevant work for progress towards advanced nanotechnology is:

• Work in the protein engineering and DNA nanotechnology fields.
• Work on spiroligomers and foldamers, as well as supporting technologies such as computational approaches (AlphaFold is a notable example relevant for protein engineering) and nanoscale imaging.
• Work on advancing scanning probe microscopy.
• Perhaps to a lesser extent, a broad class of work on non-biological dynamic nanoscale systems.[22]
Timelines Considerations relevant for thinking about APM timelines

In this section, I’ll discuss considerations relevant for thinking about when APM might be developed. I focus on APM here rather than broader advanced nanotechnology because I find it easiest to first think about APM and then consider how much earlier things might happen if we consider timelines to advanced nanotechnology of any form, rather than just timelines to APM.

The most obvious consideration pushing against short timelines for APM is the very slow rate of progress towards APM in the last 35 years. For example, atomic manipulation with scanning probe microscopes doesn’t seem to have improved very much since the original demonstration of the technique by IBM in 1989. In addition, despite significant progress in protein engineering, positional assembly with protein suprastructures (which is perhaps the first step on a hypothesised pathway to APM called the “soft path”, as discussed in footnote 5) has still not been demonstrated.[23] Overall, my very rough (and unstable) guess would be that we’ve come perhaps 10% of the way to APM in the past 35 years.

Aside from the empirical observation of slow progress, my inside-view impression is that it would be an enormous engineering challenge to make progress along the technical pathways sketched for APM. Further down the line, it also seems hugely challenging to engineer complex APM systems; although perhaps this is mitigated by the consideration that very stiff, later-stage systems might be much more predictable (and so easier to model and engineer) than early-stage ones.

An additional consideration pushing against short timelines is that it doesn’t seem like there would be many commercial applications from making the first few steps towards APM (even though later stages might have lots of commercial applications). This reduces the incentive for private research efforts and reduces researcher interest within academia.

Some considerations push in favour of shorter timelines for APM, however.

Firstly, in outlining the “soft path” to APM (see footnote 5), Drexler has sketched a technical pathway to APM that I find plausible.

In addition, it seems plausible that an APM field could emerge over the next decade, or perhaps that some highly targeted and well-resourced private effort will emerge. Because there has been relatively little highly targeted effort towards the development of APM so far, we have relatively little empirical evidence regarding what rate of progress would be possible under such an effort; perhaps progress would be much faster than expected.[24]

Further, advances in AI are leading to advances in our ability to model molecular systems, most notably in the field of protein folding, which is particularly relevant for near-term progress along the soft path to APM. It seems plausible to me that increasingly powerful AI, even if it falls short of TAI, will lead to surprisingly fast progress towards APM.

Finally, it seems very possible that the arrival of TAI would lead to the rapid development of APM. This might significantly shorten your timelines for APM, depending on your timelines for TAI.

Median timelines

As mentioned above, my guess for the probability that advanced nanotechnology is feasible in principle fluctuates between roughly 20% and 70%. If my probability for feasibility is less than 50%, it doesn’t really make sense to talk about a median timeline: if I think the chance that it’s possible to develop advanced nanotechnology is less than 50%, it wouldn’t make sense to think that at some point in the future there’s a 50% chance that we’ve developed advanced nanotechnology.

Assuming advanced nanotechnology is feasible in principle, my median timeline is perhaps something like 2110. But this number is very made up and not stable. It also interacts pretty strongly with my timelines for TAI, since TAI could massively accelerate technological progress.

Another data point comes from Robin Hanson in 2013, when he reported a guess of roughly 100-300 years until something seemingly roughly equivalent to advanced nanotechnology is developed.

Probability of advanced nanotechnology by 2040

We might be particularly interested in the probability that advanced nanotechnology arrives in the next couple of decades, since events further in the future seem generally harder to influence.

We might also be particularly interested in worlds where advanced nanotechnology is not preceded by TAI, because, for example, we might think that after TAI “everything goes crazy” and it’s hard to make useful plans for things that happen after that point. Personally, I think it’s reasonable to imagine that work on nanotechnology strategy will be much less useful if advanced nanotechnology is preceded by TAI, although I’m not at all confident about this and my view feels quite unstable.

I’d guess there’s something like a 1-2% chance that advanced nanotechnology arrives by 2040 and isn’t preceded by TAI.[25] As with my median timeline, this number is very made up and not stable.[26]

For the probability that advanced nanotechnology arrives by 2040 and is preceded by TAI, my current speculative guess is something like:

“16% chance of TAI by 2040”
× “40% chance that advanced nanotechnology is developed between TAI and 2040, given that TAI arrives before 2040 and that advanced nanotechnology is feasible in principle”
= 3% chance that advanced nanotechnology arrives by 2040 and is preceded by TAI.

You might want to substitute your own probabilities for the arrival of TAI and/or the chance that TAI leads to the development of advanced nanotechnology.

Overall, then, adding the above probabilities implies that my guess is that there’s a 4-5% chance that advanced nanotechnology arrives by 2040. Again, this number is very made up and not stable.

Rate of progress and warning signs

I imagine that developing advanced nanotechnology would be a huge engineering challenge, and there would be a lengthy road to get there from where we are today. In the median case, I imagine that it would be extremely obvious for many years that this kind of work is being done: the field might look a bit like the present AI field, for example, with notable private-sector efforts alongside lots of work in academia. It seems very unlikely, though possible, that progress would be driven by some huge, secretive, Manhattan Project–style effort instead; but even in that case, it seems like informed people would probably know that something was up, for example by noticing that relevant academics had suddenly stopped publishing.

The main ways I can see the above picture being wrong are:

• Technical progress just turns out to be way easier than it seems now. Maybe surprisingly quick progress along the “hard path” to APM (i.e. making progress using scanning probe microscopy) is the most likely way this could happen.
• Pre-transformative AI dramatically speeds up the rate of progress in relevant fields.
• Transformative AI dramatically speeds up technological progress (but then maybe “all bets are off” anyway).

It’s worth noting that if we’re focused on worlds where advanced nanotechnology arrives in the next 20 years, we should presumably focus more than we would otherwise on worlds where progress happens surprisingly quickly. This is because worlds where advanced nanotechnology arrives in the next 20 years will tend to be worlds where progress happens surprisingly quickly (for example, if there is a gradual ramp up in progress over the next 30 years, we obviously won’t have the technology in 20 years’ time).

Current landscape for nanotechnology strategy

I’m aware of little thoughtful nanotechnology strategy work currently being undertaken, and I don’t think there’s a very large amount of high-quality existing work.

The most notable public EA work in this area that I’m aware of is a 2015 Open Philanthropy report called Risks from Atomically Precise Manufacturing.

As far as I know, no-one in the EA community, other than me, is currently spending a significant fraction of their time thinking about this or is planning to do so. Due to other (potential) projects competing for my time, I’d guess I’ll spend something like 25% of my time on nanotechnology strategy-related things on average in the next 12 months, with a good chance (maybe 50%) that I don’t spend much time at all on nanotechnology strategy things in the next 12 months. Maybe I’d guess I’ll spend 40% of my time on average on nanotechnology strategy things over the next 5 years.

Outside of EA, the Foresight Institute has historically thought about nanotechnology strategy questions, and they have perhaps 0.2-0.6 FTE working in this broad area currently. Other organisations have been active in this area, such as the Center for Responsible Nanotechnology and the Institute for Molecular Manufacturing,[27] but I’m not aware of relevant recent work. I think that a lot of the public work in this area has been of low quality, and overall I think the existing work falls a long way short of providing complete and high-quality coverage of the important questions within this area.

Prospects for making progress

The high-level goals of EA work in this area could be to reduce the chance of an existential catastrophe due to advanced nanotechnology, to use its development to reduce existential risk from other sources, or more broadly to achieve any attainable value related to its development.

Potentially valuable interventions might include:

• Seeking to speed up particular technical research areas where that seems to promote safety.
• Making policy recommendations related to regulating, monitoring, and understanding emerging advanced nanotechnology.
• Positively steering the development of advanced nanotechnology by being a major early investor or founder.

We have some reasons to expect progress to be hard in this area: the technology appears likely to be many decades away, and the terrain for strategy work is poorly mapped out at present.

On the other hand, the fact that the world is not paying much attention to advanced nanotechnology gives altruistic actors a chance to make preparations now in order to, for example, be in a position to make impactful policy recommendations at critical moments later. In addition, the relative lack of exploration makes it hard to rule out the existence of valuable low-hanging fruit. And, regarding our understanding of the technology itself, advanced nanotechnology is more opaque than risky biotechnology, but perhaps compares favourably with artificial general intelligence (AGI), since APM is arguably a more plausible blueprint for advanced nanotechnology than present-day machine learning methods are for AGI.

Examples of concrete projects

One concrete project could be to consider what you would do if you knew that advanced nanotechnology was coming in 2032, seeking areas of consensus among EA-aligned individuals who are well informed about nanotechnology strategy. The results could inform what we should do now (given our actual timelines), perhaps pointing to a particular high-value intervention. The results could also help generate a plan to hold in reserve in case our advanced nanotechnology timelines were to dramatically shorten, increasing the chance that EAs successfully execute high-value interventions in the case of shortened timelines. In addition, the exercise would be practice for future strategic thinking, potentially increasing the chance that EAs successfully execute high-value interventions at some future time after a change of circumstances or further deliberation.

Another project could be to clarify what the key effects of complex APM that we want to forecast are; work out the technical requirements of these effects; and create forecasts using trend extrapolation, other outside-view considerations and methods, and inside-view judgements. These forecasts could help prioritise nanotechnology strategy research against work in other cause areas. That way, if advanced nanotechnology timelines turn out to be shorter than initially believed, more effort can be exerted in this area, increasing the chance that EAs and others successfully identify and execute high-value interventions. Forecasts of various key effects could also be helpful for directing work within nanotechnology strategy towards the most high-value sub-areas, perhaps increasing the chance that high-value interventions are found and successfully executed.

A third potential project could be to identify and monitor potential warning signs of surprising progress towards advanced nanotechnology, for example by identifying the most relevant areas of current research, the most important research groups, and key bottlenecks and potential breakthroughs. Similarly to the previous project, this monitoring could help prioritise nanotechnology strategy research against work in other cause areas, so that more resources are expended in this area if progress appears to be accelerating, increasing the chance that EAs and others successfully identify and execute high-value interventions.

How to prioritise nanotechnology strategy research

This section gives some arguments and considerations relevant for prioritising nanotechnology strategy research against other cause areas from a longtermist EA perspective. To skip to my personal bottom line view on this, see the final subsection of this section, My view on how longtermist EAs should be prioritising this work against other areas right now.

A case for nanotechnology strategy research

Pulling together the information presented in the previous sections, a rough, high-level argument for nanotechnology strategy research being highly valuable from a longtermist EA perspective could be the following:

1. Advanced nanotechnology could have dramatic effects, with both positive and negative potential implications for existential risk.
2. There seems to be a non-negligible (although low) probability that it will arrive within the next 20 years, even if it’s not preceded by TAI (my wild guess is that there’s a 1-2% probability of this).
3. There’s been very little high-quality work in this area.
4. It seems possible to make progress in this area.[28],[29] I’d also note that I think there’s a synergy between nanotechnology strategy and other areas like AI governance and biosecurity, which makes work in nanotechnology strategy more tractable and more valuable than it would be otherwise. Specifically, because these areas all concern trying to make the development of very powerful emerging technologies go well for the world, methodologies and findings seem likely to be transferable between these areas to some extent.

As discussed in the earlier section Rate of progress and warning signs, my current guess is that there’s very likely to be a long, obvious-to-the-outside-world R&D process before advanced nanotechnology is developed. This pushes in favour of deprioritising this area until there are signs that progress towards advanced nanotechnology is speeding up.

However, this depends on to what extent nanotechnology strategy work needs to be done sequentially versus in parallel. The more the work needs to happen sequentially, the more it’s worth doing work now. I feel very uncertain about this, although it seems like the work is probably at least a bit sequential. For example, maybe it takes time to lay the conceptual foundations for a new topic, and also presumably it takes time for people to get up to speed and start making contributions.

In addition, we might think that being one of the first to act in this space when progress starts to pick up is valuable. I’d guess there could be a fair amount of value here.

Overall, I think the likely slow and easy-to-detect ramp up towards advanced nanotechnology pushes in favour of spending relatively few resources now to do foundational work, while carefully monitoring progress and being ready to pivot to spending lots of EA effort in the area if that seems important.

Potential harms from nanotechnology strategy research

Nanotechnology strategy is a very complex area, and there are many potential harms from well-intentioned nanotechnology strategy work.

It currently seems to me that we don’t want to speed up progress towards advanced nanotechnology. Roughly speaking, I think this because i) the potential effects of the technology seem dramatic, but very uncertain (they could be very good or very bad), ii) it seems plausible that accelerating progress will be hard to reverse, and iii) we have the option of waiting and potentially gaining more information about the effects of the technology. It seems plausible that accelerating progress will be hard to reverse because, for example, visibly faster progress could generate more interest and a stronger research effort, which in turn sustains the rate of progress. (Though note that I still feel a bit confused about how harmful various forms of accelerating progress might be, including whether some forms would be harmful at all, and I’d like to think more carefully about this topic.)

Because accelerating progress seems bad, public talk about advanced nanotechnology in a way that hypes the technology or otherwise generates interest in developing it could cause harm by speeding up progress towards the technology. And it seems relatively easy to notably increase attention on the technology given that there’s currently relatively little attention on it.[30] Emphasising military applications seems particularly undesirable, since it might tend to push the direction of the technology’s development towards dangerous applications.

In addition, findings from nanotechnology strategy research could, in some cases, represent information hazards that would inform efforts to speed up advanced nanotechnology development. Examples include findings from efforts to better understand advanced nanotechnology timelines by mapping out technical pathways, or from mapping out scenarios for the development of an advanced nanotechnology field. While it’s possible in principle to keep these findings private, there are advantages to public communication, and exchanging ideas with people interested in speeding up the development of advanced nanotechnology is often very helpful for doing nanotechnology strategy work. So the right policy here isn’t always obvious.

Because this is a relatively unexplored area, maybe there’s also a risk that poor initial attempts at nanotechnology strategy research will “poison the well”, doing long-term damage to the ability for progress to be made in this area. This could happen because researchers develop a poor set of concepts, or because EA-aligned people communicate with policymakers in a poorly thought-out way that discourages them from engaging with EA-aligned people in the future.

There may also be a risk that nanotechnology strategy research leads to making policy recommendations that turn out to be harmful and hard to reverse. For example, this could occur if a policy recommendation seems beneficial after a shallow investigation, but enough thought would show it to be harmful.

Relatedly, some individual or group of nanotechnology strategy researchers might mistakenly come to believe that accelerating progress towards advanced nanotechnology is a worthwhile goal, and then act to accelerate progress, causing irreversible damage. (To be clear, I also think it’s possible that at some point we’ll correctly determine that accelerating advanced nanotechnology progress is a worthwhile goal.)

Mitigating the risks of harm

To reduce the risk from these potential harms, prospective nanotechnology strategy researchers require good judgement and a strong support network they can turn to for high-quality advice and feedback. Keeping the unilaterist’s curse in mind also seems important.

Prospective funders in this area should keep these traits in mind when considering who to fund. In addition, funders could, for example, look for an assurance that nanotechnology strategy researchers won’t share their work outside a trusted circle without particular trusted individuals giving approval.

Why you might not think work in this area is high priority

Here are a couple of reasons you might not consider work in this area to be high priority from a longtermist EA perspective.

Firstly, you might think that it’s just too difficult to make progress towards concrete impact, because advanced nanotechnology is likely to be far off in time, and hard to analyse because its precise nature is very uncertain. Tangible interventions are notably absent right now.

I think this is a reasonable position, although I disagree with it. I think there are concrete projects that seem useful (see the earlier section Examples of concrete projects), I think that this area is too unexplored to be very sure that there isn’t valuable work here, and I think that in scenarios where advanced nanotechnology arrives in the next couple of decades we’ll be glad that work was done in the early 2020s.

Secondly, you might just be convinced that some other area is much more important from a longtermist EA perspective, so that you don’t think longtermist EA resources should be spent on nanotechnology strategy research. For example, maybe you have short AI timelines and think AI safety work is at least moderately tractable.

Again, I think this position is reasonable (either regarding the great importance of AI safety work, or some other area), although it’s not a position I hold — cause prioritisation is hard, and there’s lots of scope for reasonable people to disagree.

My view on how longtermist EAs should be prioritising this work against other areas right now

In this section I’ll briefly give some more concrete views on how people from the longtermist EA community should prioritise working on this compared to working in other areas. I haven’t thought about this a huge amount (and this kind of prioritisation is extremely difficult and also strongly depends on details that I’m brushing over), so these are extremely rough views and are liable to change in the near future. But reviewers of earlier drafts of this post were keen to see something like this, so I’m providing it here.

I’d guess that some small, non-zero fraction of EA resources should be going into nanotechnology strategy research right now. In particular, a rough guess might be that we want to move to having 2-3 people each spending at least 50% of their time on this in the next 3 years, and maybe get to 4 FTE in 5 years.

So I don’t think we should be piling a huge amount of resources into this, but the resource allocation I’m suggesting comes against a background of (it seems to me) something like 0-0.5 FTE of longtermist EAs thinking about this at any given time over the past 5 years. So this would represent a significant change from the status quo.

Another angle on this is my view on the following highly abstract case: if I had an aspiring researcher in front of me who seemed about an equally good fit for nanotechnology strategy research, biorisk research, AI safety research, and AI governance research, I’d rank nanotechnology strategy research a bit below the others. I’d rank the options in this way because I think we should be able to recruit enough people from the pool of people who are a better fit for nanotechnology strategy research than for research in other areas (but again, I feel very uncertain about all of this).

Some guesses at who might be a good fit for nanotechnology strategy research

Being broadly longtermist EA-aligned seems important for doing work in this area.

In addition, I’d guess that, very roughly speaking, the following sorts of people would be a particularly good fit for nanotechnology strategy research:

• People with chemistry/physics/biology/materials science backgrounds (among others), and especially people with PhDs in those areas.
• People who have done this kind of “strategy” thinking in other contexts, including in other EA areas like AI risk or ending factory farming.
• People who are thoughtful, have good judgement, and are not likely to act unilaterally.
• People who are willing to build the support network mentioned in Mitigating the risks of harm.
• People willing to tackle something hard and relatively unexplored, and willing to go for higher-risk things on the basis of high expected impact.
• People who are okay with doing work that might be less legible to the outside world because of infohazard concerns and the unpredictable nature of exploratory research.
Conclusion

Advanced nanotechnology could significantly increase or decrease existential risk, and might arrive in the next couple of decades, even without transformative AI. So far there’s been relatively little high-quality work aimed at making its development go well for the world. Doing nanotechnology strategy research now can help lay the foundations to increase the chance that the development of the technology goes well.

I would love to see more people considering working in this area. If you’re interested in learning more, and especially if you’re interested in trying out work in this area now or later in your career, please get in touch by sending me a private message on the Forum.

Acknowledgements

This post was written by me, Ben Snodin, in my capacity as a researcher at Rethink Priorities. I finished writing this post while at Rethink Priorities, and I did a lot of the work for it prior to joining Rethink Priorities, while employed as a Senior Research Scholar at the Future of Humanity Institute. Many thanks to James Wagstaff, Jennifer Lin, Ondrej Bajgar, Max Daniel, Daniel Eth, Ashwin Acharya, Lukas Finnveden, Aleš Flídr, Carl Shulman, Linch Zhang, Michael Aird, Jason Schukraft, and others for their helpful feedback, and to Katy Moore for copy editing. If you like Rethink Priorities’ work, you could consider subscribing to our newsletter. You can see more of our work here.

Some resources that are especially relevant for nanotechnology strategy research are:

These resources are also useful for context and technical understanding:

APM definitions in more detail

This section provides more detailed definitions for the core APM, complex APM, and consequential APM concepts introduced in the main text in Definitions and intuitions.

I define core APM as:

A system of nanoscale, atomically precise machines that can mechanically assemble small molecules into useful atomically precise products.

Where:

• By nanoscale I mean “extremely small”, roughly of length 1-100nm in each dimension. For comparison, atoms often have a diameter of around 0.1-0.2nm. Note that a core APM system, while composed of nanoscale machines, might itself be larger than 100nm.
• Atomically precise machines means machinery that has exactly the desired atomic structure. Similarly, atomically precise products means products that have exactly the desired atomic structure.
• Mechanically assemble small molecules means (in this context): use nanoscale machines to control the motion of small molecules, using short-range intermolecular forces, such that the molecules form stable bonds with some partially constructed product.

We can describe a particular form of core APM that uses complex, very stiff components and performs many operations per second. In addition, we can imagine that the products from the nanoscale assembly system are used as inputs to slightly bigger (but otherwise very similar) assembly systems, which in turn assemble them into bigger products; and so on until you get to whatever size product you want (including everyday-scale things like laptops). Finally, we can slightly relax the definition to include systems where the assembly machines and products are only mostly atomically precise, and where the assembly method is only mostly mechanical assembly.

I label such a technology complex APM (the modifications to the core APM concept are in bold):

An extremely complex, intricate system of very stiff nanoscale, (mostly) atomically precise machines joined to progressively bigger assembly systems that operate in a vacuum and perform many operations per second to (mostly) mechanically assemble small molecules into (mostly) atomically precise products with high throughput, and with product sizes ranging from from nanoscale to metre-scale and beyond.

Finally, Drexler claims that these APM systems could create a wide range of complex, very high performance products very cheaply and with very high throughput. In this spirit, I define consequential APM to refer to:

A complex APM technology that can create a range of complex, very high performance products for $1000/kg or less and with a throughput of 1kg every 3 days or less per kg of machinery. More intuitions for why APM might be possible and impactful This section expands on the Intuitions for why APM might be possible and impactful described in the main text. As in the main text, these arguments are designed to give the reader some feeling for why you might believe these things rather than constituting cast-iron proofs. See Drexler’s 1992 book, Nanosystems, for a treatment of the physics underlying nanoscale manufacturing systems. Chapter 2 covers the scaling of important physical properties with system size and is particularly relevant for intuitions about nanoscale manufacturing technology. Intuitions for why atomic precision might be a natural goal If you want to build high-performance systems and products on the nanoscale, it’s perhaps natural to build systems and products with atomic precision. As mentioned in the main text, if you have small molecules as building blocks and you’re making products on the order of 1-100nm, you’re building things out of discrete building blocks with a width roughly 1/10th to 1/1000th of the width of the product you’re building. You may well then need atomic precision in order to build the product to the correct specification. In addition, molecular building blocks are perfect copies (e.g. all oxygen molecules are the same[31]), which bond together in discrete ways to form a finite set of possible structures. So by insisting on atomic precision, you get perfectly faithful physical realisations of your design.[32] Finally, given high-performance systems that are able to manufacture things with an extremely high degree of control, and if atomically precise products will often give significantly higher performance, atomically precise products will perhaps be an attractive target. Intuitions for high-performance products One reason we might expect complex APM to produce a very wide range of complex products is that building things by successively adding small molecules would seem to allow for huge flexibility in products. We see something similar with 3D printers today, which successively add material to a growing structure and can create a wide range of complex structures. To give some intuition for the claim that atomically precise products can achieve very high performance, the semiconductor industry spends tens of billions of dollars annually on R&D to manufacture chips with increasingly finer-grained features, a pathway that ends with atomic precision (or close to it). (To be clear, I’d imagine that far more impressive products would be possible with flexible atomically precise fabrication.) As mentioned in the main text, another angle is to consider that, with today’s technology, naturally evolved systems and devices often outperform present-day artificial ones on important dimensions. These evolved systems are certainly not atomically precise (although they contain some atomically precise components), but they show that we are currently some way from the maximum achievable performance along many dimensions. In addition, some people, including Drexler, have argued that evolved systems necessarily look different to designed ones, and that this doesn’t imply that designed systems must be inferior.[33] A highly flexible manufacturing capability that can exactly produce a vast array of complex products defined down to the last atom might seem able to meet and perhaps dramatically exceed the performance of evolved systems. Intuitions for high operating speeds and high throughput As mentioned in the main text, one reason to think that nanoscale machines might achieve high throughput is that these machines can in principle each operate very rapidly, because they are made of very stiff materials and each operation only needs to involve movements over very short distances. As was also noted in the main text, another reason is that the machines are not too much bigger than the small molecules they are manipulating during the assembly process. This makes a high rate of production per gram of manufacturing system more plausible than in the case of fabrication using present-day atomic force microscopes, for example, where a machine weighing 0.1 grams (or more) performs assembly operations on single atoms weighing around 10-23 grams. In addition to these considerations, the performance of biological machines suggests that complex APM could produce products fast. Some bacterial cells can double in both cell count and total mass of cells roughly every 10 minutes. Cells are not themselves atomically precise, but they include nanoscale atomically precise components, so we can consider this as an example of atomically precise components producing their own mass in product (copies of themselves) in 10 minutes, albeit as part of a larger system that is acting to produce a copy of all of its components.[34],[35] Intuitions for low cost As argued in the main text, high throughput could lead to cheap products from complex APM if the machinery uses cheap inputs; is itself cheap; and produces harmless, easily manageable waste. We might expect the inputs to be cheap if they consist of readily available raw materials, with minimal processing required before they are passed to the APM system. This might make sense if complex APM machinery is able to deal with minimally processed inputs, mostly requiring only that the right chemical elements are present in the inputs (not necessarily in the right ratios). This could be the case if the machinery was able to cheaply and efficiently do most of the required processing itself, which might seem plausible for complex APM machinery that is able to efficiently produce a wide range of atomically precise products through mechanical assembly. Inputs containing the right chemical elements might be cheap if only commonly occurring elements are needed. One reason to think that only common elements are needed is that carbon, a relatively abundant element, seems to be an excellent building material: carbon forms diverse atomic-scale structures and has excellent physical properties (for example, it can exist as diamond, graphite, or carbon nanotubes, and diamond is extremely hard and stiff). The complex APM machinery could be cheap because, once you know how to make it, perhaps you could use it to make lots more machinery, just as you could use it to make any other product. We might expect the complex APM machinery to produce harmless, easily manageable waste because, as with processing inputs, we might expect that if we can build machinery that is able to efficiently produce a wide range of atomically precise products through mechanical assembly, we can also build similar machinery that can efficiently process waste into a harmless and easily manageable form. Finally, as mentioned in the main text, we can also consider that naturally evolved systems seem to often be significantly cheaper to manufacture than artificial ones, at least by a rough comparison of energy cost according to the quick investigation by Paul Christiano described in Simple evolution analogy. Other potential effects of advanced nanotechnology Other than the potential effects mentioned in the main text in the section Potential effects of advanced nanotechnology, some (perhaps) less important effects of advanced nanotechnology being developed might be: • An “extremely powerful surveillance and non-lethal control” scenario. Advanced nanotechnology might allow states to cheaply manufacture ubiquitous, centrally controlled drones the size of insects or small mammals to monitor and control the world’s population using non-lethal force.[36] This seems like a plausible outcome to me. This capability might lead to or facilitate value lock-in (probably a very bad outcome), but it also seems possible that it would reduce existential risk by mitigating “vulnerable world type-1” vulnerabilities. • Advanced nanotechnology as a TAI capability. TAI could lead to the rapid development of advanced nanotechnology. This could allow an agent-like AI to quickly create more powerful hardware and thereby gain the ability to rapidly transform the physical world. • Impacts on biorisk. Advanced nanotechnology, or technologies along the path to it, seem likely to make it easier to develop things that could increase or decrease biorisk, including deadly pathogens, DNA sequencing devices, and biosensing devices. It might also enable a highly robust, rapid, local capability to manufacture medical countermeasures in a scenario where a global biological catastrophe has destroyed supply chains and important infrastructure.[37] • Advanced neurotechnology. Advanced nanotechnology might allow us to monitor the brain at very high resolution, potentially enabling whole brain emulation or neuromorphic AI. It seems unclear whether this would be good or bad overall. • Cheap energy. Cheap, powerful manufacturing might enable the fabrication of cheap solar cells and cheap batteries that help to overcome intermittency in solar power, leading to very cheap solar power (although, naively, I’m unsure how large an effect this would be given that advanced nanotechnology wouldn’t on the face of it reduce land and labour costs). • Reduced risks from catastrophic climate change. With advanced nanotechnology, we might be able to return atmospheric CO2 concentrations to pre-industrial levels for a relatively small cost using cheap, high-performance direct air capture devices powered by cheap solar. This could reduce the risk of catastrophic climate change, and also reduce existential risk if we believe that climate change poses such a risk. • Improved resilience to global catastrophes. As mentioned above, advanced nanotechnology might provide a highly robust and powerful local capability to manufacture medical countermeasures during a global biological catastrophe. More generally, the technology could enable greater resilience to global catastrophes by broadly enabling local, rapid production of vital goods, equipment, and infrastructure without requiring pre-existing infrastructure or supply chains, and regardless of environmental damage. One example might be food production without relying on sunlight or global supply chains (see also resilient foods). This could reduce the chance that a global catastrophe precipitates an existential catastrophe. • Dramatic improvements in medicine. Aside from an expectation that cheap, powerful manufacturing would be useful for fabricating medical products (just as for most other products), advanced nanotechnology might be particularly useful for developing improved medical interventions. One high-level reason for thinking this is that humans are mostly made of cells, and cellular processes happen at the nanoscale, suggesting that artificial nanoscale devices might be particularly useful. • General wealth / economic growth. On the face of it, if we can make high-performance materials and devices very cheaply, people will on average be very wealthy compared to today (whether this is a deviation from trend GDP growth will, of course, depend on when and how the technology is developed). • Social effects. Advanced nanotechnology could have important social effects, especially if it arrives abruptly. For example, strong economic growth might generally promote cooperation, while rapid technological changes might cause social upheaval. • Altering the global balance of power. Advanced nanotechnology might change the global balance of power. For example, it might dramatically shorten supply chains by enabling highly efficient production with unrefined local materials, leading to a shift in the balance of geopolitical power. Footnotes 1. In my opinion, the public work with the most authoritative and detailed description of the APM concept is Drexler’s 2013 book, Radical Abundance (especially chapter 10, and parts of chapters 1 and 2). ↩︎ 2. Drexler claims that (something similar to) complex APM would be able to manufacture products for$1/kg or less and with a throughput of 1kg every 3 hours or less per kg of machinery (for “\$1/kg or less”, see, for example, Radical Abundance, 2013, p. 172; for “a throughput of 1kg every 3 hours or less per kg of machinery”, see Nanosystems, 1992, p. 1, 3rd bullet). I set a lower bar for my definition of consequential APM here because I think this lower bar is more than sufficient to imply an extremely impactful technology, while perhaps capturing a larger fraction of potential scenarios over the next few decades. ↩︎

3. The more certain we are that the development of complex APM naturally implies the development of consequential APM soon afterwards, the less useful it is to distinguish these concepts (I think this partly explains why they are often bundled together). But I’m uncertain enough to find the distinction useful. ↩︎

4. Someone who read a draft of this post mentioned they felt initially sceptical that it makes sense to think of ribosomes (or other nanoscale objects) as “machines”, and that they found the Wikipedia page on molecular machines helpful for reducing this scepticism. ↩︎

5. In addition to showing that such machines are feasible, this suggests that one approach to engineering nanoscale machines doing mechanical assembly might be to use biological machines as the starting point.

One approach that uses biological machines as the starting point is the approach I associate with synthetic biology, which involves making incremental changes to existing biological machines to achieve some goal (I’ll refer to this as the “synthetic biology approach” from now on, although I don’t know whether synthetic biologists would agree with this characterisation). This approach seems to be very challenging. My impression is that this is because the effect of small modifications is hard to predict, and because biological machines often stop working outside of their usual conditions. Still, I think some people see this as the best approach for developing something like advanced nanotechnology.

Another approach that uses biological machines as the starting point is the “soft path” approach to APM described by Drexler (see, for example, Appendix II of Radical Abundance). This approach also starts with engineering biological molecules such as proteins (or synthetic but somewhat similar molecules), but is quite different to the synthetic biology approach, because the biological molecules are used more as a building material than as active machines in their own right, and are generally completely removed from their biological context. This soft path approach seems like a more plausible path to me than the synthetic biology approach. So far the soft path approach has had far less effort directed to it than the synthetic biology approach, as far as I’m aware. ↩︎

6. I’m mostly relying on the analysis described in Paul Christiano’s Simple evolution analogy for this claim. From that document:

I think the typical pattern is for human artifacts to be 2-3 OOM [order of magnitude, i.e. a factor of 10] less efficient than their biological analogs, measured by “How much energy/mass is needed to achieve a given level of performance?”

For example, a top GPU is said to perform roughly 1-2 OOM fewer Flops per unit power than a human brain, and artificial photodetectors are said to require roughly 3-4 OOM as much power as a human eye to attain the same level of performance. Note that the analysis in that document is fairly rough, although I’d be very surprised if it turned out not to be the case that naturally evolved systems often outperform present-day ones on important dimensions. ↩︎

7. For example, we could imagine a system occupying a 1000nm × 1000nm × 1000nm volume, and composed of nanoscale machines, that can perform one assembly operation at a time. We could fit 1,000,000,000,000 such systems into 1cm³. (Atomic force microscopes are used today to manipulate atoms one at a time, and my impression, from briefly talking to someone who uses atomic force microscopes for their research, is that they are generally around 1cm³ in size or larger (potentially much larger, depending on what you count as the atomic force microscope and what you count as its supporting infrastructure).) ↩︎

8. Though note that these high frequency, parallel operations involving single molecules each make a tiny amount of progress towards creating a structure of appreciable size, so overall we might expect these machines to make their own mass in product within say minutes, hours, or days, rather than say once per second or 1 billion times per second. ↩︎

9. According to Paul Christiano’s Simple evolution analogy: “Manufacturing costs differ by a further 2-4 OOM” (“OOM” stands for “order of magnitude”, i.e. a factor of 10). ↩︎

10. I haven’t given a large amount of thought to how advanced nanotechnology might deviate from consequential APM, and I can easily see my views changing significantly in the future. Naively, though, here are some ways I imagine the technology might deviate from consequential APM: it might not be the case that most of the assembly machinery is atomically precise; maybe assembly methods or physical phenomena other than mechanical positioning are very important; maybe important components involved in the assembly of nanoscale products are not themselves nanoscale; maybe the assembly machines would not be very stiff; or maybe “control the trajectories of small molecules” wouldn’t be a natural description of the machinery’s operation.

These alternatives to APM might often be things that are technically not consequential APM but are so similar that they can be treated as equivalent to consequential APM for most practical purposes. But perhaps many have substantial enough differences that the distinction turns out to matter for thinking about nanotechnology strategy. ↩︎

11. To give a rough quantification of my (very unstable) beliefs here: if we have advanced nanotechnology in 2040, I’d guess that in roughly 80% of cases the advanced nanotechnology doesn’t look like complex APM. This would imply that, in some relevant sense, advanced nanotechnology covers 5x as much of the space of technological possibilities as does APM. ↩︎

12. For example, we might sometimes be more interested in the question “how likely is it that advanced nanotechnology will be developed by 2040?” than the question “how likely is it that APM will be developed by 2040?”, because advanced nanotechnology has consequences as important as those of APM (or perhaps even more important), and answering the question for advanced nanotechnology might involve similar considerations but have a significantly different answer. ↩︎

13. This guess of a 70% probability is basically driven by: “It’s hard for me to imagine this not being a really big deal, but I haven’t thought about this that much, and maybe my imagination just isn’t very good.” ↩︎

14. For example, this transcript of an 80,000 Hours podcast with Carl Shulman quotes Shulman discussing the Soviets’ apparent willingness to develop bioweapons that they couldn’t protect their own people against (though presumably a weapon that could kill everyone would look less appealing than a weapon that kills some random fraction of the global population; and Soviet decision-makers might have felt that they could protect themselves through physical isolation, for example):

It’s hard to know exactly how much work they would put into pandemic things, because… With pandemic pathogens, they’re going to destroy your own population unless you have already made a vaccine for it.

And so the US eschewed weapons of that sort towards the end of its bioweapons program before it abandoned it entirely, on the theory that they only wanted weapons they could aim. But the Soviets did work on making smallpox more deadly, and defeat vaccines. So, there was interest in doing at least some of this ruin-the-world kind of bioweapons research. ↩︎

15. We might also expect that the TAI we get might be different if it’s developed using a huge amount of computational resources made available by advanced nanotechnology (other than the differences we’d expect to directly follow from earlier TAI). For example, maybe the TAI we get would be more likely to follow present-day paradigms, like stochastic gradient descent, which (very speculatively) could have safety implications. ↩︎

16. Although perhaps some new weapons other than grey goo would also involve nanoscale devices, and you might consider the components of new kinds of computers to be “novel nanoscale devices”. ↩︎

17. Here’s a quick attempt to brainstorm considerations that seem to be feeding into my views here: "Drexler has sketched a reasonable-looking pathway and endpoint", "no-one has shown X isn't feasible even though presumably some people tried", "things are complicated and usually don't turn out how you expect", "no new physics is needed", "X seems intuitively doable given my intuitions from molecular simulations", "it's hard to be sure of anything", "trend-breaking tech rarely happens but does sometimes". ↩︎

18. Richard Jones is a notable example of a highly credentialed person who seems to have engaged seriously with the question of the feasibility of APM. He seems to think that something resembling complex APM is feasible, although he seems sceptical about the feasibility of something with the kind of capabilities described by consequential APM. See, for example, Open Philanthropy’s A conversation with Richard Jones on September 30, 2014. ↩︎

19. I’d speculate that there might be a decent minority of now-senior researchers who entered the field around 1995-2005, excited by the vision for nanotechnology laid out in Drexler’s works and the creation of the National Nanotechnology Initiative in the US. Perhaps these researchers nowadays have a vague sense for what Drexler’s ideas are (and probably consider APM-like visions for nanotechnology to be too far in the future to be worth thinking about). ↩︎

20. Michael Nielsen has written a very nice, relatively quick introduction to scanning tunnelling microscopy (scanning tunnelling microscopy is a type of scanning probe microscopy). ↩︎

21. My perception is that the hard path to APM is less promising than an alternative path called the “soft path” (see footnote 5). This perception mostly comes from what people around me seem to think, and those views in turn perhaps mostly come from Drexler’s view on this. I don’t have much of an inside view here myself; my impression is that deliberate hard path work has been occurring for many years (most notably at Zyvex) without much to show for it, but this seems like only weak evidence, partly because the level of investment has apparently been quite low. ↩︎

22. For a more thorough overview of recent (and less recent) R&D relevant for advanced nanotechnology, see Adam Marblestone’s Bottleneck analysis: positional chemistry. The most relevant sections are “Building blocks that emerged in the meantime”, and “Explicit work on positionally directed covalent chemistry”. (Note that the focus of the report is on a technology the author calls “positional chemistry”, which is different to advanced nanotechnology, but is closely related.) ↩︎

23. Although positional assembly with DNA suprastructures might soon be demonstrated, as a result of this grant, which seems to me to represent relevant progress. ↩︎

24. Of course, this lack of empirical evidence also pushes against high confidence that progress will be very fast if a large research effort emerges. But I expect people will already be inclined not to be confident that progress will be very fast if a large research effort emerges. ↩︎

25. To be clear, this number does not assume feasibility, unlike my median timeline estimate. ↩︎

26. As a somewhat independent data point, Daniel Eth, a former colleague of mine at the Future of Humanity Institute who has spent time thinking about APM, told me that he guesses a 1% probability for roughly the event “advanced nanotechnology arrives by 2040 and isn't preceded by advanced AI”, where "advanced AI" is, roughly "AGI or transformative AI or CAIS or some future AI that is a similarly huge deal" (note that AI that dramatically speeds up technological progress doesn't necessarily qualify as advanced AI). Daniel said, “I'd expect [the estimate would] move around a bit, but probably not more than one order of magnitude.” ↩︎

27. Here are some examples of work from these organisations:

↩︎
28. One potential objection might be that points 1 or 2 just stem from general uncertainty about advanced nanotechnology due to a relative lack of attention on the relevant questions; for example, you might think that if we tried a bit harder to reduce our uncertainty, we would likely end up finding that there’s a sufficiently low probability that advanced nanotechnology arrives in the next 20 years without TAI that this area doesn’t seem worth investigating. A counterargument would be that the above is really an argument about which projects to prioritise in this area, rather than an argument for not thinking about this area at all. Similar comments apply to the uncertainty that is driving point 1. Of course, this counterargument only works if you think points 1 and 2 are reasonable positions given our current state of knowledge. ↩︎

29. Note that an assumption underlying this argument is that there’s not much value in doing nanotechnology strategy work to understand or prepare for scenarios where TAI precedes the development of advanced nanotechnology, because TAI makes everything go crazy such that it’s hard to plan for after that point. However, if you do think such work might be valuable, the case for nanotechnology strategy research looks stronger, because there’s a broader range of future scenarios where this work is relevant. (As mentioned in the main text, it seems reasonable to me to think that work on nanotechnology strategy will be much less useful if advanced nanotechnology is preceded by TAI, but I’m not confident in this and my view feels quite unstable.) ↩︎

30. Although note that, for example, Drexler has written several books on topics within advanced nanotechnology and the Foresight Institute runs a programme promoting the development of APM, so discussion of advanced nanotechnology would not happen against a background of complete silence on the topic. ↩︎

31. This isn’t strictly true because some molecules might contain different isotopes; but it doesn’t seem likely that the presence of different isotopes would usually matter much, and rare isotopes could in principle be filtered out if necessary. ↩︎

32. Provided that there are no errors in the assembly process that lead to incorrect building block placement or bonding. Products could be checked for errors and any errors could be corrected or the misformed products could be discarded; although it’s not clear to me how effective this would be in practice at eliminating errors. ↩︎

33. In Biological and Nanomechanical Systems: Contrasts in Evolutionary Capacity (1989), Drexler discusses how and why designed systems differ from evolved ones. These LessWrong posts also make relevant arguments, in the context of TAI: Building brain-inspired AGI is infinitely easier than understanding the brain and Birds, Brains, Planes, and AI: Against Appeals to the Complexity/Mysteriousness/Efficiency of the Brain. ↩︎

34. This 2016 Nature Methods article reports bacteria growing with a cell count doubling time of less than 10 minutes. I haven’t found numbers for the rate of total mass doubling, but I understand that cell mass remains roughly constant over successive generations, implying that total mass doubles at roughly the same rate as cell count. ↩︎

35. Relatedly, Robert Freitas and Ralph Merkle report in Kinematic Self-Replicating Machines that ribosomes can produce their own mass in proteins in ~5-12 minutes (Ctrl+F for “If the bacterial ribosome” on the linked page). ↩︎

36. Drexler describes something along these lines in The Stealth Threat: An Interview with K. Eric Drexler. ↩︎

37. Though it’s not completely clear to me how useful the “advanced nanotechnology” framing is here, as opposed to, say, something like “advanced biotechnology”. ↩︎

Discuss

### What Would It Cost to Build a World-Class Dredging Vessel in America?

2 мая, 2022 - 20:47
Published on May 2, 2022 5:47 PM GMT

I'm doing some research into questions surrounding the Foreign Dredge Act of 1906, and thought I'd experiment by throwing this out there.

For context, the 31 biggest dredging vessels were not built in America, and thus cannot be used in America by law. We only have a small number of less capable vessels, and they often get redirected to short-term emergency tasks. This is preventing us from doing a bunch of very valuable things, like repairing or expanding ports, which end up taking much more time and money or not happening at all.

This podcast is recommended. You can find a transcript here. They claim that there's no way America will be able to have such capacity for at least decades. I want to verify that (and also check if any other claims here don't ring true)?

As an alternative to repealing the Dredge Act (which I'm exploring and planning to write about) another alternate would be to build world-class dredging vehicles here in America, such that they could be used. Before assuming that this is impossible, and to have a straight answer, what would happen if someone with deep pockets tried to commission a world-class dredging ship that would qualify? Could be done? Are there other impossible barriers to solve? How much would it cost and how much more would that be than building it elsewhere? How long would it take?

Discuss

### So has AI conquered Bridge ?

2 мая, 2022 - 19:59
Published on May 2, 2022 3:01 PM GMT

Bridge, like most card games, is a game of incomplete information. It is a game of many facets, most of which will have to remain unstated here. However the calculation of probabilities, and the constant revision of probabilities during play, is a key feature. Much the same can be said of poker and of certain other card games. But bridge is the game I know and I am still learning from, 45 years after I first got infected with the bridge bug

We have Alpha Go and Alpha Zero but the wizards at DeepMind have not yet seen fit to release an Alpha Bridge. Poker is played to high expert standard by specialist AI’s such as Pluribus but bridge  playing computers up till today remain well short of that. Those commonly available play parts of the game extremely well, usually making mechanistic decisions correctly yet also make quite surprising and often comical errors. It has been noted bridge would seem to be to be an excellent training ground for AGI, in ways chess and go are not.

The French company NUKKAI has just created somewhat of an earthquake by revealing a bridge playing AI called Nook that, it is claimed, can beat the world’s best players. So has bridge finally been conquered? This post will show this is not yet the case. I hope also to give a flavour of the type of probabilistic problems that routinely need to be solved at the bridge table, by AI or human, where Nook may score over a human and why, and where its performance may be somewhat inflated due to some features specific to the training and test set-up. I’ve tried to write with the non-player in mind and I proffer apologies in advance if I have failed to do that adequately.

NUKKAI have published on You Tube a 5hr commentary on entire event. To my shame my poor French renders me unable to follow it closely and I’d welcome comments from French speakers who have followed it, especially as they may be able to back-up or possibly even refute some of the comments I make. Commentaries in English are also starting to become available. Kit Woolsey, a well known expert, is publishing in depth analyses of the most interesting hands on BridgeWinners.

A word about the mechanics of the game. Please feel free to skip this section if you are familiar with it. The game is a trick taking game played with a full deck of 52 cards, dealt equally between four players, who make up two partnerships sitting in opposition. There are two parts to each hand. The first part is the Auction in which the pairs vie to decide the Contract, the number of tricks greater than six that one side or the other contract to make with one suit chosen as trumps (playing with no trump suit is also an option). The side that wins the Auction are the Declaring side. The second part is the Play of the Hand. The player to the left of the Declarer (which of the pair on the Declaring side is the Declarer is decided by a slightly arcane method I won’t go into) leads a card and the partner of the Declarer lays his  hand down face upward and takes no further part in the hand. His cards are played, in correct rotation, by the Declarer. The other two players at the table ( the Defenders) have to cooperate (within the rules of the game!) to take enough tricks to defeat the Contract. All three players are  able to have sight of exactly half the cards, their own hand, and the one on the table (The Dummy). The usual rules for trick-taking games apply. Players must follow suit if they can. Aces are high. The highest card played in the suit led to the trick wins the trick, unless there is a trump suit and a player who cannot follow suit can play a trump to win the trick. The hand that wins the trick leads to the next one.

At the end of play of all 52 cards (13 tricks) the contract will either be satisfied or it is defeated. Scores are assigned and we move on to the next hand.

Firstly let us note that the only part of the game Nook has been trained to do is play Declarer. No bidding, no defence. No collaboration with a partner required. So it is, at best, very good at only one third of the game. The set up is the Human/AI play Declarer against two of the top current bridge playing computers. It is an important point that there is always this constancy of silicon opposition, who might be expected to (and apparently do) always defend in a consistent and predictable manner. 8 experts each played 100 hands so the full test was over 800 hands. However the auction and the contract was always the same (the contract was 3NT, 9 tricks to be made with no trump suit) and there was essentially no information available in the auction to influence declarer play. So it would be very wrong to say the test evaluated AI skill over the full gamut of declarer play.

After the initial lead, it is normal for a Declarer to make a strategic plan. A good plan will optimise the likelihood of enough tricks being made to achieve the contract. Sometimes a good plan will be ultra cautious, guarding against unlikely distributions to ensure the contract is made. On other occasions, if the contract is ambitious, a risky plan may be required to have any chance of achieving the contract.

This part of the operation would seem to be relatively straightforward for a good AI to master. Finding a plan that works best against the largest number of distributions does not seem too dissimilar to the planning required in chess and go. Commentators have indeed identified cases where Nook’s apparent plan was superior to that of the human opponent.

After the initial lead by opponent/ partner 25 cards remain unseen and their positions between the two other players often need to be inferred to achieve optimum play. There is often information available from the Auction that provides an immediate Baysian update (although not in the test of Nook). Thereafter the play of any unseen card may allow another Baysian update. If we don’t count the  triviality of the last trick, this provides  possibly 23 instances where a Baysian update may be applied (and this doesn’t include any inferences to be drawn from the cards declarer chooses to play from dummy as well).

Each Baysian update may require a change in the plan. This process of making inferences from opponents actions and gathering additional information, to adjust prior probabilities assigned to several different card distributions ( we could also call them models) is called “Card Reading”, a term with somewhat mystical association. When carried out successfully, and the actual distribution is found to match the predicted one, it does feel a bit like magic.

The most informative case is where a defender can’t follow suit. Immediately the distribution of that suit is known and the likely distributions of the other suits becomes altered. But more subtle inferences can also be derived from the nature of cards played to follow suit, or chosen as discards.

It is unlikely that all possible eventualities are taken account of in the original plan, the combinatorial explosion of possible distributions is too vast. So Nook most likely reassesses the plan after each card, similar to a human player.

At this stage let’s look at some simple situations where probabilistic reasoning can be applied and how an AI might need to reason. Here is quite a common type of position in a single suit.

Dummy:   AJ2

Declarer:  K103

Declarer wants to make 3 tricks. She could first play the K,and then lead the 3 to the J, winning if the Q is on the left (should the left hand player play the Q it will be topped by the A). Or she could lead the 3 to the A and, on the next trick, lead the 2 towards the 10. This wins if the Q is on the right. A priori it is a guess, a 50:50 proposition which play to choose. However an observant player will generally have additional information that adjusts the probabilities one way or the other. This information can be gained from the Auction, or during play. For instance, declarer may know from the Auction that the left hand player is highly likely to have  6 cards in another suit and, knowing the declaring partnership has 5 cards in that same suit, right hand player therefore has 2 of that suit. Left Hand, starting with thirteen cards, therefore has at the outset 7 “empty spaces“ and Right Hand 11 “empty spaces” in which to put the Q. All other things being equal, the chance of the Q of this suit being on the Right is now favoured to the extent of 11/18.

A good player will not be content with that. She will play other suits to test their distribution and the disposition of high cards, with a view to learning more about the hand. Sometimes this will not be productive but sometimes it will harden the odds, making it more likely the Q is on the Right. More rarely the picture will change completely and odds start to favour the Q being on the Left. Sometimes certainty can be achieved and the position of the key card can be identified with probability 1.

It would seem clear that an AI should be able to recalculate these probabilities faster and more accurately than a human can. Superiority in this part of the game would not be surprising. Whether Nook can plan the play to elicit as much information from the opposition as possible is an interesting question.

As far as I can ascertain the computer defenders do play in a deterministic fashion. Their manner of play of the low cards in any given situation will be consistent.  There does appear to be some indication that Nook may indeed seek out additional information to a certain extent, relying on the deterministic play of the defenders to build up a true picture  (human defenders would rarely be as accommodating). More on this later.

Here is an example that is sometimes quoted as classically Baysian. It is somewhat infamous, in that the principle, the so called “Principle of Restricted Choice”, is, in my experience, not accepted (sometimes vehemently so) by some bridge players. Although unintuitive at first glance, the logic behind it is correct, however.

Dummy: A10432

Declarer:  K765

There are 4 cards missing In the suit. Suppose, to the first trick, you play the K and see Left Hand play the 8  (you play small from Dummy) and Right Hand plays the Q. What is the play that gives the best chance of not losing a trick in the suit?

There are two holdings that are consistent with Right Hand’s play of the Q. Q on its own (singleton) or QJ doubleton.  If the first holding is correct you need to play a low card to the 10, finessing Left Hand’s J. If the second is correct you need to play to the A. The two holdings are roughly equal in likelihood, given no further information. So, it must be the same as a toss of a coin surely?

No, it isn’t. With no other information available, the play to the 10 figures to win two thirds of the time. The reason is that, with QJ, Right Hand has a choice of two equivalent plays on the first trick, there being no material effect on the outcome of  playing the Q rather than the J. Consequently the chance of the Q being originally in the holding QJ, is one half that of the Q being singleton, where there is no choice of play.

This situation is exactly equivalent to the famous Monty Hall problem. The case where the prize is behind your door corresponds to that of when Right Hand has QJ. The Host can choose which of the other two doors to open. The case where the prize is behind one of the other two doors corresponds to the singleton Q case. The host has no choice which door to open, without revealing the prize. You should switch doors.

The principle of restricted choice is most well known for examples involving just high cards, similar to that above. However in principle it can often be applied in some form or other in many situations. Asking the question of whether a play is forced, or is one of several equal choices, as often as possible, even regarding the smaller cards, should in theory allow more accurate assessment of which are the likely distributions. This requires great attention to detail, accurate sifting of signal from noise, and a strong and quick mental calculation ability, and  few players are capable of sustaining the effort to do this consistently for a great length of time. Having said that, very good players, through long experience, do, almost unconsciously, pick up on clues that would be insignificant to weaker players and draw inferences on them.

There is a wrinkle that is important to mention in regard to this example. Suppose we have some extra information, namely that the agent on the right will always play the Q from QJ. There is no possible benefit to do so in this situation. However there is a common convention that, with two cards only, a defender will play the highest, so long as it doesn’t cost, thereby giving information to their partner as to their length in the suit. Players can be creatures of habit, and apply that convention when there is no need. This changes the odds. Now the play of the 10 is back to being close to a 50-50 choice. However, on the flip side, had Right Hand played the J rather than the Q, it would now be a near certainty that no other card accompanies it. Note that our consistent robot defenders are creatures of habit and are likely always play the same card from QJ doubleton. They are also likely to (say) always play low from three small, play high from doubleton etc etc. and thus signal the distribution of their cards accurately.

This introduces the point that each opponent has different experience, different strengths and weaknesses, different quirks and habits. The best players (including AI’s) need to take these into account if they can to accurately infer the probabilities of various holdings, from their play.

Nook may be able to gain significant advantage in these type of situations by constantly  eking out  Baysian inferences from the small card plays that even the best human players don’t consistently do. Also, crucially, Nook will eke out significant information in from  the consistent play of defenders. Human defenders, by contrast, do not in general play consistently, thereby generating noise, and human declarers do not usually generally learn the playing style of individual opponents in great detail, as in most competitions, they will play against many opponents.

Comparative analysis of differential human/Nook strategies is still on going but evidence is emerging that Nook’s mastery may to a considerable extent be due to its exploitation of the behaviour of the defenders. Not only do they signal their distribution in a rigorous fashion but they make errors of play in a consistent fashion, which may very well be exploitable. It looks like Nook has learnt very well how to exploit the foibles of its opposition.

Let’s look at another card layout.

Dummy: Q103

Declarer  : A42

Declarer wants two tricks in the suit. A normal play here is to lead low from declarers’s hand and play either the Q or the 10 when a low card is played on the Left. If the K and J are on the Left either will do. If the K and J are on the Right neither will work. If the K is on the L and J on the R it is necessary to put in the Q. If the J is on the L and K on the R  it is necessary to to put in the 10. A 50% proposition? No, not completely. A good player will usually put in the 10. The consideration is what Left Hand Player didn’t do on this trick, which is to put in the K. With K and only one other card, most average players would play the K to make sure they make a trick with it instead of it being lost under the A. So this holding is now unlikely. With K and 2 or more cards it is more normal to play low but some players might still find a reason to play the K. With J and one other card ( or more ) it would always be normal to play low in front of dummy. The consideration  that the holding of Kx  on the Left side ( where x denotes a small card) is unlikely, is enough to tweak the probabilities and make the 10 the right play. (Note: Before it is pointed out to me, yes, playing the A first before playing to the 10 would reveal immediately if Left Hand had Kx. In some cases this would be the right play. Here, let us assume declarer has other options for an extra trick and cannot afford to put all his eggs in one basket by risking the loss of two immediate tricks)

This is firstly, an  example of drawing an inference from what hasn’t happened ( c.f. Sherlock Holmes’ dog that didn’t bark in the night). This is, in my view, one of the more difficult rationality techniques. The best bridge players are adept at this. The rest of us find this hard to do.

It will be interesting to find out how Nook handles this type of situation, Does such inferential thinking get encoded naturally during the training regime?

Secondly, it is an example where the Declarer has to take into account  how another agent at the table is likely to play, given a certain holding. It is becoming clear that Nook has become adapt at this, albeit only because it has trained against a single deterministic opponent.

A last point, this is also an example of how the probabilities can change depending on the agents at the table. This situation offers a very good player (or exceptional AI) in the Left Hand seat an opportunity. With Kx she might very well quickly play low, (taking time to make that decision is not an option, it signals where the K is). She knows that Declarer doesn’t have AJx as otherwise he would likely be playing the Q or 10 towards the AJx to try to make three tricks if the K is on the other side. So playing low is not in fact likely to cost and will very likely induce a missed guess, the 10 losing to the J. The pay-back occurs if declarer is dealt A9x, as then later Declarer is likely to play from Dummy towards the 9 and lose a second trick to the lone K.

We have discussed how a human or AI can maximise their own performance. However good players must also try to minimise the performance of the opposition, by restricting the information they themselves give out.

For example there is a bridge aphorism ‘play the card you are known to hold”.

Dummy:    432

1098                    KJ7

Declarer:    AQ65

Declarer plays the 4 from dummy and, R Hand playing the 7, puts in the Q, winning the trick. Now he cashes the A and Right Hand drops the K, the card he is known to have. Now declarer has no more  information. He can play a third round of the suit to establish a winner, but he might suspect the suit is breaking badly. It is easier for declarer to place the cards if E plays the J rather than the K.

This we can call obfuscation.  Taken to extreme, obfuscation becomes deception, with the distinction, should we want one, that an obfuscation is generally zero cost play, whereas a deceptive play should have some jeopardy if the deception is challenged.

Here is an example of deception by a defender.

Dummy:   AQ109

Declarer:   32

It is normal here for Declarer to play low to the 10 or 9 , and, if it loses to the J, later play towards the Q or 9. There are two bites of the cherry at getting three tricks. On a good  day, four tricks are possible. On this occasion Right Hand wins first time with the K. Later Declarer plays confidently towards the 9 and is dismayed to see it lose to the J that was ‘marked’ as being in the opposite hand. Perhaps now the AQ are stranded without access and cannot  make tricks. Or perhaps declarer has in the meantime burned his boats elsewhere and staked all on the expectation of three tricks in this suit.

This is a lovely coup to bring off. But it is a gamble. If declarer sees through the stratagem you are likely to be considerably worse off.

Opportunities for true deception are more commonly found in defence play but can also be found in declarer play. Poker playing robots have been demonstrated to bluff effectively so in theory an exceptional bridge playing AI might also be able to play deceptively. However  the AI has to recognise the relatively rare situations where deception is the optimum tactic and it is not clear how easy that is to train.

If there are poker and bridge playing robots that can deceive their human opponents, the question arises as whether they are actually unaligned, and, if so, whether they are unaligned in a trivial sense or in a fundamental sense that may be be important and useful to AI alignment researchers. This may be a well trod topic, for all I know, but if not, it is worthy of discussion.

It is time to summarise:

A) Nook has beaten eight out of eight human experts in one relatively specialised aspect of bridge play. Despite the small domain space and some additional caveats this does look an impressive achievement.  Nook’s play was by no stretch of the imagination error free but it did on multiple occasions clearly locate the superior line of play.

B) Nook needs to plan a general line of play at the beginning that is flexible enough to take account of as many opposing distributions as possible. It is not surprising, given the state of the art, that Nook should be at least as good as a human at this.

C) Nook needs to make Bayesian inferences at each play of a card, reassess the plan and possibly change the plan if necessary. It seems likely Nook may have the advantage at going deeper than a human would in this regard, through always taking note of every card played and making Baysian inferences not just from the play of the high cards but also of the low cards.

D) Consideration of the expertise and style of play of the defenders may change a chosen line of play. Here, it seems that Nook has been trained against the same bridge robots that it is tested against and that these robots play in a deterministic fashion. If so Nook will have learnt the robots style of card play extremely well, and the typical errors that these computers  make. It will be able to therefore read the defenders cards very well and may well have learnt to play in a way to promote the errors peculiar to these opponents. This might give Nook a very significant yet rather artificial advantage over a human who is trained against many different opponents. This advantage would seem to be enough to explain its success rate. It is this possibility that casts the most significant question mark on the world beating claims made by NUKKAI. More analysis is needed here and it will be interesting to hear how NUKKAI themselves reply to this point.

E) Ideally Nook should play in a way that disguises it’s own cards as much as possible from the opponents. It will be interesting to find out if it does play in this way. Can it also exploit situations where it is possible to deceive the defenders, albeit at risk, and gain from it? That would an exceptionally impressive achievement in my view.

F) Bridge has not been conquered by AI ….. yet!

A final note to you non-players. Bridge is an entrancing and life enhancing game but is rapidly becoming an old persons game, because the young now have many other gaming distractions to tempt them and new young blood is now a trickle where it used to be a flood. Yet for the aspiring rationalist I can think of no better training regime for Baysian thinking than the game of bridge. The examples I have presented represent a very small glimpse of the type of thinking required at the bridge table. If they have piqueued your interest, perhaps think about finding some like-minded friends and starting a game, or looking up a local bridge teacher and signing on for a course.

Discuss

### Squires

2 мая, 2022 - 06:36
Published on May 2, 2022 3:36 AM GMT

Last year, johnswentworth posted The Apprentice Experiment. I tried it out. It was a disaster for me and most of my "apprentices". Since then I have been figuring out how to make it work. I currently have two minions. Thanks to them, I finally feel like a proper supervillain.

"How can I live without a human being to pour me drinks and fetch my dirty sandals?" I said.

"You got used to having a servant scarily fast," [Redacted] said.

―[April was weird]

For starters, I don't use the word "apprentice". The word "apprentice" is pretentious and inaccurate. A teenager's parents pay the master to accept their son[1] as an apprentice. Apprenticeship is like sending a kid to a vocational school. Would I accept apprentices? Possibly. But if you (or your parents) are not paying me for the privilege of my tutelage then you are not an apprentice.

Squires are different. Squires are not students. I don't try to teach my squires anything at all. I just give them the boring tedious work I don't want to do. Surprisingly (to me) they love it.

Teenagers want to be useful. But teenagers have limited skills. I put my squires to work doing the most complicated tasks they are capable of. Today, that meant sweeping the floor. Why is sweeping the floor fun? Because doing the most complicated work you are capable of is fun. Less complicated work is boring. More complicated work is frustrating.

"Fun" is evolution rewarding you for learning optimally.

My squires learn quickly. Last month a squire offered to spellcheck my blog posts. This month, his assignment is to negotiate a business deal with <wearables company> instead. Will he succeed? I don't know. I have better things to do than micromanage my minions. This brings me to the most important trait a squire can have.

The most important trait a squire can have is "eager to work". A squire who is eager to work can usually made useful.

Squiring is for energetic young people who have few skills and no money. I you have money then just hire me as a personal coach instead.

1. ^

Middle Ages Europe was a sexist place.

Discuss

### [Linkpost] A conceptual framework for consciousness

2 мая, 2022 - 04:05
Published on May 2, 2022 1:05 AM GMT

PNAS Paper from April 29th that makes strides to solve the hard problem of consciousness by dissolving it:

A conceptual framework for consciousness

Abstract:

This article argues that consciousness has a logically sound, explanatory framework, different from typical accounts that suffer from hidden mysticism. The article has three main parts. The first describes background principles concerning information processing in the brain, from which one can deduce a general, rational framework for explaining consciousness. The second part describes a specific theory that embodies those background principles, the Attention Schema Theory. In the past several years, a growing body of experimental evidence—behavioral evidence, brain imaging evidence, and computational modeling—has addressed aspects of the theory. The final part discusses the evolution of consciousness. By emphasizing the specific role of consciousness in cognition and behavior, the present approach leads to a proposed account of how consciousness may have evolved over millions of years, from fish to humans. The goal of this article is to present a comprehensive, overarching framework in which we can understand scientifically what consciousness is and what key adaptive roles it plays in brain function.

Key quotes:

Principle 1.

Information that comes out of a brain must have been in that brain. [...]

For example, if I believe, think, and claim that an apple is in front of me, then it is necessarily true that my brain contains information about that apple. Note, however, that an actual, physical apple is not necessary for me to think one is present. If no apple is present, I can still believe and insist that one is, although in that case I am evidently delusional or hallucinatory. In contrast, the information in my brain is necessary. Without that information, the belief, thought, and claim are impossible, no matter how many apples are actually present.

[...]

You believe you have consciousness because of information in your brain that depicts you as having it. [...] The existence of an actual feeling of consciousness inside you, associated with the color, is not necessary to explain your belief, certainty, and insistence that you have it. Instead, your belief and claim derive from information about conscious experience. If your brain did not have the information, then the belief and claim would be impossible, and you would not know what experience is, no matter how much conscious experience might or might not “really” be inside you.

[...]

Note that principle 1 does not deny the existence of conscious experience. It says that you believe, think, claim, insist, jump up and down, and swear that you have a conscious feeling inside you, because of specific information in your brain that builds up a picture of what the conscious feeling is. [...]

Principle 2.

The brain’s models are never accurate.

...and I think you can anticipate what follows. I recommend reading the article though.

Discuss

### What DALL-E 2 can and cannot do

2 мая, 2022 - 02:51
Published on May 1, 2022 11:51 PM GMT

I got access to DALL-E 2 earlier this week, and have spent the last few days (probably adding up to dozens of hours) playing with it, with the goal of mapping out its performance in various areas – and, of course, ending up with some epic art.

Below, I've compiled a list of observations made about DALL-E, along with examples. If you want to request art of a particular scene, or to test see what a particular prompt does, feel free to comment with your requests.

DALL-E's strengths Stock photography content

It's stunning at creating photorealistic content for anything that (this is my guess, at least) has a broad repertoire of online stock images – which is perhaps less interesting because if I wanted a stock photo of (rolls dice) a polar bear, Google Images already has me covered. DALL-E performs somewhat better at discrete objects and close-up photographs than at larger scenes, but it can do photographs of city skylines, or National Geographic-style nature scenes, tolerably well (just don't look too closely at the textures or detailing.) Some highlights:

• Clothing design: DALL-E has a reasonable if not perfect understanding of clothing styles, and especially for women's clothes and with the stylistic guidance of "displayed on a store mannequin" or "modeling photoshoot" etc, it can produce some gorgeous and creative outfits. It does especially plausible-looking wedding dresses – maybe because wedding dresses are especially consistent in aesthetic, and online photos of them are likely to be high quality?
a "toga style wedding dress, displayed on a store mannequin"
• Close-ups of cute animals. DALL-E can pull off scenes with several elements, and often produce something that I would buy was a real photo if I scrolled past it on Tumblr.
"kittens playing with yarn in a sunbeam"
• Close-ups of food. These can be a little more uncanny valley – and I don't know what's up with the apparent boiled eggs in there – but DALL-E absolutely has the plating style for high-end restaurants down.
"dessert special, award-winning chef five star restaurant, close-up photograph"
• Jewelry. DALL-E doesn't always follow the instructions of the prompt exactly (it seems to be randomizing whether the big pendant is amber or amethyst) but the details are generally convincing and the results are almost always really pretty.
"silver statement necklace with amethysts and an amber pendant, close-up photograph" Pop culture and media

DALL-E "recognizes" a wide range of pop culture references, particularly for visual media (it's very solid on Disney princesses) or for literary works with film adaptations like Tolkien's LOTR. For almost all media that it recognizes at all, it can convert it in almost-arbitrary art styles.

"art nouveau stained glass window depicting Marvel's Captain America""Elsa from Frozen, cross-stitched sampler"Sesame Street, screenshots from the miyazaki anime movie

[Tip: I find I get more reliably high-quality images from the prompt "X, screenshots from the Miyazaki anime movie" than just "in the style of anime",  I suspect because Miyazaki has a consistent style, whereas anime more broadly is probably pulling in a lot of poorer-quality anime art.]

Art style transfer

Some of most impressively high-quality output involves specific artistic styles. DALL-E can do charcoal or pencil sketches, paintings in the style of various famous artists, and some weirder stuff like "medieval illuminated manuscripts".

"a monk riding a snail, medieval illuminated manuscript"

IMO it performs especially well with art styles like "impressionist watercolor painting" or "pencil sketch", that are a little more forgiving around imperfections in the details.

"A woman at a coffeeshop working on her laptop and wearing headphones, painting by Alphonse Mucha""a little girl and a puppy playing in a pile of autumn leaves, photorealistic charcoal sketch" Creative digital art

DALL-E can (with the right prompts and some cherrypicking) pull off some absolutely gorgeous fantasy-esque art pieces. Some examples:

"a mermaid swimming underwater, photorealistic digital art""a woman knitting the Milky Way galaxy into a scarf, photorealistic digital art"

The output when putting in more abstract prompts (I've run a lot of "[song lyric or poetry line], digital art" requests) is hit-or-miss, but with patience and some trial and error, it can pull out some absolutely stunning – or deeply hilarious – artistic depictions of poetry or abstract concepts. I kind of like using it in this way because of the sheer variety; I never know where it's going to go with a prompt.

"an activist destroyed by facts and logic, digital art""if the lord won't send us water, well we'll get it from the devil, digital art""For you are made of nebulas and novas and night sky You're made of memories you bury or live by, digital art" (lyric from Never Look Away by Vienna Teng)The future of commercials

This might be just a me thing, but I love almost everything DALL-E does with the prompt "in the style of surrealism" – in particular, its surreal attempt at commercials or advertisements. If my online ads were 100% replaced by DALL-E art, I would probably click on at least 50% more of them.

I had been really excited about using DALL-E to make fan art of fiction that I or other people have written, and so I was somewhat disappointed at how much it struggles to do complex scenes according to spec. In particular, it still has a long way to go with:

Scenes with two characters

I'm not kidding. DALL-E does fine at giving one character a list of specific traits (though if you want pink hair, watch out, DALL-E might start spamming the entire image with pink objects). It can sometimes handle multiple generic people in a crowd scene, though it quickly forgets how faces work. However, it finds it very challenging to keep track of which traits ought to belong to a specific Character A versus a different specific Character B, beyond a very basic minimum like "a man and a woman."

The above is one iteration of a scene I was very motivated to figure out how to depict, as a fan art of my Valdemar rationalfic. DALL-E can handle two people, check, and a room with a window and at least one of a bed or chair, but it's lost when it comes to remembering which combination of age/gender/hair color is in what location.

"a young dark-haired boy resting in bed, and a grey-haired older woman sitting in a chair beside the bed underneath a window with sun streaming through, Pixar style digital art"

Even in cases where the two characters are pop culture references that I've already been able to confirm the model "knows" separately – for example, Captain America and Iron Man – it can't seem to help blending them together. It's as though the model has "two characters" and then separately "a list of traits" (user-specified or just implicit in the training data), and reassigns the traits mostly at random.

"Captain America and Iron Man standing side by side" which is which????Foreground and background

A good example of this: someone on Twitter had commented that they couldn't get DALL-E to provide them with "Two dogs dressed like roman soldiers on a pirate ship looking at New York City through a spyglass". I took this as a CHALLENGE and spent half an hour trying; I, too, could not get DALL-E to output this, and end up needing to choose between "NYC and a pirate ship" or "dogs in Roman soldier uniforms with spyglasses".

DALL-E can do scenes with generic backgrounds (a city, bookshelves in a library, a landscape) but even then, if that's not the main focus of the image then the fine details tend to get pretty scrambled.

Novel objects, or nonstandard usages

Objects that are not something it already "recognizes." DALL-E knows what a chair is. It can give you something that is recognizably a chair in several dozen different art mediums. It could not with any amount of coaxing produce an "Otto bicycle", which my friend specifically wanted for her book cover. Its failed attempts were both hilarious and concerning.

prompt was something like "a little girl with dark curly hair riding down a barren hill on a magical rickshaw with enormous bicycle wheels, in the style of Bill Watterson"An actual Otto bicycle, per Google Images

Objects used in nonstandard ways. It seems to slide back toward some kind of ~prior; when I asked it for a dress made of Kermit plushies displayed on a store mannequin, it repeatedly gave me a Kermit plushie wearing a dress.

"Dress made out of Kermit plushies, displayed on a store mannequin"

DALL-E generally seems to have extremely strong priors in a few areas, which end up being almost impossible to shift. I spent at least half an hour trying to convince it to give me digital art of a woman whose eyes were full of stars (no, not the rest of her, not the background scenery either, just her eyes...) and the closest DALL-E ever got was this.

I wanted: the Star-Eyed Goddess
I got: the goddess-eyed goddess of recursionSpelling

DALL-E can't spell. It really really cannot spell. It will occasionally spell a word correctly by utter coincidence. (Okay, fine, it can consistently spell "STOP" as long as it's written on a stop sign.)

It does mostly produce recognizable English letters (and recognizable attempts at Chinese calligraphy in other instances), and letter order that is closer to English spelling than to a random draw from a bag of Scrabble letters, so I would guess that even given the new model structure that makes DALL-E 2 worse than the first DALL-E, just scaling it up some would eventually let it crack spelling.

At least sometimes its inability to spell results in unintentionally hilarious memes?

EmeRAGEencey!Realistic human faces

My understanding is that the face model limitation may have been deliberate to avoid deepfakes of celebrities, etc. Interestingly, DALL-E can nonetheless at least sometimes do perfectly reasonable faces, either as photographs or in various art styles, if they're the central element of a scene. (And it keeps giving me photorealistic faces as a component of images where I wasn't even asking for that, meaning that per the terms and conditions I can't share those images publicly.)

Even more interestingly, it seems to specifically alter the appearance of actors even when it clearly "knows" a particular movie or TV show. I asked it for "screenshots from the second season of Firefly", and they were very recognizably screenshots from Firefly in terms of lighting, ambiance, scenery etc, with an actor who looked almost like Nathan Fillion – as though cast in a remake that was trying to get it fairly similar – and who looked consistently the same across all 10 images, but was definitely a different person.

There are a couple of specific cases where DALL-E seems to "remember" how human hands work. The ones I've found so far mostly involve a character doing some standard activity using their hands, like "playing a musical instrument." Below, I was trying to depict a character from A Song For Two Voices who's a Bard; this round came out shockingly good in a number of ways, but the hands particularly surprised me.

Limitations of the "edit" functionality

DALL-E 2 offers an edit functionality – if you mostly like an image except for one detail, you can highlight an area of it with a cursor, and change the full description as applicable in order to tell it how to modify the selected region.

It sometimes works - this gorgeous dress (didn't save the prompt, sorry) originally had no top, and the edit function successfully added one without changing the rest too much.

This is how people will dress in the glorious transhumanist future.

It often appears to do nothing. It occasionally full-on panics and does....whatever this is.

I was just trying to give the figure short hair!

There's also a "variations" functionality that lets you select the best image given by a prompt and generate near neighbors of it, but my experience so far is that the variations are almost invariably less of a good fit for the original prompt, and very rarely better on specific details (like faces) that I might want to fix.

Some art style observations

DALL-E doesn't seem to hold a sharp delineation between style and content; in other words, adding stylistic prompts actively changes the some of what I would consider to be content.

For example, asking for a coffeeshop scene as painted by Alphonse Mucha puts the woman in in a long flowing period-style dress, like in this reference painting, and gives us a "coffeeshop" that looks a lot to me like a lady's parlor; in comparison, the Miyazaki anime version mostly has the character in a casual sweatshirt. This makes sense given the way the model was trained; background details are going to be systematically different between Nouveau Art paintings and anime movies.

"A woman at a coffeeshop working on her laptop and wearing headphones, painting by Alphonse Mucha""A woman at a coffeeshop working on her laptop and wearing headphones, screenshots from the miyazaki anime movie"

DALL-E is often sensitive to exact wording, and in particular it's fascinating how "in the style of x" often gets very different results from "screenshot from an x movie". I'm guessing that in the Pixar case, generic "Pixar style" might capture training data from Pixar shorts or illustrations that aren't in their standard recognizable movie style. (Also, sometimes if asked for "anime" it gives me content that either looks like 3D rendered video game cutscenes, or occasionally what I assume is meant to be people at an anime con in cosplay.)

"A woman at a coffeeshop working on her laptop and wearing headphones, screenshots from the Pixar movie""A woman at a coffeeshop working on her laptop and wearing headphones, in the style of Pixar"Conclusions

How smart is DALL-E?

I would give it an excellent grade in recognizing objects, and most of the time it has a pretty good sense of their purpose and expected context. If I give it just the prompt "a box, a chair, a computer, a ceiling fan, a lamp, a rug, a window, a desk" with no other specification, it consistently includes at least 7 of the 8 requested objects, and places them in reasonable relation to each other – and in a room with walls and a floor, which I did not explicitly ask for. This "understanding" of objects is a lot of what makes DALL-E so easy to work with, and in some sense seems more impressive than a perfect art style.

The biggest thing I've noticed that looks like a ~conceptual limitation in the model is its inability to consistently track two different characters, unless they differ on exactly one trait (male and female, adult and child, red hair and blue hair, etc) – in which case the model could be getting this right if all it's doing is randomizing the traits in its bucket between the characters. It seems to have a similar issue with two non-person objects of the same type, like chairs, though I've explored this less.

It often applies color and texture styling to parts of the image other than the ones specified in the prompt; if you ask for a girl with pink hair, it's likely to make the walls or her clothes pink, and it's given me several Rapunzels wearing a gown apparently made of hair. (Not to mention the time it was confused about whether, in "Goldilocks and the three bears", Goldilocks was also supposed to be a bear.)

The deficits with the "edit" mode and "variations" mode also seem to me like they reflect the model failing to neatly track a set of objects-with-assigned-traits. It reliably holds the non-highlighted areas of the image constant and only modifies the selected part, but the modifications often seem like they're pulling in context from the entire prompt – for example, when I took one of my room-with-objects images and tried to select the computer and change it to "a computer levitating in midair", DALL-E gave me a levitating fan and a levitating box instead.

Working with DALL-E definitely still feels like attempting to communicate with some kind of alien entity that doesn't quite reason in the same ontology as humans, even if it theoretically understands the English language. There are concepts it appears to "understand" in natural language without difficulty – including prompts like "advertising poster for the new Marvel's Avengers movie, as a Miyazaki anime, in the style of an Instagram inspirational moodboard", which would take so long to explain to aliens, or even just to a human from 1900. And yet, you try to explain what an Otto bicycle is – something which I'm pretty sure a human six-year-old could draw if given a verbal description – and the conceptual gulf is impossible to cross.

"advertising poster for the new Marvel's Avengers movie, as a Miyazaki anime, in the style of an Instagram inspirational moodboard"

Discuss

### ELK shaving

2 мая, 2022 - 01:51
Published on May 1, 2022 9:05 PM GMT

> Paul Christiano's incredibly complicated schemes have no chance of working in real life before DeepMind destroys the world. > Eliezer in Death With Dignity

Eliciting Latent Knowledge reads to me as an incredibly narrow slice in reasoning space, a hyperbolically branching philosophical rabbit hole of caveats.

For example, this paragraph on page 7translates as:

If you can ask whether AI is saying the truth and it answers "no" then you know it is lying.

But how is trusting this answer different from just trusting AI to not deceive you as a whole?

A hundred pages of an elaborate system with competing actors playing games of causal diagrams trying to solve for the worst case is exciting ✨ precisely because it allows one to "make progress" and have incredibly nuanced discussions (ELK shaving [1]) while failing to address the core AI safety concern:

if AI is sufficiently smart, it can do absolutely whatever

– fundamentally ignoring whatever clever constraints one might come up with.

I am confused why people might be "very optimistic" about ELK, I hope I am wrong.

1. ^“Yak shaving” means performing a seemingly endless series of small tasks that must be completed before the next step in the project can move forward. Elks are kinda like yaks

Discuss

### Is it desirable for the first AGI to be conscious?

2 мая, 2022 - 00:29
Published on May 1, 2022 9:29 PM GMT

It seems that consciousness is one of the essential things necessary to characterize the moral status of an agent.
It seems that we have very little chance to solve the AI safety problem.
We don't know if the first AGI will be conscious.

Since we are in a certain sense doomed, the future consists of the following 2 cases:

• Either the first AGI is not sentient and never becomes sentient.
• Or the first AGI acquires sentience at a given moment.
• Either the valence of this AGI is positive
• Or the valence of this AGI is negative
• Either the valence is mixed
• Or the AGI reprograms itself to experience a positive valence (instrumental convergence if its well-being is included in the utility function?)

I have no idea which scenario is the most desirable. The uncertainty is multiplied by considering the AGI's superior capabilities, and its ability to multiply. So perhaps we could see it as a utility monster. Therefore, the following questions seem very important:

(1) What do you think is the most likely scenario and (2) the most desirable scenario?

(3|2) Conditional on the scenario that seems most desirable to you, is there a way to steer the future in that direction?

Discuss