Вы здесь

Новости LessWrong.com

Подписка на Лента Новости LessWrong.com Новости LessWrong.com
A community blog devoted to refining the art of rationality
Обновлено: 57 минут 54 секунды назад

More Dakka for Coronavirus: We need immediate human trials of many vaccine-candidates and simultaneous manufacturing of all of them

13 марта, 2020 - 16:35
Published on March 13, 2020 1:35 PM GMT

Our best chance to fight CV is to create a vaccine as soon as possible. The irony is that we probably already have a vaccine but we can’t prove that it works, until animal safety test, 1 and 2 stage of the clinical trials, which will take 12-18 months at least. How we could accelerate the creation of the vaccine?

There are several ideas, which are mostly inspired by the research of anti-aging drugs, which suffer from the same problem: the need for very long clinical trials. See More Dakka post by Sara Constantin. 

  1. Immediate human trials. If we have the vaccine in 3 months, not 18 months, we will save millions of lives. So, based on the trolley problem logic, we may risk the health of a few thousand people to achieve these goals, especially if they were volunteers. Thus we need to start a human test of the vaccine candidates immediately, even before the end of animal trials, which are still needed. here we get acceleration by performing in parallel the actions which are typically done sequentially.
  2. Test safety and efficiency simultaneously. We also should combine 1 stage and 2 stages oа tests, that is safety and efficiency, by giving the vaccine to people who are already under potential exposure to CV, like nurses, or old people in nurseries. 
  3. Test on large groups. We should test a vaccine on a large group of people, like 10 000, so any finding will quickly get statistical significance. If the number of infection will decline relative to the control group, we could see it in one month.
  4. Test all vaccine candidates. All said above in 1-3 should be done with each of a dozen vaccine-candidates which are currently under developing. As a result, in one month we could know which vaccine candidate is the strongest and safest. 
  5. Manufacturing in advance. Simultaneously, we should start large scale production of all vaccine-candidates, if it is possible. After the best (maybe 3 or 5) vaccine-candidates are validated, the stockpile of failed vaccine-candidates is destroyed, and the best vaccines are delivered to the population. This will help to fight the production delay.
  6. Give people different best vaccines. There is still could be long-term detrimental effects of some of the vaccines, so it may be better not to give everybody just one best vaccine - what if it makes everybody will become sterile in 1 year?
  7. Combine best vaccines. To increase protection, we could give each person a combination of several (but not all to ensure point 6) of the best vaccines. Here I assume that detrimental interaction between vaccines is typically unlikely, but more technical analysis is needed. 
  8. Establish biomarkers of a good vaccine (e.g. antibodies). Biomarkers are important to check efficiency of a clinical trial before the final outcome is known.
  9. Try other approaches, like DRACO, distancing, coconut oil, a large dose of vitamin C etc and test them in the same accelerated way.

Several human trials already started: In China military volunteers are testing an experimental vaccine, and there are clinical trials of Moderna vaccine in the US.


Adaptive Immune System Aging

13 марта, 2020 - 06:47
Published on March 13, 2020 3:47 AM GMT

The human adaptive immune system is the “smart” part of the human immune system, the part which learns to recognize specific pathogens, allowing for immunity to e.g. chicken pox. For our current purposes, the key players are T-cells. T-cells start out “naive” and eventually learn to recognize specific antigens, becoming “memory” T-cells. The aged immune system is characterized by a larger fraction of memory relative to naive T-cells (without dramatic change in overall counts). This makes the elderly immune system slower to adapt to new pathogens.

This post is mainly about why the naive:memory T-cell ratio shifts with age, how to undo that shift, and some speculation about implications and applications.

A natural hypothesis (frequently asserted in the literature): the shift toward memory T-cells is driven by slower production of new (naive) T-cells. The T-cells themselves maintain overall cell count by living longer, resulting in a larger proportion of older (memory) cells.

The interesting part: why would the production of new T-cells fall with age?

Turns out there’s an obvious culprit: the thymus. The thymus is the last stop in the production line for new T-cells. It provides a sort of boot camp, training the T-cells to distinguish “self” (your own cells) from “other” (pathogens) using a whole battery of tricks. T-cells which make it through become full-time members of the naive T-cell reserve, and go on to police the body.

With age, the thymus does this:

(source: PhysAging). This is called “involution” of the thymus.

Many organs shrink with age, but the thymus is among the most dramatic. Unlike most age-related loss, it starts even before development is complete - the thymus shrinks measurably between day zero and a child’s first birthday. And it keeps on shrinking, at a steady rate, throughout childhood and adult life. The extremely early start of thymic involution suggests it’s more a developmental phenomenon than an age-related phenomenon - perhaps an appropriate hormonal mix could undo thymic involution?

Turns out, castration of aged mice (18-24 mo) leads to complete restoration of the thymus in about 2 weeks. The entire organ completely regrows, and the balance of naive to memory T-cells returns to the level seen in young mice. (Replicated here.) This is pretty dramatic evidence that:

  • The thymus only generates/regenerates in the absence of sex hormones
  • The age-related shift in naive:memory T-cell ratio is driven primarily by thymic involution

Particularly intriguing: we already have chemical castration methods, which are generally considered reversible. And it only took two weeks for the mice to regrow the whole thymus. At this point we’re speculating, but assuming chemical castration also works and it translates to humans and the thymus doesn’t rapidly re-involute after ceasing the chemical castration… that sounds pretty promising as an avenue to fixing age-related adaptive immune system decline in humans.

As long as we’re speculating, let’s speculate hard. Immunotherapy is the hot new thing in cancer these days - apparently T-cells in young people remove precancerous cells and attack tumors, but in old people that doesn’t happen as reliably. So… have there been any studies on how castration effects cancer? For starters, chemical castration is already a widely-used treatment for both prostate and breast cancer. It works. But that’s prostate and breast; they’re sex organs, which we’d expect to atrophy in the absence of sex hormones. I don’t know of any studies on the effects of chemical castration on other types of cancer, in humans.

In rats, however, at least one century-old study finds that castration prevents age-related cancer - and quite dramatically so. Castrated mice’ rate of resistance to an implanted tumor was ~50%, vs ~5% for controls. (This study finds a similar result in rabbits.) That old rat study cites a few others with mutually-conflicting results, and proposes that the age of the rats used explains it all: investigators who use young rats find that castration has little-to-no effect on resistance to an implanted tumor. Exactly what we’d expect if it’s all mediated by thymic involution & regrowth.

There’s a lot of questions here. Does chemical castration have similar effects to surgical castration on thymic regrowth in mice? Does chemical castration result in regrowth of the thymus in humans? (Several states/countries require chemical castration as a condition of parole for certain sex offenders, and it’s also used for prostate and breast cancer, so it should be possible to find a few old people using it and see whether their thymus has regrown.) Does the thymus rapidly re-involute after administration of chemical castration ceases? Does chemical castration work as a treatment for cancers besides prostate and breast? Is the effectiveness of chemical castration against cancer age-dependent? Can temporary administration of chemical castration prevent cancer for a long period of time?

Turning away from applications and back to gears, there's also some key questions around thymic involution itself. The individual cells of the thymus don't have unusually slow turnover; if the thymic cell count is decreasing over time, then either the rate of production is decreasing or the breakdown rate is increasing. There has to be some upstream cause. Whatever that cause is, it probably isn't the same upstream cause as most age-related problems - thymic involution doesn't follow the usual pattern of no noticeable problems during development, slow loss of performance in middle age, then accelerating failure in old age.

I’d be excited to see more work along these lines and/or references to relevant studies.


Crisis and opportunity during coronavirus

12 марта, 2020 - 23:20
Published on March 12, 2020 8:20 PM GMT

Note: please put on your own oxygen mask first. Don’t engage with this post if you haven’t taken appropriate measures to prepare yourself and your family; and plausibly don’t engage if you haven’t taken measures to ensure you can do so while staying stable and grounded.

We are facing a time of global crisis. But in spite the unfolding tragedy – or perhaps because of it – this will also be a time of great opportunity. If you have the skills, slack, and willingness to act, it might make sense to start looking for ways to contribute (regardless of whether you’re seeking personal gain or altruistic benefit).

Why does this seem like a good opportunity?

There’s a question of whether the current situation should change your overall cause-prioritisation, in determining what’s most useful to work on over a >1 year time-scale.

This depends on where you started in your beliefs. To many of readers of this, this likely does not provide any high-level updates, as we already believed that pandemics were a major risk for which the world was underprepared, and that the major institutions in charge were dysfunctional. (Nonetheless, I am learning a massive amount by living through a time of global crisis when I have the epistemic ability and agency to understand what’s happening and take action.)

Beyond that, there’s the question of whether this is a window of opportunity. Even if your long-term goals remain after this pandemic, are there actions which will have an extraordinarily high leverage now, compared to other times?

I think there are a few reasons for thinking so.

  • Underpreparation. The world wasn’t prepared. Everyone is scrambling to figure things out and there’s too much for anyone to do. Hundreds of millions of people are suddenly changing their lives. The same goes for hundreds of thousands of companies and hundreds of governments. Most of them have no routines or experience in handling situations like these, which means they’ll be facing problems they have never faced before.
  • Exponential growth. Each infected person can be responsible for thousands of downstream infections, so the impact of behaviour change has a large multiplier. (Though this is modulo some uncertainty about counterfactual infections which I’m unsure how to think about).
  • Scale. The pandemic might grow to directly affect hundreds of millions of people, and it will indirectly affect billions. It is also a memetic pandemic (it probably consumes >90% of my FB and Twitter feeds, and >70% of my conversations). People are actively trying to find information, products, and similar.
  • Direct exposure and quick feedback loops. Most startups die because no one wants what they’re building. A common warning sign is that the founders aren’t themselves users of the product. But in the coming months, you’ll have to solve lots of problems for yourself, and chances are high that others might benefit from your solution (e.g. many spreadsheets and documents that went viral were initially just a single person trying to figure out how they should prepare, when their company should work remotely, etc.) Even if your solving problems for others, you'll quick learn if there's any demand.
How can you contribute?

I want to distinguish two kinds of windows of opportunities. For lack of a better term, I’ll call them “social” and “causal”.

Social windows of opportunity. Suddenly people are willing to listen to new advice, and consider different actions than they previously would. Hence there are attempts to get people to sign social pledges to self-quarantine and epidemiologists are signing open letters to tech giants. More nefariously, lawmakers are smuggling their pet policy proposal into things that look like corona response measures. When all is said and done, it seems plausible the overton window for biorisk policy will have shifted massively, and that might bring with it other surprising opportunities as well.

Causal windows of opportunity. There are also many new problems to be solved, where you can build a tool or other solution that actually changes the world in a mechanistic way, and which isn't primarily about convincing other people of things.

For example:

The Coronavirus Tech Handbook is an excellent resource summarising what people are building to fight the outbreak. Many projects are urgently looking for collaborators.

Even if you don't have tech skills, there are other ways of finding great opportunity in times of crisis.

Some of history’s most successful trades (1, 2) occurred during crises. There will likely be many financial opportunities during this crisis as well. (LessWronger user Wei Dai posted about how he successfully shorted the S&P500 a few weeks back, and saw 700% returns already before the crashes of the recent week.) (This is not financial advice and if you have no trading experience now might be a particularly bad time to start.)

We have also seen a massive failure of responsible institutions to respond appropriately and provide reliable information. This means there's a shortage and need for reliable research and advice. This situation requires thinking for ourselves.

Due to the existence of niche online communities doing this, I started seriously thinking and preparing when there were 2 cases in my home country. A week and a half later I went home to my family and helped them prepare, as the only mask-wearing person at an empty row in the back of an otherwise full plane. There were 20 confirmed cases. My mom initially yelled at me and felt embarrassed when none of her friends were taking action, and asked why the authorities didn't say much. I left 5 days later. The case count had grown exponentially to >400. A friend in med school told me to "wash my hands and don't panic". I left from an airport where staff wore neither gloves nor masks and shorted the local stock market. As I'm writing this two days later the case count is almost 700.

The jury is still out, but sadly this seems to be a time where it's critically important to be able to take your beliefs seriously even when they go much further than official advice and mainstream behaviour . There will likely be many more opportunities over the coming months were good judgement and independent research can make an important difference. The LessWrong page of posts tagged coronavirus is one place to find and contribute to open questions.

Addendum on profiting from outbreaks

I strongly believe that that traders and entrepreneurs who try to gain profit during this crisis are not immoral. Rather, they are incredibly important. All this “flatten the curve” business is about smoothing out demand peaks over time. And financial markets (futures markets in particular) are one of the key technologies our society has for coordinating to efficiently allocate resources across time. For example, it might have been massively beneficial if someone with foresight would have stockpiled massive amounts of medical ventilators months back and thereby caused suppliers to increase production (it’s seems plausible this might have been worth it even if they won’t have sold those stockpiled ventilators at a massive markup). The actual stockpiling of masks that happened might also have been beneficial for this reason (but I am highly uncertain about this claim and wouldn't bet highly on it).

More generally, the coming will present a massive wealth transfer, in various ways, to the prepared from the unprepared (or those unable to prepare, due to lack of money, knowledge, or some other key prerequisite). I don’t know what the implications of this will be. But once again, it might be worth a few hours of your time thinking about what opportunities it might generate.

tags: Coronavirus


The Critical COVID-19 Infections Are About To Occur: It's Time To Stay Home [crosspost]

12 марта, 2020 - 21:50
Published on March 12, 2020 6:50 PM GMT

Well I don't know how many of you reading this have been following the spread of Coronavirus. In the line of providing another person who is noticing the smoke in the room, I wrote this last night and started a blog to do it.

Tl;dr: It is reasonable to expect that sometime between March 16th and March 18th, our hospitals will be slammed with cases of COVID-19 needing urgent attention and war-time triage will be applied, if not immediately, then at some soon point beyond that. Your value will be your youth. If you have family members who are at risk (like parents) they should consider the fact that if they leave the house and they get sick they will be going into a healthcare system that cannot give them adequate care and they will have an increased chance of dying. You should feel free to insist that they stay home for the next 5 days, preferably a week if you have the sort of relationship where you can express that sentiment. Once it begins there will not be a need to motivate them to stay.

It could happen sooner. It could happen later. It will happen. Hospitals in your community will be infected and the entire hospital will, essentially, have it.

Everyone for some value of everyone is going to get this. 30% of the population is what they told legislators today.

I feel a little silly writing this out because I'm typical minding: I've been paying attention to this virus for a long time, and I assume everyone is where I'm at.

This is where I'm at. At home. Where I will stay. For as long as possible.

Do your part, motherfuckers, it's a pandemic on and the math to track the impact point says there is still time to make a difference: the infections that comprise the big surge are about to happen. Well. Started happening today. Tomorrow there's just gonna be more.

Original post to follow:

It's Time To Stay Home

If you have the disease and don't know it yet, now is a very very bad time to spread it inadvertently. The people you infect will, if they must go to the hospital, be a part of the epidemic curve that overwhelms the hospital.

If you don't have the disease, now is a very very bad time to catch it, because you will be part of the big wave.

TL;DR: The transmissions that will swamp our hospital systems are about to begin. STAY HOME. For as long as you can.

Many have seen this time-adjusted graph of the numbers, with the time shifted so it's clear just how similar the curves look.

I'm posting this on March 11th. The incubation period is around 5 days.

The reports from Italy about their overwhelmed hospitals began March 8th; some of their hospitals were already swamped at that point.

If we were 11.5 days behind Italy on March 9th, but Italian hospitals hit saturation, oh, call it March 6th, then our hospitals will start suffering on March 17th/18th. 5 days to develop symptoms, 24 hours for them to get bad enough to go to the hospital.

The Critical Window Is Here

Starting today, March 11th, if you leave the house, you are taking a significantly higher risk of being in a hospital during the surge.

If you have left the house (and let's be realistic about how often you touched your face and you probably touched doorknobs), if you leave your house now and develop symptoms tomorrow, anyone you infect today who needs medical care will be in the hospital during the surge.

I'm not leaving my house.


Why isn't increasing ventilation of public spaces part of the best practice response to the Coronovirus?

12 марта, 2020 - 13:40
Published on March 12, 2020 10:40 AM GMT

It's my impression that there's some spread via aerosol in public spaces like buses and trains. By increasing ventilation in those spaces by opening more windows I find it plausible that we could reduce that transmission.

Why aren't health orgs pushing for increasing ventilation of public spaces?


I'm leaving AI alignment – you better stay

12 марта, 2020 - 08:58
Published on March 12, 2020 5:58 AM GMT

This diagram summarizes the requirements for independent AI alignment research and how they are connected.

In this post I'll outline my four-year-long attempt at becoming an AI alignment researcher. It's an ‘I did X [including what I did wrong], and here's how it went’ post (see also jefftk's More writeups!). I'm not complaining about how people treated me – they treated me well. And I'm not trying to convince you to abandon AI alignment research – you shouldn't. I'm not saying that anyone should have done anything differently – except myself.

Requirements Funding

Funding is the main requirement, because it enables everything else. Thanks to Paul Christiano I had funding for nine months between January 2019 and January 2020. Thereafter I applied to the EA Foundation Fund (now Center on Long-Term Risk Fund) and Long-Term Future Fund for a grant and they rejected my applications. Now I don't know of any other promising sources of funding. I also don't know of any AI alignment research organisation that would hire me as a remote worker.

How much funding you need varies. I settled on 5 kUSD per month, which sounds like a lot when you're a student, and which sounds like not a lot when you look at market rates for software developers/ML engineers/ML researchers. On top of that, I'm essentially a freelancer who has to pay social insurance by himself, take time off to do accounting and taxes, and build runway for dry periods.

Results and relationships

In any job you must get results and build relationships. If you don't, you don't earn your pay. (Manager Tools talks about results and relationships all the time. See for example What You've Been Taught About Management is Wrong or First Job Fundamentals.)

The results I generated weren't obviously good enough to compel Paul to continue to fund me. And I didn't build good enough relationships with people who could have convinced the LTFF and EAFF fund managers that I have the potential they're looking for.


Funding buys time, which I used for study and research.

Another aspect of time is how effectively and efficiently you use it. I'm good at effective, not so good at efficient. – I spend much time on non-research, mostly studying Japanese and doing sports. And dawdling. I noticed the dawdling problem at the end of last year and got it under control at the beginning of this year (see my time tracking). Too late.

Travel and location

I live in Kagoshima City in southern Japan, which is far away from the AI alignment research hubs. This means that I don't naturally meet AI alignment researchers and build relationships with them. I could have compensated for this by travelling to summer schools, conferences etc. But I missed the best opportunities and I felt that I didn't have the time and money to take the second-best opportunities. Of course, I could also relocate to one of the research hubs. But I don't want to do that for family reasons.

I did start maintaining the Predicted AI alignment event/meeting calendar in order to avoid missing opportunities again. And I did apply and get accepted to the AI Safety Camp Toronto 2020. They even chose my research proposal for one of the teams. But I failed to procure the funding that would have supported me from March through May when the camp takes place.


I know more than most young AI alignment researchers about how to make good software, how to write well and how to work professionally. I know less than most young AI alignment researchers about maths, ML and how to do research. The latter appear to be more important for getting results in this field.


Why do I know less about maths, ML and how to do research? Because my formal education goes only as far as a BSc in computer science, which I finished in 2014 (albeit with very good grades). There's a big gap between what I remember from that and what an MSc or PhD graduate knows. I tried to make up for it with months (time bought with Paul's funding) of self-study, but it wasn't enough.

Another angle on this, in terms of Jim Collins (see Jim Collins — A Rare Interview with a Reclusive Polymath (#361)): I'm not ‘encoded’ for reading research articles and working on theory. I am probably ‘encoded’ for software development and management. I'm sceptical, however, about this concept of being ‘encoded’ for something.

All for nothing?

No. I built relationships and learned much that will help me be more useful in the future. The only thing I'm worried about is that I will forget what I've learned about ML for the third time.


I could go back to working for money part-time, patch the gaps in my knowledge, results and relationships, and get back on the path of AI alignment research. But I don't feel like it. I spent four years doing ‘what I should do’ and was ultimately unsuccessful. Now I'll try and do what is fun, and see if it goes better.

What is fun for me? Software/ML development, operations and, probably, management. I'm going to find a job or contracting work in that direction. Ideally I would work directly on mitigating x-risk, but this is difficult, given that I want to work remotely. So it's either earning to give, or building an income stream that can support me while doing direct work. The latter can be through saving money and retiring early, or through building a ‘lifestyle business’ the Tim Ferriss way.

Another thought on fun: When I develop software, I know when it works and when it doesn't work. This is satisfying. Doing research always leaves me full of doubt whether what I'm doing is useful. I could fix this by gathering more feedback. For this again I would need to buy time and build relationships.


For reference I'll list what I've done in the area of AI alignment. Feel free to stop reading here if you're not interested.


…to everyone who helped me and treated me kindly over the past four years. This encompasses just about everyone I've interacted with. Those who helped me most I've already thanked in other places. If you feel I haven't given you the appreciation you deserve, please let me know and I'll make up for it.


Puzzles for Physicalists

12 марта, 2020 - 04:37
Published on March 12, 2020 1:37 AM GMT

The following is a list of puzzles that are hard to answer within a broadly-physicalist, objective paradigm. I believe critical agentialism can answer these better than competing frameworks; indeed, I developed it through contemplation on these puzzles, among others. This post will focus on the questions, though, rather than the answers. (Some of the answers can be found in the linked post)

In a sense what I have done is located "anomalies" relative to standard accounts, and concentrated more attention on these anomalies, attempting to produce a theory that explains them, without ruling out its ability to explain those things the standard account already explains well.


(This section would be philosophical plagiarism if I didn't cite On the Origin of Objects.)

Indexicals are phrases whose interpretation depends on the speaker's standpoint, such as "my phone" or "the dog over there". It is often normal to treat indexicals as a kind of shorthand: "my phone" is shorthand for "the phone belonging to Jessica Taylor", and "the dog over there" is shorthand for "the dog existing at coordinates 37.856570, -122.284176". This expansion allows indexicals to be accounted for within an objective, standpoint-independent frame.

However, even these expanded references aren't universally unique. In a very large universe, there may be a twin Earth which also has a dog at coordinates 37.856570, -122.284176. As computer scientists will find obvious, specifying spacial coordinates requires a number of bits logarithmic in the amount of space addressed. These globally unique identifiers get more and more unwieldy the more space is addressed.

Since we don't expand out references enough to be sure they're globally unique, our use of them couldn't depend on such global uniqueness. An accounting of how we refer to things, therefore, cannot posit any causally-effective standpoint-independent frame that assigns semantics.

Indeed, the trouble of globally unique references can also be seen by studying physics itself. Physical causality is spacially local; a particle affects nearby particles, and there's a speed-of-light limitation. For spacial references to be effective (e.g. to connect to observation and action), they have to themselves "move through" local space-and-time.

This is a bit like the problem of having a computer refer to itself. A computer may address computers by IP address. The IP address "" always refers to this computer. These references can be resolved even without an Internet connection. It would be totally unnecessary and unwieldy for a computer to refer to itself (e.g. for the purpose of accessing files) through a globally-unique IP address, resolved through Internet routing.

Studying enough examples like these (real and hypothetical) leads to the conclusion that indexicality (and more specifically, deixis) are fundamental, and that even spacial references that appear to be globally unique are resolved deictically.

How does this relate to physics? It means references to "the objective world" or "the physical world" must also be resolved indexically, from some standpoint. Paying attention to how these references are resolved is critical.

The experimental results you see are the ones in front of you. You can't see experimental results that don't, through spacio-temporal information flows, make it to you. Thus, references to the physical which go through discussing "the thing causing experimental predictions" or "the things experiments failed to falsify" are resolved in a standpoint-dependent way.

It could be argued that physical law is standpoint-independent, because it is, symmetrically, true at each point in space-time. However, this excludes virtual standpoints (e.g. existing in a computer simulation), and additionally, this only means the laws are standpoint-independent, not the contents of the world, the things described by the laws.

Pre-reduction references

(For previous work, see "Reductive Refrerence".)

Indexicality by itself undermines view-from-nowhere mythology, but perhaps not physicalism itself. What presents a greater challenge for physicalism is the problem of pre-reduced references (which are themselves deictic).

Let's go back to the twin Earth thought experiment. Suppose we are in pre-chemistry times. We still know about water. We know water through our interactions with it. Later, chemistry will find that water has a particular chemical formula.

In pre-chemistry times, it cannot be known whether the formula is H2O, XYZ, etc, and these formulae are barely symbolically meaningful. If we discover that water is H2O, we will, after-the-fact, define "water" to mean H2O; if we discover that water is XYZ, we will, after-the-fact, define "water" to mean XYZ.

Looking back, it's clear that "water" has to be H2O, but this couldn't have been clear at the time. Pre-chemistry, "water" doesn't yet have a physical definition; a physical definition is assigned later, which rationalizes previous use of the word "water" into a physicalist paradigm.

A philosophical account of reductionism needs to be able to discuss how this happens. To do this, it needs to be able to discuss the ontological status of entities such as "water" (pre-chemistry) that do not yet have a physical definition. In this intermediate state, the philosophy it talking about two entities, pre-reduced entities and physics, and considering various bridgings between them. So the intermediate state needs to contain entities that are not yet conceptualized physically.

A possible physicalist objection is that, while it may be a provisional truth that water is definitionally the common drinkable liquid found in rivers and so on, it is ultimately true that water is H20, and so physicalism is ultimately true. (This is very similar to the two truths doctrine in Buddhism).

Now, expanding out this account needs to provide an account of the relation between provisional and ultimate truth. Even if such an account could be provided, it would appear that, in our current state, we must accept it as provisionally true that some mental entities (e.g. imagination) do not have physical definitions, since a good-enough account has not yet been provided. And we must have a philosophy that can grapple with this provisional state of affairs, and judge possible bridgings as fitting/unfitting.

Moreover, there has never been a time without provisional definition. So this idea of ultimate truth functions as a sort of utopia, which is either never achieved, or is only achieved after very great advances in philosophy, science, and so on. The journey is, then, more important than the destination, and to even approach the destination, we need an ontology that can describe and usably function within the journeying process; this ontology will contain provisional definitions.

The broader point here is that, even if we have the idea of "ultimate truth", that idea isn't meaningful (in terms of observations, actions, imaginations, etc) to a provisional perspective, unless somehow the provisional perspective can conceptualize the relation between itself and the ultimate truth. And, if the ultimate truth contains all provisional truths (as is true if forgetting is not epistemically normative), the ultimate truth needs to conceptualize this as well.

Epistemic status of physics

Consider the question: "Why should I believe in physics?". The conventional answer is: "Because it predicts experimental results." Someone who can observe these experimental results can, thus, have epistemic justification for belief in physics.

This justificatory chain implies that there are cognitive actors (such as persons or social processes) that can do experiments and see observations. These actors are therefore, in a sense, agents.

A physicalist philosophical paradigm should be able to account for epistemic justifications of physics, else fails to self-ratify. So the paradigm needs to account for observers (and perhaps specifically active observers), who are the ones having epistemic justification for belief in physics.

Believing in observers leads to the typical mind-body problems. Disbelieving in observers fails to self-ratify. (Whenever a physicalist says "an observation is X physical entity", it can be asked why X counts as an observation of the sort that is epistemically compelling; the answer to this question must bridge the mental and the physical, e.g. by saying the brain is where epistemic cognition happens. And saying "you know your observations are the things processed in this brain region because of physics" is circular.)

What mind-body problems? There are plenty.


The anthropic principle states, roughly, that epistemic agents must believe that the universe contains epistemic agents. Else, they would believe themselves not to exist.

The language of physics, on its own, doesn't have the machinery to say what an observer is. Hence, anthropics is a philosophical problem.

The standard way of thinking about anthropics (e.g. SSA/SIA) is to consider the universe from a view-from-nowhere, and then assume that "my" body is in some way sampled "randomly" from this viewed-from-nowhere universe, such that I proceed to get observations (e.g. visual) from this body.

This is already pretty wonky. Indexicality makes the view-from-nowhere problematic. And the idea that "I" am "randomly" placed into a body is a rather strange metaphysics (when and where does this event happen?).

But perhaps the most critical issue is that the physicalist anthropic paradigm assumes it's possible to take a physical description of the universe (e.g. as an equation) and locate observers in it.

There are multiple ways of considering doing so, and perhaps the best is functionalism, which will be discussed later. However, I'll note that a subjectivist paradigm can easily find at least one observer: I'm right here right now.

This requires some explaining. Say you're lost in an amusement park. There are about two ways of thinking about this:

  1. You don't know where you are, but you know where the entrance is.
  2. You don't know where the entrance is, but you know where you are.

Relatively speaking, 1 is an "objective" (relatively standpoint-independent) answer, and 2 is a "subjective" (relatively standpoint-dependent) answer.

2 has the intuitive advantage that you can point to yourself, but not to the entrance. This is because pointing is deictic.

Even while being lost, you can still find your way around locally. You might know where the Ferris wheel is, or the food stand, or your backpack. And so you can make a local map, which has not been placed relative to the entrance. This map is usable despite its disconnection from a global reference frame.

Anthropics seems to be saying something similar to (1). The idea is that I, initially, don't know "where I am" in the universe. But, the deictic critique applies to anthropics as it applies to the amusement park case. I know where I am, I'm right here. I know where the Earth is, it's under me. And so on.

This way of locating (at least one) observer works independent of ability to pick out observers given a physical description of the universe. Rather than finding myself relative to physics, I find physics relative to me.

Of course, the subjectivist framework has its own problems, such as difficulty finding other observers. So there is a puzzle here.

Tool use and functionalism

Functionalism is perhaps the current best answer as to how to locate observers in physics. Before discussing functionalism, though, I'll discuss tools.

What's a hammer? It's a thing you can swing to apply lots of force to something at once. Hammers can be made of many physical materials, such as stone, iron, or wood. It's about the function, not the substance.

The definition I gave refers to a "you" who can swing the hammer. Who is the "you"? Well, that's standpoint-dependent. Someone without arms can't use a conventional hammer to apply lots of force. The definition relativizes to the potential user. (Yes, a person without arms may say conventional hammers are hammers due to social convention, but this social convention is there because conventional hammers work for most people, so it still relativizes to a population.)

Let's talk about functionalism now. Functionalism is based on the idea of multiple realizability: that a mind can be implemented on many different substrates. A mind is defined by its functions rather than its substrate. This idea is very familiar to computer programmers, who can hide implementation details behind an interface, and don't need to care about hardware architecture for the most part.

This brings us back to tools. The definition I gave of "hammer" is an interface: it says how it can be used (and what effects it should create upon being used).

What sort of functions does a mind have? Observation, prediction, planning, modeling, acting, and so on. Now, the million-dollar question: Who is (actually or potentially) using it for these functions?

There are about three different answers to this:

  1. The mind itself. I use my mind for functions including planning and observation. It functions as a mind as long as I can use it this way.
  2. Someone or something else. A corporation, a boss, a customer, the government. Someone or something who wants to use another mind for some purpose.
  3. It's objective. Things have functions or not independent of the standpoint.

I'll note that 1 and 2 are both standpoint-dependent, thus subjectivist. They can't be used to locate minds in physics; there would have to be some starting point, of having someone/something intending to use a mind for something.

3 is interesting. However, we now have a disanalogy from the hammer case, where we could identify some potential user. It's also rather theological, in saying the world has an observer-independent telos. I find the theological implications of functionalism to be quite interesting and even inspiring, but that still doesn't help physicalism, because physicalist ontology doesn't contain standpoint-independent telos. We could, perhaps, say that physicalism plus theism yields objective functionalism. And this requires adding a component beyond the physical equation of the universe, if we wish to find observers in it.

Causality versus logic

Causality contains the idea that things "could" go one way or another. Else, causal claims reduce to claims about state; there wouldn't be a difference between "if X, then Y" and "X causes Y".

Pearlian causality makes this explicit; causal relations are defined in terms of interventions, which come from outside the causal network itself.

The ontology of physics itself is causal. It is asserted, not just that some state will definitely follow some previous state, but that there are dynamics that push previous states to new states, in a necessary way. (This is clear in the case of dynamical systems)

Indeed, since experiments may be thought of as interventions, it is entirely sensible that a physical theory that predicts the results of these interventions must be causal.

These "coulds" have a difficult status in relation to logic. Someone who already knows the initial state of a system can logically deduce its eventual state. To them, there is inevitability, and no logically possible alternative.

It appears that, while "could"s exist from the standpoint of an experimenter, they do not exist from the standpoint of someone capable of predicting the experimenter, such as Laplace's demon.

This is not much of a problem if we've already accepted fundamental deixis and rejected the view-from-nowhere. But it is a problem for those who haven't.

Trying to derive decision-theoretic causality from physical causality results in causal decision theory, which is known to have a number of bugs, due to its reliance on hypothetical extra-physical interventions.

An alternative is to try to develop a theory of "logical causality", by which some logical facts (such as "the output of my decision process", assuming you know your source code) can cause others. However, this is oxymoronic, because logic does not contain the affordance for intervention. Logic contains the affordance for constructing and checking proofs. It does not contain the affordance for causing 3+4 to equal 8. A sufficiently good reasoner can immediately see that "3+4=8" runs into contradiction; there is no way to construct a possible world in which 3+4=8.

Hence, it is hard to say that "coulds" exist in a standpoint-independent way. We may, then, accept standpoint-dependence of causation (as I do), or reject causation entirely.


My claim isn't that physicalism is false, or that there don't exist physicalist answers to these puzzles. My claim, rather, is that these puzzles are at least somewhat difficult, and that sufficient contemplation on them will destabilize many forms of physicalism. The current way I answer these puzzles is through a critical agential framework, but other ways of answering them are possible as well.


Please Press "Record"

12 марта, 2020 - 02:56
Published on March 11, 2020 11:56 PM GMT

Over the years, I've made very heavy use of open courseware/MOOCs. I'd estimate that I've covered about as much material in online lectures as I did in-person during a four year degree.

However, since college I've been frustrated by the general lack of online material for non-101 courses. Sites like coursera, edx, etc seem to have realized that the vast majority of users have basically no relevant background, so they optimize for users with no relevant background - they produce a very repetitive stream of 101-level courses. More advanced material is harder to come by online, even though it's abundant in university course catalogs.

But I hear that many colleges are moving courses online in response to coronavirus. So, a request for any professors out there: press "record" during lectures. You're on video already, you can sort out the details and decide whether to actually put up the recordings later, but for now just push the record button.


Coronavirus tests and probability

12 марта, 2020 - 02:09
Published on March 11, 2020 11:09 PM GMT

I recently had a "duh" moment while reading an Atlantic article. Coronavirus tests are not screening tests! Like, didn’t we all learn about Bayesian probability, sensitivity, and the dangers of false positives and false negatives from a very similar question? And then, when I started reading about coronavirus test distribution in the news, I forgot all about that.

But I don't know what the probabilities are. A brief search didn't find them. Anyone know?

Expectations are tempered; a similar promise from Vice President Mike Pence of 1.5 million tests by the end of last week did not come to pass. But even when these tests eventually are available, some limitations will have to be realized. Among them, these are diagnostic tests, not screening tests—a distinction that should shape expectations about the role doctors will play in helping manage this viral disease.The difference comes down to a metric known as sensitivity of the test: how many people who have the virus will indeed test positive. No medical test is perfect. Some are too sensitive, meaning that the result may say you’re infected when you’re actually not. Others aren’t sensitive enough, meaning they don’t detect something that is actually there.The latter is the model for a diagnostic test. These tests can help to confirm that a sick person has the virus; but they can’t always tell you that a person does not. When people come into a clinic or hospital with severe flu-like symptoms, a positive test for the new coronavirus can seal the diagnosis. Screening mildly ill people for the presence of the virus is, however, a different challenge.“The problem in a scenario like this is false negatives,” says Albert Ko, the chair of epidemiology of microbial diseases at the Yale School of Public Health. If you wanted to use a test to, for example, help you decide whether an elementary-school teacher can go back to work without infecting his whole class, you really need a test that will almost never miss the virus.“The sensitivity can be less than 100 percent and still be very useful,” Ko says, in many cases. But as that number falls, so does the usefulness of any given result. In China, the sensitivity of tests has been reported to be as low as 30 to 60 percent—meaning roughly half of the people who actually had the virus had negative test results. Using repeated testing was found to increase the sensitivity to 71 percent. But that means a negative test still couldn’t fully reassure someone like the teacher that he definitely doesn’t have the virus. At that level of sensitivity, Ko says, “if you’re especially risk-averse, do you just say: ‘If you have a cold, stay home’?”



What are some articles that updated your beliefs a lot on an important topic?

12 марта, 2020 - 01:34
Published on March 11, 2020 10:34 PM GMT

The content doesn't need to be hosted on LessWrong.

It does need to have changed your beliefs personally. If it changed other people's beliefs, please put this as a comment instead.

Also, the article should be more than simply updating you away from your priors. For example, I'm not interested in things like "learning about a new cause you weren't aware of" but rather in things like "changing your mind about the importance of a cause".

I'm also interested in knowing whether the article was the first time you had come across this point, or it just argued that point better than the previous ones.

Information in other format than articles also work: audio, video, pictures, graph, data table, etc.

This is related to a project idea of tracking belief updates with the goals 1) to track which information is the most valuable so that more people consume it, and 2) see how beliefs evolve (which might be evidence in itself about which beliefs are true; although, I think most, including myself, wouldn't think this was the strongest form of evidence).


A practical out-of-the-box solution to slow down COVID-19: Turn up the heat

12 марта, 2020 - 01:30
Published on March 11, 2020 10:30 PM GMT

3 days ago there was published a research that swayed me significantly towards the opinion that COVID-19 is spreading better in cold (but not too cold) climates. I believe that the odds for this are probably over 50%, and if it's true there is a practical and scalable (although not extremely cheap) way to slow down significantly the spread of the coronavirus.

Research Claims

It seems that all current pandemic epicenter share a very similar temperature and humidity (5-11OC and 47-79% humidity). Consider the following map:

As you can see all the central outbreak locations are lying only along a narrow east west distribution roughly along the 30-50 N” corridor. This shows a correlation between hyper virus spread and specific climate conditions.

Why do I believe it's probably true?

1. The odds that randomly by pure luck all 6 different epicenters will be with similar climates and latitude seem like a too large of coincidence, even if we take into account that the 30-50N'' corridor does seem more populous than the average. There are many very populated places outside this corridor and none of them got hit as hard.

2. Iran, Italy, China, South Korea are very different places in terms of the political system and government competence, Intuitively it's hard to think about a better explaining cause root that caused the virus to be widely spread specifically in these locations.

3. It could also explain why some places that you would expect to be hit hard, like Thailand (which is the top global destination for Chinese tourism) or Taiwan, for now, don't seem to have it that bad. Both of these countries have a tropical climate. Now granted there could be a higher spread in these countries that is underreported, but if they had it bad as in Italy or Iran it wouldn't go unnoticed.


What are the implications if we believe the research?

on the downside: as the weather warms up on the northern hemisphere more places will warm up enough and could provide optimal conditions to COVID-19, the researches list the following cities as potentially dangerous areas for coronavirus spread

On the upside, if this is true there is something we might be able to do to slow down the virus spread, and the solution is simple: Increase temperatures in closed public spaces, or more simply - turn up the heat. If the coronavirus truly doesn't like heat increasing temperature will kill it faster on surfaces. If we could raise temperatures to hot but still bearable (27C/80F is a good place) in places like supermarkets, public transportation or clinics - and by this might slow down the spreading by reducing connectivity via killing the virus faster while it's on surfaces or in the air.

The main downside is that heating up costs money and we don't know for sure if it will help, A way to know for sure will be to conduct research that tests this assumption directly or even create A/B tests around the country to see if on average areas with high-temperature public areas get less pandemic growth.

Another option is to heat the places only at night to consume electricity outside the peak working-hours when it's pretty cheap anyway. The recent drops in Oil and Gas prices could mean it could be a sustainable and worthwhile act of public policy if it truly works.



12 марта, 2020 - 00:08
Published on March 11, 2020 9:08 PM GMT

Trace is a tool for writing programs which read, write and reason about programs. You can find it here. I wrote it as a tool for my own research, and I expect that others in this space may find the ideas interesting/useful as well. I'd be especially interested in new use-cases and other feedback!

Some kinds of things you might find Trace useful for:

  • Algorithms which operate on a computation graph, e.g. backpropagation, belief propagation, or other graphical inference algorithms
  • An intermediate data structure for static analysis, interpreters or compilers
  • A general-purpose non-black-box representation of objectives/constraints for optimization
  • A general-purpose non-black-box representation of world models for AI more broadly

Disclaimer for all of these: Trace is brand-new, and it was built with a focus on the core ideas rather than the engineering. Syntax is liable to change as we figure out what does and does not work well. Do not expect it to be easy/pleasant to use at this point, but do expect it to provide novel ways to think about programs.

One more warning: this doc is intended to be read start-to-finish. Trace does not really resemble any other tool I know of, and you will likely be confused if you just dive in.

What is Trace?

Trace is

  • A programming/modelling language embedded in a python library. For use as a human-facing programming language, Trace is pretty terrible, but it’s sometimes a necessary step for other use-cases.
  • A notation/data structure representing programs. For these use-cases, Trace is pretty good: compared to alternatives (e.g. abstract syntax trees), Trace offers a much more convenient representation of program structure.
  • A data structure representing the computation performed by an arbitrary program - i.e. the trace (aka execution graph aka computation graph) of a program. For this use-case, I do not know of any other tool which is anywhere near as powerful as Trace.

A prototypical use-case: suppose you want to test out a new inference algorithm. You can prototype the algorithm to operate on Trace data structures, which allows it to handle arbitrary programs (unlike e.g. pytorch graphs), with relatively little complexity (unlike e.g. python syntax trees). Then, you can write test-case world-models as programs in Trace notation. Those “programs” will themselves be fairly transparent Trace data structures, which your prototype algorithm can operate on directly.


Here’s a simple python program:

def factorial(n): if n == 0: return 1 return n * factorial(n-1)

Let’s suppose I want to trace the execution of factorial(3), starting from the result and working backwards (e.g. for something analogous to backpropagation). Conceptually, I picture something like the call stack, with a box for each function call. Within each box, variable instances are in dependency order; arrows show cross-box dependencies:

This is roughly the core data structure which Trace exposes. For every instance of every variable, it tells us:

  • The value of the variable-instance
  • The expression which produced that value
  • The variable-instances which went into that expression

(Side note: every variable instance is assumed to be write-once; no in-place updating of values is allowed.)

In Trace syntax, every variable-instance is a Symbol (S). The Symbol object contains both the symbol’s name (aka its literal) and a pointer to the “context” in which the symbol lives (i.e. the dotted boxes in the diagram). The context then assigns the literal to another symbol, a hardcoded value, or an Expression - a special type of Symbol which wraps a python function and some input Symbols. More on that in the next section.

However, Trace’ core data structures differ in two important ways from the diagram above:

  • They handle dynamic structure - i.e. programs which write programs
  • Everything in Trace is evaluated lazily whenever possible

Lazy evaluation allows us to write data structures which look a lot like normal programs (albeit with some unusual syntax), and which can fit in about as much memory as normal code, but allow access to the whole trace - every instance of every variable in the program’s execution.

The main trick to a compressed, lazy representation is an operator which says “make a copy of this whole block, but with these changes: …”. In the factorial diagram above, each of the dotted boxes (except the last) is a copy of the first box, but with a different value of n. Ignoring the last box, we could represent it like this:

Here the “?”s represent lazily-evaluated values which haven’t been evaluated yet. Note that the “copy” is nested within the outermost box - indicating that it, too, will be copied, leading to a whole nested ladder of blocks.

In Trace syntax, the dotted boxes are Context objects, and the copy-with-changes operator is represented by function-call notation: cont({‘n’:2}) makes a copy of the Context cont, in which ‘n’ is assigned the value 2. Values of variable-instances downstream of n will update in response to the new value of n, within the copy.

Core Data Structure

Here’s a full program in Trace; we’re going to walk through all the pieces.

from tracelang import S, E, Context factorial = Context({ 'fact': Context({ 'result': S(S('n') == 0, { True: 1, False: S('n')*S('result', S('fact')({'n': S('n') - 1})) }) })({'fact': S('fact')}), 'result': S('result', S('fact')({'n': S('n')})) }) >>> S(‘result’, factorial({‘n’: 3})).get_value() 6

Let’s start with the three main pieces: Symbols (S), Expressions (E), and Context. Very briefly:

  • A Symbol is a variable-instance. It’s defined by a literal (e.g. ‘n’) and a context in which to resolve that literal (e.g. {‘n’: 2}). Calling get_value() on a symbol resolves the literal within its context.
  • Expressions are Symbols whose “context” is a python function, so we resolve them by calling the function. They are implicitly created by using operators like +, *, ==, or function call on Symbols.
  • Contexts are basically dicts with a couple extra features: they provide a default context for any symbols within them, and we can “create a copy but with changes” via function-call notation.

More details follow...

Symbols are the starting point. A symbol is just a literal (e.g. ‘foo’ or 2) and a context mapping the literal to some value (e.g. {‘foo’: ‘bar’}; it doesn’t have to be a capital-C Context). By calling .get_value() on a symbol, we get the value of the literal from the context:

>>> S(‘foo’, {‘foo’: ‘bar’, ‘baz’: 2}).get_value() ‘bar’

Both the literal and the context can themselves be symbols, in which case we resolve values recursively. For instance:

>>> S(S(‘is_case’, {‘is_case’: True}), {True: ‘it is’, False: ‘it is not’}).get_value() ‘it is’ >>> S(‘foo’, S(‘bar’, {‘bar’: {‘foo’: 2}})).get_value() 2

Conceptually, S(‘x’, context) works like the square-bracket accessor context[‘x’] - except that we recursively resolve symbols along the way.

In our factorial program, notice that many of the symbols don’t have any explicit context - e.g. S(‘n’) or S(‘fact’). When a symbol’s context is not explicitly passed, the context is set to the (lexically) enclosing Context - this is one of the two main uses of capital-C Contexts. For instance, the S(‘n’)’s in our example all have their context set to one of the two Contexts, depending on which one they appear inside.

Expressions are a special type of Symbol which resolve by calling a python function. If we have a function

def square(x): return x*x

then we could call it via

>>> E(square, S(‘x’, {‘x’: 2})).get_value() 4

This resolves all the input Symbols, then calls the python function, as you’d expect. In practice, we don’t usually need to write E() explicitly - an E will be created automatically via operator overloading on Symbols:

>>> total = S(‘x’, {‘x’:2}) + S(‘y’, {‘y’:3}) >>> type(total) E >>> total.get_value() 5

In our factorial program, E’s are implicitly created where we multiply symbols (i.e. S(‘n’)*S(‘res’, …)), subtract symbols (i.e. S(‘n’) - 1), compare symbols (i.e. S(‘n’) == 0), and where we call symbols (i.e. S(‘fact’)({'n': S('n')})).

So if they're implicit, why do we need to know all this? Remember, the point of Trace is not merely to "run the code" (i.e. call .get_value()), but to query the structure of the computation - and E's are one of the main things which comprise that data structure. We'll see a bit of that in the next section.

Contexts are, conceptually, mostly just dicts. They map things to other things. The two main differences between a context and an ordinary python dict are:

  • If a Symbol doesn’t have an explicit context, its context will be set to the lexically enclosing Context.
  • By calling a Context with a dict, we create a modified copy of the context.

In the example program, we create a modified copy in three places:

  • S('fact')({'n': S('n') - 1}) creates a copy of the context called ‘fact’ for the recursive call, just like the diagram from the previous section.
  • Context({...})({'fact': S('fact')}) is used to pass a pointer to the fact-context inside of the fact-context itself, so copies can be made.
  • S('fact')({'n': S('n')}) is just a pass-through function call.

When actually using the factorial function, we create one more modified copy: factorial({‘n’: 3}). This is the first copy with a value actually assigned to ‘n’.

Before we jump back in to our factorial example, let’s see how these pieces play together in a simpler example:

import operator as op half_adder = Context({ ‘a’: 0, ‘b’: 1, ‘sum’: E(op.xor, [S(‘a’), S(‘b’)]), ‘carry’: E(op.and_, [S(‘a’), S(‘b’)]) })

This example contains two Symbols (other than the E’s). Neither Symbol has an explicit context passed, so both have their context set to the enclosing Context - i.e. the object half_adder. To get value of ‘sum’ within half_adder, we’d call S(‘sum’, half_adder).get_value(). This would look up the values of S(‘a’, half_adder) and S(‘b’, half_adder), then pass those values to the python function op.xor. We could also evaluate at other inputs by making a modified copy - e.g. half_adder({‘a’: 1, ‘b’: 0}).

That’s all the core pieces. Let’s take another look at our example program:

from tracelang import S, E, Context factorial = Context({ 'fact': Context({ 'result': S(S('n') == 0, { True: 1, False: S('n')*S('result', S('fact')({'n': S('n') - 1})) }) })({'fact': S('fact')}), 'result': S('result', S('fact')({'n': S('n')})) }) >>\> S(‘result’, factorial({‘n’: 3})).get_value() 6

We have two Contexts. The inner Context is our main function, but we need to use the outer Context in order to get a pointer to the inner context, so that we can make modified copies of it. There’s some code patterns which are probably unfamiliar at this point - e.g. S(S(‘n’) == 0, …) is used to emulate an if-statement, and we write things like S(‘result’, fact) rather than fact[‘result’]. But overall, hopefully the underlying structure of this code looks familiar.

But if all we wanted to do was write and run code, we wouldn’t be using Trace in the first place. Let’s probe our program a bit.

Stepping Through the Code

Human programmers sometimes “step through the code”, following the execution step-by-step to better understand what’s going on. IDEs often provide tools to help with this (e.g. breakpoints), but most programming languages don’t offer a nice way to step through the code programmatically. For Trace, this is a simple - and fundamental - use-case.

Here’s how we step through some Trace code.

We start with our final output, e.g. answer = S(‘result’, factorial({‘n’: 3})). Before, we called answer.get_value() on this object, but now we won’t. Instead, we’ll access the pieces which went into that Symbol: answer._literal, and answer._context. In general, we can “work backwards” in three possible “directions”:

  • If answer._literal is a Symbol/Expression, then we can step back through it, and/or we can get its value
  • If answer._context is a Symbol/Expression, then we can step back through it, and/or we can get its value
  • Once we have both values, we can look up answer._context[answer._literal] to find the Symbol/Expression/Value defining answer in its context.

In this case, the literal is not a Symbol, but the context is - it’s an Expression object, which performs the modified-copy operation on our factorial context. By calling answer._context.get_value(), we get a new Context, which is a copy of factorial with the modification {n: 3} applied. By looking at the Expression object itself, we can see the original factorial context and the {n: 3}: answer._context._literal is a list containing factorial and {n: 3}.

Let’s go one step further in: we’ll set last_step = answer._context.get_value()\[answer._literal], and look at last_step.

Now we get an object which looks like S('result', S('fact', <modified copy>)({'n': S('n', <modified copy>)})), where the modified copy is the copy of factorial with {n: 3} applied. The outermost symbol once again has a string as literal, and its context is an Expression object performing the modified-copy operation on a Context. Calling .get_value() on the Expression last_step._context would lead us even further in.

Now, obviously this is not a very convenient way for a human to trace through a program’s execution. But if we want to write programs which trace through other programs’ execution, then this looks more reasonable - there’s a relatively small number of possibilities to check at every step, a relatively small number of object types to handle, and we have a data structure which lets us walk through the entire program trace.

Definitely Real User Testimony

To wrap it up, here are some endorsements from enthusiastic Trace users.

“Trace is an AI-oriented programming language for people who like Lisp, but think it doesn't go far enough.” - Ada Lovelace

“Isn’t this just math?” - Charles Babbage

“Trace combines the syntax of JSON with the semantics of a spreadsheet, but instead of just ending up horrendously hackish, it ends up horrendously abstract and horrendously hackish.” - John Von Neumann

“In Trace, source code is truly just data, always.” - Alan Turing


[AN #90]: How search landscapes can contain self-reinforcing feedback loops

11 марта, 2020 - 20:30
Published on March 11, 2020 5:30 PM GMT

Find all Alignment Newsletter resources here. In particular, you can sign up, or look through this spreadsheet of all summaries that have ever been in the newsletter. I'm always happy to hear feedback; you can send it to me by replying to this email.

Audio version here (may not be up yet).


Demons in Imperfect Search (John S Wentworth) (summarized by Asya): This post gives an analogy to explain optimization demons: a type of undesirable behavior that arises in imperfect search processes. In the analogy, a ball rolls down a hill trying to go as far down as possible, mimicking a gradient descent algorithm. The ball is benefited by random noise, but still basically only experiences local changes in slope-- it cannot see steep drop-offs that are a little off to the side. Small bumps in the hill can temporarily alter the ball's trajectory, and the bumps that are selected for are the ones that most effectively control its trajectory. In this way, over time the ball's trajectory selects for demons, twisty paths with high walls that keep the ball contained and avoid competing walls. Demons cause the ball to go down the hill as slowly as possible so that potential energy is conserved for avoiding competitor walls.

The general pattern this analogy is meant to elucidate is the following: In any imperfect search mechanism with a rich enough search space, a feedback loop can appear that creates a more-and-more perfect exploitation of the imperfect search mechanism, resulting in a whole new optimization process. The post gives several real world examples as proofs that this is a failure mode that happens in real systems. One example is metabolic reactions-- a chemical system searches by making random small changes to the system state while trying to minimize free energy. Biological systems exploit the search by manipulating the height of the barriers between low-free-energy states, raising or lowering the activation energies required to cross them. After enough time, some chemicals changed the barriers enough such that more copies of the chemicals were made, kicking off an unstable feedback loop that led to life on earth.

The post ends by posing an open question asking what about a system makes this kind of failure mode likely to happen.

Asya's opinion: I think it's worth spelling out how this is different from the failure modes described in Risks from Learned Optimization (AN #58). In Risks from Learned Optimization, we are concerned that the outer optimizer will produce an unaligned inner optimizer because we're training it in diverse environments, and an inner optimizer may be the best solution for performing well in diverse environments. In this post, we are concerned that the outer optimizer will produce an unaligned demon (which may or may not be an optimizer) because the search process may have some self-reinforcing imperfections that allow it to be pushed strongly in a direction orthogonal to its objective. This direction could be bad unless the original outer objective is a perfect specification of what we want. This means that even if the conditions for mesa-optimization don't hold-- even if we're training on a fairly narrow task where search doesn't give an advantage-- there may be demon-related failure modes that are worth thinking about.

I really like this post, I think it crystallizes an important failure mode that I haven't seen described before. I'm excited to see more work on this class of problems.

Tessellating Hills: a toy model for demons in imperfect search (DaemonicSigil) (summarized by Asya): This post is trying to generate an example of the problem outlined in 'Demons in Imperfect Search' (summarized above): the problem where certain imperfect search processes allow for self-reinforcing behavior, 'demons', that push in a direction orthogonal to the original objective.

The post runs a simple gradient descent algorithm in an artifically constructed search space. The loss function that defines the search space has two major parts. One part straightforwardly tries to get the algorithm to move as far as it can in a particular direction x0 -- this represents our original objective function. The other part can be thought of as a series of periodic 'valleys' along every other axis, (x1 ... xn) that get steeper the farther you go along that axis.

When running the gradient descent, at first x0 increases steadily, and the other coordinates wander around more or less randomly. In the second phase, a self-reinforcing combination of valleys (a "demon") takes hold and amplifies itself drastically, feeding off the large x0 gradient. Finally, this demon becomes so strong that the search gets stuck in a local valley and further progress stops.

Asya's opinion: I think this is a good illustration of the problem specified in Demons in Imperfect Search. Clearly the space has to have a fairly specific shape, so the natural follow-up question, as is posed in the original post, is to think about what cases cause these kinds of self-reinforcing search spaces to arise.

Technical AI alignmentAgent foundations

A critical agential account of free will, causation, and physics (Jessica Taylor)

Subjective implication decision theory in critical agentialism (Jessica Taylor)


Historic trends in technological progress (AI Impacts) (summarized by Nicholas): One key question in thinking about AGI deployment and which safety problems to focus on is whether technological progress will be continuous or discontinuous. AI Impacts has researched the frequency of discontinuities in a number of case studies, that were selected on the possibility of having discontinuities. An example of a discontinuity in flight speed records would be the Fairey Delta 2 flight in 1956 which represented 19 years of progress at the previous trend. On the other hand, penicillin did not create a discontinuity of more than ten years in the number of deaths from syphilis in the US. This post summarizes a number of those case studies. As it is already a summary, I will just refer you to the post for more information.

Nicholas's opinion: I’m looking forward to reading AI Impacts’ conclusions after completing these case studies. My impression from reading through these is that discontinuities happen, but rarely, and small discontinuities are more common than larger ones. However, I remain uncertain of a) how relevant each of these examples is to AI progress, and b) if I missed any key ways in which the examples differ from each other.

Read more: Incomplete case studies of discontinuous progress

Miscellaneous (Alignment)

Cortés, Pizarro, and Afonso as Precedents for Takeover (Daniel Kokotajlo) (summarized by Matthew): This post lists three historical examples of how small human groups conquered large parts of the world, and shows how they are arguably precedents for AI takeover scenarios. The first two historical examples are the conquests of American civilizations by Hernán Cortés and Francisco Pizarro in the early 16th century. The third example is the Portugese capture of key Indian Ocean trading ports, which happened at roughly the same time as the other conquests. Daniel argues that technological and strategic advantages were the likely causes of these European victories. However, since the European technological advantage was small in this period, we might expect that an AI coalition could similarly take over a large portion of the world, even without a large technological advantage.

Matthew's opinion: In a comment, I dispute the claimed reasons for why Europeans conquered American civilizations. I think that a large body of historical literature supports the conclusion that American civilizations fell primarily because of their exposure to diseases which they lacked immunity to, rather than because of European military power. I also think that this helps explain why Portugal was "only" able to capture Indian Ocean trading ports during this time period, rather than whole civilizations. I think the primary insight here should instead be that pandemics can kill large groups of humans, and therefore it would be worth exploring the possibility that AI systems use pandemics as a mechanism to kill large numbers of biological humans.

AI strategy and policy

Activism by the AI Community: Analysing Recent Achievements and Future Prospects (Haydn Belfield) (summarized by Rohin): The AI community has been surprisingly effective at activism: it has led to discussions of a ban on lethal autonomous weapons systems (LAWS), created several initiatives on safety and ethics, and has won several victories through organizing (e.g. Project Maven). What explains this success, and should we expect it to continue in the future? This paper looks at this through two lenses.

First, the AI community can be considered an epistemic community: a network of knowledge-based experts with coherent beliefs and values on a relevant topic. This seems particularly relevant for LAWS: the AI community clearly has relevant expertise to contribute, and policymakers are looking for good technical input. From this perspective, the main threats to future success are that the issues (such as LAWS) become less novel, that the area may become politicized, and that the community beliefs may become less cohesive.

Second, the AI community can be modeled as organized labor (akin to unions): since there is high demand for AI researchers, and their output is particularly important for company products, and the companies are more vulnerable to public pressure, AI researchers wield a lot of soft power when they are united. The main threat to this success is the growing pool of talent that will soon be available (given the emphasis on training experts in AI today), which will reduce the supply-demand imbalance, and may reduce how commited the AI community as a whole is to collective action.

Overall, it seems that the AI community has had good success at activism so far, but it is unclear whether it will continue in the future.

Rohin's opinion: I think the ability of the AI community to cause things to happen via activism is quite important: it seems much more likely that if AI x-risk concerns are serious, we will be able to convince the AI community of them, rather than say the government, or company executives. This mechanism of action seems much more like the "epistemic community" model used in this paper: we would be using our position as experts on AI to convince decision makers to take appropriate precautions with sufficiently powerful AI systems. Applying the discussion from the paper to this case, we get the perhaps unsurprising conclusion that it is primarily important that we build consensus amongst AI researchers about how risky any particular system is.

Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI Ethics and Society (Carina Prunkl and Jess Whittlestone) (summarized by Rohin): This paper argues that the existing near-term / long-term distinction conflates four different axes on which research could differ: the capability level of AI systems (current pattern-matching systems vs. future intelligent systems), the impacts of AI systems (impacts that are being felt now like fairness vs. ones that will be felt in the future like x-risks), certainty (things that will definitely be problems vs. risks that are more speculative) and extremity (whether to prioritize particularly extreme risks). While there are certainly correlations across these axes, they are not the same thing, and discourse would be significantly improved by disambiguating the axes. For example, both authors of the paper see their work as considering the medium-to-long-term impacts of near-to-medium-term AI capabilities.

Rohin's opinion: I definitely agree that near-term and long-term often seem to mean many different things, and I certainly support efforts to be more precise in our language.

While we're talking about near-term and long-term, I'll add in my own gripe: "long-term" implies that the effects will be felt only in the far future, even though many people focused on such effects are doing so because there's a significant probability of such effects being felt in only a few decades.

Exploring AI Futures Through Role Play (Shahar Avin et al) (summarized by Rohin): This paper argues that role playing (akin to the "wargames" used in the military) is a good way to explore possible AI futures, especially to discover unusual edge cases, in a 10-30 year time horizon. Each player is assigned a role (e.g. director of AI at Tencent, or president of the US) and asked to play out their role faithfully. Each game turn covers 2 simulated years, in which players can negotiate and take public and private actions. The game facilitator determines what happens in the simulated world based on these actions. While early games were unstructured, recent games have had an AI "tech tree", that determines what AI applications can be developed.

From the games played so far, the authors have found a few patterns:

- Cooperation between actors on AI safety and (some) restriction on destabilizing uses of AI seem to both be robustly beneficial.

- Even when earlier advances are risky, or when current advances are of unclear value, players tend to pursue AI R&D quite strongly.

- Many kinds of coalitions are possible, e.g. between governments, between corporations, between governments and corporations, and between sub-roles within a corporation.

Rohin's opinion: It makes sense that role playing can help find extreme, edge case scenarios. I'm not sure how likely I should find such scenarios -- are they plausible but unlikely (because forecasting is hard but not impossible), or are they implausible (because it would be very hard to model an entire government, and no one person is going to do it justice)? Note that according to the paper, the prior literature on role playing is quite positive (though of course it's talking about role playing in other contexts, e.g. business and military contexts). Still, this seems like quite an important question that strongly impacts how seriously I take the results of these role playing scenarios.

Other progress in AIDeep learning

Speeding Up Transformer Training and Inference By Increasing Model Size (Zhuohan Li, Eric Wallace, Sheng Shen, Kevin Lin et al) (summarized by Rohin): This blog post and associated paper confirm the findings from Scaling Laws for Neural Language Models (AN #87) that the most efficient way to train Transformer-based language models is to train very large models and stop before convergence, rather than training smaller models to convergence.

Read more: Paper: Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers


Why don't singularitarians bet on the creation of AGI by buying stocks?

11 марта, 2020 - 19:27
Published on March 11, 2020 4:27 PM GMT

With the recent stock market sale, I've been looking over stocks to see which seem to be worth buying. (As background information, I'm buying stocks to have fun and bet my beliefs. If you believe in the efficient market hypothesis, buying stocks at random should perform roughly as well as the market as a whole, and the opportunity cost doesn't seem super high. Making a 40% return buying $ZM right before investors started paying attention to COVID-19 hasn't discouraged me.)

Standard investing advice is to stay diversified. I take the diversification suggestion further than most: A chunk of my net worth is in disaster preparation measures, and I'm also buying stock in companies like Facebook and Uber that I think have a nontrivial shot at creating AGI.

The stock market's "ostrich view" of AGI

AGI could be tremendously economically valuable. So one question about companies with AI research divisions is whether the possibility that they'll create AGI is already "priced in" to their stock. If the company's share price already reflects the possibility that they'll create this transformative technology, we shouldn't expect to make money from the singularity on expectation by buying their stock.

To investigate this question, let's examine Alphabet Inc's share price around the time AlphaGo defeated world Go champion Lee Sedol in March 2016. This has been referred to as a "Sputnik moment" that inspired the Chinese government to start investing billions of dollars in AI research. As a reminder of the kind of conversations that were being had at the time, here are some quotes from a Facebook post Eliezer Yudkowsky made:


– Rapid capability gain and upward-breaking curves.

“Oh, look,” I tweeted, “it only took 5 months to go from landing one person on Mars to Mars being overpopulated.”


We’re not even in the recursive regime yet, and we’re still starting to enter the jumpy unpredictable phase where people are like “What just happened?”


One company with a big insight jumped way ahead of everyone else.


AI is either overwhelmingly stupider or overwhelmingly smarter than you. The more other AI progress and the greater the hardware overhang, the less time you spend in the narrow space between these regions. There was a time when AIs were roughly as good as the best human Go-players, and it was a week in late January.

Here is Alphabet's historical stock price. Can you spot the point at which AlphaGo defeated the world Go champion?

That's right, there it is in March 2016:

$GOOG was worth about $727 on March 11, a Friday. That weekend, a Go commentator wrote:

AlphaGo made history once again on Saturday, as the first computer program to defeat a top professional Go player in an even match.

In the third of five games with Lee Sedol 9p, AlphaGo won so convincingly as to remove all doubt about its strength from the minds of experienced players.

In fact, it played so well that it was almost scary.


In forcing AlphaGo to withstand a very severe, one-sided attack, Lee revealed its hitherto undetected power.

Maybe if Game 3 had happened on a weekday, $GOOG would've moved appreciably. As it is, Game 4 happened on Sunday, and AlphaGo lost. So we don't really know how the market would react to the creation of an "almost scary" AI whose strength none could doubt.

Even so, the market's non-response to AlphaGo's world championship, and its non-response to AlphaZero beating the world's best chess program after 4 hours of self-play in December 2017, seem broadly compatible with a modified version of Alex_Shleizer's claim regarding COVID-19:

...consider [the singularity]: the ability to predict the market suddenly shifted from the day to day necessary data analysis for stock price prediction that revolves more around business KPIs and geopolitical processes to understanding [artificial intelligence]. How many of the Wall Street analysts are [artificial intelligence] experts would you think? Probably very few. The rules have changes and prior data analysis resources (currently hired analysts) became suddenly very inefficient.

Either that, or all the true AI experts (you know, the ones who spend their time trading stocks all day instead of doing AI research) knew that AlphaWhatever was a big nothingburger all along. 0% chance of transformation, no reason to buy $GOOG.

Even if the singularity is transformative, you want to be a shareholder

One objection goes something like: Yes indeed, I do have insider information, due to reading LessWrong dot com, that there will be this transformative change due to AI. And yes indeed, it appears the market isn't pricing this info in. But, I can't stand to profit from this info because as soon as the singularity happens, money becomes worthless! Either we'll have a positive singularity, and material abundance ensues, or we'll have a negative singularity, and paperclips ensue. That's why my retirement portfolio is geared towards business-as-usual scenarios.

Here are some objections to this line of reasoning:

  • First, singularity stocks could serve as a hedge. If your singularity stocks rapidly increase in value, you can sell them off, quit your job, and work furiously on AI safety full-time. In fact, maybe someone should create an EA organization that invests in singularity stocks and, in the event of an impending singularity, sells those stocks and starts implementing predefined emergency measures. (For example, hire a world class mediator and invite all the major players in AI to stay at a luxury hotel for a week to discuss the prevention of arms races.)

  • Second, being a shareholder in the singularity could help you affect it. Owning stock in the company will give you moral authority to comment on its direction at the company's annual shareholder meeting.

  • Finally, maybe your clickbait journalist goggles are the pair you want to wear, and the singularity will result in a hypercapitalist tech dystopia with cartoonish levels of inequality. In which case you'll be glad you have lots of money -- if only so you can buy boatloads of insecticide-treated bednets for the global poor.

More 2020 stock sales

I suspect Monday will not be the only time stocks go on sale in 2020. Given the possibility of future flash sales, I suggest you get set up with a brokerage now. I recommend Charles Schwab. (They also have a great checking account.)

If you believe in the singularity, why aren't you betting on it?