Last week I announced the LessWrong Coronavirus Agenda, an attempt to increase knowledge by coordinating research between LW participants. This post is an update on that. If you want to skip to action items, check out the last section.Last Week’s Spotlight Questions
Last week I spotlighted three questions I hoped to answer. Here's how they went.What should we do once infected?
Tragedyofthecomments provided a great overall answer and Wei Dai a speculative answer (much more likely to be wrong, but very valuable if correct). I did some additional research and incorporated these into an overall answer, which was improved by additional suggestions from Julia Wise and steve2152. This answer should still change as we learn more, but is in a more finished state than the first two questions.How can we estimate how many people are infected in an area?
We failed to find a dashboard that made either me or habryka go “yes, this is the one, I would be happy with just this.”
Honorable mentions go to two dashboards that are at least trying to estimate true caseload rather than repeating official testing numbers: Plague Plus, which uses reported COVID deaths to estimate prevalence, and to the Kinsa Smart Thermometer Dataset (suggested by Unnamed), which uses smart thermometers’ phoned home data to estimate the number of “excess” fevers.Where can we donate time and money to avert coronavirus deaths?
Neither of us found what we were hoping for here either. We’ll continue to add opportunities to the link DB as appropriate, and welcome additional suggestions. Until then I’d suggest being on the lookout for high context opportunities that can’t be captured in answers aimed at a broad audience.Other Highlights Of The Week
This 22 page document on the biology, economics, and logistics of testing for COVID-19, by Jeffrey Ladish, Edward Perello, Sean Ward, and Tessa Alexanian..
Oxygen Supplementation 101 from Sarah Constantin
This video from virology professor Michael Emerman explain the basic biology of coronavirus and epidemicsChanges To The Agenda
All three spotlight questions are being retired from the spotlight, although they remain open to new answers and updates.
“What is the basic science of coronavirus?” is skipping its moment in the sun due to being provisionally answered by the video linked above. That one requires a fair amount of bio background knowledge, so the floor is still very open for content aimed at people starting from square one. And of course if you find a better advanced introductory video, please include that as well.
After thinking about it, several members of the LessWrong team (plus me) have gotten more concerned about the economic effects of coronavirus, and I’ve added several economic questions to the list. Suggestions from Romeo Stevens and Eli Tyre have also been incorporated.This Week’s Spotlight QuestionsWhat is my prognosis (short term or long term) if I am infected with coronavirus?
We’ve attempted to answer these before, but that was weeks ago, and there’s a lot more information accessible now so I’d like to give it another shot.What will the economic effects of a 3 week quarantine be? 3 months?
This is a deliberately broad question, in part because it’s not obvious what the right specific questions are yet, and in part because the right specific questions will vary a lot by locale and we want people to be free to answer for any locale they have information on.
Obviously there's a lot of potential answers to this and it's hard to comprehensive. That's okay. The goal is to make incremental progress on the broad topic and identify specific points that would benefit from further research.
So a bit of background about me before I go into the question.So I am sophomore studying Mechanical Engineering in India.
I have noticed that I have forgotten about 80-90% of the course-work that I did during the first year.Don't get me wrong,I studied the courses properly and not for the test.Still,if you were to ask me how much of the course I remember now, I would at the very best remember the general idea of the stuff I read.
This is very startling from a long time perspective.College-work in India is generally more overloaded than other countries(from what I have observed),so what this means is people consume a lot of knowledge in a very short amount of time and forget it before they can make any use of it at all(leaving aside the question whether the knowledge is useful in the first place).This occurs despite the best intentions to learn and especially so with complicated stuff.I am not just talking about the facts here but whole concepts and ideas of the subject tend to be forgotten sooner than we can find any use for them.I am pretty confident this applies in most colleges(India or not).
This throws up a host of questions for me.The major premise/reason for attending college is to gain knowledge that I can further apply to job/life.The other touted premise is "Learning to Learn or Solve Problems".If that were the objective,I fell college apparatus is a very ineffective way of achieving it(will elaborate on this if required).Assuming that the former premise is the actual one,I do not think the college system accounts for my forgetting curve.Even if you were to take proactive steps and learn the material properly,you are still likely to forget it before you use it.It is impractical to practice spaced repetition for multiple semesters worth of course work.And if you were to do it,the question here(which I will go into detail further),is it worth to put this much effort into pre-learning it,effort into remembering it and then finally using some small portion of it later on in your lives?
All this is I feel part of a bigger question:
What utility do I gain from pre-learning any knowledge at all that I am not going to use in my near future ?
This question is asked presupposing that you are not learning out of pure curiosity and rather with the hope of using it later on.Basically you expect that this knowledge will help you meaningfully constrain your anticipation and help you make your decisions.
This question I feel has multiple sub-questions to it,which I only partially have found an answer(hence the post/question).
1.How exactly do I quantify my forgetting curve?
2.How much of what I am pre-learning is going to be useful?(This differs on a individual basis)
3.Assuming that what I have learnt is useful,how much extra effort do I have to put in to make sure that my knowledge stays intact at the time of use?This may the effort through spaced repetition or any other retention method that you use.I feel a useful parameter to define ,in addition to the forgetting curve,is "time/effort required to reach 50%,80% of knowledge at start" at any point in time.I would highly appreciate if you give me any paper on this(hopefully where the subject learns undergrad math-level concepts).
4.What should be the exponential(negative) scaling factor of the utility when you finally do end up using it?(This also might differ on an individual basis).
5.Finally answering the previous 4,is it worth to invest your time on something to pre-learn?We can debate on the degree of pre-learning here.I feel the amount that we generally do in college is waaaay off.
PS:This is the first time I am posting on LessWrong. So I am not exactly sure whether this qualifies as a question or as a post.So forgive me if the post itself is a bit rough.
We, on behalf of Open Longevity, together with the International Longevity Alliance, wrote a letter to WHO about the need for open anonymized medical data on patients with COVID-19 and the risk factors associated with aging. If WHO listens to us, this will accelerate the development of therapies against coronaviruses and against risk factors, and help fight future epidemics. The letter was signed by scientists from the USA, Europe, Israel and Russia, as well as longevity activists.
We are confident that WHO is now receiving a lot of requests and our letter will be lost in the information noise if we do not make additional efforts to promote it. Therefore, we have prepared a petition to be signed by anyone, who agrees with us http://chng.it/cLwkxSsP
If the arguments presented in the petition seem reasonable to you, please sign it. Repost, send it to your friends directly. This will help fight coronavirus.
What kind of data are we requesting from WHO? Medical data: medical history, blood tests, x-ray, etc. (for example ). And the thing is that WHO does not want to share! Here’s what they state :
“In accordance with Article 11(4) of the IHR (2005), WHO will not make the Anonymized-COVID-19 Data generally available to other State Parties until such time as any of the conditions set forth in paragraph 2 of such Article 11 are first met and following consultation with affected countries.
Pursuant to that same Article 11, WHO will not make Anonymized -COVID-19 data available to the public, unless and until Anonymized -COVID-19 data has already been made available to State Parties, and provided that other information about the -COVID-19 epidemic has already become publicly available and there is a need for the dissemination of authoritative and independent information.”For what we need open data?
However, open medical data, simply speaking, medical history (of course, anonymized, without names and surnames), is needed to:
- Predict the severity of the disease course. If an anamnesis is provided, blood tests, age, questionnaire responses, etc. will be indicated, this will help with predictions;
- Develop therapies taking into account risk factors or directly aimed at eliminating risk factors;
- For better machine learning. The predictive power of models is very much dependent on the number of samples on which they train. This is especially true for omics data, where the requirements for the minimum number of samples are much higher due to the large number of parameters in the models.
These were reasons for medical data to be useful today. But there are a number of other reasons, which are associated with future research, with preventive measures. But not only in the future: all this may come in handy, since the solution to the problem of high mortality in older ages may lie in the field of aging biology. This way, medical data is also needed for:
- Dealing with aging risk factors during future epidemics;
- Creating open medical datasets with annotation of patients age parameters;
- Existing national health systems cannot cope with the current situation. The cornerstone is the issue of collecting, storing and analyzing the medical data, necessary for successful research.
a) Local storage. Each national system (and sometimes even each medical facility) stores patient data in its own format with its own access rules. Data transfer from hospital to hospital or from country to country is difficult. Testing protocols are also local.
b) At the discretion of a particular researcher in accordance with the recommendations of regulatory authorities, only part of the data is made available to other scientists.
The prerequisites for these problems’ solutions have long been known: cloud storage, anonymization and de-identification technologies, and blockchain for secure and controlled access. Also now is exactly the moment, when the difficulties in standardizing formats can be effectively solved, when many people are ready to get involved in activities that contribute to a quick exit from a critical situation.
Many countries are currently attracting volunteers to help doctors treat patients with COVID-19. However, a huge number of bioinformatics and IT specialists can be no less useful in this situation. Creation of a prototype of a global patient database and the local involvement of one or two IT specialists in a hospital can help quickly, efficiently and relatively inexpensively (with the help of volunteers) collect data in a standardized format for subsequent analysis by the best scientists and AI algorithms around the world.
By allowing access to all types of anonymized or unidentifiable data now, using patient data from COVID-19 as an example, WHO can significantly accelerate the development of vaccines and treatment protocols. In addition, the current situation can serve as a tremendous impetus for optimizing the entire system of working with medical data, allowing us to develop an algorithm for the exchange and standardization of data on an international scale.What other types of data are important for dealing with coronavirus?
- Genomic data , primarily genomes and phylogenetic trees of the virus (examples will be in the list of examples below). Here things are much better with openness by the way. This data is needed to track differences in strains of the virus in different populations / countries, to understand how versatile therapies and tests will be. You can also select the most evolutionarily conservative regions of viral RNA to affect them—potentially, these may be the most effective therapies.
- Transcriptome data  (primarily sc-RNA-Seq of immune cells). Here's the thing. Hypermutation and VDJ recombination of genes responsible for the coding of antibodies and T-cell receptors occurs in immune cells. That is, the genomes in immune cells are different. The set of known sequences (clonotypes) of antibodies and T-cell receptors is called a repertoire. Since these are coding regions, the repertoire is most often recognized based on single-cell RNA sequencing of immune cells. This data is needed to compare people, recovered from illness, with non-infected ones. You can also compare the repertoires of immune cells of different infected people (with different severity, different courses of the disease). In the end, all this will help to diagnose disease and develop a vaccine.
- And, by the way, there is a clear deficit of this data, the Antibodies Society even called for action : “...the AIRR-C hereby calls upon its members, and the wider research community, to share experiences, resources, samples, and data as openly and freely as possible, and to work within their respective systems to break down barriers to achieve this goal, subject to the overarching directives of respect, privacy, and protection for patients and all people. We are in this together.”
- Information about test kits  and diagnostics . Many tests didn’t have enough time to be certified, clinics and some countries are afraid to use them; this applies not only to test kits but also to PCR machines and other equipment.
- Data on the spread of the virus  and prognosis . The situation is getting better every day, but diagnostics, still not being perfect, make their own adjustments to the epidemiological data.
- General educational information. WHO is doing well in this department .
- Data on publications and clinical trials .
- Newsfeeds with new articles on the topic .
Existing initiatives, including Kaggle Challenges , do not solve the problem of collecting and forming medical COVID-19 datasets and are focused on other tasks (training NLP systems on the texts about coronavirus, analysis of genomes, predicting the spread of the virus, etc.).
The idea is to find ways to significantly reduce the mortality rate from COVID-19 by influencing risk factors. IL-6 is an example of a promising risk factor target.
Sign the petition! The World Health Organization is obliged to both share their existing medical data, and to organize the work on obtaining new qualitative data. Contribute to our common cause—the fight against death.
4. https://qbrc.swmed.edu/projects/2019ncov_immuneviewer/, https://www.ncbi.nlm.nih.gov/genbank/sars-cov-2-seqs/, https://www.kaggle.com/pa…/repository-of-coronavirus-genomes
9. https://coronavirus.jhu.edu/map.html, https://ncov2019.live/, https://www.worldometers.info/coronavirus/
The virus is thought to spread mainly from person-to-person.
- Between people who are in close contact with one another (within about 6 feet).
- Through respiratory droplets produced when an infected person coughs or sneezes.
These droplets can land in the mouths or noses of people who are nearby or possibly be inhaled into the lungs.
It may be possible that a person can get COVID-19 by touching a surface or object that has the virus on it and then touching their own mouth, nose, or possibly their eyes, but this is not thought to be the main way the virus spreads.
Wait, what?? This is wildly at odds with the anti-transmission messaging I've heard. Where I live (USA), I hear endless appeals to wash hands, not shake hands, and not touch your face, etc. I hear barely a whisper about the grave risks of being indoors in a public, poorly-ventilated space. I mean, that CDC page seems to imply that standing in a poorly-ventilated grocery store, even >6 feet from others, may well be riskier than touching a point-of-sale touchscreen and then immediately touching your face. (Remember, the virus stays in the air 30 minutes.) That CDC page implies that just breathing inside an empty (but recently occupied) elevator is potentially even higher risk than licking the elevator buttons. Really? Really???
(And don't get me started on masks ... Masks & googles are not only not suggested in the USA, they're actively stigmatized, despite makeshift homemade masks being I think at least somewhat effective and not contributing to the ongoing supply shortage, and goggles not being in short supply at all....)
I'm posing this as a question because I don't have enough confidence, without doing more research, to declare that our public health messaging has been so wildly misdirected (at least, the messaging I've received). Does anyone have better evidence? Or what's your take?
I live in a relatively small and out-of-the-way city in Canada. My geographic region only has a few dozen confirmed cases of Covid-19. Nonetheless, we are bracing for impact.
Suppose I or someone close to me becomes a confirmed case. If epidemiologist types come asking, how much would it help for me to have a list of everyone I've associated with on each day starting now?
I am currently tracking the local data for where I live to get an idea of risk level for me personally.
The model is simple, total population, reported infections (not sure if I want to try adjusting for under reporting or not but should be a simple effort mechanically even if putting a number to it isn't) and some estimate of how many people I might "interact" with on a daily basis.
The formula was the one posted on MR for calculating the probability of someone in a conference of size X being infected. Not quite the same but don't mind having an over stated risk.
One thing I'm wondering about is how to estimate the number of people in "the conference". This has to be the number of people I might randomly cross paths with that could transfer the virus to me. Since some of the risk then comes from being in public places, such as a grocery store, I'm wondering how best to think about that setting.
One way would be to think about the average daily shoppers and workers at the store. Another might be the number of people I am waiting in line with and passing in the isles. Clearly the two will be significantly different.
I'm also not thinking about any cumulative impact here -- not probability of infection in the next 3 month (or even interacting with someone infected over that period which is actually the calculation I am doing) but what does today look like.
Would be interested in thoughts those here have.
On Friday I attended the 2020 Foresight AGI Strategy Meeting. Eventually a report will come out summarizing some of what was talked about, but for now I want to focus on what I talked about in my session on deconfusing human values. For that session I wrote up some notes summarizing what I've been working on and thinking about. None of it is new, but it is newly condensed in one place and in convenient list form, and it provides a decent summary of the current state of my research agenda for building beneficial superintelligent AI; a version 1 of my agenda, if you will. Thus, I hope this will be helpful in making it a bit clearer what it is I'm working on, why I'm working on it, and what direction my thinking is moving in. As always, if you're interesting in collaborating on things, whether that be discussing ideas or something more, please reach out.Problem overview
- I think we're confused about what we really mean when we talk about human values.
- This is a problem because:
- building aligned AI likely requires a mathematically precise understanding of the structure of human values, though not necessarily the content of human values;
- we can't trust AI to discover that structure for us because we would need to understand it enough to verify the result, and I think we're so confused about what human values are we couldn't do that without high risk of error.
- What are values?
- We don't have an agreed upon precise definition, but loosely it's "stuff people care about".
- When I talk about "values" I mean the cluster we sometimes also point at with words like value, preference, affinity, taste, aesthetic, intention, and axiology.
- Importantly, what people care about is used to make decisions, and this has had implications for existing approaches to understanding values.
- Much research on values tries to understand the content of human values or why humans value what they value, but not what the structure of human values is such that we could use it to model arbitrary values. This research unfortunately does not appear very useful to this project.
- The best attempts we have right now are based on the theory of preferences.
- In this model a preference is a statement located within a (weak, partial, total, etc.)-order. Often written like A > B > C to mean A is preferred to B is preferred to C.
- Stated vs. revealed preferences: we generally favor revealed preferences, this approach has some problems:
- can only infer preferences from behaviors observed; latent preferences
- inferring preferences from observation requires making normative assumptions, and if we don't make normative assumptions there are too many free variables
- General vs. specific preferences: do we look for context-independent preferences ("essential" values) or context-dependent preferences
- generalized preferences, e.g. "I like cake better than cookies", can lead to irrational preferences (e.g. non-transitive preferences)
- contextualized preferences, e.g. "I like cake better than cookies at this precise moment", limit our ability to reason about what someone would prefer in new situations
- See Stuart Armstrong's work for an attempt to address these issues so we can turn preferences into utility functions.
- Preference based models look to me to be trying to specify human values at the wrong level of abstraction. But what would the right level of abstraction be?
- What follows is a summary of what I so far think moves us closer to less confusion about human values. I hope to either think some of this is wrong or insufficient by the end of the discussion!
- Humans are embedded agents.
- Agents have fuzzy but definable boundaries.
- Everything in every moment causes everything in every next moment up to the limit of the speed of light, but we can find clusters of stuff that interact with themselves in ways that are "aligned" such that the stuff in a cluster makes sense to model as an agent separate from the stuff not in an agent.
- Basic model:
- Humans (and other agents) cause events. We call this acting.
- The process that leads to taking one action rather than another possible action is deciding.
- Decisions are made by some decision generation process.
- Values are the inputs to the decision generation process that determine its decisions and hence actions.
- Preferences and meta-preferences are statistical regularities we can observe over the actions of an agent.
- Important differences from preference models:
- Preferences are causally after, not causally before, decisions, contrary to the standard preference model.
- This is not 100% true. Preferences can be observed by self-aware agents, like humans, and influence the decision generation process.
- So then what are values? The inputs to the decision generation process?
- My best guess: valence
- My best best guess: valence as modeled by minimization of prediction error
- This leaves us with new problems. Now rather than trying to infer preferences from observations of behavior, we need to understand the decision generation process and valence in humans, i.e. this is now a neuroscience problem.
- underdetermination due to noise; many models are consistent with the same data
- this makes it easy for us to get confused, even when we're trying to deconfuse ourselves
- this makes it hard to know if our model is right since we're often in the situation of explaining rather than predicting
- is this a descriptive or causal model?
- both. descriptive of what we see, but trying to find the causal mechanism of what we reify as "values" at the human level in terms of "gears" at the neuron level
- what is valence?
- complexities of going from neurons to human level notions of values
- there's a lot of layers of different systems interacting on the way from neurons to values and we don't understand enough about almost any of them or even for sure what systems there are in the causal chain
- Valence in human computer interaction research
Thanks to Dan Elton, De Kai, Sai Joseph, and several other anonymous participants of the session for their attention, comments, questions, and insights.
This essay about dog ownership helped me empathise with dogs, and also caused me to update against getting a dog; it'd either be much more selfish and cruel than I previously thought, or else would require a lot more focused effort in order to give the dog a meaningful life.
Most of the essay is doing valuable and interesting work staring into the abyss and trying to help you see whether there is a dystopian horror occurring around us. But I'll mostly take away from it a clearer sense of what it looks like for a non-human animal to have purpose and meaning; it feels like a conceptual update that I won't easily be able to forget or ignore. (It has helped me think more clearly about animals and their values much more than most of the philosophical discussion I've seen on the topic, and I found it more useful than much extended discussion about consciousness and pleasure/pain.)
Here's three quotes to help you understand the post and entice you into reading the whole thing.
After about three days, the dog started following me everywhere. If I sat on the couch to watch tv, the dog would curl-up under my outstretched legs resting on the coffee table. If I sat at the dinner table, it would sit beside me, and watch me throughout the entire meal. If I went to the bathroom, it would follow me to the door and wait outside. At night, the dog curled up in my bed and slept beside me. The dog started walking more, and she would almost always perfectly follow my lead; she walked at just the right pace so she stayed beside me, neither lagging behind my fast stride, nor pulling ahead. On the rare occasions she got distracted by a smell or other dog, I gently tugged on her leash and called her name, and she scurried over to me.
I found it kind of creepy.
Yes, I know, it’s a dog. But still… I felt like I had been granted a level of submissiveness from a sovereign being which I hadn’t earned. All I had done was feed and walk the dog – and I apparently did this so badly that the dog was massively depressed – and yet she worshipped me.
The following section is something the author is themself quoting from a reddit thread:
The most accurate thing I can say about dogs is I feel sorry for them. My immediate family didn’t own dogs growing up, but my extended family had farms or large acreage plots with 3-5 dogs running around all day. They eat, sleep, shit, and run around exploring with their pack hours a day whenever they want.
Compare to city dogs. Mostly live in matchbox apartments. A typical weekday is likely 9-12 hours home alone. You can’t run. You can’t shit. You are bored out of your fucking mind. Your human comes home and walks you for 15 minutes on a leash. It’s the highlight of your day. Human is tired and eats dinner in front of the TV while you get scratches. Maybe you sleep in the same bed as your human. You’re probably pretty tired after an entire day of mostly not moving.
Weekends if you’re lucky, you go to a dog-friendly park. Maybe you get off leash. Maybe you never get off leash because you’re too spazzy around other dogs/humans. It’s completely understandable to be spazzy. You are chronically understimulated. One of your only opportunities to get energy and action in life is by “misbehaving” or harassing strangers.
When I walk past someone with a dog and the dog is just pulling as hard as s/he can at the leash to pounce on me, you can’t think that’s instinct. No animal in the wild thinks it’s a good idea to go fuck with something 3-30x it’s bodyweight. It’s pure boredom. The dog is just trying to stimulate itself before it’s forced back in front of the TV to watch The Office again.
There’s a laundry list of other topics like neutering, diet, training, etc that I won’t elaborate on. There’s enough grey area for people to get away with justifying whatever happens to be easiest for them, obviously, but I hope it’s also obvious that there are many many ways in which the life of a dog is diminished compared to…. other normal living organisms…
And from the sections on needs, pleasures, and meaning.
With the exceptions of abusive or negligent owners, owned dogs get their needs met. In fact, dogs get their needs met better than pretty much any non-pet animals in the world. Unlike wild animals, dogs aren’t faced with the daily life-and-death struggle for survival. They don’t need to hunt or scrounge for food, they don’t need to worry about a tainted water source, they don’t need to evade predators, etc. And unlike farm animals, their deaths almost certainly won’t come at the hands of their owners, especially not in the first 25% of their max lifespans.
I’d say most dog owners have a mixed record of fulfilling their dog’s pleasure (ignore the innuendo). On the positive side, owned dogs will usually get lots of treats, toys, and petting... Diligent owners will devote significant time to taking their dog out of the house to run around, fetch, and hopefully interact with other dogs, but plenty of owners won’t and will leave their dogs perpetually under-stimulated at home... Where dog owners fall the shortest in providing for their dog’s pleasure is in – again, ignore the innuendo – sex. By the 2010s, 83% of American dogs were neutered, and presumably most other owners do everything they can to discourage their non-neutered dogs from having sex (which is arguably a worse fate for the dog). I don’t think it’s a stretch to say that depriving an animal of an act for which it is biologically programmed to derive the most extreme of pleasures is likely detrimental to the animal’s wellbeing. Ask yourself: for what other gains would you be willing to give up sex for the rest of your life?
Third, meaning is activity and goals which provide long-term value to the being. Admittedly, it’s hard enough to identify meaning in humans, so it’s even harder to do so in dogs, but I’m going to take a shot anyway.
This post continues our exploration of truth. Here we begin to dig into the important parts of rationality, starting with the concept of Necessity and Necessary Beliefs.
I was having an argument with a friend the other day. It went vaguely like this,
Friend: "I'm not very disciplined. At some point I'm going to buckle down and train myself to be much more disciplined."
Me: "From experience and from what I know about humans, that's not going to work."
Friend: "Why? Motivation can come from within. If you can just train yourself like you're in the army, then you can become just as self disciplined as a soldier."
Me: "Yes, but the reason why people in the military are disciplined is because they have social incentives to be. In order to become disciplined, you need to create an environment for yourself that shapes your motivation. You can't just wake up one day and become a soldier."
Friend: "Sure, you might have to set up some environment like that. But once you've trained yourself, the discipline will stick, and you will be able to self motivate yourself from then on."
Me: "This theory would predict that people who were trained in the military would be much more productive three years after their service, compared to people who were never trained in the military. Do you agree?"
Friend: "Yes, I think that is likely."
Me: "I disagree. They might be slightly more productive but I'd predict it would be pretty similar."
So who is right?
I haven't been able to find direct research, but this seems like a classic instance where a debate can be settled by simply referencing a high quality experiment.
Before the pandemic, Scott Alexander's short post reviewing a few correlational studies on gun ownership and violence left me feeling uncertain about the moral status of owning a home defense weapon. Times have changed though, and I suspect that there will be a larger risk of home invasion during the course of COVID-19's spread. Many people are buying guns and ammunition in what is likely preparation for this increased risk.
Assuming that I continue to own the gun after the inflated risk of home invasion due to COVID-19 decreases to a negligible level, should I buy a gun for home defense now?
Kinsa Smart Thermometer Dataset
Kinsa is a company that makes smart thermometers. A few years ago, they found that they could use the data that they got from their smart thermometers (most importantly the temperature reading and location of the user) to track flu trends across the United States. (FitBit has done something similar.)
Kinsa's data science team has now turned their attention to Covid-19 trends and started a tracking website using their thermometer data, using methods which they explain in more detail on their technical approach page. It looks like the most impressive thing that they've been able to do with this dataset so far is to identify new hotspots before other people do, like the increase in cases in southern Florida. But potentially there are a lot of other things that can be done with these sorts of data.Estimating the Number of Coronavirus Infections in the US
One of those other things which might be doable with these sorts of data: coming up with more accurate estimates of the number of people with coronavirus. Testing in the US (and many other places) is spotty and delayed a great deal, estimating the number of infections based on the number of deaths involves a very long delay and a bunch of assumptions, etc. But if you can count the number of people in America with a fever (or extrapolate from a sample), and subtract off the baseline estimate of how many fevers you'd expect from influenza or other causes, then you can get an estimate of the number of people in the US with a fever due to coronavirus. And that gets you close to an estimate of the total number of coronavirus cases.
The coronavirus tracking website that Kinsa set up is already doing much of this - their graph (also shown below) shows something like the number of people with a fever and the baseline expected number of fevers.
So I decided to give it a try and use their graph to estimate the total number of coronavirus cases in the US.
It's a fairly rough first-pass analysis, which may contain errors, and could definitely be improved with some more work. The number I got at the end is that about 1% of Americans have gotten coronavirus, through March 20.My Estimation Method, in Brief
The graph above shows something like "number of new fevers" (on an unclear scale labeled "% ill") and Kinsa's estimate of the expected number of fevers if there was no coronavirus. So the gap between the two lines represents something like the number of new fevers each day due to coronavirus. That trend has an odd shape for a pandemic: it increases and then levels off. I suspect that this is because, once people start taking precautions to avoid coronavirus, the number of flu cases drops dramatically, so their estimated baseline gets farther and farther from reality (of # of flu cases) and coronavirus accounts for a larger and larger number of the new fevers. You can view the regional trends by clicking on particular counties; regions like the SF Bay Area and Seattle have a similar shape on earlier days. The SF Bay Area is actually now anomalously below baseline in number of new fevers on March 21.
I decided to deal with this by focusing on the trend up until March 14, and extrapolating from there. (It would be even better to do this separately for each county and then aggregate them.)
Next step: making sense of the y-axis. A little bit of digging showed that it's from their flu work, where they used their data to fit a particular measure of flu prevalence that the CDC uses, which is ILINet data (explained partly down the page here). A little bit more digging on the relationship between this number and the number of flu cases reported by the CDC (as seen headlines like this) suggests that 1 point on the scale corresponds to roughly 75,000 new flu cases that day (which probably means about 75,000 new fevers). More detailed explanation of where that number comes from in my longer writeup.
So the gap of 0.79 scale points between observed and expected on March 14 corresponds to about 60,000 excess new fevers that day. Which we're guessing are entirely due to coronavirus. Using either their data for previous days, or assumptions about the growth rate in cases, we can turn that into an estimate in the cumulative total number of feverish cases as of that day. I tried both and got numbers of 470,000 and 370,000, so let's call it about 420,000 total cumulative cases through March 14.
But this is only counting the coronavirus cases that do get a fever, and (more importantly) it is only counting them when they get the fever. My guess is that a bit more than a doubling time passes between infection and fever, and also adjusting for the cases that never get a fever, the total number of coronavirus infections on March 14 was about 3x the number of feverish cases, or about 1.3 million.
Extrapolating forward assuming a 4-day doubling time gives an estimate of 3.6 million cases in the US through March 20, or 1.1% of the population.
So that's the basic method and estimate. The longer writeup goes into more detail about each step, and includes various things things I'm still confused or uncertain about and ways in which this analysis might be wrong. For instance, maybe concerns about coronavirus are causing people to take their temperature more often, which is sufficient to cause an increased number of measured fevers, and a large part of the upwards trend line is due to that rather than to actual coronavirus cases.
I'm interested in improving this estimate, or having other people go off and do their own estimate. And I'm especially interested in people finding more good things to do with this sort of dataset.
As part of the LessWrong Coronavirus Link Database, Ben, Elizabeth and I are publishing daily update posts with all the new links we are adding each day that we ranked a 3 or above in our importance rankings. Here are all the top links that we added yesterday (March 21st), by topic.Dashboards
Uses current data to show seasonal illnesses in the US, and indicates whether this is abnormal or expected. Oregon State University.DIY
Clear explanations of why you need oxygen, how the various devices work, links to open source projects for building them, etc.Guides/FAQs/Intros
List of 40 activities to do indoors that are cheap/free.
(BP) ClearerThinking always do things competently and well, and I think the list is genuinely good.Spread & Prevention
Lots of up-to-date info and good graphics. Centre for Mathematical Modelling of Infectious Diseases.Work & Donate
He claims they have a 114x return on donations. Donation page is mrelief.com/donate.Other
Short concrete outline of how it might happen. Briefly explains that the lockdown is not a likely path to authoritarianism, but instead that healthcare tracking could become necessary and go alongside location tracking in a way that becomes overpowered.Full Database Link
Petri Hollmén traveled to Tyrol on the 5th of March. He had a bottle of hand sanitizer with him, used it a lot and washed his hands like never before.
Sunday, the 8th he returned home to hear a day afterwards that Tyrol was declared a COVID-19 epidemic area. He decided to work from home given the higher risk of having been in an epidemic area. On Thursday the 12th he woke up feeling normal but his Oura ring measured that his readiness was down to 54 from being normally at 80-90 which was mostly due to having a 1°C elevated temperature at his finger at night.
Even though he felt normal, he went to the doctor and given that he was from an epidemic area, they decided to test him. He tested positive and went to self-quarantine for 14 days. He measured his temperature several times during the following day and it always came back with 36.5°C. The Oura ring provided evidence that led to his diagnosis that wouldn't have been available otherwise.
While he didn’t have true fever as defined by the official gold standard he did have a kind of clinical relevant fever. It’s my impression that our medical community is too focused on their gold standards that are based on old and outdated technology like mercurial thermometers.
Even when new measurements like nightly finger temperature don’t match with the gold standard there are still cases where the information allows for better clinical decision making.
Today, we have cheap sensors and machine learning that provide us with a different context of making medical decisions then going to the doctors office.
Testing by doctors is very important in the fight against COVID-19 but people need to know when it’s time to go to the doctor. Hollmén needed his Oura to know that it was time to get tested professionally.
We need to get good at catching cases of COVID-19 as fast as possible when they happen in the wild if we want to avoid that millions die without us choking our economy by long-term quarantines.
Analysis of Fitbit users found that their resting heart rate and total amount of sleep can be used to predict the official state numbers for influenza-like illness.
It’s very likely that lower heart rate variance and a higher minimum of the nightly heartrate happens in at least some of the COVID-19 cases. Unfortunately, the WHO is stuck in the last century and the official symptoms charts tell us nothing about how common either of those metrics are in COVID-19 patients. Lack of access to those metrics in the official statistics means it’s harder for people who have an Oura Ring, an Apple watch or another device that can measure nightly heartrate to make good decisions about when to go to the doctor or self-quarantine.
Given that Apple sold around 50 million Apple watches between 2018 and 2019, a sizable portion of people could make better decisions if we would have more information about how COVID-19 affects heart rate.
Even more people have access to a smart phone with a decent camera. Having a sore throat is a typical symptom for many virus infections like COVID-19 and a good machine learning algorithm could produce valuable data from those images.
A priori it’s unclear about how much we can learn from such pictures. If a throat of a patient is red due to inflammation a doctor who looks at it, can’t distinguish whether it’s due to snoring or a virus infection.
If a machine learning algorithm could have access to a steady stream of daily imagine of a person’s throat the algorithm could understand a person’s baseline and use that insight to factor out the effects of snoring.
When the gold standard of diagnosing the throat is to look at one image at a particular point in time at the doctor’s office there’s potentially a big improvement to be gained by looking at a series over multiple days. We don’t know how useful such a diagnostic tool is before building it.
Ideally, users of a new app would take an image of their throat every morning after getting up and every evening before going to sleep. They would also measure their temperature with a normal thermometer at both points and enter information about subjective symptoms. If a person gets a proper COVID-19 test, they should also be able to enter the data.
At first we would train the machine learning algorithm to use the images to predict temperature. With enough users our algorithm can learn how the throat of a person having flu differs from their baseline whether or not they are snoring.
As we have more users and some of our users get COVID-19 lab tests our machine learning algorithm can learn to predict the test results directly. It’s the nature of advanced technology that we don’t know how powerful a tool is before it’s developed. Most clinical trials for new drugs find that they don’t live up to their promise.
We need more dakka for COVID-19. Creating an app that does the above function doesn’t cost much and the cost of the project should be worth the potential benefits of catching COVID-19 cases faster and thus preventing people from unknowingly infecting their friends.
Biological global catastrophic risks were neglected for years, while AGI risks were on the top. The main reason for this is that AGI was presented as a powerful superintelligent optimizer, and germs were just simple mindless replicators. However, germs are capable to evolve and they are very extensively searching the space of possible optimisations via quick replication of viruses and quick replication rate. In other words, they perform an enormous amount of computations, far above what our computers can do.
This optimisation power creates several dangerous effects: antibiotic resistance (for bacteria) and obsolescence of vaccines (for flu) as well as a zoonosis: the transfer of viruses from animals to humans. Sometimes it could be also beneficial, as in the case of evolving in the direction of less CFR.
In other words, we should think about coronavirus not as of an instance of a virus on a doorknob, but as a large optimisation process evolving in time and space.
Thus, the main median-term (3-6 months) question is how it will evolve and how we could make it evolve in better ways. In other words, what will be the next wave?
There was a claim that second wave of Spanish flu was more dangerous, because of the large hospitals: the virus was “interested” to replicate in hospitals, so it produced more serious illness; Infected people had to go to hospitals, which were overcrowded after the war, and they infected other people there, including the medical personal which moved it to the next hospital.
Another point is that the size of the virus optimisation power depends on the number of infected and of the number of virus generations, as well as on the selective pressure. The idea of “flattening the curve” is the worst, as it assumes a large number of infections AND a large number of virus generation AND high selective pressure. Cruel but short-term global quarantine may be better.
Suppose I were to say that the American legal system is a criminal organization. The usual response would be that this is a crazy accusation.
Now, suppose I were to point out that it is standard practice for American lawyers to advise their clients to lie under oath in certain circumstances. I expect that this would still generally be perceived as a heterodox, emotionally overwrought, and perhaps hysterical conspiracy theory.
Then, suppose I were to further clarify that people accepting a plea bargain are expected to affirm under oath that no one made threats or promises to induce them to plead guilty, and that the American criminal justice system is heavily reliant on plea bargains. This might be conceded as literally true, but with the proviso that since everyone does it, I shouldn't use extreme language like "lie" and "fraud."
This isn't about lawyers - some cases in other fields:
In American medicine it is routine to officially certify that a standard of care was provided, that cannot possibly have been provided (e.g. some policies as to the timing of medication and tests can't be carried out given how many patients each nurse has to care for, but it's less trouble to fudge it as long as something vaguely resembling the officially desired outcome happened). The system relies on widespread willingness to falsify records, and would (temporarily) grind to a halt if people were to simply refuse to lie. But I expect that if I were to straightforwardly summarize this - that the American hospital system is built on lies - I mostly expect this to be evaluated as an attack, rather than a description. But of course if any one person refuses to lie, the proximate consequences may be bad.
Likewise for the psychiatric system.
In Simulacra and Subjectivity, the part that reads "while you cannot acquire a physician’s privileges and social role simply by providing clear evidence of your ability to heal others" was, in an early draft, "physicians are actually nothing but a social class with specific privileges, social roles, and barriers to entry." These are expressions of the same thought, but the draft version is a direct, simple theoretical assertion, while the latter merely provides evidence for the assertion. I had to be coy on purpose in order to distract the reader from a potential fight.
The End User License Agreements we almost all falsely certify that we've read in order to use the updated version of any software we have are of course familiar. And when I worked in the corporate world, I routinely had to affirm in writing that I understood and was following policies that were nowhere in evidence. But of course if I'd personally refused to lie, the proximate consequences would have been counterproductive.
The Silicon Valley startup scene - as attested in Zvi's post, the show Silicon Valley, the New Yorker profile on Y Combinator (my analysis), and plenty of anecdotal evidence - uses business metrics as a theatrical prop to appeal to investors, not an accounting device to make profitable decisions on the object level.
The general argumentative pattern is:A: X is a fraudulent enterprise.
B: How can you say that?!
A: X relies on asserting Y when we know Y to be false.
B: But X produces benefit Z, and besides, everyone says Y and the system wouldn't work without it, so it's not reasonable to call it fraud.
This wouldn't be as much of a problem if terms like "fraud", "lie," "deception" were unambiguously attack words, with a literal meaning of "ought to be scapegoated as deviant." The problem is that there's simultaneously the definition that the dictionaries claim the word has, with a literal meaning independent of any call to action.
There is a clear conflict between the use of language to punish offenders, and the use of language to describe problems, and there is great need for a language that can describe problems.
For instance, if I wanted to understand how to interpret statistics generated by the medical system, I would need a short, simple way to refer to any significant tendency to generate false reports. If the available simple terms were also attack words, the process would become much more complicated.
In January 2020 I did a zero content life. This was partly justified by the book “the info diet” but mostly based on a philosophy that proposes,in a lifetime with limited hours, there is a choice to either consume or create. And I’d rather create than consume.
To be honest, the idea for me was born in response to a meme complaining that if you think art should be free, try going without art for a month. This sounded like a fun and interesting idea.
My main sources of content were:
- Facebook feed
- Books (paper and TTS audiobook)
- Some music
- Youtube videos
I decided to not get fussy about content that I was actively reciprocal in creating. For example a dance form and a conversation are two different types of content that I am engaged in. The difference would be between playing sport and watching sport (I’m allowed to play sport but not watch sport). I wanted to make an exception for live music but I would not usually see live music anyway.
So how did I go?
On the 1st of January I rearranged my phone screen to make my content less accessible than my creation pathways. I don’t think I recorded anything, but in the first week I wrote 5000 words with all that time I had. In the silence I noticed my mind go quieter. In the time that I spent not reading, I did thinking. I drove places in silence. I started having phone calls with my friends. I just let myself go with whatever I wanted to go towards.
Without music coming in I started to get earworms appearing in my mind. Without content ideas coming from books I had to start generating my own, or applying my existing methods. Even making my own.
I didn’t realise how badly an entrenched habit of reading (134 books in 2019) could limit my growth. I was doing a good thing adding more information to the parts of me that needed more information and also, now that I’ve slowed down, I’m more balanced.
I probably only needed to do book free for a day to get the benefit but I committed and I wasn’t sure if there was a deepening after more time. There wasn’t but one of the major benefits I retained was that I freely chose to think or read a book or call a friend. This was previously a more compulsive choice to urgently read books.
I spend a lot more time having phone calls and exploring relationally now. For the handful of people that I am talking with, we are growing therapeutically together and healing each other as we reflect on our stuff together.
I read less, I talk more, I write more. I consider this experiment a success.
I am told that unsupervised machine translation is a thing. This is amazing. I ask: Could we use it to understand dolphin language? (Or whales, perhaps?)
I don't currently see a convincing reason why not. Maybe dolphins aren't actually that smart or communicative and their clicks are mostly just very simple commands or requests, but that should just make it really easy to do this. Maybe the blocker is that dolphins have such a different set of concepts than English that it would be too hard?