# Новости LessWrong.com

A community blog devoted to refining the art of rationality
Обновлено: 24 минуты 50 секунд назад

### Has the effectiveness of fever screening declined?

28 марта, 2020 - 01:07
Published on March 27, 2020 10:07 PM GMT

On March 6th, jimrandomh predicted that the effectiveness of fever-screening for coronavirus would decline, because the virus would evolve to produce fever later or not at all. Has this happened in countries that have been liberally applying fever-screening?

Discuss

### COVID-19 growth rates vs interventions

28 марта, 2020 - 00:33
Published on March 27, 2020 9:33 PM GMT

It’s been a couple of weeks since I posted regarding the growth rates of COVID-19 cases in various countries. Now, after these countries have implemented various control measures it is more clear how each measure effects growth rate. This post looks at the 10 countries with the highest number of confirmed cases.

Death growth rates roughly match confirmed case growth rates

Firstly, to check whether the confirmed case growth rates roughly reflect the actual growth rates, they can be compared to death rates (suggested in a comment by Unnamed on my previous post). This doesn’t show what the ratio is between the number of cases and the number of detected cases, it only shows whether the fractional growth per day in detected cases roughly represents the actual growth rate of the virus.

Obviously the death growth rate doesn’t perfectly indicate the spread of the virus but it has different biases so if the two are similar then they probably at least somewhat reflect the actual growth rate.

A graph to compare growth rates of confirmed cases & deaths of the 10 worst affected countries is here. (Select country at the top)

(Note: For the y-axis I have used doubling time of infectious cases, which I’ve defined as 10 days fairly arbitrarily. The actual length of time doesn’t matter too much to the results but it’s important to have some limit otherwise the recent China and South Korea results particularly make less sense. This makes the definition of doubling time for deaths a bit odd but it's good enough for the purposes of comparing the rates.)

Generally the rates match well. Iran and USA are the main ones where there is significant divergence (and both look more reasonable recently), the others all seem sensible.

In theory there should be a lag between cases and deaths which we seem to see in Italy and Spain but the data is too noisy to say for sure.

Growth rates vs interventions

I’ve created 4 different displays of the cases data:

1. Cumulative confirmed cases

2. New cases per day

3. Doubling time for infectious cases

4. Fractional change in infectious cases per day

I’ve annotated the graphs with what anti-COVID actions each country has taken and when. Apologies for anywhere I’ve got these wrong, if you see any massive errors for your country I’ll try to update.

Apologies also for the overlapping writing - mouse over the relevant points if it gets confusing and click the legend to toggle countries. Double-click to toggle between only that country or all countries.

There has been some form of lockdown for most countries but the exact extent differs between countries. I haven't attempted to distinguish between them.

There is expected to be a ~5 day delay between actions taken and effects being seen in the confirmed cases statistics as people are usually tested when they are symptomatic.

Uninhibited growth has a doubling time of 2-3 days

Refer to my previous post. I don’t really have much to add here, only that my initial calculations of doubling time had a small error so the doubling times are actually slightly lo (i.e. growth faster) than I initially reported.

Growth with improved hygiene and social distancing has a doubling time of 3-5 days

I also mentioned in that post that it seemed as though the doubling time for each country was increasing over time. This seems to me to represent additional simple precautions starting to be taken - such as improved hygiene and social distancing (short of a lockdown).

4-5 days is probably the best that can be achieved by these methods. Many countries have put these in place but none have been able to slow the spread of the virus sufficiently without taking additional actions.

Growth of virus with partial lockdown has doubling time >4 days

Different countries have enacted different strictness levels in their lockdowns. These haven't been in place long enough to know exactly what's happening but they have had an effect and in Italy's case especially this has started to strongly increase doubling times.

Growth with virus under control has halving time as low as 2-5 days

We have 2 examples of countries which have had significant outbreaks and brought them under control – China and South Korea. In both cases the doubling time starts climbing and keeps going until the active cases starts to decrease. Under full control the halving time of active cases was 2-5 days.

We don’t currently have any countries with a large number of cases where the doubling time is >6 days and holds steady for a prolonged period. The possible exception is Iran but I have less confidence in the data there due to the mismatch between growth rate of confirmed cases and deaths and in the last few days it looks like the growth rate is increasing again.

I suspect that having a sustained high doubling time is possible if R is just above 1 but so far either a country is not doing enough (doubling time of 2-5 days) or they are doing enough and the cases are about to start decreasing. If R0 is large to start with it’s hard to find that perfect amount of intervention which takes R to 1 so that the number of new cases stay manageable. Possibly as China and South Korea loosen their restrictions they are starting to find that point.

Country summaries

The above is based on looking at the performances of various countries as described below.

China

China successfully applied a quarantine in Wuhan which reduced a rapidly growing epidemic to a handful of new cases per day. This quarantine was very strict compared to other countries on this list and the halving rate was 2-5 days. Other, less strict quarantines are likely to shrink more slowly.

More recently (11th March), the restrictions in Wuhan were eased to allow citizens to go back to work. Since 18th March the virus has started growing again, so far averaging a doubling rate of 7 days or so. So far they are performing the dance successfully.

This seems to me the most likely next stage for Western countries. Exactly what rate the virus is managed to before it needs to be suppressed again is unclear. China at the moment should have plenty of tests and protective equipment so whatever they achieve is likely to be fairly close to the best possible scenario. Successful contact tracing could allow it to pause indefinitely without a full lockdown.

Italy

Doubling times have been increasing as the government has implemented additional control measures.

Lombardy (main outbreak in Italy) was locked down fairly early on. This increased the doubling time to 3-4 days. Later lockdowns which eventually covered the entire country increased this further such that the number of new cases per day appears to be levelling off. Most recently the Lombardy lockdown was tightened to decrease spread rate further. I don’t think it will be long before the number of live cases starts to decrease.

USA

The growth rate in the USA shows the least evidence of slowing down. The growth rate in deaths is less so there may be something confounding the data on confirmed cases, such as increasing coverage of testing.

Many states, including the main centres appear to have implemented lockdowns in recent days so these should start having an effect shortly. Some counties in the Bay area implemented a lockdown earlier but John Hopkins have started aggregating by state and any effects haven’t shown up in the California figures yet.

Spain, Germany, France, Switzerland, UK

These have all followed fairly similar paths. Schools have closed between 2k and 5k cases (Switzerland ~1k). Lockdowns have happened between 5k and 10k cases.

Some countries seem very keen to say there is no lockdown (e.g. Germany, Switzerland) and their actions are correspondingly less strict. However they do entail a large curtailment of freedoms even if they are less strictly enforced.

France seems to have been most strict with their measures in enacting fines for violators although I don’t know how effective these are.

If I borrow VipulNaik's taxonomy, all of them are somewhere between level 2 and level 3 lockdown.

The UK and Switzerland are a bit behind the other countries in terms of cases but their actions have similarly lagged so haven't taken advantage of their initial advantage.

Iran

Iran is a strange one in that their total number of new cases per day has been fairly flat for a couple of weeks. I’m not sure whether they have achieved a perfect R0=1 or whether their data is a bit funny – their deaths data don’t really reflect their confirmed cases although more recently they start to match more closely.

South Korea

South Korea are my favourite COVID-19 dealing country.

They essentially had it under control until the infamous patient 31 infected a large number in his church who then went on to infect more until there were >5k cases associated with the church (more than half of the total number of cases in the country).

Despite the government never imposing any particularly strict orders, the entire city of Daegu was deserted within a couple of days of the patient 31 outbreak being confirmed. Between that and the intensive contact tracing and testing program the outbreak was quickly brought under control so that the hospitals weren’t overrun and the fatality rate was kept down to 1.3%.

In 2015 South Korea experienced the second worst outbreak of MERS. There were 168 confirmed infections and 38 people died. This article has an interesting summary of how the lessons from that outbreak fed into the COVID-19 response.

The halving time during the reduction phase was 3-5 days.

Arguably they have now entered their dance phase as R0≈1.

Japan

I haven’t included Japan on the graphs as nothing much has happened there which is pretty amazing. Japan is probably what South Korea would look like if there had been no patient 31.

They have taken precautions similar to Western countries before the latter implemented stricter lockdown. However they have managed to contain every cluster of cases before any have got out of control.

There has been a lot of talk about Japan not doing enough testing and that their numbers are artificially suppressed. My prior for this is pretty low – this seems like an unlikely thing for a government to do, especially as it wouldn’t take long before the truth came out as the death toll rose.

As for evidence against that hypothesis, Japan have done a lot of testing compared to the number of cases - 19 out of 20 tests come back negative (even if the absolute numbers are low). If they are deliberately suppressing their numbers then they’re doing a really good job at testing the wrong people.

Japan’s cases started to get serious in mid-Feb. I think it’s clear that they managed to avoid any out of control outbreaks until at least the beginning of March, otherwise there would be so many cases by now that it would be obvious. If they can keep the virus in control for 3 weeks then they can probably keep it in control for a couple more up until now. If a cluster does get out of control in Japan then I expect it to go the same way as South Korea.

Of course the Japanese government could be lying about everything but again if they are I would expect better evidence from citizens/journalists by now.

Summary

Doubling times

Unmitigated spread: 2-3 days

Improved hygiene and basic social distancing: 3-5 days

Lockdown with work allowed: 5+ days (possibly cases decreasing)

Halving times (single sample each)

Full lockdown: 2-5 days

Flexible lockdown + Epic contact tracing: 3-5 days

Discuss

### Some examples of technology timelines

27 марта, 2020 - 21:13
Published on March 27, 2020 6:13 PM GMT

"Weather forecasts were comedy material. Now they're just the way things are. I can't figure out when the change happened."

Introduction

I briefly look into the timelines of several technologies, with the hope of becoming marginally less confused about potential A(G)I developments.

Having examples thus makes some scenarios more crisp:

• (AGI will be like perpetual motion machines: proven to be impossible)
• AGI will be like flying cars: possible in principle but never in practice.
• AI will overall be like contact lenses, weather forecasts or OCR; developed in public, and constantly getting better, until one day they have already become extremely good.
• AI will overall be like speech recognition or machine translation: Constant improvement for a long time (like contact lenses, weather forecasts or OCR), except that the difference between 55% and 75% is just different varieties of comedy material, and the difference between 75% and 95% is between "not being usable" and "being everywhere", and the change feels extremely sudden.
• AGI will be like the iPhone: Developed in secret, and so much better than previous capabilities that it will blow people away. Or like nuclear bombs: Developed in secret, and so much better than previous capabilities that it will blow cities away.
• (AGI development will be like some of the above, but faster)
• (AGI development will take an altogether different trajectory)

I did not use any particular method to come up with the technologies to look at, but I did classify them afterwards as:

• After the event horizon: Already in mas production or distribution.
• In the event horizon: Technologies which are seeing some progress right now, but which aren't mainstream; they may only exist as toys for the very rich.
• Before the event horizon: Mentioned in stories by Jules Verne, Heinlein, Asimov, etc., but not yet existing. Small demonstrations might exist in laboratory settings or by the DYI community

I then give a summary table, and some highlights on weather forecasting and nuclear proliferation which I found particularly interesting.

After the event horizon Weather forecasting

A brief summary: Antiquity's weather forecasting methods lacked predictive power in the short term, though they could understand e.g., seasonal rains. Advances were made in understandic meteorological phenomen, like rainbows, in estimating the size of the stratosphere, but not much in the way of prediction. With the advent of the telegraph, information about storms could be relayed faster than the storm itself travelled, and this was the beginning of actual prediction, which was, at first, very spotty. Weather records had existed before, but now they seem to be taken more seriously, and weather satellites are a great boon to weather forecasting. With time, mathematical advances and Moore's law meant that weather forecasting services just became better and better.

Overall, I get the impression of a primitive scholarship existing before the advent of the telegraph, followed by continuous improvement in weather forecasting afterwards, such that it's really difficult to know when weather forecasting "became good".

A brief timeline:

1. Invention of the electric telegraph: Information could travel faster than the wind.
• 1854 – The French astronomer Leverrier showed that a storm in the Black Sea could be followed across Europe and would have been predictable if the telegraph had been used. A service of storm forecasts was established a year later by the Paris Observatory.
1. The first daily weather forecasts were published in The Times in 1861; weather forecasting was spurred by the British Navy after losing ships and men to the Great storm of 1859
• 1900s. Advances in numerical methods. Better models of cyclones (1919) and huricanes (1921), global warming from carbon emissions is first postulated as a hypothesis (1938); huricanes are caught on radar (1944), first correct tornado prediction (1948). Weather observing stations now abound.
1. D-Day (Allied invasion of Normandy during WW2) postponed because of a weather forecast.
• 1950s and onwards. The US starts a weather satellite program. Chaos theory is discovered (1961). Numerical methods are implemented in computers, but these are not yet fast enough. From there on, further theoretical advances and Moore's law make automated weather forecasting, slowly, more possible, as well as advanced warning of huricanes and other great storms.

Sources:

Nuclear proliferation

What happens when governments ban or restrict certain kinds of technological development? What happens when a certain kind of technological development is banned or restricted in one country but not in other countries where technological development sees heavy investment? Source

A brief summary: Of the 34 countries which have attempted to obtain a nuclear bomb, 9 have succeeded, whereas 25 have failed, for a base rate of ~25%. Of the latter 25, there is uncertainty as to the history of 6 (Iran, Algeria, Burma/Myanmar, Saudi Arabia, Canada and Spain). Excluding those 6, those who suceeded did so, on average, in 14 years, with a standard deviation of 13 years. Those who failed took 20 years to do so, with a standard deviation of 11 years. Three summary tables are available here.

Caveats apply. South Africa willingly gave up nuclear weapons, and many other countries have judged a nuclear program to not be in their interests, after all. Further, many other countries have access to nuclear bombs or might be able to take posession of them in the event of war, per NATO's Nuclear sharing agreement. Additionally, other countries, such as Japan, Germany or South Korea, would have the industrial and technological capabilities to produce nuclear weapons if they were willing to

Overall, although of course the discoveries of, e.g., the Curies were foundational, I get the impression that the discovery of the possibility of nuclear fission, followed by "let's reveal this huge infohazard to our politicians" and the beginning of a nuclear programme in the US was relatively rapid.

I also get the impression of a very large standard deviation in wanting nuclear weapons badly enough. For example, Israel or North Korea actually got nuclear weapons, whereas Switzerland or Yugoslavia were vaguely gesturing in the direction of attempting it; the Wikipedia page on Switzerland and weapons of mass destruction is almost comical in the amount of bureaucratic steps and committees and reports, recommendations and statements, which never get anywhere.

A brief timeline:

• 1898: Pierre and Marie Curie commence the study of radioactivity
• ...
• 1934: Leó Szilárd patents the idea of a nuclear chain reaction via neutrons.
• 1938: First fission reaction.
• 1939: The idea of a using fission as a weapon is floating around
• 1942: The Manhattan Project starts.
• 1945: First nuclear bomb.
• 1952: First hydrogen bomb.

Cryptocurrencies

A brief summary: Wikipedia's history of cryptocurrencies doesn't mention any cyberpunk influences, and mentions an 1983 ecash antecedent. I have the recollection of PayPal trying to solve the double spending problem but failing, but couldn't find a source. In any case, by 2009 the double spending problem, which had previously been considered pretty unsolvable, was solved by Bitcoin. Ethereum (2013) and Ethereum 2.0 (2021?) were improvements, but haven't seen widespread adoption yet. Other alt-coins seem basically irrelevant to me.

A brief timeline:

• 1983: The idea exists within the cryptopunk community, but the double spending problem can't be solved, and the world wide web doesn't exist yet.
• 2009: Bitcoin is released; the double-spending problem is solved.
• 2015: Ethereum is released
• 2020-2021: Ethereum 2.0 is scheduled to be released.

Mobile phones

A brief summary: "Mobile telephony" was telephones installed on trains, and then on cars. Because the idea of mobile phones was interesting, people kept working on it, and we went from a 1kg beast to the first iPhone in less than twenty years. Before that, there was a brief period where Nokia phones all looked the same.

A brief timeline:

• 1918: Telephones in trains
• 1946: Telephones in cars.
• 1950s-1960s: Interesting advances are made in the Soviet empire, but these don't get anywhere. Bell Labs works is working on the topic.
• 1973: First handheld phone. Weight: 1kg
• 1980s: The lithium-ion battery, invented by John Goodenough .
• 1983: "the DynaTAC 8000X mobile phone launched on the first US 1G network by Ameritech. It cost $100M to develop, and took over a decade to reach the market. The phone had a talk time of just thirty minutes and took ten hours to charge. Consumer demand was strong despite the battery life, weight, and low talk time, and waiting lists were in the thousands" • 1989: Motorola Microtac. A phone that doesn't weight a ton. • 1992/1996/1998: Nokia 1011; first brick recognizable as a Nokia phone. Mass produced. / Nokia 8110; the mobile phone used in the Matrix. / Nokia 7110; a mobile with a browser. In the next years, mobile phones become lighter, and features are added one by one: GPS, MP3 music, storage increases, calendars, radio, bluetooth, colour screens, cameras, really unaesthetic touchscreens, better batteries, minigames, • 2007: The iPhone is released. Nokia will die, but doesn't know it yet. Motorola/Sony had some sleek designs, but the iPhone seems to have been better than any other competitor among many dimensions. • Onwards: Moore's law continues and phones look more sleek after the iPhone. Cameras and internet get better, and so on. Optical Character Recognition A brief summary: I had thought that OCR had only gotten good enough in the 2010s, but apparently they were already pretty good by the 1975s, and initially used for blind people, rather than for convenience. Recognizing different fonts was a problem, until it wasn't anymore. A brief timeline: • 1870: First steps. The first OCR inventions were generally conceived as aid for the blind. • 1931: "Israeli physicist and inventor Emanuel Goldberg is granted a patent for his "Statistical machine" (US Patent 1838389), which was later acquired by IBM. It was described as capable of reading characters and converting them into standard telegraph code". Like many inventions of the time, it is unclear to me how good they were. • 1962: "Using the Optacon, Candy graduated from Stanford and received a PhD". "It opens up a whole new world to blind people. They aren't restricted anymore to reading material set in braille." • 1974: American inventor Ray Kurzweil creates Kurzweil Computer Products Inc., which develops the first omni-font OCR software, able to recognize text printed in virtually any font. Kurzweil goes on the Today Show, sells a machine to Stevie Wonder. • 1980s: Passport scanner, price tag scanner. • 2008: Adobe Acrobat starts including OCR for any PDF document • 2011: Google Ngram. Charts historic word frequency. Speech recognition A brief summary: Initial progress was slow; "a system could understand 16 spoken words (1962), then a thousand words (1976). Hidden markov models (1980s) proved to be an important theoretical advance, and commercial systems soon existed (1990s), but they were different degrees of clunky. Cortana, Siri, Echo and Google Voice search seem to be the first systems in which voice recognition was actually good. A brief timeline: • 1877: Thomas Edison's phonograph. • 1952: A team at Bell Labs designs the Audrey, a machine capable of understanding spoken digits • 1962: Progress is slow. IBM demonstrates the Shoebox, a machine that can understand up to 16 spoken words in English, at the 1962 Seattle World's Fair. • 1976: DARPA funds five years of speech recognition research with the goal of ending up with a machine capable of understanding a minimum of 1,000 words. The program led to the creation of the Harpy by Carnegie Mellon, a machine capable of understanding 1,011 words • 1980s: Hidden markow models! • 1990: Dragon Dictate. Used discrete speech where the user must pause between speaking each word. • 1996: IBM launches the MedSpeak, the first commercial product capable of recognizing continuous speech. • 2006: The National Security Agency begins using speech recognition to isolate keywords when analyzing recorded conversations. • 2008: Google Announces Voice Search. • 2011: Apple announces Siri. • 2014: Microsoft announces Cortana / Amazon announces the Echo. Machine translation. "The Georgetown experiment in 1954 involved fully automatic translation of more than sixty Russian sentences into English. The authors claimed that within three or five years, machine translation would be a solved problem". A brief summary: Since the 50s, various statistical, almost hand-programmed methods were tried, and they don't get qualitatively better, though they can eventually translate e.g., weather forecasts. Though in my youth google translator was sometimes a laughing stock, with the move towards neural network methods, it has become exceedingly good in the last years. Socially, although translators are in denial and still mantain that their four years of education are necessary, and mantain a monopoly through bureaucratic certification, in my experience it's easy to automate the localization of a commercial product and then just edit the output. A brief timeline: • 1924: Concept is proposed. • 1954: "The Georgetown-IBM experiment, held at the IBM head office in New York City in the United States, offers the first public demonstration of machine translation. The system itself, however, is no more than what today would be called a "toy" system, having just 250 words and translating just 49 carefully selected Russian sentences into English — mainly in the field of chemistry. Nevertheless, it encouraged the view that machine translation was imminent — and in particular stimulates the financing of the research, not just in the US but worldwide". Hype ends in 1966. • 1977: The METEO System, developed at the Université de Montréal, is installed in Canada to translate weather forecasts from English to French, and is translating close to 80,000 words per day or 30 million words per year. • 1997: babelfish.altavista.com is launched. 1. Google translate is launched; it uses statistical methods. • Onwards. Google translate and other tools get better. They are helped by the gigantic corpus which the European Union and the United Nations produce, which release all of their documents in their official languages. This is a boon for modern machine learning methods, which require large datasets for training; in 2016, Google translate moves to use an engine based on neural networks. Link to Wikipedia Contact lenses A brief summary: Leonardo da Vinci speculated on it in 1508, and some heroic glassmakers experimented on their own eyes. In I'll take 1888 to be the starting date. He experimented with fitting the lenses initially on rabbits, then on himself, and lastly on a small group of volunteers, publishing his work, "Contactbrille", in the March 1888 edition of Archiv für Augenheilkunde.[citation needed] Large and unwieldy, Fick's lens could be worn only for a couple of hours at a time.[citation needed] August Müller of Kiel, Germany, corrected his own severe myopia with a more convenient blown-glass scleral contact lens of his own manufacture in 1888 Many distinct improvements followed during the next century, each making contact lenses less terrible. Today, they're pretty good. Again, there wasn't any point at which contact lenses had "become good". A brief timeline: • 1508: Codex of the Eye. Leonardo da Vinci introduces a related idea. • 1801: Thomas Young heroically experiments with wax lenses. • 1888: First glass lenses. German ophthalmologist experiments on rabbits, then on himself, then on volunteers. • 1939: First plastic contact lens. Polymethyl methacrylate (PMMA) and other plastics will be use from now onwards. • 1949: Corneal lenses; much smaller than the original scleral lenses, as they sat only on the cornea rather than across all of the visible ocular surface, and could be worn up to 16 hours a day • 1964: Lyndon Johnson became the first President in the history of the United States to appear in public wearing contact lenses • 1970s-1980s: Oxygen-permeable lense materials are developed, chiefly by chemist Norman Gaylord. Soft lenses are also developed • More developments are made, so that lenses become more comfortable, and more disposable. From this point onwards, it's difficult for me to differentiate between lense companies wanting to hype up their marginal improvements and new significant discoveries being made. Link to Wikipedia In the event horizon Technologies which are seeing some progress right now, but which aren't mainstream: • Virtual Reality. First (nondigital) prototype in 1962, first digital prototype in 1968. Moderate though limited success as of yet, though VR images made a deep impression on me and may have contributed to me becoming vegetarian. • Text generation. ELIZA in 1968. 2019's GPT-2 is impressive but not perfect. • Solar sails. First mention in 1864-1899. Current experiments, like the IKAROS 2010, provide proof of concept. Anders Sandberg has a bunch of cool ideas about space colonization, some of which include solar sails. • Autonomous vehicles. "In 1925, Houdina Radio Control demonstrated the radio-controlled "American Wonder" on New York City streets, traveling up Broadway and down Fifth Avenue through the thick of a traffic jam. The American Wonder was a 1926 Chandler that was equipped with a transmitting antenna on the tonneau and was operated by a person in another car that followed it and sent out radio impulses which were caught by the transmitting antenna." It is my impression that self-driving cars are very decent right now, if perhaps not mainstream. • Space tourism. Precursors were the Russian's INTERKOSMOS, starting as early as 1978. Dennis Tito was the first space tourist in 2001 for 20 million; SpaceX hopes to make it cheaper in the coming decades. • Prediction systems. Systems exist such as PredictionBook, Betfair, Metaculus, PredictIt, Augur, Foretold, The Good Judgement Project, etc. but they haven't found widespread adoption yet. Before the event horizon For the sake of completeness, I came up with some technologies which are "beyond the event horizon". These are technologies mentioned in stories by Jules Verne, Heinlein, Asimov, etc., but which haven't been implemented yet. Small demonstrations might exist in laboratory settings or by the DYI community. • Flying cars. Though prototypes could be found since at least the 1926, fuel efficiency, bureaucratic regulations, noise concerns, etc., have prevented flying cars from being successful. Companies exist which will sell you a flying car if you have$100,000 and a pilots license, though.
• Teleportation. "Perhaps the earliest science fiction story to depict human beings achieving the ability of teleportation in science fiction is Edward Page Mitchell's 1877 story The Man Without a Body, which details the efforts of a scientist who discovers a method to disassemble a cat's atoms, transmit them over a telegraph wire, and then reassemble them. When he tries this on himself...". "An actual teleportation of matter has never been realized by modern science (which is based entirely on mechanistic methods). It is questionable if it can ever be achieved, because any transfer of matter from one point to another without traversing the physical space between them violates Newton's laws, a cornerstone of physics." No success as of yet.
• Space colonisation. "The first known work on space colonization was The Brick Moon, a work of fiction published in 1869 by Edward Everett Hale, about an inhabited artificial satellite". No rendezvous at L5; no success as of yet;
• Automated data analysis. Unclear origins As of 2020, IBM's Watson is clunky.
Other technological processes I only briefly looked into
• Alternating current.
• Mass tourism.
• Video conferences.
• Algorithms which play checkers/chess/go.
• Sound quality / Image quality / Video quality / Film quality.
• Improvements in vehicles.
• Air conditioning.

It also seems to me that you could get a prior on the time it takes to develop a technology using the following Wikipedia pages: Category: History of Technology, Category: List of timelines; Category: Technology timelines.

Previous work:
Conclusions

There are no sweeping conclusions to be had. What follows is a summary table; below it are some quotes on the history of weather forecasting and nuclear proliferation which I found particularly interesting.

Highlights from the history of weather forecasting.

There is some evidence that Democritus predicted changes in the weather, and that he used this ability to convince people that he could predict other future events

Several years after Aristotle's book (350 BC), his pupil Theophrastus puts together a book on weather forecasting called The Book of Signs. Various indicators such as solar and lunar halos formed by high clouds are presented as ways to forecast the weather. The combined works of Aristotle and Theophrastus have such authority they become the main influence in the study of clouds, weather and weather forecasting for nearly 2000 years

During his second voyage Christopher Columbus experiences a tropical cyclone in the Atlantic Ocean, which leads to the first written European account of a hurricane

It was not until the invention of the electric telegraph in 1835 that the modern age of weather forecasting began. Before that, the fastest that distant weather reports could travel was around 160 kilometres per day (100 mi/d), but was more typically 60–120 kilometres per day (40–75 mi/day) (whether by land or by sea). By the late 1840s, the telegraph allowed reports of weather conditions from a wide area to be received almost instantaneously, allowing forecasts to be made from knowledge of weather conditions further upwind.

But calculated by hand on threadbare data, the forecasts were often awry. In April 1862 the newspapers reported: "Admiral FitzRoy's weather prophecies in the Times have been creating considerable amusement during these recent April days, as a set off to the drenchings we've had to endure. April has been playing with him roughly, to show that she at least can flout the calculations of science, whatever the other months might do."

It was not until the 20th century that advances in the understanding of atmospheric physics led to the foundation of modern numerical weather prediction. In 1922, English scientist Lewis Fry Richardson published "Weather Prediction By Numerical Process", after finding notes and derivations he worked on as an ambulance driver in World War I. He described therein how small terms in the prognostic fluid dynamics equations governing atmospheric flow could be neglected, and a finite differencing scheme in time and space could be devised, to allow numerical prediction solutions to be found.

Richardson envisioned a large auditorium of thousands of people performing the calculations and passing them to others. However, the sheer number of calculations required was too large to be completed without the use of computers, and the size of the grid and time steps led to unrealistic results in deepening systems. It was later found, through numerical analysis, that this was due to numerical instability. The first computerised weather forecast was performed by a team composed of American meteorologists Jule Charney, Philip Thompson, Larry Gates, and Norwegian meteorologist Ragnar Fjørtoft, applied mathematician John von Neumann, and ENIAC programmer Klara Dan von Neumann. Practical use of numerical weather prediction began in 1955, spurred by the development of programmable electronic computers.

Sadly, Richardson’s forecast factory never came to pass. But twenty-eight years later, in 1950, the first modern electrical computer, eniac, made use of his methods and generated a weather forecast. The Richardsonian method proved to be remarkably accurate. The only downside: the twenty-four-hour forecast took about twenty-four hours to produce. The math, even when aided by an electronic brain, could only just keep pace with the weather.

Highlights from the history of nuclear proliferation.

The webpage of the Institute for Science and Internation Security has this this compendium on the history of nuclear capabilities. Although the organization as such remains influential, the ressource above is annoying to navigate, and may contain factual inaccuracies (e.g., Spain's nuclear ambitions both during the dictatorship and during the democratic period). Improving that online ressource might be a small project for an aspiring EA.

On the origins:

In 1934, Tohoku University professor Hikosaka Tadayoshi's "atomic physics theory" was released. Hikosaka pointed out the huge energy contained by nuclei and the possibility that both nuclear power generation and weapons could be created.[3] In December 1938, the German chemists Otto Hahn and Fritz Strassmann sent a manuscript to Naturwissenschaften reporting that they had detected the element barium after bombarding uranium with neutrons;[4] simultaneously, they communicated these results to Lise Meitner. Meitner, and her nephew Otto Robert Frisch, correctly interpreted these results as being nuclear fission[5] and Frisch confirmed this experimentally on 13 January 1939.[6] Physicists around the world immediately realized that chain reactions could be produced and notified their governments of the possibility of developing nuclear weapons.

Modern country capabilities:

Like other countries of its size and wealth, Germany has the skills and resources to create its own nuclear weapons quite quickly if desired South Korea has the raw materials and equipment to produce a nuclear weapon but has not opted to make one Today, Japan's nuclear energy infrastructure makes it capable of constructing nuclear weapons at will. The de-militarization of Japan and the protection of the United States' nuclear umbrella have led to a strong policy of non-weaponization of nuclear technology, but in the face of nuclear weapons testing by North Korea, some politicians and former military officials in Japan are calling for a reversal of this policy

In general, the projects of Switzerland and Yugoslavia were characterized by bureacratic fuckaroundism, and the Wikipedia page on their nuclear efforts is actually amusing because of the abundance of committees which never got anywhere:

The secret Study Commission for the Possible Acquisition of Own Nuclear Arms was instituted by Chief of General Staff Louis de Montmollin with a meeting on 29 March 1957.[8][12][13] The aim of the commission was to give the Swiss Federal Council an orientation towards "the possibility of the acquisition of nuclear arms in Switzerland."[12] The recommendations of the commission were ultimately favorable.[8]

The authors complained that the weapons effort had been hindered by Yugoslav bureaucracy and the concealment from the scientific leadership of key information regarding the organization of research efforts. It offers a number of specific illustrations about how this policy of concealment, “immeasurably sharper than that of any country, except in the Soviet bloc,” might hamper the timely purchase of 10 tons of heavy water from Norway.

On the other hand, the heterogeneity is worrying. Some countries (Israel) displayed ruthlessness and competency, whereas others got dragged down by bureaucracy. I suspect that this heterogeneity would also be the case for new technologies.

Cooperation between small countries:

Some scholars believe the Romanian military nuclear program to have started in 1984, however, others have found evidence that the Romanian leadership may have been pursuing nuclear hedger status earlier than this, in 1967 (see, for example, the statements made toward Israel, paired with the minutes of the conversation between the Romanian and North Korean dictators, where Ceauşescu said, "if we wish to build an atomic bomb, we should collaborate in this area as well")

The same report revealed that Brazil's military regime secretly exported eight tons of uranium to Iraq in 1981

In accord with three presidential decrees of 1960, 1962 and 1963, Argentina supplied about 90 tons of unsafeguarded yellowcake (uranium oxide) to Israel to fuel the Dimona reactor, reportedly creating the fissile material for Israel's first nuclear weapons.[11]

Syria was accused of pursuing a military nuclear program with a reported nuclear facility in a desert Syrian region of Deir ez-Zor. The reactor's components were believed to have been designed and manufactured in North Korea, with the reactor's striking similarity in shape and size to the North Korean Yongbyon Nuclear Scientific Research Center. The nuclear reactor was still under construction.

Veto powers:

These five states are known to have detonated a nuclear explosive before 1 January 1967 and are thus nuclear weapons states under the Treaty on the Non-Proliferation of Nuclear Weapons. They also happen to be the UN Security Council's permanent members with veto power on UNSC resolutions.

China:

China[6] became the first nation to propose and pledge NFU policy when it first gained nuclear capabilities in 1964, stating "not to be the first to use nuclear weapons at any time or under any circumstances." During the Cold War, China decided to keep the size of its nuclear arsenal small, rather than compete in an international arms race with the United States and the Soviet Union.[7][8] China has repeatedly reaffirmed its no-first-use policy in recent years, doing so in 2005, 2008, 2009 and again in 2011. China has also consistently called on the United States to adopt a no-first-use policy, to reach an NFU agreement bilaterally with China, and to conclude an NFU agreement among the five nuclear weapon states. The United States has repeatedly refused these calls.

Discuss

### Russian x-risks newsletter March 2020 – coronavirus update

27 марта, 2020 - 21:06
Published on March 27, 2020 6:06 PM GMT

Rusia recently attracted attention as it reported fewer CV cases than other countries. Probably the best explanation is under-reporting and under-testing. Other explanations are shorter life expectancy and higher home temperature.

Russia government is taking some measures against coronavirus, but they seem to be not radical or effective. Testing is not widespread, though some private companies are allowed to test now, as of March 26. Before it, only 3 consequent positive tests including the one in Novosibirsk Vector (the military bio lab where was an explosion in fall 2019 and which preserves a collection of deadly viruses) were needed to establish &#x421;ovid-19 officially. It created a backlog of testing. Also, mostly the people returned from abroad were tested, not locals, which created an illusion of no local transmission.

It may be expected that Putin would act cruelly to stop the pandemic, maybe even put lions in the streets as in one meme, but he acted very mildly, not in Modi style.

From 28 of March national holiday is declared, which is not lockdown (lockdown requires to waive rents and the state may have to pay for it, which it doesn’t want). Many people are parting on warmer areas, and the spring is surprising early and warm. Cafe and parks will be closed from tomorrow. From my observations, only from today, I can see a significant number of people in masks on streets (like 10-20 per cent). People above 65 are not allowed to go out and some were fined, but many still out. Schools are online.

Several facts about CV in Russia:

• As of 27 of March: Russia: 1036 cases, 703 in Moscow.
• Age of recent CV deaths in Russia: 45m (abroad, in Cuba, diabetic), 70, 73, 79f (died from clot), 88, 56f (+cancer).
• The reaction to the pandemic was delayed because they didn’t want to spoil important voting on new constitution April 22. Now it postponed.
• The official number is growing, but not as a quick exponent like in NY. May grow more after testing will be expanded.
• Male median life expectancy in Russia (not median age) is 67. vs 80 in Italy. May result in fewer deaths.
• Russia typically has a warmer temperature in houses: 25 C is the norm. It may kill virus indoors. Italy doesn’t have heating in winter.

Russia decided to go in the oil war with Saudi and US shale which was an obvious mistake. Global demand failed 15 Mb because of CV, and Russia’s export of 5 Mb could be completely replaced by export from Arab countries. Oil prices could be even negative short term. It also means no investment in oil production in the world and could result in oil shortages in 3-5 years from now, as existing wells decline 5-10 per cent a year. Not clear, how CV will affect renewables.

Russia has limited the export of grain and some outs to keep internal prices lower. Russia is exporting 10-20 per cent of world wheat. There is a long term tradition of hoarding in Russia and many people bought 1-3 month supply of food including flour.

Discuss

### When to assume neural networks can solve a problem

27 марта, 2020 - 20:52
Published on March 27, 2020 5:52 PM GMT

A pragmatic guide

Let’s begin with a gentle introduction in to the field of AI risk - possibly unrelated to the broader topic, but it’s what motivated me to write about the matter; it’s also a worthwhile perspective to start the discussion from. I hope for this article to be part musing on what we should assume machine learning can do and why we’d make those assumptions, part reference guide for “when not to be amazed that a neural network can do something”.

The Various hues of AI risk

I’ve often had a bone to pick against “AI risk” or, as I’ve referred to it, “AI alarmism”. When evaluating AI risk, there are multiple views on the location of the threat and the perceived warning signs.

1. The Bostromian position

I would call one of these viewpoints the “Bostromian position”, which seems to be mainly promoted by MIRI, philosophers like Nick Bostrom and on forums such as AI Alignment.

It’s hard to summarize without apparently straw-man arguments, e.g. “AIX + Moore’s law means that all powerful superhuman intelligence is dangerous, inevitable and close.” That’s partly because I’ve never seen a consistent top-to-bottom reasoning for it. Its proponents always seem to start by assuming things which I wouldn’t hold as given about the ease of data collection, the cost of computing power, the usefulness of intelligence.

I’ve tried to argue against this position, the summary of my view can probably be found in “Artificial general intelligence is here, and it's useless”. Whilst - for the reasons mentioned there - I don’t see it as particularly stable, I think it’s not fundamentally flawed; I could see myself arguing pro or con.

2. The Standard Position

Advocated by people ranging from my friends, to politicians, to respectable academics, to CEOs of large tech companies. It is perhaps best summarized in Stuart Russell’s book Human Compatible: Artificial Intelligence and the Problem of Control.

This viewpoint is mainly based around real-world use cases for AI (where AI can be understood as “machine learning”). People adopting this perspective are not wrong in being worried, but rather in being worried about the wrong thing.

It’s wrong to be upset by Facebook or Youtube using an algorithm to control and understand user preferences and blaming it on “AI”, rather than on people not being educated enough to use TOR, install a tracking blocker, use Ublock Origin and not center their entire life around conspiracy videos in their youtube feed or in anti-vaccination facebook groups.

It’s wrong to be alarmed by Amazon making people impulse-buy via a better understanding of their preferences , and thus getting them into inescapable debt, rather than by the legality of providing unethically leveraged debt so easily.

It’s wrong to fuss about automated trading being able to cause sudden large dips in the market, rather than about having markets so unstable and so focused on short-term trading as to make this the starting point of the whole argument.

It’s wrong to worry about NLP technology being used to implement preventive policing measures, rather than about governments being allowed to steal their citizens’ data, to request backdoors into devices and to use preventive policing to begin with.

It’s wrong to worry about the Chinese Communist Party using facial recognition and tracking technology to limit civil rights; Worry instead about CCP ruling via a brutal dictatorship that implements such measures without anybody doing something against it.

But I digress, though I ought to give a full rebuttal of this position at some point.

3. The misinformed position

A viewpoint distinct from the previous two. It stems from misunderstanding what machine learning systems can already do. It basically consists in panicking over “new developments” which actually have existed for decades.

This view is especially worth fighting against, since it’s based on misinformation. Whereas with categories 1 and 2 I can see valid arguments arising for regulating or better understanding machine learning systems (or AI systems in general), people in the third category just don’t understand what’s going on, so they are prone to adopt any view out of sheer fear or need of belonging, without truly understanding the matter.

Until recently I thought this kind of task was better left to PBS. In hindsight, I’ve seen otherwise smart individuals being amazed that “AI” can solve a problem which anyone that has actually worked with machine learning would have been able to tell you is obviously solvable and has been since forever.

Furthermore, I think addressing this viewpoint is relevant, as it’s actually challenging and interesting. The question of “What are the problems we should assume can be solved with machine learning?”, or even narrower and more focused on current developments “What are the problems we should assume a neural network should be able to solve?”, is one I haven’t seen addressed much.

There are theories like PAC learning and AIX which at a glance seem to revolve around this, as it pertains to machine learning in general, but if actually tried in practice won’t yield any meaningful answer.

How people misunderstand what neural networks can do

Let’s look at the general pattern of fear generated by misunderstanding the machine learning capabilities we’ve had for decades.

• Show a smart but relatively uninformed person - a philosophy PhD or an older street-smart businessman - a deep learning party trick.
• Give them the most convoluted and scary explanation of why it works. E.g. Explain Deep-Dream by using incomplete neurological data about the visual cortex and human image processing, rather than just saying it’s the outputs of a complex edge detector overfit to recognize dog faces.
• Wait for them to write an article about it in VOX & co

An example that originally motivated this article is Scott Alexander’s article post about being amazed that GPT-2 is able to learn how to play chess, poorly.

It seems to imply that GPT-2 playing chess well enough not to lose very badly against a medicore opponent (the author) is impressive and surprising.

Actually, the fact that a 1,500,000,000-parameter model designed for sequential inputs can be trained to kind of play chess is rather unimpressive, to say the least. I would have been astonished if GPT-2 were unable to play chess. Fully connected models a hundred times smaller ( https://github.com/pbaer/neural-chess) could do that more than 2 years ago.

The successful training of GPT-2 is not a feat because if a problem like chess has been already solved using various machine learning models we can assume it can be done with a generic neural network architecture (e.g. any given FC net or a FC net with a few attention layers) hundreds or thousands of times larger in terms of parameters.

When to assume a neural network can solve a problem

In the GPT-2 example, transformers (i.e. the BERT-like models inspired by the “Attention is all you need” paper’s proposed design) are pretty generic as far as NN architectures go. Not as generic as a fully connected net, arguably; they seem to perform more efficiently (in terms of training time and model size) on many tasks, and they are much better on most sequential input tasks.

So when should we assume that such generic NN architectures can solve a problem?

The answer might ease uniformed awe and might be relevant to actua problems – the kind for which “machine learning” might have been considered, but with doubt whether it’s worth bothering.

Playing chess decently is also a problem already solved. It can be done using small (compared to GPT-2) decision trees and a few very simple heuristics (see for example: https://github.com/AdnanZahid/Chess-AI-TDD). If a much smaller model can learn how to play “decently”, we should assume that a fairly generic, exponentially larger neural network can do the same.

The rule of thumb is:

1.A neural network can almost certainly solve a problem if another ML algorithm has already succeeded.

Given a problem that can be solved by an existing ML technique, we can assume that a somewhat generic neural network, if allowed to be significantly larger, can also solve it.

This assumption doesn’t always hold because:

• a) Depending on the architecture, a neural network could easily be unable to optimize a given problem. Playing chess might be impossible for a conv network with large windows and step size, even if it’s very big.
• b) Certain ML techniques have a lot of built-in heuristics that might be hard to learn for a neural network. The existing ML technique mustn’t have any critical heuristics built into it, or at least you have to be able to include the same heuristics into your neural network model.

As we are focusing mainly on generalizable neural network architectures (e.g. a fully connected net, which is what most people think of initially when they hear “neural network”), point a) is pretty irrelevant.

Given that most heuristics are applied equally well to any model, even for something like chess, and that size can sometimes be enough for the network to be able to just learn the heuristic, this rule basically holds almost every time.

I can’t really think of a counter example here… Maybe some specific types of numeric projections?

This is a rather boring first rule, yet worth stating as a starting point to build up from.

2. A neural network can almost certainly solve a problem very similar to ones already solved

Let’s say you have a model for predicting the risk of a given creditor based on a few parameters, e.g. current balance, previous credit record, age, driver license status, criminal record, yearly income, length of employment, {various information about current economic climate}, marital status, number of children, porn websites visited in the last 60 days.

Let’s say this model “solves” your problem, i.e. it predicts risk better than 80% of your human analysts.

But GDPR rolls along and you can no longer legally spy on some of your customers’ internet history by buying that data. You need to build a new model for those customers.

Your inputs are now truncated after and the customer’s online porn history is no longer available (or rather admittedly usable).

Is it safe to assume you can still build a reasonable model to solve this problem ?

The answer is almost certainly “yes; given our knowledge of the world, we can safely assume someone’s porn browsing history is not that relevant to their credit rating as some of those other parameters.

Another example: assume you know someone else is using a model, but their data is slightly different from yours.

You know a US-based snake-focused pet shop that uses previous purchases to recommend products and they’ve told you it’s done quite well for their bottom line. You are a UK-based parrot-focused pet shop. Can you trust their model or a similar one to solve your problem, if trained on your data ?

Again, the right answer is probably “yes”, because the data is similar enough. That’s why building a product recommendation algorithm was a hot topic 20 years ago, but nowadays everyone and their mom can just get a wordpress plugin for it and get close to Amazon’s level.

Or, to get more serious, let’s say you have a given algorithm for detecting breast cancer that - if trained on 100,000 images with follow-up checks to confirm the true diagnostics - performs better than an average radiologist.

Can you assume that, given the ability to make it larger, you can build a model to detect cancer in other types of soft tissue, also better than a radiologist ?

Once again, the answer is yes. The argument here is longer, because we aren’t so certain, mainly because of the lack of data. I’ve spent more or less a whole article arguing that the answer would still be yes.

In NLP the exact same neural network architectures seem to be decently good at doing translation or text generation in any language, as long as it belongs to the Indo European family and there is a significant corpus of data for it (i.e. equivalent to that used for training the extant models for English).

Modern NLP techniques seem to be able to tackle all language families, and they are doing so with less and less data. To some extent, however, the similarity of the data and the amount of training examples are tightly linked to the ability of a model in quickly generalizing for many languages.

Or looking at image recognition and object detection/boxing models, the main bottleneck consists in large amounts of well-labeled data, not the contents of the image. Edge cases exist, but generally all types of objects and images can be recognized and classified if enough examples are fed into an architecture originally designed for a different image task (e.g. a conv residual network designed for imagenet).

Moreover, given a network trained on imagenet, we can keep the initial weights and biases (essentially what the network “has learned”) instead of starting from scratch, and it will be able to “learn” on different datasets much faster from that starting point.

3. A neural network can solve problems that a human can solve with small-sized datapoints and little to no context

Let’s say we have 20x20px black and white images of two objects never seen before; they are “obviously different”, but not known to us . It’s reasonable to assume that, given a bunch of training examples, humans would be reasonably good at distinguishing the two.

It is also reasonable to assume, given a bunch of examples (let’s say 100), that almost any neural network of millions of parameters would ace this problem like a human.

You can visualize this in terms of amounts of information to learn. In this case, we have 400 pixels of 255 values each, so it’s reasonable to assume every possible pattern could be accounted for with a few million parameters in our equation.

But what “small datapoints” means here is the crux of this definition.

In short, “small” is a function of:

• The size of your model. The bigger a model, the more complex the patterns it can learn, the bigger your possible inputs/outputs.
• The granularity of the answer (output). E.g 1,000 classes vs 10 classes, or an integer range from 0 to 1,000 vs one from 0 to 100,000. In this case 2.
• The size of the input. In this case 400, since we have a 20x20 image.

Take a classic image classification task like MNIST. Although a few minor improvements have been made, the state-of-the-art for MNIST hasn’t progressed much. The last 8 years have yielded an improvement from ~98.5% to ~99.4%, both of which are well within the usual “human error range”.

Compare that to something much bigger in terms of input and output size, like ImageNet, where the last 8 years have seen a jump from 50% to almost 90%.

Indeed, even with pre-CNN techniques, MNIST is basically solveable.

But even having defined “small” as a function of the above, we don’t have the formula for the actual function. I think that is much harder, but we can come up with a “cheap” answer that works for most cases - indeed, it’s all we need:

• A given task can be considered small when other tasks of equal or larger input and output size have already been solved via machine learning with more than one architecture on a single GPU

This might sound like a silly heuristic, but it holds surprisingly well for most “easy” machine learning problems. For instance, the reason many NLP tasks are now more advanced than most “video” tasks is size, despite the tremendous progress on images in terms of network architecture (which are much closer to the realm of video). The input & output size for meaningful tasks on videos is much larger; on the other hand, even though NLP is in a completely different domain, it’s much closer size-wise to image processing.

Then, what does “little to no context” mean ?

This is a harder one, but we can rely on examples with “large” and “small” amounts of context.

• Predicting the stock market likely requires a large amount of context. One has to be able to dig deeper into the companies to invest in; check on market fundamentals, recent earning calls, the C-suite’s history; understand the company’s product; maybe get some information from it’s employees and customers, if possible, get insider info about upcoming sales and mergers, etc.

You can try to predict the stock market based purely on indicators about the stock market, but this is not the way most humans are solving the problem.

• On the other hand, predicting the yield of a given printing machine based on temperature and humidity in the environment could be solved via context, at least to some extent. An engineer working on the machine might know that certain components behave differently in certain conditions. In practice, however, an engineer would basically let the printer run, change the conditions, look at the yield, then come up with an equation. So given that data, a machine learning algorithm can also probably come up with an equally good solution, or even a better one.

In that sense, an ML algorithm would likely produce results similar to a mathematician in solving the equation, since the context would be basically non-existent for the human.

There are certainly some limits. Unless we test our machine at 4,000 C the algorithm has no way of knowing that the yield will be 0 because the machine will melt; an engineer might suspect that.

So, I can formulate this 3rd principle as:

A generic neural network can probably solve a problem if:

• A human can solve it
• Tasks with similarly sized outputs and inputs have already been solved by an equally sized network
• Most of the relevant contextual data a human would have are included in the input data of our algorithm.

Feel free to change my mind (with examples).

However, this still requires evaluating against human performance. But a lot of applications of machine learning are interesting precisely because they can solve problems humans can’t. Thus, I think we can go even deeper.

4. A neural network might solve a problem when we are reasonably sure it’s deterministic, we provide any relevant context as part of the input data, and the data is reasonably small

Here I’ll come back to one of my favorite examples - protein folding. One of the few problems in science where data is readily available, where interpretation and meaning are not confounded by large amounts of theoretical baggage, and where the size of a datapoint is small enough based on our previous definition. You can boil down the problem to:

• Around 2,000 input features (amino acids in the tertiary structure), though this means our domain will only cover 99.x% of proteins rather than literally all of them.
• Circa 18,000 corresponding output features (number of atom positions in the tertiary structure, aka the shape, needing to be predicted to have the structure).

This is one example. Like most NLP problems, where “size” becomes very subjective, we could easily argue one-hot-encoding is required for this type of inputs; then the size suddenly becomes 40,000 (there’s 20 proteinogenic amino acids that can be encoded by DNA) or 42,000 (if you care about selenoproteins and 44,000 if you care about niche proteins that don’t appear in eukaryotes).

It could also be argued that the input & output size is much smaller, since in most cases proteins are much smaller and we can mask & discard most of inputs & outputs for most cases.

Still, there are plenty of tasks that go from an, e.g. 255x255 pixel image to generate another 255x255 pixel image (style alternation, resolution enhancement, style transfer, contour mapping… etc). So based on this I’d posite the protein folding data is reasonably small and has been for the last few years.

Indeed, resolution enhancement via neural networks and protein folding via neural networks came about at around the same time (with every similar architecture, mind you). But I digress; I’m mistaking a correlation for the causal process that supposedly generated it. Then again, that’s the basis of most self-styled “science” nowadays, so what is one sin against the scientific method added to the pile ?

Based on my own fooling around with the problem, it seems that even a very simple model, simpler than something like VGG, can learn something ”meaningful” about protein folding. It can make guesses better than random and often enough come within 1% of the actual position of the atoms, if given enough (135 millions) parameters and half a day of training on an RTX2080. I can’t be sure about the exact accuracy, since apparently the exact evaluation criterion here is pretty hard to find and/or understand and/or implement for people that aren’t domain experts… or I am just daft, also a strong possibility.

To my knowledge the first widely successful protein folding network AlphaFold, whilst using some domain-specific heuristics, did most of the heavy lifting using a residual CNN, an architecture designed for categorizing images, something as widely unrelated with protein folding as one can think of.

That is not to say any architecture could have tackled this problem as well. It rather means we needn’t build a whole new technique to approach this type of problem. It’s the kind of problem a neural network can solve, even though it might require a bit of looking around for the exact network that can do it.

The other important thing here is that the problem seems to be deterministic. Namely:

• a) We know peptides can be folded into proteins, in the kind of inert environment that most of our models assume, since that’s what we’ve always observed them to do.
• b) We know that amino acids are one component which can fully describe a peptide
• c) Since we assume the environment is always the same and we assume the folding process itself doesn’t much alter it, the problem is not a function of the environment (note, obviously in the case of in-vitro folding, in-vivo the problem becomes much harder)

The issue arises when thinking about b), that is to say, we know that the universe can deterministically fold peptides; we know amino acids are enough to accurately describe a peptide. However, the universe doesn’t work with “amino acids”, it works with trillions of interactions between much smaller particles.

So while the problem is deterministic and self-contained, there’s no guarantee that learning to fold proteins doesn’t entail learning a complete model of particle physics that is able to break down each amino acid into smaller functional components. A few million parameters wouldn’t be enough for that task.

This is what makes this 4th most generic definition the hardest to apply.

Some other examples here are things like predictive maintenance where machine learning models are being actively used to tackle problems human can’t, at any rate not without mathematical models. For these types of problems, there’s strong reasons to assume, based on the existing data, that the problems are partially (mostly?) deterministic.

There are simpler examples here, but I can’t think of any that, at the time of their inception, didn’t already fall into the previous 3 categories. At least, none that aren’t considered reinforcement learning.

The vast majority of examples fall within reinforcement learning, where one can solve an impressive amount of problems once they are able to simulate them.

People can find optimal aerodynamic shapes, design weird antennas to provide more efficient reception/coverage, beat video games like DOT and Starcraft which are exponentially more complex (in terms of degrees of freedom) than chess or Go.

The problem with RL is that designing the actual simulation is often much more complicated than using it to find a meaningful answer. RL is fun to do but doesn’t often yield useful results. However, edge cases do exist where designing the simulation does seem to be easier than extracting inferences out of it. Besides that, the more simulations advance based on our understanding of efficiently simulating physics (in itself helped by ML), the more such problems will become ripe for the picking.

In conclusion

I’ve attempted to provide a few simple heuristics for answering the question “When should we expect that a neural network can solve a problem ?”. That is to say, to what problems should you apply neural networks, in practice, right now. What problems should leave you “unimpressed” when solved by a neural network ? For which problems should our default hypothesis include their solvability, given enough architecture searching and current GPU capabilities.

I think this is fairly useful - not only for not getting impressed when someone shows us a party trick and tells us it’s AGI - but also for helping us quickly classify a problem as “likely solvable via ML” and “unlikely to be solved by ML”

To recap, neural networks can probably solve your problem:

1. [Almost certainty] If other ML models already solved the problem.
2. [Very high probability] If a similar problem has already been solved by an ML algorithm, and the differences between that and your problem don’t seem significant.
3. [High probability] If the inputs & outputs are small enough to be comparable in size to those of other working ML models AND if we know a human can solve the problem with little context besides the inputs and outputs.
4. [Reasonable probability] If the inputs & outputs are small enough to be comparable in size to those of other working ML models AND we have a high certainty about the deterministic nature of the problem (that is to say, about the inputs being sufficient to infer the outputs).

I am not certain about any of these rules, but this comes back to the problem of being able to say something meaningful. PACL can give us almost perfect certainty and is mathematically valid but it breaks down beyond simple classification problems.

Coming up with this kind of rules doesn’t provide an exact degree of certainty and they are derived from empirical observations. However, I think they can actually be applied to real world problems.

Indeed, these are to some extent the rules I do apply to real world problems, when a customer or friend asks me if a given problem is “doable”. These seem to be pretty close to the rules I’ve noticed other people using when thinking about what problems can be tackled.

So I’m hoping that this could serve as an actual practical guide for newcomers to the field, or for people that don’t want to get too involved in ML itself, but have some datasets they want to work on.

Discuss

### Explanatory power of C elegans neural models

27 марта, 2020 - 20:48
Published on March 27, 2020 5:30 PM GMT

Background

@jkaufman wrote a nice post on the progress of C elegans modeling effort in 2011, with a followup post in 2014. Here we are, almost a decade later, and I still have the same question that I would have asked even in 2011 if I had dived into the project.

Question

Considering that we do not know the signs of the connectome weights, what do you think about the strength of explanations that try to explain biological phenomenon (E.g. locomotion) in terms of neural dynamics (e.g. from this model, we propose that there's push-pull circuitry because we see that our model shows these fluctuating membrane potentials)?

My preliminary take: Weak and highly uncertain.

Cook et al., 2019 said "modelling the functions of the nervous system at the abstracted level of the connectivity network cannot be seriously undertaken if a considerable number of nodes or edges (for example, edges that represent electrical couplings) are missing." I would think that misdirected edges might just be as harmful as missing edges.

Why doubt the preliminary take?

I have seen commentaries in papers like Kunert et al., 2014 and Kim et al., 2019 that analyze neuronal dynamics from the model (E.g. oscillatory dynamics were observed in membrane potentials of neuron set A if neuron set B are stimulated). I don't doubt the methodological contributions of the paper, but I wonder if it's worthwhile to produce a study that is purely an analysis of the dynamics of such a model, given the uncertainties in the connectome.

Discuss

### Overhead of MessageChannel

27 марта, 2020 - 20:10
Published on March 27, 2020 5:10 PM GMT

The Channel Messaging API gives you a way to pass information asynchronously between different origins on the web, such as cross-origin iframes. Traditionally you would use window.postMessage(), but the a MessageChannel has the advantage of being clearer, only requiring validating origins on setup, and handling delegation better. Reading this 2016 post, however, I was worried that it might have enough overhead that postMessage made more sense in performance-sensitive contexts. Benchmark time!

I made a test page which alternates between loading trycontra.com/test/messageChannelResponse.html and trycontra.com/test/postMessageResponse.html. I'm using two different domains so that I can test cross-origin performance. First it loads messageChannelResponse in an iframe, waits for it to load, and then times how long it takes to pass in a MessagePort and receive a response on it. Then it does the same basic operation postMessageResponse with plain postMessage. This is a worst-case for MessageChannel, since I stand up the while channel only to use it a single time.

I ran 1,000 iterations, interleaved, on Chrome (80.0), Firefox (74.0), and Safari (13.1). All of these were on my 2017 MacBook Pro. Here are all the runs, sorted from fastest to slowest [1]:

In Chrome, MessageChannel is a bit faster, while in Firefox and especially Safari it's slower. Additionally, Firefox runs it faster than Safari, which runs it faster than Chrome. Safari also has more consistent performance than Chrome, with a flatter distribution. Firefox is in between, with a flat distribution for postMessage but a few slow calls in the tail for MessageChannel. If you're writing something where ~7ms/call in Safari is an issue then it might be worth sticking to postMessage, otherwise MessageChannel seems fine.

[1] I find this kind of "sideways CDF" a really useful visualization tool, and possibly the chart I make most often.

Discuss

### March 25: Daily Coronavirus Updates

27 марта, 2020 - 07:32
Published on March 27, 2020 4:32 AM GMT

Aggregators

Clusters C19 literature for easier filtering and digestion

Given a target article, identifies similar articles

Economics

Balancing the cost in lives of C19 and shutdown

Plea to consider both the cost in lives of doing nothing and of gutting the economy, and spend more money to get things fixed faster

Disinfection and reuse of N95 masks (guidelines and data)

A report on disinfecting N95 disposable respirators. The second half of the document is a paper with fresh results: 70C air for 30 min (good), UV 254nm@8W for 30 min (good, but light must reach all surfaces), hot steam for 10 min (significant degradation after 5-10 cycles), bleach or alcohol (destroys filtration, do not use).

Work & Donate

Empathetic listening for frontline medics

Connects front line medics and carers with volunteer listeners

Discuss

### Charity to help people get US stimulus payments?

27 марта, 2020 - 06:16
Published on March 27, 2020 3:16 AM GMT

It seems like once the big coronavirus bill passes, the stimulus payments might result in an interesting one-time opportunity for effective altruism. Apparently, anyone who didn't file a tax return last year would need to file a return to get the money. A charity that helps people file might be a highly leveraged opportunity for financial help? For example, if you can spend $60 to help someone get a$1200 check then that's pretty good leverage.

This seems similar to what mRelief does, but probably a better fit for tax preparers. Does anyone know anything?

(On the other hand, I suppose any tax preparer could do this for profit? Hmm.)

Discuss

### The four levels of social distancing, and when and why we might transition between them

27 марта, 2020 - 06:07
Published on March 27, 2020 3:07 AM GMT

In this post, I describe four increasingly strict levels of social distancing that have been considered in response to the coronavirus situation. I then talk about how likely each is to succeed, how sustainable it is, and the relative economic and social cost. I then talk about the likely time periods that we'll transition between these levels. The purpose here is to simply help think systematically about trade-offs and transitions and identify points of agreement. I've tried to indicate confidence levels for various claims, but am probably overconfident about many specific things.

The four level of social distancing The levels

What exactly falls in a given level is subject to debate, but is not the focus here.

1. Level 1 (avoid large gatherings) Avoid sporting events, large lectures, conferences, political rallies, demonstrations, rave parties, cinema halls, cruise ships, prisons, and other environments that involve close contact with large numbers of people.

2. Level 2 (do all work and recreation remotely where feasible, and avoid crowding when meeting in person): Run schools, universities, and workplaces remotely, except those that need in-person interaction (for instance, retail storefronts and restaurants). Postpone recreational travel and substitute virtual entertainment for physical entertainment where feasible. Go to restaurants and cinema halls sparingly and when they are less crowded.

3. Level 3 (flexible lockdown, "stay home except for essential needs"): Whatever workplaces can be run remotely, run remotely. For others, shut them down unless they serve "essential needs". For instance, shut down nail salons, spas, gyms, etc. Only things like grocery stores, convenience stores, hardware stores, and takeout food are allowed. Restrict travel, both regional and across cities, to essential needs. Go out only for exercise and buying essentials.

4. Level 4 (strict curfew-style lockdown, with police enforcement): Similar to Level 3, but with stricter restrictions. Mass transit, both intra-city and inter-city, may be stopped for everybody except essential workers. Even takeout food may not be allowed. If you leave the house, you may be stopped by lockdown enforcers and may have to submit documentation or show an identity card to justify yourself.

Chart 15 in the hammer and the dance is somewhat similar in that it lays out various levels of social distancing. However, it's much more complicated. If you are familiar with the jargon of that post, levels 3 and 4 are probably the only levels that deserve the "hammer" moniker, whereas levels 1 and 2 can be the relatively relaxed portions of the "dance".

TL;DR: timelines

Note that the gradation from optimistic to pessimistic for one transition may not correlate with the super-optimistic case for another transition (e.g., perhaps going down from level 3 to level 2 earlier increases the expected delay in going down from level 2 to level 1). The use of "summer" in the table below refers to the northern hemisphere summer.

Translations:

• Super-optimistic case ~ 84th percentile of distribution
• Reasonably optimistic case ~ 68th percentile of distribution
• Median case ~ 50th percentile of distribution
• Pessimistic but non-catastrophic case ~ 16th percentile of distribution
Transition Super-optimistic case Reasonably optimistic case Median case Pessimistic but non-catastrophic case Most of the world goes down from level 3 to level 2 mid-April mid-May mid-June mid-August Most of the world goes down from level 2 to level 1 end of June mid-August summer 2021 summer 2022 Most of the world goes down from level 1 to business-as-usual end of September summer 2021 summer 2022 summer 2023 More about each level Level 1 (avoid large gatherings)

Avoid sporting events, large lectures, conferences, political rallies, demonstrations, rave parties, cinema halls, cruise ships, prisons, and other environments that involve close contact with large numbers of people.

Despite the wide range of uncertainty around coronavirus, my guess is that Level 1 is the bare minimum necessary till coronavirus is fully under control (for instance, through a widely available vaccine that has been administered to a large proportion of the population, including the individuals who want to be part of large gatherings).

Expected duration that we need to maintain at least level 1, and that there will be a general social consensus that we need to do so: Somewere between 6 months (super-optimistic) and 3 years (pessimistic).

My confidence in this range: Reasonably high (around 70%)

I expect this to not be too controversial with the LessWrong readership.

Predictions about businesses: Sports stadiums, cruise ships, and cinema halls are in for a very difficult time.

Level 2 (do all work and recreation remotely where feasible, and avoid crowding when meeting in person)

Run schools, universities, and workplaces remotely, except those that need in-person interaction (for instance, retail storefronts and restaurants). Postpone recreational travel and substitute virtual entertainment for physical entertainment where feasible. Go to restaurants and cinema halls sparingly and when they are less crowded.

In the last few weeks, large parts of the world, including many places in United States, Europe, and South Asia, transitioned from level 1 to level 2, and some quickly sped to level 3, after staying in level 2 for less than two weeks. Places that have had strong systems of diagnosis, testing, and contact tracing have been able to stick to level 2 and not escalate to level 3 (South Korea, Hong Kong, Singapore, and Taiwan arguably fit this description). Other areas, where diagnosis, testing, and contact tracing were not in place early enough, did not have confidence that they could contain the epidemic at level 2 and escalated to level 3 or level 4.

I believe there's a good chance that the next few months of data will show that we need to maintain at least level 2 until coronavirus is mostly conquered (through a proven treatment or vaccine that has started being administered). My reason for believing that at least level 2 is needed: as far as I can make out, a lot of the transmission of coronavirus happened outside of contexts that can be avoided through level 1 of social distancing. If the world had followed level 1 consistently, the outbreak may not have reached pandemic proportions, but it would probably still be growing in numbers.

It is possible that the data, once colllected and studied, will show that level 1 of social distancing, combined with some specific precautions gleaned from the data, are enough. However, even if ours is a world where the ground truth is that level 1 of social distancing is enough, I expect that rigorously demonstrating this will take at least a few months. And I expect that nothing short of a rigorous demonstration will lead to a general social consensus to go down to level 1.

Expected duration that we need to maintain at least level 2, and that there will be a general social consensus that we need to do so: Somewhere between 3 months (super-optimistic) and 2 years (pessimistic).

My confidence in this range: Reasonably high (around 70%)

I expect this to be a little more controversial with the LessWrong readership, but not too much.

Predictions about businesses: Distance education and remote work will thrive relative to their brick-and-mortar counterparts, and the tools that enable these have tremendous growth potential. Similarly, home entertainment will make big strides relative to meatspace entertainment.

Comparing level 1 and level 2 for the economy: Neither level 1 nor level 2 completely cripples the economy. With that said, level 1 affects only very specific industries, whereas level 2 has wide-ranging effects, creating both winner and losers.

Level 3 (flexible lockdown, "stay home except for essential needs")

Whatever workplaces can be run remotely, run remotely. For others, shut them down unless they serve "essential needs". For instance, shut down nail salons, spas, gyms, etc. Only things like grocery stores, convenience stores, hardware stores, and takeout food are allowed. Restrict travel, both regional and across cities, to essential needs. Go out only for exercise and buying essentials.

In the last few weeks, large parts of the world transitioned from level 1 to level 2 to level 3, or even directly from level 1 to level 3. Some East Asian countries were able to avoid going all the way to level 3, or needed to go there only for brief periods of time. Examples of regions that didn't need to go to level 3 (outside of very specific geographies or short time periods) South Korea, Singapore, Hong Kong, and Taiwan. A region that went to level 3 or 4 and is on the path to returning to level 2 is China's Hubei province.

Data in the coming weeks, particularly data on the mostly East Asian countries that have managed to either avoid escalating to, or de-escalated from level 3, will be crucial in figuring out how long we need to sustain level 3.

One question might be: why would the answer to "does it make sense to de-escalate from level 3 to level 2?" change over time? There are two mechanisms by which the answer might change:

• Preparation time: Staying in level 3 for a little while slows down the rate of infection a lot, which gives people time to ramp up on production of medical equipment, and catch up with the backlog in testing and contact tracing work. Once that catch-up is complete, quarantining infected or at-risk individuals is good enough, and the rest of us can go down to level 2.

• Information value: Maybe it actually makes sense to stay in level 2 all along, and an omniscient being would know it. But, we don't know it yet. From a risk perspective, it makes sense for any part of the world that has crossed some threshold of infection to escalate to level 3. Later, when the data on new infection rates is in, it may turn out that level 2 was "good enough", and that even with level 2, the basic reproduction number (R0) was already much less than 1. But right now, we can't risk it. Note in particular that this argument benefits a lot from the fact that some parts of the world are only at level 2; the data from these regions over the next few weeks and months will be crucial to making that case that level 2 is sustainable.

Expected duration that we need to maintain at least level 3, and that there will be a general social consensus to do so: This is the part I am most uncertain about, and also the one that I feel has the most cross-region variation. Loosely, here is what I expect:

• I expect that it will take at least till the end of April for there to be enough data to make a confident case that level 2 is enough in equilibrium (i.e., the "information value" side of the argument).

• I expect that regions where coronavirus can be contained (i.e., there are either no cases, or all cases are caught through contact tracing and there are no unexpected cases) will not need to go to level 3 at all. But that regions that have needed to go to level 3 will take at least a month to return to level 2, and possibly two to three months, or even longer.

Concretely:

• I expect that already-overwhelmed regions such as New York City and Italy will stay in level 3 till at least the end of May, and possibly till July or August.

• However, I expect that other regions that are actually not overwhelmed, will be able to get out of level 3 around mid-June (super-optimistic: mid-April, pessimistic: August). I emphasize the "actually" there because there could be many regions that already have a lot of undiagnosed cases, and are already overwhelmed; it just isn't showing up in their statistics yet.

The time range of late April to mid-May is under the optimistic assumption that level 2 is actually sustainable (and that the experiences of China, South Korea, Hong Kong, Singapore, and Taiwan continue to show this) and that the preparation time is sufficient.

Predictions about businesses: Restaurants, nail salons, spas, gyms, etc. will see a huge hit for the next few months. Even after flexible lockdown ends, traffic to them will probably rebound slowly. Those that lack the cash to get through the few months may go bankrupt, or get sold.

Comparing level 2 and level 3 for the economy: Level 3 has a much higher impact on the economy than level 2. Not only does it hurt a bunch of retail storefront businesses, it also massively shifts consumer demand patterns, which require a lot of supply chain reconfiguration (for instance, moving food demand away from restaurants and toward groceries). The uncertainty of duration of level 3 further complicates matters; normally, supply chain reconfiguration takes more time. Thus, even after we de-escalate from level 3, the reconfigured supply chain may make it harder to go back to normal.

Level 4: strict curfew-style lockdown, with police enforcement

Level 4 is similar to Level 3, but with stricter restrictions. Mass transit, both intra-city and inter-city, may be stopped for everybody except essential workers. Even takeout food may not be allowed. If you leave the house, you may be stopped by lockdown enforcers and may have to submit documentation or show an identity card to justify yourself.

The lockdowns in Wuhan (China), India, and many European countries are at level 4; some European countries straddle the line between level 3 and level 4.

I think the case for level 4 as opposed to level 3 is unclear; however, this might be a hard matter to study. A justification offered for level 4 is low compliance with level 3. This may be more of an issue in some places than others, which makes cross-region comparisons harder. For instance, reading articles like this might lead one to argue that strict enforcement is necessary for social distancing to be successful in India, making the case for level 4 instead of level 3. However, articles like this suggest that the threat or fear of a strict lockdown may itself cause people to crowd more as they rush to stock up on food or return to their home town, whereas a commitment to a more flexible lockdown would lead to less panic and less crowding.

Other than compliance levels, I think there is very little to say in favor of level 4 instead of level 3. I believe that if level 3 is not sufficient to bring the basic reproduction number (R0) to less than 1, and to generally keep the pandemic under control, level 4 will not be enough either.

Discuss

### Alternative mask materials

27 марта, 2020 - 04:30
Published on March 27, 2020 1:22 AM GMT

Posting for the first time -- please LMK if this is the wrong format for a question.

I like Elizabeth's idea that we are still in the "throw things against the wall and see what sticks" phase of coronavirus response, so even very messy and non-expert questions can be helpful.

So here's a question. Tl:dr version: what are possible mask alternatives that are more easily scalable than standard N95 masks and have better coronavirus protection than surgical masks? If promising materials do exist, what are good ways to test promising materials without endangering doctors? Long version below.

I have been reading about the shortage of PPE (personal protective equipment) for doctors: hospitals are planning for re-using masks, substituting the more effective N95 masks for much less effective surgical masks (SSC compares effectiveness here https://slatestarcodex.com/2020/03/23/face-masks-much-more-than-you-wanted-to-know/). I have friends who are doctors or work with doctors and are absolutely miserable about this, and this strikes me as one of the biggest short-term problems we'll be facing.

Many companies and individuals have donated their stock of N95 masks, but there simply doesn't seem to be enough to go around. To be safe a medical professional needs to put on a fresh N95 mask for each shift that might involve coronavirus exposure. For most doctors and nurses in affected wards, this corresponds to (bare minimum) two masks per day, which will most likely mean 2-4 million masks per day for the US alone in the coming months.

From what I understand, this kind of production is nigh-impossible in the short term, because of the shortage of melt-blown fabric. From this NPR source

making a single machine line takes at least five to six months.

(Though the same source discusses Chinese factories that have gotten around this time limit.)

So a reasonable next question is to ask whether there are alternative materials/designs for masks that do not use melt-blown fabric or other difficult-to-scale materials and offer better protection against n-CoV exposure than flimsy fabric or surgical masks. A cursory literature search shows a bit of promise:

• This nature article claims that adding a layer to a mask which is coated in dried salt significantly improves H1N1 flu virus filtration. The article has problems, and its *in vivo* mouse section strikes me as mostly garbage (sample size is too small, and they test by incubating virus on the mask rather than filtering through it), but the approach seems promising.
• There is a plethora of different materials with filtration properties that are effective at removing virus aerosols. This handbook mentions that certain glass fiber materials in (non-wearable) air filters reduce virus levels by 9 orders of magnitude, and has this tantalizing, though unfortunately reference-less quote:
It is normally intended that very fine ('absolute') air cleaning filters should remove bacteria and viruses by direct filtration, so that air can be sterilized by such action. However, there is now a growing range of combination media where the fibres have been treated in some way with a range of anti-bacterial coatings, to provide an alternative (or supplementary) means of pathogen removal. These treatments may work by physical action (damaging the impinging cells) or chemical destruction on the pathogen particles, and may be 178 Handbook of Filter Media 'permanent' or have a definite active life, after which the filter is discarded or retreated.

BTW, a curious fact from the same handbook, probably not very applicable: old filters from WW2 and Russian early cold-war filters achieved particle filtration approaching the virus level using asbestos filters. This is obviously not a good idea in modern times (old Soviet asbestos-based gas masks are weirdly available to buy on the internet, but keep in mind if an outer filter layer is damaged they will result in you breathing asbestos, which is probably worse than getting coronavirus). Still, this can be taken as a datapoint that other cheap materials (which are not asbestos) with decent virus filtration properties might exist.

The big problem I see with trying new materials is that there do not seem to be that many good studies measuring viral filtration qualities of different materials for viral material in general, and, worse, studies for n-CoV filtration specifically are non-existent and probably not very realistic for the time scales involved.

So here are some specific questions that seem to me like they have a chance of being useful:

• What are the most promising and quickly scalable materials one could test as alternatives for mask filtration?
• What are fast-track ways of determining efficiency of different materials in filtering n-CoV particles? Are there any useful natural experiments that have happened or will happen, and ways to carefully gather data about them? Are there other quick preliminary ways to find promising candidates which do not involve endangering doctors?

Discuss

### TikTok Recommendation Algorithm Optimizations

27 марта, 2020 - 04:03
Published on March 27, 2020 1:03 AM GMT

Summary

The communities I am part of rely heavily on longform text for communication. The world seems to be moving away from text (and especially longform text).

One clear example is TikTok, which is the most downloaded social media application. It centers around sharing short videos. I’m interested in whether important concepts can be communicated via this medium.

As part of that, I researched more about the recommendation algorithm. This has led me to some success (e.g. a coronavirus video with 400,000+ views). Because I found it very hard to get useful information about TikTok when I was doing this research, and because I want the sort of person who would read this post to get wider visibility for their ideas, I am writing this summary.

Background

Most TikTok videos are viewed through the “for you page”, roughly analogous to the Twitter feed. The TikTok algorithm recommends videos for you to view on this page. Note that, unlike with Twitter or Instagram, a large fraction of the content comes from creators that the user does not follow.

The TikTok recommendation algorithm is proprietary and mostly confidential. We know only a few things through information that employees have leaked to the press.

The TikTok recommendation algorithm consists of two components: an automated component and a manual component.

Automated

When a user creates a video, TikTok analyzes the video to identify the target audience for the video. They claim to use a combination of discrete fields (namely the hashtags used by the author and the sound chosen), natural language processing (presumably analyzing any text which is overlaid in the video), and computer vision. For example, they might analyze your video to find that contains images of a cat, has text like “meow” and the hashtag “#catsoftiktok”. They will use this information to identify an audience of people who like cat videos.

They then create a small sample audience of cat-video-lovers who get shown this video. If this audience likes the video, it will be shown to a larger audience. If that audience likes it, it will be shown to a still larger audience, etc.

Whether the audience likes the video is some function of how they engage with it: do they watch it the whole way through, like it, comment, share, etc. A common heuristic is that a video needs at least 10% of the viewers to like it in order to advance to the next stage.

Manual

Reportedly, moderators manually review videos, once after the video has ~100 views, again after it has ~8,000 views, and a third time after it has ~20,000 views. At each stage, the moderator can set different levels of visibility, which determine how widely the video will be shown. These levels of visibility are not visible to the author.

Early leaks of the moderation guidelines showed that they included limiting the visibility of videos containing disabled and possibly also LGBT+ actors, as well as politically sensitive content like protests.

The Intercept recently obtained a more full leak of TikTok’s moderation guidelines. The guidelines mostly attempt to limit sexually “explicit” content (“explicit” in scare quotes because the guidelines include things like banning people for wearing bikinis when not swimming) and politically “controversial” content (notably content that makes China look bad).

Moderation Goals

Interestingly, moderators are also instructed to prohibit “ugly” people and environments (slums etc.). The motivation here seems to be that new users will bounce if they are presented with videos of ugly people/places.

On the other side of things, TikTok apparently hired contractors to steal content of “nice-looking” videos from Instagram and repost it on TikTok. The #BeachGirl hashtag was specifically mentioned as a source contractors should use.

My guess is that there are two separate moderation goals at play: one is politically motivated (e.g. limiting videos about Tiananmen Square), and the second is targeted towards increasing engagement (e.g. displaying videos from more attractive users). TikTok’s official party line seems to be that these guidelines are “locally” produced (implying that the political censorship only happens in China).

ImplicationsStatistical Modeling

It seems likely that the number of views any video can expect to receive should be modeled as 4 separate models: one which is applicable when the video has less than 100 views, one for the 100-8000 regime, one for 8000-20,000, and one for 20,000+. (Corresponding to the thresholds where manual review takes place.)

For simplicity, I will just assume there is a single model. The process of iteratively showing videos to larger audiences implies a distribution like: 1/2 chance of being shown to 10 people, 1/4 chance of 100, 1/8 chance of 1000, etc. More generally, this implies some distribution like

For some parameters n,p.

Noting that we can re-parameterize this as

P(Views>v)=vlogn(p)

We can see that this implies a power law distribution. Indeed, Chen et al. 2019 found that views on the most popular videos were Zipf-distributed.

Empirical Evidence

Looking at my own videos, a simple power law distribution fits reasonably well. (These numbers are using the model P(Views≥10k)=0.3k−2.)

Optimizations Manual versus automated algorithm optimizations

My intuition is that most content is not remarkably changed by manual moderation. (As one intuition pump: TikTok wants to create videos which are popular among its target audience, and probably the empirical evidence about whether a video is popular among its target audience is better correlated with this metric than the opinions of some random moderator.) This implies that it’s more important to focus on automated algorithm optimizations.

Brainstorm

This section is a brainstorm of ways one can optimize their video for wider spread distribution. They all seem like reasonable hypotheses to me based on the above information, but I have no real evidence to support them.

Also, it’s worth pointing out that TikTok has fascist moderation policies and optimizing for fascist moderation is maybe a bad idea.

1. Optimizing things the recommendation algorithm can easily calculate. Machine learning engineers routinely overuse data fields which are easy to measure, even if those data fields are not the most important. Therefore, it’s reasonable to assume that TikTok’s algorithm also prioritizes these.
1. Camera resolution. Probably better to use the backward facing camera on a phone, and use a phone with a better camera. Possibly the algorithm might penalize you for having too high of a resolution (e.g. by using a professional camera), since TikTok wants to maintain an “amateur” aesthetic.
2. Use trending sounds. TikTok can tell which music is more or less popular, and recommend videos based on that. Using sponsored sounds is even better than using ones which are trending organically.
3. Use trending hashtags. Same idea, although note that the sample audience will also be chosen based on your hashtag. (And it’s crucial to get engagement from the sample audiences.) So if you make a video about video games, using hashtags related to makeup is probably a bad idea, even if those makeup hashtags are trending, because your sample audience will be people who want to watch makeup videos (and therefore won’t engage with your videogame content).
4. Use trending keywords in text. Even though TikTok claims they do “natural language processing”, I’m skeptical it’s very advanced. Similar to search engine optimization, you probably want to explicitly list out key terms in either your caption or in the video itself.
5. Use good lighting. TikTok doesn’t want videos that are too dark or oversaturated, and I suspect they can at least partially identify this algorithmically.
2. Making your content more easily understandable for the algorithm.
1. Use contrasting backgrounds. Computer vision is still in its infancy, and algorithms would probably struggle to recognize an object in front of a complex background. Using single-color backdrops with bright lighting let the algorithm more easily identify the content of your video. (Of course, the content in your video has to also be the sort of thing TikTok wants to recommend, otherwise them being able to identify the content of your video doesn’t help.)
3. (Pretend to) Be the person TikTok wants you to be.
1. Be physically attractive. See above information about how moderators filter out “ugly” people.
2. Don’t be bully-able. TikTok suppresses videos by creators who moderators think could be bullied. Be cautious sharing problems in your life or controversial opinions. Political opinions seem especially dangerous.
3. Purchase, use, and mention sponsored products. TikTok makes money through sponsors paying to promote videos containing their products. It seems quite likely that e.g. mentioning a sponsored hairspray will increase your views relative to mentioning a non-sponsored hairspray.
4. Live the lifestyle Gen Z wants to live. TikTok wants to highlight videos that showcase the lifestyle its target audience (primarily 13-16-year-olds) want to live in order to attract and retain these users. This includes being successful (or at least participating) in desired careers like being a social media influencer or twitch streamer. General wealth, beauty, and success is probably also good.
1. Lu and Lu 2019: “We found that 21 [of 28] interviewees use Douyin for following the perceived stylish and up-to-date lifestyle, because they consider using Douyin as “a fashionable lifestyle”. Further, 16 interviewees reported that they used Douyin to be able to talk with people around them with interesting and trending topics, because peer students or workers often talk about content on Douyin.”
5. Have upper-class and clean backdrops. Moderation guidelines include hiding videos that have broken pipes or cracks in the wall.
4. Timing. As with all social media platforms that prioritize recent content, it’s important to publish content when your audience is most active. (TikTok Pro accounts display a histogram about when your followers are active.)
5. Focus on quantity. Power law distributions have the counterintuitive property of having expected values that are significantly higher than the median value. The expected number of views on your next video is probably higher than you intuitively expect.
Broader idea: steal ideas from China/Douyin

Douyin is a sister product to TikTok. Several trends on Douyin do not seem popular on TikTok yet. For example:

Interestingly, several young users reported that they like watching and sharing videos in which the creator is sharing good-looking profile images for them to adopt for their social accounts, fancy nicknames for online social accounts, and creative animated mobile phone wallpapers for individuals or couples (Figure 1). This shows a trend that the content of some short videos can be easily “materialized” and adopted in other social channels, and the videos penetrate into users’ real life. As noted by P28, a male student who was in a relationship: “My girlfriend really likes to adopt the profile images recommended by a content provider as her WeChat account. She says that those profile images are beautiful and can fit with her personality well. She also likes to adopt animated mobile phone wallpapers shared on Douyin. Sometimes those wallpapers are for a couple, which are very creative in that two phones can create a holistic story when put together, and she always urges me to adopt those wallpapers together with her. It is a way to show to her friends that we are in a sweet relationship. ”Lu and Lu 2019

Academic literature presumably lags trends by a significant amount, so if you have direct access to Chinese or other markets where you can copy trends more rapidly, that’s probably better.

TikTok’s recommendation process makes coordinated voting hard. Because the decision about promoting a video is based on how a randomly selected audience interacts with it, having your friends all like your video doesn’t do much.

One possibility is that you could ensure that your friends are disproportionately represented in the randomly selected audience. For example, you could have some hashtag that only you and your friends use, and TikTok might therefore automatically choose your friends to be in the sample audience.

This is just a hypothesis, and I have low confidence it will work.

Discuss

### LW Team Updates: Pandemic Edition (March 2020)

27 марта, 2020 - 02:55
Published on March 26, 2020 11:55 PM GMT

TL; DR

The LessWrong team has been throwing our combined might behind coronavirus efforts. As I see it, coronavirus is both of a problem immense proportion that we might be able to help with as well as a domain well suited to the development and application of rationality + rationality tools

Our efforts include:

Full Update

2020 is turning out to be quite the year. I apologize that the LessWrong team hasn’t yet put out any updates so far: we’d lined up some great Q1 plans and then things got...messy.

The team’s attention is now fully focused on COVID-19. Habryka has some elaboration on why this is a top priority here.

Before I go any further, serious respect to the LessWrong community which has done a fantastic job addressing the coronavirus situation ahead of the curve. Really, the response has been very proportionate to the situation.

A Note on Additional Help

For the last several weeks, the LessWrong team has been contracting with Elizabeth to help lead coronavirus efforts, hence her name being attached to several of the below projects. Elizabeth has been a member of the extended LessWrong moderator team since possibly the rise of LW2.0 and may be recognized for her work on epistemic spotchecks and other research.

Current Projects[This Sunday!! March 29] Online Events & Meetups

The LessWrong team is beginning to organise various online meetups and events. The plan is to experiment and explore which exact kinds of meetups can be made to work and with which tech.

The first event will be this Sunday, March 29: a live-streamed debate between Robin Hanson and Zvi Mowshowitz followed by an online meetup.

The LessWrong Coronavirus Research Agenda

Elizabeth has laid out concrete open questions related to coronavirus here. LessWrong/Rationality is about believing true things and taking actions that cause the best outcomes. In many ways, the coronavirus pandemic is an excellent test of our rationality. It’s a murky, high-uncertainty domain with high stakes.

I’ve been excited by the leading work the LessWrong community has done to date, and am further excited by the prospect of us pulling together communally for further answer the many questions that remain.

Right now, the top 3 concrete questions are:

You do not need a background in biology to help out. Basic quantitative skills are sufficient to provide useful information. Ben Pace has further guidance in his How To Contribute Guide.

On March 13th, The team launched a daily-updating database of links to coronavirus resources. Links are collected, sorted by topic, rated, and summarized. Top level categories include “Progression & Outcome”, “Spread & Prevention”, “Science”, “DIY”, and so on.

The goals is to make it easy to find important information, for examples, answers to questions like:

• "What's the best dashboard to follow global case counts?"
• "What's the best link to send to my parents?"
• "How long does COVID-19 last on different material surfaces?"
• "Where can I find that really good thread by Rob Wiblin I read last week?"

The link database was originally implemented as a Google Sheet but has just been migrated to live in the LessWrong site proper.

See the database here

Appearance of the New LessWrong Coronavirus Links Database: Intro PageLessWrong Coronavirus Links Database: Science Page

In conjunction with the links database, the team has been posting a daily-updates post containing notable links which were added in the last day. For example:

Tagging/Filtering

The LessWrong team has been talking about “tagging” as a feature for a long-time now and head a prototype build as of last December. The coronavirus situation gives us extra reason to roll it out now: some people might wish to filter out the deluge of coronavirus related content and view posts on other topics. To this end, we’ve rolled out a limited version of tagging with just the “Coronavirus” tag.

Raemon describes the feature fully in this post here. The key thing to know is you click the gear next to Latest Posts to open up filtering options.

Appearance of tag filtering options upon clicking Filtering Gear Icon.
• Hidden: no posts with this tag will be displayed
• Required: only posts with this tag will be displayed
• Default: posts with this tag will be displayed normally based on karma and time since posting.

Naturally, if you're after everything we've got on coronavirus, visit the coronavirus tag page at www.lesswrong.com/tag/coronavirus.

Coronavirus Justified Practical Advice Thread (& Summary)

The team noticed early on that a lot of coronavirus advice was being shared, often without much explanation for why it was good advice. Thus was the Justified Practical Advice (JPA) Thread born. The thread is long, contains many interesting and hopefully useful ideas, but can also take a while to read. So we’ve also got the Justified Practical Advice Thread Summary which contains the best advice from the thread.

If you have something to say on coronavirus that’s not worth a top-level post, please share in the Coronavirus Open Thread.

House Coordination Spreadsheet & Isolation Levels

Raemon with the assistance of others put a great deal of effort into developing a spreadsheet template for houses within a community to coordinate on their health and quarantine status.

Coronavirus Household Isolation Coordination 1.2

The spreadsheet can be an excuse to think about isolation plans, discuss them with roommates, and create common knowledge of them. Subgoals include:

• establishing guidelines for level-of-isolation (see second tab)
• help with contract tracing
• facilitate roommate swapping between higher-caution and lower-caution households
• facilitate creation of multi-house cells that share an isolation level (where permitted by local quarantine laws, e.g. not California)

I find the second tab with “isolation levels” useful for thinking about exposure, risk, and various precautions.

Direct Research

Jim, Elizabeth, and others have been putting out some great direct research. Notable contributions include:

Upcoming Projects

I expect the team will want to remain agile in coming weeks and move our efforts to wherever seems most useful. Given that, I think it’s hard to commit specifically to what we’ll work on.

A project I’d like to work on if possible is moving us towards some kind of Wiki tech. I could see that being useful for establishing a Schelling location for the most up-to-date knowledge of given questions of interest, e.g. treatments for global spreading infection diseases.

How To Contribute

Ben Pace has written a guide on how to contribute to LessWrong’s coronavirus efforts. It so far details how to contribute to the Links Database and Research Agenda.

Feedback, Support, Questions, Etc.

We love to hear from people. If you want to talk to us about anything, please reach out:

Discuss

### Hanson vs Mowshowitz LiveStream Debate: "Should we expose the youth to coronavirus?" (Mar 29th)

27 марта, 2020 - 02:46
Published on March 26, 2020 11:46 PM GMT

I'm announcing the first LessWrong LiveStream Debate and Meetup, an experiment in online events.

This Sunday (3/29) at noon, Robin Hanson and Zvi Mowshowitz will debate Robin's policy proposal of deliberate infection of certain parts of the population. This will be on a YouTube LiveStream (link TBA). After the debate, there will be several audio/video rooms for a general LW meetup, where you can talk to other LessWrong users and talk with the two debaters.

Summary

• Time and Date: March 29 at 12:00 PM PDT (US West Coast Time)
• On Feb 17th, Robin wrote the post Consider Controlled Infection, and has since followed up with six more posts on the policy proposal of deliberate exposure of certain parts of the population. Robin and Zvi will be debating that proposal and its alternatives.
• They will have a 90 min debate followed by a 30 min Q&A that I'll moderate.
• For the Q&A, if you are selected you will be given an invite link and will join the Zoom call to ask your question.
• The online meetup will start just after 2pm.

I'll update this page with a link to the live stream and further info on how the Q&A and meetup will happen. I look forward to chatting with people! Check back for updates.

Discuss

### What are the most plausible "AI Safety warning shot" scenarios?

26 марта, 2020 - 23:59
Published on March 26, 2020 8:59 PM GMT

A "AI safety warning shot" is some event that causes a substantial fraction of the relevant human actors (governments, AI researchers, etc.) to become substantially more supportive of AI research and worried about existential risks posed by AI.

For example, suppose we build an unaligned AI system which is "only" about as smart as a very smart human politician, and it escapes and tries to take over the world, but only succeeds in taking over North Korea before it is stopped. This would presumably have the "warning shot" effect.

I currently think that scenarios like this are not very plausible, because there is a very narrow range of AI capability between "too stupid to do significant damage of the sort that would scare people" and "too smart to fail at takeover if it tried." Moreover, within that narrow range, systems would probably realize that they are in that range, and thus bide their time rather than attempt something risky.

Discuss

### How important are MDPs for AGI (Safety)?

26 марта, 2020 - 23:32
Published on March 26, 2020 8:32 PM GMT

I don't think finite-state MDPs are a particularly powerful conceptual tool for designing strong RL algorithms. I'll consider the case of no function approximation first.

It is certainly easier to do RL in a finite-state MDP. The benefit of modeling an environment as a finite-state MDP, and then using an MDP-inspired RL algorithm, is that when the agent searches for plans to follow, it doesn't evaluate the same plans twice.

Instead, it caches the (approximate) "value" for each possible "state", and then if a plan would take it to a state that it has already evaluated, it doesn't have to re-evaluate what the plan would be from that point on. It already knows, more or less, how much utility it could get thereafter. Compare that to naïve approach of using a world-model to do full expectimax search at each timestep.

The model-the-environment-as-finite-state-MDP-then-do-dynamic-programming approach, or just "the MDP approach" for short, is, I think, all about not searching the same region of the planning search space twice. This is clearly a good thing, but I don't think the MDP approach in RL contains much more conceptual progress toward AGI than that. If I were to try to do a pre-natum of a fairly advanced RL agent, that is, if I tried to anticipate a response to "things went well; why did that happen?", my guess would be that a big part of the answer would be:

It avoids searching much of the planning search space even once, certainly not twice.

The MDP approach with function approximation is more powerful, depending on how good the function approximation is. There's no upper bound on how good the MDP approach with function approximation could be, because buried inside the function approximation (whether that's approximation of the value, or the optimal policy, or both) could be some clever RL algorithm that does most of the work on its own. A good function approximator that is able to generate accurate predictions of the value and/or the optimal policy might appear to us to "generalize" well across "similar states". But it's not clear to me to what extent it is a useful abstraction to say that the function approximator thinks in terms of the agent bouncing around a set of states that it classifies as more or less similar to each other.

I don't mean to say that the MDP approach is useless. I'm certainly not against using a TD-style update instead of a full Monte Carlo rollout for training a function approximator; it's better than not using one and effectively searching parts of planning search space many times over. I just don't think it's a hugely big deal conceptually.

I think this is one small, very disputable argument against defaulting to a finite-state MDP formalism in AGI safety work. A natural alternative is to consider the agent's entire interaction history as the state, and suppose that the agent is still somehow using clever, efficient heuristics for approximating expectimax planning, with or without built-in methods for caching plans that have already been evaluated. None of this says that there's any cost to using a finite-state MDP formalism for AGI safety work, only that the benefits don't seem so great as to make it a "natural choice".

Discuss

### What is the safe in-person distance for COVID-19?

26 марта, 2020 - 23:29
Published on March 26, 2020 8:29 PM GMT

I'm not sure where the 6' number comes from, and I'm skeptical it really holds up as something I'd be comfortable maintaining for an extended period of time (If someone with c19 coughed at me from 6' away I would not feel very safe). I'm guessing the 6' is more like a quick rule for people who are only interacting briefly.

How much does it matter whether you're up/downwind? I've heard conflicting things about how airborne it might be.

I'm interested in this largely for "Okay, assuming we need to be careful about this for months at a time, what sort of practices could we use to maintain in-person social ties, indefinitely, without risk?" (i.e. going on long walks, visiting each other's house where 1-2 people hang out in the street or sidewalk and house denizens hang out on the porch, etc)

Discuss

### March 24th: Daily Coronavirus Link Updates

26 марта, 2020 - 05:22
Published on March 26, 2020 2:22 AM GMT

As part of the LessWrong Coronavirus Link Database, Ben, Elizabeth and I are publishing update posts with all the new links we are adding each day that we ranked a 3 or above in our importance rankings. Here are all the top links that we added yesterday (March 24th), by topic.

You can find the full database here: https://www.lesswrong.com/coronavirus-link-database

Dashboards

Dashboard with estimates and predictions of true prevalence

A dashboard that gives you estimates for current prevalence by country, as well as predictions for the future based on varying amounts of mitigation.

Economics

What will the economic effects of quarantine be?

LW attempts to predict what the effects of a short or long quarantine will be

Medical System

Flexport CEO explains why scaling PPE is hard

Outlines the difficulties in scaling, including QA and legal issues

Other

Stories of C19 layoffs

Reddit thread of people who lost their jobs due to coronavirus or quarantine

Review of mask efficacy

They're useful but the gains may be overwhelmed by any risk compensation, and they need to be saved for medics

(EV) He left out a swath of studies on mask use in mass gathering

A summary of the best suggestions from the justified practical advice thread

Work & Donate

LessWrong C19 Agenda

A list of questions we want answered to inform future decisions, and assembly of answers as they're created

Discuss

### Price Gouging and Speculative Costs

26 марта, 2020 - 04:50
Published on March 26, 2020 1:50 AM GMT

Let's say you see a potential pandemic coming, and you produce a product that could be critical. Maybe you make respirator masks, maybe you make ventilators, maybe you make PCR test reagents. You can see that if you and your competitors don't ramp up production and the pandemic happens, there will be a shortage. What do you do?

One option is to do nothing: keep producing at your regular rate. If the pandemic fears were overblown then you're fine. If the pandemic happens you quickly sell out, and start scrambling to ramp up production.

Another option is to ramp up production now, speculatively. Start paying workers extra to work longer shifts and run your assembly lines around the clock. Train extra workers. Find what you're bottlenecked on and figure out how to get that ramped up too. If the pandemic fears were overblown you lose a lot of money, but if the pandemic happens people need what you have so much that you can charge high prices. How much to ramp up production in advance depends on how likely you think the pandemic is, and how much you'd be able to increase prices if it does happen.

Except we have laws and customs against price gouging: if the pandemic does happen, you are going to have a lot of trouble raising your prices. The laws generally do allow passing along increased costs, but the problem here is that your costs were speculative. Let's work an example.

Imagine your ventilators normally cost $35k to make, and the market price is$40k each. If you push really hard to ramp up production you can make a lot more, but your cost goes up to $90k/each. In New Jersey, the state I looked at last time, you're allowed to pass along costs but can't increase your profit on "merchandise which is consumed or used as a direct result of an emergency or which is consumed or used to preserve, protect, or sustain the life, health, safety or comfort of persons or their property" by more than 10% (56:8-107, 108, 109). Your costs have gone up by$55k and your normal profit is $5k, so you would be allowed to sell them for$95.5k. Your possibilities are:

• No pandemic: you spent $90k each, but the market price is$40k. You lose $50k each. • Yes pandemic: you spent$90k each, and the market price is way above that, but you can legally only charge $95.5k. You make$5.5k each.

If you think the pandemic is >90% likely to happen then you'll expect to make money by ramping up production, otherwise you'll lose money. So even if a pandemic looks, say, 75% likely, you don't ramp up.

Similarly, imagine you make respirator masks and you know that every so often there's an emergency where demand spikes. Could be a pandemic, but could also be widespread fires, or many other things. Since melt-blown fabric is a major bottleneck, you could decide to keep a large stockpile of it so you can easily ramp up production in an emergency. The same unfavorable math applies here: you're heavily limited in how much you can increase prices in an emergency, so keeping a large stockpile to be prepared for even a reasonably likely event is a money-loosing proposition.

In the current crisis people are likely to die because we don't have enough ventilators, and the marginal person probably needs one for about a week, and the peak lasts maybe two months, so the marginal ventilator saves about eight lives. At the US statistical value of life of ~$9M, that's$72M per ventilator. We're heading into a disaster where we don't have enough machines that we would value at ~$72M/each and normally cost ~$35k/each to make. This is really bad.

It's too late to fix this for the current situation, but I see three main ways out of this for the future:

• Allow price gouging: don't restrict what prices people can sell things at.

• Allow speculative production: require companies to disclose and document production plans, require them to share their probability estimates of how likely they think things are to be needed, keep them honest by allowing third parties to bet against them at their published probabilities.

• Have a government that will put in emergency ventilator orders at an early stage of the crisis even when they may not be needed, stockpile masks for potential pandemics, and generally stay on top of things.

These three approaches require increasing levels of government competence and foresight, and given recent performance I'm pretty skeptical. But the current approach where the government doesn't handle the problem and also does not allow industry to make a profit handling the problem is a disaster.

Discuss

### AGI in a vulnerable world

26 марта, 2020 - 03:10
Published on March 26, 2020 12:10 AM GMT

I’ve been thinking about a class of AI-takeoff scenarios where a very large number of people can build dangerous, unsafe AGI before anyone can build safe AGI. This seems particularly likely if:

• It is considerably more difficult to build safe AGI than it is to build unsafe AGI.
• AI progress is software-constrained rather than compute-constrained.
• Compute available to individuals grows quickly and unsafe AGI turns out to be more of a straightforward extension of existing techniques than safe AGI is.
• Organizations are bad at keeping software secret for a long time, i.e. it’s hard to get a considerable lead in developing anything.
• This may be because information security is bad, or because actors are willing to go to extreme measures (e.g. extortion) to get information out of researchers.

Another related scenario is one where safe AGI is built first, but isn’t defensively advantaged enough to protect against harms by unsafe AGI created soon afterward.

The intuition behind this class of scenarios comes from an extrapolation of what machine learning progress looks like now. It seems like large organizations make the majority of progress on the frontier, but smaller teams are close behind and able to reproduce impressive results with dramatically fewer resources. I don’t think the large organizations making AI progress are (currently) well-equipped to keep software secret if motivated and well-resourced actors put effort into acquiring it. There are strong openness norms in the ML community as a whole, which means knowledge spreads quickly. I worry that there are strong incentives for progress to continue to be very open, since decreased openness can hamper an organization’s ability to recruit talent. If compute available to individuals increases a lot, and building unsafe AGI is much easier than building safe AGI, we could suddenly find ourselves in a vulnerable world.

I’m not sure if this is a meaningfully distinct or underemphasized class of scenarios within the AI risk space. My intuition is that there is more attention on incentives failures within a small number of actors, e.g. via arms races. I’m curious for feedback about whether many-people-can-build-AGI is a class of scenarios we should take seriously and if so, what things society could do to make them less likely, e.g. invest in high-effort info-security and secrecy work. AGI development seems much more likely to go existentially badly if more than a small number of well-resourced actors are able to create AGI.

By Asya Bergal

Discuss