## Вы здесь

### Рациональное додзё. Установка привычек и триггеров

События в Кочерге - 19 апреля, 2019 - 19:00
Понедельник, 22 апреля, 16:30

### Username change and event page editing

Новости LessWrong.com - 19 апреля, 2019 - 07:43
Published on April 19, 2019 4:43 AM UTC

I want to change my username, as making a new account would delete all my posting history, which I value.

I am also the new organizer for the Kansas City SSC meetup, and have things to edit on the page (starting with the fact that there appears to be a duplicate posting of the group).

Discuss

### Reflections on Duo Standard

Новости LessWrong.com - 19 апреля, 2019 - 02:20
Published on April 18, 2019 11:20 PM UTC

Companion Piece: Reflections on the Mythic Invitational

Frank Karsten (Channel-Fireball): The Mythic Invitational Wasn’t Perfect And It Was Still a Smashing Success

The first question is, is Duo Standard a good format? Should we continue to play it? Or should we abandon it?

Wizards has already answered that this is definitely not the final form. That still leaves the question of why, and how to improve.

The second question is, What is the right thing to do in Duo Standard, right now or in general? What game theory questions, metagame shifts and strategic issues are most important? How does this differ not only from best-of-three, but from best-of-one?

My answer to the first question is: No. Duo Standard is not a good format. We should not continue to play it. The Mythic Invitational was played and the verdict is in.

The best way to give a more complete answer to the first question is to answer the second question, then double back. So that’s how this is structured.

Duo Standard: Level Zero

The biggest errors in my Speculations on Duo Standard were errors about level zero: Understanding best-of-one Standard.

I stopped playing games of best-of-one Standard the moment I realized there was a best-of-three ladder on Magic Arena. Why would I want to play an impoverished game of Magic, unless I had severe time constraints or a deck that wasn’t ready to field a sideboard yet?

Thus, I missed some key elements and dynamics. So did the players and commentary team, although different ones, and less of them.

Duo Standard is radically different from simple best-of-one. But if you start from an incorrect base, you’ll reach wrong conclusions.

The developments I missed were: Mastermind’s Acquisition in Esper, Esper Acuity, the dismissal of Sultai and the dismissal of blue. I also underestimated people’s love of mono-red and mono-white on this level, in addition to my underestimation of it on levels one and two.

The level-zero development most players definitely missed were: Blue is still great.

The ambiguous issue is that Gruul was both (from where I sit) underplayed and badly built, and did poorly.

I also suspect Golgari/Sultai was wrongly dismissed, but I dislike and don’t play the deck, so it’s hard for me to say.

Lets talk quickly about all six.

1. Mastermind’s Acquisition in Esper

I forgot this card existed. Once one remembers the card exists, adding it to the deck is obvious. It provides a way to win and close out games and a lot of flexibility, and your sideboard wasn’t doing anything otherwise. If you’re in a mirror, this creates a counter or a must-counter threat. If you’re not in the mirror, this provides runaway advantage or an answer to your problems.

The mana cost is high, so it’s not clear how many copies one can afford. The first copy is clearly a win given how much flexibility it earns, especially by starting with The Mirari Conjecture. After that it’s hard to say and opinions varied, as the four slot is already quite dense. My guess is that you want one in normal best-of-one, but two in Duo Standard due to the popularity of Esper.

2. Esper Acuity

It in no way occurred to me that Esper Acuity was a card. It turns out there is a whole deck involved, once you can play Mastermind’s Acquisition and not worry about sideboarding. You have a lot of ways to gain life and keep things running smoothly to buy time for Mastermind’s Acquisition, and against Teferi, Hero of Dominaria decks you can go get a Sorcerer’s Spyglass they can’t remove. Not bad.

Is the deck actually good, though? I still have no idea. Given who played it, it’s at least reasonable. The existence of this option, as a way to have a lot of life gain for tiebreakers and against red, while having a plan in the Esper mirror, changes things a lot.

3. The Dismissal of Sultai

People who play Jund historically talk about how great their sideboards are. The same is said of Sultai in current standard.

As someone who never got into Jund, I never understood that.

Yes, you get to make sure that all your cards are good, and shore up on numbers where you are weak. But that’s everyone’s plan. Maybe not quite everyone, but close.

The cards you do bring in seem generic. After sideboarding, you’re still fundamentally doing the same thing you were doing before sideboarding.

You don’t get more than a few of the thing that you want most. Decks that fear removal will face more, but not feel swamped. Decks that fear discard will face some discard, but not that much. Somehow your cards in are both spread too thin, and don’t feel special.

The cards that come out are a big win if you can take out useless creature removal. But Esper and Nexus both often bring creatures in, that you need to worry about. You’re still winning to this because the alternative is that they always ‘guess right’ and have no men, but you’re not winning that much to it either.

Sultai is also considered a solid way to handle aggressive creature decks. If it can’t even win game ones against red and white, honestly what is the deck even doing? Where is it good? Why did so many Pros play it in Cleveland, despite it seeming bad everywhere?

I don’t know. I’ll likely never know. Perhaps it was a ‘I want to use my superior playing skill’ thing, or a ‘I want my fate in my own hands’ thing. That always plays too big in many players’ minds. But if so, I expected that to continue!

Thus, when everyone figured out that Sultai was bad, which I agree with, it took me by surprise. Golgari gets smoother mana and most definitely should be able to handle the aggressive decks, a la what Reid Duke brought, but I don’t see how it stands up to Esper, which is central.

This resulted in me exploring and expecting a bunch of potential pairings involving Sultai that turned out to be off the table.

4. The Dismissal of Blue

This is distinct from blue being good or bad. If everyone thinks blue is unplayable, then that’s identical to it being unplayable for all practical purposes.

That seems to have been what most people think. When interviewed early on, the one blue player was almost apologetic about his choice. He said (this is a heavy paraphrase) that yes, it seems like it’s a terrible choice for the metagame, but it’s a deck and he likes it so he’s playing it anyway.

The argument was that blue was bad against red and bad against white in a best-of-one world, and couldn’t afford to play Essence Capture or Entrancing Melody because they would be dead against Esper.

Without blue to worry about, two things happen.

First, players can’t play blue, which shuts down a lot of the pairings and options I was considering.

Second, players don’t need to worry about beating blue. All the hate floating around Standard can safely disappear, and there is no reason to play anti-blue Rakdos/Jund or a hateful Gruul build. So no one played such decks and cards. There were a small number of anti-weenie Gruul builds, but no anti-blue ones.

This is why the format shrunk so dramatically, even more than the lack of sideboards or the banning of Nexus of Fate.

5. Blue Is Actually Great

We could cite anecdotal evidence of the one blue player finishing second overall. That definitely counts. It sounded like it mostly convinced the coverage team, despite them constantly saying blue couldn’t be chosen in game three even at the end.

The core problem of blue is the same core problem everyone else faces: You have to worry about creatureless Esper, and also lots of heavy creature decks. The cards that beat the creature decks (in this case, Essence Capture and Entrancing Melody) are blanks against Esper. Blanks kill you.

Blue has the opposite problem in best of three. I liked Quench in the main when I tried it. After sideboarding, it’s never terrible. But you always have 60 cards you want more, so it always makes your elephant worse.

Quench in turn weakens Chart a Course even further. I’ve already cut it entirely from my best-of-three build, because after board I always have a better 60 cards to play, and I’d rather improve that 60 when the card was always marginal in the main deck. In many matchups you have zero time to tap mana on your turn for no board impact, especially red and white.

I do think that if you run zero Chart a Course that implies a 20th Island, but the mana smoothing rule in best of one, combined with not running Entrancing Melody, knocks us back to 19 again.

The other move Piotr made was to run two copies of Surge Mare. The theory is that Surge Mare plays against Esper while being a strong card against red and white. It defends you against the many copies of Goblin Chainwhirler and Cry of the Carnarium that are out there in best of one, and makes the deck more ‘solid.’ In exchange you give up mana efficiency, which is the main thing the deck is doing, so that’s a big cost. I’ve never been a Surge Mare fan.

Playing the deck this way leaves you very strong against Esper. You have zero dead cards, and enough counters to give the opponent fits.

The question is the aggressive matchups. When playing with three copies of Essence Capture, and more recently also a copy of Entrancing Melody, I actively like my game one against red. This configuration seems much worse. Quench is scary against red because of Runaway Steamkin. You’re running five cards that go dead if they have surplus mana, and that gets in the way of the trading strategy and runs into whammies. The +1/+1 counters will be missed quite a bit, as this opens you up to Goblin Chanwhirler more and gives them much longer to draw burn. Surge Mare helps a bit, and it is nice to have more counters for Experimental Frenzy. My guess is that things are now about even here.

I always expected the white matchup to be bad, but then I started trading off creatures on turn two. After that I kept winning games and was forced to update to it being a good matchup, as scary as it feels. They have less removal than you think, and less sources of power than you think. If you use your counters well, their deck is often doing very little for quite a while. Trading often puts you in a dominant position.

The problem is that this was leaning heavily on Essence Capture and Entrancing Melody, both of which are amazing cards here. The extra +1/+1 counter from Essence Capture often creates a blocker that shuts off many of their creatures, or allows you to handle History of Benalia, and it means your counters mostly still work late in the game. Entrancing Melody is back breaking when it takes Benalish Marshall and a cheap two-for-one otherwise, although that two might be countering History of Benalia. But that’s fine, because they don’t have that many powerful threats.

Surge Mare blocks, but it takes a lot of mana to stop the opponent from attacking. I’m not convinced it’s better than Mist-Cloaked Herald here. Mist-Cloaked Herald trading off for one mana is a bigger game than one might think, and I’m happy here we still have two copies.

My guess is that this matchup is now slightly bad for you, perhaps moderately bad, but it is definitely not disastrous. If I’m facing half Esper and half white, I’d rather put up blue against that than either white or Esper. If it’s Esper and red, I’m quite happy.

Can we give back some of the Esper matchup and accept a dead card or two? Not at this density of Esper. It’s too big a game to be great there. If the metagame reaches a more stable equilibrium, which will have less Esper in it, that might change.

Why do players think blue is bad? Traditionally the answer has often been ‘they are bad at playing blue’ but this field is too good for that.

6. Gruul Failed

Looking at the lists, I do see a decent amount of Gruul played. None of it went deep into the tournament, so the question is why.

Some of it was weird or built badly. I don’t think that playing Ghalta, Primal Hunger is a good idea. Nor do I think a dinosaur theme is something one does if looking to maximize win percentage.

There are two basic approaches to Gruul. One can build the deck around Warriors, or around base green. There’s also a base-red version with a tiny splash, but to me that’s just a red deck and doesn’t count.

The Warriors lists are clearly strong against red and white if they run a bunch of removal spells. Your creatures fight so much better, and you don’t fall far behind given where your curve lies. The worry is how well it plays against Esper. You have a lot of haste, and your creatures hit hard, but your removal spells are not that useful so a balance needs to be struck. Despite all worries, I was told during my scramble to prepare that this ‘traditional’ Gruul deck is considered advantaged against Esper. I’m not sure that’s true, but it seems not highly disadvantaged and that should be good enough.

The green-based lists splash red and are based around Steel-Leaf Champion to varying degrees. These decks tried going up to Ghalta, Primal Hunger off the back of Nullhide Ferox. I don’t think one needs to go that big, as I’ve had no problems overpowering people without resorting to 12/12s. By going large, you force over-commitment to the board and open yourself up to sweepers, while Esper still has Mortify and Teferi, Hero of Dominaria as direct responses.

My guess is that this was a large part of why the green decks did poorly. They also seem to have been chosen by players who expected a different metagame to emerge, and who have less history with and are less comfortable with Magic Arena. These players knew some things others didn’t, but also didn’t know things others did, and made detail choices that didn’t fit the situation.

My choice remains my build of Gruul, as played by Brian-David Marshall. See the guide here [TODO]. The only question is Lava Coil. Lava Coil is purely dead against Esper, but is important to stopping Rekindling Phoenix out of warrior Gruul. I was worried that if we got rid of Lava Coil, we’d be afraid of the semi-mirror, although it isn’t obvious we’d need to be given that otherwise things line up quite well. Without those two copies of Lava Coil, the Esper decks that actually showed up are decks you’re not unhappy to play against at all.

One must always remember that they don’t always have Kaya’s Wrath and it takes a while to find two. Many people played as few as two copies of that card. In that case, you’re feasting. Half-measures like Cry will usually not cut it.

Level One

I cheated a bit discussing level zero, as I assumed the level one result that there is a ton of Esper. Now it is time to explore why there was a lot of Esper.

Esper is arguably the best deck, but it’s more than that.

Esper is the central level one move.

Every deck, including Esper itself, faces the same central trade-off. Do I play defenses against creatures, or avoid dead/bad cards against Esper?

Red has this the easiest. Its burn goes meaningfully to the face. Red already wants to avoid cards like Lava Coil, and the Esper consideration seals the deal.

White can use Conclave Tribunal on Teferi, Hero of Dominaria, so its removal does not go entirely dead if it avoids Baffling End. This still keeps decks off of the full four copies of Conclave Tribunal, which feels from the other side like a big win in many matchups.

Blue decks have to abandon Essence Capture and Entrancing Melody.

Green decks need to worry about Collision/Colossus, and their general rate of durdling and defending versus attacking.

Black decks have pure creature removal like Moment of Craving and Cast Down.

Esper itself has to worry about Mortify, Cry of the Carnarium, Kaya’s Wrath, Cast Down and Moment of Craving. Mortify hits Search for Azcanta, but the others constitute about ten dead cards. Four of them can hopefully get discarded to Chemister’s Insight, but that still leaves six more.

Players can choose to ‘sell out’ one of their decks against one side of the dilemma, by pre-sideboarding into an anti-creature or anti-spell configuration. A blue, Temur Reclamation or Esper deck that wants to defeat Esper will have a dominant matchup. A blue, Esper Acuity, Sultai/Golgari or Gruul deck that wants to defeat creature decks can dominate those matchups. The exact distribution of the dominance is flexible.

In exchange, if you run into the wrong half of the field, things get quite ugly.

To defend against someone who sells out, one must maintain a creature deck and a non-creature deck.

‘Non-creature’ means a deck not vulnerable to an anti-creature sideboard configuration. In theory a creature deck could be good against the ‘normal’ anti-creature setups, but right now that does not seem to be the case. So the three viable creatureless builds are Esper Acuity, Esper Control and Temur Reclamation. Temur Reclamation (in my understanding) folds the creature matchups to win the creatureless matchups. Esper Acuity is an attempt to win both via different parts of the build, but focused more on anti-creature. Esper Control can be tuned either way.

In theory there’s also Dimir/Grixis, but it’s a bad deck, and the number of Nullhide Ferox running around does not make it better.

Thus, the natural level one thing to do is to always play at least one of those three decks. Two of them are odd ducks players aren’t as used to, and that aren’t as naturally strong as strategies (in my opinion), so Esper got the nod a very large percentage of the time, even though everyone knew that would happen.

On the creature-deck side, to guard against things like an Esper deck without much creature removal, things are far more wide open. You can choose white, red, blue, Drakes, Sultai/Golgari, Gruul and more. There’s something for (almost) everyone.

Most players seem to have stopped on this level, perhaps because advancing past it proved hard. There are two ways one can diverge from it. One can refuse to go to level one because something else matters more, or one can go beyond into various forms of level 2.

Some players, as I predicted, embraced that they were at a severe skill disadvantage. Thus, they sought to minimize the impact of that advantage, and either maximize variance or focus on getting good with a single deck. Alternatively, some players may have decided one deck was just better than the others.

Either way, the result was a number of players submitting two identical decklists.

I mentioned the possibility of using two copies of the same strategy, but pre-sideboarded against different halves of the field. That was not tried. What was tried was saying that this was my deck and I dare you to come and beat it twice.

This did not go well. All players who did this finished poorly.

These players were self-selected to already be at a disadvantage, so failure should not be too surprising. It is still strong confirmation that this approach gives up quite a lot. You need to at least be at level one.

Level Two: Finding the Trump

Paulo writes that he expected everyone to find the level one strategy of Esper paired with red or white, and spent a lot of time looking for a deck that was very strong against all three, even if it was horrible elsewhere.

Unfortunately, Standard doesn’t work that way. Every deck does powerful things and is capable of winning games against everything else. Unless a deck intentionally forfeits a matchup.

Even if you do forfeit some areas to win others, the whole point of the Esper/Aggro split is to prevent one deck from crushing both. If you play answers to Aggro cards, those cards are bad against Esper. Either play such cards, or do not.

You do get to ignore the third leg of the triangle trade-off: Sultai. It is fine to not be the ‘midrange trump’ given what others are up to.

If Paulo had been less ambitious, there were solutions.

Gruul with zero dead cards, and creatures selected with Esper in mind, is thought to be advantaged against Esper even with several burn spells. This matches my experience. Your draws usually force them to have exactly Kaya’s Wrath plus good support. Most versions don’t run the full four Kaya’s Wrath and don’t have much quick card draw. You can run either a Steel-Leaf Champion version or a Warrior build. Both work. What you can’t do is run Ghalta, Primal Hunger. That is exactly what Esper wants you to do.

Blue also beats Esper soundly if you pack zero dead cards by using Quench instead of Essence Capture. If you know how to play blue properly, it wins the matchup even with three or four dead cards; it just doesn’t crush the matchup then. Against white and red, you’re advantaged game one with a balanced configuration. So you can definitely get an edge on the pairing without having any clear weaknesses, and you can choose how to balance your needs. It’s not quite what was ordered, but it’s a good idea.

Golgari/Sultai also potentially offers promise if you are willing to give up the mirror and other midrange matchups, allowing polarization of spells and doing things like running Assassin’s Trophy as the cheap removal because it hits Teferi, Hero of Dominaria. I strongly suspect Reid Duke was on the ball here, but I can’t confirm from my own experience.

Level Two: Accept Polarization

Another approach is to force their choice in game three.

The commentators (wrongly) kept implying that many players ‘couldn’t’ choose whatever decks were not Esper or White/Red because they viewed one of the potential matchups as too bad. Part of this was a poor view of the matchups.

Part of that was poor game theory.

Part of it was putting the blame on the non-standard or polarizing deck.

Nassif brought Temur Reclamation. This deck (as he configured it, at least) is very good against Esper, very bad against aggression. Thus, he ‘can’t’ play Temur because his opponent ‘could’ play aggro. So he plays his other deck, Esper Acuity, which has matchups that are more balanced. Then his opponent, Mengucci, knows that Nassif ‘can’t’ play Temur, despite that this is why Temur was a good idea at all in the first place, and can and does safely choose Esper.

Which is exactly what I was expecting. Time and time again, players who brought unique decks ‘lost their nerve’ when choosing for game three, chose the non-polarizing deck over the polarizing one, and walked into a bad matchup.

Players who had the generic decks also chose the non-polarizing deck, or the one with the most ‘play skill’ generically attached to it, over the riskier deck, whether or not that had expectancy. In some sense they also lost their nerves, but were right to do so in non-standard matchups, because their opposition followed suit and rewarded them.

Thus the general conclusion that players will vastly overvalue ‘controlling my own destiny’ in both the matchup and play decision sense, and sacrifice expectancy to get it, even when against a player of equal or greater skill level, so long as they consider themselves good enough to be there.

To capitalize on this dynamic, one can invest in making one matchup very bad for the opponent. Make sure that they ‘can’t’ play Esper because you have a mirror-breaker list, or they ‘can’t’ play their creature deck because you can sweep the board too much, or what have you. This can be worthwhile even if you give up percentage in the first two games by doing it.

If your opponent has two similar decks, you get the benefit of choice. If your opponent has two different decks, you know they will almost always dodge you. Thus, you can play the other half of your ‘good pairing’ and get an edge in game three that way. Even if the flip would be very bad for you, don’t worry. It won’t happen.

Doing this for several days every time might clue people in. It also might not.

There was then another group of players who didn’t consider themselves good enough. They knew they had a skill disadvantage, so they did things like submit two creature decks or even two identical decks, which as we discussed earlier was a disaster.

The key is to do the right thing. Maximize your chance of winning, then when that involves rolling the dice, roll the dice.

Conclusion and Level Three: The Beyond

What is clear from this discussion, and logical extension to future levels of play, is just how weird this format really is. The important considerations revolve around the game three choices players get, and the guessing games that result, and anticipating opposing choices based on their anticipation of those choices.

You know what this has nothing at all in common with?

Kitchen table Magic!

Sideboards make sense. Even if the kitchen table doesn’t use them formally, everyone understands that hate cards can help against the deck someone else is beating you with, or is expected to bring. Anticipating an opponents’ response to these logical forks and building levels upon levels? That doesn’t have any parallel at all. It’s alien. No one – not even those playing – understands it or what makes it interesting.

It also means a lot of big gambles and big risks tied to Boolean guessing games. Things are inherently random. We were fortunate that most Standard matchups are by default close, so things weren’t too bad. At some times in the past, this format would have been much, much more random than this.

Sideboards allow adjustment. They let formats self-balance. They also are where the most interesting decisions, and most skill-based actions, lie. They are the biggest reward for knowing how your deck ticks. Building your deck between games is where it is at, and I especially love ‘tuning’ sideboards where what to do in each match is far from obvious.

Sideboards allow decks to differentiate themselves far more. Without sideboards, a format contains far fewer effectively different strategies, no matter how distinct you draw the boundaries. Post-sideboard games of the same matchup often play very differently with different builds and players. Without those sideboards, things are much more generic.

Sideboards are easy to follow once Arena gives you the screen showing what is going on (as does Magic Online), and once you have that, watching these decisions is fascinating. It’s also great for lower-level players. They learn how to do it, and they see what players think is good and bad in different places. It’s great.

More to the point, the expressed goal of ‘being like kitchen table Magic’ is deeply misguided. I’ll quote from Reflections on the Mythic Invitational:

To be blunt, it’s a stupid goal. Professional and advanced play of games often involves additional twists that don’t make sense at home. People understand.

Does your little league game use a bullpen? Should MLB stop using one because you don’t? Does your touch football game not use distinct offensive and defensive players? Should the NFL fix this?

Of course not. That would be insane.

Let us end on a positive note.

What was good about the format? What can we carry forward to our next experiment?

I think we can take two good things away.

The first is that different is great.

Doing almost anything new and different will create new and different dynamics to explore. People will try cool new things, find innovative solutions. Even in failure, the first day with a given new format is still going to be good. Almost every time.

This implies that the more we do crazy different things, the better off we are.

This too goes against the goal of ‘Kitchen Table Magic’ but that’s the point. We want to set up a unique challenge and see how players react.

The second is that teams are great.

That doesn’t sound like it follows. Duo Standard didn’t have teams! But it did, of course. Each player brought a team of two decks. Effectively they were a team of two with one player taking both roles. The most interesting consequences were then when the teammates interacted with each other, and how they complement each others’ strengths, and cover and exploit weaknesses.

Let that be two different players on each side, and everything improves.

Of course, add a third player, restore the sideboards, and you get Team Standard, which is pretty great.

Many E-sports have figured this out. A team of three or five can put on a better show than a team of one, create better moments and better stories.

Next in Magic land, we’re on to War of the Spark, Pro Tour London and the London Mulligan. I’m sure I will have thoughts. And I know I’ll have fun watching it play out, whether it is triumph or disaster.

Discuss

### Criticizing Critics of Structural-Functionalism

Новости LessWrong.com - 18 апреля, 2019 - 21:52
Published on April 18, 2019 3:10 AM UTC

Structural-Functionalism is usually criticized for being circular, in the following two ways:

1) The function of the whole follows from that of its parts, and the function of the parts follows from that of the whole.

2) Schematic representations of society are formulated on the basis of preexisting societal institutions which are, in turn, used to substantiate their existence.

The argument from tautological circularity suggests that this state of internal consistency prevents a formal theory from accurately explaining the actual structures and behaviors of the actual world; and that fitting the actual world into this scheme yields nothing but an artificial self-contained system divorced from all that it is attempting to describe and explain.

More generally, the argument can be defined thusly: The description of all the statements which constitute a social systems theory as tautological is inevitable as they are all true in virtue of their form but not in virtue of fact. They need not bear any resemblance to the world, but only need satisfy the structure of the system and the logical minds who created it. The structure is coherent, but coherence doesn't necessarily imply a correspondence to reality, and hence the target of formalization. This can be called the general problem of formalization—as it applies to any science.

The counterpoint to be made here is that the schematic representation is warranted precisely because it is based upon what substantiates its existence. Otherwise it wouldn't exist, and in its stead another possible version of it would exist. One which conforms to the actual structure of the actual world. Faulty premises derived from observations about preexisting institutions do lead to faulty formal theories and those that don't rely on faulty premises succeed in their intentions. What happens to be the case always remains, whether or not the theory succeeds in its intentions remains to be seen. The failure of a theory must be granted, but so to must the success of any given theory.

What qualifies as the criterion for success is where uncertainty arises. But this uncertainty need not be of consequence as the answer is simple: As long as a real world interaction, action, or organization satisfies the criteria for its counterpart in the formal theory and that formal counterpart implies the physical-social realization of it in the real world, then the overall structure of one matches the other. Just because the abstract seeks generality doesn't mean that particular instances of human experience and sociological phenomena don't fit into some general scheme. This scheme should mirror the particular, and the particular should mirror this scheme.

Formal theories seek a disquotational scheme in which reality, apart from its representation, can also fit. Otherwise no one-to-one correspondence between both theory and reality has been achieved. Formal theories seek to be redundant—they seek to be superfluous in so far as genuine redundancy can be achieved. Such a scheme would allow for one to superimpose their personal experience onto the logical structure of theory and yield an affirmative result (if the theory proves workable for that specific case). One can reflect their own experience onto the general structure of a scheme so as to see for themself how true to life a correspondence there is. The reader of these very words can themself act as an instantiation of a theoretical claim and so substantiate them. Only if this collection of instantiations is statistically significant enough will the claim made be substantiated. If not, the theory is a failure and the claims false. If it is the case then it is the case, if not then not.

The set of schemes should correspond to the set of real world particulars. I wouldn't say that the set of schemes constitutes the totality of a formal theory as that set accounts only for its structure and not its meta-theoretical parts—like why the theory is the way it is and how, which is to say the methodology of theory construction.

The criteria for success and failure that I’ve argued for thus far have been of whether instantiations of phenomena in and of themselves without their data representations are correspondent with their formal counterparts in some formal theory. The greatest consequence of a disquotational scheme, in which ‘X’ iff X, is that the object of study has been formalized into a sentential object—further divorcing it from the base reality of which it is intended to be a part. An alternative to my explicitly tautological approach would be to derive empirical evidence from observable real world events and functions, and consequently make structural claims about society based on the data collected; and view that as the only way in which to validate some theory.

What may be evident to some is the degree to which this resembles the problem of coordination in philosophy of science, which is concerned with how the empirical relates to the theoretical. “Correspondence” would be an appropriate addition to the vocabulary of coordination problems. This is especially so considering its present application to how theorizing relates to data. Coordination is comprised of both correspondence and validation. Validation for the theory that the measurement conforms to, and reciprocative validation for the measurement procedures that produced an outcome that the theory predicted. Because they correspond to each other, they validate each other. Coordination is reached if measurement M and theory T satisfy each other and it is not reached if they do not satisfy each other:

So coordination is a function of satisfaction as related to measurement and theory, and satisfaction is a relation between both measurement and theory. Success and failure are the two possible outcomes.

This reciprocal correspondence between theory and measurement is really one between two higher level systems derived from, but not identical to, base reality. Both are subordinate to base reality. The empirical is not to be confused with the “real” as empirical data and its accompanying methodologies can be, and routinely are, faulty. Moreover, appearance is not to be confused with reality, nor are observations of those appearances.

Satisfaction relies not upon the premise that concrete observable structures (the phenomena in and of themselves) can be isomorphic to abstract theoretical ones (substructures of models or parts of wholes); but upon the less committal and more plausible premise that data collected from instantiations of those phenomena can be isomorphic to formalizations of those phenomena. Whether correspondence between both the empirical and the theoretical does itself exist with the "real" is another question.

The argument from tautological circularity would also apply to conventional sociological analysis and any data driven discipline more generally. The data derived from empirical research is intended to correspond to what it is the representation of—just as a formal theory intends to do the same. This property of supposed circularity is not unique to formal theory.

Let me know if my criticism is as nonsensical as theirs.

Discuss

Новости LessWrong.com - 18 апреля, 2019 - 20:20
Published on April 18, 2019 5:20 PM UTC

Alignment Newsletter #53 Newsletter turns one year old, and why overfitting isn't a huge problem for neural nets View this email in your browser

Cody Wild is now contributing summaries to the newsletter!

Highlights

Are Deep Neural Networks Dramatically Overfitted? (Lilian Weng): The concepts of underfitting and overfitting, and their relation to the bias-variance tradeoff, are fundamental to standard machine learning theory. Roughly, for a fixed amount of data, there is an optimal model complexity for learning from that data: any less complex and the model won't be able to fit the data, and any more complex and it will overfit to noise in the data. This means that as you increase model complexity, training error will go down to zero, but validation error will go down and then start turning back up once the model is overfitting.

We know that neural networks are much more expressive than the theory would predict is optimal, both from theorems showing that neural networks can learn any function (including one that provides a rather tight bound on number of parameters), as well as a paper showing that neural nets can learn random noise. Yet they work well in practice, achieving good within-distribution generalization.

The post starts with a brief summary of topics that readers of this newsletter are probably familiar with: Occam's razor, the Minimum Description Length principle, Kolmogorov Complexity, and Solomonoff Induction. If you don't know these, I strongly recommend learning them if you care about understanding within-distribution generalization. The post then looks at a few recent informative papers, and tries to reproduce them.

The first one is the most surprising: they find that as you increase the model complexity, your validation error goes down and then back up, as expected, but then at some point it enters a new regime and goes down again. However, the author notes that you have to set up the experiments just right to get the smooth curves the paper got, and her own attempts at reproducing the result are not nearly as dramatic.

Another paper measures the difficulty of a task based on its "intrinsic dimension", which Cody has summarized separately in this newsletter.

The last paper looks at what happens if you (a) reset some layer's parameters to the initial parameters and (b) randomize some layer's parameters. They find that randomizing always destroys performance, but resetting to initial parameters doesn't make much of a difference for later layers, while being bad for earlier layers. This was easy to reproduce, and the findings reemerge very clearly.

Rohin's opinion: I'm very interested in this problem, and this post does a great job of introducing it and summarizing some of the recent work. I especially appreciated the attempts at reproducing the results.

On the papers themselves, a regime where you already have ~zero training error but validation error goes down as you increase model expressivity is exceedingly strange. Skimming the paper, it seems that the idea is that in the normal ML regime, you are only minimizing training error -- but once you can get the training error to zero, you can then optimize for the "simplest" model with zero training error, which by Occam's Razor-style arguments should be the best one and lead to better validation performance. This makes sense in the theoretical model that they use, but it's not clear to me how this applies to neural nets, where you aren't explicitly optimizing for simplicity after getting zero training error. (Techniques like regularization don't result in one-after-the-other optimization -- you're optimizing for both simplicity and low training error simultaneously, so you wouldn't expect this critical point at which you enter a new regime.) So I still don't understand these results. That said, given the difficulty with reproducing them, I'm not going to put too much weight on these results now.

I tried to predict the results of the last paper and correctly predicted that randomizing would always destroy performance, but predicted that resetting to initialization would be okay for early layers instead of later layers. I had a couple of reasons for the wrong prediction. First, there had been a few papers that showed good results even with random features, suggesting the initial layers aren't too important, and so maybe don't get updated too much. Second, the gradient of the loss w.r.t later layers requires only a few backpropagation steps, and so probably provides a clear, consistent direction moving it far away from the initial configuration, while the gradient w.r.t earlier layers factors through the later layers which may have weird or wrong values and so might push in an unusual direction that might get cancelled out across multiple gradient updates. I skimmed the paper and it doesn't really speculate on why this happens, and my thoughts still seem reasonable to me, so this is another fact that I have yet to explain.

Technical AI alignment   Technical agendas and prioritization

Summary of the Technical Safety Workshop (David Krueger) (summarized by Richard): David identifies two broad types of AI safety work: human in the loop approaches, and theory approaches. A notable subset of the former category is methods which improve our ability to give advanced systems meaningful feedback - this includes debate, IDA, and recursive reward modeling. CIRL and CAIS are also human-in-the-loop. Meanwhile the theory category includes MIRI's work on agent foundations; side effect metrics; and verified boxing.

Iterated amplification

A Concrete Proposal for Adversarial IDA (Evan Hubinger): This post presents a method to use an adversary to improve the sample efficiency (with respect to human feedback) of iterated amplification. The key idea is that when a question is decomposed into subquestions, the adversary is used to predict which subquestion the agent will do poorly on, and the human is only asked to resolve that subquestion. In addition to improving sample efficiency by only asking relevant questions, the resulting adversary can also be used for interpretability: for any question-answer pair, the adversary can pick out specific subquestions in the tree that are particularly likely to contain errors, which can then be reviewed.

Rohin's opinion: I like the idea, but the math in the post is quite hard to read (mainly due to the lack of exposition). The post also has separate procedures for amplification, distillation and iteration; I think they can be collapsed into a single more efficient procedure, which I wrote about in this comment.

Learning human intent

Conditional revealed preference (Jessica Taylor): When backing out preferences by looking at people's actions, you may find that even though they say they are optimizing for X, their actions are better explained as optimizing for Y. This is better than relying on what they say, at least if you want to predict what they will do in the future. However, all such inferences are specific to the current context. For example, you may infer that schools are "about" dealing with authoritarian work environments, as opposed to learning -- but maybe this is because everyone who designs schools doesn't realize what the most effective methods of teaching-for-learning are, and if they were convinced that some other method was better for learning they would switch to that. So, in order to figure out what people "really want", we need to see not only what they do in the current context, but also what they would do in a range of alternative scenarios.

Rohin's opinion: The general point here, which comes up pretty often, is that any information you get about "what humans want" is going to be specific to the context in which you elicit that information. This post makes that point when the information you get is the actions that people take. Some other instances of this point:

Inverse Reward Design notes that a human-provided reward function should be treated as specific to the training environment, instead of as a description of good behavior in all possible environments.

CP-Nets are based on the point that when a human says "I want X" it is not a statement that is meant to hold in all possible contexts. They propose very weak semantics, where "I want X" means "holding every other aspect of the world constant, it would be better for X to be present than for it not to be present".

Wei Dai's point (AN #37) that humans likely have adversarial examples, and we should not expect preferences to generalize under distribution shift.

- Stuart Armstrong and Paul Christiano have made or addressed this point in many of their posts.

Defeating Goodhart and the closest unblocked strategy problem (Stuart Armstrong): One issue with the idea of reward uncertainty (AN #42) based on a model of uncertainty that we specify is that we tend to severely underestimate how uncertain we should be. This post makes the point that we could try to build an AI system that starts with this estimate of our uncertainty, but then corrects the estimate based on its understanding of humans. For example, if it notices that humans tend to become much more uncertain when presented with some crucial consideration, it could realize that its estimate probably needs to be widened significantly.

Rohin's opinion: So far, this is an idea that hasn't been turned into a proposal yet, so it's hard to evaluate. The most obvious implementation (to me) would involve an explicit estimate of reward uncertainty, and then an explicit model for how to update that uncertainty (that would not be Bayes Rule, since that would narrow the uncertainty over time). At this point it's not clear to me why we're even using the expected utility formalism; it feels like adding epicycles in order to get a single particular behavior that breaks other things. You could also make the argument that there will be misspecification of the model of how update the uncertainty. But again, this is just the most obvious completion of the idea; it's plausible that there's a different way of doing this that's better.

Parenting: Safe Reinforcement Learning from Human Input (Christopher Frye et al)

Interpretability

Attention is not Explanation (Sarthak Jain et al) (summarized by Richard): This paper explores the usefulness of attention weights in interpreting neural networks' performance on NLP tasks. The authors present two findings: firstly, that attention weights are only weakly correlated with other metrics of word importance; and secondly, that there often exist adversarially-generated attention weights which are totally different from the learned weights, but which still lead to the same outputs. They conclude that these results undermine the explanatory relevance of attention weights.

Richard's opinion: I like this type of investigation, but don't find their actual conclusions compelling. In particular, it doesn't matter whether "meaningless" adversarial attention weights can lead to the same classifications, as long as the ones actually learned by the system are interpretable. Also, the lack of correlation between attention weights and other methods could be explained either by attention weights being much worse than the other methods, or much better, or merely useful for different purposes.

The LogBarrier adversarial attack: making effective use of decision boundary information (Chris Finlay et al) (summarized by Dan H): Rather than maximizing the loss of a model given a perturbation budget, this paper minimizes the perturbation size subject to the constraint that the model misclassify the example. This misclassification constraint is enforced by adding a logarithmic barrier to the objective, which they prevent from causing a loss explosion through through a few clever tricks. Their attack appears to be faster than the Carlini-Wagner attack.

Read more: The code is here.

Robustness

Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks (Mingchen Li et al.) (summarized by Dan H): Previous empirical papers have shown that finding ways to decrease training time greatly improves robustness to label corruptions, but to my knowledge this is the first theoretical treatment.

Other progress in AI   Deep learning

Measuring the Intrinsic Dimension of Objective Landscapes (Chunyuan Li et al) (summarized by Cody): This paper proposes and defines a quantity called "intrinsic dimension", a geometrically-informed metric of how many degrees of freedom are actually needed to train a given model on a given dataset. They calculate this by picking a set of random directions that span some subspace of dimension d, and taking gradient steps only along that lower-dimensional subspace. They consider the intrinsic dimension of a model and a dataset to be the smallest value d at which performance reaches 90% of a baseline, normally trained model on the dataset. The geometric intuition of this approach is that the dimensionality of parameter space can be, by definition, split into intrinsic dimension and its codimension, the dimension of the solution set. In this framing, higher solution set dimension (and lower intrinsic dimension) corresponds to proportionally more of the search space containing reasonable solution points, and therefore a situation where a learning agent will be more likely to find such a solution point. There are some interesting observations here that correspond with our intuitions about model trainability: on MNIST, intrinsic dimensionality for a CNN is lower than for a fully connected network, but if you randomize pixel locations, CNN's intrinsic dimension shoots up above FC, matching the intuition that CNNs are appropriate when their assumption of local structure holds.

Cody's opinion: Overall, I find this an interesting and well-articulated paper, and am curious to see future work that addresses some of the extrapolations and claims implied by this paper, particularly their claim, surprising relative to my intuitions, that increasing n_parameters will, maybe monotonically, reduce difficulty of training, because it simply increases the dimensionality of the solution set. I'm also not sure how to feel about their simply asserting that a solution exists when a network reaches 90% of baselines performance, since we may care about that "last mile" performance and it might also be the harder to reach.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.

Discuss

### Уличная эпистемология. Тренировка

События в Кочерге - 18 апреля, 2019 - 18:30
Вторник, 23 апреля, 16:30

### Episode 1 of "Tsuyoku Naritai!" (the 'becoming stronger' podcast/YT series).

Новости LessWrong.com - 18 апреля, 2019 - 00:58
Published on April 17, 2019 9:58 PM UTC

The Kansas City Rationalists just had our first dojo meetup yesterday. It was a success as far as first meetings go; we had great attendance, including a couple non-rationalists. We are using the 'Hammertime' sequence as our content.

I am making a series of vlog posts/podcast episodes detailing my personal journey through the sequence, also for the benefit of others. I plan to expand it as I learn and experience more.

Discuss

Новости LessWrong.com - 17 апреля, 2019 - 23:52
Published on April 17, 2019 8:52 PM UTC

I’m going to write a few posts on quant trading. Specifically trading crypto, since that’s what I know best. Here’s a few reasons why I’m doing this:

• I think I can benefit a lot from writing about my approach and methodology. Hopefully this will make the ideas and assumptions more clear.
• I’d love to get input from other people in the community on their approaches to model building, data analysis, time series analysis, and trading.
• There’s been a lot of great content on this website, and I’d love to contribute. This is the topic I currently know best, so I might as well write about it.
• My company (Temple Capital) is also looking to hire quants and we believe the rationalist way of thinking is very conducive to successful quant trading.

My goal here isn’t to make you think that “Oh gosh, I can become a millionaire by trading crypto!” or “Here’s the strategy that nobody else has found!” Instead, I want to give you a taste of what quant trading looks like, and what thinking like a quant feels like. EAs have been talking about earning to give for a while, and it’s well known that quant trading is a very lucrative career. I’ve known about it for a while, and several of my friends have done quant (e.g. at Jane Street) or worked at a hedge fund. But, I never thought that it was something I could do or would find enjoyable. Turns out that I can! And it is!

I’m going to be sharing the code and sometimes the step by step thinking process. If you’re interested in learning this on a deeper level, definitely download the code and play with the data yourself. I’ve been doing this for just over a year, so in many ways I’m a novice myself. But the general approach I’ll be sharing has yielded good results, and it’s consistent with what other traders / hedge funds are doing.

Setup

Note: I’ve actually haven’t gone through these install steps on a clean machine. I think they’re mostly sufficient. If you run into any issues, please post in the comments.

1. Make sure you have Python 3.6+ and pip
2. pip install pandas numpy scipy matplotlib ipython jupyter
3. git clone https://github.com/STOpandthink/temple-capital.git
4. cd temple-capital
5. jupyter notebook
6. Open blog1_simple_prediction_daily.ipynb

If you’re not familiar with the tools we’re using here, then the next section is for you.

Python, Pandas, Matplotlib, and Jupyter

We’re going to be writing Python code. Python has a lot of really good libraries for doing numerical computation and statistics. If you don’t know Python, but you know other programming languages, you can still probably follow along.

Pandas is an amazing, wonderful library for manipulating tabular data and time series. (It can do a lot more, but that’s primarily what we’re using it for.) We’re going to be using this library a lot, so if you’re interested in following along, I’d recommend spending at least 10 minutes learning the basics.

Matplotlib is a Python library for plotting and graphing. Sometimes it’s much easier to understand what’s going on with a strategy when you can see it visually.

Jupyter notebooks are useful for organizing and running snippets of code. It’s well integrated with Matplotlib, allowing us to show the graphs right next to the code. And it’s good at displaying Pandas dataframes too. Overall, it’s perfect for quick prototyping.

There are a few things you should be aware of with Jupyter notebooks:

1. Just like running Python in an interactive shell mode, the state persists across all cells. So if you set the variable x in one cell, after you run it, it’ll be accessible in all other cells.
2. If you change any of the code outside of the notebook (like in notebook_utils.py), you have to restart the kernel and recompute all the cells. A neat trick to avoid doing this is:
import importlib
importlib.reload(notebook_utils)
Our first notebook

We’re not going to do anything fancy in the first notebook. I simply want to go over the data, how we’re simulating a trading strategy, and how we analyze its performance. This is a simplified version of the framework you might use to quickly backtest a strategy.

Cell 1

The first cell loads daily Bitcoin data from Bitmex. Each row is a “daily bar.” Each bar has the open_date (beginning of the day) and close_date (end of the day). The dataframe index is the same as the open_date. We have the high, low, and close prices. These are, respectively, the highest price traded in that bar, the lowest, and the last. In stock market data you usually have the open price as well, but since the crypto market is active 24/7, the open price is basically just the close price of the previous bar. volume_usd shows how much USD has been transacted. num_trades_in_bar is how many trades happened. This is the raw data we have to work with.

From that raw data we compute a few useful variables that we’ll need for basically any strategy: pct_change and price_change. pct_change is the percent change in price between the previous bar and this bar (e.g. 0.05 for +5%). price_change is the multiplicative factor, such that: new_price = old_price * price_change; additionally, if we had long position, our portfolio would change: new_portfolio_usd = old_portfolio_usd * price_change.

A few terms you might not be familiar with:

• We take a long position when we want to profit from the price of an asset going up. So, generally, if the asset price goes up 5%, we make 5% on the money we invested.
• We take a short position when we want to profit from the price of an asset going down. So, generally, if the asset price goes down 5%, we make 5% on the money we invested.
Cell 2Cell 3

Here we see that indeed BTC recently crossed its 200 day SMA (Simple Moving Average). One neat thing about that that I didn’t realize myself is that it looks like the SMA has done a decent job of acting as support/resistance historically.

Cell 4Cell 5

Here we simulate a perfect strategy: it knows the future!

One thing to note is that the returns are not as smooth / linear as one might expect. It makes sense, since each day bar has a different pct_change. Some days the price doesn’t move very much, so even if we guess it perfectly, we won’t make that much money. But it’s also interesting to note that there are whole periods where the bars are smaller / bigger than average. For example, even with perfect guessing, we don’t make that much money in October of 2018.

Cell 6

Here we simulate what would have happened if we bought and held at the beginning of 2017 (first graph) vs shorted.

Quick explanation of the computed statistics:

• Returns: multiplicative factor on our returns (e.g. 5.2 means 420% gain or turning $1 into$5.20)
• Returns after fees: multiplicative factor on our returns, after accounting for the fees that we would have paid for each transaction. (On Bitmex each time you enter/leave a position, you pay 0.075% fees, assuming you’re placing a market order.)
• SR: is Sharpe Ratio. It’s a very common metric used to measure the performance of a strategy. “Usually, any Sharpe ratio greater than 1 is considered acceptable to good by investors. A ratio higher than 2 is rated as very good, and a ratio of 3 or higher is considered excellent.” (Source)
• % bars right: what percent of days did we guess correctly.
• % bars in the market: what percent of day were we trading (rather than being out of the market). (It’s a bit misleading here, because 1.0 = 100%)
• Bars count: number of days simulated
Cell 7

There are more graphs in the notebook, but you get the idea.

I’m not going to discuss this particular strategy here. I just wanted to show something more interesting than constantly holding the same position.

Future information

One of the insidious bugs you can run into while working with time series is using future information. This happens when you make a trading decision using information you wouldn’t have access to if you were trading live. One of the easiest ways to avoid it is to do all the computation in a loop, where each iteration you’re given the data you have up until that point in time, and you have to compute the trading signal from that data. That way you simply don’t have access to future data. Unfortunately this method is pretty slow when you start working with more data or if there’s a lot of computation that needs to be done for each bar.

For this reason, we’ve structured our code in a way where to compute the signal for row N, you can use any information up to and including row N. The computed strat_signal will be used to trade the next day’s bar (N+1). (You can see the logic for this in add_performance_columns(): df['strat_pct_change'] = df['strat_signal'].shift(1) * df['pct_change']. This way as long as you’re using standard Pandas functions and not using shift(-number), you’ll likely be fine.

That’s it for now!

Potential future topics:

• What is overfit and how it impacts strategy research
• Filters (market regimes, entry/exit conditions)
• Common strategies (e.g. moving average crossover)
• Common indicators
• Using simple ML (e.g. Naive Bayes)
• Support / resistance
• Autocorrelation
• Multi-coin analysis

Questions for the community:

• Do you feel like you understand what’s going on so far, or should I move slower / zoom in on one of the prerequisites?
• What topics would you like me to explore?
• What strategies are you interested to try?

Discuss

### Role playing game based on HPMOR in Moscow

Новости LessWrong.com - 17 апреля, 2019 - 22:19
Published on April 17, 2019 7:19 PM UTC

We created an intellectual game where story-based magical and detective mission is combined with rationality-based minigames.
In our game you can:
* discover the mysteries of past events and check the rumors about rebirth of the Dark Lord,
* exchange magical items, learn new spells and even use them on somebody,
* try to solve some rationality-based puzzles and discover some new information on success.

What does this game look like?

Each player have a description of the character and its goals. Player can move around the rooms, talk with other players, use items and spells, find and solve puzzles to gather information and achieve the goals.
There are several rooms available for players, so players can talk both in public and in private.
Each player have a pouch that contains game rules, character description, hints and other knowledge, and also various cards: items, spells, energy and Hogwarts points.

More detailed announcement and description is here: https://goo.gl/WLU4p3

Discuss

### StrongerByScience: a rational strength training website

Новости LessWrong.com - 17 апреля, 2019 - 21:12
Published on April 17, 2019 6:12 PM UTC

StrongerByScience is a website I have been using to inform my recently-renewed weightlifting habit. I think it will be of interest here because the writers are experts in a practice (strength training, particularly powerlifting), and their modus operandi is to do careful literature reviews of the latest strength research and use that to inform their practice.

I recommend all of the Guides on the Big Three page (now technically four), which cover the big three lifts from the biomechanics on up, as well as one on strength training. They have monthly reviews of the latest papers in the field. They occasionally do deep reviews of particular subjects (I am reading the metabolism article on and off now). They're a good source of references for other literature in the field.

Stuff you would normally expect to be unpleasant, but isn't: the mailing list is just the articles, not spam; the articles are written in a way similar to how posts are often written here, and are not particularly marketing-y; the articles and guides can mostly be downloaded as PDFs in addition to being read on the website.

The specific benefits I got came from the Strength Training Guide and from the article on high-volume training, which defined my current set structure. I have a new lifetime personal best in dumbbell press, and I am close to recovering my previous deadlift. I have not purchased any of the books or trainings, however.

Discuss

### No Safe AI and Creating Optionality

Новости LessWrong.com - 17 апреля, 2019 - 17:08
Published on April 17, 2019 2:08 PM UTC

[I am working on some formal arguments about the possibility of safe AI, and what realistic alternatives might be. I wrote the following to get imaginative feedback before I continue my arguments. I strongly believe that before developing formal argumentation, it is very helpful to play around with imaginative stories so that possibilities that otherwise would not have been considered can surface. Stories are like epistemic lubricant.]

Within a span of 3 months two formal proofs were published showing:

1) There is no learning algorithm of complexity X or greater such that humans can prove there are no unknown nonlinear effects. And then…

2) Any learning algorithm which interacts with agents which are not itself cannot have rigorously bounded effects.

Most people understood these proofs, which were far more technical and caveated than these summaries, to mean, in shortest form, that no safe AI is even possible. Some pointed out that Google Maps qualified as “unsafe” by this definition. The reactions to the Google Maps example spawned two camps. The one said that intuition and experience tells us Google Maps is safe, so need to worry, the other side said Google Maps is not safe, but the problems it causes are minimal and so we put up with them. This argument was just window dressing on a march darker and more dire arguments happening all over the world.

Deep inside the hallways of American policy Pentagon generals postulated that although these systems are not safe, we must build some and test them in foreign countries. “The only way through this labyrinth of technology was trial and error, and so the U.S. should be on the forefront of the trials.”

In China, Xi Jinping’s government reasserted control over all technology companies and began pulling all electronic systems unnecessary for party rule out of the hands of consumers and promoted a more natural world-oriented China.

Brussels took the most drastic steps. Three times they tried to pass legislation, one banning microprocessors, another banning certain classes of algorithms, another banning private or public AI research and funding. None of these laws passed, but Poland and Finland left the EU over the controversy.

Far from the centers of power other stews were brewing. Large bands of people on the internet were arguing for the disestablishment of the internet. The NYT wondered whether more education could have prevented this discovery from being true. Cable television hosts pointed out that we didn’t have these problems before video streaming. An American nationalist movement advocated that U.S. close herself off from the rest of the world as quickly as possible. Some religious groups pleaded with the Amish to teach them their ways, others prayed for dissolution of nation-states into city-states. Renewed interest in outdoorsmanship swept the developed world. However, no solution presented itself, just panic and mounting pressure on democratic government to do something about the robot menace.

What are more options for No Safe AI?

Discuss

### Reflections on the Mythic Invitational

Новости LessWrong.com - 17 апреля, 2019 - 14:50
Published on April 17, 2019 11:50 AM UTC

Previously / Compare and Contrast To: Reflections on the 2017 Magic Online Championship

Previously: Speculations on Duo Standard

Compare To (Frank Karsten at Channel Fireball): The Mythic Invitational Wasn’t Perfect—And It Was Still a Smashing Success

And Remember, Guys, You Asked For It: The “And” of MTG Arena

Two years ago we were treated to a virtuoso performance on all fronts at the 2017 Magic Online Championship. All the players on the Sunday stage played the best Magic we’ve ever seen. The commentary drew us into exactly that which made the games, and the game of Magic in general, great. I called upon the game to bottle that lightning, and build upon it to create our future.

At the Invitational we experienced a very different digital tournament.

Some stuff was great. We made giant leaps forward in some areas.

Including viewers. We’re playing in a different league now.

We turned ourselves into a real e-sport! Woo-hoo!

Other areas, not so much. We mustn’t let the good distract from fixing the bad.

I won’t speak of the minor technical difficulties, as this is already far too long and they are doubtless being addressed. Frank Karsten mentions them in his article, along with good suggestions for the Arena client.

This is a case of ‘I should get this out there one way or another’ so I’m doing that. I hope it helps.

The Great

You can’t argue with a million dollars in prizes. Competitive Magic’s biggest flaw has always been the size of its prize pools. There’s still room to improve, but I think it is safe to say: Problem solved. This is no longer the lowest hanging fruit.

Physical production values were off the charts. Game play on Arena is much easier to follow even for invested veterans like me. For new and casual players, it’s a transformation. The game looks and feels exciting and fast paced.

No doubt the stage looked like we wanted it to look, and we had the commentary teams we wanted to have, no matter the expense.

Things felt instinctively like they were supposed to feel. High stakes, high tension, high drama. The money and stage did their jobs quite well.

Becca Scott, Brian Kibler and the other commentators were (someone please create the supercut so I can link to it here, and also watch it multiple times) very excited. That latest draw (again, supercut please) was huge. If Richard Hagon had been on site, I would have worried if he could have survived.

Viewership followed. Before the final day we were already breaking 100,000 viewers. Arena stream viewer numbers are a different order of magnitude.

Magic is still super awesome. Magic reliably creates great moments, giant swings, complex key decisions, heroes and stories. There were some truly epic games on camera. We have great potential as a true spectator sport.

If we can keep improving the product, the sky is the limit.

We’re going to need to do that. A lot of things were less than optimal this weekend. We can’t let top line success distract us from that.

While Magic content is on screen, Esper torture sessions we’ll talk about later not withstanding, your floor is high. The floor is even higher when both hands are visible. You’re watching Magic. That’s why we all tuned in.

The ideal broadcast contains other things. Deck features are great. Previews of match-ups give crucial context. Interviews can add a lot when done well. The story of the tournament and its players is worth telling. A few preview cards and promo cards add spice.

Magic is still where it is at. When in doubt, show more Magic.

More importantly, don’t give us dead air.

An interview is almost always good the first time it is shown. Each time it is repeated, it gets worse.

A cheesy player introduction is good fun the first time it is shown. Each time it is repeated, it gets worse.

Other features are similar. I want to see that deck analysis once. I definitely don’t want to see it five times. I don’t want to constantly be going over the same brackets and same match results.

Streams are about getting viewers who stick around all day. That means new content.

I lost track of how many times I saw the same short clips of pure fluff. I lost track of how long Kibler and the others sat around speculating about War of the Spark cards they’d never seen before, trying to get constructed hype up for well-designed limited cards. Presumably because they were on the screen with a high amount of zoom. And because the awesome preview video with all the feels.

We had quick recaps of a few matches, where we could have had time-shifted matches.

The first two days were not as bad on these fronts. Saturday made it very easy for the mind to wander. On Sunday, most of the time there was nothing on the stream to see. Eventually we switched over to basketball and kept an eye in case a match started.

Then there were the constant ads, all in heavy rotation. It was a bit much.

At our would-be watch party, at best we were sort of watching.

More and Better Magic Games

There was zero need for things to be that way!

We could have easily filled all that time with quality Magic content.

Playing all games on Magic Arena means having full recordings, with hands, of every game of every match of the tournament.

Why not show them to us?

I realize things are marginally better when the players are in the feature match area in the big chairs. I don’t care. At all. This is a crazy, very not important, not-worth-worrying-about concern. It’s fine to start with the match of your choice. It’s not fine to be tied only to two matches that are selected in advance.

Once those two matches are done, if not sooner, we should choose the games and matches that are right for viewers. Then show them.

Before the first round, during breaks, during down time late in the day without alternate rounds, show matches from other rounds.

There are lots of good ways to choose matches. Here are some ideas.

Pick decks, match-ups and players we haven’t seen.

Pick games and matches that were exciting, or offer interesting decisions or whatever else you prefer, as suggested by the players or by a volunteer group watching the secondary matches. Or post all the matches on different streams, and then judge via feedback which ones were the best, and show those.

Pick games and matches of the right length. Next round starts in 23 minutes, so find a game or match that lasts 15-23 minutes and show that one.

Pick games and matches by players who did well. On Sunday morning we might show select games by the top 4 competitors from previous days.

Pick the match-up we’re about to see, and show previous games won by both sides, to illustrate how it might work.

Pick via the Twitch chat or a Twitter poll, if you’d like.

The important thing is, pick.

In other tournaments we’ve evolved the technology of the time-shifted match. A match is recorded, then played back, often at accelerated speed, and commentary reacts in real time. This is awesome. With Arena, we can supercharge this.

We also could have chosen better feature matches.

There were a number of cool decks (and players) that were never on camera. Many of the early round matches we saw were echoed many times in later rounds. We should have spent the early rounds actively avoiding Esper players, and seeking out players with some spice.

More Other Content

That is not to say that we should uniquely rely upon Magic matches. Other content is also both welcome and abundantly available. Advantage should be taken.

As usual, before we innovate, we should take advantage of existing excellent technology. Look at what we’ve done in the past that worked. Then do that.

With a field full of streamers, both MPL players and other professionals, we have access to a wide variety of content creators who love Magic and want to increase their profile. We can use that.

Deck Analysis and Matchup Previews

I have loved it when players preview their own matchups and sideboarding, especially on Sunday stage. Ask for volunteers from the field. Put their deck up on screen, give them a microphone and a prompt, and let them talk.

It should be easy to get deck explanations from enough players to have one from each major archetype.

It should also be easy to get a top perspective on both sides of every matchup between popular decks, and every potential Sunday matchup.

Not every player will give a great preview. Some will, some won’t. Tape in advance. Take the ones that are good. Lose the ones that are bad.

One could argue, as players have in the past, that such features put the player at strategic disadvantage. This is a concern, but we’ve already told MPL players they must prepare for Mythic Championships in the open on their streams. That’s the same concern, but writ large, and everyone has accepted it.

Strategic Analysis

One of the best ways to get better at Magic is to go over games and do a postmortem. Why did the game play out that way? What could either player have done differently? Can we do a deep dive into decision points, and ask about all the factors going into what the right call is? With the benefit of not only hindsight but focus and time, one can go much deeper after the game than even the best player can go during the game.

I find such analysis fascinating.

Getting one or both players of a match into the booth, and having them watch a replay with the ability to pause and accelerate, and discuss and debate their decisions, seems super high value to me. So does simply watching a match on replay with one of the players as a commentator. Consider what is probably the best Grand Prix coverage of all time, which followed Reid Duke each round. Copy a lot of what was good about that, both in the previews discussed above and in analysis after the matches.

Going deep is admittedly difficult to square with the average experience level of the audience. We need to aim at viewers who are trying to learn what the rules are and what the cards do, or simply admire the pretty animations, in addition to those who would love going deep.

But you know what? That’s what makes Magic interesting to watch. I’ll quote directly from my older reflections:

I didn’t think of it at the time, but what Sunday reminded me of most was sitting back for a Mets game and listening to our world-class broadcast booth for a well-played, close game. Ron, Gary, and Keith aren’t afraid to share their opinions about anything, or to geek out or rant about little details, or to relax and tell you stories. Like Magic, baseball can be slow at times, pausing quite a bit between actions, and it suffers when it gets too slow. Also like Magic, if you are not interested in the details, strategy, and atmosphere of the game, it is boring. Those who go out to the ballpark and do not watch the game are skipping the game because it’s boring, but it’s boring because they are skipping it by not giving it the attention it deserves.

Thus, we need to strike a balance the same way professional sports broadcasts do.

If you’re looking for how to do that, watch a New York Mets baseball broadcast. It is chock full of esoteric knowledge and opinion, the tiny details of games that new fans will have zero idea about. Yet it is also accessible out of the box even if you know nothing about the game. It can be done.

Another example is to look at what ESPN does in the biggest college football games with its Megacast. On six different channels, the same game is presented six different ways. The coaches film room breaks down the game as its most central characters break it down when planning for their next match. The fan casts are highly partisan. The regular broadcast is there for those who want it. On different nights, I choose different options. It’s awe inspiring.

In Magic it will be harder. Some of the expert content will need to be gated and made distinct from the main broadcast, a la the film room of the megacast. We also likely could benefit from a ‘beginner’ broadcast. On the main broadcast, we’ll need to work hard to explain deep strategic thinking in ways new players can also follow.

What strategic analysis we did get seemed to be one line explanations for why matchups were lopsided in ways that they were not. But the players also made similar mistakes in many places, making it hard to find too much fault here.

Human Interest

It was cool to see interviews asking about players’ favorite experiences of the weekend, or how they were feeling going into a day’s action. Once.

Often it seemed like the process was to instruct the on-air talent to ask a one-line question, get a ten second answer, give a reaction that indicated how great that answer was, then move on. There was no tying to the broader picture, no following up, no relation to the game of Magic. Players were reduced to a single anecdote repeated over and over, clearly not rehearsed or selected.

If we’re going to intentionally sum up players with fifteen second clips of them saying what a great day it is for Magic, and show them lots of times, at least then they should know that this is their job, write up what they want to say, be coached on delivery, and deliver the goods. Do multiple takes if needed. This stuff does not come naturally. Then players can decide what persona to present, and will present better ones. This can still be combined with raw post-match reactions.

Better would be to do longer and more of them, so they could each be used more sparingly, and/or edited for the best parts, and they can go into more depth. If a Magic player wants to tell a long story – and they often do – there’s a good chance it’s a good story and I want to listen to it.

Mark Rosewater’s podcast on my good friend Brian David-Marshall highlighted how the stories of the players and the Pro Tour drive player engagement, and how important his pioneering of this angle was to coverage’s success. I agree completely. But what matters are deep stories. We want to know players over the course of many events, hear about the little things and the big arcs. A ten second set of stereotypes and tropes, or a catchphrase that wasn’t even well chosen, is not good characterization.

Result Reporting and Highlights

The weakest part of current traditional tournament coverage is when we are being updated on match results and how players are doing. We are read a list of how a bunch of players are doing, who won and who lost to who. Sometimes I want to know, but it’s pure scoreboard watching.

With no matches left to cover, it makes sense to use this to fill remaining round time, but if one wants to know how things are going for more than a small number of matches, a broadcast is a terrible method of transmitting that information.

If anything, reporting lots of results directly onto the stream is to me a detriment, because it is a spoiler. This prevents watching rounds out of order. Which would matter even more if we had better rebroadcasts.

Website Ho!

You know what’s great at reporting results? A web page that one can click on. Wizards used to be good at this. Can we bring this back? Please? We should have all the public information available in easy to access form on the web. All decklists, all results, should be easy to find. Somehow we have fallen so far away from this ideal that the stream becomes the entirety of the coverage, but that makes no sense.

It especially makes no sense for results and things like decklists and deck analysis, but it also makes no sense for the games themselves. Why can’t we watch any Invitational game we want, right now, on demand? Seriously. Why not?

Want to watch your favorite player’s rounds in order? We got you.

Want to view your own matches and maybe create a commentary track or companion article for them, or just analyze them in detail? All of which I would totally do all the time? We got you.

Want to watch all the copies of your favorite deck or match up? We got you.

Want to go around clipping highlights for an awesome YouTube video? We got you, too.

Want to create a full matrix of how any element impacted win percentage, from number of lands in the opening hand to dead cards in the matchup to which spells are worth countering? A true deep dive? We’re all over that.

And so on. The possibilities are endless.

The other thing one might want to do is experience the tournament without spoilers.

SFSN: The Spoiler-Free Sports Network

This is one of those startup ideas I am way too busy for and I know ideas are worth nothing, but I do hope someone creates it someday. I plan a full proposal write-up at some point.

In the meantime, we can start with Magic. We should have a place where one can view matches with a bar that determines when ‘now’ is and what parallel things of which we want or don’t want to be made aware. Then we can journey through what happened at our own pace, without worrying that we will be spoiled.

Another concrete suggestion is that there needs to be dead time at the end of all match/round/day videos, enough so that we can’t infer from the length of the video what happened in the match. Thus, if a round goes less than an hour, the video is the same length as if it went to time. If showing an non-timed round, go up to the reasonable maximum one could have expected. When I say ‘dead’ time it can literally be static or a fixed screen saying ‘thanks for watching!’ if we’d like. Alternatively, we can use that time for post-match analysis or information, or to show another match of the appropriate length, as we prefer.

This one will come with time. There were a lot of first time jitters. That makes sense. We had new announcers and announcing teams, high-stakes Arena games, a giant stage and an epic prize pool all for the first time. Both players and the coverage team were in awe of the moment.

Next time, that will be a lot better. A few years from now, challengers will have the issue, but most competitors and the whole coverage team will be old veterans of this new level.

There’s no nice way to say this. I won’t name any names, but we need to not pretend that what happened here didn’t happen.

A huge portion of the coverage team, including some of those doing commentary, had no idea, or not much of an idea, what was going on in the games.

They were all super excited. Which is great. It’s not enough. The team needs to be on the ball.

If you’re going to do interviews or man the reporting desk, you don’t need to know as much as the color commentator. The color commentator doesn’t need to know as much as the play by play. The play by play commentators need not study enough to compete. There still remains a minimum level that one has to meet to do a good job. More than that is great, and it will show, but you need to know the game, know the players, know the format and its major cards, decks and match-ups. And know what your role is, and how to execute on it. Be a professional.

If you do find yourself behind the camera, and you have no idea what is happening in the game, or what the right play is, just admit that. Ask the play by play announcer with more experience. If you’re the play by play, point out that the situation is complicated and hard and how cool that is. You can’t wait to see what the players do with this tough decision. Nothing wrong with that.

hate it when definitive strategic statements are made, over and over again, criticizing the players, that I know to be wrong, as the claimant doubles and triples down. When they could be thinking more about what was actually going on.

We also saw tons of excitement-based commentary during games, talking about how huge swings and moments were (that often were quite the opposite), but not explaining the interesting things about the game at all. Hopefully this was not on purpose, but only a side effect of not being fully prepared with knowledge or for the faster pace of play that comes with Arena.

This is totally, totally not about going after any particular person. This is not, repeat not, anyone on the team’s fault (although if they come back still not ready, that would be different, and if it was bad enough I’d start naming names).

It is the fault of the people who put together the teams for not checking, not making the right preparations.

If that was growing pains and trying people out, totally fine. But it can’t happen again. Not like this.

Life Total Tiebreak and Double Elimination

Everyone knows this is a horrible, no good, very bad solution to matches going to time. It always has been. When Gerry Thompson was forced to concede to Wyatt Darby in a won position in game three, it stung. Later, when I Googled for ‘Mythic Invitational’ the two highlighted results were something non-flattering that we won’t discuss here, and an article criticizing deciding games on the basis of life totals.

This was far from the worst case scenario. Fast Arena games kept most matches from going to time even with Esper mirrors. There was no visible stalling or foul play anywhere. The match we saw that featured a life total tiebreak gave both players the chance to play the game with the tiebreak in mind, and Gerry could have conceded a previous game much faster to save the time needed to win.

So all in all, we got off very light. Next time we might not be so lucky.

The obvious offender is double elimination. In addition to its hyper randomness, double elimination forces the elimination of draws. Without draws, we need some sort of tiebreaker.

In exchange, we get excitement and easy to understand brackets. I don’t love it, but I understand and accept the need for it. If we could do longer and more skill intensive matches it would be far less painful.

What else can we do if we’re committed to elimination brackets?

Our choices are to eliminate time limits, have both players lose, or to choose a better tiebreaker.

Both players losing isn’t viable in context.

Eliminating time limits interferes with the tournament schedule. So do other things, and there were lots of fast-finishing rounds that resulted in lots of dead air, so one could plan to make up the time elsewhere. It would almost certainly be fine, and you could reserve the right to call the match if needed. But alas, that is almost certainly not good enough.

So we need a better tiebreaker. It isn’t obvious one is available, or even possible. Life totals at least have the advantage of being law. There’s a number, make yours higher, everyone knows the rule and can choose plays and decks accordingly. Ugly sometimes, but gets the job done.

What I totally don’t want is a subjective judgment call. Even if it is used sparingly, judges and staff need to not be put into that position. It’s not fair to them, it’s not fair to the players or to the tournament. Eventually they’re going to get one wrong, or there’s going to be one so muddled it’s a giant train wreck, and once you have the option to intervene, it’s your call and no amount of punting will change that.

Chess clocks are also an option. Magic Online matches won’t go to life totals, because one player will lose to time. We could in theory add those clocks into Magic Arena for matches without a turn time limit, and choose a limit such that the round must end on time.

In theory, we could have the clocks count up, and say that whoever’s total time used was lower wins the match if it goes to time. But that runs into the problem of ‘player ahead on time now knows they should stall because they have a big enough time lead’ and I don’t see a solution to that. All the solutions I can think of reintroduce all the disadvantages of counting down, so you might as well count down.

Unless we go the chess clock route, I think we’re mostly stuck. We should use 60 minute rounds whenever possible and be ruthless about slow play and especially stalling.

One partial solution is to present the sudden death rule up front, and treat it as a good, exciting thing. Sudden death! Fans hate sudden death overtime in sports for its randomness, but they also love sudden death overtime. It’s exciting and action packed.

Another solution of course is to not have so many damn Esper decks, which combined with Arena’s speed should solve the problem almost all the time anyway.

Player Skill

Player skill on display at the Mythic Invitational was far lower than at the Mythic Championship or similar past events, let alone the above-referenced Magic Online Championship. What happened?

Several things.

First, the format forced players to bring a diversity of decks, forcing them to play styles they were uncomfortable with. There are certainly advantages to forcing players to be well–rounded, but this is a price you pay.

Second, we invited competitors in ways that didn’t test for the skills they would use to compete.

The grind into the Mythic Top 8 was a killer on the community’s stamina and is gladly not being repeated. Another aspect was that it rewarded players who knew one deck inside and out and could grind out wins consistently and quickly versus non-top opposition. That’s a very different skill.

Then we invited a lot of streamers. Streamers are specializing in a different skill, which wasn’t on display. I wish it had been on display. If you’re going to invite streamers, use them also for what they do best! Tactics with multiple decks was not their forte, and often it showed.

Third, Arena seems like it instinctively rushes players. Even with nothing mechanically forcing quick moves, everything is geared towards goading players into playing faster. It worked, but this caused a decrease in quality of play. All the players who did well played a ton on Arena and were deeply comfortable with the program, but also likely didn’t properly adjust to a regular form of time control.

Fourth, there was no good testing ground for Duo Standard. Thus, when players tried to do things that didn’t make sense in regular best of one, they were out on limbs. When they wanted to test the matchups they would actually face, instead they faced best of one fields that had very different deck distributions.

Fifth, the double elimination format and the Duo Standard format did not give enough room for those playing better to triumph over those playing worse. A lot of Magic’s best were out early.

Sixth, players lost their nerve due to the stakes and setting, as mentioned above. This showed especially in the choice of decks for game three. No one dared be bold. It felt like a lot of players planned to make bold choices for game three, then couldn’t pull the trigger.

We Tried Duo Standard, Now Try Something Else

A lot of the problems were due to Duo Standard.

Thinking about Duo Standard is a fascinating exercise in game theory. My speculations were a lot of hit and also a bunch of miss; in another article I’ll go over what happened, and what I think explains the differences, and what we’ve now learned.

Also what I think the players did wrong.

Without stepping too much on that much deeper article’s toes, I think it is safe to say that we tried Duo Standard and found it wanting.

Duo Standard resulted in less deck diversity in terms of general archetypes. Where we did see a new deck, it was because of Mastermind’s Acquisition.

Duo Standard resulted in less diversity within each deck, even within the main deck build. Without the ability to sideboard out poor cards, or fix problems, players universally opted for ‘safe’ configurations.

Duo Standard took away sideboarding, which has the most strategic depth of any portion of the game, and also creates a lot of the diversity of experience since different players pursue different strategies even with identical decklists and sideboards to work with.

Stop trying to kill sideboarding. Seriously. Stop trying to kill sideboarding. The game is a shadow of itself without sideboarding. It turns into an endless grind if one isn’t careful. Sideboards make us think about every detail of our opponents’ deck, how they think, what they anticipated, how they think about us, what they expect, how they might plan for a later game. It makes the game rich.

That doesn’t mean no best of one queue. Players need to start somewhere. Sometimes we want to try out a new thing or get in a quick game.

But seriously. Stop It.

Duo Standard also often results in situations where if the flip on which deck plays which goes one way, one player gets two great matchups, and if the flip goes the other way, the other player gets two great matchups.

Then for game three, you have a pure guessing thinking game. Not what we had in mind.

Coin flipping was the order of the day for many distinct reasons.

Combine that with double elimination and an established format full of aggressive decks, and it’s no surprise that skill testing was at an all-time low.

Good news! Wizards has already announced that they have recognized that Duo Standard did not do what they needed it to, and will continue to experimenting.

The problem is that the phrase ‘closely resemble your at-home play experience’ seems to be code for ‘no sideboards.’

This misguided goal may doom us all.

To be blunt, it’s a stupid goal. Professional and advanced play of games often involves additional twists that don’t make sense at home. People understand.

Does your little league game use a bullpen? Should MLB stop using one because you don’t? Does your touch football game not use distinct offensive and defensive players? Should the NFL fix this?

Of course not. That would be insane.

Rule of Law

This is another point on my list of things to write about extensively and carefully, and ties into my recent posts on Privacy and Blackmail. It is almost impossible to underestimate the value of true rule of law.

Rule of law opposes rule of man. It says that we choose rules, then we follow those rules. It says that the record reflects what happened, that rewards and punishments are not chosen based on politics and alliances, or who placated or served the powerful.

Without rule of law, power, wealth and survival come from politics. One’s prime directive is to make alliances, sell one’s self and serve powerful interests in hopes of reward. Such systems make communication impossible, invoking the Snafu principle.

What is unique and great about games? Games are the avatars of rule of law. You have a closed system with fixed rules. The rules cannot be broken. Those who navigate those rules best, win. Even as Magic’s rules and cards change, the winners are those who play the best. We give the power to the players.

I was playing games to get away from power and politics long before I knew that was what I was doing. I believe the same is true of many others.

When Wizards fails to communicate their plans and set clear rules, this causes two huge problems.

The first is that players cannot plan their lives. They don’t know what is being asked of them, or what is being offered to them, or what they must accomplish.

The second is that we weaken rule of law. When decisions are made without prior formulas, it is impossible to not worry that the fingers on the scale are (at least in part) choosing in order to get the results that they want. To invite the players they want, and not the ones they don’t. To give advantage where it would help them.

Thus, instead of focusing on winning at Magic, we are forced to focus on doing what we think will please the Powers That Be. Feedback becomes unreliable. Players praising Wizards and the game, whether or not #Sponsored, are hard to trust. Players must constantly think about what would look good, what would be popular, what would get their streaming numbers up or convince key decision makers.

This happens even if the decisions are in fact being made without considering these things.

The key thing about power is avoiding it. That is hard. Power creates more power by default. Poor is the man whose pleasures depend on the permission of another.

It is great when Wizards says well in advance, we will take players from Arena according to this formula (the new formula of an event among the top 1000 is far superior to an exhausting ladder grind, so kudos for fixing that right away), or the winners of these qualifiers, or those who score highly on these point systems. Some systems are better than others, but having a system at all is the most important thing!

I would implore Wizards to enshrine as much rule of law as they possibly can, while still getting the things they need. I’m totally fine with setting aside slots (in Invitationals, or even in the MPL) for big streamers or other game ambassadors. Things other than play skill matter. It’s true.

But (in addition to being inherently political themselves) success in such tasks is very hard to quantify and the temptation is not to. We should be as quantified and objective as possible in choosing which streamers, in a way that streamers can know in advance. Even more importantly we should draw a distinct line between where being a good player versus being a good ambassador (versus a combination of both) is what we are judging.

Same with other choices.

I’m even totally fine with reserving a few slots (again, even in the MPL) for ‘Wizards’ choice’ and making it explicit that those slots are based on politics. There are big benefits, and this is big business. But let us isolate that, lest we lose that which is most precious.

Conclusion

Lets boil down and summarize what needs to happen to make events like this great.

1. Minimize dead air. Minimize repetition. Let us watch as much Magic as possible.
2. Make all games available both during stream and at the website.
3. Website needs to provide the information we want, and also allow watching of matches without exposing us to spoilers.
4. Ensure commentators know the game, players, cards and format.
5. Keep focus on deep stories and deep strategy whenever possible.
6. Do longer and deeper interviews and round analysis with players.
7. Keep focus off of coin flipping and repeating how huge and exciting things are.
8. Choose matches with more deck diversity.
9. Keep experimenting with formats, but accept that Duo Standard is a failure. Stop trying to kill off sideboards.
10. Maintain rule of law.

Discuss

Новости LessWrong.com - 17 апреля, 2019 - 13:14
Published on April 17, 2019 10:14 AM UTC

A well-known brainteaser asks about the truth of the statement "this statement is false". My previous article on this topic, outlined common approaches to this problem and then argued that we should conceive of two distinct kinds of truth:

• Statements about the world, where as per Tarski there is a natural interpretation: "Snow is white" is true iff and only iff "Snow is white"
• Logical/Mathematical statements, where the notion truth is constructed to give us a convenient way of talking about our rules of inference within a particular system that normally excludes self-referential statements

I should add that some statements can only be defined using a combined notion of truth, ie. "The car is red and 1+1=2".

My point was that if we chose to extent logical/mathematical statements outside of their usual bound, we shouldn't be surprised that it breaks down and that if we choose to patch it, there will be multiple possible ways of achieving this.

Patching with INFINITE-LOOP

So let's consider how we might attempt to patch it. Suppose we follow the Formalists (ht CousinIt) and insist that "true" or "false" or only applied to sentences that can be evaluated by running a finite computation process. Let's add a third possible "truth" value: INFINITE-LOOP.

Consider the following sentence:

The truth value of this sentence is not INFINITE-LOOP

This seems to be a contradiction because the sentence is infinitely recursive, but at the same time denies this.

In order to understand what is happening, we need to make our algorithm for assigning truth values more explicit:

What we see here is that if the sentence is not able to be expanded without ending up in an infinite loop, it is assigned the truth value INFINITE-LOOP without any regard to what the sentence asserts. So there isn't actually an inconsistency, at most, this system for assigning truth values just isn't behaving how we'd want.

In fact consider the following:

A: This sentence is falseB: Sentence A has a truth value of INFINITE-LOOP

According to the above algorithm, assigning INFINITE-LOOP to B is correct, when it seems like it should be TRUE. Further, this system assigns INFINITE-LOOP to:

1+1=2 or this sentence is false

when perhaps it'd be better to assign it a value of TRUE.

Patching with an oracle

Being able to talk about whether or not sentences end up in an infinite loop seems useful. So we can imagine that we have a proof oracle that can determine whether sentence will end up in a loop or not.

for reference in sentence: if oracle returns INFINITE-LOOP: Evaluate the clause given the value INFINITE-LOOP as the truth value of the reference else: Expand normally

However, our oracle still doesn't demystify:

The truth value of this sentence is not INFINITE-LOOP

As our algorithm would replace the first clause with INFINITE-LOOP and hence evaluate

INFINITE-LOOP is not INFINITE-LOOP

to FALSE. But then:

FALSE is not INFINITE-LOOP

so we would expect it to also be TRUE.

So perhaps we should define our oracle to only work with sentences that don't contain references to INFINITE-LOOPS. Consider the following situation:

A: This sentence is falseB: Sentence A has a truth-value of INFINITE-LOOPC: Sentence B is trueD: Sentence C is true

B would be TRUE (even though it refers to INFINITE-LOOP, the oracle only has to work with the reference "Sentence A"). However, C would be undefined.

We could fix this by allowing the oracle to return TERMINATES for sentences that can be evaluated after one level of expansion with our initial definition of an oracle. We can then allow sentence D to be true by allowing the oracle to return TERMINATES for any sentence that can be evaluated after two levels of expansion and we can recursively extend this definition until infinity.

This also resolves cases like:

1+1=2 or this sentence is false

The second clause evaluates to INFINITE-LOOP and since this is a truth value, rather than actually infinitely looping, (TRUE OR INFINITE-LOOP) should give true.

Patching with ORACLE-LOOP

We still haven't figured out how to handle cases like:

The truth value of this sentence is not INFINITE-LOOP

I would suggest that we might want to repeat our first move and say that the truth value is ORACLE-LOOP whenever an oracle fails to resolve it (even if we expand it an infinite number of times, we still end up with a sentence containing INFINITE-LOOP). We can then stack meta-levels and further metalevels on top of this.

Final Note

I'll finish by noting that we could also define another notion of truth whether a statement is true when there is a single fixed point. This would result in statements like:

This sentence is true

Being set to true instead of INFINITE-LOOP.

In any case, the way that we extend the concept of truth to apply to these degenerate cases is purely up to what we find convenient. Obviously

Discuss

### «Хогвартс и рациональное мышление» в Москве 25 мая

Новости - 17 апреля, 2019 - 03:00
25 мая в 15:00 в антикафе “Кочерга” (Москва, ул. Б.Дорогомиловская, д.5к2) состоится квест-игра «Хогвартс и рациональное мышление». Игра в стиле «живой квест», ролевая, не костюмированная, рассчитана на 6-30 человек. Время мероприятия: 15:00—19:00. (Время непосредственно игры: 3 часа). Стоимость: 900 руб. Печеньки, чай и нахождение в антикафе включены. Оплата на ресепшене Кочерги, предварительная регистрация не требуется. Специально для любителей интеллектуальных игр мы совместили сюжетную “магическую и детективную” миссию с участием в мини-играх, связанных с рациональным мышлением.

### Open Problems in Archipelago

Новости LessWrong.com - 17 апреля, 2019 - 01:57
Published on April 16, 2019 10:57 PM UTC

Over a year ago, I wrote about Public Archipelago and why it seemed important for LessWrong. Since then, nothing much has come of that. It seemed important to acknowledge that. There are more things I think are worth trying, but I've updated a bit that maybe the problem is intractable or my frame on it might be wrong.

The core problems Public Archipelago was aiming to solve are:

• By default, public spaces and discussions force conversation and progress to happen at the lowest common denominator.
• This results in a default of high-effort projects happening in private, where it is harder for others to learn from.
• The people doing high-effort projects have lots of internal context, which is hard to communicate and get people up to speed on in a public setting. But internally, they can talk easily about it. So that ends up being what they do by default.
• Longterm, this kills the engine by which intellectual growth happens. It's what killed old LessWrong – all the interesting projects were happening in private, (usually in-person) spaces, and that meant that:
• newcomers couldn't latch onto them and learn about them incidentally
• at least some important concepts didn't enter the intellectual commons, where they could actually be critiqued or built upon

The solution was a world of spaces that were public, but with barriers to entry, and/or the ability to kick people out. So people could easily have the high-context conversations that they wanted, but newcomers could slowly orient around those conversations, and others could either critique those ideas in their own posts, or build off them.

Since last year, very few of my hopes have materialized.

(I think LessWrong in general has done okay, but not great, and Public-Archipelago-esque things in particular have not happened, and there's been interested in discussion in private areas that not everyone is privy to)

I think the only thing that came close is some discussion on AI Alignment topics, which benefited from being technical enough to automatically have a barrier to entry, and created a discussion shaped in such a way that it was harder to drag it into Overton Window Fights.

The core problem is that maintaining a high-context space requires a collection of skills that few people have, and even if they do, it requires effort to maintain.

The moderation tools we built last year still require: a lot of active effort on the part of the individual users, and that effort is kinda intrinsically aversive (telling people to go away is a hard skill and comes with social risks), and it also requires people to have lots of ideas that are interesting enough in the first place to build a high-context conversation around.

The current implementation requires all three of those skills in a single person.

There are a few alternate implementations that could work, but requires a fair amount of dev work, and meanwhile we have other projects that seem higher priority. Some examples:

• People have asked for subreddits for awhile. Before we build that, we want to make sure that they're designed such that good ideas are expected to "bubble up" to the top of LessWrong, rather than stay in nested filters forever.
• Opt in rather than opt out moderation (i.e. people might have a list of collaborators, and only collaborators can comment on their posts, rather than a banned list). This is basically what FB and Google Docs does.
• I had some vague ideas for "freelance moderators". We give authors with 2000 karma the ability to delete comments and ban users, but this is rarely used, because it requires someone who is both willing to moderator and who can write well. Splitting those into two separate roles could be useful.

I'm most optimistic about the second option.

I think subreddits are going to be a useful tool that I expect to build sooner or later, but they won't accomplish most-of-the-thing. Most of what I'm excited about are not subreddits by topic, but highly-context-driven conversations with some nuanced flavor that doesn't neatly map to the sort of topics that subreddits tend to have. Plus, subreddits still mean someone has to do the work of policing the border, which is the biggest pain point of the entire process.

If I were to try the second option and it still didn't result in the kinds of outcomes I'm looking for, I'd update away from Public Archipelago being a viable frame for intellectual discourse.

(I do think the second option still requires a bit of effort to get right – it's important that the process be seamless and easy and a salient option to people. And thus, it'll probably still be a while before I'd have the bandwidth to push for it)

Discuss

### Robin Hanson on Simple, Evidence Backed Models

Новости LessWrong.com - 17 апреля, 2019 - 01:22
Published on April 16, 2019 10:22 PM UTC

This is link to a Hanson post that is primarily about tax returns, and whether they should be public. It was an interesting foray into the considerations that bear on that topic.

But I was particularly interested in the opening paragraph, which had a useful lens:

Our simplest model of an economy is: supply and demand. This model has many simple implications for policy. Now we know of many much more complicated economic models, which often have quite different policy implications. But often we are not sure which more complex models actually apply well to any given situation. So we have to worry that people favor more complex models mainly to justify their preferred policies. Knowing this pushes me toward recommending the policies implied by supply and demand, unless I see unusually clear evidence to support a different economic model. (FYI, the evidence that fixed costs exists seems plenty clear, so I really mean supply & demand with fixed costs.)In our simplest modes of information, people are better off when they have more information, and also when information is distributed more symmetrically. There is a vast world of much more complicated models, but it is often hard to tell which more complex models apply well to given situations, and many probably favor particular models to justify preferred policies. So as with supply and demand, this uncertainty pushes me to favor the simplest info models, and their policy recommendations favoring more info and more symmetric info.

Discuss

### Conditional revealed preference

Новости LessWrong.com - 16 апреля, 2019 - 22:16
https://s0.wp.com/i/blank.jpg

### Complex value & situational awareness

Новости LessWrong.com - 16 апреля, 2019 - 21:46
Published on April 16, 2019 6:46 PM UTC

Epistemic status: theorizing.

Here are two types of activity that (a) I genuinely enjoy and (b) seem quite useful:

2. Maintaining situational awareness
Complex value

What does "adding complex value" mean?

It means all the efforts (often small, often done at the margin) that are difficult to automate / formalize, and are (in aggregate) crucial for pulling a project together.

Complex value is the grease that helps all the machine's cogs run together.

Examples:

• Establishing new linkages in the social graph by making introductions
• Reviewing & giving feedback on drafts of writing, pre-publication
• Reading & commenting on writing, post-publication
• Having new ideas about things that would be good to do (especially things that would be good to do on the margin; big new ideas can be turned into standalone projects or companies)
• Helping refine the pitch for a new idea; understanding and articulating the bear & bull cases for the idea
• Pitching good new ideas to relevant people that are plausibly interested
Situational awareness

What does "maintaining situational awareness" mean?

It's all the reading & conversations that are undertaken to learn what's happening in the world, to keep your world-model up to date with both social reality & objective, physical reality.

Maintaining situational awareness dovetails nicely with adding complex value – the better your situational awareness, the more opportunities for adding complex value you'll see.

Examples:

• Lurking on twitter (especially with a well-curated feed)
• Using various other social media (though the signal:noise ratio of other social media tends to be far worse than that of well-curated twitter)
• Reading company & project slacks
• Semi-formal "update" conversations with other actors in project domains you care about
• Informal conversations with friends who happen to work in project domains you care about
• Attending conferences
• Gossip

Note that very different information sets flow through formal & informal networks. These sets tend to be complementary, so it seems important to be tapped into both.

Note also that situational awareness seems distinct from "learning about a subject." Probably the distinction cleaves on where most of the learning occurs – situational awareness focuses its learning on social reality ("who thinks what about who/what?"), whereas the locus of learning about subjects tends to be in physical reality ("how does this part of physical reality work?").

Stereotypical city for situational awareness: DC
Stereotypical city for learning about subjects: SF

Unfortunately, though both adding complex value & maintaining situational awareness are high-value, it's hard to earn a living by making them your main focus.

It is possible to do this, e.g. one way of understanding the original pitch for GiveWell is "create an institution in philanthropy that will aggregate explicit & implicit information sets, remain at the frontier of situational awareness, and identify leveraged opportunities for adding complex value in the philanthropic sector."

80,000 Hours is another example of this, aimed at the domain of "policy & research careers" rather than at philanthropy.

I'm still learning about how to successfully establish something like this. My current take is that (a) it's generally hard to do, (b) the base rate of success is very low, and (c) successful attempts leaned heavily on leveraging pre-existing reputation & social relationships.

Cross-posted to the EA Forum & my blog.

Discuss

### Notes from Literature Review: Distributed Teams

Новости LessWrong.com - 16 апреля, 2019 - 20:58
Published on April 16, 2019 5:58 PM UTC

My disorganized, unformatted notes for Literature Review: Distributed Teams.

https://www.researchgate.net/publication/256005419_Understanding_Conflict_in_Geographically_Distributed_Teams_The_Moderating_Effects_of_Shared_Identity_Shared_Context_and_Spontaneous_Communication

Understanding Conflict in Geographically Distributed Teams: The Moderating Effects of Shared Identity, Shared Context, and Spontaneous Communication

Pamela J. Hinds, Mark Mortensen 2005

• “t shared identity moderated the effect of distribution on interpersonal conflict and that shared context moderated the effect of distribution on task conflict”
• “spontaneous communication was associated with a stronger shared identity and more shared context, our moderating variables. Second, spontaneous communication had a direct moderating effect on the distribution-conflict relationship, mitigating the effect of distribution on both types of conflict. W”

Diversity in team composition, relationship conflict and team leader support on globally

distributed virtual software development team performance

Wickramasinghe, V.

Nandula, S.

University of Moratuwa, Sri Lanka.

• “diversity in team composition leads to relationship conflict, relationship conflict leads to team performance, and team leader support moderates the latter relationship”
• Never meeting in person is bad
• Diversity -> relationship conflict, but that’s probably not the problem here
• This is offshoring, which has class and quality implications
• “, a mean value of 3.67 suggests a considerably high level of conflict between team members. “

(PDF available on request)

Insights for Culture and Psychology from the Study of Distributed Work Teams

Catherine Cramton

• Screw you google books and your disabling of copy/paste and hiding of pages
• The mutual knowledge problem: people act on disparate knowledge without knowing the knowledge is disparate
• Note: I skipped descriptions of problems for which I had already described the paper they were based on.
• “Language asymmetries” are a big deal.
• “Strategies included avoiding conversations and meetings at which the lingua franca would be required, deleting correspondence unread, leaving meetings early, trying to control attendance at meetings to exclude speakers of the less familiar language, and switching languages in the middle of meetings.”

http://sci-hub.tw/10.1145/2675133.2675199

In the Flow, Being Heard, and Having Opportunities: Sources of Power and Power Dynamics in Global Teams

Pamela Hinds

Daniela Retelny

Catherine Cramton

2015

• “Over the last 10-15 years, research on global teams has grown dramatically, including investigations of team dynamics [6, 8, 23, 33], communication structure [10, 19], conflict [20, 21], coordination [16, 18] and leadership [11, 40]. Scholars have also studied the use of technology to support [13, 17, 33, 38] and alleviate many of the challenges encountered by globally distributed teams [3, 27, 28]”
• “ Few studies, however, have examined the power dynamics in global teams, particularly the sources of power and how workers respond to power imbalances. This is despite the fact that the distribution of power can have profound and far-reaching effects on individual behavior [9] and team outcomes [2, 41]”
• “Social power has been defined as “asymmetric control over valued resources in social relations””
• “power is more objective whereas status is in the eye of the beholder.”
• “work by French and Raven [15] reported six sources (bases) of power – legitimate (positional), reward, coercive, informational, expert, and referent”
• “ O’Leary and Mortensen [32] who examined the configurational imbalance (relative numbers of team members) across two sites. They found that locations with (numerical) minority subgroups were at a disadvantage compared to locations with a relatively larger number of team members.”
• “They report that separation in time and space accentuated differences in sources of status (social capital), but that the onshore leaders of these teams sometimes shared their resources as a way to renegotiate status differences”
• 9 teams interviewed over 18 months
• “Our analysis suggests that team members at some locations felt that they had less influence and power than those at other locations. Interestingly, it was not only those at headquarters or those collocated with the largest number of team members who felt powerful, although both of these were factors. We found that the sources of power among these globally distributed professionals resided at different locations and fell into three categories; access to information (being in the flow), access to decision makers (being heard), and opportunities for growth. When team members perceived that they had access to these resources, they felt that they could work effectively, influence decisions of importance, and have career opportunities, but if few were present, they expressed frustration and dissatisfaction in their jobs.”
• “, having the expertise located elsewhere created a dependency on team members at another location, which reduced local developers’ sense of power”
• “Having direct access to customers was also a critical resource for developers and varied significantly by location”
• Factors that matter in likelihood of power struggle:
• Being near executives
• Opportunities for growth (terminal goal)
• “Power Contested In 4 of the 9 teams in our study, power was contested, sometimes fiercely. Surprisingly, in all of these teams, the sources of power were relatively evenly split across locations.”
• “in the absence of a powerful organizational boundary (vendor-client), power contests may be less easily and quickly decided”
• “In our analysis, power contests seemed to be dominated by concerns over opportunities for capturing work and for growth.”
• “ perceiving opportunities for growth diluted power contests.”
• “individuals in power are less likely to share knowledge [26, 30]”
• Suggested fixes:
• “ two-way channel between leaders and distant workers that would provide more access to strategic information, more opportunity to contribute to decisions, and more visibility between leaders and distant workers. Visibility, in particular, would need to be bidirectional, “
• Move people to new sites

SUBGROUP DYNAMICS IN

INTERNATIONALLY DISTRIBUTED

TEAMS: ETHNOCENTRISM OR

CROSS-NATIONAL LEARNING?

Catherine Durnell Cramton and Pamela J. Hinds

http://www.jimelwood.net/students/grips/man_group_comm/cramton_2001.pdf

The Mutual Knowledge Problem and Its

Consequences for Dispersed Collaboration

Catherine Durnell Cramton

• Group projects consisting of two students each from three different schools, sometimes international. Note that there was no control and students are just bad at things, and didn’t have as much time to gel.
• “Five types of problems constituting failures of mutual knowledge are identified: failure to communicate and retain contextual information, unevenly distributed information, difficulty communicating and understanding the salience of information, differences in speed of access to information, and difficulty interpreting the meaning of silence”
• “unrecognized differences inthe situations, contexts, and constraints of dispersed collaborators constitute "hidden profiles" that can increase the likelihood of dispositional rather than situational attribution, with consequences for cohesion and learning”
• “ Mutual knowledge is knowledge that the communicating parties share in common and know they share”
• “Establishing mutual knowledge is important because it increases the likelihood that communication will be understood”
• “ Proceeding without mutual knowledge, people may speak and understand what is said on the basis of their own information and interpretation of the situation, falsely assuming that the other speaks and understands on the basis of that same information and interpretation”
• “ Krauss and Fussell (1990) describe three mechanisms by which mutual knowledge is established: direct knowledge, interactional dynamics, and category membership”
• “It is well established that groups that meet face-to-face tend to dwell on commonly held information in their discussions and overlook uniquely held information...When a group's discussion is mediated by technology, the problem seems to be worse”
• “ The computer-mediated groups exchanged less information overall and took more time doing it. One of the most robust findings concerning the effect of computer mediation on communication is that it proceeds at a slower rate than does face-to-fac”
• “"A delay of 1.6 seconds is sufficient to disrupt the ability of the sender to refer efficiently to the . . . stimuli, despite the fact that the back-channel response is eventually transmitted”
• Computer mediated interactions are less rich, leading people to read too much in to what non-verbals they do get.
• Computer mediated interaction with people you don’t know well creates feelings of “ isolation, anonymity, and deindividuatio”
• It is important that when a problem arises, peolpe attribute it to the correct cause. In particular, don’t blame the person when the situation is the cause
• “ people using computer-mediated communication with remote others they do not know well rely heavily on social categorizations to guide their relationships. The social categorizations provide a basis for affiliation if participants share a significant social identity. However, they also can provide fodder for in-group/out-group dynamics if remote others are seen as belonging to social categories different and less attractive than oneself”
• “The data were contained in an archival dataset that was created in the course of a collaborative project involving graduate business faculty and students located at nine universities on three continents”
• No co-located controls.
• “Despite efforts to make project requirements consistent across all nine universities, differences were discovered as the project unfolded”
• “s. In seven of the 13 teams, conflict escalated to the point that hostile coalitions formed. In five of these teams, members at two sites began to complain about partners at the third site, refusing in some cases to send them pieces of the team's work or put their names on finished work. Two teams evidenced shifting coalitions among subgroups at the three sites”
• “Eventually, Icharacterized five types of problems: (1) failure to communicate and retain contextual information, (2) unevenly distributed information, (3) differences in the salience of information to individuals, (4) relative differences in speed of access to information, and (5) interpretation of the meaning of silence.”
• (1) failure to communicate and retain contextual information
• E.g. Team members refused to schedule group chat and insisted on phone call, but chat was a required part of the project for other team members
• Not informing each other of spring break (which occurred at different times for different schools).
• Not informing team members they were disappearing for other projects or tests
• Team members on e-mail missed that other people weren’t included or that their emails were misspelled. “ Sometimes people knew they were exchanging mail with only part of the team, but failed to understand how this affected the perspectives of team members who did not receive the mail, or how it affected the dynamics of the team as a whole”
• People retained resentments that others were “unresponsive”, even when they saw proof the person had attempted to communicate and it was not their fault (e.g. misspelled an e-mail address).
• “In relationships conducted face-to-face, it is a challenging cognitive exercise to interpret a set of facts from the perspective of another person. It is far more difficult to determine how the information before the other party differs from one's own, and then see things from the other's perspective. Geographic dispersion makes these two activities more difficult because of undetected "leaks in the bucket," because partnerseem to have difficulty retaining information about remote locations, and because feedback processes are laborious”
• Private conversations give an inaccurate perception of the pace of work
• Differences in the salience of information
• Confusion to do indirect wording
• “there was a tendency to request feedback from the team indirectly, yet to expect quick responses from everyone”
• Physical distance meant that US <-> US communication was faster than US <-> AUS communication (I expect this to be less of an issue now). This meant everyone’s messages were followed by unrelated messages rather than responses. This was “... attributed to remote partners' lack of conscientiousness”
• Got time difference wrong
• People failed to convey which topics in an email were most important. In person you can check comprehension as you go and look for reactions.
• Interpreting Silence
• “t silence had meant all of the following at one time or another: Iagree. I strongly disagree. I am indifferent. I am out of town. I am having technical problems. Idon't know how to address this sensitive issue. I am busy with other things. Idid not notice your question. I did not realize that you wanted a respons”
• This turned out to be a very big deal.
• “ The data suggest four constellations: (1) good performance, task focus, moderate relationship demands, relatively low volume of communication, and low coalition activity; (2) good performance, high task and relationship demands, relatively high volume of communication, and high coalition activity; (3) weaker performance, relatively high volume of communication, many and diverse information problems, and high coalition activity; and (4) weaker performance, relationship focus, task secondary, relatively high volume of communication, and low coalition activity”
• “Information problems seemed to be more damaging to relationships than to task performance”
• Highest-harmony team did not grade that great, perhaps because their unwillingness to disagree meant they didn’t hone each others’ ideas.
• “ there may be a tendency to generalize such social perceptions, particularly negative ones, to the locational subgroup of which a person is a member, which sets in motion in-group/out-group dynamics (Point [7]) that are destructive to group cohesion”
• “ Members of the teams I studied often failed to guess which of the many features of their context and situation differed from the contexts and situations of remote partner”
• “The authors propose that trusting action and demonstrated reliability increase trust in dispersed teams. However, my work suggests that human and technical errors in information distribution may be common in dispersed collaboration, particularly during the early phases of activity. If these are interpreted as failures of personal reliability, they are likely to inhibit the development and maintenance of trust. “
• “ when people work under heavy cognitive load, they become more likely to make personal rather than situational attributio”

https://sci-hub.tw/10.1287/orsc.2013.0869

Situated Coworker Familiarity: How Site Visits Transform

Relationships Among Distributed Workers

Pamela J. Hinds

Stanford University, Stanford, California 94305, phinds@stanford.edu

Catherine Durnell Cramton

• Site visits engender closer co-worker relationships that change behavior upon return to distant sites, continuing the closeness.
• “qualitative study of 164 workers on globally distributed teams”, over 1.5 years
• “Wilson et al. (2006) demonstrate that trust starts lower in mediated groups but develops to levels comparable to face-to-face groups (although pure electronic groups never achieved the same levels of cooperation as those who met face-to-face)”
• “Walther (1992, 1996), for example, leverages social information processing theory to argue that, although the process may be slower, rapport among members of distributed dyads that never meet can eventually exceed that of collocated dyads. Similarly, Wilson et al. (2006) demonstrate that trust starts lower in mediated groups but develops to levels comparable to face-to-face groups (although pure electronic groups never achieved the same levels of cooperation as those who met face-to-face). Field research on virtual teams has also claimed that knowledge repositories are an adequate substitute for face-to-face interaction (Malhotra et al. 2001)”
• “Grinter et al. (1999), for example, report that coworkers who did not meet face-to-face had more difficulty creating rapport and developing a longterm working relationship. Alge et al. (2003) also find that coworkers who met face-to-face and got to know one another before embarking on a mediated collaborative task reported as much openness and trust as those who worked face-to-face the entire time. In an early review, Olson and Olson (2000) conclude that face-toface interaction is important to establishing the conditions necessary for distributed work. In their longitudinal study of three globally distributed teams, Maznevski and Chudoba (2000) further describe how regular faceto-face meetings create the rhythms that enable higherlevel coordination. Touching (e.g., handshakes, pats on the back) and “breaking bread” together also have been found to contribute to a state of communicative readiness among distant workers (e.g., Nardi 2005). With few exceptions, research on distributed work consistently points to the importance of face-to-face contact as a means of building trust and rapport (e.g., Orlikowski 2002), translating locally situated knowledge (Sole and Edmondson 2002), interacting more rapidly on tasks (Crowston et al. 2007), and building social networks (Orlikowski 2002). This research generally suggests that periodic face-to-face interaction plays an important role for distributed workers, although what happens during these encounters remains largely a mystery.”
• “Mortensen and Neeley (2012) that shows that workers who travel to the location of their coworkers have more knowledge, both direct (about their distant coworkers) and reflected (about their own location), and that this knowledge is associated with feelings of closeness and trust”
• “more time together leads to more familiarity, although a plateau seems to be reached fairly quickly in laboratory settings “
• Workers felt more comfortable with distant co-workers after meeting in person.
• Socializing outside work hours was important
• Seeing how co-workers interact with other people is useful
• “n seeing workers acting on their knowledge and working directly with them as they did so that coworkers truly began to understand the capabilities of their coworkers and how they needed to work together to leverage those skills most effectively.”
• “. Informants told us that after site visits, they and their distant coworkers responded in a much more timely manner to emails and requests”
• Not integrating a visitor has a terrible effect on their morale and doesn’t generate any of the benefits of a good visit.
• “We speculate that travel needs to occur on a regular basis with intervals of about six months, plus or minus depending on the ambiguity of the project”
• Both managers and workers need to travel

https://digitalcommons.usu.edu/cgi/viewcontent.cgi?article=1008&context=mis_facpubs

Bridging Space over Time: Global Virtual Team

Dynamics and Effectiveness

Katherine M. Chudoba

Utah State University

Martha L. Maznevski

University of Virginia

• “We studied three global virtual teams, collecting data over a period of 21 months: 9 months of intensive observation and collection, preceded by 3 months of informal discussions with the teams and their managers, and followed by 9 months of more discussions.”
• “Effective outcomes were associated with a fit among an interaction incident’s form, decision process, and complexity”
• “effective global virtual teams sequence these incidents to generate a deep rhythm of regular face-toface incidents interspersed with less intensive, shorter incidents using various media.”
• “ In some studies face-to-face groups performed better than technology-mediated groups (e.g., Hightower and Sayeed 1996, Smith and Vanecek 1990); in others they performed worse (e.g., Ocker et al. 1995–1996, Straus 1996); in others there was no difference on quality-related outcomes (e.g., Farmer and Hyatt 1994, Valacich et al. 1993). Furthermore, these relationships changed and evolved over time (e.g., Hollingshead et al. 1993). Although task type was often proposed to moderate the relationship between a medium and its effect on performance (e.g., O’Connor et al. 1993), there did not seem to be a consistent pattern of task types for which communications technology was better or worse. Some studies concluded that a combination of media including face-to-face outperformed one without face-to-face (e.g., Ocker et al. 1998).”
• “trust, which was critical to the team’s ability to manage decision processes, could be built swiftly; however, this trust was very fragile.”
• “This basic pattern is defined by regular faceto-face meetings in which the intensity of interaction is extremely high, followed by a period of some weeks in which interaction incidents are less intense. Moreover, the decision process is organized to match this temporal pattern, rather than the other way around”
• MakeTech and SellTech chose the kind of interaction (phone, fax, etc) to match the task, NewTech did not.

https://pdfs.semanticscholar.org/3388/10f4e28b8954d8c63c5bcd8b0295e101ff95.pdf

Reflected Knowledge and Trust in Global Collaboration

Mark Mortensen

Tsedal B. Neeley

Harvard Business School, Harvard University, Boston, Massachusetts 02163,

tneeley@hbs.edu

• “an equally important trust mechanism is “reflected knowledge,” knowledge that workers gain about the personal characteristics, relationships, and behavioral norms of their own site through the lens of their distant collaborators. Based on surveys gathered from 140 employees in a division of a global chemical company, we found that direct knowledge and reflected knowledge enhanced trust in distinct ways”
• “some researchers have argued that trust is more easily generated in collocated settings, because collocated workers are better placed than their distributed counterparts to use behavioral cues to read intentions and foster a collective identity (Frank 1993, Wilson et al. 2006)”
• “t firsthand experience would be positively related to individuals’ reflected knowledge”
• “neither the path linking reflected knowledge to understanding distant collaborators nor the path linking direct knowledge to feeling understood by distant collaborators were significant in our original model”
• “We found significant positive paths linking direct knowledge and understanding collaborators ( = 00531 p < 00001) and linking reflected knowledge to feeling understood by distant others ( = 00281 p < 0005)”

On Cooperative Behavior in Distributed Teams: The Influence of Organizational Design, Media Richness, Social Interaction, and Interaction Adaptation

Dorthe D. Håkonsson1,2,3*, Børge Obel2,4, Jacob K. Eskildsen4 and Richard M. Burton5

• oxytocin increases awareness of others’ emotional states
• “decision makers with a cooperative mindset were more prone to interpret others’ actions as efforts to coordinate. This, in turn was found to increase the quality of their interaction outcomes ”
• “highly motivated liars interacting in a text-based, computer-mediated environment were more successful in deceiving their partners compared to motivated liars interacting face-to-face”
• “media richness reinforces non-cooperative incentives”

https://nooffice.org/silicon-valley-is-disrupting-everything-but-the-way-they-work-getting-remote-work-done-681c4a0be6fe

• “Peter Thiel writes that “Even working remotely should be avoided, because misalignment can creep in whenever colleagues aren’t together full-time, in the same place, every day”.”

https://www.scirp.org/journal/PaperInformation.aspx?paperID=84181

An Empirical Analysis of Communication on Trust Building in Virtual Teams

Makoto Shinnishi1orcid, Kunihiko Higa2

• Compared 5 teams with text-only communication, “ five teams could use a non-text communication tool through which one can see other member’s situation with a web camera image and a short text message. “
• “use of non-text communication tool did not affect trust building; however, amount of awareness communication affected trust building. Log-in to the communication system at the same time also affected trust building. The findings of this study showed the tendency of awareness communication helping team building trust in the remote environment.”
• Awareness = “knowledge about the work and worker of current and predicted future status and situation.”

https://pdfs.semanticscholar.org/e3b9/57e8c1f6a286ce3b8876c12701a073a8f0c5.pdf

Multinational and Multicultural Distributed Teams: A Review and Future Agenda

Stacey L. Connaughton and Marissa Shuffler

• [we should keep studying this]

http://users.ece.utexas.edu/~perry/education/382v-s08/papers/hinds03.pdf

Out of sight, Out of sync: Understanding conflict in distributed teams

Pamela J. Hinds •Diane E. Bailey

• “empirical studies suggest that distributed teams experience high levels of conflict”
• “In this paper, we develop a theory-based explanation of how geographical distribution provokes team-level conflict.”
• “Geographically distributed teams face a number of unique challenges, including being coached from a distance, coping with the cost and stress of frequent travel, and dealing with repeated delays (Armstrong and Cole 2002).”
• “Field studies further indicate that geographically distributed teams may experience conflict as a result of two factors: The distance that separates team members and their reliance on technology to communicate and work with one another.”
• “Armstrong and Cole (2002) reported that conflicts in geographically distributed teams went unidentified and unaddressed longer than conflicts in collocated teams.”
• “t geographical distribution will have a significant impact on each type of group conflict proposed in recent organizational studies: task, affective, and process. Task conflict refers to disagreements focused on work content. Affective conflict (sometimes referred to as relationship or emotional conflict) refers to team disagreements that are characterized by anger or hostility among group members. Process conflict refers to disagreements over the team’s approach to the task, its methods, and its group processes.”
• “task conflict has been found to be beneficial for performance on many traditional teams, but we contend that it will not be so for their distributed counterparts.”
• “Distributed teams enable firms to take advantage of expertise around the globe, to continue work around the clock, and to create closer relationships with far-flung customers. “
• “We contend that all other traits that may be associated with geographical distribution derive from distance or technology mediation, and we consider them in our analysis of these two factors”
• “members of distributed teams may have difficulty establishing a shared context”
• “in a study of the use of new machines in a factory, Tyre and von Hippel (1997) observed that engineers and operators had trouble resolving equipment problems over the phone because the engineers needed to “see for themselves” the technology in context.”
• “When team members have different understandings of the task, task conflict is likely to result (Jehn et al. 1997). Moreover, when team members’ understanding of the issues differs, conflict is difficult to resolve (Brehmer 1976)”
• “Team members who lack a sense of a shared context as a result of distance also are likely to adhere to different norms”
• “site-specific cultures and expectations acted as significant sources of misunderstandings and conflict between distant sites. “
• “Grinter et al. (1999) found that members of distributed software development teams, regardless of the way they structured their work, were “constantly surprised” and confused about the activities of their distant colleagues.”
• “research on friendship suggests that distributed teams will experience less friendship and, thus, less affective conflict.”
• “studies that suggest that time can remedy the relational problems that ensue from technology mediation”
• “Feelings of not “being there” with one’s communication partners stand to prevent distributed team members from sharing relational information that help teams to develop trust”
• “technology mediation engenders negative relational effects that we contend will precipitate affective conflict.”
• “Several problems related to information sharing and seeking emerge from the literature, including uneven distribution of information, unevenly weighted information, and information that resists transmission.”
• “Uneven distribution can occur in at least two ways: Team members may be purposely or accidentally excluded from communications, or members may not reveal information that they uniquely hold.”
• “Purdy et al. (2000) reported that student groups working face to face collaborated more than distributed groups working over video, telephone, or chat”
• “Longitudinal studies report that groups adapt communication technologies to good effect”
• “Over time, effective teams generate a shared team identity.”
• “Distributed teams appear to gain more if they meet early in the development of the group (Kraut et al. 1992), enabling members to form relationships that can be supported over technologies (Armstrong and Cole 2002)”
• “. If distributed teams are able to meet face to face at the points with the most potential for conflict, then conflict may be reduced or diminished.”
• “Establishing collaborative norms, however, may be significantly more difficult in distributed teams.”

Manager control and employee isolation in

telecommuting environments

Nancy B. Kurlanda

, Cecily D. Cooperb,*

2002

• “he primary challenges facing supervisors who manage in telecommuting environments involve clan strategies: fostering synergy, replicating informal learning, creating opportunities for interpersonal networking, and professionally developing out-of-sight employees.”
• Managers dislike teleworking because they can’t see what people are doing
• Telecommuters are worried that they’ll be ignored politically
• “They evaluated employees’ work based not so much on what they did, but on ‘‘how well they did it.”
• “a shift to managing telecommuters only by results may enhance telecommuters’ professional isolation concerns”
• These are individual telecommuters, not distributed teams

https://www.emeraldinsight.com/doi/pdf/10.1108/13527590410556854

Differences between

on-site and off-site

teams: manager

perceptions

Walt Stevenson and

Erika Weis McGrath

• Useless

https://smile.amazon.com/Distributed-Work-Press-Pamela-Hinds-ebook/dp/B004FTPEPS/ref=sr_1_3?keywords=pamela+hinds&qid=1551665878&s=gateway&sr=8-3

Distributed Work (The MIT Press) 1st Edition, Kindle Edition

by Pamela J Hinds (Author, Editor), Sara Kiesler (Editor)

• Chapter 4 The Place of Face-to-Face Communication in Distributed Work

Bonnie A. Nardi and Steve Whittaker

• Chapter 5: The (Currently) Unique Advantages of Collocated Work

Judith S. Olson, Stephanie Teasley, Lisa Covi, and Gary Olson

• “ communication frequency among individuals drops considerably with distance and that after about thirty meters, it reaches asymptote. This means that if two people reside more than 30 meters apart, they may as well be across the continent”
• This paper is talking about radical co-location: 6-8 peeps in the same room. Any further than that doesn’t count.
• “Teams who experienced radical collocation—pioneer teams—were much more productive than standard teams at both this company and in the industry as a whole.”
• “These teams produced twice as much as other teams did in their multitasked work, in standard office cubicles, in projects with more variable scoping. The collocated teams got the job done in about one-third the amount of time compared to the company baseline—and even faster than the industry standard. Both of these differences are significant using a z-score against company baseline (p < .001).”
• I think they had other advantages over the control group, such as having only one task. And of course there’s Hawthorne effect.
• “Follow-on teams were even more productive than the pioneer teams; function points per staff month doubled again while cycle time stayed about the same. We believe this second increase has to do in part with the fact that some of the team members now had experience (some pilot team members served on follow-on teams), and there was some organizational learning about how to run and manage such groups.”
• I’m extremely curious how often team members used the available solo offices.
• Productivity measured in function points, so it’s possible they were just better at estimating.
• Morale effects (positive and negative) were more contagious. Even the team with shit morale was more productive than standard teams.
• White boards and sticky notes were very popular
• Tivoli vs. IBM: quite close
• Chapter 6: Understanding Effects of Proximity on Collaboration: Implications for Technologies to Support Remote Collaborative Work

Robert E. Kraut, Susan R. Fussell, Susan E. Brennan, and Jane Siegel

• “ For example, in collaboration at a distance, communication is typically less frequent, characterized by longer lags between messages, and more effortful.”
• Is less frequent a downside?
• “Results showed that even in this environment, pairs of researchers were unlikely to complete a technical report together unless their offices were physically near each other, even if they had previously published on similar topics or worked in the same department in the company. “
• But international collaborations happen
• What tools can replace hallways?
• Elizabeth’s observation: this works better for extroverts and multitaskers
• “ the most important problem is that when conversation is initiated in person, the people must be simultaneously present”
• “ Ancona and Caldwell (1992),demonstrated that problems can arise when people concentrate communication within a supervisory group and fail to exchange enough information with others outside the group”
• Chatrooms are a partial substitute for hallways
• “Because spoken utterances are ephemeral, unlike messages on an answering machine or in a written document, the listener cannot pause or reread the message when some portion is difficult. Again, however, the ability to ask for clarification partially compensates for the ephemeral nature of speech. When there are many listeners, however, it is far more costly for a single one whose attention has wandered to stop the speaker for clarification”
• “ease of local communication and information acquisition may bias the information tracked by a work group, causing them to overattend to local information at the expense of more remote, contextual information.”

https://www.tandfonline.com/doi/abs/10.1207/S15327744JOCE1002_2

Telework: Existing Research and Future Directions

Bongsik Shin , Omar A. El Sawy , Olivia R. Liu Sheng & Kunihiko Higa

• Future research should be more rigorous

https://www.sciencedirect.com/science/article/pii/S1048984307001518

The impact of superior–subordinate relationships on the commitment, job satisfaction, and performance of virtual workers

Author links open overlay panelTimothy D.GoldenaJohn F.Veigab1

• High quality relationships are good

The Good, the Bad, and the Unknown About Telecommuting: MetaAnalysis of Psychological Mediators and Individual Consequences

Ravi S. Gajendran and David A. Harrison

• “Managers who are unwilling to or who lack the training to
• change their management and control styles would likely see
• deterioration in the depth and vitality of their connection with
• telecommuting subordinates”
• Telecommuting led to: increased feelings of autonomy, lower work-life conflict, better supervisor relationships (not statistically significant), no change on co-worker relationships (low intensity only), worse co-worker relationships (high intensity), increased job satisfaction, increase in objective or external (but not self) reported performance, lower turnover intent, less stress, no impact on career prospect feelings.
• This is true even for very high intensity remote work (> ½ time)
• Could be supervisor relationship improved because people were happy they got to co-work.

https://journals.sagepub.com/doi/abs/10.1177/0170840607083105?journalCode=ossa (PDF without permalink available on AWS)

Perceived Proximity in Virtual Work: Explaining the Paradox of Far-but-Close

Jeanne M. Wilson, Michael Boyer O'Leary, Anca Metiu, Quintus R. Jett

2008

• “ By understanding what leads to perceived proximity, we also believe that managers can achieve many of the benefits of co-location without actually having employees work in one place.”
• “As team members discover that they share a certain social category, they establish a common ground from which they can work “
• “the more two people identify with some social category, entity or experience (e.g. profession, gender, ethnicity, common political views, shared trauma, etc.), the more common ground they will have between them and, thus, the more proximal they are likely to feel.”
• They use the example of two mothers of young children. I suspect this effect is much stronger if the far-apart people share a trait they don’t share with people nearby.
• “Absent a shared identity, people have a strong tendency toward faulty attributions about others’ motives”
• This is consistent with other research showing people are more likely to commit the fundamental attribution error against distant people.
• “As one subgroup sought to differentiate itself from the other, less communication between the two groups ensued, resulting in even fewer opportunities to discover and develop a shared identity. “
• “an organization with strong structural assurance might have very rigorous hiring standards. As a result, employees of the organization would feel comfortable communicating with distant co-workers — safe in the knowledge that they were dealing with competent and reliable professionals”
• This was definitely present at Google
• “e, the technological infrastructure for open-source projects (version control systems, specialized topic-based discussion forums, and so on) provides rich and stable support for complex work and interpersonal interactions between developers who rarely if ever see each other “
• High openness to experience -> less assumption of bad faith of colleague’s part.
• “ With experience, members of distributed groups learn to communicate frequently (Jarvenpaa and Leidner 1998), start tasks promptly because of time delays (Iacono and Weisband 1997), disclose personal information (Moore et al. 1999) and explicitly acknowledge receiving messages (Cramton 2001). In essence, experienced tele-workers learn effective norms and routines to be productive in this specific context. ”
• “feeling close reduces the uncertainty and ambiguity of working at a distance and improves at least two of the three dimensions of group effectiveness: (1) capability to work together in the future; and (2) growth and well-being of team members”
• Having a co-located co-worker visit another site helps your relationship with people at that site, even if you don’t visit yourself.
• Too much closeness leads to negative feelings, such as worries about surveillance.

https://pdfs.semanticscholar.org/c2c0/1cbfa5796a273aa5396472eb3582e3496b4c.pdf

When does the medium matter? Knowledge-building experiences

and opportunities in decision-making teams

Bradley J. Alge,a,* Carolyn Wiethoff,b and Howard J. Kleinc

• “t media differences existed for teams lacking a history, with face-to-face teams exhibiting higher openness/trust and information sharing than computer-mediated teams. However, computer-mediated teams with a history were able to eliminate these differences. These findings did not extend to team-member exchange (TMX). Although face-to-face teams exhibited higher TMX compared to computer-mediated teams, the interaction of temporal scope and communication media was not significant. In addition, openness/trust and TMX were positively associated with decision-making effectiveness when task interdependence was high, but were unrelated to decision-making effectiveness when task interdependence was low”

• note y axis

https://www.emeraldinsight.com/doi/abs/10.1108/eb022856?journalCode=ijcma

SUBGROUP DYNAMICS IN INTERNATIONALLY DISTRIBUTED TEAMS: ETHNOCENTRISM OR CROSS-NATIONAL LEARNING? Catherine Durnell Cramton and Pamela J. Hinds

2005

• “There is,
• however, increasing evidence that internationally distributed teams are prone
• to subgroup dynamics characterized by an us-verses-them attitude across sites
• (Armstrong & Cole, 1995; Cramton, 2001; Hinds & Bailey, 2003
• see Gibson & Cohen, 2003; Hinds & Kiesler, 2002
• Diversity isn’t bad, it’s when multiple traits are correlated that you get fault lines
• “Merely becoming aware of the presence of subgroups is adequate to trigger ingroup-outgroup dynamics”
• “t ethnocentrism is reduced under conditions of contact between groups of equal status that are pursuing common goals with institutional or social support”
• Trying to erase team differences costs you the ability to learn from teams
• “Under certain conditions, people can recognize the positive qualities of their own group as well as other groups, constituting what we call an attitude of mutual positive distinctiveness”
• Shared goal == good
• “We conclude that motivation to engage across differences is reduced when groups have unequal status”

https://journals.sagepub.com/doi/abs/10.1177/875697281704800303

Barriers to Tacit Knowledge Sharing in Geographically Dispersed Project Teams in Oil and Gas Projects

Olugbenga Jide Olaniran

• Oh, this is a pitch for delphi

https://www.entrepreneur.com/article/243579

Why This Startup Won't Let the Team Work From Home

Randy Frisch, CEO of Uberflip

• “Our dependence on instant messaging and shorthand updates have decreased our likelihood to brainstorm, opting for a quick WTF or LOL, rather than an attempt to build off a crazy idea.”
• Marissa Mayer: ““People are more productive when they’re alone, but they’re more collaborative and innovative when they’re together. Some of the best ideas come from pulling two different ideas together.””

https://open.buffer.com/remote-team-meetups/

Remote Team Meetups: Here’s What Works For Us

Stephanie Lee

https://news.ycombinator.com/item?id=17021655

• Need to be remote-first

• Vague missing interactions

• Missing collaborative learning
• Nuances get missed
• People don’t believe you’re working because they can’t see you
• “Can’t have remote work and flexible schedule”
• Need ways to replicate social experience

https://www.researchgate.net/publication/279508419_COMMUNICATION_TEAM_PERFORMANCE_AND_THE_INDIVIDUAL_BRIDGING_TECHNICAL_DEPENDENCIES

COMMUNICATION, TEAM PERFORMANCE, AND THE INDIVIDUAL: BRIDGING TECHNICAL DEPENDENCIES.

Patrick Wagstrom

James D. Herbsleb

Kathleen M Carley

• Can’t replicate “watercooler experience”

• Need childcare when WFH
• Good to schedule regular meetings

Quora

• Proper onboarding is essential
• Have regular syncs
• So definitely you can work remotely, the main challenges to overcome are:
• A feeling of separation
• Communication
• Productivity and consistency of work

• Easy to lose focus
• Feel disconnected from others/lonely
• People don’t believe you’re working
• Easy to get sucked into working too much

https://www.quora.com/What-are-the-pros-and-cons-of-remote-working-Working-a-job-from-your-home

• Pros:
• no commute
• No control over work environment
• Flexible hours
• Cons:
• Lonely

How do virtual teams process information? A literature review and implications for management

Petru L. Curs¸eu, Rene´ Schalk and Inge Wesse

• “Virtual teams are better in exchanging information and in overcoming information biases. However, they encounter problems in using and integrating information. A greater pool of knowledge leads to higher memory interference in virtual teams. Nevertheless, in virtual teams, processes such as planning and coordination, are less effective and the emergence of trust and cohesion is more difficult to achieve. These opposing effects limit the potential of better knowledge integration in virtual teams.”
• “the development of trust, cohesion and a strong team identity is one of the most difficult challenges for managers of virtual teams. “
• “Especially in the initial phases of the team project, meeting face-to-face will let the team members get acquainted with each other. Direct contact is essential for the development of trust and cohesion”

Virtual Teams: What Do We Know and Where Do

We Go From Here?

Luis L. Martins∗

Georgia Institute of Technology, College of Management, 800 West Peachtree Street NW,

Atlanta, GA 30332-0520, USA

Lucy L. Gilson

Department of Management, School of Business, University of Connecticut, 2100 Hillside Road,

Unit 1041, Storrs, CT 06269-1041, USA

M. Travis Maynard

Department of Management, School of Business, University of Connecticut, 2100 Hillside Road,

Unit 1041, Storrs, CT 06269-1041, USA

2004

• “ researchers have noted the tendency of VTs to possess a shorter lifecycle as compared to face-to-face teams”
• “thee number of ideas generated in VTs has been found to increase with group size, which contrasts with results found in face-to-face groups”
• “, the addition of video resources results in significant improvements to the quality of a team’s decisions (Baker, 2002)”
• “compared to men, women in VTs perceived their teams as more inclusive and supportive, and were more satisfied. Also, in a study of e-mail communication among knowledge workers from North America, Asia, and Europe, Gefen and Straub (1997) found that women viewed e-mail as having greater usefulness, but found no gender differences in levels of usage. Bhappu, Griffith and Northcraft (1997) examined the effects of communication dynamics and media in diverse groups, and found that individuals in face-to-face groups paid more attention to in-group/out-group differences in terms of gender than those in VTs.”
• “found that collocated teams reported a significantly lower number of difficulties with various aspects of project management (such as keeping on schedule and staying on budget) than did virtual or global teams”
• “Several studies have demonstrated that participation levels become more equalized in VTs than in face-to-face teams (Bikson & Eveland, 1990; Kiesler, Siegel & McGuire, 1984; Straus, 1996; Zigurs, Poole & DeSanctis, 1988). The most commonly cited reason for this is the reduction in status differences resulting from diminished social cues”
• “or by team members compared to interactions within face-to-face contexts. In particular, Siegel et al. (1986) found that uninhibited behavior such as swearing, insults, and name-calling was significantly more likely in CMC groups than in face-to-face groups.” This is worse in men.
• “In general, lower levels of satisfaction are reported in VTs than in face-to-face teams (Jessup & Tansik, 1991; Straus, 1996; Thompson & Coovert, 2002; Warkentin et al., 1997)”
• “Finally, satisfaction in VTs appears to be affected by a team’s gender composition. In particular, all-female VTs tend to report higher levels of satisfaction than all-male VTs”
• “For negotiation and intellective tasks, face-to-face teams have been found to perform significantly better that CMC teams, whereas there were no differences found on decision-making tasks (Hollingshead et al., 1993). “
• “A type of task in which CMC groups seem to outperform face-to-face groups is brainstorming and idea-generation because there is no interruption from other group members, in effect allowing all members to “talk” at the same time. “

https://sci-hub.tw/10.1287/orsc.10.6.791

Communication and Trust in Global Virtual Teams

Sirkka L. Jarvenpaa, Dorothy E. Leidner

• “global virtual teams may experience a form of “swift” trust, but such trust appears to be very fragile and temporal”

https://pubsonline.informs.org/doi/10.1287/orsc.1090.0434

Go (Con)figure: Subgroups, Imbalance, and Isolates in Geographically Dispersed Teams

Michael Boyer O'Leary, Mark Mortensen

2009

• “we find that the social categorization in teams with geographically based subgroups (defined as two or more members per site) triggers significantly weaker identification with the team, less effective transactive memory, more conflict, and more coordination problems.”
• “imbalance in the size of subgroups (i.e., the uneven distribution of members across sites) invokes a competitive, coalitional mentality that exacerbates these effects; subgroups with a numerical minority of team members report significantly poorer scores on identification, transactive memory, conflict, and coordination problems. In contrast, teams with geographically isolated members (i.e., members who have no teammates at their site) have better scores on these same four outcomes than both balanced and imbalanced configurations.”

Virtual teams that Work

Gibson and Somebody

• For virtual teams to perform well, three enabling conditions need to be established:
• “Shared understanding is the degree of cognitive overlap and commonality in beliefs, expectations, and perceptions about a given target. “
• “Integration is the process of establishing ways in which the parts of an organization can work together to create value, develop products, or deliver services. “
• “ integration refers to organizational structures and systems, while shared understanding refers to people’s thoughts.”
• Mutual trust (or collective trust) is a shared psychological state characterized by an acceptance of vulnerability based on expectations of intentions or
• behaviors of others within the team
• Chapter 2: Knowledge Sharing and Shared Understanding in Virtual Teams
Pamela J. Hinds, Suzanne P. Weisband
• “ In teams without a common understanding, team members are more likely to hedge their bets against the errors anticipated from others on the team, thus duplicating efforts and increasing the likelihood of rework.”
• “A common occurrence on teams is that members think they have come to agreement, but the agreement they believe they have reached is viewed differently by different team members”
• “shared understanding among team members has these benefits: • Enables people to predict the behaviors of team members • Facilitates efficient use of resources and effort • Reduces implementation problems and errors • Increases satisfaction and motivation of team members • Reduces frustration and conflict among team members”
• Need an understanding of both the task and the process
• “telephone may be better than text-based systems for detecting and resolving misunderstandings.”
• “Weisband (forthcoming) found that teams that shared information about where they were and what they were doing performed better than teams that did not share this information”
• Site visits better than off-site retreats
• Chapter 3: Managing the Global New Product Development Network A Sense-Making Perspective

Susan Albers Mohrman, Janice A. Klein, David Finegold

• “Sense making in the new product development system is by necessity virtual. It occurs simultaneously within and across different levels and elements of the organizational system. “
• This chapter feels buzzwordy as hell and I’m going to skip it
• Chapter 4: Building Trust: Effective Multicultural Communication Processes in Virtual Teams

Cristina B. Gibson, Jennifer A. Manuel

• “Collective trust is a crucial element of virtual team functioning. “
• “collective trust can be defined as a shared psychological state in a team that is characterized by an acceptance of vulnerability based on expectations of intentions or behaviors of others within the team “
• “This larger study began in July 1999 and had two phases: in-depth qualitative case analysis with two different virtual teams from each of eight organizations and a comprehensive quantitative survey that will be administered in each firm and analyzed as to statistical predictors of virtual team effectiveness.”
• On virtual teams “ trust is harder to identify and develop, yet may be even more critical, because the virtual context often renders other forms of social control and psychological safety less effective or feasible. “
• “Some controls actually appear to signal the absence of trust and therefore can hamper its emergence. Institutional controls can also undermine trust when legal mechanisms give rise to rigidity in response to conflict and substitute high levels of formalization for more flexible conflict management (Sitkin and Bies, 1994).”
• “interdependence is also critical in establishing trust in virtual teams.”
• Compliments build trust
• Active listening
• Chapter 8: Exploring Emerging Leadership in Virtual Teams Kristi Lewis Tyran, Craig K. Tyran, Morgan Shepherd
• “A leader is said to emerge in a team when the team as a whole reaches a consensus that they perceive the emergent leader to be their leader”
• “ We found agreement among team members that leaders emerged in nine of the thirteen virtual teams participating in our study. “
• “For traditional teams, trust in a leader’s ability to facilitate team task and relationship interaction effectively has been found to be a critical factor in achieving the consensus necessary for a leader to emerge”
• Types of trust:
• Can do the tasks to achieve the goal
• Altruism
• Friendship
• May be harder to emerge as a leader in virtual teams, in part due to difficulty building trust
• Leaders sent more messages than average but not necessarily the most messages
• “virtual team performance was not clearly related to emergent leadership within the team”
• This seems useless
• Chapter 10: Overcoming Barriers to Information Sharing in Virtual Teams, Catherine Durnell Cramton, Kara L. Orvis
• Skipping the description of problems, assuming it’s similar to Cramton’s other work
• Recommendations:
• “Establish procedures for information sharing within the virtual team. It may be helpful for leaders to distinguish among task, social, and contextual information and to design procedures appropriate to each type of information. For example, task-related information may be best shared at a weekly dial-in conference call in which representatives at each location are guaranteed airtime. It probably is a good idea to designate a facilitator for such conference calls so that time is managed, focus is maintained, and new information is made salient to all.”
• Weekly calls include a few moments where everyone says how they are. I think this is incorrect and it makes me doubt their other suggestions. It will chew up a lot of time, people won’t listen, and isn’t enough time to actually share anything relevant.
• Site Visits
• Longer time frames
• Awww, they’re worrying about the cost of phone calls. So cute.
• “Build the virtual team’s social identity”
• Chapter 14: Influence and Political Processes in Virtual Teams by Efrat Elron, Eran Vigoda
• “ Our study relied on ten semistructured interviews with members of virtual teams” Two companies based in US and Canada, with lots of subsidiaries.
• “We have witnessed a rapid growth in studies that developed well-grounded models and theories of influence and politics in organizations (Bacharach and Lawler, 1980; Gandz and Murray, 1980; Ferris and Kacmar, 1992; Kipnis, Schmidt, and Wilkinson, 1980; Mintzberg, 1983; Pfeffer, 1992; Yukl and Falbe, 1990)”
• “ People in teams that rely heavily on e-mail are therefore more careful with writing than speaking because of the permanency effect of the written word. Thus, political behavior takes on a more careful and covert form when documented.”
• Requests to take off-topic discussions offline curtail political behavior
• “ Noted one participant, “Virtual teams are task oriented. You do not have enough chances to read and understand this politics, if it’s there at all. In fact, I don’t feel that I actually have enough opportunities to be exposed to such activities in my virtual team. We don’t have that much time left for politics; we need to work.””
• “Many multicultural team participants try to be “on their best behavior” when in contact with members from other cultures because they feel that they are not only private individuals but also representatives of their country and culture. As a result, they tend to use tactics that are more acceptable socially”
• “ In general, these findings indicate that VOP may be significantly lower than operational politics in conventional teams. Members of virtual teams report lower intensities in the use of influence and political behaviors compared with similar activities of conventional face-to-face groups. In addition, the most dominant influence attempts used are rationality, consultation, and assertiveness, which are considered among the most socially acceptable tactics and also the most effective ones (Yukl and Tracey, 1992). The use of less acceptable tactics such as sanctions, exerting pressures and threats, and blocking information was denied by all our interviewees. “
• Okay, I can’t take the seriously after that, skipping the rest
• Chapter 15 Conflict and Virtual Teams Terri L. Griffith, Elizabeth A. Mannix, Margaret A. Neale
• “Media effects is a phrase used to describe all the outcomes that result from the use of a particular communication medium. “
• “practice also suggests that even teams located in the same building may communicate largely by e-mail. In fact, in the organization described in this chapter, all teams spent an equal amount of time working face-to-face on the team task (approximately 13 percent). E-mail use did vary by whether all the members were colocated, but not to a great extent (34 percent of all task communication in colocated teams versus 45.76 percent in distributed teams).”
• “What is technically a conflict about the task, for example, may be taken personally and thus be experienced as relationship conflict.”
• They looked at twenty eight teams on a spectrum of virtualness
• “ We found that virtual teams had greater levels of process conflict than traditional teams, but only when also controlling for the effects of trust. We did not find differences in the levels of task or relationship conflict “
• Difference was only .3 out of 5
• “Neither trust nor team identification is significantly related to the distribution of the team members.”
• “regardless of virtualness, relationship and process conflict have significantly negative effects on performance as rated by the team’s manager. Task conflict, however, does not seem to produce these effects.”
• “trust was higher the more the team communicated using e-mail and the more they worked together face-to-face. “

Situation Invisibility and Attribution in Distributed Collaborations

Catherine Durnell Cramton, Kara L. Orvis, Jeanne M. Wilson

2007

https://psycnet.apa.org/record/2004-00215-011

Chapter 11: Leadership in Virtual Teams:

Zaccaro, S. J., Ardison, S. D., & Orvis, K. L. (2004)

• “Zaccaro, Rittman, and Marks (2001) argued that leaders contribute to team effectiveness by influencing five team processes: motivational, affective, cognitive, coordination, and boundary spanning.”
• Trust is the expectation that others will behave as you expect = ability to predict others’ behavior
• “Team effectiveness is grounded in members being motivated to work hard on behalf of the team. “ that seems incomplete at best
• “, team members are more likely to contribute more of their individual efforts to collective action when they trust one another”
• “trust, particularly knowledgebased and identification-based trust, evolves from a long period of face-to-face interactions”
• “t teams with high levels of initial trust devoted almost half of their early communications during the first 2 weeks of the teams’ existence to discussions of their families, hobbies, and weekend social activities...such communications were not sufficient to maintain trust.”
• “trust in temporary teams develops when team members have clearly defined roles with specified and accurate behavioral expectations. “
• “When virtual teams form with members having strong professional identities, then swift trust can emerge from stable role expectations and subsequent actions are likely to confirm these expectations,strengthening team trust.”
• “effective team coordination depends upon the emergence of a shared mental model”
• Skipping the stuff on crampton 2001 since I’ve read the source
• “team leaders need to help their teams set performance objectives that are aligned with the strategic requirements operating in the team environment”

Discuss

### Slack Club

Новости LessWrong.com - 16 апреля, 2019 - 09:43
Published on April 16, 2019 6:43 AM UTC

This is a post from The Last Rationalist, which asks, generally, "Why do rationalists have such a hard time doing things, as a community?" Their answer is that rationality selects for a particular smart-but-lazy archetype, who values solving problems with silver bullets and abstraction, rather than hard work and perseverance. This archetype is easily distractible and does not cooperate with other instances of itself, so an entire community of people conforming to this archetype devolves into valuing abstraction and specialized jargon over solving problems.

Discuss