Вы здесь
Новости LessWrong.com
Minimizing Loss ≠ Maximizing Intelligence
(Cross-posted from my Substack; written as part of the Halfhaven virtual blogging camp.)
Many speculate about the possibility of an AI bubble by talking about past progress, the economy, OpenAI, Nvidia, and so on. But I don’t see many people looking under the hood to examine whether the actual technology itself looks like it’s going to continue to grow or flatline. Many now realize LLMs may be a dead end, but optimism persists that one clever tweak of the formula might get us to superintelligence. But I’ve been looking into the details of this AI stuff more lately, and it seems to me that there’s a deeper problem: self-supervised learning itself.
Here’s how supervised learning with gradient descent works, by my understanding:
- Give the neural network some input, and it returns some output.
- We score how “bad” the output is.
- We update the model’s weights in directions that would have produced less bad output, making it less bad next time
This works great when you can judge badness reliably. AlphaGo Zero used a cleverly-designed oracle to evaluate its outputs, essentially comparing the move the model thought was the best with the real best move. But modern LLMs work differently. We have them complete a snippet of training data, and compare their output with the real completion. This is called self-supervised learning. By training the model this way, we minimize loss with respect to the training data, thereby creating an AI model that’s really good at predicting the next token of any snippet of training data, and hopefully other similar data.
By doing this, we create a model which tries to remember all patterns present in the data, however arbitrary. Common patterns get prioritized because they help minimize loss more, but the only way to minimize loss is to learn as many patterns as you can. That will include some patterns humans care about, and many more we do not.
Self-supervised learning is not a blind memorizer. It does abstract and generalize. But it abstracts indiscriminately.
Here’s the problem. Let’s say I want to train an AI model that can beat any human at chess. I train it on the history of all recorded chess games, including amateur games, master games, and grandmaster games. Feed it some number of opening moves and have it predict the next move. We update the model using self-supervised learning based on accuracy.
Training my AI model this way, it would learn to play well. It would also learn to play poorly. It would learn the playstyle of every player in the data. It would learn to use the King’s Indian Defense if the game was played in the ’60s, but probably not if the game was in the ‘90s. It would learn what I wanted, and orders of magnitude more that I didn’t care about.
The history of all recorded chess games is several gigabytes, but Stockfish, including the heuristics it uses to evaluate moves, can fit in 3–4 MB. This is at least a 1000x difference between the information we care about (some winning strategy) and the total information in the training data.
Keep in mind that when chess officials wrote down the moves for a chess game, they were implicitly throwing away most of the data for us, like whether the pieces were made of wood or plastic, or whether so-and-so happened to cough before making a move. Not all datasets are this refined to exactly what we want the AI to learn. If you were unlucky enough to have to learn chess from videos of chess matches, the ratio of noise to important data would be like 1,000,000x or 1,000,000,000x. Yet even in the case of chess notation data, most of the information is not worth holding on to.
Now expand this from chess to every domain. Most patterns in most data will be worthless. Most patterns in reality itself are worthless. Humans discard almost all the data we perceive. Our intelligence involves discrimination. Models trained by self-supervised learning like LLMs, on the other hand, try to stuff as much of reality into their weights as possible. An LLM might know a lot about chess, since there’s a lot of chess-specific training data, but only a small amount of what it knows will be about winning chess. That’s why it’s sometimes hard to get peak performance out of an LLM. It won’t necessarily give you the best moves it can unless you tell it to pretend it’s Magnus Carlsen. It knows how to play chess kinda well, but also kinda poorly, and it doesn’t know which one you want unless you specify.
A 7-year-old child given an addition problem learns from it, but given a calculus problem, they simply ignore it. They won’t try desperately to memorize shapes of symbols they don’t understand. We remember what matters and discard the rest.
What matters depends on context and values. The wood grain pattern on my hardwood living room floor is irrelevant if I’m having a conversation about politics, but critical if I’m painting a picture of the room. It takes judgement to know what to focus on. The ability to focus is how we make sense of a very complex world. If remembering everything relevant were easy, then evolution would have let us do so. Instead, we’re forced to remember based on what we think is important.
Human intelligence is neither specialized to a single domain, nor fully general, like reality-stuffing LLMs. Human intelligence is something else. Call it specializable intelligence. We’re specialized in our ability to tactically learn new information based on our existing knowledge and values.
Some imagine superintelligence as a magical system that could play chess for the first time at a grandmaster level, having only seen the rules, deducing winning strategies through pure, brilliant logic. This is impossible. Chess is computationally irreducible. Many games must be played, whether in reality or in some mental simulation of games (or sub-game patterns). Existing knowledge of Go or checkers or “general strategy” will not really help. You can’t have an AI model that’s just good at everything. Not without a computer the size of the universe. What you want is an AI that can get good at things as needed. A specializable intelligence.
There is a tradeoff between a fully general intelligence and a specialized intelligence. The “no free lunch” theorem states that for any AI model, improvements on one class of problems come with worse performance on other classes of problems. You either stay general, or specialize in some areas at the cost of others.
This implies that, for fixed compute, a general intelligence will perform worse at the things we care about than a specialized intelligence could. Much worse, given just how much we don’t care about. Our goal should be specializable intelligence which can learn new things as needed, as well as some fundamentals humans care about often, like language, vision, logic, “common knowledge”, and so on. Creating general superintelligence would require literally astronomical compute, but specializable superintelligence would be far cheaper.[1]
Reality-stuffed general models that don’t discriminate what they learn we will never lead to superintelligent AI. Whatever superintelligence we achieve will not be general with respect to its training data. The chess example before was a contrived one. Keep in mind that we have a lot of good data for chess, and that chess is much less computationally complex than many tasks we care about.[2]An LLM might conceivably play chess well by overfitting to chess, but it won’t have similar performance on novel games similar to chess, and it will be helpless at more complex tasks.
Here are some approaches to AI that I’d guess can’t get us to superintelligent AI:
- Just increasing compute. Diminishing returns (in useful capabilities) will set in. Loss may decrease predictably, but scaling laws measure the wrong objective.
- Higher quality data. This will help, practically speaking, but most of the information in even really high quality data is going to be worthless/discardable. Imagine you cleaned up a chess dataset. You only included grandmaster games, for example. That’s still way more data than the Stockfish heuristics. Preparing “good” data is equivalent to extracting patterns you care about from that data, which in the limit requires the intelligence you’re trying to create.
- Synthetic data. This boils off some noise from the original dataset, essentially creating a higher quality dataset with hopefully less information you don’t care about. Hopefully. But that’s all you’re doing.
- Curriculum learning. When you heard about that 7-year-old who learned from the addition problem but ignored the calculus problem, you might have thought the solution to this whole problem was to order the data such that harder information comes after prerequisite easier data. This won’t work because the model is still being evaluated on completing the trianing data, so it still has to memorize whatever patterns are in the data, even ones we don’t care about. Maybe it’ll learn more quickly, but it’s what it’s learning that’s the problem. It may also lead to more unified internal world models, which is good, but not great if those world models are of things we don’t even care about.
- Using another smaller LLM as an evaluator. Using a small model to judge how good or bad the output of a larger model-in-training is based on some metric humans care about won’t work, because it’s limited by the intelligence of the smaller model.
- RLHF (reinforcement learning from human feedback): The model is already stupid by the time you apply RLHF. It’s constrained by the abstractions already learned.
- Transformers and “attention”: Paying attention to different parts of a sentence when processing a token, and only paying attention to certain patterns humans care about in the data, both use the word “attention”, but they have nothing to do with each other. The model will still be penalized if it fails to predict the next token in the training data, which is a task that inherently requires memorizing a bunch of information humans don’t care about. Any architecture trained with respect to this goal will fail to scale to superintelligent AI. You might think that LLMs are already kind of specializable, because they can do “in-context learning” without any weight updates. But models think with their weights. The depth of thinking you can do in a domain without any learned patterns in the weights is limited. The whole point of the weights is to store abstractions so you can reason with them later. Depriving the model of the ability to do this makes it much stupider.
- Neuro-inspired models with Hebbian learning. (Hebbian = “neurons that fire together wire together”, basically if neuron A firing leads to neuron B firing, the connection between the two is strengthened, as in the human brain). Even with more sophisticated stuff like spike-timing-dependent plasticity, the problem is that Hebbian learning reinforces whichever thought patterns already occur, but doesn’t teach the model to care about certain things.
- Growing neural networks, making them larger as they train. If you’re using self-supervised learning, you’re still growing an idiot. I think this will make internal world models more unified as in the case of better training data ordering, but will not make the models care about only the patterns we want them to care about.
- Meta-learning. Using an outer loop based on gradient descent or evolution or something, and an inner loop based on gradient descent. I read one paper where the model did expensive evolution in the outer loop to set up the initial conditions for learning. They then had the evolved models learn using gradient descent on some task. The models that learned better were then selected for the next generation of evolution. The hope was that you could evolve a model that’s predisposed to be good at learning arbitrary tasks. But it seems wasteful to me to do expensive evolution to set up the initial state of a network only to bowl over that network with backpropagation. Gradient descent minimizing loss with respect to training data will create a reality-stuffed model, regardless of the initial conditions. So you’re essentially evolving good initial conditions for an idiot.
- Predictive coding: I haven’t looked into this much, but it seems like minimizing surprise is pretty similar to minimizing loss with respect to training data. Same problem: learning a bunch of patterns humans don’t care about.
- Anything that improves “grokking”. The transition from memorization to understanding the underlying patterns in data is important, but this is true whether you’re trying to learn important things, like “how English works” or “how to win at chess”, or you’re trying to learn unimportant things, like “how terrible chess players tended to make mistakes in the ’70s”. Grokking is a sign that abstraction is happening, but it’s not sufficient for discriminatory intelligence.
- Manually encoding human knowledge. E.g. putting human knowledge of words and phonemes into the model. The bitter lesson is still bitter.
- Online learning. This is necessary, but not sufficient for superintelligence. A general, reality-stuffing model with online learning will be trying to cram way too much information to be as smart as we want it to be.
I don’t know what approaches could be more promising. Evolution of neuro-inspired models could work. We have at least one working example, at least: us. Evolution gave humans basic architecture and values that tell us what information we “should” pay attention to and care about. Then, during our lifetimes, Hebbian learning lets us learn specific knowledge in accordance with these values. Unfortunately, evolution is just very expensive. Is there a cheaper way forward? Probably, but I have no idea what it is.
One thing to keep in mind is that any more promising approach will necessarily lose the loss minimization game. Yet currently, “conventional approaches” are a gold standard to which other more experimental approaches are compared. If a new method can’t predict the next token of training data better than the conventional approach, it’s reported as a failure — or perhaps as “only slightly better than” the conventional approach, to satisfy the publication demands of academia.
This heuristic cannot stand. We don’t want general loss minimization with respect to training data. We want capability. Performance on novel games could be a valid benchmark. It could could also be used during training. You’d first create specializable intelligence that can learn arbitrary games, then teach it specific games like “speaking English”.
Novel games could also be used to operationalize the claim that useful capabilities will plateau even as loss continues to decrease. Specifically, I’d predict that performance on computationally complex novel games (at least as complex as chess) will barely improve as newer self-supervised models are released and continue to improve at traditional benchmarks. Novel games are a good benchmark because they prevent cheating if the training data happened to contain similar problems. A sufficiently novel game is unlike anything in the training data.
Self-supervised learning can only create general models, which are limited in their capability in any domain by trying to succeed in every possible domain. The trillion dollar bet on self-supervised models will not pay off, because these general models will continue to fail exactly where we need them the most — on novel, difficult problems.
- ^
François Chollet also pointed out the weakness of general intelligence, citing the “no free lunch” theorem, but he went too far, missing the specializability of human intelligence. It’s true that humans are specialized for a certain environment. Infants are born with certain reflexes, and certain knowledge. For example, the fusiform face area of the brain specialized for recognizing human faces. But even though we are partly specialized, we are also specializable. Give us any task and enough time, and we’ll outperform a random actor. For example, psychologists created objects called greebles that share a similar number of constraints as human faces, but look totally alien. They then trained some humans to become experts at recognizing greebles, and found they could reliably tell them apart, and found they used a holistic approach when viewing them rather than looking at their individual parts. In short, as long as we can extract patterns from data, and use those patterns to further refine our search for more patterns, we can do anything.
- ^
Discuss
Solstice Season Megameetups
tl;dr: Solstice Season is coming. It's a good time to visit old friends, and reflect on the big questions together.
- Berkeley winter solstice will be held Dec 6th. Lighthaven will be open the month of December for visitors who want to stay longer or get day-passes to hang out.
- New York winter solstice will be held Dec 20th, with a megameetup that weekend at the HI NYC Hostel.
- Other solstice celebrations around the world are encouraged to post their events on LessWrong, which will be highlighted on a community map soon. If you think you're running something "megameetup" level, let me know and I'll add it here.
If you want to run a small solstice for your friends or a bigger one for your local community, you can find resources here to help you get started.
For the folk in the northern hemisphere, the nights are getting long. The sky is getting dark. Bold Orion is in the night sky[1]. For folk in the southern hemisphere, the opposite of all that is happening – soon the world's light will be at it's zenith.
Most rationalist winter solstice rituals are relatively serious, basically a church service if church was about not-all-believing-the-same things but sharing a commitment to truthseeking and excitement about human progress. Lesswrong-ish winter solstice ceremonies are about confronting dark, immense truths that are difficult to face alone. They also usually involves singing together.
Summer Solstices tend to be much more lighthearted and fun, celebrating the here-and-now. There, uh, are rather a lot more lesswrong folk in the northern hemisphere than the southern, so, this post is mostly oriented around the winter frame. But, either way, you are encouraged to post your Solstice celebrations/rituals/events to LessWrong.
In New York and Berkeley in particular, there'll be large megameetups where hundreds+ people will show up, with unconferences and afterparties surrounding the winter solstice ceremony. (If there are any other large scale megameetup-y things happening elsewhere in the world, let me know)
If you're only going to come to one rationalist event this year, a Solstice megameetup is a pretty good choice.
New YorkNew York was the founding home of the rationalist winter solstice, and is still one of the largest. Each year they host a ceremony with some of the best solstice music, and a megameetup where around a hundred people gather at a hotel for an weekend festival/unconference/sleepover.
The megameetup starts on Friday, Dec 19th and runs till Monday the 22nd.
The solstice ceremony itself is Saturday evening (exact start time tbd).
You can get tickets to the solstice, the megameetup, or a hostel room with 8 bunkbeds on the megameetup site.
BerkeleyBerkeley is probably the largest concentration of rationalists living in one city (I haven't checked which is denser, Berkeley or SF).
Berkeley's Solstice ceremony will be December 6th. Doors open at 7pm, with the event starting (hopefully!) at 7:30. You can get tickets here. It'll take place at Freight and Salvage theater. We sell tickets
The megameetup and afterparty will be hosted at Lighthaven. If you want to attend the megameetup on Friday or Saturday afternoon, tickets are $50. (This includes access to the December Lighthaven schedule, for any other spontaneous events people might schedule).
Lighthaven will also be open all month for people visiting Berkeley, or who just want to cowork or hang out. You can rent rooms. (Later you'll be able to rent day-passes, but it's not set up yet). If you book longer stays you can get a bulk discount. We'll be open through the end of December.
Everything's listed on the Lighthaven Solstice Season page.
SmolsticeRemember, even if you're attending a big megameetup, you can still hold a small solstice ritual for your friends and family, the night of the 21st. The thing that originally inspired me to create the solstice ritual wasn't Midnight Mass, it was my family's small Christmas eve ceremony, where 20+ people could cram around a dinner table and then settle into the living room around a fireplace for hours of singing.
Since then, some of us have also experimented with "outdoor firepit Solstice", where the setting sun itself guides you from light into darkness. (Sometimes, during the Moment of Darkness, people take off their coat for a couple minutes to feel the winter chill more fully).
"Living room solstice" and "outdoor campfire solstice" are both really nice vibes, and by hosting a small thing of your own, you can tailor it for exactly your own preferences, wand individual people maybe bringing specific things that are meaningful to them.
Have a happy Solstice Season!
- ^
And while he isn't actually older than continents, he is pretty old and he has long heralded the human experience of winter
- ^
(There, uh, happen to be way more northern hemisphere rationalists than southern, so, this post is framed a bit more about the former)
Discuss
My new nonprofit Evitable is hiring.
Our mission is to inform and organize the public to confront societal-scale risks of AI, and put an end to the reckless race to develop superintelligence.
We're hiring for 3 roles:
1) Operations Associate or Head of Operations
2) Communications Associate or Head of Communications
3) Chief of Staff (CoS)
We're also fielding expressions of interest from people who might lead movement building efforts (on a somewhat longer time line).
I'd appreciate applications, leads for candidates, and/or signal boosting the Tweet!
Discuss
Willpower is exhausting, use content blockers
Okay, you say, well now I’ve thought about how I’m addicted to my screens, and what I’d be excited about doing if I weren’t. And you’ve told me a lot of stories. But what do you actually want me to do?
I’m so glad you asked! The answer to that question is very long, but that’s why I’m writing about it for thirty straight days.
Let’s start with something simple: Content blockers.
Changing your relationship to screens requires a certain amount of willpower, but using willpower alone is fragile and exhausting. Content blockers reduce the amount of willpower that you need to use at each given moment.
So today, I’ll go through a few different content blockers I use. (I was going to list all the strategies I use — physical life design, social strategies, as well as other ways I set up my devices to make them less addictive — but it was getting way too long and taking forever.)
Every website and app is designed to to maximize the amount of time you spend on it, and that is not to your benefit, so I recommend blocking everything you can stand to block.
However! The easiest way to fail with content blockers is to accidentally block something you really need. Then you get in the habit of overriding them, or just stop turning them on, and you’re left raw and exposed to the full gale force of the internet like a naked baby in a hurricane.
So, to build a sustainable content blocking strategy, start slow, and allow room for trial and error. Like, you might try first with an override close to hand, and then move to a more locked down system once you’ve dialed in your block lists.
With that in mind, here are some content blockers that I think are good:
FreedomMy fav, the backbone of my life. You can create customized block lists or whitelists, and run both recurring and one-off block sessions.
What I use it for: All my potential vices are blocked during my waking hours every day except Saturday. For me that includes all social media, online shopping (even for food), videos/streaming, games, news sites, fanfiction, blogs, and Wikipedia. If there’s something I really want to do on one of those sites, I know I’ll be able to do it on Saturday if I still care then.
Logistics: Browser extension connected to a website. There’s a phone version but I never got it to work. Costs $40/year, which is totally worth it for me.
Important note: Turn on Locked Mode, and go into settings to disable the ability to end sessions. Otherwise it’s too easy to override.
uBlock OriginAllows you to block specific website elements. Also automatically blocks all ads and trackers, more effectively than other ad blockers (e.g. you can watch a YouTube video without ads; I hear most other ad blockers can’t do this anymore).
What I use it for: I block all feeds, all recommendations, and any other distracting elements on every site I use regularly — e.g. YouTube shows just a video and its description; Facebook shows just a search bar. Without uBlock Origin the experience of using the internet is so terrible I don’t know how anyone does it.
Logistics: Browser extension. Stopped working on Chrome, and it’s so important to me that I switched to Firefox. Note that there’s something else just called uBlock which is totally different.
Screen TimeApple’s native content blocker, which lets you set time limits or fully lock apps and websites. Unfortunately, all you need to override the blocks is a passcode. People usually have a partner, roommate, or friend set their code, so they can’t just easily unlock things any time.
What I use it for: My browser was my biggest time sink on my phone, but also sometimes you need to look something up, so I’m allowed five minutes a day. The couple phone games I’ve been addicted to are fully blocked at all times. I haven’t explored Screen Time on my laptop because I already have other solutions.
Note: I know non-Apple systems have native content blockers too, I’m just not familiar, sorry.
OneSecDelays your ability to open chosen apps/sites, allowing you to pause and decide whether you actually want to do it. It also shows you how many times that day you’ve tried to open that app/site. Highly recommended by a friend of mine.
What to use it for: Any app or site you don’t want to use compulsively / without conscious thought, but don’t want to block entirely for some reason. For example, you need to check your email sometimes, but don’t want to check it all the time for no reason.
Logistics: Has Android and iPhone apps, as well as browser extensions for Chrome, Firefox, Safari, and Edge.
BrickA little physical object; you select what to block (or allow) using the app, and start a session by tapping your phone to it. So e.g. you can tap it before going outside to block everything except maps, calls/messages, and ride sharing. Then the only way to end the session is to tap it again — hence the value of the physical object. If you leave it at home, you can’t unblock until you get back.
What my friend uses it for: To stay off his phone in bed (can’t get messages, but your alarm still works!) and work without distraction in the morning. He called it ‘night and day’ for his ability to focus deeply.
Logistics: The devices are ~$60 each, and work with iPhone and Android. They’re magnetic so you can put them on a fridge. (If the name is too generic, look up Brick LLC.)
SelfControlNuclear option. Basically impossible to override, and can block the internet entirely.
What to use it for: I stopped using SelfControl after I discovered Freedom, but it’s popular and a good option for people who really need something they can’t override.
Logistics: Mac only. Free and open source.
There are many, many other content blockers out there, including well-known ones. These are just the ones I’m most familiar with. As you may have noticed, they’re all slightly (or very) different from one another. If none of them are exactly what you want, go exploring!
Oh my god this post is over 1000 words.
Discuss
A review of MSUM's AI Innovation Summit: Day Two
This is a continuation of my previous post, and will be discussing the second day targeted towards educators, "Learning in the age of AI." There were around 70 attendees, near double the previous day, of which it seemed roughly 40% K12 teachers, 30% administrators/curriculum creators and 20% college professors, with a few "others," a couple students and me.
If the first day of the AI Innovation Summit had one main question (how can attendees use AI to enhance their business), the second day had three.
1. How can teachers use AI to enhance their teaching?
2. How do teachers deal with student use of AI?
3. How does the existence of AI change what students need to be taught?
Of these, only the first one has relatively easy answers. This combined with the more general tendency among teacher-types to equivocate, to say "It's more about asking the right questions than having the right answers", to seek more agreement more than getting into specific metrics, etc. made the whole day feel significantly more open-ended and less "alright, here was the takeaway."[1] It's not my preferred style of engagement but I was glad to be there and engage. One benefit of the day is that I was able to hear more people voice their opinion/perspective on the changes that AI is bringing, which I was glad for.
Keynote Address: Another Innovative Initiative: AI in Education
This talk was given by two people who notably were leaders of Minnesota Generative AI Alliance for Education (MNGAIA), a community organization that's to become a nonprofit in the beginning of 2026. This group describes themselves as "a coalition dedicated to the ethical and effective integration of AI into education, ensuring that humans remain at the forefront of decision-making and learning."
Their talk is a little difficult to summarize, it kind of jumped around and even included some audience participation, but I'd split it up into three main points. 1) AI is a scary new technology, but many other now-normal technologies were scary when they were new. 2) What's actually important to us? Does AI change that *that* much? 3) Discussing their MNGAIA organization, how that started, what it's done so far.
For their first point, they compared AI to many other technologies, putting it in a line of progress from scribes to printing press to personal computers, and then AI. Plato was worried that the rise of books would lead to forgetfulness, but we're all happy that books exist now. There wasn't really concern about how AI could or would qualitatively change, and they're generally keeping it in the "AI as tool" sort of framing. It's natural for people to have anxiety about change, but also it's sort of not up to the teachers whether they can stop or significantly affect the change. AI is happening to us and with us, and people generally should try to work within that change rather than work against it.
For the second, it got more into the "hard to take a concrete position" often seen in this conference. The word "Human-centered" was thrown around many times, that we need to build AIs and systems that serve human values, that we should promote good things like lifelong learning, adaptability, and human relationships. I agree, but I think there was far too much just saying we need to make it human-centered and not enough talking about how that actually happens.
For the third point, they started just as a group of teachers/administrators that would have a monthly call to discuss AI and related issues, and have emerged into a full-fledged organization that hosted their own AI-related Summit back in June. They host a discussion forum, have testified in front of the MN Senate, do some research, and offer resources and tools to MN educators. These resources include different policy documents and guiding principles for AI use in the classroom, which I may look into later.
Breakout Session One: The Human Advantage: Equipping Learners for an AI-Enabled World.
This session focused on the work being done in one high school, mainly focusing on developing "irreplaceable human skills." Her focus was very much on figuring out how to prepare students to find jobs in the future, and cited the World Economic Forum's 2025 Future of Jobs Report, which states that while 92 million jobs are likely to be displaced by current trends, 170 million new jobs will be created this decade. The question is how to prepare students for jobs that we don't yet know about, in response to which the presenter described her school's six human-centered competencies of character, communication, citizenship, creativity, collaboration, and critical thinking. These all seem like good things, but weren't really defined well in the time we had, and of course I'm not convinced that humans will continue to be better than AI at these (or even are currently better in some cases). She then had us do an activity where we would use AI to help us make a short presentation on how to develop one of these skills and go around the room, which I didn't get much out of. Thus ended the session.
Breakout Session Two: Launching the Institute of Applied AI.
This session was led by MSUM's Institute of Applied AI Executive Director, and was essentially to explain its reason for existence and their future plans/goals. It's very much "we think AI will shift what skills are valuable and we need to be a college that can prepare our graduates for the workforce." To do this, they plan to host AI workshops, have faculty fellows with expertise in AI, give AI tool demonstrations, other readiness work, partner with regional companies to see what they're seeking from graduates, and more including hopefully some sort of micro-credentialing in the future. These all seem like interesting ideas that will be good in a very slow takeoff world. In a world with faster takeoff, I'm skeptical on how much will be relevant, but I would also think that they would be able to react more quickly to faster takeoff than a university without such infrastructure. The Institute is still in its infancy, beginning its existence this past spring, and it's only projected to be fully implemented in 2027-2028, but I plan to keep an eye out and what it's doing and where it's going.
Breakout Session Three: Supporting and Preparing Students in the Emerging Age of AI.
This session had the most 'meat', including three leaders at MSUM answering real questions about current AI teaching issues. Their first discussion was about how college students are different now than in the past. Whether it was the pandemic, an effect of more high school students going to college, or other factors, students are less prepared for a college curriculum like would have been had in the past. One said "students were college-ready, now we have to be student-ready."
Moving on to the job market, they acknowledged that right now the job market is difficult, and they don't know yet what the many jobs created by AI will look like. Through this they see two main paths forward: either students (and others) can find ways to upskill and be prepared for these new jobs, or they will be un/underemployed. Colleges, who need to prepare students for jobs to justify their tuition, need to keep up with the trends of industry and be closely connected.
Speaking of which, how are employers hiring right now? They say there's somewhat of a shift going on from experience/credential-based hiring to more skill-based hiring. This is one of the reasons behind the Institute's desire to create micro-credential programs, to provide proof of useful skills in job-seeking. Colleges also need to consider possible industry training programs as potential competitors.
However, multiple people on the panel also consider it important that colleges remain more than just job skills training programs, that they should focus on developing the soft skills of students, harkening back to sort of a traditional view of higher education, but for the purpose of increasing ability to find jobs in an uncertain future. I really wonder how you measure/track the development of these soft skills, but that would be a question for another day.
Breakout Session Four: Critical AI Literacy Roundtable.
The point of this session was to define "Critical AI Literacy," mostly contrasting it with "AI literacy," and "technological literacy." My notes for this session weren't the greatest, mostly thanks to it including more active participation than the others, but also due to me not thinking it was very helpful. The most interesting part to me was some discussion of the conflict that occurs in educators in regards to AI use. According to one attendee, the students she sees tend to be fairly polarized with regard to using AI, in that some will use it all the time even when teachers would rather they not, and some students don't want to use AI at all, generally for moral/environmental reasons. The question raised by this teacher was how to deal with the students. The educators want their students to be able to hold moral positions and stick to them under pressure, but they also want those students to be able to get a good job postgraduation, and that may require AI skills. We didn't have a good answer to this. I overall think we should talk to people about their concerns (I think the AI environmental issue is hugely overblown, mostly thanks to the work done by Andy Masley) and do our best to present the situation to them honestly.
Ending Critique:
The main way I thought it was lacking: Nobody seems to feel the AGI. People aren't projecting out plans to how to live with AI that surpasses us in most every way, they're figuring out how to make stuff work with AI in its current state They see it as a tool and aren't imagining that it will be something other than that. I know it's hard to to plan for future eventualities with really high weirdness but I do think it's important to be projecting the future advancement of AI when you're discussing incoming high school freshman starting to make career plans.
- ^
If you've read Spiral Dynamics (which I wouldn't exactly endorse, but it's an interesting map of the territory), it was aggressively Green vMeme.
Discuss
Brutalist Prose
THis may be the laziest lesswrong post you will ever read. Typically, LW house style is heavily edited, tagged, linked, backlinked, and cited. Asides are moved to the endnotes/sidenotes. Sources are checked and double checked. You proofread the draft yourself, then send it to the LW team and ask them pweease is this good is this okie dokie and they are like error on line 17. (uhh team, is this even true? not like i ever actually post anythin here)
I am here today to suggest a radical new approach to writing: Never delete. Never edit. Publish the first draft. Get it write on the first try. Like a language model spitting out token after token, you cannot go back, you can only shoot forward.
I traditionally follow the Paul Graham school of essay writing. He says an essai is an attempt to get to the truth. When you essay, you might not even know what your conclusion is until you get there. Never write on the bottom line. If you only write forward, you will never go back.
Concrete is flowing stone. It is poured as a slurry of water, stone, sand into a wooden mold and then it cures into a solid chunk of rock. Before it fully cures there is a finishing phase where you smooth the surface. But if you never finish then the concrete retains the edges and ridges patterns of the mold. This technique is called béton brut (french for "concrete raw") and the architecture school of this style is called brutalism.
Brutalism has nothing to do with savage brutes. It is not about clubbing your users over the head with grey artificial blocks of alienation. The opposite! to be brut is to be real, to show without any fear or embarrassment the true organic, to purely reflect how bio-capital made this. Realness is often cheaper than simulacra, but more importantly it gives the direct observation of of Nature, which every rationalist should appreciate.
Aside — one of my gripes with rationalist interior decorating is that it is not real enough. I would sooner lay on a rough concrete slab than to lay on astroturf. Reality is supposed to have a surprising amount of detail I thought? If you want soft outdoor pavement, may I suggest the rubbery material they use for running tracks?
Writing RoughThe advantage to publishing rough drafts is that the nerd reading this is very sure every word of this was written by an original human, not some sloppily emulation. Besides, I would like to think my errors delightfully show my personality. I mean, isn't that kinda the point of humanity?
The counter is that language models have latent abilities in detecting nuances in word choice. The super-beings of the future will appreciate the lack of effort you put into each word and each char. I invite you consider the opposite: actually the AI will enjoy writing which is heavily golfed like poetry, rap lyrics, and tweets.
Okay okay, never editing AT ALL would be too insane.
Instead, here's the suggested guidelines for Brutalism:
- as I say, on draft #0 do not go back at all, only go forward. write it in pen in a notebook if you must
- tally up how many typos. fix the errors that suck. leave the good ones
- for every error removed, re-introduce another error where it makes the writing clearer, funnier, or better in some other way.
- harness a style distinct from school english. I take inspiration from internet english. whitespace as punctuation. quote marks like code parens - ending punctuation goes "outside". breezy lowercase sentences. tasteful use of slang to play low status where appropriate (yudkowsky does this).
- go back and link stuff up.
Oh and do cut out words in accordance to George Orwell's rules of writing:
Never use a metaphor, simile or other figure of speech which you are used to seeing in print.
Never use a long word where a short one will do.
If it is possible to cut a word out, always cut it out.
Never use the passive where you can use the active.
Never use a foreign phrase, a scientific word or a jargon word if you can think of an everyday English equivalent.
Break any of these rules sooner than say anything outright barbarous.
I would add one last rule: never lie.
Discuss
Can we do useful meta-analysis? Unjournal evaluations of "Meaningfully reducing consumption of meat... is an unsolved problem..."
Cross-posted from the EA forum here
The Unjournal commissioned two evaluations of "Meaningfully reducing consumption of meat and animal products is an unsolved problem: A meta-analysis" by Seth Ariel Green, Benny Smith, and Maya B Mathur. See our evaluation package here.
My take: the research was ambitious and useful, but it seems to have important limitations, as noted in the critical evaluations; Matthew Janés evaluation provided constructive and actionable insights and suggestions.
I'd like to encourage follow-up research on this same question, starting with this paper's example and its shared database (demonstrating commendable transparency), taking these suggestions on board, and building something even more comprehensive and rigorous.
Do you agree? I come back to some 'cruxes' below:
- Is meta-analysis even useful in these contexts, with heterogeneous interventions, outcomes, and analytical approaches?
- Would a more rigorous and systematic approach really add value? Should it follow academic meta-analysis standards, or "a distinct vision of what meta-analysis is for, and how to conduct it" (as Seth suggests)?
- Will anyone actually do/fund/reward rigorous continued work?
The authors discussed this paper in a previous post.
We conclude that no theoretical approach, delivery mechanism, or persuasive message should be considered a well-validated means of reducing MAP [meat and anumal products'] consumption
Characterizing this as evidence of "consistently small effects ... upper confidence bounds are quite small" for most categories of intervention.[1]
Unjournal's evaluators: ~this meta-analysis is not rigorous enoughFrom the Evaluation Manager's summary (Tabare Capitan)
... The evaluators identified a range of concerns regarding the transparency, design logic, and robustness of the paper’s methods—particularly in relation to its search strategy, outcome selection, and handling of missing data. Their critiques reflect a broader tension within the field: while meta-analysis is often treated as a gold standard for evidence aggregation, it remains highly sensitive to subjective decisions at multiple stages.
Evaluators' substantive critiquesParaphrasing these -- mostly from E2, Matthew Jané, but many of the critiques were mentioned by both evaluators
Improper missing data handling: Assigning SMD = 0.01 to non-significant unreported effects introduces systematic bias by ignoring imputation variance
Single outcome selection wastes data: Extracting only one effect per study discards valuable information despite authors having multilevel modeling capacity
Risk-of-bias assessment is inadequate: The informal approach omits critical bias sources like selective reporting and attrition
Missing "a fully reproducible search strategy, clearly articulated inclusion and exclusion criteria ..., and justification for screening decisions are not comprehensively documented in the manuscript or supplement."
No discussion of attrition bias in RCTs... "concerning given the known non-randomness of attrition in dietary interventions"
... And a critique that we hear often in evaluations of meta-analyses: "The authors have not followed standard methods for systematic reviews..."
Epistemic audit: Here is RoastMyPoast's epistemic and factual audit of Janés evaluation. It gets a B- grade (which seems like the modal grade with this tool.) RMP is largely positive, but some constructive criticism (asking for "more explicit discussion of how each identified flaw affects the magnitude and direction of potential bias in the meta-analysis results.")
One author's responseSeth Ariel Green responded here.
Epistemic/factual audit: Here is RoastMyPoast's epistemic and factual audit of Seth's response. It gets a C- grade, and it raises some (IMO) useful critiques of the response, and a few factual disagreements about the cited methodological examples (these should be doublechecked). It flags "defensive attribution bias" and emphasizes that "the response treats innovation as self-justifying rather than requiring additional evidence of validity."
Highlighting some of Seth's responses to the substantive critiques:"Why no systematic search?"
...We were looking at an extremely heterogeneous, gigantic literature — think tens of thousands of papers — where sifting through it by terms was probably going to be both extremely laborious and also to yield a pretty low hit rate on average.
we employed what could be called a ‘prior-reviews-first’ search strategy. Of the 985 papers we screened, a full 73% came from prior reviews, . ... we employed a multitude of other search strategies to fill in our dataset, one of which was systematic search.
David Reinstein:
Seth's response to these issues might be characterized as ~"the ivory tower protocol is not practical, you need to make difficult choices if you want to learn anything in these messy but important contexts and avoid 'only looking under the streetlamp' -- so we did what seemed reasonable."
I'm sympathetic to this. The description intuitively seems like a reasonable approach to me. I'm genuinely uncertain as to whether 'following the meta-analysis rules' is the most useful approach for researchers aiming at making practical recommendations. I'm not sure if the rules were built for the contexts and purposes we're dealing with.
On the other hand, I think a lack of a systematic protocol limits our potential to build and improve on this work, and to make transparent fair comparisons.
And I would have liked the response to directly take on the methodogical issues raised directly -- yes there are always tradeoffs, but you can justify your choices explicitly, especially when you are departing from conversation.
"Why no formal risk of bias assessment?"
The main way we try to address bias is with strict inclusion criteria, which is a non-standard way to approach this, but in my opinion, a very good one (Simonsohn, Simmons & Nelson (2023) articulates this nicely).
After that baseline level of focusing our analysis on the estimates we thought most credible, we thought it made more sense to focus on the risks of bias that seemed most specific to this literature.
... I hope that our transparent reporting would let someone else replicate our paper and do this kind of analysis if that was of interest to them.
David: Again, this seems reasonable, but also a bit of a false dichotomy: you can have both strict inclusion criteria and risk of bias assessment.
"About all that uncertainty"
Matthew Jané raises many issues about ways in which he thinks our analyses could (or in his opinion, should) have been done differently. Now I happen to think our judgment calls on each of the raised questions were reasonable and defensible. Readers are welcome to disagree.
Matthew raises an interesting point about the sheer difficulty in calculating effect sizes and how much guesswork went into it for some papers. In my experience, this is fundamental to doing meta-analysis. I’ve never done one where there wasn’t a lot of uncertainty, for at least some papers, in calculating an SMD.
More broadly, if computing effect sizes or variance differently is of interest, by all means, please conduct the analysis, we’d love to read it!
David: This characterizes Seth's response to a number of the issues: 1. This is challenging, 2. You need to make judgment calls, 3. We are being transparent, and allowing others to follow up.
I agree with this, to a point. But again, I'd like to see them explicitly engage with the issues, careful and formal treatments, and specific practical solutions that Matthew provided. And as I get to below -- there are some systemic barriers to anyone actually following up on this.
Where does this leave us – can meta-analysis be practically useful in heterogeneous domains like this? What are the appropriate standards?
Again from the evaluation manager's synthesis (mostly Tabare Capitan)
... the authors themselves acknowledge many of these concerns, including the resource constraints that shaped the final design. Across the evaluations and the author response, there is broad agreement on a central point: that a high degree of researcher judgment was involved throughout the study. Again, this may reflect an important feature of synthesis work beyond the evaluated paper—namely, that even quantitative syntheses often rest on assumptions and decisions that are not easily separable from the analysts' own interpretive frameworks. These shared acknowledgements may suggest that the field currently faces limits in its ability to produce findings with the kind of objectivity and replicability expected in other domains of empirical science.
David Reinstein:
... I’m more optimistic than Tabaré about the potential for meta-analysis. I’m deeply convinced that there are large gains from trying to systematically combine evidence across papers, and even (carefully) across approaches and outcomes. Yes, there are deep methodological differences over the best approaches. But I believe that appropriate meta-analysis will yield more reliable understanding than ad-hoc approaches like ‘picking a single best study’ or ‘giving one’s intuitive impressions based on reading’. Meta-analysis could be made more reliable through robustness-checking, estimating a range of bounded estimates under a wide set of reasonable choices, and enabling data and dashboards for multiverse analysis, replication, and extensions.
I believe a key obstacle to this careful, patient, open work is the current system of incentives and tools offered by academia and the current system of traditional journal publications as a career outcome an ‘end state’. The author’s response “But at some point, you declare a paper ‘done’ and submit it” exemplifies this challenge.The Unjournal aims to build and facilitate a better system.
Will anyone actually follow up on this? Once the "first paper" is published in an academic journal, can anyone be given a career incentive, or direct compensation, to improve upon it? Naturally, this gets at one of my usual gripes with the traditional academic journal model, a problem that The Unjournal's continuous evaluation tries to solve.
This also depends on... whether the animal welfare and EA community believes that rigorous/academic-style research is useful in this area. And wants to fund and support a program to gradually and continually improve our understanding and evidence on perhaps a small number of crucial questions like this.
(And, preaching to the choir here, I also think it depends on good epistemic norms.)
- ^
However they say "the largest effect size, ... choice architecture, comes from too few studies to say anything meaningful about the approach in general. So for that case we're dealing with an absence of evidence, i.e., wide posteriors.
Discuss
Toward Statistical Mechanics Of Interfaces Under Selection Pressure
Imagine using an ML-like training process to design two simple electronic components, in series. The parameters θ1.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style: normal; font-weight: normal; font-size: 100%; font-size-adjust: none; letter-spacing: normal; word-wrap: normal; word-spacing: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0; min-height: 0; border: 0; margin: 0; padding: 1px 0} .MJXc-display {display: block; text-align: center; margin: 1em 0; padding: 0} .mjx-chtml[tabindex]:focus, body :focus .mjx-chtml[tabindex] {display: inline-table} .mjx-full-width {text-align: center; display: table-cell!important; width: 10000em} .mjx-math {display: inline-block; border-collapse: separate; border-spacing: 0} .mjx-math * {display: inline-block; -webkit-box-sizing: content-box!important; -moz-box-sizing: content-box!important; box-sizing: content-box!important; text-align: left} .mjx-numerator {display: block; text-align: center} .mjx-denominator {display: block; text-align: center} .MJXc-stacked {height: 0; position: relative} .MJXc-stacked > * {position: absolute} .MJXc-bevelled > * {display: inline-block} .mjx-stack {display: inline-block} .mjx-op {display: block} .mjx-under {display: table-cell} .mjx-over {display: block} .mjx-over > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-under > * {padding-left: 0px!important; padding-right: 0px!important} .mjx-stack > .mjx-sup {display: block} .mjx-stack > .mjx-sub {display: block} .mjx-prestack > .mjx-presup {display: block} .mjx-prestack > .mjx-presub {display: block} .mjx-delim-h > .mjx-char {display: inline-block} .mjx-surd {vertical-align: top} .mjx-surd + .mjx-box {display: inline-flex} .mjx-mphantom * {visibility: hidden} .mjx-merror {background-color: #FFFF88; color: #CC0000; border: 1px solid #CC0000; padding: 2px 3px; font-style: normal; font-size: 90%} .mjx-annotation-xml {line-height: normal} .mjx-menclose > svg {fill: none; stroke: currentColor; overflow: visible} .mjx-mtr {display: table-row} .mjx-mlabeledtr {display: table-row} .mjx-mtd {display: table-cell; text-align: center} .mjx-label {display: table-row} .mjx-box {display: inline-block} .mjx-block {display: block} .mjx-span {display: inline} .mjx-char {display: block; white-space: pre} .mjx-itable {display: inline-table; width: auto} .mjx-row {display: table-row} .mjx-cell {display: table-cell} .mjx-table {display: table; width: 100%} .mjx-line {display: block; height: 0} .mjx-strut {width: 0; padding-top: 1em} .mjx-vsize {width: 0} .MJXc-space1 {margin-left: .167em} .MJXc-space2 {margin-left: .222em} .MJXc-space3 {margin-left: .278em} .mjx-test.mjx-test-display {display: table!important} .mjx-test.mjx-test-inline {display: inline!important; margin-right: -1px} .mjx-test.mjx-test-default {display: block!important; clear: both} .mjx-ex-box {display: inline-block!important; position: absolute; overflow: hidden; min-height: 0; max-height: none; padding: 0; border: 0; margin: 0; width: 1px; height: 60ex} .mjx-test-inline .mjx-left-box {display: inline-block; width: 0; float: left} .mjx-test-inline .mjx-right-box {display: inline-block; width: 0; float: right} .mjx-test-display .mjx-right-box {display: table-cell!important; width: 10000em!important; min-width: 0; max-width: none; padding: 0; border: 0; margin: 0} .MJXc-TeX-unknown-R {font-family: monospace; font-style: normal; font-weight: normal} .MJXc-TeX-unknown-I {font-family: monospace; font-style: italic; font-weight: normal} .MJXc-TeX-unknown-B {font-family: monospace; font-style: normal; font-weight: bold} .MJXc-TeX-unknown-BI {font-family: monospace; font-style: italic; font-weight: bold} .MJXc-TeX-ams-R {font-family: MJXc-TeX-ams-R,MJXc-TeX-ams-Rw} .MJXc-TeX-cal-B {font-family: MJXc-TeX-cal-B,MJXc-TeX-cal-Bx,MJXc-TeX-cal-Bw} .MJXc-TeX-frak-R {font-family: MJXc-TeX-frak-R,MJXc-TeX-frak-Rw} .MJXc-TeX-frak-B {font-family: MJXc-TeX-frak-B,MJXc-TeX-frak-Bx,MJXc-TeX-frak-Bw} .MJXc-TeX-math-BI {font-family: MJXc-TeX-math-BI,MJXc-TeX-math-BIx,MJXc-TeX-math-BIw} .MJXc-TeX-sans-R {font-family: MJXc-TeX-sans-R,MJXc-TeX-sans-Rw} .MJXc-TeX-sans-B {font-family: MJXc-TeX-sans-B,MJXc-TeX-sans-Bx,MJXc-TeX-sans-Bw} .MJXc-TeX-sans-I {font-family: MJXc-TeX-sans-I,MJXc-TeX-sans-Ix,MJXc-TeX-sans-Iw} .MJXc-TeX-script-R {font-family: MJXc-TeX-script-R,MJXc-TeX-script-Rw} .MJXc-TeX-type-R {font-family: MJXc-TeX-type-R,MJXc-TeX-type-Rw} .MJXc-TeX-cal-R {font-family: MJXc-TeX-cal-R,MJXc-TeX-cal-Rw} .MJXc-TeX-main-B {font-family: MJXc-TeX-main-B,MJXc-TeX-main-Bx,MJXc-TeX-main-Bw} .MJXc-TeX-main-I {font-family: MJXc-TeX-main-I,MJXc-TeX-main-Ix,MJXc-TeX-main-Iw} .MJXc-TeX-main-R {font-family: MJXc-TeX-main-R,MJXc-TeX-main-Rw} .MJXc-TeX-math-I {font-family: MJXc-TeX-math-I,MJXc-TeX-math-Ix,MJXc-TeX-math-Iw} .MJXc-TeX-size1-R {font-family: MJXc-TeX-size1-R,MJXc-TeX-size1-Rw} .MJXc-TeX-size2-R {font-family: MJXc-TeX-size2-R,MJXc-TeX-size2-Rw} .MJXc-TeX-size3-R {font-family: MJXc-TeX-size3-R,MJXc-TeX-size3-Rw} .MJXc-TeX-size4-R {font-family: MJXc-TeX-size4-R,MJXc-TeX-size4-Rw} .MJXc-TeX-vec-R {font-family: MJXc-TeX-vec-R,MJXc-TeX-vec-Rw} .MJXc-TeX-vec-B {font-family: MJXc-TeX-vec-B,MJXc-TeX-vec-Bx,MJXc-TeX-vec-Bw} @font-face {font-family: MJXc-TeX-ams-R; src: local('MathJax_AMS'), local('MathJax_AMS-Regular')} @font-face {font-family: MJXc-TeX-ams-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_AMS-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_AMS-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_AMS-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-B; src: local('MathJax_Caligraphic Bold'), local('MathJax_Caligraphic-Bold')} @font-face {font-family: MJXc-TeX-cal-Bx; src: local('MathJax_Caligraphic'); font-weight: bold} @font-face {font-family: MJXc-TeX-cal-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-R; src: local('MathJax_Fraktur'), local('MathJax_Fraktur-Regular')} @font-face {font-family: MJXc-TeX-frak-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-frak-B; src: local('MathJax_Fraktur Bold'), local('MathJax_Fraktur-Bold')} @font-face {font-family: MJXc-TeX-frak-Bx; src: local('MathJax_Fraktur'); font-weight: bold} @font-face {font-family: MJXc-TeX-frak-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Fraktur-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Fraktur-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Fraktur-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-BI; src: local('MathJax_Math BoldItalic'), local('MathJax_Math-BoldItalic')} @font-face {font-family: MJXc-TeX-math-BIx; src: local('MathJax_Math'); font-weight: bold; font-style: italic} @font-face {font-family: MJXc-TeX-math-BIw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-BoldItalic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-BoldItalic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-BoldItalic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-R; src: local('MathJax_SansSerif'), local('MathJax_SansSerif-Regular')} @font-face {font-family: MJXc-TeX-sans-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-B; src: local('MathJax_SansSerif Bold'), local('MathJax_SansSerif-Bold')} @font-face {font-family: MJXc-TeX-sans-Bx; src: local('MathJax_SansSerif'); font-weight: bold} @font-face {font-family: MJXc-TeX-sans-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-sans-I; src: local('MathJax_SansSerif Italic'), local('MathJax_SansSerif-Italic')} @font-face {font-family: MJXc-TeX-sans-Ix; src: local('MathJax_SansSerif'); font-style: italic} @font-face {font-family: MJXc-TeX-sans-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_SansSerif-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_SansSerif-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_SansSerif-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-script-R; src: local('MathJax_Script'), local('MathJax_Script-Regular')} @font-face {font-family: MJXc-TeX-script-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Script-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Script-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Script-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-type-R; src: local('MathJax_Typewriter'), local('MathJax_Typewriter-Regular')} @font-face {font-family: MJXc-TeX-type-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Typewriter-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Typewriter-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Typewriter-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-cal-R; src: local('MathJax_Caligraphic'), local('MathJax_Caligraphic-Regular')} @font-face {font-family: MJXc-TeX-cal-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Caligraphic-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Caligraphic-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Caligraphic-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-B; src: local('MathJax_Main Bold'), local('MathJax_Main-Bold')} @font-face {font-family: MJXc-TeX-main-Bx; src: local('MathJax_Main'); font-weight: bold} @font-face {font-family: MJXc-TeX-main-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Bold.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-I; src: local('MathJax_Main Italic'), local('MathJax_Main-Italic')} @font-face {font-family: MJXc-TeX-main-Ix; src: local('MathJax_Main'); font-style: italic} @font-face {font-family: MJXc-TeX-main-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-main-R; src: local('MathJax_Main'), local('MathJax_Main-Regular')} @font-face {font-family: MJXc-TeX-main-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Main-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Main-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Main-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-math-I; src: local('MathJax_Math Italic'), local('MathJax_Math-Italic')} @font-face {font-family: MJXc-TeX-math-Ix; src: local('MathJax_Math'); font-style: italic} @font-face {font-family: MJXc-TeX-math-Iw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Math-Italic.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Math-Italic.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Math-Italic.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size1-R; src: local('MathJax_Size1'), local('MathJax_Size1-Regular')} @font-face {font-family: MJXc-TeX-size1-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size1-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size1-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size1-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size2-R; src: local('MathJax_Size2'), local('MathJax_Size2-Regular')} @font-face {font-family: MJXc-TeX-size2-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size2-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size2-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size2-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size3-R; src: local('MathJax_Size3'), local('MathJax_Size3-Regular')} @font-face {font-family: MJXc-TeX-size3-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size3-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size3-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size3-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-size4-R; src: local('MathJax_Size4'), local('MathJax_Size4-Regular')} @font-face {font-family: MJXc-TeX-size4-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Size4-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Size4-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Size4-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-R; src: local('MathJax_Vector'), local('MathJax_Vector-Regular')} @font-face {font-family: MJXc-TeX-vec-Rw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Regular.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Regular.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Regular.otf') format('opentype')} @font-face {font-family: MJXc-TeX-vec-B; src: local('MathJax_Vector Bold'), local('MathJax_Vector-Bold')} @font-face {font-family: MJXc-TeX-vec-Bx; src: local('MathJax_Vector'); font-weight: bold} @font-face {font-family: MJXc-TeX-vec-Bw; src /*1*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/eot/MathJax_Vector-Bold.eot'); src /*2*/: url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/woff/MathJax_Vector-Bold.woff') format('woff'), url('https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.2/fonts/HTML-CSS/TeX/otf/MathJax_Vector-Bold.otf') format('opentype')} control the function performed by the first component, and the parameters θ2 control the function performed by the second component. The whole thing is trained so that the end-to-end behavior is that of a digital identity function: voltages close to logical 1 are sent close to logical 1, voltages close to logical 0 are sent close to logical 0.
Background: Signal BufferingWe’re imagining electronic components here because, for those with some electronics background, I want to summon to mind something like this:
This electronic component is called a signal buffer. Logically, it’s an identity function: it maps 0 to 0 and 1 to 1. But crucially, it maps a wider range of logical-0 voltages to a narrower (and lower) range of logical-0 voltages, and correspondingly for logical-1. So if noise in the circuit upstream might make a logical-1 voltage a little too low or a logical-0 voltage a little too high, the buffer cleans that up, pushing the voltages closer to their ideal values.
This is a generalizable point about interfaces in scalable systems: for robustness and scalability, components need to accept less-precise inputs and give more-precise outputs.
That’s the background mental picture I want to invoke. But now, I want to combine it with an ML-like mental picture of training a system to match particular input/output behavior.
Back To The Original Picture: Introducing Interfacesθ1 chooses the function performed by the first component, θ2 chooses the function performed by the second component; the colored curves show some possible functions for the two components. The whole system is trained to have a particular end-to-end behavior.Here’s a conceptual story.
There are three interfaces - “APIs”, we’ll call them. The first (API1) is at the input of the whole system, the second (API2) between the two components, and the last (API3) is at the output of the whole system. At each of those APIs, there’s a set of “acceptable” voltages for each logical input to the full system (i.e. 0 or 1).
The APIs constrain the behavior of each component - e.g. component 1 is constrained by API1 (which specifies its inputs) and API2 (which specifies its outputs).
Let's put some math on that, with some examples.
A set of APIs might look like:
- API1:(0↦[0V,1.2V],1↦[3.5V,5.0V]) - i.e. the full system accepts either a voltage between 0 and 1.2 volts (representing logical 0), or a voltage between 3.5 and 5.0 volts (representing logical 1). No particular behavior is guaranteed for other voltages.
- API2:(0↦[0V,0.5V]∪[4.6V,5.0V],1↦[2.6V,3.8V]) - i.e. in the middle the system uses extreme voltages (either above 4.6 or below 0.5 volts) to represent logical 0, and middling voltages to represent logical 1. Weird, but allowed.
- API3:(0↦[0V,0.5V],1↦[2.8V,5.0V]) - i.e. a narrower range of low voltages but wider range of high voltages, compared to the input. This might not be the most useful circuit behavior, but it’s an allowed circuit behavior.
(For simplicity, we’ll assume all voltages are between 0V and 5V). In order for the system to satisfy those particular APIs:
- Component 1 must map every value in API1(0) to a value in API2(0), and every value in API1(1) to a value in API2(1) - i.e. any value less than 1.2V must be mapped either below 0.5V or above 4.6V, while any value above 3.5V must be mapped between 2.6 and 3.8V.
- Component 2 must likewise map every value in API2(0) to a value in API3(0), and every value in API2(1) to a value in API3(1).
Using fi for component i and writing it out mathematically: the components satisfy a set of APIs if and only if
∀b∈0,1,x∈APIi(b):fi(x,θi)∈APIi+1(b)
That’s a set of constraints on θi, for each component i.
The Stat Mech PartSo the APIs put constraints on the components. Furthermore, subject to those constraints, the different components decouple: component 1 can use any parameters θ1 internally so long as it satisfies the API set (specifically API1 and API2), and component 2 can use any parameters θ2 internally so long as it satisfied the API set (specifically API2 and API3).
Last big piece: putting on our stat mech/singular learning theory hats, we educatedly-guess that the training process will probably end up with an API set which can be realized by many different parameter-values. A near-maximal number of parameter values, probably.
The decoupling now becomes very handy. Let’s use the notation )">H(Θ|<constraints>) - you can think of it as the log number of parameter values compatible with the constraints, or as entropy or relative entropy of parameters given constraints (if we want to weight parameter values by some prior distribution, rather than uniformly). Because of the decoupling, we can write H as
H(Θ|API)=
H(Θ1|∀b∈0,1,x∈API1(b):f1(x,θ1)∈API2(b))
+H(Θ2|∀b∈0,1,x∈API2(b):f2(x,θ2)∈API3(b))
So there’s one term which depends only on component 1 and the two APIs adjacent to component 1, and another term which depends only on component 2 and the two APIs adjacent to component 2.
Our stat-mech-ish prediction is then that the training process will end up with a set of APIs for which H(Θ|API) is (approximately) maximal.
Why Is This Interesting?What we like about this mental model is that it bridges the gap between stat mech/singular learning theory flavored intuitions (i.e. training finds structure compatible with the most parameters, subject to constraints) and internal structures in the net (i.e. internal interfaces). This feels to us like exactly the gap which needs to be crossed in order for stat mech flavored tools to start saying big things about interpretability.
Discuss
Sex, Drugs, and the Future of American Politics
The New York Times recently published an article entitled “Can Anyone Rescue the Trafficked Girls of L.A.’s Figueroa Street?” The article, worth reading in its entirety, details the struggle to “rescue” the girls:
After interviewing each girl, Armendariz kept her company for hours until someone from the Department of Children and Family Services came to pick her up. In theory, D.C.F.S. staff would take the girl to a hospital for a health screening and then to a temporary housing placement, before ultimately finding her a new foster household or group home. But time after time, the agency reported back to Armendariz that the girl had jumped from the car as soon as it pulled out of the station. The agency estimates that three out of four rescued preteens and teenagers go back to their traffickers.
Why do they return to their traffickers? The article attributes some of it to fear, but also claims that most of the prostitutes feel “a deep psychological tie to their traffickers, the most consistent authority figure they had ever known.” Given how many of these girls are runaways, I rather doubt that the issue is excessive deference to authority. Do they have a sense of “love” for their traffickers? Perhaps, the article describes one such girl, in what might be an excerpt from poorly-written BDSM erotica:
Earlier that night, officers pulled Ajena, 15, from the Blade for the third time. Armendariz had never met a child with such street loyalty. Ajena’s trafficker had just hours earlier split her lip, because she smirked at him, and threatened to shoot her in the leg if she walked away. She still spit on the officer who asked her to share his identity, and when her cellphone rang, his contact came up as “Daddy.”
The central case study, “Ana,” who the article explicitly states returned willingly to her trafficker, hints at what I suspect is the more common reason: (emphasis added)
When he left, Ana plodded over to her own trafficker to ask for a break. The answer was no. Ana took some Percocets and chased them with Hennessy to numb herself. The temperature dropped, and soon she felt herself longing for another customer’s car, just to get out of the cold.
{snip}
Ana didn’t know it, but Forsythe [Ana’s former foster parent] had been thinking about her all this time. She had visited Ana in the hospital four times and was riddled with guilt that she couldn’t invite Ana back, but she had already taken in a 12-year-old in her place. She tried to stay in touch with Ana, but when she confronted Ana about stockpiling opioids, Ana cut her off.
The article never explicitly states she, or any of the other girls, returned to prostitution due to drugs or alcohol. Probably because had they said these women work as prostitutes because it’s the only job they can do while drunk or high, they’d have been accused of victim-blaming. I’ve seen this pattern in other New York Times articles. They use all the politically correct terminology (”trafficking”, “rescue”) while giving the reader enough information to see that the narrative isn’t quite right. Crumbs that the intelligent, open-minded reader can follow, while the person who just wants the narrative confirmed can see that too. Some people learn something and no subscriptions are cancelled.
As frustrating as this is, it’s preferable to the approach taken by most of our right-wing media. Had Fox News or Breitbart commissioned such an article, it would have been something something Democratic cities bad, something something sex trafficking, something something illegals. Pure comfort food for its audience. The NYT at least introduces its audience to novel information, information that might even challenge their preconceptions and make them feel uncomfortable.
Recall again the title, “Can Anyone Rescue the Trafficked Girls of L.A.’s Figueroa Street?” Per Betteridge’s law of headlines, the answer is no. The police and social workers seem to be doing the best jobs they can in an unfixable situation. The only people who can rescue the girls are the girls themselves, of course, the NYT doesn’t say that. For that would be “victim blaming.” It feels mean to “victim-blame.” Much kinder to tell the girls they’re victims of the traffickers, blame the traffickers, not the victims! Of course, the traffickers are not going to stop trafficking because the New York Times called them misogynists. It does nothing to help the victims, but hey, it makes the rest of us feel good.
One of the things the New York Times is willing to say is that the Democratic state legislature is making the police’s job harder:
Their jobs grew even more challenging when California repealed the law allowing the police to arrest women who loitered with the intent to engage in prostitution. The repeal, known as SB 357, was intended to prevent profiling of Black, brown and trans women based on how they dressed. But when it was implemented in January 2023, the effect was that uniformed officers could no longer apprehend groups of girls in lingerie on Figueroa, hoping to recover minors among them. Now officers needed to be willing to swear they had reason to suspect each girl was underage — but with fake eyelashes and wigs, it was nearly impossible to tell. One girl told vice officers that her trafficker had explained things succinctly: “We run Figueroa now,” he said.
Most counties swung toward Trump in the 2024 election, but some of the greatest swings were seen in big, blue metropolitan areas like Los Angeles, Chicago, and New York, probably because of anger at Democratic tolerance for urban disorder. This didn’t have to be such a big disaster for the Democrats. Had they embraced a libertarian attitude that consenting adults can dress how they want and sleep with whoever they want, even if money changes hands, they might have received hearty support from California’s not-very-religious population. Instead, they shot themselves in the foot by embracing the moral panic around sex trafficking. When you tell people there’s a horrific problem in the city you govern, some might expect you to fix it. While the GOP won’t be winning LA, Chicago, or NYC anytime soon, the internal migrants from these cities will swell the populations of the booming sunbelt, bringing with them a lasting distaste for Democratic rule.
Political self-interest is a powerful motivator. The flow of congressional districts from blue states to red states motivated some blue-state politicians to finally start repealing their NIMBY housing laws. Perhaps this is what led the NYT to publish that article, the hope that Democratic politicos will read it and won’t make the same mistakes they made in California. And the red tribe shouldn’t rest on its laurels. Their areas have fewer of these problems because their police forces haven’t adopted the don’t-blame-the-victim mantra and put prostitutes into jail, not social workers’ cars they can escape from. While this may or may not help the prostitutes, it certainly removes them from public view. Yet an increasing fraction of the coalition buys into Sound of Freedom sex trafficking hysteria. If they get the police to behave according to blue-state ideology, they’ll wind up with blue-state results.
The same theme, don’t blame the victim, never demand responsibility, can be seen in other parts of American life. Just look at the Vice President, who was propelled to fame by writing a book about how his people, “hillbillies,” are harmed by a culture of blaming others and refusing to take responsibility for oneself. And when he entered politics, rather than challenge this culture, he embraced it, always blaming foreigners and the government for his voters’ problems. While running for the Ohio Senate, Vance said, “instead of trying to trigger WW3 with a disastrous no fly zone in Ukraine, we should secure our own border to stop the fentanyl killing our kids.” This taboo on personal responsibility was given a sleazy twist when Trump returned to the Presidency, as his administration started putting out press releases about how he saved “over 119 million lives” by seizing enough fentanyl to cause that many overdoses. You see, without Daddy Trump, we would have been powerless to not inject ourselves.
What started with laughter soon turned to murder. As of November 2, 67 people have been murdered on the high seas by the Trump administration. “Killing cartel members who poison our fellow citizens is the highest and best use of our military,” Sleaze Jr. a.k.a J.D. Vance posted on social media. Expecting our citizens to not poison themselves, hell, expecting anything from them, that isn’t what America’s about! When you hear people say “we’re a nation, not an economy,” know that this is the kind of nation they want, a nation of dysgenic losers incapable of taking the most basic steps toward self-preservation.
The Democrats share some blame here, for they feed into the same culture of helplessness, treating drug addicts like they’re victims of a medical ailment. If you put people in the mindset that 80,000 Americans a year are dying through no fault of their own, some will respond positively to the Trump solution of stopping it by murdering scary foreign people. It won’t work, but nor will the Democrats’ solution of offering “treatment” that most addicts will refuse. Just look at San Francisco.
One could imagine a different, more Nietzschean Democratic Party. Imagine a newly elected President Gavin Newsom getting up on that lectern and announcing that America is a country for winners, and that henceforth, Americans are expected to participate in their own survival. The Federal Bureau of Prisons gets the DOGE treatment, being ordered to release all drug offenders. States would still be able to prosecute drug crimes, but they’d need to pay for it themselves. There’d be a massive backlash, but good ole’ Gavin could just smile with aristocratic contempt and say “don’t like drugs, don’t do drugs,” applying the same mantra to other “problems” in American society like unhealthy food and pornography.
Of course, real-world social change doesn’t happen with one man going up on a lectern; it’s a gradual process, as old topics fall out of fashion. Hopefully, our current infantilizing culture will go the way of prior fads, as it gradually dawns on people how ridiculous they sound.
Discuss
What is the (LW) consensus on jump from qualia to self-awareness in AI?
What is the consensus here on jumping from qualia (inner experience) to full self-awareness in LLM based AI? Meaning: If an AI running on something like LLM based architecture would gain qualia; inner experience of any kind, is the gap to self-awareness small?
Is it perhaps 15 % for qualia, 10 % for full self-awareness?
The alternative would be a bigger gap between qualia and self-awareness. Perhaps as big as, or bigger than the gap from non-sentience to qualia.
This question is only about how big the sentience jump would be, relatively speaking. I do not explicitly care about agency here. (The consensus there is ofc that agency is more likely than qualia. Those Ps are another discussion.)
I would believe that frontier labs and most researchers (alignment and capability alike) would agree that unlike in evolved, organic life, the jump from qualia to self-awareness would be smaller. This since the LLM is already wired and trained for reasoning. The springing point is then that qualia itself is unlikely. But what the probabilities are for both are debatable. I am curious about the sentiment about the relative gap between them.
I have no idea where LW stands on this, or where the broader public (those who think about this, I presume mostly academia) is at.
The premise here is that the labs would make all necessary changes and scaffolding to allow this to at least in theory be possible, say on purpose.
Discuss
OpenAI Does Not Appear to be Applying Watermarks Honestly
When OpenAI launched Sora 2, they accompanied the release with a statement on "Launching Sora responsibly". The first bullet point of this statement reads as follows:
"Distinguishing AI content: Every video generated with Sora includes both visible and invisible provenance signals. At launch, all outputs carry a visible watermark. All Sora videos also embed C2PA metadata—an industry-standard signature"
I have been testing the C2PA metadata accompanied with Sora 2 videos, and to my understanding, this claim is false.
Sora 2 videos with visible watermarksAll users of Sora 2, except those with the $200/month "Pro" plan, are restricted to downloading videos with visible watermarks. An example of this can be seen below:
(Credit to OpenAI and @hellcat6969 on Sora)
As can be seen above, the video is prominently watermarked with a visible Sora watermark. However, I can't find any invisible C2PA data attached, as is claimed to exist by OpenAI.
Following OpenAI's own guidance, I tested for the C2PA metadata using the official Content Credentials "Verify" tool. The tool was not able to identify any metadata.
Above: The Verify tool is unable to identify any C2PA metadata associated with this video.I also installed the official C2PA command line tool, and tried to verify for authenticity using this.
Above: C2PA command-line tool does not identify any metadata either.Sora 2 videos without visible watermarksIt appears that, if a Pro user downloads a video without the visible watermark, then the invisible C2PA metadata is included. I tested this myself and got the following result:
Above: The tool does identify C2PA metadata, but only for videos that were downloaded from Sora without a visible watermark.Is this dangerous at all?It doesn't seem entirely ridiculous for OpenAI to omit invisible C2PA metadata on videos that already have a visible watermark, however it does raise the question "why not apply both?".
Feasibly, somebody could download a visibly watermarked Sora video, and crop it down to keep the watermarked parts out of frame. They would then have a zero-watermark and zero-metadata Sora video.
This would work, but would require cropping out a large proportion of the original video. It would also be pointless because, to my knowledge, it is quite easy to remove C2PA metadata anyway. If you Google "Erase C2PA Metadata", there are many website offering the service for free.
ConclusionIn summary:
- OpenAI claims that "All Sora videos also embed C2PA metadata"
- In fact, OpenAI only embeds C2PA metadata if a video is downloaded without visible watermarking.
- This is probably not a great safety or misinformation concern, as C2PA metadata can be erased easily anyway.
Despite this not being a great concern, I still wanted to make this post to bring it to people's attention, as this seems like something that people should know about.
Disclaimer: The claims in this post are "to my knowledge", and I am not a cyber-security or cryptography expert. All claims are made according to the results of my testing using the Content Authenticity Verify and C2PA-rs tools. These tests were performed on videos downloaded using the Sora 2 web interface on Windows Desktop.
Discuss
Genetic Enhancements of Color Qualia
Existing genetic engineering techniques can be used to significantly enhance human color vision, both in terms of private visual experience as well as perceptually.
Color Gamut ExpansionLMS color space represents the response of the three cone cells of the human eye, named after the wavelength of their relative peak responsivity (long, medium and short). Proteins called opsins are mostly responsible for this response.
Overlaps between these peaks makes resolving colors difficult, this is particularly problematic for M cones since their wavelength lies in between the other two.
M and L spectra share a large area making experience of some intermediate and pure colors virtually impossibleThese poorly designed absorption curves make about 30% of our gamut color space completely inaccessible under normal conditions.
Large area of the theoretical LMS human color triangle is "polluted" by sympathetic activation of different cone cellsTo address this issue new opsin proteins must be generated and used to replace the current ones. These new proteins must possess curves that are well separated with means equally spaced between them. To design these new proteins techniques like directed evolution can be employed, alternatively a suitable alternative can be searched among their biological homologous.
In a recent study scientists were able to make a machine (a sort of advanced laser display) that can, to a certain degree, selectively stimulate individual cone cells within the retina; test subjects confirmed they could experience a novel color quale. Refining the responsivity curves of our opsin proteins would have the same effect, effectively adding up to a million new colors to our everyday experience.
Unfortunately germline engineering is necessary to implement this fix, as virtually all the cone cells in both our eyes need to be modified for this to work. Tissue engineering and retina transplant may make this genemod available to baseline humans in the medium-term future.
Dimensional ExpansionA completely new color can be added to our vision by adding a new opsin gene to our current set, making us tetrachromats. The common ancestor of vertebrates was a tetrachromat as are birds and other modern animals, with four distinct channels for conveying chromatic information, their color space is 4D.
Tetrachromats have an expanded LMSV color space in the shape of a tetrahedronThe fourth color opsin for these animals lies in the ultraviolet. For humans such a protein must be carefully picked among existing ones (or else designed anew) since the human eye naturally blocks UV light. UV-A (320-400 nm) is likely the best candidate as a large part of it reaches the retina and there are animal uv opsins with mean responsivity at around 370 nm.
Gene therapy protocols have already been explored to add a new color in animal vision with successful results, while, to my knowledge, germline engineering has not been explored for this type of editing.
Hypothetical "optimal" spectra of engineered human opsins, including an additional uv opsinReferences- Novel color via stimulation of individual photoreceptors at population scale
- Gene therapy for red–green colour blindness in adult primates
- A Locus Control Region Adjacent to the Human Red and Green Visual Pigment Genes
Discuss
Anticheat: a non-technical look without psychoanalysis
Any competitive online game is going to attract cheaters. They ruin the game for everyone else, and even the suspicion that your opponent is cheating suffices for that. Naturally, some countermeasures have to be taken.
You might have heard the phrase "To describe a system accurately is to attack it", but sadly I think most of this is already common knowledge for anyone actually selling cheating as a service, and that's a huge industry. The customers don't need any technical skills. I used to know a guy who sold both cheat and anticheat programs for a game, ensuring that the illusion of an arms race kept both sides paying a monthly subscription.
The oldest solution, from the LAN gaming era, is just to refuse to play with the cheater. You don't even have to know they're cheating; you'd also refuse to play with anyone who's too much better than you. This is the same solution we use for board games, and non-competitive games.
Many online games, especially those with smaller player counts, do manual matchmaking. Typically it's done using the lobby system: anyone can open a game lobby, and it's visible to other people who can decide to join. Before the game starts, the lobby leader attempts to balance the game, and kicks out anyone they don't want in. People learn to recognize cheaters by the nicknames used. Again, anyone too good for your lobby gets kicked out.
Almost all popular games have an automatic matchmaking queue. Players have a skill rating, used by the system to find them an equally-skilled opponent. In team games, the system is slightly more complicated as you have to balance entire teams against each other. The skill rating is typically shown publicly, by displaying the raw number or some kind of rank. In most games, obtaining a higher rank is a major drive for cheaters, but also for everyone else.
CheatingThe easiest form of cheating to detect is performing actions the game rules don't allow, like walking through walls, flying, or creating items or money from nothing. These can typically be either prevented or at least detected automatically, depending on whether the server is authoritative or not. This means it's typically dependent on bugs in the game code, and can be fixed once it becomes known. These are rarely an issue nowadays.
A more problematic cheat works entirely within the limits of the game code: the player input is replaced or augmented automatically. For instance, in a shooting game, it could slightly adjust your aim so you hit the enemies every time, or just automatically press the trigger once it points to an opponent. The most egregious form of this is spinbotting, spinning the camera around multiple times per second, immediately eradicated every enemy. The subtler forms make it really hard to distinguish between a good player and an automation. Some statistical methods, including machine learning, can be used to detect unnatural mouse movements or inhuman reaction times. But it's a cat-and-mouse game.
This kind of cheat is the primary problem in games of perfect information like chess. Any decent chess engine beats every human alive, and it cannot be detected in any other way than comparing player moves against engine moves. And even in those cases, if done carefully and not too often, undetectable. Even professional players in live tournaments are sometimes suspected of cheating this way, sometimes in really creative ways.
The third primary type of cheating is obtaining information you shouldn't have, for instance to see through walls or know the actions your opponent has done in secret. In perfect-information games this is by definition impossible. In others, it can be mitigated by not sending out any state that's not necessary for the clients to see. But in fast-paced games, it's often necessary to share information that might be required soon, like what's around a corner just before crossing it. Many developers don't bother to implement this at all, since it's often quite hard to predict what events might happen soon.
Again, this is often impossible to detect. It just looks like good game sense. But if you make decisions based on information you shouldn't have, someone watching you would notice. This can also be automated to a degree, but that will not catch all cases.
There are other ways of cheating too, like joining a game with multiple accounts to get more information, abusing lag compensation, or hard-to-see textures with simpler ones, but the examples above are plenty enough background. There's a type of non-technical cheating I must talk about, though: letting someone else play for you, or playing as someone else. These are completely non-detectable by measures that attempt to separate humans from machines, and many games don't bother to protect against these at all.
A common practice, not even thought of as cheating by many people, is smurfing, using a separate low-ranked account to play against lower-ranked people. Since any decent matchmaking system would quickly adjust your skill to match the actual level of play, you have to either sandbag, losing games on purpose, or simply create new accounts periodically. Often the developers are not willing to do anything about this, only partially because it's too hard. There are even paid services that lose games for you, avoiding the AFK detection that games commonly have.
Letting a better player play for you is the opposite form of this. There are paid boosting services that get you up to your desired rank without you having to play the game at all, letting you enjoy the status of a good player until you actually have to play the game yourself. This makes little sense to me, but I've heard of them, which means they are too popular.
CountermeasuresThe design phase of game development must already consider cheating. Can the asymmetrical info somehow not be shared to all clients? Can we make it easier to play for humans compared to computers? This is never sufficient, and most popular games don't seem to be impacted much by this, but I think some ideas get filtered out because this would be too hard to do. Building a good anticheat solution is often more work than the game itself.
Almost all games use the same technical prevention measures: not providing source code for the games and using authoritative servers. On consoles, hardware is often designed to resist tampering, and on PC remote attestation is used, sometimes controversially implemented as a kernel-level anticheat. All detection measures double as prevention, as the fear of getting caught is the most important deterrent.
The detection is done with a collection of evidence-gathering instruments that estimate the likelihood of cheating. The one every single game has is the other people in the same match. Typically there's some kind of report button you can press. Sometimes there's also a way to kick the cheating person out by popular vote, although such means are prone to abuse. In other games, the opposing team might simply leave the game, losing their rating but not giving the cheater the satisfaction.
Some technical detection measures I've already discussed a bit, but more things exist. A common one is having other players or paid moderators to manually review gameplay footage, especially when cheating was already suspected by other detection measures. This is really important to prevent false positives.
ResponseWhen we have determined that someone is cheating, something has to be done. The simplest measure is banning them permanently, possibly redeeming lost points to their past opponents. For lesser antisocial behavior it's typical to give a timeout, but that's insufficient here. To make it harder for the cheat developers to detect why they got caught, the bans are typically delayed and batched, even when this means that more people will have to play against them. False positives are problematic, though, so overwhelming evidence is required. Bans are almost always irrevocable, since in any system worth having, the ratio of true to false positives would make it incredibly expensive for customer support.
However, if the banned person can just make another account and keep playing, what's the point of banning them in the first place? For paid games, buying a new copy is often a sufficient barrier. In the age of free-to-play games and microtransactions, this won't help much. You could, in theory, tie the game to the real-world identity of the person somehow, but currently nobody wants to follow a KYC process just to play a game. Using a phone or credit card number might be viable, but you can have multiple of them. Tying the ban to hardware ID is sometimes done, as replacing that is certainly not free, but this is problematic for shared computers in net cafes and such. IP address bans used to be common, but are nowadays infeasible due to operators using NAT to share IPs.
Another way is to limit account creation rate, for instance requiring new accounts to go through a lengthy tutorial or having to play some amount of practice games before being allowed into competitive matches. There's a balance here, as you still want fresh people to pick up the game quickly, and the type of people who cheat often don't value their time highly. Even if the approach is otherwise viable, it'll create a business of making new accounts that have passed the barrier.
Some kind of collateral can be used to solve the issue quite cleanly. While nobody actually requires a security deposit, stuff bought with microtransactions, like skins and such, is lost. This also means that anyone owning expensive in-game items is unlikely to be a cheater. Some platforms apply the ban to the entire account, which could have multiple games.
A sneakier approach is to not ban the person at all, and instead do a shadow ban, where they're stuck playing against bots only, so the ban isn't immediately apparent. A more refined approach is keeping a reputation score in addition to the skill score, and doing matchmaking across both axes, so that the cheaters end up playing with other cheaters. This is rather neat, and works against any other antisocial behavior too. New accounts need to have quite low reputation, but using the existing account as an explicit collateral can help with this.
Lastly, the social aspect of getting caught for cheating works quite well. Cheaters are typically banned from live competitions. And nobody wants to play with a cheater, so letting everybody know who cheated should work quite well.
Discuss
It is our responsibility to develop a healthy relationship with our technology
Many technologies can be used in both healthy and unhealthy ways. You can indulge in food to the point of obesity, or even make it the subject of anxiety. Media can keep us informed, but it can also steal our focus and drain our energy, especially social media. AI can help students learn, or it can help them avoid learning. Technology itself has no agency to choose between these paths; we do.
This responsibility exists at all levels: from society as a whole, to institutions, to families, down to each individual. Companies should strive to design healthier products—snack foods that aren’t calorie-dense, smartphones with screen time controls built in to the operating system. There is a role for law and regulation as well, but that is a blunt instrument: there is no way to force people to eat a healthy diet, or to ensure that students don’t cheat on their homework, without instituting a draconian regime that prevents many legitimate uses as well. Ultimately part of the responsibility will always rest with individuals and families. The reality, although it makes some people uncomfortable, is that individual choices matter, and some choices are better than others.
I am reminded of a study on whether higher incomes make people happier. You might have heard that more money does not make people happier past an annual income of about $75k. Later research found that that was only true for the unhappiest people: among moderately happy people, the log-linear relationship of income to happiness continued well past $75k, and in the happiest people, it actually accelerated. So there was a divergence in happiness at higher income levels, a sort of inverse Anna Karenina pattern: poor people are all alike in unhappiness, but wealthy people are each happy or unhappy in their own way. This matches my intuitions: if you are deeply unhappy, you likely have a problem that money can’t solve, such as low self-esteem or bad relationships; if you are very happy, then you probably also know how to spend your money wisely and well on things you will truly enjoy. It would be interesting to test those intuitions with further research and to determine what exactly people are doing differently that causes the happiness divergence.
Similarly, instead of simply asking whether social media makes us anxious or depressed, we should also ask how much divergence there is in these outcomes, and what makes for the difference. Some people, I assume, turn off notifications, limit their screen time, put away their phones at dinner, mute annoying people and topics, and seek out voices and channels that teach them something or bring them cheer. Others, I imagine, passively submit to the algorithm, or worse, let media feed their addictions and anxieties. A comparative study could explore the differences and give guidance to media consumers.
In short, we should take an active or agentic perspective on the effects of technology and our relationship to it, rather than a passive or fatalistic one. Instead of viewing technology as an external force that acts on us, we should view it as opening up a new landscape of choices and possibilities, which we must navigate. Nir Eyal’s book Indistractable is an example, as is Brink Lindsey’s call for a media temperance movement.
We should also take a dynamic rather than static perspective on the question. New technology often demands adjustments in behavior and institutions: it changes our environment, and we must adapt. For thousands of years manual labor was routine, and the greatest risk of food was famine—so no one had to be counseled to diet or exercise, and mothers would always encourage their children to eat up. Times have changed.
These changes create problems, as we discover that old habits and patterns no longer serve us well. But they are better thought of as growing pains to be gotten through, rather than as an invasion to be repelled.
When we shift from a static, passive framing to a dynamic, agentic one, we can have a more productive conversation. Instead of debating whether any given technology is inherently good or bad—the answer is almost always neither—we can instead discuss how best to adapt to new environments and navigate new landscapes. And we can recognize the responsibility we all have, at every level, to do so.
Discuss
Debunking “When Prophecy Fails”
In 1954, Dorothy Martin predicted an apocalyptic flood and promised her followers rescue by flying saucers. When neither arrived, she recanted, her group dissolved, and efforts to proselytize ceased. But When Prophecy Fails (1956), the now-canonical account of the event, claimed the opposite: that the group doubled down on its beliefs and began recruiting—evidence, the authors argued, of a new psychological mechanism, cognitive dissonance. Drawing on newly unsealed archival material, this article demonstrates that the book's central claims are false, and that the authors knew they were false. The documents reveal that the group actively proselytized well before the prophecy failed and quickly abandoned their beliefs afterward. They also expose serious ethical violations by the researchers, including fabricated psychic messages, covert manipulation, and interference in a child welfare investigation. One coauthor, Henry Riecken, posed as a spiritual authority and later admitted he had “precipitated” the climactic events of the study.
Discuss
[Linkpost] How to Win Board Games
This is my Day 2 Inkhaven post. I'm crossposting it a) under request, and b) because it's my most popular (by views) Inkhaven post so far. Please let me know if you have opinions on whether I should crosspost more posts!
Want to win new board games? There’s exactly one principle you need to learn:
- Understand the win condition, and play to win.
That’s it. It’s that simple. Despite the simplicity, meditating on this principle alone can help you significantly elevate your gameplay and become much better at beating other novices.
The rest of the article will teach you the simplest ways you can apply this principle to a wide variety of games. Note that this is not an outline of a perfect strategy, but rather a “good enough” strategy for beating other novices[1].
Understanding the win conditionThe first step is always: read the victory conditions carefully. Ask yourself: What triggers the game’s end? How do I score points? What’s the least number of turns I need to satisfy the win condition? Can I win more quickly than that?
Roughly speaking, there are 5 types of win conditions[2]:
- Race conditions: First to X victory points (VPs) wins (Splendor, Settlers of Catan)
- Accumulation conditions: Most points when the game ends wins (7 Wonders, Wingspan, Terraforming Mars)
- Duel elimination - Win by killing your single opponent
- Multiplayer elimination - Win by outlasting everyone else
- Idiosyncratic win conditions - a large gamut of unusual win conditions[3]
Next, you play towards the win condition(s).
Playing to win
“Playing to win” sounds obvious, but next time you play a new game, record and introspect on how you actually play a new game! Chances are, you would optimize for everything except the win condition. You likely build elaborate engines that never score, make enemies unnecessarily, or pursue “interesting” strategies and clever combinations that are almost certainly too slow to achieve anything before the game inevitably ends.
Beelining towards the win condition
The simplest application of my principle is for games where winning means achieving X victory points first. In those games, usually your “good enough” strategy comes from doing whatever greedily leads you to the most victory points in the smallest number of turns possible. And then repeating this with the next small sequence of actions that leads you to the highest VP/turn ratio, and again and again a few more times before you win[4].
For example, in Splendor, where the win condition is achieving 15 victory points, a good greedy heuristic is to reserve the card you see that has the highest VP:gem cost ratio, buy it as quickly as you can, and then try to reserve and/or buy the next card you see with the highest VP:gem cost ratio. Do this 3-5 times, and you win.
As a broad heuristic, this “simple beelining” strategy works best for games where either a) the end-game condition is via reaching X victory points first, or b) the game ends deterministically by Y turns, so you just have to have the most VPs by the time the game ends.
“Rush” strategies/Dual mandates
What if the game doesn’t end via either having reached some number of VPs first, or deterministically? What if a specific set of conditions must be triggered to end the game?
In those cases, a good “playing the win condition” player must acknowledge that the end condition is different from the win condition. They should invest in two different goals:
- Getting as many VPs per turn as quickly as possible.
- Ending the game as quickly as possible.
This is a dual mandate, so it’s harder to achieve than just going for one goal alone. So in those games, you might be tempted to avoid rushing altogether and instead go for a more complex “engine” strategy where your goal is to achieve a viable economy, etc that lets you eventually achieve a lot of victory points by the end of the game, even if you initially don’t get any.
This is usually a mistake. Not always, but usually, especially for beginning players. But what happens if you do go for an engine? For that, see the next section:
Source: Gemini 2.5 Pro
The endgame starts earlier than you think
In an earlier draft of this post, somebody objected to my framing, saying that an important strategic aspect of advanced play in a lot of board games is timing the transition between a middle-game (accumulating resources, improving your economy, building up production, etc) and endgame (where you’re just maximizing points).
I don’t necessarily disagree. However, I think beginner, intermediate, and even some advanced players are consistently biased in overestimating how long the middle-game should be:
2. Linch’s Law of Endgames: You should start the endgame earlier than you think you should, even after taking Linch’s Law into effect.
I claim this is a real bias rather than a random mistake/error or “skill issue” as the kids say. You can in fact imagine the opposite issue: people bias too much towards the endgame/victory conditions and don’t invest enough resources into building up their engine. You can imagine it, yet in practice I almost always see people systematically biased in favor of spending too much time on engines, and almost never too much time spent on trying to win.
More broadly, you should “aim more directly towards victory” more than you currently do, even after taking this advice into account.
However, ‘playing to win’ looks different depending on win conditions. In elimination games, for example, playing to win centrally means playing to not lose.
Playing to not loseIn multiplayer elimination games (Coup, multi-player fighting games) the objective is to be the last person standing. Some people interpret this as a mandate to be maximally aggressive and to try to kill off all their opponents as soon as possible. This is a mistake.
Instead, you should play to not lose: play in a way to attract as little aggression as possible. Among novices, this means playing quiet moves, not being spooky, avoiding making enemies, flying under the radar etc. Among intermediate to advanced players, you should assume everybody always (or almost always) makes the locally optimal decision, so you should play to position yourself in a manner that it is never rational for you to be anybody’s first target of choice.[5]
This means taking game theory seriously, being less wealthy than other players, seeming (and often being) locally less of a threat etc.
In politics games, take Statecraft seriously, and avoid being a Great Game loser by seeming like too much of a threat.
Choosing between beelining towards victory vs playing to not lose
This is complicated, but here are some broad heuristics:
Usually when there’s minimal player interaction and no player elimination you should beeline towards victory. Other players are novices and are far from optimal. Worrying about them is often going to hurt your own game. Instead, you should spend your limited time and attention improving your own game.
In the other extreme, in multi-player elimination games, you should focus the vast majority of your attention, especially in the beginning, on “not losing.” In those games, the best offense is a studious defense. Damaging an opponent increases your win probability some, but much less than being damaged will decrease your win probability.
In a two-player perfect information duel elimination game like chess, you should pay approximately equal attention to beelining towards victory (checkmating your opponent) and playing to not lose (avoid being checkmated). In those games specifically, my impression is that most people systematically neglect thinking from their opponent’s perspective, and play too much of what Sarah Paine calls “half-court tennis.”[6]
Against other strategies for winningI contrast my “playing to win via aiming at the win condition” strategy for winning against other common strategies that people either employ or advise. I think all of the other strategies have their place, but are substantially worse for new players.
Against Central Narrative strategiesThe most common strategy that people implicitly employ when learning a new game is what I call the “Central Narrative” strategy – understand the Central Narrative of a game, and try to execute it better than anybody else.
For Dominion, this means buying lots of cards and playing lots of actions that combo well with each other. For Splendor, this might mean buying lots of gem mines and being more locally efficient with your purchases than anybody else. For fighting games, this might mean executing lots of cool attacks and attacking your opponents head-on. For Coup and other social deception games, this might mean lying a lot and calling out your opponents on their BS. And so forth.
I think this is bad advice for people trying to win, since presumably other players are trying almost the exact same strategy. If you win despite everybody else doing the same thing, it means you’re some combination of:
- Smarter than everybody else
- Care more about winning than everybody else
- Have more experience with this game/games in general than everybody else
- Luckier than everybody else
None of these traits are repeatable strategies you can rely on. “Git gud” is not actionable advice. I prefer strategies that are both easier to apply and more nontrivial.
Against Psychology PlayersSome people advise trying to learn what your opponents want to do, and then systematically foiling their plans (Sun Tzu: “what is of supreme importance in war is to attack the enemy’s strategy”). Good advice for some generals in 5th century BC China, terrible advice for most modern board games. This is because a) psychology is hard, b) your opponents are novices, so foiling their plans might well backfire, and c) wasting your cognition worrying about your opponents’ strategies is going to hurt your own gameplay (especially in multi-player games with 3+ players).
Against “Optimal” PlayersSome people advise learning what the best players in a game do (or even what’s theoretically proven or simulated to be mathematically or computationally optimal) and then copy the best moves possible.
I think this is terrible advice since the best strategies for beginners to play might be very different from what is theoretically the most optimal play from the best strategists.
When to not use this adviceYou should avoid this advice if you are currently winning too often, and think winning is not fun. You should also disregard this advice if you don’t mind winning, but find playing to win morally or aesthetically distasteful.
This might sound dismissive, but I mean it most sincerely. We all want to have fun in our games, and if winning (or winning in specific ways) is not fun for you, please don’t change your gameplay just based on some rando’s Substack post!
Finally, you should downgrade my advice if you tried to implement it multiple times and you were less successful at achieving your desired goals (whether winning, having fun, or something else) than when you were following other advice, or using your own base intuitions.
ConclusionTo recap, find the win condition, think carefully about what the win condition entails, and then aim every chip, card, or cube at it, and start the sprint earlier than feels safe.
Taking this advice seriously will mean that you’d almost always win against other novices – at least until they Get Good and they follow similar advice. Good luck!
And please, feel free to subscribe! And if you do end up trying out my advice, please comment! I would love to get more data on whether my advice is directly applicable to real games.
Thanks for reading The Inchpin! Subscribe for free to receive new posts and support my work.
- ^
Indeed, a well-designed game shouldn’t have an obvious strategic equilibrium as simple as the one I describe.
- ^
I try my best to explain strategies in a way that’s agnostic to specific games. But sometimes I find concrete examples helpful! Unfortunately I expect most readers to not have heard of the vast majority of the specific games referenced, and also I don’t think it’s worth the readers’ time to be explained specific game rules! Such is life.
- ^
For example, in Blood on the Clock Tower, Good wins by killing the Demon. Evil wins by having only two players alive, one of which is the Demon. In Innovation (and to a lesser degree, CCGs like Magic the Gathering, Yu-Gi-Oh!, etc) there are many “special cards” that can trigger different unusual win conditions.
- ^
Does this sound way too simple to be the best possible strategy for such a broad class of games? Sure. Does that mean it’s not the best possible strategy for most games? Also true. But surprisingly I found it to be “good enough” to beat most novices in large numbers of games, such that beating this simple greedy strategy often requires much more strategic depth, understanding, deep thinking, etc than needed to execute this strategy, or simple variations thereof.
- ^
Coup players may enjoy my earlier Medium article here: https://medium.com/@draconlord/coup-card-game-strategy-the-3-player-endgame-b87cd90a17af
- ^
An earlier version of this post tried to more carefully delineate which of my advice was board game-specific, vs apply to the “Game” of life more generally. In the end, I decided that while the broader topic (“to what extent is good advice for board games good advice for Real Life”) can be very sophisticated and interesting, I couldn’t do the topic enough justice while still talking about my core points. So I cut most of the comparisons.
Discuss
SPAR Spring ‘26 mentor apps open—now accepting biosecurity, AI welfare, and more!
Mentor applications for the Spring 2026 round of SPAR are open! This time, we’re accepting any projects related to ensuring transformative AI goes well, from technical AI safety to policy to gradual disempowerment to AI welfare and more. We’re particularly excited to announce we’ll now be accepting biosecurity projects!
SPAR is a part-time, remote research program pairing aspiring AI risk researchers with professionals in the field to work together on impactful research projects. Mentees gain research experience and guidance, and mentors get assistance with their projects. You can learn more about the program here.
Previous mentors have come from organizations like Redwood Research, METR, Google DeepMind, and RAND; universities including Harvard and CMU; and programs like MATS and GovAI.
Mentor apps will close November 30, and mentee apps will run December 17-January 14.
Who should apply?MentorsWe typically accept applications from researchers with research experience matching or exceeding that of a late PhD student or a MATS scholar. Mentors apply with a proposal for a 3-month research project related to ensuring transformative AI goes well, and commit between 2 and 10 hours a week to supervising mentees working on it.
MenteesMentors often look for a technical background (e.g., ML, CS, biology, cybersecurity, math, physics, etc.) or knowledge relevant to policy and governance (e.g., law, international relations, public policy, political science, economics, etc). The program is open to undergraduate, graduate/PhD students, as well as professionals of varying experience levels.
Some projects may require additional background knowledge, like knowledge of specific research areas, techniques, or tools (e.g., ML frameworks).
We don’t require previous research experience, and many mentees have none.
Even if you do not meet all of these criteria, we encourage you to express interest in the upcoming round and apply once applications open in December! Many past mentees have been accepted even though they didn’t completely match a project's criteria.
Why SPAR?SPAR creates value for everyone involved. Mentors expand their research output while developing research management skills. Mentees get to explore safety research in a structured, supportive environment while building safety-relevant skills. Both produce concrete work that serves as a strong signal for future opportunities in AI safety.
Click here to apply as a mentor, and here to express interest as a mentee. Mentor apps will close November 30, and mentee apps will open on December 17.
Questions? Email us at spar@kairos-project.org or drop them in the comments below.
Discuss
Fake media seems to be a fact of life now
epistemic status: my thoughts, backed by some arguments
With the advent of deep fakes, it has become very hard to know which image / sound / video is authentic and which is generated by an AI. In this context, people have proposed using software to detect generated content, usually aided by some type of watermarking. I don't think this type of solution would work.
Watermarking AI-generated contentOne idea is to add a watermark to all content produced by a generative model. The exact technique would depend on the type of media - e.g. image, sound, text.
We could discuss various techniques with their advantages and shortcomings, but I think this is beside the point. The fact is that this is an adversarial setting - one side is trying to design reliable, robust watermarks and the other side is trying to find ways to break them. Relying on watermarks could start a watermarking arms race. There are strong incentives for creating fakes so hoping that those efforts would fail seems like wishful thinking.
Then there is the issue of non-complying actors. One company could still decide not to put watermarks or release the weights of its model. This is next to impossible to prevent on a worldwide scale. Whoever wants to create fakes can simply use any generative model which doesn't add watermarks.
I don't think watermarking AI-generated content is a reasonable strategy.
Mandatory digital watermark system for all digital camerasAnother idea is to make digital cameras add a watermark (or a digital signature) to pictures and videos. Maybe digital microphones can even do something similar for sound, although this would likely significantly increase the price of the cheapest ones. We should not that this technique cannot be applied for text.
I see several objections to this proposal:
- How do we decide who has the authority to make digital cameras? There needs to be control to make sure watermarked content really comes from a digital camera. This could lead to an oligopoly where only some companies have the authority to make digital cameras and thus decide what is real. We could try to make this watermarking system partly decentralized, kind of like HTTPs certificate authorities. The problem is that HTTPs is not as secure as we would like to think. It would be even worse for watermarks, because some actors (e.g. states) have stronger incentives to create fakes and there seem to be fewer ways to detect those (e.g. there is no underlying network where one can track suspicious packages).
- What software goes on a digital camera? Cameras already do a ton of software processing before saving the content to a file so somebody must decide and control what software can be put there. The watermark would be useless if the camera software could be used to watermark an image where an object was digitally removed from a scene, for example.
- It's not clear how technically feasible this watermarking could be. We would like watermarks that persist after "legitimate" edits (crop, brightness change, format change, etc.) but break otherwise.
- Such a watermarking system is likely to end up like the Clipper Chip or the Content Scramble System - somebody would find a security hole rendering it useless or counterproductive.
- I print my deep fake image using a high-quality printer and then use my high-quality camera with a special lens to take a picture of it. Now the deep fake is watermarked. So, can one reliably distinguish a picture of something from a picture of a picture?
- It is dubious whether using cameras adding watermarks would see wide adoption.
- What do we do with the analog devices?
I think we may need to accept that indistinguishable fakes are part of what's technologically possible now.
In such case, the best we could do is track the origin of content and then each person could decide which origins to trust. I am thinking of some decentralized append-only system where people can publish digital signatures of content they have generated.
If you trust your journalist friend Ron Burgundy, you could verify the digital signature of the photo in his news article against his public key. You could also assign some level of trust to the people Ron trusts. This creates a distributed network of trust.
With the right software, I can imagine this whole process being automated: I click on an article from a site and one of my browser plugins shows the content is 65% trustworthy (according to my trusted list). When I publish something, a hash of it signed with my private key is automatically appended to a distributed repository of signatures. Anybody can choose run nodes of the repository software in a way similar to how people can run blockchain or tor nodes. Platforms with user-generated content could choose to only allow signed content and the signer could potentially be held responsible. It's not a perfect idea, but it's the best I have been able to come up with.
I have seen attempts at something similar, but usually controlled by some company and requiring paid subscription (both of which defeat the whole purpose for wide adoption).
Discuss
Our ancestors didn't know their faces
When you are born, nobody knows your face. And in particular, you don’t know your face.
As you grow up, you become familiar with your face, observing how it slowly changes, with its pimples coming and going. This ability hinges on the peculiar fact that you have access to images of yourself, mainly from mirrors and photos.
But these are both relatively recent inventions. The first daguerreotype portrait was taken in 1839, and the earliest traces of manufactured mirrors were polished obsidian found in modern-day Turkey, dated to 6000 BCE. So, ten thousand years ago, your options for seeing yourself were:
- A still lake or rain puddle
- Looking into someone’s eye
- A naturally shiny stone
- A smooth sheet of ice
And… that’s pretty much it. Your face used to be how other people knew you. It must have been common to live your entire life without knowing what you looked like.
But the primary way you would get to know yourself was through the reactions of your peers around you. It is like playing the game “Who am I?” (“Devine tête” in French), with the character you are trying to guess being your face, and the game lasting for all your life.
The experience of getting to know your face, ten thousands years agoTo get a sense of what it feels like to see your face in such a world, we can look at the effect of listening to a recording of yourself. People in your life are intimately familiar with the sound of your voice, but you are familiar with another voice. This is the sound of your voice when it reaches your ears through your head instead of through the air.
We all have had this cringe experience: is this what I sound like? Our sense of self gets challenged, this weirdly foreign voice is supposed to be me? This might be the feeling shared by one of our ancestors ten thousand years ago, casually going for a walk to pick berries after the rain and stopping in shock, looking down at a puddle.
Today, our face feels like a universal way to say, “this is me.” It shows how flexible our sense of self is. It is like a bag whose boundary can expand to include objects through ownership, people from our tribe, or ideologies and beliefs.
Knowing our ancestors didn’t know their faces can help us keep our identity small. It can allow us to take a step back when we hold on strongly to a cause that feels so close to our hearts, so close to who we are. Does letting this go feel worse than forgetting your face?
Discuss
Review: K-Pop Demon Hunters (2025)
(This review contains spoilers for the entire plot of the film.)
K-Pop Demon Hunters is a very popular movie. It is the first Netflix movie to hit #1 at the box office. It is "the first film soundtrack on the Billboard Hot 100 to have four of its songs in the top ten". When you Google it, a little scrolling marquee appears with a reference to a joke from the movie. My friends keep talking about it. So I figured I'd check it out.
The movie does some interesting things with animation, importing a lot of anime tropes and visual effects into the realm of 3D animation. For me this mostly fell flat; a bunch of them landed in the uncanny valley, or were otherwise jarring. That said, the choreogrpahed fight scenes are very well-executed and fun; a friend described the feeling of watching them as "like watching someone play Beat Saber really well", and I agree. The movie's songs are also pretty good. (Honestly, I'm not a huge K-pop fan, but they're very catchy and I expect them to be stuck in my head for a while.) One song in particular was handled cleverly; more on that below.
Spoilers follow!
I expected cool visuals and catchy music from the beginning; the real surprise, as the movie approached its climax, was how engaging I found it on a thematic level. The film sets up a simple Manichean world of good and evil, then dives into an exploration of the troubling psychological implications of this setup, weaving together the protagonist's personal growth and the viewer's increasingly conflicted understanding of the movie's cosmology. Then it throws away all the metaphorical structure in the last few minutes, stabs the problem with a sword, and brings back the status quo. I found this very frustrating! I think it could've been on par with some of the best Disney and Pixar movies, thematically, if it only had the courage of its convictions. (I wasn't expecting The Godfather.)
Our protagonist is Rumi, a singer in the K-pop group Huntr/x. Together with her bandmates Mira and Zoey, she maintains the Honmoon, the (somewhat porous) magical veil protecting our world from evil demons. They also stab any demons who get through the Honmoon with swords. The Honmoon is sustained by good vibes from successful concerts; fortunately, they're extremely good musicians and have legions of dedicated fans. Occasionally Huntr/x fight demons on stage; fans assume this is part of the show, so they don't really bother with kayfabe. (Wikipedia says they "lead double lives"; I disagree. If you don't need to change outfits between singing and slaying, you're leading a very single life. I think most actual K-pop idols are leading doubler lives than that. But I digress.) But Rumi has a dark secret; while her late mother was part of the previous generation's trio of demon-hunting singers (It's a Buffy-style "into every generation" deal), her father was -- a demon! (The implied relationship, and her mother's fate, are never explored beyond this; I can't tell if this was a bold choice or laziness.) Most demons are fairly unconvincing humans even before they morph into their demonic forms; the first demon we meet is watering a plant with a pot of coffee. They also have purple webbed "patterns" on their skin, which Huntr/x tend to use as final confirmation before pulling out the swords. Rumi was born with a tiny bit of pattern, which has expanded over the years to cover much of her body. She is deeply ashamed of this and has kept the patterns scrupulously hidden, even from her bandmates, as she was encouraged to do by her adoptive mother Celine (one of her mom's bandmates).
Huntr/x's goal in the film is to be even better K-pop idols so that they can create a "Golden Honmoon", which will be completely demon-proof. Rumi secretly hopes that this will also rid her of her patterns. But just when they seem to be on the verge of success, the demons send a boy band to defeat them. One of the demon boys, Jinu, turns out to have a secret of his own: He used to be human! He has been turned into a demon by the demon king Gwi-Ma through a typically Faustian deal (earthly power, temptation to sin, loss of his soul, etc). Gwi-Ma, we learn, controls Jinu (and, apparently, all demons) via shame and regret. Jinu learns of Rumi's patterns and tells her about his past; both characters start to hope for a shared redemption. (In particular, they make plans for Jinu to sabotage his band's performance, hoping that he can stay on the human side of the Golden Honmoon.) But meanwhile, trouble is brewing; Rumi has newfound empathy for the demons she mows down by the dozen, and mixed feelings about the lyrics of her band's new anti-demon diss track, "Takedown". ("'Cause I see your real face, and it's ugly as sin / Time to put you in your place, 'cause you're rotten within / When your patterns start to show / It makes the hatrеd wanna grow outta my veins") She starts falling behind in battle and can't seem to get through a rehearsal without losing her voice. Her patterns keep growing. All of this strife is tearing holes in the Honmoon. This comes to a head at the big show-down concert for the Idol Awards; demons impersonating Mira and Zoey perform "Takedown" and reveal Rumi's patterns; she flees, and the news of Huntr/x's "breakup" tears the Honmoon to shreds.
Rumi confronts first Jinu, who has lost all hope and is thoroughly in the grip of Gwi-Ma, and then Celine, who encourages her (as usual) to hide her patterns and try to "fix" things. At one point, Celine says "our faults and fears must never be seen", which we've heard from Mira earlier in the film. Rumi, distraught (and looking increasingly demonic), accuses Celine of failing to love "all of" her. "If this is the Honmoon I'm supposed to protect," she says, "then I'll be glad to see it destroyed." The demon boy band begins a final performance where they sing about unhealthy parasocial relationships for a newly-aboveground Gwi-Ma and legions of sorta-depressed-but-enraptured fans ("I'm the only one who'll love your sins / Feel thе way my voice gets underneath your skin").
Let's pause here. We've learned that demons (or at least some of them) are just humans who have given in to shame and fear and lost hope of redemption. Rumi, on the verge of despair, has glowing patterns just like those on Jinu, the most human-looking of the demons. Her maternal figure encouraged her and her bandmates to hide their flaws; this has now pushed Rumi to the point of questioning her cosmic role as one of the guardians of the increasingly impenetrable barrier between humans and demons. So obviously we're going to learn that the Honmoon was a mistake and that there's a better way to integrate the human and demon worlds, right? And the demons, or at least Jinu, will get a second chance? And maybe we're getting some critique of how people engage with K-pop idols?
Just kidding! As things move towards the obvious resolution on an emotional level (Rumi's bandmates sing about their respective personality "flaws", recontextualizing them as positive traits), they backslide on a cosmological level. Huntr/x has a dramatic battle against Gwi-Ma and his boy band, where they stab the demon king and his demon minions with swords; then they make a new Honmoon, better than ever, powered by the soul-energy of their even-more-devoted fans. (It's rainbow, not golden, but the effect seems to be the same). Hoards of demons get killed or banished back to the underworld; I think maybe Gwi-Ma gets killed but I wasn't really paying attention. A newly-ensouled Jinu sacrifices himself to save Rumi, neatly avoiding the question of which side of the barrier he'd have ended up on. Rumi, for her part, no longer seems remotely bothered by the task of slaughtering demons at an industrial scale.
At the end of the film, it's uncanny how little has changed, for both Rumi and her world. The Honmoon is stronger than before -- although, really, it seems to have been holding up alright from the beginning. The demons (traumatized failures?) are trapped in the underworld, where they (apparently?) belong, and the (charmingly flawed but ontologically immaculate) humans are safe up top. Rumi is basically the same, except that she's not ashamed of her patterns, which are also pretty and rainbow now. I think this is supposed to symbolize her (and her friends' and fans') acceptance of her flaws, but her major flaw has pretty much been fixed. Her bandmates are also more willing to talk openly with each other about their flaws and insecurities, although they really don't seem to have been shy about this before. They're also willing to sing about their insecurities, but this is a somewhat confusing kind of growth. (I don't really think it matters whether Taylor Swift's songs are about specific relationships she's had; a lot of good art isn't autobiographical in that way.) The largest sign of character growth is in their relationship with their fans -- instead of hiding from fans when out in public, they are happy to engage with them. This seems nice, but not really that big a deal.
(There's a possible interpretation here where Rumi's previous shame about showing skin is fundamentally about sexuality, or intimate relationships, but the movie doesn't really seem to be angling for this.)
There's something annoyingly self-referential about Rumi's "flaw". She's ashamed of her patterns, which are the physical manifestation of her shame about them. She hides them from her bandmates, who (we learn) are much more upset about the hiding than the patterns themselves. This is maybe a good metaphor for a lot of personal problems, but most such problems don't go away as soon as you acknowledge them. (Her bandmates -- abrasive Mira and people-pleasing Zoey -- get much less recursive personality traits to struggle with.) We are maybe supposed to think that she's hiding other problems, but if so, there isn't room in the movie for them. As for the fans, despite the movie's insistence that Huntr/x really loves their fans, while the demonic Saja Boys are merely exploiting them, the experience of the fans seems pretty similar in both cases. (It's also not clear to me that a healthy parasocial relationship is simply one where the idol says "for the fans!" a lot backstage.)
Above all, I wish the movie hadn't made its entire cosmology a metaphor for an unhealthy way of handling emotions, spelled out the metaphor in increasingly direct terms, and then left it untouched. It's hard not to walk away with the message that personal growth and spiritual redemption are only for people who are "essentially good" rather than "essentially bad", and that at any rate they're less important than keeping the essentially-bad people in their place.
My friend Tamera suggests an alternate ending: the demons, no longer held back by the Honmoon, swarm the concert -- only to find themselves given new spiritual strength by Huntr/x's music. (This is nearly foreshadowed; audience members' chests glow blow when they're particularly touched by the music, and Jinu leaves behind a glowing blue ball which we're told is the "soul" he regained thanks to Rumi's guidance.) The demons realize they have the power to fight back, overthrow Gwi-Ma, and either turn back into humans or go to their eternal rest. I think this would be much more consistent with the message the movie is going for, and doesn't undo all the work it does in building up the demons and Honmoon as symbols.
Discuss
Страницы
- « первая
- ‹ предыдущая
- …
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- …
- следующая ›
- последняя »