Вы здесь
Источники
Новости LessWrong.com
- The Hot Mess Paper Conflates Three Distinct Failure Modes 8 часов 24 минуты
- The Future of Aligning Deep Learning systems will probably look like "training on interp" 12 часов 17 минут
- An agent autonomously builds a 1.5 GHz Linux-capable RISC-V CPU 12 часов 20 минут
- Untrusted monitoring: extra bits 13 часов 51 минута
- Finding features in Transformers: Contrastive directions elicit stronger low-level perturbation responses than baselines 14 часов 14 минут