AI
AI Digest
EN RU
Home Archive About RSS

#post-training

4 items

  • 10 июн DRPO: Rethinking Divergence Regularization in LLM Reinforcement Learning Tencent Hunyuan research
  • 11 июн Anatomy of Post-Training: Using Interpretability to Audit and Fix Preference Data research
  • 26 июн OPRD: On-Policy Representation Distillation for Post-Training LLMs research
  • 28 июн Tencent Hunyuan Open-Sources UniRL: Unified RL Post-Training for LLMs and Diffusion Models Tencent / Hunyuan research

ai-digest.kerby.pro

© 2026 Alexei Lukin · CC BY 4.0

RSS · JSON Feed · About