GeneralMahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX2026-05-22·3 read
GeneralConditional Equivalence of DPO and RLHF: Implicit Assumption, Failure Modes, and Provable Alignment2026-05-22·3 read
GeneralScenePilot: Controllable Boundary-Driven Critical Scenario Generation for Autonomous Driving2026-05-22·3 read
GeneralModel Context Protocol (MCP): The Complete Developer Guide to Building Production-Grade AI Agents in 20262026-05-22·3 read