❌ Traditional reward models = Slow 🚶
🔹They score entire responses post-generation 📜
🔹LLMs must generate fully before evaluation ⏳
✅ GenARM = Fast 🏎️
🔹 Predicts next-token rewards on the fly ⚡
🔹 Guides LLMs token by token—drastically improving efficiency! 💡