Workshop on Reinforcement Learning Beyond Rewards

Reinforcement Learning Conference (RLC) 2024

August 9, 2024

@RLBRew_2024 ยท #RLBRew_2024


The paper PDFs can be accessed by clicking on the title of each paper. For incorrect links or fixes, please create a pull request at https://github.com/rlbrew-workshop/rlbrew-workshop.github.io.

Poster Presentations (PDF) Authors
Language Reward Modulation for Pretraining Reinforcement Learning Ademi Adeniji, Amber Xie, Carmelo Sferrazza, Younggyo Seo, Stephen James, Pieter Abbeel
Skill-Based Reinforcement Learning with Intrinsic Reward Matching Ademi Adeniji, Amber Xie, Pieter Abbeel
Learn to Outperform Demonstrators via Reward and Policy Co-learning Mingkang Wu, Feng Tao, Yongcan Cao
Adaptive Deep Q-Networks for Decision Making in Non-Stationary Environments: A Case Study with the Wisconsin Card Sorting Test Dieu-Donne Fangnon, Eduardo H. Ramirez-Rangel
The Reward Problem: Where does the most important signal in reinforcement learning come from? Kory Wallace Mathewson
Integrating Feedback and Noisy Preferences for Adaptable Robotic Control Yuxuan Li, Srijita Das, Qinglin Liu, Matthew E. Taylor
Representation Learning for Cross-Embodiment Inverse Reinforcement Learning from Mixed-Quality Demonstrations Anurag Sidharth Aribandi, Connor Mattson, Daniel S. Brown
Learning Action-based Representations Using Invariance Max Rudolph, Caleb Chuck, Kevin Black, Misha Lvovsky, Scott Niekum, Amy Zhang
Pragmatic Feature Preferences: Learning Reward-Relevant Preferences from Human Input Andi Peng, Yuying Sun, Tianmin Shu, David Abel
Multistep Inverse Is Not All You Need Alexander Levine, Peter Stone, Amy Zhang
Adaptive Feedback Selection for Learning to Avoid Negative Side Effects in Autonomous Agents Yashwanthi Anand, Sandhya Saisubramanian
REBEL: Reinforcement Learning via Regressing Relative Rewards Zhaolin Gao, Jonathan Daniel Chang, Wenhao Zhan, Owen Oertell, Gokul Swamy, Kiante Brantley, Thorsten Joachims, J. Andrew Bagnell, Jason D. Lee, Wen Sun
Efficient Inverse Reinforcement Learning without Compounding Errors Nicolas Espinosa Dice, Gokul Swamy, Sanjiban Choudhury, Wen Sun
Multi-Agent Imitation Learning: Value is Easy, Regret is Hard Jingwu Tang, Gokul Swamy, Fei Fang, Steven Wu
PIPER: Primitive-Informed Preference-based Hierarchical Reinforcement Learning via Hindsight Relabeling Utsav Singh, Wesley A Suttle, Brian M. Sadler, Vinay P. Namboodiri, Amrit Bedi
RLeXplore: Accelerating Research in Intrinsically-Motivated Reinforcement Learning Mingqi Yuan, Roger Creus Castanyer, Bo Li, Xin Jin, Glen Berseth, Wenjun Zeng
A Dual Approach to Imitation Learning from Observations with Suboptimal Offline Datasets Harshit Sikchi, Caleb Chuck, Amy Zhang, Scott Niekum
SkiLD: Unsupervised Skill Discovery Guided by Local Dependencies Zizhao Wang, Jiaheng Hu, Caleb Chuck, Stephen Chen, Roberto Martin-Martin, Amy Zhang, Scott Niekum, Peter Stone
Inverse-RLignment: Inverse Reinforcement Learning from Demonstrations for LLM Alignment Hao Sun, Mihaela van der Schaar
Value Implicit Pretraining does not learn Representations suitable for Reinforcement Learning Harshit Sikchi, Siddhant Agarwal, Peter Stone, Amy Zhang, Scott Niekum
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms Rafael Rafailov, Yaswanth Chittepu, Ryan Park, Harshit Sikchi, Joey Hejna, W. Bradley Knox, Chelsea Finn, Scott Niekum
Tell my why: Training preferences-based RL with human preferences and step-level explanations Jakob Karalus
Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning Jiaheng Hu, Zizhao Wang, Peter Stone, Roberto Martin-Martin
Aligning Agents like Large Language Models Adam Jelley, Yuhan Cao, David Bignell, Sam Devlin, Tabish Rashid
Prioritizing safety via curriculum learning Cevahir Koprulu, Thiago D. Simao, Nils Jansen, ufuk topcu
Agent Q: Combining Search, Self-Critique and Reinforcement Learning for Autonomous Web Agents Pranav Putta, Edmund Mills, Naman Garg, Chelsea Finn, Divyansh Garg, Rafael Rafailov
External Model Motivated Agents: Reinforcement Learning for Enhanced Environment Sampling Rishav Bhagat, Jonathan C Balloch, Zhiyu Lin, Mark Riedl
Understanding Preference Fine-Tuning Through the Lens of Coverage Yuda Song, Gokul Swamy, Aarti Singh, Drew Bagnell, Wen Sun
Generalization of Temporal Logic Tasks via Future Dependent Options Duo Xu
Learning Abstract Skillsets with Empowerment Bandits Andrew Levy, Alessandro G Allievi, George Konidaris
Proto Successor Measure: Representing the space of all possible solutions of Reinforcement Learning Siddhant Agarwal, Harshit Sikchi, Peter Stone, Amy Zhang
Offline Reinforcement Learning with Imputed Rewards Carlo Romeo, Andrew D. Bagdanov
A Reward Analysis of Reinforcement Learning from Large Language Model Feedback Muhan Lin, Shuyang Shi, Yue Guo, Behdad Chalaki, Vaishnav Tadiparthi, Simon Stepputtis, Joseph Campbell, Katia P. Sycara
OCALM: Object-Centric Assessment with Language Models Timo Kaufmann, Jannis Bluml, Antonia Wust, Quentin Delfosse, Kristian Kersting, Eyke Hullermeier
Towards Principled Representation Learning from Videos for Reinforcement Learning Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford
Dynamics Generalisation with Behaviour Foundation Models Scott Jeen, Jonathan Cullen
Task-Oriented Slot-Based Cumulant Discovery in General Value Functions Vincent Michalski, Somjit Nath, Derek Nowrouzezahrai, Doina Precup, Samira Ebrahimi Kahou