Not All LLM Reasoners Are Created Equal Arian Hosseini, Alessandro Sordoni, Daniel Toyama, Aaron Courville, Rishabh Agarwal
Generative Verifiers: Reward Modeling as Next-Token Prediction Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling Hritik Bansal, Arian Hosseini, Rishabh Agarwal, Vinh Q. Tran, Mehran Kazemi
V-STaR: Training Verifiers for Self-Taught Reasoners COLM 2024 - Arian Hosseini, Xingdi Yuan, Nikolay Malkin, Aaron Courville, Alessandro Sordoni, Rishabh Agarwal
Faster, More Efficient RLHF through Off-Policy Asynchronous Learning Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux, Arian Hosseini, Rishabh Agarwal, Aaron Courville
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization COLM 2024 - Shengyi Huang, Michael Noukhovitch, Arian Hosseini, Kashif Rasul, Weixun Wang, Lewis Tunstall
Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference Neurips 2023 - Alessandro Sordoni, Xingdi Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, Nicolas Le Roux
On the Compositional Generalization Gap of In-Context Learning EMNLP 2022 (workshop) - Arian Hosseini, Ankit Vani, Dzmitry Bahdanau, Alessandro Sordoni and Aaron Courville
Understanding by Understanding Not: Modeling Negation in Language Models NAACL 2021 - Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, Devon Hjelm, Alessandro Sordoni, and Aaron Courville
Ordered Memory NeurIPS 2019 - Yikang Shen, Shawn Tawn, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, and Aaron Courville
Learning to Understand Goal Speci cations by Modelling Reward ICLR 2019 - Dzmitry Bahdanau, Felix Hill, Jan Leike, Edward Hughes, Arian Hosseini, Pushmeet Kohli, Edward Grefenstette
Commonsense mining as knowledge base completion? A study on the impact of novelty NAACL-HLT 2018 (workshop) - Stanisław Jastrzębski, Dzmitry Bahdanau, Arian Hosseini, Michael Noukhovitch, Yoshua Bengio, Jackie Chi Kit Cheung