Not All LLM Reasoners Are Created Equal

Arian Hosseini, Alessandro Sordoni, Daniel Toyama, Aaron Courville, Rishabh Agarwal

Generative Verifiers: Reward Modeling as Next-Token Prediction

Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling

Hritik Bansal, Arian Hosseini, Rishabh Agarwal, Vinh Q. Tran, Mehran Kazemi

V-STaR: Training Verifiers for Self-Taught Reasoners

COLM 2024 - Arian Hosseini, Xingdi Yuan, Nikolay Malkin, Aaron Courville, Alessandro Sordoni, Rishabh Agarwal

Faster, More Efficient RLHF through Off-Policy Asynchronous Learning

Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux, Arian Hosseini, Rishabh Agarwal, Aaron Courville

The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization

COLM 2024 - Shengyi Huang, Michael Noukhovitch, Arian Hosseini, Kashif Rasul, Weixun Wang, Lewis Tunstall

Deep Language Networks: Joint Prompt Training of Stacked LLMs using Variational Inference

Neurips 2023 - Alessandro Sordoni, Xingdi Yuan, Marc-Alexandre Côté, Matheus Pereira, Adam Trischler, Ziang Xiao, Arian Hosseini, Friederike Niedtner, Nicolas Le Roux

On the Compositional Generalization Gap of In-Context Learning

EMNLP 2022 (workshop) - Arian Hosseini, Ankit Vani, Dzmitry Bahdanau, Alessandro Sordoni and Aaron Courville

Understanding by Understanding Not: Modeling Negation in Language Models

NAACL 2021 - Arian Hosseini, Siva Reddy, Dzmitry Bahdanau, Devon Hjelm, Alessandro Sordoni, and Aaron Courville

Ordered Memory

NeurIPS 2019 - Yikang Shen, Shawn Tawn, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, and Aaron Courville

Learning to Understand Goal Speci cations by Modelling Reward

ICLR 2019 - Dzmitry Bahdanau, Felix Hill, Jan Leike, Edward Hughes, Arian Hosseini, Pushmeet Kohli, Edward Grefenstette

Commonsense mining as knowledge base completion? A study on the impact of novelty

NAACL-HLT 2018 (workshop) - Stanisław Jastrzębski, Dzmitry Bahdanau, Arian Hosseini, Michael Noukhovitch, Yoshua Bengio, Jackie Chi Kit Cheung