Bibliography

A non-exhaustive list of RL+NLP papers. You can choose a paper from this list or suggest other papers. If you know of an important paper that is not here, let me know.

General

LITTMAN, Michael L. Reinforcement learning improves behaviour from evaluative feedback. Nature, 2015, 521. Jg., Nr. 7553, S. 445-451. link
LUKETINA, Jelena, et al. A survey of reinforcement learning informed by natural language. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence Survey track. Pages 6309-6317, 2019. link

Coreference resolution

CLARK, Kevin; MANNING, Christopher D. Deep Reinforcement Learning for Mention-Ranking Coreference Models. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. S. 2256-2262. link

Dialog

ENGLISH, Michael S.; HEEMAN, Peter A. Learning mixed initiative dialog strategies by using reinforcement learning on both conversants. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2005. S. 1011-1018. link
FANG, Meng; LI, Yuan; COHN, Trevor. Learning how to Active Learn: A Deep Reinforcement Learning Approach. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. S. 595-605. link
HEEMAN, Peter A. et al. Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence. In: Thirteenth Annual Conference of the International Speech Communication Association. 2012. link
KHOUZAIMI, Hatim; LAROCHE, Romain; LEFEVRE, Fabrice. Optimising turn-taking strategies with reinforcement learning. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2015. S. 315-324. link
LI, Jiwei, et al. Deep Reinforcement Learning for Dialogue Generation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. S. 1192-1202. link
MANUVINAKURIKE, Ramesh; DEVAULT, David; GEORGILA, Kallirroi. Using reinforcement learning to model incrementality in a fast-paced dialogue game. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. 2017. S. 331-341. link
PAEK, Tim. Reinforcement learning for spoken dialogue systems: Comparing strengths and weaknesses for practical deployment. In: Proc. Dialog-on-Dialog Workshop, Interspeech. 2006. link
PAPANGELIS, Alexandros. A Comparative Study of Reinforcement Learning Techniques on Dialogue Management. In: Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics. 2012. S. 22-31. link
SINGH, Satinder P., et al. Reinforcement learning for spoken dialogue systems. In: Advances in Neural Information Processing Systems. 2000. S. 956-962. link

Grammatical error correction

SAKAGUCHI, Keisuke; POST, Matt; VAN DURME, Benjamin. Grammatical Error Correction with Neural Reinforcement Learning. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2017. S. 366-372. link

Human-robot interaction

CRUZ, Francisco, et al. Interactive reinforcement learning through speech guidance in a domestic scenario. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE, 2015. S. 1-8. link
RITSCHEL, Hannes; ANDRÉ, Elisabeth. Shaping a social robot’s humor with Natural Language Generation and socially-aware reinforcement learning. In: Proceedings of the Workshop on NLG for Human–Robot Interaction. 2018. S. 12-16. link

Image Captioning

RENNIE, Steven J., et al. Self-critical sequence training for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. S. 7008-7024. link

Information Extraction

NARASIMHAN, Karthik; YALA, Adam; BARZILAY, Regina. Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. S. 2355-2365. link
TANIGUCHI, Motoki; MIURA, Yasuhide; OHKUMA, Tomoko. Joint Modeling for Query Expansion and Information Extraction with Reinforcement Learning. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER). 2018. S. 34-39. link

Instructions

BRANAVAN, Satchuthananthavale RK, et al. Reinforcement learning for mapping instructions to actions. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1. Association for Computational Linguistics, 2009. S. 82-90. link
GOYAL, Prasoon; NIEKUM, Scott; MOONEY, Raymond J. Using natural language for reward shaping in reinforcement learning. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. AAAI Press, 2019. S. 2385-2391. link

Knowledge graph reasoning

XIONG, Wenhan; HOANG, Thien; WANG, William Yang. DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. S. 564-573. link

Language model

RANZATO, Marc’Aurelio, et al. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732, 2015. link

Language generation

DETHLEFS, Nina; CUAYÁHUITL, Heriberto. Hierarchical reinforcement learning and hidden Markov models for task-oriented natural language generation. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2. Association for Computational Linguistics, 2011. S. 654-659. link
YASUI, Go; TSURUOKA, Yoshimasa; NAGATA, Masaaki. Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 2019. S. 400-406. link

Machine Translation

GRISSOM II, Alvin, et al. Don’t until the final verb wait: Reinforcement learning for simultaneous machine translation. In: Proceedings of the 2014 Conference on empirical methods in natural language processing (EMNLP). 2014. S. 1342-1352. link
WU, Lijun, et al. A Study of Reinforcement Learning for Neural Machine Translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3612-3621. link

Math word problem

HUANG, Danqing, et al. Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018. S. 213-223. link

Mix

CHANG, Kai-Wei et al. Learning to search better than your teacher. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37. 2015. p. 2058-2066. link
NOROUZI, Mohammad et al. Reward augmented maximum likelihood for neural structured prediction. In: Advances In Neural Information Processing Systems. 2016. p. 1723-1731. link
SOKOLOV, Artem et al. Learning structured predictors from bandit feedback for interactive NLP. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016. p. 1610-1620. link

Narratives

LING, Yuan, et al. Learning to diagnose: assimilating clinical narratives using deep reinforcement learning. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2017. S. 895-905. link

Paraphrase generation

LI, Zichao, et al. Paraphrase Generation with Deep Reinforcement Learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3865-3878. link

Parsing

JIANG, Jiarong, et al. Learned prioritization for trading off accuracy and speed. In: Advances in Neural Information Processing Systems. 2012. S. 1331-1339. link
LÊ, Minh; FOKKENS, Antske. Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 2017. S. 677-687. link
NASEEM, Tahira, et al. Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. S. 4586-4592. link
ZHANG, Lidan; CHAN, Kwok Ping. Dependency parsing with energy-based reinforcement learning. In: Proceedings of the 11th International Conference on Parsing Technologies. Association for Computational Linguistics, 2009. S. 234-237. link

Poetry generation

YI, Xiaoyuan, et al. Automatic poetry generation with mutual reinforcement learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3143-3153. link

Question Answering

GODIN, Fréderic; KUMAR, Anjishnu; MITTAL, Arpit. Learning when not to answer: a ternary reward structure for reinforcement learning based question answering. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers). 2019. S. 122-129. link
XIONG, Caiming; ZHONG, Victor; SOCHER, Richard. Dcn+: Mixed objective and deep residual coattention for question answering. arXiv preprint arXiv:1711.00106, 2017. link

Question generation

FAN, Zhihao, et al. A reinforcement learning framework for natural question generation using bi-discriminators. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018. S. 1763-1774. link
HU, Huang, et al. Playing 20 Question Game with Policy-Based Reinforcement Learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3233-3242. link

Relation Extraction

ZENG, Xiangrong, et al. Large scaled relation extraction with reinforcement learning. In: Thirty-Second AAAI Conference on Artificial Intelligence. 2018. link

Sentence Representation

YOGATAMA, D., et al. Learning to compose words into sentences with reinforcement learning. In: 5th International Conference on Learning Representations (ICLR 2017). International Conference on Learning Representations, 2017. link

Semantic Parsing

GUU, Kelvin, et al. From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. S. 1051-1062. link
LIANG, Chen et al. Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. p. 23-33. link

Taxonomy induction

MAO, Yuning, et al. End-to-End Reinforcement Learning for Automatic Taxonomy Induction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018. S. 2462-2472. link

Text-based games

HE, Ji, et al. Deep Reinforcement Learning with a Natural Language Action Space. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016. S. 1621-1630. link
NARASIMHAN, Karthik; KULKARNI, Tejas; BARZILAY, Regina. Language Understanding for Text-based Games using Deep Reinforcement Learning. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015. S. 1-11. link

Text anonymization

MOSALLANEZHAD, Ahmadreza; BEIGI, Ghazaleh; LIU, Huan. Deep Reinforcement Learning-based Text Anonymization against Private-Attribute Inference. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. S. 2360-2369. link

Text classification

ZHANG, Tianyang; HUANG, Minlie; ZHAO, Li. Learning structured representation for text classification via reinforcement learning. In: Thirty-Second AAAI Conference on Artificial Intelligence. 2018. link

Text summarization

LEE, Gyoung Ho; LEE, Kong Joo. Automatic text summarization using reinforcement learning with embedding features. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2017. S. 193-197. link
PAULUS, Romain; XIONG, Caiming; SOCHER, Richard. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. link
RYANG, Seonggi; ABEKAWA, Takeshi. Framework of automatic text summarization using reinforcement learning. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012. S. 256-265. link

Video captioning

WANG, Xin, et al. Video captioning via hierarchical reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. S. 4213-4222. link