Bibliography 
A non-exhaustive list of RL+NLP papers. You can choose a paper from this list or suggest other papers. If you know of an important paper that is not here, let me know.
General
  - LITTMAN, Michael L. Reinforcement learning improves behaviour from evaluative feedback. Nature, 2015, 521. Jg., Nr. 7553, S. 445-451. link
 
  - LUKETINA, Jelena, et al. A survey of reinforcement learning informed by natural language. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence Survey track. Pages 6309-6317, 2019. link
 
Coreference resolution
  - CLARK, Kevin; MANNING, Christopher D. Deep Reinforcement Learning for Mention-Ranking Coreference Models. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. S. 2256-2262. link
 
Dialog
  - ENGLISH, Michael S.; HEEMAN, Peter A. Learning mixed initiative dialog strategies by using reinforcement learning on both conversants. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2005. S. 1011-1018. link
 
  - FANG, Meng; LI, Yuan; COHN, Trevor. Learning how to Active Learn: A Deep Reinforcement Learning Approach. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. S. 595-605. link
 
  - HEEMAN, Peter A. et al. Using Reinforcement Learning for Dialogue Management Policies: Towards Understanding MDP Violations and Convergence. In: Thirteenth Annual Conference of the International Speech Communication Association. 2012. link
 
  - KHOUZAIMI, Hatim; LAROCHE, Romain; LEFEVRE, Fabrice. Optimising turn-taking strategies with reinforcement learning. In: Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue. 2015. S. 315-324. link
 
  - LI, Jiwei, et al. Deep Reinforcement Learning for Dialogue Generation. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. S. 1192-1202. link
 
  - MANUVINAKURIKE, Ramesh; DEVAULT, David; GEORGILA, Kallirroi. Using reinforcement learning to model incrementality in a fast-paced dialogue game. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. 2017. S. 331-341. link
 
  - PAEK, Tim. Reinforcement learning for spoken dialogue systems: Comparing strengths and weaknesses for practical deployment. In: Proc. Dialog-on-Dialog Workshop, Interspeech. 2006. link
 
  - PAPANGELIS, Alexandros. A Comparative Study of Reinforcement Learning Techniques on Dialogue Management. In: Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics. 2012. S. 22-31. link
 
  - SINGH, Satinder P., et al. Reinforcement learning for spoken dialogue systems. In: Advances in Neural Information Processing Systems. 2000. S. 956-962. link
 
Grammatical error correction
  - SAKAGUCHI, Keisuke; POST, Matt; VAN DURME, Benjamin. Grammatical Error Correction with Neural Reinforcement Learning. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2017. S. 366-372. link
 
Human-robot interaction
  - CRUZ, Francisco, et al. Interactive reinforcement learning through speech guidance in a domestic scenario. In: 2015 International Joint Conference on Neural Networks (IJCNN). IEEE, 2015. S. 1-8. link
 
  - RITSCHEL, Hannes; ANDRÉ, Elisabeth. Shaping a social robot’s humor with Natural Language Generation and socially-aware reinforcement learning. In: Proceedings of the Workshop on NLG for Human–Robot Interaction. 2018. S. 12-16. link
 
Image Captioning
  - RENNIE, Steven J., et al. Self-critical sequence training for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017. S. 7008-7024. link
 
Information Extraction
  - NARASIMHAN, Karthik; YALA, Adam; BARZILAY, Regina. Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016. S. 2355-2365. link
 
  - TANIGUCHI, Motoki; MIURA, Yasuhide; OHKUMA, Tomoko. Joint Modeling for Query Expansion and Information Extraction with Reinforcement Learning. In: Proceedings of the First Workshop on Fact Extraction and VERification (FEVER). 2018. S. 34-39. link
 
Instructions
  - BRANAVAN, Satchuthananthavale RK, et al. Reinforcement learning for mapping instructions to actions. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1-Volume 1. Association for Computational Linguistics, 2009. S. 82-90. link
 
  - GOYAL, Prasoon; NIEKUM, Scott; MOONEY, Raymond J. Using natural language for reward shaping in reinforcement learning. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. AAAI Press, 2019. S. 2385-2391. link
 
Knowledge graph reasoning
  - XIONG, Wenhan; HOANG, Thien; WANG, William Yang. DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. S. 564-573. link
 
Language model
  - RANZATO, Marc’Aurelio, et al. Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732, 2015. link
 
Language generation
  - DETHLEFS, Nina; CUAYÁHUITL, Heriberto. Hierarchical reinforcement learning and hidden Markov models for task-oriented natural language generation. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers-Volume 2. Association for Computational Linguistics, 2011. S. 654-659. link
 
  - YASUI, Go; TSURUOKA, Yoshimasa; NAGATA, Masaaki. Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. 2019. S. 400-406. link
 
Machine Translation
  - GRISSOM II, Alvin, et al. Don’t until the final verb wait: Reinforcement learning for simultaneous machine translation. In: Proceedings of the 2014 Conference on empirical methods in natural language processing (EMNLP). 2014. S. 1342-1352. link
 
  - WU, Lijun, et al. A Study of Reinforcement Learning for Neural Machine Translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3612-3621. link
 
Math word problem
  - HUANG, Danqing, et al. Neural math word problem solver with reinforcement learning. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018. S. 213-223. link
 
Mix
  - CHANG, Kai-Wei et al. Learning to search better than your teacher. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37. 2015. p. 2058-2066. link
 
  - NOROUZI, Mohammad et al. Reward augmented maximum likelihood for neural structured prediction. In: Advances In Neural Information Processing Systems. 2016. p. 1723-1731. link
 
  - SOKOLOV, Artem et al. Learning structured predictors from bandit feedback for interactive NLP. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016. p. 1610-1620. link
 
Narratives
  - LING, Yuan, et al. Learning to diagnose: assimilating clinical narratives using deep reinforcement learning. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2017. S. 895-905. link
 
Paraphrase generation
  - LI, Zichao, et al. Paraphrase Generation with Deep Reinforcement Learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3865-3878. link
 
Parsing
  - JIANG, Jiarong, et al. Learned prioritization for trading off accuracy and speed. In: Advances in Neural Information Processing Systems. 2012. S. 1331-1339. link
 
  - LÊ, Minh; FOKKENS, Antske. Tackling Error Propagation through Reinforcement Learning: A Case of Greedy Dependency Parsing. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 2017. S. 677-687. link
 
  - NASEEM, Tahira, et al. Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019. S. 4586-4592. link
 
  - ZHANG, Lidan; CHAN, Kwok Ping. Dependency parsing with energy-based reinforcement learning. In: Proceedings of the 11th International Conference on Parsing Technologies. Association for Computational Linguistics, 2009. S. 234-237. link
 
Poetry generation
  - YI, Xiaoyuan, et al. Automatic poetry generation with mutual reinforcement learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3143-3153. link
 
Question Answering
  - GODIN, Fréderic; KUMAR, Anjishnu; MITTAL, Arpit. Learning when not to answer: a ternary reward structure for reinforcement learning based question answering. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Industry Papers). 2019. S. 122-129. link
 
  - XIONG, Caiming; ZHONG, Victor; SOCHER, Richard. Dcn+: Mixed objective and deep residual coattention for question answering. arXiv preprint arXiv:1711.00106, 2017. link
 
Question generation
  - FAN, Zhihao, et al. A reinforcement learning framework for natural question generation using bi-discriminators. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018. S. 1763-1774. link
 
  - HU, Huang, et al. Playing 20 Question Game with Policy-Based Reinforcement Learning. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018. S. 3233-3242. link
 
Relation Extraction
  - ZENG, Xiangrong, et al. Large scaled relation extraction with reinforcement learning. In: Thirty-Second AAAI Conference on Artificial Intelligence. 2018. link
 
Sentence Representation
  - YOGATAMA, D., et al. Learning to compose words into sentences with reinforcement learning. In: 5th International Conference on Learning Representations (ICLR 2017). International Conference on Learning Representations, 2017. link
 
Semantic Parsing
  - GUU, Kelvin, et al. From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. S. 1051-1062. link
 
  - LIANG, Chen et al. Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017. p. 23-33. link
 
Taxonomy induction
  - MAO, Yuning, et al. End-to-End Reinforcement Learning for Automatic Taxonomy Induction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018. S. 2462-2472. link
 
Text-based games
  - HE, Ji, et al. Deep Reinforcement Learning with a Natural Language Action Space. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016. S. 1621-1630. link
 
  - NARASIMHAN, Karthik; KULKARNI, Tejas; BARZILAY, Regina. Language Understanding for Text-based Games using Deep Reinforcement Learning. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015. S. 1-11. link
 
Text anonymization
  - MOSALLANEZHAD, Ahmadreza; BEIGI, Ghazaleh; LIU, Huan. Deep Reinforcement Learning-based Text Anonymization against Private-Attribute Inference. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019. S. 2360-2369. link
 
Text classification
  - ZHANG, Tianyang; HUANG, Minlie; ZHAO, Li. Learning structured representation for text classification via reinforcement learning. In: Thirty-Second AAAI Conference on Artificial Intelligence. 2018. link
 
Text summarization
  - LEE, Gyoung Ho; LEE, Kong Joo. Automatic text summarization using reinforcement learning with embedding features. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 2017. S. 193-197. link
 
  - PAULUS, Romain; XIONG, Caiming; SOCHER, Richard. A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304, 2017. link
 
  - RYANG, Seonggi; ABEKAWA, Takeshi. Framework of automatic text summarization using reinforcement learning. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, 2012. S. 256-265. link
 
Video captioning
  - WANG, Xin, et al. Video captioning via hierarchical reinforcement learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. S. 4213-4222. link