I started to learn NLP-related stuff in mid-2018. And gradually start to some serious research. My focus mainly on Natural Language Generation (NLG).
Beginning: SemEval2019
By the end of 2018, members of the AI learning group Lead by Prof. Weifeng Su decide to take part in the SemEval 2019. To study and learning more about NLP.
Our group chose task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval). Which the input text is users’ tweets and we need to classify it.
There are three subtasks. Task A needs to identify the offensive language. Task B needs to categorization the offensive type. And Task C needs to identify the offensive target.
We used the BERT model, which is state-of-the-art at that time for text classification. But our experiment limit to changing the preprocessing methods and some simple ensemble (not even proper) methods. But is still fun to learn such an advanced model and some NLP methods.
Our work was published as a paper in the NAACL 2019 Workshop: ACL Anthology
Final Year Project: LenAtten
We proposed a novel attention-based module that can be applied to the RNN-based sequence-to-sequence Model for Fixed Length Summarization Task. The proposed method achieved good results on CNN/Daily Mail and English Gigaword Dataset.
This work has been published at Findings of ACL 2021: ACL Anthology; Code.
MSc Independent Project: Cross-modal Dialogue Pre-training
Emotion-aware Multimodal Pre-training for Image-grounded Emotional Response Generation
In this work, we consider the natural situation that happens during a two-person doing conversation. Factors like facial expression, posture, and more will be considered except for the content expressed through spoken language. And usually, such non-verbal factors will convey much richer and more abstract information like emotions. Based on this nature, we proposed methods pre-training the language model to capture emotions from modals and incorporate the emotion into text generation for dialogue.
This work has been accepted at DASFAA 2022; the article can be accessed from Springer Link.