We propose the QDD_Net, which is used for duplicate question detection.
Our model achieves a good performance in PPDAI Magic Mirror Data Application Contest.
Data should be pairs of questions labeled with 0 and 1 represents similar or not.
Word & Character embedding should be provided respectively for representing the question sequences.
We proposed three models including a RNN based model, CNN based model and a RCNN based model. These models have the following characteristics:
- Bi-Directional GRU in RNN based models for semantic learning.
- 1-D Convolution in CNN and RCNN based models for local feature extraction.
- Co-Attention was used to learn the semantic correlations between two sequences.
- Self-Attention was used to enhance the feature representation.
- Word embedding and Character Embedding were used simultaneously.
The ensemble model achieved 0.203930 for similarity loss in PPDAI contest, at the top 15% in ranking.
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension[ICLR 2018]
Wenhui Wang et al. “Gated Self-Matching Networks for Reading Comprehension and Question Answering”