Name: DuQI
Owner: Cornell Data Science
Description: ?-ee: Duplicate Question Identification
Created: 2018-03-03 18:51:21.0
Updated: 2018-03-22 16:40:05.0
Pushed: 2018-03-22 16:40:04.0
Size: 6647
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Members: Brandon Kates, Zhao Shen, Arnav Ghosh
Objective: To create a system capable of detecting duplicate questions on Q&A platforms.
We expect our approach to help centralize the available knowledge on a single question/issue and direct users with questions that have already been answered to the appropriate resource.
We will test a variety of duplicate question identification methods on the Quora question pairs dataset, and hope to eventually apply our findings to the classroom Q&A platform Piazza to improve the Cornell student experience.
Below is the data required to successfully train/run all of the models.
In the current directory (“DuQI”), create a folder named “data” and populate it with:
Final directory should look like: