CornellDataScience/NLP_Research-FA17

Name: NLP_Research-FA17

Owner: Cornell Data Science

Description: Cornell Data Science: Machine learning research project

Created: 2017-09-17 20:31:34.0

Updated: 2018-03-01 23:19:49.0

Pushed: 2018-03-01 23:19:45.0

Homepage:

Size: 637427

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

CDS: NLP Research Team

Team Lead: Kenta Takatsu (CS '19)
Advisor: Prof. Thorsten Joachims

About Us

We are a student-led research team from Cornell Data Science (CDS), working on Natural Language Processing projects under Prof. Thorsten Joachims. This semester, we are participating in the Yelp Dataset Challenge to provide analytic insights from raw review texts. Our final products are research papers which makes use of machine learning algorithms and statistical validations. You can visit the subteam sections to see our individual work.

Achivements

This past semester, we had a wide range of research topics, from recommendation system to deep style transfer. In general, we took the approach called Natural Language Processing – an interaction between machine learning and text analysis.

All researches demonstrated remarkable results; an implementation of recommendation system that beats industry standard algorithm, an accurate analytic tool to assess business trends, a classifier to identify locally popular users, and a writing style transfer with deep learning.

Subteams
Final Submissions

You can visit our final papers from the following links:

How to get the code

The code uses git submodules, so to properly intialize those you need the --recurse-submodules option. Additionally, using --depth 1 will avoid cloning the history, making the clone faster.

clone --recurse-submodules --depth 1 https://github.com/CornellDataScience/Yelp-FA17.git

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.