NLP Classification and sentiment analysis
The project requires building of a NLP classifiers and sentiment analysis based on the below guidelines. I have attached the Rmarkdown file and required input files for the analysis. please use Python to do the below steps preferably. The 'cleaned_subtitles' will be the dataset used for the analysis and 'movie reviews' file is what you should use for sentiment analysis
Classifcation:
- Pick several nouns and/or verbs from your previous assignment. Create a column in the dataframe that indicates if that line from the movie/TV show includes that word or does not include that word. You can use 0 and 1 or any labels that make sense to you. Remember, we covered regular expression detection and deletion in the raw text assignments! - Once you have created this column, use string replacement to delete that word from your subtitles. We will take the word out to see if we can predict when it is used - if you leave it in, it's a perfect predictor! - Use *two* feature extraction methods and *two* machine learning algorithms to determine if you can predict when your noun or verb will be used. You should include four different classification reports below.
Sentiment:
- Use *one* of the unsupervised lexicon techniques to create sentiment scores for your movie/TV show. - What is the overall sentiment of your movie/TV show? How would you interpret the scores provided? - Using the movie reviews mini dataset provided online, create a sentiment tagging model (one feature extraction method + one algorithm). - With this new model, create sentiment scores for your movie/TV show. - What is the overall sentiment using the new model of sentiment tagging? How would you interpret the scores provided?
4 years ago
30
Purchase the answer to view it

- BuildingNLPClassifier.edited.docx
- The stemplot below displays midterm exam scores for the 34 students taking a Calculus course
- physics HW
- Final Project - Group Paper
- Why the model is known as a linear probability model (LPM)? What is the meaning of β1?
- MKT500 Assignment 1 Part- A
- life of pi essay MLA format
- BUS 307 Week 2 DQ 1 ( Gantt vs. Network Based Approach ) - Tutorial Contains Two Answers For This Discussion - A Graded - Best Tutorial - Quality Work
- Descriptive Statistics and Interpretation
- "A PLUS WORK IN 15 HOURS
- Post blog respond ------