This project performs sentiment analysis on Intel products using various data scraping, preprocessing, and machine learning techniques.
The primary goal of this notebook is to analyze customer reviews and sentiments about Intel products. It includes the following steps:
- Data Collection: Web scraping reviews using Selenium.
- Data Preprocessing: Cleaning and preparing the text data for analysis.
- Sentiment Analysis: Using TextBlob to analyze the sentiment of the reviews.
- Machine Learning Models: Applying models such as Random Forest and KMeans for sentiment classification.
- Visualization: Generating word clouds and other visual representations of the data.
The notebook is organized into sections that follow the project's workflow:
- Introduction: Brief introduction to the project.
- Web Scraping: Using Selenium to scrape review data from websites.
- Data Preprocessing: Cleaning and preparing the data, including handling missing values, text cleaning, and language detection.
- Sentiment Analysis: Applying TextBlob for sentiment scoring.
- Feature Extraction: Using TF-IDF for feature extraction from text.
- Model Training: Training machine learning models to classify sentiments.
- Evaluation and Visualization: Evaluating model performance and visualizing results using word clouds and other plots.
Additionally, you need to have:
Python 3.12 or later. ChromeDriver installed for Selenium to work correctly.
To run this notebook, you need to install the following libraries:
pip install numpy
pip install pandas
pip install selenium
pip install webdriver_manager
pip install textblob
pip install wordcloud
pip install emoji
pip install langid
pip install matplotlib
pip install scikit-learn