This project focuses on detecting spam SMS messages using a Logistic Regression model. By preprocessing the dataset and applying text classification techniques, the project achieved high accuracy in classifying messages as spam or ham.
- Objective: To classify SMS messages as spam or ham using Logistic Regression.
- Dataset: SMS Spam Collection Dataset with 5574 messages (87% ham and 13% spam).
- Model: Logistic Regression, trained using TF-IDF vectorizer for feature extraction.
- Outcome: Achieved an accuracy of 96.59% in identifying spam messages.
data (1).csv
: Dataset used for training and testing the model.miniprojectanalysis.ipynb
: Jupyter notebook with code for data preprocessing, model training, and evaluation.miniproject report.docx
: Project report summarizing the objectives, preprocessing, and results.SMS Spam Detection Using Logistic Regression.pptx
: Presentation covering key points and findings of the project.
- Accuracy: 96.59%
- Precision (spam): 0.99
- Recall (spam): 0.75
- F1-score (spam): 0.86
- Open the
miniprojectanalysis.ipynb
file in Jupyter Notebook to explore the data preprocessing and model training. - The dataset (
data (1).csv
) can be loaded into the notebook for model training and evaluation.
- Python 3.x
- Pandas
- Scikit-learn
- Jupyter Notebook