A Machine Learning API for real-time sentiment analysis using scikit-learn and FastAPI. This project demonstrates core NLP techniques, robust engineering practices, and automated CI/CD with GitHub Actions.
- Train a text classification model to predict sentiment (positive, negative, neutral).
- Automate linting, testing, and deployment with GitHub Actions.
Uses the IMDB Movie Reviews Dataset or any dataset of labeled text.
Example features:
review_textlabel(positive, negative)
- Converts raw text to numeric features using
CountVectorizerorTfidfVectorizer.
- Uses Logistic Regression or Support Vector Machine (SVM) for classification.
- Train/test split, accuracy, and classification report.
- Save trained model and vectorizer with
joblib.
Automates:
- Linting (
flake8) - Unit tests for model & API
- Coverage reporting
- ✅ Clean Python scripts for training and serving.
- ✅ Jupyter Notebook for exploration and training.
- ✅ CI/CD workflow with linting & tests.
- ✅ README with full instructions.
pip install -r requirements.txt