📊 Defects Prediction System

This project builds a machine learning pipeline to predict the number of product defects in a manufacturing setup using features such as product ID, type, shift, and operator experience. The model is trained using a Random Forest Regressor and provides performance metrics and insightful visualizations.

📁 Dataset

Path: ./datasets/aggressive_defects_dataset.csv

📌 Columns:

Date: Date of production
Product_ID: Identifier for the product
Product_Type: Category/type of the product
Shift: Production shift (e.g., Morning, Evening)
Operator_Experience_Level: Operator’s experience level
Defects: Number of defects observed (target variable)

🛠 Features

✅ Label encoding for categorical features
✅ Model training using RandomForestRegressor
✅ Evaluation using R² Score and RMSE
✅ Visualizations for feature importance, prediction accuracy, and defect trends
✅ Model & encoders saved using joblib

📦 Installation

Install all required dependencies using:

pip install -r requirements.txt

🚀 How to Run

Place your dataset at:

./datasets/aggressive_defects_dataset.csv

Run the script:
```
python defects_predictor.py
```
The script will:
- Train and evaluate a model
- Save defect_regressor.pkl and label_encoders.pkl
- Show feature importance and prediction plots
- Plot defects trend over time

📈 Visualizations

🔹 Top 10 Feature Importances

Shows the most influential features for defect prediction.

🔹 Actual vs Predicted Defects

Compares model predictions with real defect counts.

🔹 Defects Over Time

Line plot to monitor defect trends chronologically.

🧪 Model Files

defect_regressor.pkl: Trained Random Forest model
label_encoders.pkl: Saved encoders for categorical variables

🔍 Predict from CLI

You can later load the model and encoders to predict from new data:

import joblib
import pandas as pd

# Load model and encoders
model = joblib.load("defect_regressor.pkl")
encoders = joblib.load("label_encoders.pkl")

# Sample input (replace with your values)
input_dict = {
    "Product_ID": "P123",
    "Product_Type": "A",
    "Shift": "Night",
    "Operator_Experience_Level":
"Intermediate",
"Machine_usage_hour": 15
}

# Encode input
for col in input_dict:
    le = encoders[col]
    input_dict[col] = le.transform([input_dict[col]])[0]

# Predict
X_input = pd.DataFrame([input_dict])
predicted_defects = model.predict(X_input)[0]
print(f"Predicted Defects: {predicted_defects:.2f}")

📤 Future Improvements

Add a web dashboard using Flask or Streamlit
Hyperparameter tuning with GridSearchCV
Integration with real-time factory data sources
Support for more advanced models (XGBoost, CatBoost)

👨‍💻 Author

Raj Aryan 🎓 B.Tech | RNSIT 🔗 LinkedIn 🔗 GitHub

📝 License

This project is open-source and free to use under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
datasets		datasets
model_train		model_train
models		models
templates		templates
LICENSE		LICENSE
README.md		README.md
defect_app.py		defect_app.py
image_defect.py		image_defect.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📊 Defects Prediction System

📁 Dataset

📌 Columns:

🛠 Features

📦 Installation

🚀 How to Run

📈 Visualizations

🔹 Top 10 Feature Importances

🔹 Actual vs Predicted Defects

🔹 Defects Over Time

🧪 Model Files

🔍 Predict from CLI

📤 Future Improvements

👨‍💻 Author

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

h4ck3r0/defect_prediction

Folders and files

Latest commit

History

Repository files navigation

📊 Defects Prediction System

📁 Dataset

📌 Columns:

🛠 Features

📦 Installation

🚀 How to Run

📈 Visualizations

🔹 Top 10 Feature Importances

🔹 Actual vs Predicted Defects

🔹 Defects Over Time

🧪 Model Files

🔍 Predict from CLI

📤 Future Improvements

👨‍💻 Author

📝 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages