A machine learning (ML) project for predictive maintenance: (1) Bearing fault detection from CWRU vibration data — classify as normal, inner race, ball, or outer race fault; (2) RUL (Remaining Useful Life) from NASA C-MAPSS — predict cycles until turbofan engine failure. Both include training, demo, and a web dashboard.
- What is this project? (Simple overview)
- Key concepts — Glossary for beginners
- Project structure — What each file does
- Setup — Step by step
- How to run everything
- Understanding the two models
- How to interpret results
- Troubleshooting
- Learning resources
Machines with rotating parts (motors, pumps, fans) use bearings. When a bearing starts to fail, it often vibrates differently. If we can detect that change early, we can repair it before a costly breakdown.
We use machine learning to train a computer program on real vibration data from healthy and faulty bearings. Once trained, the program can:
- Classify bearing vibration as: normal, inner race fault, ball fault, or outer race fault
- Give a health score (0–100%)
- Recommend what to do: no action, monitor, or inspect immediately
- Predict RUL for turbofan engines (NASA C-MAPSS)
- Two datasets: CWRU (bearings) and NASA C-MAPSS (turbofan RUL)
- Two approaches for bearings: hand-crafted features vs. raw signal (neural networks)
- End-to-end pipeline: download → load → features → train → predict → demo
- Step-by-step plan in PLAN.md
| Term | Plain English |
|---|---|
| Machine learning (ML) | A program that learns patterns from data instead of being told every rule. We show it examples of “healthy” and “faulty” vibrations; it learns to tell them apart. |
| Predictive maintenance | Fixing things before they fail, based on signs (like unusual vibration). Contrast with “fix it when it breaks.” |
| Dataset | A collection of labeled examples. Here: vibration files + labels (normal, inner_race, ball, outer_race). |
| Training | The process of showing the model many examples so it learns. |
| Model | The learned program. We save it so we can use it later without re-training. |
| Inference / prediction | Using the trained model to classify new vibration data. |
| Feature | A number we compute from the raw signal (e.g. RMS, peak). Features summarize the signal in a way the model can use. |
| Window | A short segment of the signal (e.g. 1024 samples). We split long signals into windows and predict per window. |
| Validation / val set | Data we hold back during training to check how well the model generalizes (not used to update weights). |
| Accuracy | Fraction of predictions that are correct (e.g. 98% = 98 out of 100 right). |
| Health score | Our 0–100% measure: higher = healthier. 100% = confidently normal; 0% = confidently faulty. |
| RUL (Remaining Useful Life) | Number of cycles until failure. Regression task (NASA C-MAPSS). |
| Term | Meaning |
|---|---|
| Vibration signal | A sequence of numbers: amplitude at each time step. Recorded by an accelerometer. |
| Sample rate (Hz) | How many numbers per second (e.g. 12,000 Hz = 12,000 samples/sec). |
| RMS (Root Mean Square) | A measure of signal strength / energy. |
| FFT (Fast Fourier Transform) | Converts time signal → frequency spectrum. Faults often show up as extra frequencies. |
| Wavelet | A way to decompose a signal into different frequency bands (like a multi-scale FFT). |
| Term | Meaning |
|---|---|
| Feature-based model | We compute features (RMS, peak, etc.) by hand, then a simple neural net classifies from those numbers. |
| Raw model (1D-CNN / LSTM) | The neural net sees the raw vibration window directly and learns its own internal features. |
| Dense layer | A layer where every input connects to every output (fully connected). |
| 1D-CNN | Convolutional Neural Network for 1D (time-series) data. Good at finding local patterns. |
| LSTM | Recurrent network that can remember patterns over time. |
ml/
├── data/ # Datasets
│ ├── *.mat # CWRU bearing vibration (Normal_0, IR007_0, etc.)
│ └── cmapss/ # NASA C-MAPSS FD001 (train_FD001.txt, test_FD001.txt, RUL_FD001.txt)
│
├── models/ # Saved trained models (create after training)
│ ├── fault_classifier.keras # Feature-based bearing model
│ ├── fault_classifier.npz # Metadata for feature model
│ ├── fault_classifier_raw.keras # Raw-signal (1D-CNN) bearing model
│ ├── fault_classifier_raw.npz # Metadata for raw model
│ ├── rul_predictor.keras # RUL (Remaining Useful Life) LSTM
│ └── rul_predictor.npz # Metadata for RUL model
│
├── notebooks/ # Jupyter notebooks
│ ├── exploration.ipynb # Data exploration, plots, FFT, windowing
│ ├── model_comparison.ipynb # Compare all models (Phase 8.2)
│ └── training_curves*.png # Loss/accuracy curves (generated by training)
│
├── tests/ # pytest unit tests (Phase 8.1)
│ ├── test_load_data.py
│ ├── test_feature_engineering.py
│ ├── test_load_cmapss.py
│ └── test_predict.py
│
├── src/ # Core Python code
│ ├── load_data.py # Load .mat files, return (signal, sample_rate, rpm)
│ ├── feature_engineering.py # Extract features (RMS, peak, FFT, wavelet), build datasets
│ ├── train_model.py # Feature-based Dense model training
│ ├── raw_model.py # 1D-CNN and LSTM for raw bearing windows
│ ├── load_cmapss.py # NASA C-MAPSS FD001 loading, RUL labels, sequences
│ ├── rul_model.py # LSTM for RUL regression
│ └── predict.py # Inference: bearing fault + RUL
│
├── scripts/ # Run these from the command line
│ ├── download_cwru.py # Download CWRU .mat files into data/
│ ├── verify_data.py # Plot a sample signal, sanity check
│ ├── train.py # Train feature-based model
│ ├── train_raw.py # Train 1D-CNN or LSTM on raw windows
│ ├── demo.py # Quick demo: predict from sample, print health score
│ ├── demo_raw.py # Demo with raw model (pass .mat path)
│ ├── run_predict.py # Run prediction from features or .mat file
│ ├── run_features.py # Run feature engineering, print dataset stats
│ ├── download_cmapss.py # Download NASA C-MAPSS FD001–004
│ ├── run_all.py # Full pipeline: download → train → demo (Phase 9.1)
│ ├── train_rul.py # Train RUL predictor (LSTM)
│ ├── demo_rul.py # Demo RUL prediction on test engines
│ └── dashboard.py # Streamlit web UI: bearing fault + RUL
│
├── requirements.txt # Python dependencies
├── PLAN.md # Step-by-step implementation roadmap
├── CHANGELOG.md # Version history
├── README.md # This file
└── LICENSE # MIT
CWRU (bearings):
- Source: CWRU Bearing Data Center
- Content: Vibration recordings.
Normal_0.mat= healthy;IR007_0.mat= inner race fault; etc.
NASA C-MAPSS (RUL):
- Source: NASA Prognostics / GitHub mirror
- Content: Turbofan engine sensor data (unit, cycle, 3 op settings, 21 sensors). Train = run-to-failure; test = ends before failure with true RUL.
- Python 3.10+ (check:
python3 --version) - pip (usually comes with Python)
A virtual environment keeps this project’s packages separate from others.
cd /path/to/ml
python3 -m venv venv- Linux/macOS:
source venv/bin/activate - Windows:
venv\Scripts\activate
You should see (venv) in your terminal prompt.
pip install -r requirements.txtThis installs: numpy, pandas, scipy, matplotlib, TensorFlow, PyWavelets, scikit-learn, Streamlit, Jupyter.
python scripts/download_cwru.pyThis fetches CWRU .mat files into data/. You need this before training.
python scripts/verify_data.pyPlots a short segment of a sample file. Confirms data loaded correctly.
# Activate venv first
source venv/bin/activate # or venv\Scripts\activate on Windows
# 1. Download data (if not done)
python scripts/download_cwru.py
# 2. Train the feature-based model
python scripts/train.py
# 3. Demo: predict from a sample in the dataset
python scripts/demo.py
# 4. (Optional) Train raw model
python scripts/train_raw.py --arch 1dcnn
# 5. (Optional) Demo with raw model
python scripts/demo_raw.py data/IR007_0.mat
# 6. RUL prediction (NASA C-MAPSS)
python scripts/download_cmapss.py
python scripts/train_rul.py
python scripts/demo_rul.py
# 7. Web dashboard
streamlit run scripts/dashboard.py| Command | What it does | Output |
|---|---|---|
python scripts/download_cwru.py |
Downloads CWRU .mat files |
Files in data/ |
python scripts/verify_data.py |
Loads one file, plots a segment | Plot (optional) |
python scripts/train.py |
Builds features, trains Dense model, saves it | models/fault_classifier.keras + .npz, confusion matrix in terminal |
python scripts/train_raw.py --arch 1dcnn |
Builds raw windows, trains 1D-CNN, saves it | models/fault_classifier_raw.keras + .npz |
python scripts/demo.py |
Uses feature model on first sample in dataset | Prints predicted class, health score, recommendation |
python scripts/run_predict.py |
Same, with optional .mat path |
With path: run_predict.py data/Normal_0.mat |
python scripts/demo_raw.py data/IR007_0.mat |
Uses raw model on a specific .mat file |
Same output format |
python scripts/download_cmapss.py |
Download NASA C-MAPSS FD001 | data/cmapss/*.txt |
python scripts/train_rul.py [--fd 1|2] |
Train RUL (FD001 or FD002) | rul_predictor_fd00N.keras |
python scripts/demo_rul.py [--fd 2] |
RUL demo on test engines | Predicted vs true RUL table |
streamlit run scripts/dashboard.py |
Starts web app | Open http://localhost:8501 in browser |
python -m pytest tests/ -v |
Run unit tests (Phase 8.1) | 14+ tests for load, features, predict, cmapss |
python scripts/run_all.py |
Full pipeline (download, train, demo) | Use --quick for fast run, --skip-rul etc. |
# Feature model
python scripts/train.py --epochs 100 --batch-size 32 --no-class-weights
# Raw model
python scripts/train_raw.py --arch lstm --epochs 40 --batch-size 64- Input: 9 numbers per window — RMS, peak, mean, std, kurtosis, spectral centroid, spectral bandwidth, wavelet_energy_d1, wavelet_energy_a1.
- Architecture: A few Dense (fully connected) layers.
- Training: ~50 epochs, early stopping, optional class weights.
- Output: Probabilities for normal, inner_race, ball, outer_race.
Pros: Fast, interpretable (you see which features matter).
Cons: We design features by hand; may miss patterns.
- Input: Raw 1024-sample windows (no hand-crafted features).
- Architecture: Convolutions (1D-CNN) or LSTM layers.
- Training: Same data, different representation.
- Output: Same 4-class probabilities.
Pros: Can learn complex patterns from raw data.
Cons: Less interpretable, more compute.
Predicted: inner_race
Confidence: 99.5%
Health score: 0%
Recommendation: Maintenance required — inspect immediately
All class probabilities: {'normal': 0.001, 'inner_race': 0.995, 'ball': 0.002, 'outer_race': 0.002}
- Predicted: The class with highest probability.
- Confidence: That probability.
- Health score: 0–100%. 100% = confidently normal; 0% = confidently faulty.
- Recommendation: Based on health score thresholds (see
scripts/demo.pyorscripts/dashboard.py).
- Accuracy: Fraction of correct predictions.
- Confusion matrix: Rows = true class, columns = predicted. Diagonal = correct.
- Recall: For each class, what fraction of true positives we caught.
- Precision: For each class, of what we predicted, how much was correct.
| Problem | What to do |
|---|---|
ModuleNotFoundError: No module named 'src' |
Run from project root: cd /path/to/ml before python scripts/... |
FileNotFoundError: Model not found |
Run training first: python scripts/train.py (or train_raw.py for raw model). |
No such file or directory: data/ |
Run python scripts/download_cwru.py first. |
| CUDA / GPU messages | Safe to ignore if you don’t have a GPU; training runs on CPU. |
| Low accuracy | Try more epochs, check that data downloaded correctly, try both models. |
- TensorFlow / Keras: keras.io — Sequential API used here
- CWRU dataset: Bearing Data Center
- Predictive maintenance: AWS – What is predictive maintenance?
| Model | Metric | Input |
|---|---|---|
| Feature-based (Dense) | ~99–100% val accuracy | 9 features (RMS, peak, FFT, wavelet) |
| Raw-signal (1D-CNN) | ~99.9% val accuracy | 1024-sample windows |
| RUL predictor (LSTM) | Val RMSE ~15–30 cycles | NASA C-MAPSS FD001/FD002, 30-cycle windows |
Bearing classes: normal, inner_race, ball, outer_race
RUL: Remaining Useful Life in cycles (regression)
- Phases 1–6: Done (data, features, feature model, raw model, polish, dashboard)
- Phase 7.1: Extra features (FFT, wavelet) — Done
- Phase 7.3: Class weights for imbalanced data — Done
- Phase 7.4: Web dashboard — Done
- Phase 7.2: NASA C-MAPSS RUL — Done
- Phase 8.1: Unit tests (pytest) — Done
- Phase 8.2: Model comparison notebook — Done
- Phase 8.3: RUL FD002 (multi op condition) — Done
- Phase 9.1: run_all.py (full pipeline) — Done
- Phase 9.2: RUL FD003/FD004 — Done
Full roadmap: PLAN.md
cd /path/to/ml
source venv/bin/activate # or venv\Scripts\activate on Windows
# Option A: Full pipeline (one command)
python scripts/run_all.py
python scripts/run_all.py --quick # Fewer epochs
python scripts/run_all.py --skip-rul # Bearing only
# Option B: Step by step
python scripts/download_cwru.py
python scripts/train.py
python scripts/demo.py
python scripts/download_cmapss.py
python scripts/train_rul.py --fd 1
python scripts/demo_rul.py --fd 2
streamlit run scripts/dashboard.pyMIT — see LICENSE. Dataset usage follows CWRU terms.