Predicting-Diabetic-patient-using-Ensemble-Learning

Problem

The Objective of the Project is to diagnostically predict whether or not a patient has diabetes based on certain diagnostic measurements in the dataset. All patients are female of age >=21.

Ensembe Learning

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.

Constituent models/algorithms

KNN

K nearest neighbors is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure (e.g., distance functions).

Random Forest

Random forests or random decision forests are an ensemble learning method for classification, regression and other tasks that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes or mean/average prediction of the individual trees.

Logistic Regression

Logistic Regression, also known as Logit Regression or Logit Model, is a mathematical model used in statistics to estimate (guess) the probability of an event occurring having been given some previous data. Logistic Regression works with binary data, where either the event happens (1) or the event does not happen (0).

GridSearch

Grid search is the process of performing hyper parameter tuning in order to determine the optimal values for a given model. it is difficult to manually change the hyperparameters and fit them on the training data every time. It helps to loop through predefined hyperparameters and fit the estimator (model) on the training set.

estimator
params - list of parameters and the range of values for each parameter of the specified estimator. All you need to do is create a dictionary (variable params in my code) that has the hyperparameters as keys and an iterable that holds the options we need to try out.
cross validation - A cross validation process is performed in order to determine the hyper parameter value set which provides the best accuracy levels. variance problems is dealt with in cross validation.

ScreenShots

Distributions By Pregnancy

Distributions by Glucose Level

Distributions by Diastolic Level

Distributions by Tricep

Distribution by Insulin Level

Distribution by BMI

Distribution by Diabetic Pedigree Function

Distribution by Age

Comparison of KNN, Random Forest and Logistic Regression

Conclusion

The Score of ensemble model is more than the Individual score of constituent models

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Ensemble Project-checkpoint.ipynb		Ensemble Project-checkpoint.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Predicting-Diabetic-patient-using-Ensemble-Learning

Problem

Ensembe Learning

Constituent models/algorithms

KNN

Random Forest

Logistic Regression

GridSearch

ScreenShots

Distributions By Pregnancy

Distributions by Glucose Level

Distributions by Diastolic Level

Distributions by Tricep

Distribution by Insulin Level

Distribution by BMI

Distribution by Diabetic Pedigree Function

Distribution by Age

Comparison of KNN, Random Forest and Logistic Regression

Conclusion

About

Uh oh!

Releases

Packages

Languages

drajwal1511/Predicting-Diabetic-patient-using-Ensemble-Learning

Folders and files

Latest commit

History

Repository files navigation

Predicting-Diabetic-patient-using-Ensemble-Learning

Problem

Ensembe Learning

Constituent models/algorithms

KNN

Random Forest

Logistic Regression

GridSearch

ScreenShots

Distributions By Pregnancy

Distributions by Glucose Level

Distributions by Diastolic Level

Distributions by Tricep

Distribution by Insulin Level

Distribution by BMI

Distribution by Diabetic Pedigree Function

Distribution by Age

Comparison of KNN, Random Forest and Logistic Regression

Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages