Hallucination Detection using Logistic Regression

Overview

This project applies logistic regression for binary text classification, specifically hallucination detection in text summaries. The classifier determines whether a given text is factual or hallucinated based on the XSum Hallucination Dataset.

Dataset

Source: XSum Hallucination Dataset
Input: summary field (text summaries)
Label: is_factual (1 = factual, 0 = hallucinated)

Tasks

🛠 1. Data Preprocessing

Clean and preprocess text data for model training.

📊 2. Model Training

Implement logistic regression from scratch (no ML libraries).
Train the model on the dataset and tune hyperparameters.

📈 3. Model Evaluation

Assess performance using accuracy, precision, recall, and F1-score.
Visualize performance using a confusion matrix.

🔄 4. Cross-Validation

Implement k-fold cross-validation to ensure model robustness.
Report average accuracy and standard deviation across folds.

🧐 5. Error Analysis

Identify and analyze misclassified examples.
Suggest improvements based on findings.

🚀 This project demonstrates logistic regression’s effectiveness in text classification and provides insights into hallucination detection in NLP.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
hallucination_detection.ipynb		hallucination_detection.ipynb
hallucination_detection.py		hallucination_detection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hallucination Detection using Logistic Regression

Overview

Dataset

Tasks

🛠 1. Data Preprocessing

📊 2. Model Training

📈 3. Model Evaluation

🔄 4. Cross-Validation

🧐 5. Error Analysis

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Vaneeza-7/Text-Hallucination-Detection-Classifier

Folders and files

Latest commit

History

Repository files navigation

Hallucination Detection using Logistic Regression

Overview

Dataset

Tasks

🛠 1. Data Preprocessing

📊 2. Model Training

📈 3. Model Evaluation

🔄 4. Cross-Validation

🧐 5. Error Analysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages