Skip to content

vinh2155/E-commerce-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Content

This dataset contains customer data with the following columns:

  • Email: Customer email addresses, uniquely identifying users.
  • Address: Residential address of the customers.
  • Avatar: Favorite color or theme for a user’s online profile representation.
  • Avg. Session Length: Average duration (in minutes) customers spend per session on the app.
  • Time on App: Average daily time (in minutes) spent on the app.
  • Time on Website: Average daily time (in minutes) spent on the company’s website.
  • Length of Membership: How many years the customer has been a member.
  • Yearly Amount Spent: Total money spent by the customer in a year (dependent variable).

Purpose

The goal of this analysis is to predict the Yearly Amount Spent based on other factors such as average session length, time on app, time on website, and length of membership. By understanding the key drivers of spending, the company can better target customers, enhance the user experience, and optimize resource allocation to boost revenue and customer satisfaction.

How to Run the Project

Pick the option that best fits your comfort level. Both non-technical and technical paths produce the same results.

Option A — No installation (Recommended for non-technical users, uses Google Colab)

  1. Download this project as a ZIP:
    • On this repository page, click the green “Code” button > “Download ZIP”.
    • Unzip the file on your computer.
  2. Open Google Colab in your browser: https://colab.research.google.com
  3. In Colab, go to File > Upload notebook and select Ecommerce_LinearRegression.ipynb.
  4. Upload the dataset:
    • In the left sidebar (Files tab), click “Upload” and select Ecommerce_Customers.csv.
  5. If needed, update the data path in the first data-loading cell to:
    df = pd.read_csv('Ecommerce_Customers.csv')
  6. Run all cells:
    • Runtime > Run all.
  7. View results:
    • The notebook will walk you through exploratory analysis, model training, and performance reporting (RMSE).

Tips:

  • Keep the notebook (.ipynb) and the CSV in the same working directory in Colab to avoid path issues.
  • If prompted to install packages, follow the notebook instructions.

Option B — Run locally (Recommended for technical users)

Prerequisites:

  • Python 3.9+ installed
  • pip installed
  1. Clone or download the repository:

    git clone https://github.com/vinh2155/E-commerce-Model.git
    cd E-commerce-Model

    Or download ZIP and extract, then open a terminal in the extracted folder.

  2. (Optional but recommended) Create and activate a virtual environment:

    python -m venv .venv
    # Windows
    .venv\Scripts\activate
    # macOS/Linux
    source .venv/bin/activate
  3. Install dependencies:

    pip install jupyter pandas numpy scikit-learn matplotlib seaborn
  4. Launch Jupyter and open the notebook:

    jupyter notebook

    Then open Ecommerce_LinearRegression.ipynb.

  5. Ensure the data file is present in the same directory:

    • Ecommerce_Customers.csv If your path differs, update the data-loading cell accordingly:
    df = pd.read_csv('path/to/Ecommerce_Customers.csv')
  6. Run all cells (Kernel > Restart & Run All) to reproduce the analysis and metrics.

Project Files

  • Ecommerce_LinearRegression.ipynb — Main notebook with EDA, feature engineering, model training, and evaluation.
  • Ecommerce_Customers.csv — Dataset used in the analysis.
  • README.md — Project description and run instructions.

Results

  • Model: Linear Regression
  • Metric: RMSE = 10.19 (2.04% of dataset mean), indicating strong predictive performance for yearly spending.

Troubleshooting

  • File not found errors: Confirm the CSV filename and that it’s in the same directory as the notebook (or update the path).
  • Package errors: Re-run pip install ... commands, or restart the notebook/kernel after installing.
  • Colab file resets: If the Colab runtime disconnects, re-upload the CSV and re-run cells.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published