A machine learning study demonstrating the application of Reinforcement Learning (Q-Learning) algorithms to optimize stock trading strategies and maximize portfolio returns.
Source Code · Kaggle Notebook · Video Demo · Live Demo
Authors · Overview · Features · Structure · Results · Quick Start · Usage Guidelines · License · About · Acknowledgments
Important
Special thanks to Mega Satish for her meaningful contributions, guidance, and support that helped shape this work.
Optimizing Stock Trading Strategy With Reinforcement Learning is a Data Science study conducted as part of the Internship at Technocolabs Software. The project focuses on the development of an intelligent agent capable of making autonomous trading decisions (Buy, Sell, Hold) to maximize profitability.
By leveraging Q-Learning, the system models the market environment where an agent learns optimal strategies based on price movements and moving average crossovers. The model is visualized via a Streamlit web application for real-time strategy simulation.
The analysis is governed by strict exploratory and modeling principles ensuring algorithmic validity:
- State Representation: utilization of Short-term and Long-term Moving Average crossovers to define market states.
- Action Space: Discrete action set (Buy, Sell, Hold) optimized through reward feedback.
- Policy Optimization: Implementing an Epsilon-Greedy strategy to balance exploration and exploitation of trading rules.
Note
This project was published as a research paper and successfully demonstrated the viability of RL agents in simulated trading environments. The work received official recognition from Technocolabs Software including an Internship Completion Certificate and Letter of Recommendation.
| # | Resource | Description | Date |
|---|---|---|---|
| 1 | Source Code | Complete production repository and scripts | — |
| 2 | Kaggle Notebook | Interactive Jupyter notebook for model training | — |
| 3 | Dataset | Historical stock market data (5 Years) | — |
| 4 | Technical Specification | System architecture and specifications | — |
| 5 | Technical Report | Comprehensive archival project documentation | September 2021 |
| 6 | Blueprint | Initial project design and architecture blueprint | September 2021 |
Tip
The Q-Learning agent's performance relies heavily on the quality of historical data. Regular retraining with recent market data is recommended to adapt the Q-Table's values to shifting market trends and volatility patterns.
| Component | Technical Description |
|---|---|
| Data Ingestion | Automated loading and processing of historical stock data (CSV). |
| Trend Analysis | Computation of 5-day and 1-day Moving Averages to identify trend signals. |
| RL Agent | Q-Learning implementation with state-action mapping for decision autonomy. |
| Portfolio Logic | Dynamic tracking of cash, stock holdings, and total net worth over time. |
| Visualization | Interactive Streamlit dashboard using Plotly for financial charting. |
Note
Stock markets are stochastic environments. This project simplifies the state space to Moving Average crossovers to demonstrate the foundational capabilities of Reinforcement Learning in financial contexts, prioritizing pedagogical clarity over high-frequency trading complexity.
- Runtime: Python 3.x
- Machine Learning: NumPy, Pandas
- Visualization: Streamlit, Plotly, Matplotlib, Seaborn
- Algorithm: Q-Learning (Reinforcement Learning)
OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING/
│
├── docs/ # Technical Documentation
│ └── SPECIFICATION.md # Architecture & Design Specification
│
├── Mega/ # Archival Attribution Assets
│ ├── Filly.jpg # Companion (Filly)
│ ├── Mega.png # Author Profile Image (Mega Satish)
│ └── ... # Additional Attribution Files
│
├── screenshots/ # Application Screenshots
│ ├── 01-landing-page.png # Home Interface
│ ├── 02-amzn-trend.png # Stock Trend Visualization
│ ├── 03-portfolio-growth.png # Portfolio Value Over Time
│ └── 04-alb-trend.png # Analysis Example
│
├── Source Code/ # Core Implementation
│ ├── Train_model/ # Training Notebooks
│ │ └── Model.ipynb # Q-Learning Implementation
│ │
│ ├── .streamlit/ # Streamlit Configuration
│ ├── all_stocks_5yr.csv # Historical Dataset
│ ├── model.pkl # Trained Q-Table (Pickle)
│ ├── Procfile # Heroku Deployment Config
│ ├── requirements.txt # Dependencies
│ ├── setup.sh # Environment Setup Script
│ └── Stock-RL.py # Main Application Script
│
├── Technocolabs/ # Internship Artifacts
│ ├── AMEY THAKUR - BLUEPRINT.pdf # Design Blueprint
│ ├── Optimizing Stock Trading...pdf # Research Paper
│ ├── PROJECT REPORT.pdf # Final Project Report
│ └── ... # Internship Completion Documents
│
├── .gitattributes # Git configuration
├── .gitignore # Repository Filters
├── CITATION.cff # Scholarly Citation Metadata
├── codemeta.json # Machine-Readable Project Metadata
├── LICENSE # MIT License Terms
├── README.md # Project Documentation
└── SECURITY.md # Security Policy1. User Interface: Landing Page
The Streamlit-based dashboard allows users to select stocks and define investment parameters for real-time strategy optimization.

2. Market Analysis: Stock Trend
Historical price visualization identifying long-term upward trends suitable for momentum-based trading strategies.

3. Strategy Evaluation: Portfolio Growth
Simulation of portfolio value over time, demonstrating the cumulative return generated by the agent against the initial capital.

4. Risk Assessment: Volatility Analysis
Trend analysis highlighting periods of high volatility where the agent adjusts exposure to mitigate risk.

- Python 3.7+: Required for runtime execution. Download Python
- Streamlit: For running the web application locally.
Warning
Data Consistency
The Q-Learning agent depends on proper state definitions. Ensure that the input dataset contains the required date, close, and Name columns to correctly compute the Moving Average crossovers used for state discretization.
Establish the local environment by cloning the repository and installing the computational stack:
# Clone the repository
git clone https://github.com/Amey-Thakur/OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING.git
cd OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
# Navigate to Source Code directory
cd "Source Code"
# Install dependencies
pip install -r requirements.txtLaunch the web server to start the prediction application:
streamlit run Stock-RL.pyAccess: http://localhost:8501/
Tip
Experience the interactive Stock Trading RL simulation directly in your browser through the working Hugging Face Space. This platform features a Q-Learning agent that executes autonomous trading decisions based on real-time Moving Average (MA) signal processing, providing a visual representation of algorithmic portfolio management and cumulative return optimization.
This repository is openly shared to support learning and knowledge exchange across the machine learning and algorithmic trading community.
For Students
Use this project as reference material for understanding Deep Reinforcement Learning (Q-Learning), state-action discretisation, and financial reward shaping. The source code is available for study to facilitate self-paced learning and exploration of moving average strategies.
For Educators
This project may serve as a practical lab example or supplementary teaching resource for Computational Finance, Artificial Intelligence, and Quantitative Trading courses. Attribution is appreciated when utilizing content.
For Researchers
The documentation and architectural approach may provide insights into simplified market modeling, policy iteration in volatile environments, and industrial internship artifacts.
This academic submission, developed for the Data Science Internship at Technocolabs Software, is made available under the MIT License. See the LICENSE file for complete terms.
Note
Summary: You are free to share and adapt this content for any purpose, even commercially, as long as you provide appropriate attribution to the original authors.
Copyright © 2021 Amey Thakur & Mega Satish
Created & Maintained by: Amey Thakur & Mega Satish
Role: Data Science Interns
Program: Data Science Internship
Organization: Technocolabs Software
This project features Optimizing Stock Trading Strategy With Reinforcement Learning, a study conducted as part of an industrial internship. It explores the practical application of Q-Learning in financial economics.
Connect: GitHub · LinkedIn · ORCID
Grateful acknowledgment to Mega Satish for her exceptional collaboration and scholarly partnership during the execution of this data science internship task. Her analytical precision, deep understanding of statistical modeling, and constant support were instrumental in refining the learning algorithms used in this study. Working alongside her was a transformative experience; her thoughtful approach to problem-solving and steady encouragement turned complex challenges into meaningful learning moments. This work reflects the growth and insights gained from our side-by-side academic journey. Thank you, Mega, for everything you shared and taught along the way.
Special thanks to the mentors at Technocolabs Software for providing this platform for rapid skill development and industrial exposure.
Authors · Overview · Features · Structure · Results · Quick Start · Usage Guidelines · License · About · Acknowledgments
📈 OPTIMIZING-STOCK-TRADING-STRATEGY-WITH-REINFORCEMENT-LEARNING
Computer Engineering (B.E.) - University of Mumbai
Semester-wise curriculum, laboratories, projects, and academic notes.

