Stock Market Data Pipeline & Prediction Model
End-to-end machine learning pipeline for predicting S&P 500 stock movements using historical market data.
- Built a full data ingestion pipeline to collect, normalize, and clean historical market data using the yfinance API
- Developed a Random Forest classifier with a backtesting engine achieving 55.7% precision across 9,000+ historical trading days
- Implemented train-test splits and data leakage prevention for reliable model evaluation