Projects

Sentiment Analysis

๐Ÿง  NLP Sentiment Rating Prediction Project โ€” Key Highlights
  • ๐Ÿ“Œ Project Goal: Built an NLP model to predict sentiment ratings (0โ€“10) from textual feedback in a coaching institute context.

  • ๐Ÿ” Data: 200,000+ unique reviews generated with realistic coaching-related expressions (e.g., "teachers good", "not up to mark", "average", etc.).

  • ๐Ÿงน Preprocessing: Applied text cleaning, tokenization, stopword removal, and lemmatization using NLTK and re.

  • ๐Ÿง  Feature Engineering: Used TF-IDF vectorization and Word2Vec to convert text into numerical features.

  • ๐Ÿ“ˆ Modeling: Trained regression models including Linear Regression to predict sentiment scores.

  • ๐Ÿงช Evaluation: Achieved high prediction accuracy with reduced MAE and RMSE; selected the best-performing model for deployment.

  • ๐Ÿš€ Deployment: Deployed the final model using Streamlit for real-time prediction with a user-friendly web interface.

  • ๐Ÿ’ผ Business Impact: Automated sentiment scoring reduced manual review time, minimized human bias, and improved feedback analysis efficiency by 74%

Objective:


To analyze one year of realistic, multi-source e-commerce data and deliver data-backed strategies to achieve at least ๐Ÿ“ˆ 15% revenue growth through optimization across marketing, logistics, and customer experience.

  • ๐Ÿ“Š Defined and monitored KPIs: Net Revenue, Profit Margin, AOV, Return Rate, Cart Abandonment, CAC, ROAS

  • ๐Ÿง  Built Excel and Power BI dashboards to explore trends across loyalty, product category, city tier, and conversion funnel

  • ๐Ÿ› ๏ธ Used SQL (BigQuery) for deep-dive analysis using joins, aggregations, and window functions

  • ๐Ÿงฌ Applied Cohort Analysis & RFM Segmentation in Python to identify customer lifecycle groups (e.g., Champions, At-Risk)

  • ๐Ÿ“ Conducted statistical testing (T-Test, ANOVA, Chi-square, Z-Test) to validate differences with statistical confidence

  • ๐Ÿ” Simulated 9+ business strategies (e.g., loyalty conversion, AOV uplift, funnel improvement) to forecast revenue impact

  • ๐ŸŒ Developed a clean, interactive Streamlit app for real-time KPI tracking and scenario simulation

๐Ÿงฐ Tools Used:

Python ๐Ÿ | SQL ๐Ÿงพ | Power BI ๐Ÿ“Š | Excel ๐Ÿ“ˆ | Streamlit ๐ŸŒ | Plotly ๐Ÿ“‰ | SciPy ๐Ÿ”ฌ

โœ… Outcome:

Achieved a 15.03% increase in projected annual revenue through strategic optimization, simulation, and data storytelling โ€” all backed by domain knowledge and statistical evidence.

Blinkit Revenue Optimization & Simulation Dashboard

Student Churn Prediction Engine

๐ŸŽฏ Objective

To proactively identify students at risk of dropping out from a coaching institute and enable timely interventions to improve retention and reduce revenue loss.

๐Ÿ› ๏ธ Tools & Technologies

Python, pandas, scikit-learn, Streamlit, pickle (for model deployment)

๐Ÿ“Š Project Overview

I developed a machine learning model using Random Forest to predict student churn based on academic performance, attendance, feedback sentiment, and behavioral scores. The final model achieved 97% accuracy and was deployed as an interactive Streamlit web app for real-time prediction and counselor use.

๐Ÿ’ผ Business Impact

  • Correctly flagged 14,142 high-risk students

  • Enabled interventions that retained 5,656 students

  • Saved approximately โ‚น2.83 Crores in potential revenue loss

๐Ÿš€ Outcome

This project showcases my ability to solve real business problems using end-to-end data science โ€” from feature engineering and model building to deployment and impact simulation.