🚩 The Breaking Point
FinVault had crossed 8,000 registered users but their monolithic PHP portal — originally built in 2019 — was not designed for this scale. During peak market hours (9:15 AM–11:30 AM IST), the server would hit 100% CPU utilization, causing session timeouts and incomplete portfolio loads. Their biggest enterprise client threatened to terminate the contract after losing ₹3.2 Lakhs due to a delayed sell signal caused by stale data.
The core architectural flaw: all data — live prices, NAV, portfolio P&L — was fetched via a single cron job every 45 minutes. There was no event-driven mechanism, no caching strategy, and no horizontal scalability plan.
🔍 Discovery & Audit
| System Area | Issue Found | Impact |
|---|---|---|
| Database Layer | N+1 queries, no indexing | Page load 14s avg |
| Data Freshness | 45-min cron polling | Stale investment data |
| Frontend Architecture | jQuery + inline CSS spaghetti | Zero mobile support |
| Scalability | Single VPS, no load balancing | Crashes at 500+ users |
🛠️ The Architecture Rebuild
Phase 1 — Decouple the Monolith
We split the application into 4 independent microservices: Auth Service (JWT), Portfolio Engine, Market Data Streamer, and Notification Service. Each service communicated via a Redis Pub/Sub message queue, eliminating direct database chaining and allowing independent scaling.
Phase 2 — Real-Time Data Pipeline
Replaced 45-minute cron polling with a WebSocket-based live stream from NSE/BSE market data APIs. A Node.js event emitter layer processed and broadcast price ticks to connected clients in under 800ms. Redis cached the last 100 ticks per symbol — serving 95% of requests without touching PostgreSQL.
Phase 3 — Next.js Frontend
The entire UI was rebuilt in Next.js 14 with App Router. Portfolio charts used Chart.js with real-time dataset updates via WebSocket listeners. Server Components handled initial data hydration; Client Components handled live tick updates — achieving an LCP of 1.1 seconds on mobile.
Phase 4 — Infrastructure & DevOps
Deployed on Docker containers with Nginx reverse proxy on a 3-node cluster. GitHub Actions CI/CD pipeline ran automated tests on every push with zero-downtime blue-green deployments. Cloudflare CDN handled static asset serving with edge caching across 12 global PoPs.
📈 Outcome: 60 Days Post-Launch
| Metric | Before | After | Delta |
|---|---|---|---|
| Data Latency | 45 minutes | 800ms | -99.97% |
| Peak Concurrent Users | 500 (crashes) | 12,000 (stable) | +2300% |
| Page Load Time (avg) | 14.2s | 1.1s | -92% |
| Client Retention (30-day) | 38% | 82% | +116% |
| System Uptime | 94.2% | 99.97% | +5.77 pts |