All projects
Haziq Nazeer
Full-stackAI Telehealth Platform2025 — Present

VirtualMD

A production AI telehealth platform with 70+ AI specialist doctors across web and mobile — real-time voice and streaming chat consultations powered by Anthropic Claude and OpenAI's Realtime API, engineered for reliability at scale.

Full-Stack Engineer · Web reliability ownerVisit live site
VirtualMD preview

01 — Overview

The project

VirtualMD lets patients consult 70+ AI specialist doctors by voice or chat across web and mobile. A Python FastAPI backend coordinates multiple AI providers — Anthropic Claude for clinical reasoning, OpenAI's Realtime API over WebRTC for live voice — behind a React app, a separate admin console and a Next.js marketing site. It runs in production at virtualmd.app.

Role

Full-Stack Engineer · Web reliability owner

Timeline

2025 — Present

Stack

8 technologies

02 — Context

Problem & approach

The problem

Real-time medical conversations are unforgiving: a dropped socket, a stalled token stream or a latency spike breaks the consultation instantly. The platform also had to coordinate multiple AI providers, serve web and mobile from one API, and stay reliable as usage grew toward serving a very large user base.

My approach

I owned web reliability and the real-time streaming pipeline end to end. I hardened the WebSocket layer (keep-alive pings, idle cleanup, exponential-backoff reconnection, per-connection rate limiting) and built an adaptive client-side drain that scales the typewriter render to queue depth so streamed responses never stall. On the backend I worked across a provider-abstracted AI connector with fallback handling, a Redis cache-aside layer with graceful degradation, Celery background jobs, and a Postgres schema tuned with connection pooling and targeted indexes — architecture built to scale toward 1M+ users.

03 — Showcase

A closer look

Live AI consultation — streamed answers with cited sources and suggested follow-ups

Live AI consultation — streamed answers with cited sources and suggested follow-ups

The panel — 70+ AI specialist doctors to choose from

The panel — 70+ AI specialist doctors to choose from

Starting a consultation with a personal AI health advisor

Starting a consultation with a personal AI health advisor

04 — Capabilities

Key features

01

Real-time AI chat

Token-by-token streaming over a custom StreamStart / Chunk / End WebSocket protocol.

02

Live voice consultations

OpenAI Realtime API over WebRTC with on-device voice-activity detection.

03

70+ AI specialist doctors

A coordinator routes each turn to the right specialist persona.

04

Vision & document analysis

Uploaded images and PDFs analysed by Claude with format detection.

05

Family & guest modes

Separate histories per family member, plus no-login guest sessions.

06

30+ languages with RTL

Full internationalisation including right-to-left layouts.

05 — Contribution

My role

As Full-Stack Engineer · Web reliability owner, here is exactly what I owned and delivered on this project.

  • Owned web app reliability and the real-time streaming pipeline.
  • Built a custom WebSocket manager — keep-alive, idle cleanup, reconnection with backoff, per-connection rate limiting.
  • Engineered an adaptive client-side stream drain (RAF batching scaled to queue depth) for smooth, stall-free token rendering.
  • Worked across the provider-abstracted AI connector with 429 / timeout / refusal fallbacks.
  • Tuned Postgres with connection pooling and indexes, and added a Redis cache-aside layer with graceful degradation.
  • Coordinated API contract changes across the web and mobile teams.

06 — Engineering

Challenges I solved

Challenge

Streamed AI responses stalled or overwhelmed the UI under load.

Solution

Built an adaptive requestAnimationFrame drain that scales batch size to queue depth (2→12 chars/frame) and flushes when the tab is hidden — smooth output with no runaway queues.

Challenge

Real-time voice and chat sockets dropped mid-consultation.

Solution

Added keep-alive pings, idle cleanup and 5-attempt exponential-backoff reconnection with a 15s health check so sessions stay live.

Challenge

A single AI provider hitting a 429 or timeout would break a consultation.

Solution

Routed all AI calls through one connector with model selection and typed fallbacks for rate limits, timeouts and refusals.

07 — Toolbox

Built with

FastAPIPythonReactPostgreSQLAnthropic ClaudeOpenAI RealtimeWebRTCRedis

08 — Impact

Outcomes

70+

AI specialist doctors

Web + Mobile

Served from one API

Live

In production at virtualmd.app

Next project

LinguaLeap

Real-time AI Arabic-Learning Platform

Want something like this built?

I'm available for freelance work. Let's build yours.

Hire me