MediSense AI

An AI-powered multimodal clinical copilot that assists clinicians by analyzing medical images, transcribed patient–doctor conversations, and EHR data. It generates structured differential diagnoses, risk factors, red‑flag alerts, and RAG-backed advisories designed for real-time decision support and explainability.

The system supports voice input via WhisperX with speaker diarization, fuses signals from text and imaging, and surfaces live confidence/margin metrics with suggested follow‑up questions. It ships with a mock EHR integration, an embeddings-powered knowledge base, and a React UI for interactive clinical workflows.

Backend: FastAPI, Python, Pydantic, Uvicorn, WebSockets, python‑dotenv
ML/Vision: PyTorch, torchvision, OpenCLIP, timm, scikit‑learn, pandas
Speech: WhisperX, torchaudio, librosa, soundfile, PyAudio
RAG/LLM: ChromaDB, Sentence‑Transformers, LangChain, Groq API
Frontend: React 18, TypeScript, TailwindCSS, Axios, Framer Motion, react‑dropzone, react‑hot‑toast
Data/Infra: JSON EHR mock, env‑based configs, health checks, multimodal inference endpoints

Mukhil Sundararaj

MediSense AI