Back to Portfolio
MVP

VOID: Oceanographic Query Engine

An AI-powered ETL pipeline enabling natural-language interrogation of complex ARGO float datasets.

Tech Stack

Next.js 14FastAPIPythonSupabaseChromaDBGeoJSON
VOID: Oceanographic Query Engine

Overview

VOID is a retrieval-augmented generation (RAG) system designed to bridge the gap between oceanographers and raw data. It translates natural language queries into executable SQL commands, processing complex NetCDF files into visualized intelligence.

Core Architecture

  • ETL Ingestion: Automated pipeline transforming ARGO NetCDF files into a star-schema PostgreSQL database.
  • RAG Engine: Utilizes ChromaDB embeddings and few-shot prompting to minimize hallucination and maximize query accuracy.
  • Geospatial Intelligence: Converts semantic locations (e.g., "Near Mumbai") into Haversine formulas for precise coordinate filtering.

Impact

Eliminated the need for manual SQL writing for researchers, allowing for instant visualization of temperature trends and salinity gradients.