Skip to content
View MarShaikh's full-sized avatar
💭
💭
  • University of Oxford
  • Oxford, UK
  • X @mrfshk

Block or report MarShaikh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MarShaikh/README.md

hey, i'm marouf 👋

research software engineer — I make ML work in prod, not just in notebooks.

currently building research platform infrastructure for biological data visualisation @ Oxford's Centre for Human Genetics, on the MDV platform. previously built MLOps + geospatial pipelines for epidemiological forecasting at LSHTM, worked on protein language models in biotech, and did a stint at a YC startup.

i like the space between research and production. most of my work involves turning messy scientific workflows into something deployable.

what i'm working on

  • ChatMDV — an LLM-powered interface for exploring biological data, with a focus on provenance and sandboxed execution
  • a distributed jobs framework for MDV with a pluggable executor abstraction (Local + Slurm backends), so analysis jobs can run on HPC as easily as locally
  • vLLM / local LLM inference across HPC and OpenStack environments
  • mdvtools — packaging and shipping it on PyPI

things i reach for

Python PyTorch Docker Kubernetes Slurm Azure Terraform PostgreSQL GitHub Actions


LinkedIn Email

Pinned Loading

  1. planetary-computer-pipelines planetary-computer-pipelines Public

    Scalable pipeline for geospatial data processing: Direct ingestion from Microsoft Planetary Computer or batch processing of CHIRPS climate data with COG conversion, STAC metadata generation, and Az…

    Python

  2. stac-fastapi-pgstac-azure stac-fastapi-pgstac-azure Public

    Shell

  3. epi-geo-chat epi-geo-chat Public

    A multi-agent chat interface for querying geospatial climate data.

    Python 3