Skip to main content

London, United Kingdom

Analytics Data Engineer – Health AI Analytics Data Engineer – Health AI

Location
London, United Kingdom
Job Number
1970324837140701-en-1
City
London
Category
Copilot Health
Country
United Kingdom
Discipline
At Microsoft AI, we are inventing an AI Companion for everyone – an AI designed with real personality and emotional intelligence that’s always in your corner. Defined by effortless communication, extraordinary capabilities, and a new level of connection and support, we want Copilot to define the next wave of technology. This is a rare opportunity to be a part of a team crafting something that challenges everything we know about software and consumer products.
Our health team is on a mission to help millions of users better understand and proactively manage their health and wellbeing. We’re responsible for ensuring that Microsoft AI’s models and services are useful, trusted and safe across diverse customer health journeys.
We’re looking for a deeply technical and mission-driven Analytics Data Engineer to build the data foundations powering our health AI companion. You’ll architect, scale, and optimize the pipelines, datasets, and metrics frameworks that help us understand user behavior, evaluate model performance, and measure health impact. This role sits at the intersection of engineering, analytics, and applied AI—translating raw signals into insights that shape product decisions and ensure our systems are safe, effective, and grounded in evidence.
You’ll partner closely with product, model, and clinical teams to define data models, build robust ETL workflows, and enable a high-quality analytics environment that supports experimentation, evaluation, and decision-making at scale.
Key Responsibilities
  • Design, build, and maintain high-quality data pipelines and models that power analytics, dashboards, and product experimentation across health AI experiences
  • Develop and optimize scalable ELT/ETL processes to extract data from multiple structured and unstructured sources (including telemetry, model outputs, and healthcare data integrations)
  • Partner with product and clinical counterparts to define source-of-truth datasets and standardized metrics for user engagement, safety, and health outcome evaluation
  • Implement monitoring, validation, and alerting systems to ensure data reliability, lineage, and reproducibility across the analytics stack
  • Collaborate with ML engineers and model evaluation teams to operationalize evaluation pipelines—supporting automated scoring, HealthBench metrics, and experiment tracking
  • Define and maintain data schemas, transformation logic, and documentation to promote transparency and reusability across teams
  • Drive continuous improvement in data quality, discoverability, and observability
  • Contribute to shaping data infrastructure strategy and tooling to support next-generation health AI systems
Required Qualifications
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Data Science, or related field, OR similar experience.
  • Experience with scaled consumer products
  • Experience building and maintaining production-grade data pipelines, warehouses, and analytics platforms
  • Strong proficiency with SQL and modern data-stack technologies (e.g., dbt, Airflow, Databricks, BigQuery, Snowflake, Spark, or similar)
  • Experience designing efficient data models and ETL processes supporting analytical workloads and experimentation
  • Proven ability to translate ambiguous data needs into scalable engineering solutions
  • Familiarity with data governance, schema design, and principles of data privacy and compliance (HIPAA, de-identification, PHI handling)
  • Experience working with Python for data processing, analytics, or pipeline orchestration
Preferred Qualifications
  • Experience working in healthcare, digital health, or regulated data environments
  • Exposure to large language model (LLM) or generative AI systems, particularly in analytics or evaluation contexts
  • Strong understanding of experiment design, metrics definition, and instrumentation in AI-driven products
  • Familiarity with tools for workflow orchestration, version control, and CI/CD (e.g., Airflow, Dagster, GitHub Actions)
  • Comfort collaborating cross-functionally with product, analytics, and clinical teams in a fast-paced environment
  • Curiosity about how AI systems can responsibly improve access to care and health outcomes