Steer Health logo

Senior QA Engineer

Steer Health
10 hours ago
Full-time
Remote
Worldwide
Remote QA Jobs

About the job

About Us: Steer Health helps healthcare organizations improve patient access, reduce operational burden, and recover revenue through AI-native workflow automation. Our lead product, Luna AI, acts as a voice-based digital workforce, handling patient access workflows such as scheduling, intake, and follow-up. We sit on top of existing EHR infrastructure and focus on measurable operational outcomes.

About The Role We are looking for a Senior QA Engineer who thrives at the intersection of AI, voice automation, and cloud-native systems. You will own quality across our platform — from testing LLM-powered features and voice pipelines to ensuring robust end-to-end coverage on GCP infrastructure. You will work closely with product, engineering, and AI teams to embed quality from the ground up.

Requirements

  • Design, build, and maintain automated test suites using Playwright for web and API surfaces, including AI-generated content flows.
  • Lead QA strategy for voice automation pipelines built on ElevenLabs — developing test cases for synthesis quality, latency, and failure modes.
  • Validate Claude (Anthropic) integrations: prompt-response accuracy, edge case handling, safety behaviors, and output consistency across builds.
  • Build and maintain Node.js-based test tooling, harnesses, and custom reporters for CI/CD pipelines.
  • Deploy, monitor, and triage test infrastructure on Google Cloud Platform — leveraging Cloud Run, GCS, and Pub/Sub for scalable test execution.
  • Define and track quality metrics: test coverage, flakiness rates, mean-time-to-detect, and regression velocity.
  • Collaborate with engineers during design reviews to surface testability gaps and advocate for observable, fault-tolerant system design.
  • Mentor junior QA engineers and establish team-wide standards for test authoring, review, and maintenance

Required Qualifications

  • 5+ years of QA engineering experience, with at least 2 years on systems that include LLMs, AI APIs, or speech/audio pipelines.
  • Expert-level Playwright skills — authoring resilient selectors, managing parallel workers, and debugging flaky tests at scale.
  • Proficient Node.js developer — comfortable writing custom test runners, CLI tooling, and service mocks in TypeScript/JavaScript.
  • Hands-on GCP experience: deploying workloads to Cloud Run or GKE, querying logs in Cloud Logging, configuring artifact storage in GCS.
  • Familiarity with ElevenLabs or comparable TTS/voice APIs — understanding synthesis parameters, webhook flows, and audio quality evaluation.
  • Practical experience testing Claude or other LLMs — designing determinism-aware test strategies, evaluating prompt regressions, and building evals.
  • Strong understanding of REST, WebSocket, and gRPC protocols for API-level testing.
  • Experience integrating test suites into CI/CD pipelines (GitHub Actions, Cloud Build, or similar).
  • Nice to Have
  • Experience writing custom LLM evals or using evaluation frameworks such as PromptFoo or Braintrust.
  • Background in audio signal quality assessment or speech intelligibility testing.
  • Familiarity with observability tooling: OpenTelemetry, Datadog, or GCP Cloud Monitoring.
  • Knowledge of accessibility testing standards (WCAG 2.1) and assistive technology compatibility

Core Technology Stack:

Google Cloud Platform (GCP)

  • Cloud Run, GCS, Pub/Sub, Cloud Logging, GKE for scalable test infrastructure

ElevenLabsVoice Automation

TTS pipeline testing, synthesis quality evaluation, webhook and latency validation

  • Node.js / TypeScript Custom test runners, service mocks, CLI tooling, and CI/CD integration
  • Playwright End-to-end and API-level browser automation with parallel execution
  • Claude (Anthropic) LLM integration QA, prompt regression testing, and output evaluation

Benefits

  • Competitive base salary commensurate with experience
  • High-autonomy environment with direct access to executive leadership
  • Structured operating cadence with clear goals, metrics, and career growth targets
  • Work that touches 19M+ patients — the mission is real

Flexible PTOs policy