Machine Learning Infrastructure Engineer

remotemid$150K$350K

via Ashby

About this role

ABOUT THE ROLE We’re looking for seasoned ML Infrastructure engineers with experience designing, building and maintaining training and serving infrastructure for ML research. Responsibilities: - Provide infrastructure support to our ML research and product - Build tooling to diagnose cluster issues and hardware failures - Monitor deployments, manage experiments, and generally support our research - Maximize GPU allocation and utilization for both serving and training Requirements: - 4+ years of experience supporting the infrastructure within an ML environment - Experience in developing tools used to diagnose ML infrastructure problems and failures - Experience with cloud platforms (e.g., Compute Engine, Kubernetes, Cloud Storage) - Experience working with GPUs…

Read the full description on Character's site →

What we'd score you on

reqspace match rubric

Five dimensions, recruiter-grade. Upload your resume and we'll generate a written explanation of where you fit and where the gaps are.

1

Skills match

For this role: kubernetes, jax

2

Level fit

This role is mid-level. We check your trajectory against it.

3

Domain experience

Your work in the role's domain matters more than your years total. We weight recent and direct experience.

4

Recency

A skill you used last quarter weighs more than one from five years ago. We grade on recency, not lifetime.

5

Location fit

This role is remote-eligible — we factor in your stated location and time-zone overlap.

Score yourself on this role.
Free · no card · written explanation included
See if I'm a fit →

Skills in this role

Pulled from the job description. These are the keywords we'll weight when scoring your fit.

kubernetesjax

More at Character

See all open jobs at Character