About FundamentalFundamental is an AI company pioneering the future of enterprise decision-making. Founded by DeepMind alumni, Fundamental has developed NEXUS – the world's most powerful Large Tabular Model (LTM) – purpose-built for the structured records that actually drive enterprise decisions. Backed by world class investors and trusted by Fortune 100 companies, Fundamental unlocks trillions of dollars of value by giving businesses the Power to Predict.At Fundamental, you'll work on unprecedented technical challenges in foundation model development and build technology that transforms how the world's largest companies make decisions. This is your opportunity to be part of a category-defining company from the ground-up. Join the team defining the future of enterprise AI.Key responsibilitiesDesign and implement cloud infrastructure from the ground upBuild and maintain Kubernetes clusters optimized for GPU workloads and ML applications, as well as Production SaaS hostingImplement GitOps practices using ArgoCD for continuous deploymentDevelop infrastructure as code using TerraformCreate and maintain CI/CD pipelines for infrastructure and application deploymentImplement monitoring and observability solutions for distributed systemsAutomate infrastructure management with Python and BashCollaborate with ML engineers to optimize infrastructure for model training and servingImplement and maintain cost optimization strategies (FinOps) for cloud resourcesMonitor and optimize cloud spending, especially for GPU-intensive workloadsMust have5+ years of experience in cloud infrastructure and DevOps3+ years of experience with PythonStrong experience with AWS and GCP cloud platformsDeep expertise in Kubernetes, including multi-cluster management, GPU workload optimization, resource scheduling and autoscaling, and network policies and securityExperience with GitOps tools (ArgoCD preferred)Extensive experience with cloud networking, including VPC design, load balancer configuration, network security and segmentation, and cross-cloud networking solutionsStrong CI/CD expertise, preferably with GitHub ActionsProficiency in infrastructure as code (Terraform)Experience with monitoring and observability toolsExperience with FinOps practices and cloud cost optimizationNice to haveExperience with ML workflow tooling (MLflow, Kubeflow, or similar)Experience with FastAPI and Backend applicationsFamiliarity with data platforms like Databricks or SnowflakeExposure to SRE practices or cloud security certificationsHands-on experience with Prometheus, Grafana, or DatadogBenefitsCompetitive compensation with salary and equityComprehensive health coverage, including medical, dental, vision, and 401KFertility support, as well as paid parental leave for all new parents, inclusive of adoptive and surrogate journeysRelocation support for employees moving to join the team in one of our office locationsA mission-driven, low-ego culture that values diversity of thought, ownership, and bias toward action