Head of AI/GPU Optimization Engineering

Densify -
Canada

Postuler dès maintenant

Détails du poste

Temps plein

Profil recherché

Node.js
Kubernetes
Systèmes distribués
Intelligence artificielle
Compétences en communication

Description complète du poste

Role Overview

Kubex is seeking a Head of AI Optimization Engineering to lead the technical direction and hands-on development of our AI infrastructure optimization capabilities. This is a senior, hands-on technical leadership role reporting directly to the CTO.

You will act as a principal-level architect and engineer , owning the design and evolution of Kubex’s optimization solutions for Kubernetes-based environments running AI workloads, with a strong emphasis on GPU-accelerated inference . This role carries broad technical ownership and organizational influence, and we are looking for candidates interested in a position that provides both hands on and people-leadership opportunities.

This role is ideal for someone who combines deep, practical experience with GPU infrastructure and Kubernetes with the ability to reason about system-level trade-offs, optimization strategies, & real-world customer environments, and who remains excited to write and ship production code.

Key Responsibilities

Technical Leadership & Architecture

Own the technical vision and architecture for Kubex’s AI infrastructure optimization capabilities, with a focus on Kubernetes-based environments running GPU-accelerated workloads.
Lead the design of systems that automate the optimization of resource configurations and allocations across containers, nodes, GPUs, and autoscaling groups.
Serve as a senior technical authority within the organization, guiding architectural decisions and influencing broader engineering strategy.

Hands-On Engineering

Contribute directly to production code, remaining deeply hands-on in the design, implementation, and evolution of core platform components.
Collaborate closely with other senior engineers to coordinate and execute complex software development initiatives.
Prototype, validate, and productionize new technical approaches related to AI workload optimization.

GPU & AI Infrastructure Expertise

Apply deep expertise in NVIDIA GPU ecosystems , including:
CUDA and GPU programming models
Tensor vs. non-tensor core trade-offs
Multi-Instance GPU (MIG) configurations and advanced GPU sharing strategies
Device plugins, telemetry, and instrumentation required to support optimization algorithms
Understand how customers deploy and operate AI workloads in production, from container configuration through node-level and cluster-level design.
Work with Kubernetes autoscaling technologies (e.g., native autoscaling, Karpenter, …) and understand their interaction with GPU-backed nodes.

Optimization & Innovation

Work with Kubex’s existing optimization frameworks and patented technologies, quickly building fluency and contributing to their evolution.
Collaborate with internal experts on optimization algorithms while bringing strong systems intuition and real-world constraints into solution design.
Identify opportunities to extend Kubex’s value beyond inference workloads, including potential future optimizations for training or hybrid workloads.

External & Cross-Functional Impact

Partner with Product Management to translate customer needs and market opportunities into actionable technical solutions.
Engage directly with customers on architecture and design discussions.
Represent Kubex externally through technical discussions, thought leadership, and industry engagement as appropriate.

Engineering Culture

Champion high standards for engineering quality, correctness, observability, and operational excellence.
Embrace and promote the use of AI-assisted development tools and workflows to accelerate software delivery and improve developer effectiveness.

Required Qualifications

10+ years of professional software engineering experience, including significant experience building complex, production systems.
Deep, hands-on experience with GPU-accelerated infrastructure , particularly NVIDIA-based environments.
Strong knowledge of Kubernetes, including how GPU-backed workloads are scheduled, scaled, and operated in real-world clusters.
Practical experience with CUDA, GPU telemetry, and performance considerations for AI workloads.
Proven ability to design and build systems that balance performance, cost efficiency, and operational reliability.
Strong coding skills and a demonstrated commitment to remaining hands-on with production code.
Excellent communication skills, with the ability to explain complex technical concepts to both internal and external audiences.

Preferred Qualifications

Experience optimizing or operating large-scale AI inference platforms.
Familiarity with advanced GPU sharing strategies, including MIG, and their implications for scheduling and performance.
Exposure to optimization-based systems, scheduling, bin-packing, or resource allocation problems.
Experience working with autoscaling frameworks such as Kubernetes HPA/VPA or Karpenter.
Background in high-performance computing, large-scale distributed systems, or AI platforms at scale.
Experience mentoring or leading senior engineers, with interest in future people leadership.

Why Join Kubex?

Play a key role in shaping the future of AI infrastructure optimization.
Work on technically challenging problems at the intersection of Kubernetes, GPUs, and AI workloads.
Collaborate with a highly experienced, deeply technical team.
Influence product direction, architecture, and external technical positioning.
Flexible, remote-first culture focused on impact and innovation.
Competitive compensation, equity, and benefits.

Postuler dès maintenant