Senior Machine Learning Engineer at QWERKY AI

About Qwerky AI

QWERKY AI is a human-centered artificial intelligence company focused on building practical and approachable AI tools for real-world use. Headquartered in Columbia, South Carolina, with a distributed team across the U.S., QWERKY is led by a founding team of tech entrepreneurs with over a decade of experience. The company is dedicated to creating AI that enhances — rather than replaces — human intelligence. QWERKY AI is currently developing an AI platform to empower knowledge workers, creatives and small businesses.

Job Description

QWERKY AI seeks a highly experienced, visionary, and technically exceptional Senior Machine Learning Engineer (MLE) to spearhead critical initiatives within our Research & Development team. In this leadership role, you will be responsible for architecting, designing, and implementing highly scalable, robust, and cutting-edge machine learning systems and infrastructure. You will drive the technical strategy for MLOps and ML system design, mentor a team of talented engineers, and tackle our most complex engineering challenges in operationalizing AI. The ideal candidate is a recognized expert in ML engineering with a proven track record of delivering complex, high-impact ML systems from concept to production at scale.

Responsibilities

Design, develop, and deploy mission-critical machine learning systems, platforms, and infrastructure, ensuring best-in-class reliability, scalability, and performance.
Execute the organization's technical vision and strategy for MLOps practices, tools, and frameworks.
Own and oversee the end-to-end lifecycle of complex ML systems, from requirements gathering and system design to implementation, testing, deployment, and long-term operational excellence.
Provide technical leadership, mentorship, and guidance to machine learning engineers, fostering a culture of innovation, collaboration, and engineering excellence.
Champion and enforce software engineering and MLOps best practices, including advanced CI/CD for ML, automated testing, infrastructure-as-code, comprehensive monitoring, and proactive incident response.
Collaborate with data scientists to understand model intricacies and translate research prototypes into production-grade systems.
Spearhead the optimization of machine learning models and inference pipelines for ultra-low latency, high throughput, and optimal resource utilization on various hardware platforms.
Help lead the evaluation, selection, and integration of new technologies, tools, and methodologies to enhance our ML engineering capabilities.
Drive initiatives to improve our ML infrastructure's scalability, reliability, and cost-effectiveness.
Troubleshoot and resolve challenging issues in production ML systems, often requiring deep dives into complex, distributed environments.

Required Skills

Bachelor’s, Master's, or PhD in Computer Science, Software Engineering, a closely related technical field, or equivalent experience (10+ years).
Extensive, proven experience (typically 5+ years, or 3+ with a PhD) in machine learning engineering, software engineering focusing on ML systems, or a similar role.
Expert-level proficiency in Python and proficiency in at least one language relevant to high-performance systems (e.g., C++, Java, Go, Rust).
Hands-on expertise in building and deploying complex machine learning models and systems into production environments.
In-depth understanding of MLOps principles, tools, and platforms (e.g., MLflow, Kubeflow, TFX, Seldon Core, Docker, Kubernetes, CI/CD for ML, model registries, feature stores).
Experience with major cloud platforms (e.g., AWS, Azure, GCP), including their advanced ML services, compute options, and infrastructure components.
Expert understanding of machine learning lifecycle, distributed systems, microservices, and data engineering principles.
Demonstrated ability to develop complex technical projects, mentor engineers, and execute technical strategy.
Exceptional problem-solving, debugging, and system design skills, with the ability to execute on solutions for ambiguous and challenging requirements.
Outstanding communication and interpersonal skills, with the ability to articulate complex technical designs and strategies to technical and executive audiences.

Bonus Skills

Significant contributions to open-source MLOps, machine learning, or distributed systems projects.
Expertise in designing and implementing solutions for real-time, low-latency ML inference at scale.
Knowledge of specific hardware acceleration for ML (e.g., GPUs, TPUs, FPGAs) and experience with CUDA programming or similar.
Experience building and managing large-scale data processing pipelines using technologies like Spark, Flink, Kafka, or Beam.
Expertise in network programming, distributed consensus, or high-availability system design.
Advanced knowledge of C++ for building and optimizing high-performance ML inference pipelines or system components.
Experience with security best practices for ML systems and data.
A track record of publications in top-tier engineering or ML systems conferences/journals.

Pay / Benefits

Salary or hourly rate
Stock options plan (we are a private company, so this is not liquid)
If you are in the USA: healthcare, dental, vision, 401k
Unlimited time off policy
Flexible working hours

Hiring Process

Submit a resume to us for review.
We will follow up with a technical screening (this will take approximately one hour).
Following the successful completion of the technical screening, we will schedule an Onsite Interview.
The Onsite Interview to meet more of the team, it will consist of the following (total time three hours):
- Technical Screenings (1-2)
- System Design
- Behavioral Interview
We’ll reach out with an offer if you're a great fit.
Once accepted, you start working with us!

Senior Machine Learning Engineer