About you

You are an engineer passionate about building reliable, observable, and secure AI systems at scale. You enjoy working at the intersection of product metrics, evaluation frameworks, and production-grade infrastructure. You are comfortable translating abstract KPIs into measurable signals, designing deployment safeguards, and ensuring AI systems behave safely, efficiently, and consistently in real-world environments.

You thrive in fast-paced, cloud-native environments, value automation and rigor, and enjoy collaborating with product, engineering, and AI teams to continuously improve system quality.

You bring to Applaudo the following competencies:

Bachelor’s Degree in Computer Science, Software Engineering, Computer Engineering, or related field — or equivalent professional experience.
Strong experience with AI/ML evaluation, including metric definition, evaluation pipelines, golden datasets, and automated judge systems.
Proficiency in observability and monitoring, including structured logging, tracing, and OpenTelemetry.
Solid background in CI/CD automation and modern deployment strategies (canary, blue-green, gated deployments).
Knowledge of AI safety practices, including PII scrubbing, deterministic guardrails, and secure handling of model inputs and outputs.
Experience working with multi-agent systems and translating product KPIs into measurable agent performance metrics.
Hands-on experience with AWS, including CDK, ECS/ECR, WAF, SES, Bedrock, CloudWatch, and DevOps Guru.
Strong experience with Docker, Kubernetes, and cloud-native tooling.
Familiarity with Azure for identity management and basic exposure to GCP environments.
Strong analytical thinking, attention to detail, and problem-solving skills.
Excellent communication skills to collaborate across product, platform, and AI teams.
English proficiency required for collaboration with global stakeholders.

You will be accountable for the following responsibilities:

Translate product KPIs into measurable agent and system-level metrics for effectiveness, efficiency, robustness, and safety.
Design and implement end-to-end observability using structured logging, metrics, and tracing with OpenTelemetry.
Curate and maintain golden datasets and manage judge systems for scalable, repeatable AI evaluation.
Implement evaluation-gated deployments within CI/CD pipelines.
Orchestrate pre-merge and post-merge validation workflows to ensure quality before release.
Apply canary and blue-green deployment strategies, enabling fast and safe rollbacks.
Enforce layered security controls, including PII scrubbing, deterministic guardrails, and AI-based filtering of inputs and outputs.
Monitor and analyze latency, error rates, token usage, and cost metrics across AI systems.
Track production quality indicators such as correctness, relevance, and helpfulness.
Convert failures, incidents, and negative feedback into automated regression tests.
Manage multi-agent interoperability and coordination across AI components.
Continuously update and adapt guardrails and safety controls as new risks and threats emerge.

AgentOps (Artificial Intelligence Engineer)

Share this job

AgentOps (Artificial Intelligence Engineer)

Job Description

Additional Information

Other jobs

Front Desk Assistant

Integrations Architect (Salesforce & Mulesoft)

Inside Account Executive

QA Mobile Automation Engineer

APPLAUDO

MORE ABOUT US