Job description:
Data Skill Source is hiring an AI Agent Quality Assurance (QA) Team Lead to stand up and run a quality assurance workstream for a fast-growing B2B SaaS client in the intake-to-procurement space. This is a North America–based role leading a small, distributed QA team that tests and evaluates the client's customer-facing AI agents and multi-step workflows.
You'll own the QA workstream end to end: defining how the team tests, building the evaluation standards, managing day-to-day execution, and reporting quality signals back to the client's AI Operations team. This is a player-coach role—you'll set the standard by doing the work, then scale it through your team.
If you have strong QA or AI evaluation instincts and want to build a function rather than just execute tickets, this is for you.
What You'll Do
Workstream Ownership
- Stand up and own the Agent QA workstream: testing approach, evaluation frameworks, coverage priorities, and quality standards.
- Define what "good" looks like for the client's AI agents and translate loosely defined quality goals into reproducible test suites and metrics.
- Report quality signals, risks, and trends clearly to the client's AI Operations team.
Team Leadership
- Lead a distributed team of QA Engineers (Manila-based), coordinating across time zones with required ET overlap.
- Assign and prioritize work, review output, unblock the team, and maintain consistent quality across contributors.
- Onboard, coach, and level up team members; surface performance issues early.
Hands-On QA & Evaluation
- Design and run evaluation pipelines, adversarial/security testing, and workflow audits alongside the team.
- Validate agent accuracy, schema/API compliance, grounding, and edge-case handling across multi-step workflows.
- Drive root-cause analysis on failures and translate them into actionable feedback for the client's engineers.
What We're Looking ForCore Traits (Non-Negotiable)
- Ownership: You take a loosely defined mandate and turn it into a functioning, well-run workstream.
- Player-Coach Balance: You can do the technical work yourself and lead others doing it.
- Strong Communication: Direct, clear written and spoken English; you translate technical detail into signal for non-QA stakeholders.
- Judgment Under Ambiguity: You make reasonable calls when standards aren't fully defined, and know when to escalate.
Technical Skills
- Professional QA, software engineering, or AI/system evaluation experience, with hands-on backend testing, API validation, or eval-pipeline work.
- Experience evaluating LLM-based systems or AI agents (prompt behavior, grounding, adversarial testing) strongly preferred.
- Comfort with automated test suites, structured data validation, and schema/API testing.
- Prior experience leading, mentoring, or coordinating a team—formal or informal.
Nice to Have
- Experience working with distributed/offshore teams.
- Background in B2B SaaS, enterprise software, or procurement/fintech domains.
Note: Formal titles matter less than demonstrated ability. We care far more about what you've built and led than where you've worked.
What We Offer
- Consistent, long-term contract work with a stable, high-growth B2B SaaS client.
- 30+ hours/week, fully remote (Canada or US).
- MacBook Pro provided.
- The opportunity to build a QA function from the ground up and grow with a scaling managed-service program.
About Data Skill Source
Data Skill Source partners with fast-growing SaaS companies to deliver managed services, systems support, and applied automation. We focus on practical, production-ready technology—not demos or experiments that never ship.
How to Apply
Please submit your resume and a short note describing:
- QA or AI evaluation workstreams you've owned or built from scratch.
- Your experience leading or coordinating a team.
- Any hands-on work evaluating AI agents, LLM systems, or complex backend workflows.
- Links to GitHub repos, portfolios, or write-ups (if available).
Job Types: Full-time, Contract
Work Location: Remote (Canada or US)
Pay: From $30.00 per hour
Benefits:
Work Location: Remote