Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

Billennium • Poland
Role & seniority: AI Test Engineer with mid–senior level; 5+ years software testing, at least 1 year focused on genAI
Stack/tools: Python, pytest, Allure; AI testing tools/frameworks; GitHub Actions, GitLab CI/CD; AWS; Grafana; Jira/GitLab/GitHub; familiarity with LLMs, RAG systems; prompt-security concepts
Design and implement Generative AI/LLM testing strategy, including model response evaluation and RAG pipeline accuracy
Develop and maintain QA frameworks and automated testing for AI systems (integration, performance, AI-specific tests)
Identify edge cases, bias/fairness issues, security risks, and conduct comprehensive test documentation and risk-based testing
5+ years software testing; 1+ year genAI focus
Proficient in Python; strong testing framework experience (pytest, Allure)
Experience with AI testing tools, LLM/RAG knowledge, test automation, CI/CD pipelines
Cloud testing (AWS), monitoring/observability (Grafana), and ALM tools (Jira, GitHub, GitLab)
Security testing basics for AI applications; strong requirements analysis
Prior experience with Roche testing frameworks or similar AI QA frameworks
Deep understanding of AI failure modes, prompt injection risks, data leakage mitigation
Location & work type: Location not specified; work type not stated; global company with diverse, international team
Billennium is a global technology company with over 20 years of experience, committed to innovation and empowering businesses. As an employer, we offer a supportive, growth-focused environment where collaboration and creativity thrive. Join us to shape the future of technology together!
Generative AI Testing Strategy: Design and implement comprehensive testing strategies specifically tailored for LLM-based applications, including evaluation of model responses, RAG pipelines accuracy, and overall system reliability
Quality Assurance Framework Development: Utilize Roche's testing frameworks that address both traditional software quality aspects and AI-specific concerns such as output consistency, contextual accuracy, and ethical compliance; co-create and maintain such frameworks when required
Test Automation Development: Design and implement automated testing solutions for continuous evaluation of LLM applications, including integration tests, performance tests, and specialized AI behavior tests
Edge Case Analysis: Identify and develop test scenarios for edge cases in LLM behavior, including handling of ambiguous inputs, potential biases, and unexpected response patterns
Bias and Fairness Testing: Design and execute tests to identify potential biases in model outputs and ensure fair treatment across different user groups and use cases
Security Testing: Collaborate within development teams to test for potential vulnerabilities specific to LLM applications, including prompt injection, data leakage, and other AI-specific security concerns
Test Documentation: Create and maintain comprehensive test documentation testing strategy, test cases, and testing guidelines specific to AI applications and compliant with Roche practices; document the analysis of requirements and risks and develop tests accordingly
Performance Testing: Collaborate with other engineers to conduct thorough performance testing of GenAI applications, including response time analysis, load testing, and resource utilization monitoring
Experience: 5+ years of experience in software testing, with at least 1 year focused on genAI applications
Technical Skills: Strong proficiency in Python and testing frameworks (pytest, allure); experience with AI testing tools and frameworks
AI Knowledge: Understanding of LLM architectures, RAG systems, and common failure modes in AI applications
Test Automation: Experience with test automation frameworks and GitHub Actions / GitLab CI/CD pipelines; ability to design and implement automated testing solutions for AI applications Ability to analyze requirements, identify test scenarios, and prioritize testing activities based on risk and impact
Cloud Platforms: Practical experience with testing applications on cloud platforms (AWS) and working with cloud-based AI services
Monitoring Tools: Experience with monitoring and observability tools, log analysis, and performance metrics tracking (Grafana)
ALM tools: Experienced in Roche mandatory ALM tools (Jira, GitLab, GitHub)
Security Testing: Understanding of security testing methodologies, particularly in the context of AI applications Hold B.Sc., B.Eng., M.Sc., M.Eng. in Computer Science, Software Engineering, or equivalent degree with strong background in testing methodologie