Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

EPAM Systems • Brazil
Role & seniority: Senior Data Quality Engineer (Pleno-sênior level)
Stack/tools: Python; Hadoop ecosystem (HDFS, Hive, Spark); streaming (Kafka, Flume, Kinesis); NoSQL (Cassandra, MongoDB, HBase); SQL DBs (PostgreSQL, MSSQL, MySQL, Oracle); ETL tools (Talend, Informatica); MDM tools; data visualization (Tableau, Power BI, Tibco Spotfire); cloud (AWS, Azure, GCP); CI/CD (Jenkins, GitHub Actions); VCS (Git/GitLab/SVN); testing frameworks (TDD, DDT, BDT)
Lead end-to-end data quality strategy, testing, and reliability of data products; establish governance and compliant practices
Architect, scale, and automate data quality validation pipelines; optimize testing across complex pipelines and architectures
Collaborate cross-functionally, mentor junior engineers, and guide resource prioritization and documentation
3+ years in Data Quality Engineering
Advanced Python; strong SQL with high-volume/real-time workloads
Deep experience with Hadoop ecosystem, Spark, Kafka/streaming; NoSQL (Cassandra/MongoDB/HBase)
Experience with cloud platforms (AWS/Azure/GCP) and multi-cloud architectures
ETL design/Troubleshooting (Talend/Informatica or similar); MDM and JMeter for performance
Modern testing frameworks (TDD/DDT/BDT); CI/CD pipelines; version control
Strong problem-solving, analytical mindset, and English communication
Java/Scala or advanced Bash scripting
XPath expertise for
We are seeking an experienced and accomplished Senior Data Quality Engineer to join our team, driving the reliability, accuracy, and efficiency of our data systems and processes at scale. If you are passionate about shaping high-impact data quality initiatives and are eager to work with advanced technologies, this role will empower you to influence the future of our data landscape. Responsibilities Oversee end-to-end data quality strategy, ensuring rigorous testing and reliability of data products and processes Drive data quality initiatives while instilling best practices across multiple teams and projects Define and enforce advanced testing methodologies and frameworks to ensure enterprise-level data quality Prioritize and manage complex data quality tasks, optimizing efficiency under tight timelines and competing demands Architect and maintain robust testing strategies tailored to evolving system architectures and data pipelines Advise on resource allocation, setting priorities for testing aligned with regulatory and business standards Establish and continuously enhance a robust data quality governance framework, overseeing compliance with industry standards Develop, scale, and refine automated data quality validation pipelines for production systems Collaborate at a high level with cross-functional teams to resolve infrastructure challenges and optimize performance Mentor junior engineers while maintaining comprehensive documentation, including versions of test strategies and advanced test plans Requirements 3+ years of professional experience in Data Quality Engineering Advanced programming skills in Python Deep expertise in Big Data platforms, including Hadoop ecosystem tools (HDFS, Hive, Spark) and modern streaming platforms (Kafka, Flume, Kinesis) Strong practical knowledge of NoSQL databases such as Cassandra, MongoDB, or HBase, with a track record of handling enterprise-scale datasets Advanced skills in data visualization and analytics tools (e.g., Tableau, Power BI, Tibco Spotfire) to support decision-making Extensive hands-on experience with cloud ecosystems like AWS, Azure, and GCP, including a strong understanding of complex multi-cloud architectures Demonstrated expertise with relational databases and SQL (PostgreSQL, MSSQL, MySQL, Oracle) in high-volume, real-time environments Proven ability to implement, troubleshoot, and scale ETL processes using tools like Talend, Informatica, or similar platforms Experience deploying and integrating MDM tools into existing workflows, with knowledge of performance testing tools such as JMeter Advanced experience in version control systems (Git, GitLab, or SVN) and automation/scripting for large-scale systems Comprehensive knowledge of modern testing frameworks (TDD, DDT, BDT) and their application in data-focused environments Familiarity with CI/CD practices, including implementation of pipelines using tools like Jenkins or GitHub Actions Highly developed problem-solving abilities and an analytical mindset capable of interpreting complex datasets into actionable business outcomes Exceptional verbal and written communication skills in English (B2 level or higher), paired with experience guiding discussions with stakeholders Nice to have Extensive hands-on experience with programming languages like Java, Scala, or advanced Bash scripting for production-level data solutions Advanced knowledge of XPath and its applications in data validation or transformation workflows Experience designing customized data generation tools and sophisticated synthetic data techniques for testing scenarios We offer International projects with top brands Work with global teams of highly skilled, diverse peers Healthcare benefits Employee financial programs Paid time off and sick leave Upskilling, reskilling and certification courses Unlimited access to the LinkedIn Learning library and 22,000+ courses Global career opportunities Volunteer and community involvement opportunities EPAM Employee Groups Award-winning culture recognized by Glassdoor, Newsweek and LinkedIn
Nível de experiência Pleno-sênior Tipo de emprego Tempo integral Função Tecnologia da informação, Engenharia e Controle de qualidade Setores Desenvolvimento de software, Atividades dos serviços de tecnologia da informação e Tecnologia, Informação e Internet