Cookies & analytics consent
We serve candidates globally, so we only activate Google Tag Manager and other analytics after you opt in. This keeps us aligned with GDPR/UK DPA, ePrivacy, LGPD, and similar rules. Essential features still run without analytics cookies.
Read how we use data in our Privacy Policy and Terms of Service.
🤖 15+ AI Agents working for you. Find jobs, score and update resumes, cover letter, interview questions, missing keywords, and lots more.

Go Digital Technology Consulting LLP • Michigan, United States
Role & seniority: ETL QA Analyst; seniority not specified
AWS data services (S3, Redshift, Glue, EMR, etc.)
SQL for data validation
PySpark for transformation validation
Python for automation
Agile/Scrum environment; modern data engineering toolchains (version control, CI/CD, orchestration)
Perform ETL testing and data validation across source, staging, transformation, and reporting layers
Validate PySpark-based transformations and ensure data accuracy, completeness, and consistency
Write complex SQL queries for transformations, reconciliations, full/incremental loads, and defect logging/tracking; collaborate with data engineers and stakeholders
Solid hands-on ETL testing experience
Advanced SQL for data validation and reconciliation
Experience validating PySpark-based transformations
AWS-based data environment experience (S3, Redshift, Glue, EMR, etc.)
Data warehousing concepts; Python automation; Agile delivery experience
Experience designing/contributing to automated data validation frameworks on AWS
Exposure to modern data toolchains and cloud-native workflows
Basic data modeling knowledge
Location: not specified
Work type: not specified (implied collaborative, Agile environment)
Role Overview
We are seeking an ETL QA Analyst with strong hands-on experience in validating data pipelines within AWS-based cloud environments. The ideal candidate will have expertise in SQL-based data validation, exposure to PySpark transformations, and experience ensuring data integrity across source, staging, and reporting layers.
This role focuses on maintaining data accuracy, completeness, and consistency across enterprise data platforms and collaborating closely with data engineering teams in an Agile environment.
Key Responsibilities
ETL & Data Validation
Perform ETL testing for AWS-based data pipelines. Validate data movement across source, staging, transformation, and reporting layers.
Cloud & PySpark Validation
Validate data transformations executed using PySpark. Support testing of distributed data processing workflows in AWS environments. Ensure correctness and consistency of large datasets processed in cloud platforms.
Automation & Toolchain Exposure
Use Python to automate repetitive data validation and reconciliation tasks. Maintain regression test cases as pipelines evolve. Work within modern data engineering toolchains (e.g., version control, CI/CD, orchestration tools) to support automated validation workflows.
Collaboration & Reporting
Work closely with data engineers and business stakeholders to review transformation logic and requirements. Log, track, and verify resolution of data defects. Participate in Agile/Scrum ceremonies and provide QA updates.
Required Skills & Experience
Strong hands-on ETL testing experience. Advanced SQL skills for data validation and reconciliation. Experience validating PySpark-based transformations. Working experience in AWS-based data environments (e.g., S3, Redshift, Glue, EMR, etc.). Solid understanding of ETL concepts and data warehousing fundamentals. Proficiency in Python for automation. Experience working in Agile delivery environments.
Preferred (Nice To Have)
Experience designing or contributing to automated data validation frameworks on AWS. Exposure to modern data toolchains and cloud-native workflows. Basic understanding of data modeling concepts.