Data Quality Engineer (M/F)

Permanent contract
Lyon
Salary: Not specified
Occasional remote

ORIS
ORIS

Interested in this job?

Questions and answers about the job

The position

Job description

As a Data Quality Engineer, you will occupy a pivotal position within the ORIS Materials Intelligence, an organization distinguished by its innovation, data and AI driven culture. Your mandate will involve quality-assuring our data sets, to maintain high-quality data that supports informed decision-making and drives business success. By implementing data quality assessment strategies, identifying and fixing inaccuracies or inconsistencies in data sets, you will ensure the completeness, accuracy, consistency, integrity, conformity and timeliness of data used within the organization.

You’ll collaborate with diverse teams, multicultural and international, to define project scopes and implement solutions, all while using agile principles and the kanban framework

This is a permanent position, with ORIS offices conveniently situated near Lyon’s Part Dieu train station, English as our working language and a fantastic team to work alongside!

Data Quality tests

  • Execute manual validation tests, to identify duplicates, issues related to data accuracy, consistency or completeness, and fix the data quality issues by correcting/completing the datasets

  • Create and execute end-to-end automated tests using modern tools, ensuring accurate simulation of data pipelines behavior

  • Design and implement automated tests to validate the data accuracy, timeliness, consistency, conformity, completeness and integrity

Data Preparation

  • Test Data Management: this involves creating, maintaining, and managing test data, such as images and structured data, to ensure it can support testing activities effectively, ensuring balanced, cleaned, and consistent datasets

  • Label geospatial images using dedicated tools

  • Ability to effectively search and find alternative digital data sources and accurate datasets, manually or with the use of pipelines provided by the team

Data tests strategies and test cases

  • Design and develop test cases to validate the data. This includes creating scenarios that cover all possible data conditions and edge cases

  • Ensure high-quality standards across all stages of the data lifecycle, from data collection and storage to processing and analysis

  • Define rules and requirements for the dataset, and combined indicators to measure the quality of data

  • Use and propose appropriate data quality tools to monitor, manage, and improve the quality of the data, using features like data observability, automated data lineage, and key performance indicators (KPIs).

Collaboration

  • Document and report any issues or discrepancies found during the testing process. This includes providing detailed reports on the testing results and any recommended solutions

  • Support a team of engineers in adhering to quality best practices, assisting with specification clarifications

  • Collaborate effectively with cross-functional teams to ensure data related developments are in sync with predefined data quality requirements

Proactive Incident Management

  • Use of issue tracking and project management software such as JIRA, or similar platforms to accurately report, track, and manage defects throughout the testing lifecycle

  • Utilize monitoring tools such as OpenSearch, or Datadog to proactively detect and manage incidents, ensuring rapid response times and minimal disruption to service availability

  • Configure and manage alerting systems within tools like observability systems such as Datadog,  or BI tools like Tableau, to automate notifications of potential issues before they escalate into critical problems, enabling preemptive action

Continuous Learning

  • Keep up-to-date with the latest trends and best practices in test automation, data quality testing, observability and monitoring

  • Participate in internal training sessions, workshops, and code reviews


Preferred experience

Experience

2+ years’ experience in Data quality analysis and testing, including test automation and manual data validation would be highly appreciated

Diploma

Bachelor’s degree or equivalent in Computer Science

Required Expertise

  • Experience with data quality pillars accuracy, timeliness, consistency, conformity, completeness and integrity

  • Experience with automation tools like Selenium, Puppeteer, or similar

  • Familiar with Observability tools like Datadog or OpenSearch

  • Familiarity using agile development methodologies (Kanban or SCRUM)

  • Proficiency in Python

  • Strong knowledge of Git and comfortable in working with branches, merging code, and resolving conflicts

  • Strong experience with GitLab CI/CD pipelines

  • Familiarity with AWS services

  • Familiarity Docker and Kubernetes

  • Familiarity using API endpoints

  • Familiarity using Jupyter Notebooks

Nice to have skills

  • Familiarity with database systems (relational and NoSQL) like PostgreSQL, MongoDB, Redis and data lakes (AWS S3)

  • Experience as a Data developer with python is a big plus.

  • Experience with workflow automation leveraging AI or LowCode solutions

  • Familiar with Computer vision and labeling tools

Important Soft Skills

  • Good communication skills for collaborating with developers and effectively conveying technical concepts and discussing business requirements with non-technical colleagues

  • Ability to prioritize and delegate tasks to ensure efficient project execution

  • Willingness to learn new tools and technologies as they emerge

  • Proficient in conducting code reviews and providing constructive feedback to maintain code quality

  • Skilled in fostering a positive and collaborative team environment

  • Strong organizational skills to keep track of multiple tasks and ensure smooth coordination with external developers

  • Capacity to handle project documentation and ensure its accuracy and completeness.

  • Proficient in presenting technical concepts and ideas to both technical and non-technical stakeholders

Want to know more?