We are seeking a skilled Data/Database Engineer to join our team and take ownership of managing, optimizing, and monitoring our data infrastructure. The ideal candidate will have experience working with both SQL and NoSQL databases, building and maintaining ETL pipelines, and ensuring seamless data operations. This role requires expertise in cloud platforms (preferably Google Cloud), containerization, and continuous integration/continuous deployment (CI/CD) practices. The engineer will also ensure data security and compliance with regulations such as GDPR/CCPA.
Your responsibilities include:
1/ Data Pipeline Management:
Design, implement, and maintain scalable ETL processes using PySpark and SparkNLP
Manage data pipelines using GCP Workflows for scheduling and orchestrating jobs
Ensure seamless integration and management of data systems to maintain continuous operation.
2/ Database Management:
a) SQL Databases:
Manage and optimize PostgreSQL databases for transactional data and relational database management.
Regularly optimize queries and indexes to ensure high-performance operations.
Implement automated backup and recovery solutions for PostgreSQL to prevent data loss.
b) NoSQL Databases:
Manage and optimize NoSQL datasets using Delta Lake for large-scale data.
Ensure NoSQL infrastructure scalability to handle increasing data volumes.
3/ Infrastructure & Deployment:
Deploy data applications on cloud platforms like Google Cloud.
Utilize Docker for containerized environments and ensure consistency across development, testing, and production environments.
Leverage GCP services for deployment, scaling, and monitoring of data applications.
Set up and manage CI/CD pipelines using GitHub Actions to automate testing, deployment, and version control.
4/ Monitoring & Performance Optimization:
Monitor data processing systems for latency, throughput, and error rates to ensure optimal performance.
Ensure data quality by regularly checking for consistency, completeness, and accuracy across databases and pipelines.
Implement centralized logging using Google Cloud Logging to aggregate logs from multiple sources.
5/ Security & Compliance:
Ensure the encryption of data both at rest and in transit.
Implement role-based access control (RBAC) to secure data and model endpoints.
Maintain compliance with regulations such as GDPR and CCPA, including detailed audit logging for model training and data access.
6/ Documentation & Communication:
Document API endpoints and data pipelines using tools like Swagger for ease of maintenance and onboarding.
Provide data flow diagrams, ETL process documentation, and data schema explanations.
Set up alerts using Google Cloud Monitoring and Slack for real-time issue notifications.
Generate and share performance reports to keep stakeholders informed and facilitate data-driven decision-making.
Required Skills & Qualifications:
Minimum 3 year of experience on both SQL (PostgreSQL) and NoSQL (Delta Lake, Firestore, MongoDB) databases
Experienced in Python and GCP, experienced in AWS is a plus
Proficient in PySpark, SparkNLP, and data pipeline orchestration tools (e.g., GCP Workflows).
Expertise in containerization (Docker) and CI/CD pipelines (GitHub Actions).
Knowledge of performance metrics (latency, throughput, error rates) and data quality checks (consistency, completeness, accuracy).
Understanding of data encryption, access control (RBAC), and compliance with GDPR/CCPA.
API development (REST/GraphQL) and ML pipeline integration.
Strong scripting (Python/Bash) and experience with automation (Terraform, Ansible).
Familiar with monitoring tools (Prometheus, Grafana, ELK stack) and big data frameworks.
Excellent communication skills and the ability to document and report on technical processes.
D&M believes diversity drives innovation and is committed to creating an inclusive environment for all employees. We welcome candidates of all backgrounds, genders, and abilities to apply. Even if you don’t meet every requirement, if you’re excited about the role, we encourage you to go for it—you could be exactly who we need to help us create something amazing together!
30 mins chat with Julie, Talent Acquisition Specialist
45 mins chat with Omar, Lead Data Engineer
45 mins chat with Harsha, Engineering Lead
What you’ll love about working at Descartes & Mauss
Flexible and hybrid work arrangements
Work from home (or wherever you perform best) up to 3 days per week
Join your colleagues in the office on Wednesdays & Thursdays at our WeWork space in Paris 9ème
Access to WeWork offices worldwide
Extended full remote periods during summer and the end of the year, allowing you to spend more time with loved ones
An international, diverse, and inclusive work environment
18+ nationalities represented (and growing!)
A 50-50 gender balance, ensuring equal opportunities for all
A welcoming culture that embraces new perspectives and backgrounds
A workplace that prioritizes well-being and a strong team spirit
Regular team-building activities, after-work drinks, and fun social events
An international company offsite every year - our last one was in Tuscany in 2024!
Benefits and perks
50% of Navigo transportation costs reimbursed
Swile card for meals, with a daily allowance of €16, 55% covered by us
An extra day off on your birthday apart from 25 days off + RTTs 🎊
Comprehensive mutuelle
Continuous learning and career growth:
A culture that encourages professional and personal development
Buddy program for all newcomers, ensuring a smooth integration with guidance from experienced team members
Access to training resources and mentorship from experienced peers and managers
3 company-wide workshops per year, some featuring expert speakers!
These companies are also recruiting for the position of “Data / Business Intelligence”.
See all job openings