The Mission
As a Senior Data Engineer at ThreatMark, your primary mission will be to architect and maintain robust data infrastructure and pipelines that support our data-driven decision-making and advanced analytics. You will collaborate closely with our data analysts, data scientists, and engineering teams to ensure our data is precise, accessible, and meaningful, ultimately enhancing the quality of our products. Your work will enable ThreatMark to achieve its business objectives with confidence, backed by reliable and insightful data analysis.
General
Seniority: Senior (5+ years of experience)
Hire: Employee or Contractor
Employment Type: Full-time, Employee or Contractor
Place of work: Offices in Brno, Bratislava or Prague; Full Remote Possible
Responsibilities
In this role, you will:
Data Lake Development:
Build and maintain infrastructure for storage of structured and semi-structured multitenant data in Data Lake.
Maintain and develop configuration of Data Lake-related AWS infrastructure and services.
Develop and automate data ingestion, ETL processes and maintenance jobs.
Create a layer of consolidated data to be used for data analysis and reporting.
Data Quality and Integrity:
Ensure high levels of data quality and integrity across all data sources and pipelines.
Implement monitoring and alerting mechanisms to detect and address data issues promptly.
Performance Optimization:
Optimize data processing workflows for performance and efficiency.
Address bottlenecks and ensure data pipelines can scale with increasing data volumes.
Data Security and Compliance:
Ensure compliance with relevant data protection regulations and standards.
Implement data security measures to safeguard sensitive information.
Collaboration and Support:
Work closely with data analysts and data scientists to understand their data needs and ensure data availability.
Provide support and guidance on best practices for writing ETL and data engineering tasks within the AWS environment and used technologies.
Aid in MLOps processes.
Automate data reports.
Participation in data analysis and research welcomed.
Product’s solution design and development (out of Data Lake scope)
Qualifications
Must have:
Proven experience with development of data storage (5+ years).
Knowledge of and experience with:
SQL, Python, PySpark, Iceberg, Terraform, AWS Services (IAM, S3, Glue, Lambda, EMR, …) or their equivalents in Azure or GCP, Git, Gitlab CI/CD
relational, column-oriented and NoSQL database design
Strong proficiency in building and maintaining ETL and data pipelines.
Ability to communicate effectively in English.
What We Value
Ownership: A strong ability to take ownership and move towards shared goals without supervision.
Collaboration: A positive, can-do attitude with no-excuse startup mindset, clear, honest and timely communication.
Innovation: A fervent passion to learn new skills and technologies, seeking improvement, being open to new ideas, and making data-driven decisions.
Adaptability: Thriving in a fast-paced and evolving environment, being flexible and ready to take on new challenges.
At ThreatMark, we value diversity and are committed to creating an inclusive environment for all employees. If you are passionate about data analysis and eager to contribute to a team that is making a significant impact in the cybersecurity landscape, we encourage you to apply. Please submit your resume and a brief cover letter explaining your interest in the role and how your skills align with our mission.