Staff Site Reliability Engineering (SRE)

CDI
Paris
Salaire : Non spécifié
Télétravail fréquent
Postuler

Alma
Alma

Cette offre vous tente ?

Postuler
Questions et réponses sur l'offre

Le poste

Descriptif du poste

About the job

Alma shapes the fintech landscape. We strive to serve and empower consumers and merchants by developing innovative solutions that redefine their purchase experience.

About the mission

  • Organize and prioritize SRE roadmaps to ensure that the infrastructure is aligned with customer needs (internal and external)
  • Lead cross-functional initiatives within the product teams.
  • Regularly interact with stakeholders and senior management, ensuring alignment and effective communication on key initiatives.
  • Promote automation and SRE best practices to optimize operational efficiency.
  • Develop and maintain backup and disaster recovery strategies to protect data and ensure business continuity.
  • Design, implement and maintain monitoring tools to track key system metrics, health indicators and our SLAs/SLOs.
  • Provide technical support and expertise to engineering teams for the resolution of application and infrastructure incidents.
  • Carry out in-depth analyzes of incidents in order to identify the underlying causes and put in place corrective measures.
  • Maintain the platform in operational condition by implementing updates, security patches and continuous improvements.
  • Participate in the optimization of the operating costs of the platform.
  • Supporting and guide SREs through knowledge-sharing and collaboration, fostering continuous improvement across the team

About you 

  • At least 8 years in the management of cloud infrastructures.
  • You also have experience in project management, enabling you to oversee and drive initiatives from planning to successful delivery
  • Strong presentation and communication skills to collaborate with different teams and share problems and solutions effectively.
  • Deep knowledge of Google Cloud Platform or other cloud providers.
  • Good network knowledge.
  • Experience in setting up and maintaining monitoring tools, analyzing metrics and malfunctions.
  • Practice of Infrastructure as code.
  • Ability to solve problems methodically and work effectively under pressure during critical incidents.
  • Practice of English.

Our technical stack

  • Cloud providers: GCP, CloudFlare, AWS
  • Backend: Python + FastAPI and Flask
  • Frontend: React / Typescript
  • Databases technologies: PostgreSQL, Redis, BigQuery
  • Log and error management: Datadog, Sentry
  • CI/CD: Github Actions, Docker
  • Monitoring: Datadog
  • Infrastructure as Code: Terraform

About the recruitment process

  • Interview with Talent Acquisition (30-45 min)
  • Interview with Engineering Manager (45-60 min)
  • Take-home Coding test, followed by a remote feedback session and a system design test (90 min)
  • Team Fit interview (30 min)

 

Envie d’en savoir plus ?

D’autres offres vous correspondent !

Ces entreprises recrutent aussi au poste de “Cloud Computing and DevOps”.

Voir toutes les offres
Postuler