Site Reliability Engineer (SRE) Internship

Work study(12 to 36 months)
Paris
Salary: Not specified
Starting date: June 30, 2025
Fully-remote
Experience: < 6 months
Education: High School Diploma

Popsink
Popsink

Interested in this job?

Questions and answers about the job

The position

Job description

Popsink is a cutting-edge data transfer solution revolutionizing how organizations handle and move their data. Our mission is to provide seamless, secure, and efficient data transfer capabilities for businesses of all sizes. As a fast-growing startup, we are seeking a passionate and experienced Site Reliability Engineer (SRE) Internship to join our fully remote team and help us build a highly reliable, scalable, and efficient infrastructure.


Preferred experience

As an SRE Intern at Popsink, you will play a critical role in ensuring the reliability, scalability, and security of our infrastructure. You will collaborate with developers, product teams, and other engineers to design and implement robust systems and processes that power our stack, which includes Google Cloud Platform (GCP), Kubernetes, ArgoCD, and Terraform. Additionally, you will drive our monitoring and tracing strategies to ensure deep visibility into system health and performance.


Recruitment process

Key Responsibilities

  • Infrastructure Management:

    • Design, build, and manage cloud infrastructure on Google Cloud Platform (GCP).

    • Automate infrastructure provisioning and deployments using Terraform.

  • Orchestration & Automation:

    • Manage and optimize Kubernetes clusters for containerized application deployment and scaling.

    • Implement GitOps workflows using ArgoCD to ensure seamless application updates.

  • Monitoring, Tracing, & Performance:

    • Develop and maintain comprehensive monitoring and tracing solutions to track system health and performance.

    • Configure and utilize tools like Prometheus, Grafana, Jaeger, or similar systems for observability.

    • Proactively identify bottlenecks and optimize system performance based on metrics and logs.

  • Reliability Engineering:

    • Define and maintain SLOs, SLAs, and SLIs to ensure system reliability.

    • Lead post-incident reviews and implement preventive measures to enhance system resilience.

  • Collaboration:

    • Partner with development teams to implement CI/CD pipelines and enforce best practices.

    • Foster a culture of operational excellence, automation, and continuous improvement across the team.


Required Qualifications

  • Problem Solving:

    • Proven ability to troubleshoot complex distributed systems in production environments.

    • Experience with incident management and root cause analysis processes.

  • Soft Skills:

    • Strong communication and collaboration skills, with a proactive mindset.

    • Comfort working in a fast-paced startup environment.


Why Join Popsink?

  • Impact: Be part of a startup revolutionizing data transfer solutions.

  • Growth: Join a fast-paced environment with ample opportunities for career development.

  • Culture: Work with a collaborative, innovative, and supportive team.

Flexibility: Enjoy a fully remote work environment that supports work-life balance.

Want to know more?