Are you passionate about maintaining robust and high-performing infrastructures? Do you thrive in managing complex network environments and ensuring system reliability?
Join our infrastructure team and help us elevate operational excellence to new heights.
As a Site Reliability Engineer at Flowdesk, you will be at the heart of our infrastructure operations, ensuring that our global high-frequency trading platform runs smoothly and efficiently.
Reporting to Flowdesk's Lead of Infrastructure and collaborating closely with the Engineering, Trading, and Data teams, your role is crucial in maintaining and enhancing our systems' reliability and performance
Your mission together with the other members of the infrastructure team will be to
- Monitor and optimize our network infrastructure to ensure peak performance and minimal downtime.
- Implement and manage robust monitoring solutions (using tools like Prometheus and Grafana) that proactively detect and resolve issues.
- Maintain critical infrastructures, ensuring reliability security, scalability, and performance of the company’s essential systems, such as Nats, the ultra-low latency networking stack, cloud infrastructures, data pipelines, and more
- Develop the in-house MPC (Multi-party Computation) infrastructure to ensure reliability and performance.
- Collaborate with the team to design and refine disaster recovery plans that minimize risk and ensure rapid recovery during incidents.
- Engage with Flowdesk's teams to understand their needs and provide technical support and solutions that enhance their operational capabilities.
- Propose and implement system improvements and innovative solutions to enhance performance and reliability.Have a look at our stack here stackshare.io/flowdesk/flowdesk
Requirements
Background and experience
- Fluency in English (French is a plus).
- Proven track record in managing and securing high-availability systems in highly sensitive environments.
- Experience with network and system monitoring tools like Prometheus, Grafana, and similar technologies.
- Strong background in implementing and managing incident response strategies and disaster recovery plans.
- Skilled in system automation and scripting (Python, Bash), with a solid understanding of Kubernetes and cloud services (AWS, GCP).
- Familiarity with DevOps methodologies and tools for CI/CD (FluxCD, GitHub Actions).
- Excellent problem-solving skills, with the ability to communicate complex technical information to non-technical stakeholders.
- Organized and methodical approach to work, with the capability to prioritize urgent tasks and meet deadlines.
- A strong interest in continuous learning and applying new technologies and concepts to improve system performance and reliability.
Join our team and contribute to a resilient and cutting-edge trading infrastructure that supports Flowdesk's growth and innovation in the crypto market!
Benefits
- International environment (English is the main language)
- 50% of transportation costs & a sustainable mobility agreement
- Swile lunch voucher (€9.25 per day, 60% covered)
- 100% Alan Blue covered for you and your children
- Gymlib contribution to gym membership
- Top of the range equipment, Macbook, keyboard, laptop stand, 4K monitor & headphones
- Team events and offsites
- Coming soon, international mobility & lot of other cool benefits!
Are you interested in this job but feel you haven't ticked all the boxes? Don't hesitate to apply and tell us in the cover letter section why we should meet!
Here's what you can expect if you apply
- HR Call with the Tech Talent Acquisition (30')
- Technical round with the Lead of the Infrastructure team (60’)
- Technical interview with the Head of the Infrastructure department (60’)
- Culture fit with the Lead Talent Acquisition (30')
On the agenda, discussions rather than trick questions! These moments of exchange will allow you to understand how Flowdesk works and its values. But they are also (and above all) an opportunity for you to present your career path and your expectations for your next job!