Pipeline
Browse Jobs
Sign inSign up
Pipeline
Browse jobsSign inContactTermsPrivacyCookiesPreferences
Logos provided by Logo.dev

© 2026 Pipeline. All rights reserved.

  1. Home
  2. Jobs
  3. Ingénierie/ Engineering
  4. SRE Developer
Aylo Health logo

Aylo Health

SRE Developer at Aylo Health

Montréal, QuebecFull-timeIngénierie/ EngineeringPosted 29 days ago
Apply with Pipeline→

About the Role

<p>Established in 2004, we are a tech pioneer offering world-class adult entertainment and games on some of the internet’s safest and most popular platforms. With the support of an international team of dynamic and collaborative innovators, we are on a mission to enable safe user experiences and empower our communities by celebrating diversity, inclusion, and expression — all while maintaining robust trust-and-safety protocols.&nbsp;</p> <p>We embrace the best of both worlds! Local talent can thrive in our collaborative office space with the flexibility of a hybrid work environment, while remote team members play an integral role in shaping our dynamic culture from afar. We have offices in Montreal (Quebec), Austin (Texas) and Nicosia (Cyprus).&nbsp;</p> <p style="text-align: center;"><strong>*A select number of positions require full-time in office attendance*</strong></p> <p style="text-align: left;">We are seeking a highly skilled Site Reliability Engineer (SRE) to support and enhance the reliability, scalability, and performance of our production systems. In this role, you will play a key role in incident response, root cause analysis, and continuous improvement of operational processes while leveraging cutting-edge tooling and AI-assisted solutions.</p> <p style="text-align: left;"><strong>What you’ll be doing:&nbsp;</strong></p> <ul> <li style="text-align: left;">Own the reliability, availability, and performance of production systems in a containerized, microservices-based environment</li> <li style="text-align: left;">Monitor system health using Grafana dashboards, alerts, and observability tools; proactively identify and resolve issues</li> <li style="text-align: left;">Manage and operate Kubernetes clusters (via Rancher), including deployments, scaling, and troubleshooting</li> <li style="text-align: left;">Lead and participate in incident management using OpsGenie, including on-call rotations, escalations, and post-incident reviews</li> <li style="text-align: left;">Troubleshoot issues across application, infrastructure, messaging, database, and container layers</li> <li style="text-align: left;">Build and maintain automation scripts and tools using Bash, Go, and/or Python to improve operational efficiency</li> <li style="text-align: left;">Support and optimize CI/CD pipelines using GitLab, ensuring smooth deployment and release processes</li> <li style="text-align: left;">Collaborate with development teams to improve application reliability, performance, and observability</li> <li style="text-align: left;">Work with databases and data systems (MySQL, Redis) for performance monitoring and issue resolution</li> <li style="text-align: left;">Support distributed messaging systems such as Kafka and RabbitMQ</li> <li style="text-align: left;">Contribute to and maintain operational documentation, runbooks, and knowledge bases using Jira and Confluence</li> <li style="text-align: left;">Perform root cause analysis (RCA) and implement preventative measures</li> <li style="text-align: left;">Ensure systems operate in alignment with security, compliance, and data privacy standards</li> <li style="text-align: left;">Leverage AI-powered engineering tools to accelerate troubleshooting, documentation, and workflows</li> </ul> <p><strong>What you need to be successful: </strong></p> <p><strong>Must Haves:&nbsp;</strong></p> <ul> <li>3+ years of experience in Site Reliability Engineering, DevOps, Production Support, or Systems Engineering</li> <li>Bachelor’s degree in computer science or related field</li> <li>Hands-on experience with Grafana, Kubernetes and Docker</li> <li>Experience with OpsGenie for incident management and on-call coordination</li> <li>Strong experience with GitLab/Git, including CI/CD pipelines and release processes</li> <li>Proficiency with Atlassian tools (Jira, Confluence) for tracking and documentation</li> <li>Solid knowledge of MySQL • Experience with Kafka and/or RabbitMQ</li> <li>Familiarity with Redis for caching and performance optimization</li> <li>Working knowledge of Temporal or similar workflow orchestration tools</li> <li>Strong scripting skills in Bash</li> <li>Proficiency in Go and/or Python for automation and tooling</li> <li>Familiarity with PHP applications (Symfony, Laravel) for production support</li> <li>Proven ability to troubleshoot complex systems across multiple layers</li> <li>Excellent documentation habits (runbooks, playbooks, system diagrams)</li> </ul> <p><strong>Nice to Have:</strong></p> <ul> <li>Knowledge of FTC data protection principles</li> <li>Understanding of NIST frameworks and security best practices Familiarity with GDPR requirements (data handling, logging, retention, privacy)<br><br></li> </ul> <p style="text-align: center;"><strong>As an equal opportunity employer, we celebrate diversity and are committed to creating an inclusive environment for all employees</strong></p> <p style="text-align: center;"><strong>In this role you may be exposed to adult content</strong></p>

Related Roles

  • Software Test Analyst

    Aylo Health

    Athens, Greece
  • Software Tester (Manual)

    Aylo Health

    Athens, Greece
  • Software Tester (Manual)

    Aylo Health

    Nicosia, Cyprus
  • Software Tester (Manual)

    Aylo Health

    Thessaloniki, Greece
  • GO - Senior Software Developer

    Aylo Health

    Montréal, Quebec
  • Software Test Analyst

    Aylo Health

    Nicosia, Cyprus