Engineering Manager, HADR at Stripe

US RemoteFull-timeRemote8126 Developer InfrastructurePosted 14 days ago

About the Role

<h1><strong>Who we are</strong></h1> <h3><strong>About Stripe</strong></h3> <p>Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.</p> <h3><strong>About the team</strong></h3> <p>In this role, you will be joining the High Availability and Disaster Recovery team. At Stripe, availability is a core feature of our products. This team designs and builds new solutions to allow latency-critical, stateful applications to survive any type of disaster. We build distributed systems on top of unreliable architecture to provide highly available and resilient customer solutions. This team is creating greenfield solutions which will serve as the basis for Stripe’s architecture 5, 10, or 20 years into the future.</p> <p>This is a distributed team with many remote engineers. You are encouraged to apply if you meet the minimum requirements and are able to work from anywhere in the United States or Canada.</p> <h2><strong>What you’ll do</strong></h2> <p>You will help develop our global architecture by combining less-available components and data centers into a highly available and resilient whole. You will work on latency-critical solutions where every millisecond matters and data redundancy is a hard requirement. You will learn quickly and work on a broad range of problems - one day may be investigating Mongo write concerns, the next may be minimizing cross-region TLS handshakes, followed by developing new systems to automate disaster detection and failovers. Your work will enable Stripe to increase the GDP of the internet by providing uptime and data protection which have historically been impossible.</p> <h3><strong>Responsibilities</strong></h3> <ul> <li>Lead and manage a team of talented engineers on the team, providing mentorship, guidance, and support to ensure their success.</li> <li>Drive the execution of projects, overseeing the entire development lifecycle from planning to delivery, while maintaining high standards of quality and timely completion.</li> <li>Help influence peers / managers and build consensus while dealing with ambiguity</li> <li>Build your team - formalizing role definitions, defining charter and ownership boundaries and taking a newly formed team into a high-functioning one </li> </ul> <h1><strong>Who you are</strong><strong><em> </em></strong></h1> <p>We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.</p> <h3><strong>Minimum requirements</strong></h3> <ul> <li><strong>5+ years of engineering management experience</strong>, with a proven track record of managing and growing teams of 10+ engineers.</li> <li><strong>Deep domain expertise</strong> in cloud development or a strong background in building sophisticated infrastructure systems.</li> <li><strong>Exceptional communication skills</strong>, with the ability to distill complex technical concepts into clear strategic frameworks for peers and executives.</li> <li><strong>Experience in rapid-growth environments</strong>, specifically a history of successful, high-volume hiring and team scaling.</li> <li><strong>Technical proficiency</strong> to engage deeply with Staff-level engineers on system architecture, API design, and AI model integration.</li> </ul> <h3><strong>Preferred qualifications</strong></h3> <ul> <li>Understanding of distributed system concepts (ex. leader election, voting, quorum)</li> <li>Background in high-availability systems, chaos engineering, or disaster recovery design</li> <li>Experience with cloud infrastructure and multi-region deployments</li> <li>Familiarity with document databases such as MongoDB</li> </ul>

About the Role

Related Roles