Cricut is looking for SRE technology leader, who will be in charge of the SRE's overall strategy, development, and day-to-day operation, powering our design space and cricut.com
Cricut® mission is to help people lead creative lives. The company produces cutting machines for people crafting, designing, and DIY. The company's focus is on innovating with new machines, platforms, materials, and tools. The Cricut team is dynamic, diverse, and inclusive, join us as we place the power of handmade into the hands of ALL.
People Leadership (20%):
- Mentor, develop, and guide your SRE team;
- Excellent people manager who can be a hands-on player/coach;
- Help establish a collaborative work environment that fosters autonomy, transparency, innovation, and personal growth;
- Evangelize DevOps principle;
- Foster a fail fast, learn fast environment.
Technical Leadership (40%):
- Define vision, strategy, and roadmaps for SRE team;
- Support the growth by ensuring a robust, scalable, cloud-first infrastructure;
- Drive and own the measuring of SLA/SLO/SLI/uptime and ensure the organization meets set goals;
- Collaborate with Engineering teams to understand deployment practices and processes and work towards iteratively improving the release pipeline to provide a highly resilient deployment strategy, ideally with zero downtime;
- Ensure best engineering practices through automation, infrastructure as code, robust system monitoring, alerting, auto-scaling, self-healing, etc;
- A clear understanding of infrastructure, networking, configuration management, and computing services/deployment architecture;
- Experience building resiliency into system architecture;
- Experience in building CI/CD Pipeline;
- Collaborate and work closely with the Security teams to define and implement security compliance controls, auditing, and privacy initiatives;
- Experience in micro-service management experience;
- Propose architecture (software and infrastructure) changes that increase application availability and performance (i.e., drive design enhancements).
- Collaborate and troubleshoot across multiple layers of a technical stack, including Applications, Database, Systems, Networking, and Development;
- Identify sources of instability in large-scale distributed systems and drive operational excellence;
- Analyze complex systems from a reliability and resilience perspective;
- Write and maintain infrastructure-as-code on AWS.
- Advanced level of English, good communication skills;
- BS/MS in Computer Science or equivalent experience;
- 7+ years of experience as a software engineer and engineering manager with demonstrated technical & Organizational Leadership;
- Proven experience in Microsoft development stack and leveraging SQL Server, .NET/C#, and IIS for enterprise software technology;
- 5+ years experience shipping distributed systems, services, and highly available infrastructure;
- Experience managing Site Reliability Engineering;
- Experience in defining and implementing highly resilient and reliable applications with a focus on automation, availability, and performance;
- Architect, build and operate AWS environments with well established best practices;
- AWS Professional Certifications are a plus: AWS Certified DevOps Engineer, AWS Certified Advance Networking, AWS Certified Database, AWS Certified Security, AWS Certified Solutions Architect.