We use the following criteria to assess the level and title of ops at 37signals. These criteria aren’t exhaustive, and they aren’t mere checklists. The criteria outline the shape of what work at a given level on the Ops team at 37signals looks like.
It should also be noted that these criteria chiefly examine the scope of work someone is capable of tackling independently. In addition to that assessment, we also look at the consistency and quality of the execution itself. 37signals pays in the top 10% of the industry (based on San Francisco rates), so the quality of the work itself should be commensurate with that target.
Junior Site Reliability Engineer
- Discussion about tasks and how to complete them prior to starting work is required.
- Basic familiarity with networking, configuration management, containers, orchestration, and other major systems, and with common processes and procedures.
- Mostly carries out low-risk, isolated system maintenance tasks; passively participates in emergency problem resolution.
- Many questions to more senior team members.
- Does not participate in on-call rotation.
- Self taught; may have worked at IT Help Desk for 1-2 years.
Site Reliability Engineer
- Work is reviewed with the occasional need for material direction or implementation changes.
- Works on single system / variable problems.
- Can handle on call with the backing of a more senior team member.
- Can perform work on production systems by following existing procedures.
- Shows strength in some areas of specialization, but lacks the balance of experience in all areas.
- Lacks institutional knowledge about our systems. Not completely familiar with documentation or procedures.
- Usually at least 5 years of experience as sysadmin or programmer with site reliability experience.
Senior Site Reliability Engineer
- Work doesn’t necessarily need to be reviewed, but general approach may be.
- Can work independently on smaller projects and be a reliable contributor to larger projects.
- Fully participates in on call rotation.
- Subject matter expert in at least one major system.
- Plans and performs lower risk maintenance independently. Participates in higher risk maintenances.
- Contributes to resolving major problems.
- Improves existing professional standards for the team.
- Usually at least 6-10 years of experience being a professional sysadmin or network engineer; typically 5 years experience at 37signals internalizing how we work.
Lead Site Reliability Engineer
- Able to fill in and lead team for short periods.
- Work happens completely autonomously with no regular need for review.
- Expert on multiple systems. Helps make strategic decisions around major components.
- Elevates the standards through new tooling, processes, procedures and effective communication. Able to carry out research, testing, implementation and improvement for new systems.
- Leads high risk maintenance with limited to no customer impact.
- Significant technical contributor to problem resolution; demonstrates consistent maturity in communication and demeanor under stress.
- Performs more complex work like capacity planning, load testing, security improvements, etc.
- Sets new professional standards for the team.
- Usually at least 8-12 years of experience being a professional sysadmin.
Principal Site Reliability Engineer
- Can run large and complex Ops projects independently.
- Carries significant responsibility for many domains of infrastructure.
- Multiple areas of expertise: configuration management, containers, continuous integration / development, debugging, orchestration, optimization, networking, performance, reliability, security.
- Makes what is new normal; what is old reliable; evangelizes what is next.
- Pushes the whole organization forward regularly through implementing new systems and designs.
- Writes new procedures and documentation regularly; trains others throughout the company.
- Work is almost always free of mistakes; often helps others improve the quality of their work.
- Completely comfortable working with all teams at 37signals; frequently coordinates work across teams to solve complex problems.
- When a site is down, something is broken, or work is crazy, this person is ready to save the day and lead us to a successful resolution.
- Effectively delegates work to others; acts as a leader who has earned the respect of their peers.
- Capable of coordinating company wide response to major issues and leading problem resolution via emergency procedures.
- Usually 12-15 years of experience being a professional sysadmin.