Site Reliability Engineer
- Portland, OR
- Posted 1 week ago – Accepting applications
As a Site Reliability Engineer you will be tasked with daily operations of running the Smarsh SaaS Platform. You will be passionate about uptime metrics, automating all of the simple tasks and figuring out how to deploy code continuously. You’ll work closely with our Engineering, QA and Technical Operations group to manage our current on-premise deployments and cloud native infrastructure. Our stack runs on .NET and Java while using MSSQL, ActiveMQ and Zookeeper.
- Strong experience operating in high-traffic on-prem and cloud environments
- Experience managing CI/CD systems (Concourse, Jenkins, TravisCI )
- Strong experience working with configuration management tools. (Puppet, Ansible, Terraform)
- Experience deploying and/or operating ELK stack or Splunk
- Experience with container technologies and orchestration platforms (Docker, Kubernetes, Cloud Foundry)
- Experience working with monitoring and observability tools (We use Datadog and New Relic)
- Familiarity with working with MSSQL or MySQL databases
- Background working in a multi-platform environment (Linux, Windows)
- Experience with running on a cloud platform, AWS preferred (S3, RDS, SQS)
- Familiarity with Agile/Scrum/Kanban methodologies
- Strong background with programming/scripting languages (ie. Python, Bash, Powershell, Go, etc.)
- Manage day to day operations of our SaaS on-prem platform ensuring health and performance of platform.
- Creatively solve problems in the DevOps space, collaborating with Development, DBA, and QA team members
- Communicate and coordinate effectively with Product, Customer Success and Integration teams on operations tasks and deployments.
- Listen to our internal customers/teams, understand their pain points, coach/mentor them for working smarter
- Execute with modern container and cloud native best practices
- Document decisions regarding technology choices, best practices and process flow
- Help create and manage continuous integration systems.
- Mentor and uplevel other SRE team members on how to operate effectively in the cloud.
- Automate builds and deployments across multi-platform environments
- Strong interpersonal skills
- A can-do attitude and sense of urgency for a high growth/fast paced environment
- Proven track record of leading implementations of build and release engineering best practices, both processes and technologies
- BS in Computer Science or equivalent experience
- Curious mind, wanting to learn new technologies and share with others.
- The ability to think outside of the box to resolve issues and create solutions
- Participation on an on-call schedule