Staff SRE / AWS / Terraform / Python

Company: Motion Recruitment
Location: Los Angeles, California, United States
Type: Full-time
Posted: 24.APR.2021

Summary

Today, there are more data and users outside the enterprise than inside, causing the network perimeter as we know it to dissolve. We realize...

Description

Today, there are more data and users outside the enterprise than inside, causing the network perimeter as we know it to dissolve. We realized a new perimeter was needed, one that is built in the cloud and follows and protects data wherever it goes, so we started our company to redefine Cloud, Network and Data Security.

Since 2012, we have built the market-leading cloud security company and an award-winning culture powered by hundreds of employees spread across offices in Santa Clara, San Francisco, Seattle, Bangalore, London, Melbourne, and Tokyo. Our core values are openness, honesty, and transparency, and we purposely developed our open desk layouts and large meeting spaces to support and promote partnerships, collaboration, and teamwork. From catered lunches and office celebrations to employee recognition events (pre and hopefully post-Covid) and social professional groups, we strive to keep work fun, supportive and interactive.

About the role

Our Platform Production Engineering ( PPE ) team is seeking a production service-oriented, self-driven, and motivated Infrastructure SRE to join the team. PPE is a diverse group of engineers responsible for measuring the quality of our
cloud services running in our infrastructure. An ideal candidate should have the ability to analyze and troubleshoot a broad spectrum of problems related to application, system and network.

As an Infrastructure SRE you will help build out our existing infrastructure and troubleshoot problems as they arise ensuring the highest levels of systems and infrastructure availability of our
roduction services. You will also be responsible for integrating services health metrics, identifying/measuring these service health indicators and providing creative tool sets for the frontline operations support teams.

Required Experience:

  • Minimum 5-7 years experience in a production data center environment with 1000+ servers
  • Troubleshooting complex issues and correlating data from multiple sources like service application, linux system and network.
  • Deep knowledge of various metrics platform, like Prism, Prometheus, Grafana, Graphite, Sumo Logic etc. Expert in collecting data metrics, running data analysis, and building correlation metrics.
  • Ability to deep dive into network troubleshooting area, like packet analysis, HTTP/HTTPs, tunneling protocol, load balancer issues, etc
  • Comprehensive understanding of computer internal architecture and experienced in maintaining common Linux/Unix applications and services.
  • Experience with modern cloud and virtualization technologies (Docker, Kubernetes, AWS, Google Cloud Platform, KVM, OpenNebula, OpenStack or any orchestration platforms)
  • Strong software development skills using any programming language including Python, C, C++, Go.
  • Deep expertise with operational support systems, automation, and CI/CD tools
  • Demonstrated ability and willingness to act as subject matter expert, tracking technology/industry trends, and to provide data-driven reasoning for recommending technology paths.?

Education:
  • Bachelor's Degree or higher in Computer Science, Engineering, or combination of comparable education and experience typically obtained by 5 or more years related work experience.?
- provided by Dice

 
Apply Now

Share

Flash-bkgn
Loader2 Processing ...