SRE - C++/Python

Company: Perennial Resources International
Location: New York, New York, United States
Type: Full-time
Posted: 06.FEB.2021
< >

Summary

SRE - Ticker Plant Full Time Only (Remote during Covid but On Site is required) New York, NY As SREs, we are tasked with applying software e...

Description

SRE - Ticker Plant
Full Time Only (Remote during Covid but On Site is required)
New York, NY
As SREs, we are tasked with applying software engineering skills to solve the problems of owning large and always-growing market data systems while ensuring that we maintain resiliency, efficiency, availability and visibility at any scale. We work across the globe, with teams that comprise experts in various specialties like software engineering, platform performance, capacity planning, systems recovery, and automation.

The Ticker Plant SRE will:
  • Get hands-on experience working on large-scale market data systems
  • Design and develop predictive data models for our system capacity
  • Build systems capable of early detection of issues through metrics and signals, and develop automated correction and remediation strategies
  • Help set standards and partner closely with other engineers to ensure that all products meet those standards
  • We are currently focused on performance analysis, setting up SLI / SLO, system-level alerting, capacity planning, and incident mitigation.

Responsible for:

  • Develop Python/C++ services, libraries and tools to monitor the health, availability, latency and reliability of our services with a focus on fault tolerant approaches
  • Proactively scale our services to stay ahead of ever-increasing market data demands by driving capacity planning, instrumentation and performance analysis
  • Ensure service issues do not reoccur by architecting automation and remediation strategies employing signal detection and orchestration frameworks
  • Define service level objectives and drive measurable service improvement

Required:

  • Experience programming in Python, C++, or Java
  • Experience working in a Unix/Linux environment
  • Experience with developing and using tools
  • Experience with analysis, diagnosing, and solving issues in a production environment
  • Strong understanding of large-scale systems architecture
  • Strong communication skills

Preferred:

  • Building orchestration systems (Ansible, Salt, etc)
  • Configuration management (Chef, Puppet, CFEngine)
  • Knowledge of test frameworks (gtest, etc)
  • Familiarity with industry standard tools for collection of data across distributed systems, such as Splunk, Grafana, ElasticSearch, Nagios, Zabbix
  • Experience with incident response and blameless postmortems


Adam Planica

- provided by Dice

 
Apply Now

Share

Flash-bkgn
Loader2 Processing ...