Sr Engineer - SRE

Brooklyn Park, Minnesota
Jul 29, 2021
Dec 03, 2021
Employment Status
Full Time

About us:
Target is an iconic brand, a Fortune 50 company and one of America's leading retailers.

Target as a tech company? Absolutely. We're the behind-the-scenes powerhouse that fuels Target's passion and commitment to cutting-edge innovation. We anchor every aspect of being America's most loved retailer with cutting edge technology, and the smartest engineers in retail technology! Site Reliability Engineering (SRE) at Target is an engineering discipline that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems.

As a Sr Engineer in SRE, you are curious and open to learning how to build highly scalable platforms and fault-tolerant systems across various technologies, including Linux, Apache, MongoDB, Python, Oracle RDBMS, Redis, Postgres, and Hadoop. We use a combination of Google Computing Platform and our server farms operating out of Target Data Centers, and therefore, experience managing application stacks in a Hybrid cloud is preferred.

Job Summary:

As a member of the SRE team, you will contribute to product prioritization, scalable automation, capacity planning, adoption of supporting technologies, and all other aspects of maintaining a world-class cloud-based service. You will work closely with software engineering teams developing infrastructure and applications to focus on driving scalability, stability, reliability, operability of services, and security for Omnichannel retail experiences.

Site Reliability Engineers are hybrid systems and software engineers responsible and take ownership for reliability, scalability, automation, and other issues related to the availability of Target's e-commerce/Retail and Enterprise platforms. Our goal is to build, scale and guard the systems that delight our guests.

To do so, you will:
  • Design, write and build tools to improve the reliability, latency, availability, and scalability of Target's e-commerce/Retail and Enterprise products
  • Instrument systems for reliability, performance, and efficiency of Guest experiences
  • Define, drive adoption, and enforcement of service level objectives at both service and experience levels
  • Influence standards, methods, and best practices for large-scale enterprise systems
  • Root-cause complex problems involving multiple parties, networks, hardware, and software that relate to scaling and performance Champion high availability for critical systems and systematically root out single points of failure
  • Develop partnership and be able to work side by side with the Product team.

  • Strive to eliminate downtime and improve the manageability of infrastructure and application services.
  • Set standards for deployments at scale, infrastructure reliability, and scalability. Iterate, revisit, and optimize service availability, scalability, and performance.
  • Influence engineering teams across Target with customer focus, world-class quality, effective communication, decisive, fast-moving solutions, quick and constructive resolution of conflicts.
  • Manage service availability and scalability through process, tools, and automation. Perform post-mortems and optimize incident response processes.
  • Actively participate in incident response for production incidents; Drive investigation, analysis, and troubleshooting to resolve production incidents and systematically drive down detection and mitigation times.

  • BS or MS in Computer Science or equivalent experience
  • 4 years or more building and scaling distributed systems leveraging web-scale technologies like Linux, Apache, MongoDB, Python, Oracle RDBMS, Redis, Postgres, and Hadoop.
  • Experience with Linux/Unix internals and systems services like DNS, DHCP, TFTP, IPtables, SMTP, and networking protocols such as TCP, UDP, and HTTP.
  • Experience with monitoring systems, tracing, and observability to manage large-scale systems and 24x7 availability.
  • Experience with building and maintaining application stacks in a Hybrid Cloud environment and expertise with Google Cloud Platform (GCP) is a plus.
  • Programming experience in one or more of the following languages: Go, Java, Python, Ruby, Shell, and CI/CD tools such as Travis, Drone, Jenkins.

Americans with Disabilities Act (ADA)

Target will provide reasonable accommodations (such as a qualified sign language interpreter or other personal assistance) with the application process upon your request as required to comply with applicable laws. If you have a disability and require assistance in this application process, please visit your nearest Target store or Distribution Center or reach out to Guest Services at 1-800-440-0680 for additional information.


Similar jobs

Similar jobs