DaleCityRecruiter Since 2001
the smart solution for Dale City jobs

Site Reliability Engineer

Company: The Josef Group
Location: Herndon
Posted on: February 1, 2025

Job Description:

Site Reliability Engineer
TS/SCI or TS/SCI Poly
Herndon, VA

Seeking a Site Reliability Engineer (SRE) our OpenShift PaaS organization, you will be responsible for ensuring the availability, performance, and scalability of our OpenShift environments. You will collaborate with development, operations, and product teams to automate processes, build robust monitoring systems, and enhance the overall reliability of our platforms.

Key Responsibilities:

  • System Reliability & Scalability: Design, implement, and maintain highly available OpenShift clusters to support mission-critical applications.
  • Automation & Infrastructure as Code (IaC): Develop and maintain automation scripts and tools to streamline deployment, scaling, and recovery processes using tools like Ansible, Terraform, and Helm.
  • Monitoring & Incident Management: Build and enhance monitoring and alerting systems (e.g., Prometheus, Grafana, ELK). Respond to and resolve incidents, conducting post-mortem analyses to identify root causes.
  • Performance Optimization: Analyze and optimize system performance, ensuring minimal latency and maximum throughput.
  • Collaboration: Work closely with development teams to implement DevOps best practices, CI/CD pipelines, and platform enhancements.
  • Security & Compliance: Ensure platforms meet security and compliance requirements by integrating tools for vulnerability scanning, policy enforcement, and logging.Required Skills
    • Bachelor's degree in Computer Science, Engineering, or equivalent experience.
    • Minimum 5+ years of experience as an SRE, DevOps Engineer, or related role.
    • Expertise in OpenShift or Kubernetes platform administration.
    • Strong knowledge of Linux systems, networking, and containerization technologies (Docker).
    • Proficiency in scripting languages such as Python, Bash, or Go.
    • Experience with CI/CD pipelines (e.g., Jenkins, GitLab CI/CD).
    • Familiarity with monitoring and logging tools like Prometheus, Grafana, ELK, or Splunk.Desired Skills (Optional)
      • OpenShift certification (e.g., Red Hat Certified Specialist in OpenShift Administration).
      • Experience with cloud platforms (AWS, Azure, or GCP).
      • Knowledge of service mesh technologies (Istio, Linkerd).
      • Strong understanding of microservices and distributed systems architecture.
        #J-18808-Ljbffr

Keywords: The Josef Group, Dale City , Site Reliability Engineer, Professions , Herndon, Virginia

Click here to apply!

Didn't find what you're looking for? Search again!

I'm looking for
in category
within


Log In or Create An Account

Get the latest Virginia jobs by following @recnetVA on Twitter!

Dale City RSS job feeds