Saviynt logo

Senior Site Reliability Engineer - Saviynt

View Company Profile
Job Title
Senior Site Reliability Engineer
Job Location
Bengaluru
Job Description
Saviynt's AI-powered identity platform manages and governs human and non-human access to all of an organization's applications, data, and business processes. Customers trust Saviynt to safeguard their digital assets, drive operational efficiency, and reduce compliance costs. Built for the AI age, Saviynt is today helping organizations safely accelerate their deployment and usage of AI. Saviynt is recognized as the leader in identity security, with solutions that protect and empower the world’s leading brands, Fortune 500 companies and government institutions. For more information, please visit www.saviynt.com.

We’re a fast-moving AI Security Company building AI-native infrastructure and applications powered by LLMs and autonomous agents. Our stack is deeply integrated with AWS, Kubernetes, and OpenAI-based systems, and we’re rethinking reliability in a world where software can reason, adapt, and self-heal.

We’re hiring a Senior SRE Engineer to own reliability across our cloud-native and AI-driven platform. You’ll work at the intersection of distributed systems, Kubernetes operations, and LLM-powered automation, building systems that don’t just scale—but think and fix themselves.

WHAT YOU BRING
  • 5+ years in SRE / DevOps / Platform Engineering.
  • Strong hands-on experience with:
    • AWS infrastructure at scale
    • Kubernetes (production-grade clusters)
    • Proven ability to debug complex distributed systems under pressure.
    • Strong coding skills (Python or Go)—you build internal platforms and tools.
    • Experience implementing monitoring, alerting, and incident management systems.
    • Bonus (AI / LLM Focus)

      • Experience working with LLM APIs such as the OpenAI API.
      • Familiarity with agent frameworks like:
        • LangChain
        • AutoGen
        • Built or experimented with:
          • AI agents for DevOps / SRE workflows
          • Retrieval-Augmented Generation (RAG) systems
          • Vector databases (Pinecone, Weaviate, etc.)
          • Exposure to AIOps or intelligent automation systems.
WHAT YOU WILL BE DOING
  • Own uptime, reliability, and performance of services running on AWS + Kubernetes (EKS).
  • Design and implement self-healing infrastructure using automation and AI agents.
  • Build LLM-powered operational tooling using APIs such as the OpenAI API for:
    • Intelligent alert triage
    • Incident summarization
    • Root cause analysis
    • Runbook automation
    • Manage and scale Kubernetes workloads:
      • Deployments, autoscaling, resource optimization
      • Cluster reliability and cost efficiency
      • Build and evolve observability systems:
        • Metrics (Prometheus), dashboards (Grafana)
        • Logs (ELK / OpenSearch)
        • Tracing (OpenTelemetry)
        • Define and enforce SLOs, SLAs, and error budgets tied to business metrics.
        • Automate infrastructure using Terraform and CI/CD pipelines.
        • Lead incident response, postmortems, and continuous reliability improvements.
        • Introduce chaos engineering practices to proactively test system resilience.

Everything You Need, One Platform.

From job listings to startups, investors to funding rounds, and everything in between, Employbl puts the power in your hands. Why wait?

Start your free trial today!


Stay Ahead of the Curve

Sign up for our newsletter to stay informed about the latest startups and trends in the tech market. Let Employbl be your guide to success.

Saviynt Headquarters Location

El Segundo, CA

View on map

Saviynt Company Size

Between 200 - 1,000 employees

Saviynt Founded Year

2010

Saviynt Total Amount Raised

$375,000,000

Saviynt Funding Rounds

View funding details
  • Debt Financing

    $205,000,000 USD

  • Private Equity

    $130,000,000 USD

  • Series A

    $40,000,000 USD