Sriram Manikanth

MTS / Salesforce

OPERATIONS STATUS / ONLINE

EMAIL msriram0803@gmail.com

PHONE +91 7010561869

LOCATION Bangalore, IN

PORTFOLIO sriram.manikanth.info

LINKEDIN linkedin.com/in/sriram

Contact Me GitHub

Skills Matrix

DevOps & SRE

Cloud & Infrastructure

AI & Orchestration

Collaboration

System Metrics

EXPERIENCE 5.5+ Years

AUTOMATION 80% Shift

CLOUDS AWS / AZ / GCP

Executive Summary

Member of Technical Staff at Salesforce with 5.5+ years building production ops platforms. I architected NextOps (Slack ChatOps), Cosmic AI (manifest-driven LLM agents on Temporal), SSP (governed change execution), and Ops-Sage (alert auto-remediation) — replacing ticket queues with safe, auditable self-service. Deep in Kubernetes, Terraform, Temporal, and FinOps-aware platform design.

My Production Ops Portfolio v2.4-stable

k8s-active temporal-connected gpt5-linked

Live portfolio demos

4 production systems Enterprise scale

CloudStack Ops Automation Suite

Four products I shipped — NextOps and Cosmic AI for intake, SSP for governed execution, Ops-Sage for automated alert response — on Temporal, RBAC, and a complete audit trail.

02 · ChatOps Self-service in seconds

NextOps

Slack-native runbooks for DB access, job termination, feature toggles, and on-call — with live RBAC and Temporal-backed execution.

TemporalChatOpsRBAC

Open demo

02 · ChatOps Natural-language infra ops

Cosmic AI

YAML-defined agents compiled into Temporal workers — engineers ask in plain English, LLM generates safe queries with capability-level RBAC.

Azure OpenAIKubernetesGitOps

Open demo

03 · Govern JIRA + CAB wired to GitOps

SSP

Change requests flow through JIRA and CAB approval before any production mutation — approvals gate Temporal workflows and deployment pipelines.

JIRAGitOpsCAB

Open demo

05 · Respond On-call relief in seconds

Ops-Sage

Matches each on-call alert to a runbook, validates live context, executes remediation through SSP, and only pages a human when validation fails.

AlertingRunbooksSSP

Open demo

NextOps Bot

Secure Self-Service

What it is / Why it matters

A Slack bot that lets engineers and support staff safely run production operations on the CloudStack platform — provision database users, kill stuck jobs, toggle features, and triage incidents — all from Slack, with role-based access and a full audit trail. No direct production access required.

BUSINESS IMPACT

Self-service in seconds, not tickets

Default-deny RBAC checks

Every action fully audited

Zero direct prod terminal access

NextOps Automation Center

Good evening, Alex Rivera · Admin ·

Role: Operator Admin · Workspace: demo-ops

Active Users 16 Users

Security Groups 3 Groups

Registered Services 24 Services

Temp Access Grants 2 Active

Quick Actions

Live execution feed

Run a quick action to see a simulated workflow card with validation steps, progress, and audit confirmation.

Provisioning & Diagnostics Catalog

Audit History Log

Workflow ID	Type	Status	Duration	Operator	Time
wf-8812	Take Thread Dump	Complete	3.4s	Priya Nair	4 hrs ago
wf-8794	Release DB Lock	Complete	1.8s	Sam Chen	1 day ago

Cosmic AI

Autonomous agent

What it is / Why it matters

An AI-powered "senior SRE" that lives in Slack. Talk to it in plain English about any infrastructure or incident problem and it investigates, reasons step-by-step, runs the right operations, and remembers everything — available 24/7. Powered by GPT-5.

BUSINESS IMPACT

Replaces 24/7 manual toil

Investigates diagnoses in parallel

Learns from every incident history

Mandatory human approvals for write operations

Cosmic AI Diagnostics Copilot

Status: Connected | Agent: Idle

Cosmic AI Just now

Welcome to the Cosmic AI diagnostics terminal. I can check pod performance, query status, run safe operations, and recall past incident notes. What should we investigate?

Cosmic AI Diagnostics Parameters

AI Queries Sent 1,248

Active Watchlists 4 Pods

Granted Scope 18 SRE Agents

Available Console CLI Commands

/sreagent help List available agents, parameters, and command controls.

/sreagent memory recall <query> Search past incident conversations and resolution details.

/sreagent watch <pod> Configure monitoring alerts and notify channels on latency.

SSP

Governed Self-Service

What it is / Why it matters

The governed automation engine behind the scenes. Every operational action is validated, gets a JIRA change ticket, waits for the right approvals (Change Advisory Board), then executes against production and notifies everyone — fully audited, every time.

BUSINESS IMPACT

No direct production logins

Mandatory Change Advisory Board (CAB) gate

Temporary, auto-expiring user parameters

Complete Slack + JIRA + Email audit ledger

SSP Governance Stepper

Submit & Validate Pending form inputs...

Change request ticket Not started

CAB Notification Not started

Apply Change Not started

Deliverable & Notify Not started

SSP Governed Execution request

Workflow Catalog Item

Target Pod

Target Org ID

Toggle ID

Action

Value

PM Approval Ticket

Projects / CloudStack Change Advisory

CHG-447561

Summary [Cloud-SSP] Feature Toggle | Pod cloudstack-prod-use2 | Org org-acme-8842

Description

Governed Feature Toggle execution triggered via SSP. Verification required.

Labels

cloud-govportal cloud-change-governance

Feature Toggle Request AWAITING APPROVAL

From: SSP Governance Engine <ssp-notify@example.com>

To: Alex Rivera <alex.rivera@example.com>

Subject: Completed: governed workflow wf-8842 Mongo logs export

Cloud Self Service Portal

The requested workflow has been approved and executed successfully. Output files have been compiled and encrypted using standard Fernet symmetric keys. Access credentials will expire in 6 hours.

Encrypted Log File: Click here to access OneDrive folder

Fernet Decryption Key: gAAAAABmX_k9R...

Total File Size: 14.2 MB

Decryption Instructions

To extract the contents locally on a terminal with python installed, run the following decryption parameters:

python decrypt.py logs.enc <fernet_key>

Ops-Sage

On-Call Alert Automation

What it is / Why it matters

Ops-Sage watches incoming on-call alerts and acts on them automatically. For each alert type you configure a validation checklist and an approved remediation action. When an alert fires, Ops-Sage validates the signal against live context, runs the action through the governed SSP engine if checks pass, and only escalates to a human when validation fails — removing pager fatigue for known failure patterns.

BUSINESS IMPACT

Automated response in seconds, not minutes

Reduces on-call toil for repeat alerts

Validation gate before any production change

Every auto-action fully audited

Simulated alert queue. Click Process to watch Ops-Sage validate an alert and execute the configured action without paging on-call.

Alert	Severity	Pod	Runbook	Status
JVM heap utilization above 90%	High	cloudstack-prod-use2	heap-scale-v2	Waiting
API latency p99 > 2500ms on profiling endpoints	High	cloudstack-prod-use2	conn-pool-scale	Waiting
Disk usage > 85% on worker node pool	Medium	cloudstack-prod-apse1	log-rotate-sweep	Waiting

Recent automated remediations executed by Ops-Sage (simulated audit trail).

Time	Alert	Action	Pod	Result	On-call paged?
06:42 UTC	Stale deployment lock on org org-acme-8842	release-lock	cloudstack-stg-pod1	Complete	No
05:18 UTC	Certificate expiry warning (< 14 days)	ssl-renotify	cloudstack-prod-use2	Complete	No

Each alert type maps to validation checks and an approved action. Ops-Sage only escalates when a check fails.

heap-scale-v2 JVM heap > 90%

Validate: confirm single pod spike, no active deploy, heap trend > 5 min.
Action: apply JVM heap multiplier via SSP workflow.

conn-pool-scale Latency p99 high

Validate: DB pool at capacity, no Sev-1 open.
Action: scale connection pool ConfigMap + rolling restart.

log-rotate-sweep Disk > 85%

Validate: log volume growth, not data disk.
Action: trigger log rotation job on node pool.

Engineering Repositories

NextOps ChatOps Platform

Salesforce

A Slack-native production operations self-service automating operations with high reliability.

Built event-driven workflows powered by Temporal Orchestration for reliable long-running operations.
Engineered a live Slack-administered RBAC engine enabling secure access directly in conversation.
Integrated on-call alert automation (Ops-Sage) and AI-assisted diagnostics to minimize MTTR.

Temporal ChatOps AI/LLM Slack Python

Portfolio & demos

Cosmic AI Agent Platform

Salesforce

A manifest-driven AI agent engine automating multi-tenant cloud infrastructure management via chat interfaces.

Auto-compiles YAML specification files into active, task-running Temporal workflow workers.
Generates LLM queries contextually with fine-grained capability-level RBAC for cloud safety.
Supports multi-tenant architectures, translating natural language into actionable Infrastructure actions.

LLMs Azure OpenAI Temporal Kubernetes GitOps

Portfolio & demos

Career Timeline

Member of Technical Staff

Salesforce

Mar 2026 - Present Bangalore, IN

Architected and built NextOps, a Slack-native ChatOps/self-service platform for automating production operations, featuring Temporal-backed workflows, a live Slack-administered RBAC engine, on-call automation, and AI-driven Root Cause Analysis.
Designed and implemented Cosmic AI, a manifest-driven AI agent platform that manages multi-tenant cloud infrastructure via natural-language Slack/Teams chat, leveraging YAML specs for auto-compilation into Temporal workers with LLM query generation and capability-level RBAC.

Senior DevOps Engineer

Informatica

Apr 2024 - Mar 2026 Bengaluru, IN

Designed and developed a scalable AI Agent Orchestration Platform, incorporating dynamic task chaining, parallel execution, intelligent data flow, and a central agent knowledge layer for improved automation accuracy and accelerated multi-step workflows.
Built Ops-Sage, an on-call alert automation system that validates incoming alerts against configured runbooks and executes remediations through SSP — reducing pager load for repeat failure patterns.
Developed multiple Microsoft Teams bots to support team operations and streamline incident management processes.
Created a self-service portal using Python and Temporal, enabling engineering teams to independently run scripts and perform operational tasks, reducing reliance on DevOps/Platform support.
Automated Emergency Bug Fix (EBF) deployments, optimizing the release process and achieving a 50% increase in deployment speed by minimizing manual intervention.
Partnered with customers to diagnose and resolve critical production issues, ensuring high satisfaction and reliable service delivery.
Contributed to Disaster Recovery (DR) drills, driving process enhancements and reducing recovery times annually.
Collaborated with the FinOps team to monitor and optimize infrastructure costs, implementing cost-efficient architectures that significantly lowered cloud expenses.

DevOps Engineer

Informatica

Sep 2022 - Apr 2024 Bengaluru, IN

Led the onboarding of a multi-cloud data platform service across AWS, Azure, and GCP, optimizing data management workflows.
Collaborated with cross-functional teams to implement GitOps methodologies, enhancing version control and deployment accuracy.
Utilized Kubernetes to orchestrate containerized applications, improving scalability and resource utilization.
Engineered infrastructure as code (IaC) using Terraform to ensure consistent and repeatable environment provisioning.
Employed Chef for configuration management, automating software deployments and system configurations.
Developed Python and Bash scripts to automate routine tasks, reducing manual effort by 80%.
Played a key role in cloud migrations, ensuring minimal downtime and seamless data transition.
Swiftly addressed production incidents, performing root cause analysis and implementing corrective actions.
Contributed to the development of internal tools, streamlining team processes and enhancing productivity.

Infrastructure Consultant

Thoughtworks

Dec 2021 - Aug 2022 Bangalore, IN

Implemented automation solutions using Python to streamline infrastructure management tasks.
Managed application deployments within Kubernetes environments, ensuring smooth operations and scalability.
Designed and implemented robust CI/CD pipelines to accelerate software delivery and improve release reliability.
Conducted security vulnerability checks across infrastructure to maintain high security standards.
Performed cost-cutting research for AWS cloud resources, identifying and implementing strategies to optimize expenditure.

Assistant System Engineer

Tata Consultancy Services

Aug 2020 - Nov 2021 Chennai, IN

Integrated CI/CD pipelines to automate build, test, and deployment processes for microservices, enhancing workflow efficiency.
Configured and managed Kubernetes clusters for container orchestration and application deployment, ensuring reliable application performance.
Automated preparation and configuration of testing platforms for IBM MQ software across cloud providers (AWS, Azure, IBM Cloud, GCP) and on-premise servers using Ansible, streamlining deployment processes.
Built microservices into Docker images, facilitating containerization and deployment.

Education

BE / Mechanical Engineering

Chennai Institute of Technology

Graduated: Jan 2020

HSLC / Bio-Math

IIPE Laxmi Raman Higher Secondary School

Completed: Jan 2016

Credentials

Microsoft: Azure Fundamentals
IBM: Containers & Kubernetes Essentials
IBM: MQ Developer Essentials
Dassault: SolidWorks Associate

Activities

Volunteer / APJ Youth Club

Engaged in local community empowerment, educational drives, and youth guidance programs.

Contributor / JsonQ

Contributed updates and enhancements to JsonQ, a query-like library for JSON data structures in Python/Golang.

Executive Summary

Production platforms I architected and built

NextOps

Cosmic AI

SSP

Ops-Sage

NextOps Bot

What it is / Why it matters

Cosmic AI

What it is / Why it matters

SSP

What it is / Why it matters

CHG-447561

Cloud Self Service Portal

Decryption Instructions

Ops-Sage

What it is / Why it matters

Engineering Repositories

NextOps ChatOps Platform

Cosmic AI Agent Platform

Career Timeline

Education

BE / Mechanical Engineering

HSLC / Bio-Math

Credentials

Activities

Volunteer / APJ Youth Club

Contributor / JsonQ