Principal Observability Engineer
2026-04-04T15:05:15+00:00
SustainRecruit
https://www.greataustraliajobs.com/jsjobsdata/data/employer/comp_4986/logo/sustain%20recruit.jpeg
https://sustainrecruit.com/
FULL_TIME
Melbourne
Melbourne VIC
2000
Australia
Professional, Scientific, and Technical Services
Science & Engineering, Computer & IT
2026-04-14T17:00:00+00:00
8
Background information about the job or company
Are you a seasoned engineering leader with a passion for building world‑class observability across complex, high‑scale systems? We’re looking for a Principal Observability Engineer to drive the strategy, standards, and technical excellence behind our organisation’s next‑generation observability and AIOps capabilities.
In this role, you’ll shape the future direction of our observability platforms, partner with cross‑functional teams across engineering and operations, and enable the business to proactively detect issues, optimise performance, and deliver exceptional reliability for mission‑critical systems.
Responsibilities or duties
- Lead the design, development, and rollout of advanced observability and AIOps platforms—spanning metrics, logs, traces, dashboards, and alerting.
- Own and evolve the observability technology roadmap aligned with organisational priorities.
- Define and champion standards, frameworks, and best practices across engineering teams.
- Continuously optimise system performance and ensure observability practices stay current with modern industry trends.
- Architect scalable, resilient distributed systems to support high‑traffic, complex workloads.
- Implement ML‑driven capabilities for anomaly detection, forecasting, and root‑cause analysis.
- Manage large volumes of telemetry data while maintaining security, quality, and compliance.
- Identify automation opportunities and develop intelligent auto‑remediation workflows.
Qualifications or requirements
- Vast experience across software engineering, DevOps, SRE, or platform operations.
- Strong hands‑on experience with Dynatrace and Sumo Logic.
- Proven ability to build and manage large‑scale observability platforms using ML and LLM‑based tooling.
- Expertise in cloud monitoring, scalable telemetry pipelines, and distributed systems.
- Proficiency with Kubernetes, Docker, Harness, and microservices architectures.
- Deep understanding of enterprise‑scale cloud infrastructure, networking, and multi‑layer observability.
- Demonstrated use of ML/GenAI for predictive monitoring, incident analysis, correlation, and summarisation.
- Experience building auto‑healing workflows (e.g., using Ansible).
Any other provided details
If you'd like an opportunity to lead the observability strategy at enterprise level, please apply now.
- Lead the design, development, and rollout of advanced observability and AIOps platforms—spanning metrics, logs, traces, dashboards, and alerting.
- Own and evolve the observability technology roadmap aligned with organisational priorities.
- Define and champion standards, frameworks, and best practices across engineering teams.
- Continuously optimise system performance and ensure observability practices stay current with modern industry trends.
- Architect scalable, resilient distributed systems to support high‑traffic, complex workloads.
- Implement ML‑driven capabilities for anomaly detection, forecasting, and root‑cause analysis.
- Manage large volumes of telemetry data while maintaining security, quality, and compliance.
- Identify automation opportunities and develop intelligent auto‑remediation workflows.
- Distributed Systems
- Automation
- Software Development
- Dynatrace
- Sumo Logic
- ML/LLM-based tooling
- Cloud monitoring
- Scalable telemetry pipelines
- Kubernetes
- Docker
- Harness
- Microservices architectures
- Enterprise-scale cloud infrastructure
- Networking
- Multi-layer observability
- ML/GenAI
- Auto-healing workflows
- Ansible
- Vast experience across software engineering, DevOps, SRE, or platform operations.
- Strong hands‑on experience with Dynatrace and Sumo Logic.
- Proven ability to build and manage large‑scale observability platforms using ML and LLM‑based tooling.
- Expertise in cloud monitoring, scalable telemetry pipelines, and distributed systems.
- Proficiency with Kubernetes, Docker, Harness, and microservices architectures.
- Deep understanding of enterprise‑scale cloud infrastructure, networking, and multi‑layer observability.
- Demonstrated use of ML/GenAI for predictive monitoring, incident analysis, correlation, and summarisation.
- Experience building auto‑healing workflows (e.g., using Ansible).
JOB-69d128ab6f3c4
Vacancy title:
Principal Observability Engineer
[Type: FULL_TIME, Industry: Professional, Scientific, and Technical Services, Category: Science & Engineering, Computer & IT]
Jobs at:
SustainRecruit
Deadline of this Job:
Tuesday, April 14 2026
Duty Station:
Melbourne | Melbourne VIC
Summary
Date Posted: Saturday, April 4 2026, Base Salary: Not Disclosed
Similar Jobs in Australia
Learn more about SustainRecruit
SustainRecruit jobs in Australia
JOB DETAILS:
Background information about the job or company
Are you a seasoned engineering leader with a passion for building world‑class observability across complex, high‑scale systems? We’re looking for a Principal Observability Engineer to drive the strategy, standards, and technical excellence behind our organisation’s next‑generation observability and AIOps capabilities.
In this role, you’ll shape the future direction of our observability platforms, partner with cross‑functional teams across engineering and operations, and enable the business to proactively detect issues, optimise performance, and deliver exceptional reliability for mission‑critical systems.
Responsibilities or duties
- Lead the design, development, and rollout of advanced observability and AIOps platforms—spanning metrics, logs, traces, dashboards, and alerting.
- Own and evolve the observability technology roadmap aligned with organisational priorities.
- Define and champion standards, frameworks, and best practices across engineering teams.
- Continuously optimise system performance and ensure observability practices stay current with modern industry trends.
- Architect scalable, resilient distributed systems to support high‑traffic, complex workloads.
- Implement ML‑driven capabilities for anomaly detection, forecasting, and root‑cause analysis.
- Manage large volumes of telemetry data while maintaining security, quality, and compliance.
- Identify automation opportunities and develop intelligent auto‑remediation workflows.
Qualifications or requirements
- Vast experience across software engineering, DevOps, SRE, or platform operations.
- Strong hands‑on experience with Dynatrace and Sumo Logic.
- Proven ability to build and manage large‑scale observability platforms using ML and LLM‑based tooling.
- Expertise in cloud monitoring, scalable telemetry pipelines, and distributed systems.
- Proficiency with Kubernetes, Docker, Harness, and microservices architectures.
- Deep understanding of enterprise‑scale cloud infrastructure, networking, and multi‑layer observability.
- Demonstrated use of ML/GenAI for predictive monitoring, incident analysis, correlation, and summarisation.
- Experience building auto‑healing workflows (e.g., using Ansible).
Any other provided details
If you'd like an opportunity to lead the observability strategy at enterprise level, please apply now.
Work Hours: 8
Experience in Months: 36
Level of Education: postgraduate degree
Job application procedure
If you'd like an opportunity to lead the observability strategy at enterprise level, please apply now.
Application Link: Click Here to Apply Now
All Jobs | QUICK ALERT SUBSCRIPTION