PROCESSING APPLICATION
Hold tight! We’re comparing your resume to the job requirements…

ARE YOU SURE YOU WANT TO APPLY TO THIS JOB?
Based on your Resume, it doesn't look like you meet the requirements from the employer. You can still apply if you think you’re a fit.
Job Requirements of Senior Observability Engineer – NS2JP00000386:
-
Employment Type:
Full-Time
-
Location:
Oak Hill, VA (Onsite)
Do you meet the requirements for this job?

Senior Observability Engineer – NS2JP00000386
Senior Observability Engineer
Location: Remote
Job Summary:
We are seeking a skilled and experienced Senior Observability Engineer to join the Observability team. The ideal candidate will be responsible for improving our monitoring and alerting posture for Cloud Infrastructure. The role requires a strong understanding of observability tools and practices, with a focus on Prometheus, Grafana, Gardener Kubernetes, and Splunk. Experience with Dynatrace is a plus.
Key Responsibilities:
- Implement, manage, and improve monitoring solutions that use Prometheus, ensuring high availability and accurate alerting for our systems.
- Contribute to the development of observability strategies to improve our Cloud monitoring posture.
- Collaborate with development teams to integrate observability into the CI/CD pipeline and throughout the application lifecycle.
- Respond to and investigate incidents, providing thorough post-mortem analyses and implementing preventive measures.
- Stay current with the latest trends and best practices in site reliability and observability.
- Work with cross-functional teams to ensure system reliability, scalability, and performance.
Qualifications:
- Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent experience.
- Proven experience with observability tools such as Prometheus, Grafana, and Splunk.
- Hands-on experience with Kubernetes and container orchestration, preferably with Gardener Kubernetes.
- Familiarity with logging, monitoring, and application performance management (APM) tools; experience with Dynatrace is a plus.
- Strong understanding of cloud infrastructure, networking, and distributed systems.
- Excellent problem-solving and analytical skills, with the ability to work independently and as part of a team.
- Strong communication skills and the ability to work effectively with both technical and non-technical stakeholders.
- Experience with scripting and automation tools. (Python, Terraform, Ansible, etc.)
#ZR-PRO