Observability Operations Engineer

bridge351

Location

Remote (Europe) with occasional travel to Germany

Language Requirements

  • German: C1 or higher (mandatory)
  • English: C1 or higher (mandatory)

About the Role

We are looking for an experienced Observability Operations Engineer to support and operate enterprise-scale platform environments. The successful candidate will be responsible for ensuring the reliability, performance, and observability of critical systems running in Kubernetes-based environments.

You will work closely with platform, infrastructure, and development teams to improve monitoring capabilities, operational excellence, and service reliability across complex enterprise environments.

Key Responsibilities

  • Operate and support Kubernetes-based production environments.
  • Manage and optimize observability platforms and monitoring solutions.
  • Configure and maintain logging, metrics, and tracing solutions.
  • Support incident, problem, and change management processes.
  • Define and monitor SLIs, SLOs, and SLAs.
  • Create and maintain operational runbooks and documentation.
  • Collaborate with engineering teams to improve platform reliability and performance.
  • Contribute to automation and continuous improvement initiatives.

Required Skills & Experience

  • Minimum 3 years of experience operating Kubernetes environments in production.
  • Strong experience with observability and monitoring platforms such as:
    • Prometheus
    • Grafana
    • Datadog
    • Loki
    • Mimir
    • OpenTelemetry
  • Strong understanding of networking concepts, load balancing, and security principles.
  • Experience with CI/CD tools and processes:
    • GitLab
    • Jenkins
    • ArgoCD
    • Tekton
    • Argo Workflows
  • Knowledge of ITSM processes:
    • Incident Management
    • Change Management
    • Problem Management
  • Understanding of Site Reliability Engineering (SRE) practices.
  • Experience documenting operational procedures and maintaining runbooks.
Nice to Have

  • Experience in enterprise-scale environments.
  • Cloud-native platform experience.
  • Infrastructure automation knowledge.
  • Experience working in regulated industries.

What We Offer

  • Fully remote work model.
  • International projects for the German market.
  • Modern cloud-native technology stack.
  • Long-term opportunities.
  • Collaborative and highly skilled engineering teams.

What can you expect from us?

Mind-blowing workplace culture. You will be integrated in a professional, dynamic and collaborative team.

100% Remote opportunities

We want you to have the flexibility to work where you feel most comfortable and productive.

International Career

  • You can expect professional growth and to be connect with the world.
  • We are represented in Portugal, Belgium, Luxembourg, and Denmark.
  • And with projects in many other countries: Netherlands, Luxembourg, Singapore and in the United States of America (and a lot more is coming…)

Extra Benefits & Perks

If you wish to work with us and you are outside European Union (good news…) we are a Tech Visa Company, We will help!

As a plus, we provide Health and Life Insurance.

Como aplicar?

Para se candidatar a este emprego, você precisa autorizar em nosso site. Se você ainda não possui uma conta, registre-se.