System Monitoring & Observability Engineer (Prometheus / Grafana)

Cardiff
1 week ago
Create job alert

SRT Marine Systems plc (SRT) is a market leader in the domain of international marine surveillance technology and systems. We are a respected, established, and an ambitious multi-national company headquartered in the UK with a global customer base.

The company has a worldwide impact in the marine sector by leading the next generation of maritime domain awareness technologies "MDA", products, and systems that significantly enhance security, safety, environmental protection, and sustainability. Our customers are global and range from the largest national coast guards to individual vessel owners.

SRT is an exciting company where high-quality results are rewarded. We are ambitious and constantly seek to innovate in order to deliver better products and services to our customers. We strive to make SRT a rewarding and challenging place to work, where talented, hard-working individuals have the opportunity to make a real impact across the marine industry.

Role overview of our System Monitoring & Observability Engineer (Prometheus / Grafana)

You as a System Monitoring & Observability Engineer (Prometheus / Grafana) here at SRT, you will be part of a small team tasked with implementing an end-user observability visualisation. Currently, we have observability dashboards in place for our engineers, utilising Prometheus for metrics collection and Grafana for visualisation. This initiative aims to deliver a more user-friendly solution tailored for our end-users.

Our clients are located across various countries worldwide, each with differing WAN capabilities, and our system is geographically distributed on-premises across multiple sites. We are fortunate to have a team of highly experiencedengineers, including UX designers, who can provide support and guidance. Our lead observability engineer will oversee and assist with your work throughout the project in the role of System Monitoring & Observability Engineer (Prometheus / Grafana).

Key Responsibilities - System Monitoring & Observability Engineer (Prometheus / Grafana) - (not exhaustive)

Monitoring & Metrics Collection
Design, configure, and maintain Prometheus-based monitoring solutions
Develop and manage metric exporters for application and system-level data
Optimise Prometheus scraping configurations and retention policies
Alerting & Incident Response
Define and maintain alert rules based on SLIs/SLOs and performance baselines
Ensure alerts are actionable, with minimal false positives
Participate (not necessarily lead) in on-call rotations and incident postmortems
Observability Dashboards
Design and maintain Grafana dashboards for real-time operational insights
Collaborate with engineering and product teams to create tailored visualisations
Provide self-service dashboard capabilities for end users
System Performance & Reliability
Monitor infrastructure (servers, containers, databases, services) for uptime, latency, and throughput
Identify bottlenecks and recommend improvements

Required Skills & Experience - System Monitoring & Observability Engineer (Prometheus / Grafana)

Proven experience with Prometheus (including PromQL) and Grafana in production environments
Strong knowledge of Linux-based systems
Experience writing and optimising PromQL queries for alerts and dashboards
Familiarity with exporters (node_exporter, blackbox_exporter, custom exporters)
Understanding of alertmanager configuration and routing
Proficiency with Grafana dashboard creation and templating
Strong troubleshooting skills for infrastructure and application issues
Familiarity with containers (Docker)
Scripting skills (Bash, Python, or Go) for automation

Just some of the benefits we offer

Highly Competitive Salary
Matched company pension contributions up to 5%
25 days annual leave rising to 28 days with service
Career development opportunities
Company "Get to know you" days

SRT Marine Systems plc are an equal opportunity employer. We are committed to creating an inclusive working environment for all employees and actively encourage applications from all sectors of the community

Related Jobs

View all jobs

Dev Ops Engineer

DevOps Engineer

Technical Lead - TypeScript / Node.js

Technical Lead - TypeScript / Node.js

System Engineer - Golang & GitHub (Inside IR35 Remote)

Senior Site Reliability Engineer

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Cloud Engineer Jobs in the UK: Salary, Skills, Career Paths & How to Get Hired

Cloud engineer jobs are among the fastest-growing technology roles in the UK. As organisations move infrastructure, applications and data into the cloud, demand for skilled cloud professionals continues to surge across finance, healthcare, retail, defence, government and high-growth startups. If you’re exploring a career in cloud engineering — or looking for your next role — this guide covers everything you need to know: What a cloud engineer does Types of cloud engineer jobs Required skills and certifications UK salary expectations Career progression pathways How to land a cloud engineer job in the UK Whether you’re a graduate, IT professional transitioning into cloud, or an experienced engineer looking to specialise, this article will help you position yourself competitively.

How Many Cloud Computing Tools Do You Need to Know to Get a Cloud Job?

If you are aiming for a role in cloud computing, it can feel like the skills list never ends. One job advert asks for AWS, Terraform and Kubernetes. Another mentions Azure DevOps, PowerShell and ARM templates. A third throws in Docker, Python, Linux, CI/CD, monitoring tools and security frameworks. It is no surprise that many cloud job seekers feel overwhelmed before they even apply. Here is the reality most cloud hiring managers agree on: they are not hiring you because you know every cloud tool. They are hiring you because you understand cloud concepts, can design reliable systems, manage costs, keep things secure and support real workloads. Tools matter, but only when they support outcomes. So how many cloud computing tools do you actually need to know to get a job? For most roles, the answer is far fewer than you think. This article explains what employers really expect, which tools are essential, which are role-specific, and how to focus your learning so you look capable and employable rather than scattered.

What Hiring Managers Look for First in Cloud Computing Job Applications (UK Guide)

anding a job in cloud computing can be highly competitive — especially in the UK market where demand far outpaces supply in many segments. Whether you’re aiming for roles in Cloud Engineering, DevOps, Site Reliability, Cloud Architecture, Security, Data/Analytics, or Platform Operations, hiring managers screen applications quickly and with specific priorities in mind. Hiring managers don’t read every detail at first; they scan for critical signals in the first 10–20 seconds. These early signals determine whether your CV gets read more closely, whether your LinkedIn profile gets clicked, and whether you’re invited to interview. This guide breaks down, in practical terms, exactly what hiring managers look for first in cloud computing applications — and what you should emphasise in your CV, cover letter and portfolio to stand out on www.cloudcomputingjobs.co.uk .