Incident Problem Manager

London
3 months ago
Applications closed

Related Jobs

View all jobs

Lead Service Desk Aanlyst

Senior Service Delivery Manager

Akamai Security Engineer

Principal Cloud Platform Engineer

Cloud Security Engineer

IT Compliance Support Engineer - Hybrid - Banking

Incident & Problem Manager**

Duration: 6 Months Possibility for extension)

Location: London/Hybrid (3 days per week on site)

Rate: A highly competitive Umbrella Day Rate is available for suitable candidates

Role Overview

We are seeking an experienced and governance-focused Incident and Problem Manager to oversee the effective management of IT incidents and problems across the organisation's technology landscape.

In this critical role, you will ensure that incidents, including major incidents, are resolved promptly to minimise business disruption and that underlying problems are identified, analysed, and addressed to prevent recurrence.

You will provide strategic and operational oversight of incident and problem management processes, ensuring robust governance and compliance with regulatory and operational resilience frameworks, including DORA.

You will also drive continuous improvement initiatives, strengthen operational resilience, and safeguard critical business services by embedding best practices and governance standards across the technology estate.

Key Responsibilities:

Lead the end-to-end management of incidents, including major incidents to ensure rapid restoration of services and minimal business disruption.
Collaborate on major incident bridges, coordinating cross-functional teams to drive timely resolution and maintain clear, consistent stakeholder communication during high-impact events.
Ensure escalation protocols and communication plans are executed effectively during major incidents
to keep senior leadership, regulators, and impacted business units informed in real time.
Oversee incident trend analysis and reporting to senior leadership and regulators to identify systemic issues, improve response strategies, and support compliance obligations.
Ensure incident processes align with DORA requirements including impact classification, response timelines, and regulatory reporting to maintain operational resilience.
Own the problem management lifecycle from identification through resolution and closure to eliminate root causes and prevent recurrence of incidents.
Drive structured root cause analysis (RCA) using methodologies such as 5 Whys or Kepner-Tregoe
to ensure accurate diagnosis and effective long-term solutions.
Maintain and govern the Known Error Database (KEDB) to provide documented workarounds and enable faster incident resolution.
Collaborate with engineering and product teams to implement permanent fixes to improve service reliability and reduce operational risk.
Embed DORA-aligned practices into incident and problem management processes including ICT risk classification and critical service mapping to strengthen resilience.
Support scenario testing and resilience assessments for critical business services to validate preparedness and compliance with regulatory standards.
Contribute to regulatory reporting and audit readiness for operational resilience and ICT incident handling
to ensure transparency and adherence to governance requirements.
Partner with Risk, Compliance, and Business Continuity teams to align incident and problem management with broader resilience objectives.
Mentor and guide junior analysts and managers within the service management function to build capability and maintain high standards of performance.
Drive automation and tooling enhancements for incident/problem detection and resolution to improve efficiency and reduce mean time to restore (MTTR).
Provide insights and recommendations to improve service reliability and reduce operational risk
to support continuous improvement and strategic objectives.
Lead service reviews and post-incident/post-problem retrospectives with accountable owners to capture lessons learned and implement process improvements.

Key Skills & Requirements:

Extensive experience in Incident and Problem Management within financial services or other regulated industries.
Proven track record of managing major incidents, conducting root cause analysis (RCA), and implementing permanent fixes.
Strong knowledge and practical application of ITIL principles (v4 preferred).
Demonstrated experience working with DORA compliance, operational resilience frameworks, and regulatory obligations.
Familiarity with ITSM platforms (e.g., ServiceNow) and monitoring tools.
Ability to operate under pressure and manage complex, high-impact situations.
Excellent stakeholder management, communication, and leadership skills.
Strong analytical and problem-solving capabilities.
Experience with cloud and hybrid infrastructure environments.
Understanding of DevOps and Agile delivery models.
Ability to drive continuous improvement and embed best practices across ITSM processes.

Candidates will need to show evidence of the above in their CV in order to be considered.

If you feel you have the skills and experience and want to hear more about this role 'apply now' to declare your interest in this opportunity with our client. Your application will be observed by our dedicated team.

We will respond to all successful applicants ASAP however, please be advised that we will always look to contact you further from this time should we need further applicants or if other opportunities arise relevant to your skillset.

Pontoon is an employment consultancy. We put expertise, energy, and enthusiasm into improving everyone's chance of being part of the workplace. We respect and appreciate people of all ethnicities, generations, religious beliefs, sexual orientations, gender identities, and more. We do this by showcasing their talents, skills, and unique experience in an inclusive environment that helps them thrive.

As part of our standard hiring process to manage risk, please note background screening checks will be conducted on all hires before commencing employment.

We use generative AI tools to support our candidate screening process. This helps us ensure a fair, consistent, and efficient experience for all applicants. Rest assured, all final decisions are made by our hiring team, and your application will be reviewed with care and attention

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Cloud Engineer Jobs in the UK: Salary, Skills, Career Paths & How to Get Hired

Cloud engineer jobs are among the fastest-growing technology roles in the UK. As organisations move infrastructure, applications and data into the cloud, demand for skilled cloud professionals continues to surge across finance, healthcare, retail, defence, government and high-growth startups. If you’re exploring a career in cloud engineering — or looking for your next role — this guide covers everything you need to know: What a cloud engineer does Types of cloud engineer jobs Required skills and certifications UK salary expectations Career progression pathways How to land a cloud engineer job in the UK Whether you’re a graduate, IT professional transitioning into cloud, or an experienced engineer looking to specialise, this article will help you position yourself competitively.

How Many Cloud Computing Tools Do You Need to Know to Get a Cloud Job?

If you are aiming for a role in cloud computing, it can feel like the skills list never ends. One job advert asks for AWS, Terraform and Kubernetes. Another mentions Azure DevOps, PowerShell and ARM templates. A third throws in Docker, Python, Linux, CI/CD, monitoring tools and security frameworks. It is no surprise that many cloud job seekers feel overwhelmed before they even apply. Here is the reality most cloud hiring managers agree on: they are not hiring you because you know every cloud tool. They are hiring you because you understand cloud concepts, can design reliable systems, manage costs, keep things secure and support real workloads. Tools matter, but only when they support outcomes. So how many cloud computing tools do you actually need to know to get a job? For most roles, the answer is far fewer than you think. This article explains what employers really expect, which tools are essential, which are role-specific, and how to focus your learning so you look capable and employable rather than scattered.

What Hiring Managers Look for First in Cloud Computing Job Applications (UK Guide)

anding a job in cloud computing can be highly competitive — especially in the UK market where demand far outpaces supply in many segments. Whether you’re aiming for roles in Cloud Engineering, DevOps, Site Reliability, Cloud Architecture, Security, Data/Analytics, or Platform Operations, hiring managers screen applications quickly and with specific priorities in mind. Hiring managers don’t read every detail at first; they scan for critical signals in the first 10–20 seconds. These early signals determine whether your CV gets read more closely, whether your LinkedIn profile gets clicked, and whether you’re invited to interview. This guide breaks down, in practical terms, exactly what hiring managers look for first in cloud computing applications — and what you should emphasise in your CV, cover letter and portfolio to stand out on www.cloudcomputingjobs.co.uk .