Be at the heart of actionFly remote-controlled drones into enemy territory to gather vital information.

Apply Now

Site Reliability Engineer - 12 Month FTC (we have office locations in Cambridge, Leeds and London)

London
2 weeks ago
Create job alert

Company Description

Genomics England partners with the NHS to provide whole genome sequencing diagnostics. We also equip researchers to find the causes of disease and develop new treatments – with patients and participants at the heart of it all.

Our mission is to continue refining, scaling, and evolving our ability to enable others to deliver genomic healthcare and conduct genomic research.

We are accelerating our impact and working with patients, doctors, scientists, government and industry to improve genomic testing, and help researchers access the health data and technology they need to make new medical discoveries and create more effective, targeted medicines for everybody.

Job Description

Are you driven by a deep curiosity about how complex distributed systems work and, more importantly, how they fail? Do you believe reliability is the most critical feature of any service?  

At Genomics England, we’re pushing the boundaries of science and technology to transform patient outcomes, and our platform underpins it all.  

We're looking for a Site Reliability Engineer to ensure our platform is not just running, but is sustainably reliable, scalable, and resilient. As a SRE advocate, you will actively collaborate with engineering squads to cultivate a culture of reliability. You will play a pivotal role in driving our technical evolution, influencing and shaping platform practices across the organisation. 

Your responsibilities will include automating and optimising infrastructure to improve workload throughput. You will focus on implementing proactive measures to anticipate and address potential issues before they impact our users. You can’t fix what you don’t measure, so there will be a focus on developing monitoring and metrics that teams will rely on day to day.  Through this approach, you will help create a platform that is not only scalable and resilient but also ready to meet the demands of our mission. 

What You'll Be Doing Day-to-Day: 

Your work will be a balance of proactive engineering and thoughtful operational practice. You'll move between different modes, from deep project work and strategic initiatives to collaboration and incident response. Your primary mission will be to: 

Champion Reliability: Work with engineering teams to define and measure what matters to our users, establishing and monitoring SLIs, SLOs, and error budgets that drive data-informed decisions. 

Learn from Failure: Be involved in blameless post-incident reviews that focus on identifying contributing factors, ensuring we turn every failure into a valuable opportunity for systemic improvement. 

Eliminate Toil: Systematically identify and automate repetitive, manual, and tactical operational processes. You'll reduce operational load by building solutions with enduring value. 

Build Resilient Systems: Design, build, and maintain robust infrastructure across AWS and on-prem environments using Infrastructure as Code and automation. You'll also drive performance tuning, capacity planning, and cost optimisation. 

Enable Developer Velocity: Develop CI/CD pipelines, release automation, and platform tooling that help our engineering squads deploy changes safely and efficiently, without sacrificing reliability. 

Share Your Knowledge: Create clear, usable documentation and act as a consultant and advocate for SRE and DevOps best practices, helping to improve resilience across the entire organisation. 
What You’ll Bring:  

We're looking for someone who not only advocates for the SRE mindset but can also implement it with robust code, thoughtful automation, and scalable architecture. 

Mindset & Approach: 

Deep-Seated Curiosity: You're driven to understand how systems truly behave in production, not just how they are supposed to work. 

A Systems Thinker: You can zoom out to see the big picture and zoom in to troubleshoot the details, understanding that reliability is an emergent property of the entire system. 

Relentlessly Collaborative: You see reliability as a shared responsibility, actively seeking out different perspectives and treating SRE as a dialogue. You're open to new ideas, welcome diverse viewpoints, and thrive on teaching, learning, and driving initiatives with colleagues across various teams. 

Incident Responder: You remain calm under pressure, applying a structured approach to troubleshooting when the pager rings. You know how to take charge of an incident, coordinate a response, and mitigate issues efficiently. 

Views Failure as an Opportunity: You champion blameless post-incident reviews as a core learning mechanism, focusing on process and technology, not people. 

Customer-Focused: You understand that reliability must be measured from the customer's perspective to be meaningful. 

Technical Experience: 

Experience applying Site Reliability Engineering principles in a production environment. 

Strong hands-on experience with AWS services across compute, storage, networking, and security. 

Deep understanding of distributed systems and their common failure modes, including issues related to latency, data consistency, and fault tolerance. 

Experience with capacity planning, performance engineering, and designing systems that scale to meet traffic demands and remain fault-tolerant under pressure. 

Excellent Infrastructure as Code skills (Terraform essential). 

Solid scripting and software engineering fundamentals in languages like Python or Bash, with an ability to debug code, handle errors, and understand system architecture. 

Experience with observability and alerting tools (e.g., DataDog, Cloudwatch, OpsGenie etc) and a passion for turning data into actionable insights. 

Knowledge of CI/CD tools (e.g., GitLab CI, Jenkins) and release engineering best practices. 

Familiarity with container orchestration (ECS, Kubernetes) and running production-grade infrastructure at scale. 

A good understanding of networking fundamentals (DNS, TCP/IP, HTTP) and their practical application, including load balancing and traffic management. 

Familiarity with Relational (e.g., PostgreSQL) and NoSQL Databases. 

Nice to Haves: 

Exposure to new tech evaluation, lean experimentation, or platform tooling decisions. 

Experience mentoring or sharing knowledge across teams. 

Understanding of genomics, HPC, data-heavy workloads, or regulated environments. 

Qualifications

Formal qualifications are not mandatory. We value practical experience, a curious mind, and a passion for reliability. Relevant certifications in AWS, Terraform, or other technologies are welcome and highly beneficial.

Additional Information

Closing Date: Monday 20th October at 23:00 (UK time) 

Salary From: £71,300

Being an integral part of such a meaningful mission is extremely rewarding in itself, but in order to support our people, we’re continually improving our benefits package. We pride ourselves on investing in our people and supporting them to achieve their career goals, as well as offering a benefits package including: 

Generous Leave: 30 days’ holiday plus bank holidays, additional leave for long service, and the option to apply for up to 30 days of remote working abroad annually (approval required).
Family-Friendly: Blended working arrangements, flexible working, enhanced maternity, paternity and shared parental leave benefits.
Pension & Financial: Defined contribution pension (Genomics England double-matches up to 10%, however you can contribute more if you wish), Life Assurance (3x salary), and a Give As You Earn scheme.
Learning & Development: Individual learning budgets, support for training and certifications, and reimbursement for one annual professional subscription (approval required).
Recognition & Rewards: Employee recognition programme and referral scheme.
Health & Wellbeing: Subsidised gym membership, a free Headspace account, and access to an Employee Assistance Programme, eye tests, flu jabs.Equal opportunities and our commitment to a diverse and inclusive workplace 

Genomics England is actively committed to providing and supporting an inclusive environment that promotes equity, diversity and inclusion best practice both within our community and in any other area where we have influence. We are proud of our diverse community where everyone can come to work and feel welcomed and treated with respect regardless of any disability, ethnicity, gender, gender identity, religion, sexual orientation, or social background. 

Genomics England’s policies of non-discrimination and equity and will be applied fairly to all people, regardless of age, disability, gender identity or reassignment, marital or civil partnership status, being pregnant or recently becoming a parent, race, religion or beliefs, sex or sexual orientation, length of service, whether full or part-time or employed under a permanent or a fixed-term contract or any other relevant factor.  

Genomics England does not tolerate any form of discrimination, harassment, victimisation or bullying at work. Such behaviour is contrary to

Related Jobs

View all jobs

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer - 12 Month FTC (we have office locations in Cambridge, Leeds and London)

Site Reliability Engineer (SRE)

Senior Dev Ops Engineer/ Site Reliability Engineer

SC CNI SRE CGEMJP00315700

Subscribe to Future Tech Insights for the latest jobs & insights, direct to your inbox.

By subscribing, you agree to our privacy policy and terms of service.

Industry Insights

Discover insightful articles, industry insights, expert tips, and curated resources.

Cloud Computing Recruitment Trends 2025 (UK): What Job Seekers Must Know About Today’s Hiring Process

Summary: UK cloud hiring has shifted from title-led CV screens to capability-driven assessments that emphasise platform reliability, cost control (FinOps), defence-in-depth security, automation via IaC, high-availability design, and measurable business impact. This guide explains what’s changed, what to expect in interviews & how to prepare—especially for platform engineers, SREs, cloud security engineers, DevOps, solutions architects, FinOps practitioners & data/AI platform engineers. Who this is for: Cloud/platform engineers, SREs, DevOps, cloud security, FinOps, network engineers, solutions/enterprise architects, data/ML platform engineers, observability engineers & cloud product managers targeting roles in the UK.

Why Cloud Computing Careers in the UK Are Becoming More Multidisciplinary

For many years, cloud computing careers in the UK meant roles for infrastructure specialists, system administrators, network engineers & software developers. Today, the picture looks very different. Cloud has become the backbone of digital transformation across industries — from healthcare to finance, education to government. With that reach comes new expectations. Cloud isn’t just about servers & storage anymore. It’s about handling sensitive data responsibly, meeting regulatory obligations, designing intuitive user experiences, communicating clearly with diverse stakeholders & understanding how people actually interact with complex digital systems. This means cloud careers are increasingly multidisciplinary, requiring expertise in law, ethics, psychology, linguistics & design alongside technical skills. In this article, we’ll explore why cloud careers in the UK are broadening, how these five disciplines intersect with cloud work, what it means for job-seekers & employers, and how to future-proof your career in this fast-changing sector.

Cloud Computing Team Structures Explained: Who Does What in a Modern Cloud Department

Cloud computing has transformed how organisations in the UK and worldwide design, deliver, and maintain their IT infrastructure. Whether it’s migrating on-premise workloads to the cloud, building cloud-native applications, or optimising for cost, performance, and security — organisations of all sizes need cloud teams with clearly defined roles. For someone applying for cloud computing jobs, or hiring for them, knowing who does what in a modern cloud department gives you an edge. This article describes the core roles you’ll find in a mature cloud team, how these roles work together through the cloud lifecycle, what skills UK employers tend to expect, typical career paths and salaries, plus the challenges of structuring cloud computing teams.