Principal Site Reliability Engineer at People Can Fly
Gateshead, United Kingdom
Job Descrption
Company Description
People Can Fly is one of the leading independent AAA games development studios with an international team of hundreds of talented individuals working from offices located in Poland, UK, US, and Canada, and from all over the world thanks to our remote work programs.
Founded in 2002, we made our mark on the shooter genre with titles such as Painkiller, Bulletstorm, Gears of War: Judgment, and Outriders. We are one of the most experienced Unreal Engine studios in the industry and we are expanding it with in-house solutions called PCF Framework.
Our creative teams are currently working on several exciting titles: Gemini is our new project being developed with Square Enix; Maverick is a Triple-A game developed in collaboration with Microsoft Corporation; Bifrost, Victoria and Dagger are projects we're growing in the self-publishing model. We also have one project in the concept phase – Red; as well as two projects in VR technology – Green Hell VR and Bulletstorm VR - an exciting VR version of our cult-classic shooter.
With over 20 years of experience, PCF sets out to explore new horizons. We aim to combine our expertise with creativity of the best and most forward-thinking talents in the industry to work together on the new generation of action games for the global gaming community.
If you decide to accompany us on this journey, you’ll have a chance to perfect your craft and expand your knowledge, working alongside leaders in the industry on bringing a brand-new unique experience to the players worldwide.
Job Description
Build and deploy cloud-native infrastructure for the online services platform.
Create tools and foster a culture focused on reliability across all services.
Automate deployment of the online services platform to cloud providers, including provisioning for various stages like development, testing, and external publishers.
Advise on and implement strategies to maximize reliability, scalability, and uptime.
Deploy tools for efficient maintenance, updates, and recoveries, ensuring processes are traceable and reversible.
Establish and test disaster recovery protocols.
Implement monitoring dashboards and alerting systems for real-time service status and assist in service instrumentation for effective monitoring.
Build dashboards to monitor operational costs and provide guidance on cost minimization.
Manage communications with third-party providers and publishers, especially during outages.
Develop 24/7 on-call support protocols for live games.
Create run books for everyday operations related to online services.
Share knowledge with the rest of the studio.
Occasionally support leads in recruitment projects with HR collaboration.
Define the best practices and technical solutions used at the company.
Qualifications
8+ years of extensive experience in infrastructure engineering, with a specialized focus on AWS Cloud Computing.
Strong in languages like Python, Go, or Java and in scripting for automation.
Good grasp of network architecture and security practices.
Familiarity with CI/CD pipelines and tools like Jenkins, Ansible, or Kubernetes.
Proficient with Source Control and Code Review tools (Swarm, Perforce, Git, etc.).
Skills in setting up monitoring systems and managing incidents.
Ability to analyze and improve system performance.
Strong troubleshooting skills across various technology layers.
Knowledge in designing and implementing disaster recovery strategies.
Strong mentoring skills.
Strong verbal and written communication skills in English.
Nice to have:
Video game-specific experience
Experience in content distribution, ad-tech, news, mobile gaming, or finance domains
Additional language proficiency
Additional project management and bug tracking software knowledge
Additional Information
A competitive salary and performance-based annual bonuses.
Private medical healthcare (Vitality) and BUPA dental insurance for PCF's employees and their families.
Access to wellbeing platform - Gympass for employees and family members.
Online Polish and English language classes.
Access to the pension scheme.
Flexible working hours.
Free virtual health and mental wellbeing sessions included in the plan for members and their dependents.
Personal development opportunities and ability to work in a global environment.
Work in a creative team with people full of passion for what they do.
Your CV has been submitted successfully.
Complete form below to directly Send your CV / Linkedin Profile to Principal Site Reliability Engineer at People Can Fly.
@
You will receive all responses from employer on this email
Example: Application for the post of 'Accountant'
Example: Introduce your self and give purpose of your application
*All fields are mandatory.
Loading...
PEOPLE CAN FLY 23 jobs found
Lead UI Programmer at People Can Fly
Gateshead, United Kingdom
Senior Level Designer at People Can Fly
Gateshead, United Kingdom
Senior AI Designer at People Can Fly
Gateshead, United Kingdom
Senior Backend Engineer at People Can Fly
Gateshead, United Kingdom
Lead Backend Engineer at People Can Fly
Gateshead, United Kingdom
Lead Backend Engineer at People Can Fly
Gateshead, United Kingdom
Lead Backend Programmer at People Can Fly
Gateshead, United Kingdom
Senior Backend Engineer at People Can Fly
Gateshead, United Kingdom
Principal Site Reliability Engineer at People Can Fly