DevOps / Site Reliability Engineer

Published date Posted on WorkingNomads on Sep 16, 2021 (39 d ago)

Platform.sh is a groundbreaking hosting and development tool for web applications.


To reinforce our technical prowess, we are looking to grow our operations team. If you’re looking for an exciting, high-growth opportunity with an award-winning, cutting-edge company, this could be just the job for you


For its PaaS solution, https://platform.sh is looking for an Operations and Service Reliability Engineer with a taste for Python and Go, great Linux system understanding, and a real hunger for the challenges of building robust, distributed systems.


Platform.sh is a PaaS shrouded in a lot of black magic (we can consistently clone a whole running cluster, with its state, databases, indexes in a matter of seconds). We want to get this down to the hundreds of milliseconds domain. Interested? There is more...


Our external API is pure Hypermedia REST + oAuth on top of Pyramid. It mechanizes the Git layer and needs more features.


We can consistently generate from the same manifest a Docker container, an LXC one, or VM disk images (AWS, Azure, OpenStack), we want more targets.


We probably have the highest industry container density. We need to get it higher.


We support any Python, Ruby, NodeJS or PHP, Java and .NET, time to roll out Elixir, of course, Elixir (and Rust. We need Rust).


We need to have more auto-healing on the high-availability clusters. We need more performance out of our multi-protocol ssh proxy. We need to work on our Ceph Implementation. We need to get the Debian package generation streamlined and faster. We need… great ideas on how to make Platform.sh even better.


Directly reporting to our VP of Infrastructure and in close interaction with our Engineering and Customer Support teams, you will be responsible for:



  • Cloud operations: configure clusters, deploy stuff, follow-up on alerts, help customer support debug issues

  • Automating all of the above so they can instead drink margaritas (or non-alcoholic beverages, of course)

  • Creating systems, tools & processes that will enhance our support and operations efficiency

  • Improving service quality, discipline, and reliability throughout the lifecycle

  • Monitoring operating objectives, streamline and automate intervention

  • Continuous learning from Operations experience, modeled as software


This is a fully remote position for a candidate based in the EMEA timezone.

The ideal candidate



  • Has proven successful experience in an operations role

  • Has demonstrated the ability to successfully manage cloud-based infrastructure for a fast-growing organization

  • Has experience with containerization technologies

  • Has had exposure to cloud services such as AWS, Azure, GCP, etc

  • Understands how an OS works, knows networking, how git works, and the constraints of a distributed system,

  • Puppet experience

  • Is proficient in Python (Golang a plus)


Nice to have



  • Knowledge of Magento Ecommerce, Symfony, Drupal, eZ Platform, or Typo3


Note: we don't like stress, so we build everything to be robust and resilient, but stuff does break. This is a role with on-call duties and fire drills. If this fills you with dread... well, this might not be a fit for you.


A typical month in our team would look like this



  • Development week: writing the code and the automation to make our infrastructure run smoothly, from Puppet, Go, Python, and it really goes from monitoring tasks up to self-healing & updating

  • Deploy week: every that goes live on PSH is deployed by us, and the project managers assign those updates of clusters to whom is working during that week (during the off hours). We're always improving :)

  • Escalation week: whenever there's a tough problem support can't deal with, the team is investigating why, and our team help solve it

  • On-Call week: whenever a person is on-call, we don't add anything to that person, so that teammate has time to learn something new while being available in case something happens



About Platform.sh


Platform.sh is an idea-to-cloud application platform that simplifies cloud infrastructures.


We give developers the tools they need to experiment, innovate, get rapid feedback and deliver better-quality features with speed and confidence thanks to our unique rapid cloning technology.


Platform.sh serves thousands of customers worldwide including The Financial Times, Gap, Magento Commerce, Orange, Hachette, Ikea, Stanford University, Harvard University, The British Council, and Lufthansa.


We want people who are passionate, open, multicultural, friendly, humble and smart to join us and help this fast-growing, award-winning company to revolutionize the tech industry.


Let us know

Help us maintain the quality of jobs posted on RemoteTechJobs and let us know if:

Loading...
Success
Error on reporting

Related jobs

SUSE SUSE |
|
4 d ago

Open Source is in our genes. Open means more than shared source code to us. It’s a philosophy and approach ingrained in everything we do. It’s how we develop software, how we work with partners and customers, and how we engage with communities. Most of all, it’s

Engooden Health Engooden Health |
Today

Engooden Health is looking for a Senior IT engineer, who is looking to grow, learn and pivot in a fast-growing startup environment. This is an exciting opportunity to join a technology team focused on providing the tools and services supporting our rapidly growing team. As a grow

Engooden Health Engooden Health |
Today

Engooden Health is adding a Python Software Engineer to its engineering team in order to support the tremendous growth we have projected for 2022 and beyond. This is a remote position working with a small agile team of engineers located across the US and the Americas. Responsibil

Our 100% remote team of 200+ CleverPeople are seasoned yet friendly professionals who want to collaborate and welcome you. We have an exciting new opening for a Sr Data Scientist - Custom Text Search.What we're working onEnterprise companies turn to us to help them launch innovat

Remote Crew LDA Remote Crew LDA |
Today

Summary:We’re hiring an experienced Data Engineer/BI Developer to work 100% remotely for a global AdTech company.The team is dealing with hundreds of millions of data points every day, generated from over two thousand data processes running through workflows, huge distribut

More jobs by this company

We are seeking an Outbound Sales Development Representative to join the sales & business development team at Platform.sh. The primary function of this role will be to partner with Enterprise Account Executives, drive and nurture prospect engagement through outbound efforts, a