Lead Sre Engineer

hace 3 semanas


En remoto, España Stuart A tiempo completo

Stuart (DPD Group) is a sustainable last-mile logistics company that connects retailers and e-merchants to a fleet of geolocalised couriers across several countries in Europe.

Our Mission
- We are an impact-driven company that aims to build the future of logistics for a more sustainable world: shared, efficient and reliable. We are committed to creating a new standard for urban deliveries that meet today’s environmental and social challenges while offering a premium delivery experience blending speed, flexibility and convenience.

Our motto: “Make every delivery a moment all of us can truly celebrate” More than 3000+ leading brands already partner with us across Restaurants, Grocery, Retail & Luxury, eCommerce and Professional Services to deliver all types of goods at the tap of a button. Stuart is a highly diverse and inclusive company of 700+ employees with 90+ nationalities working across France, Italy, Poland, Portugal, Spain and the U.K.

It’s the right moment and the right place for us to make an impact on millions of people, as home delivery services hit a record high. And guess what? You can help us fulfil our vision

We are looking for a
**Lead Site Reliability Engineer** who will be a technical leader for our SRE team. You will guide the team technically and help us make our platform more robust, handle failures gracefully, and early detect issues by the mean of automation, proper alarming, and chaos engineering.

**The SRE mission **is to make the platform as reliable as possible, trying to reduce the number and severity of incidents affecting the platform. We need to make sure that all the services are efficiently monitored with the right thresholds set for alarms to be meaningful, and that most of the remediation work is automated rather than manual. Further reliability of the platform is provided by introducing controlled errors in it (chaos engineering principles) and testing different disaster recovery scenarios. SREs are the stewards of reliability and they provide the technical and documentation instruments for other Engineering teams to build reliable software.

**The SRE team** is a new team at Stuart and you will have the opportunity to see how the team grows further, and have a word in how it does it. You will be part of the Infrastructure department under the Reliability area, together with the Engineering Support team. Other areas of the department are Cloud Engineering, Security, and IT.

**What will I be doing?**:

- Be a technical leader for the team and the go to person for software reliability matters.
- Take part in additional departmental efforts such as hiring, running community talks, defining team processes and other such ways to contribute to culture and growth on the team.
- Help the other engineering teams to build reliable, observable, and performant products.
- Drive and help other teams to set SLOs and SLAs and track them via SLIs.
- Lead Design the Stuart observability stack, implement it and guide other teams to adopt it.
- Contribute to Stuart systems reliability and performance.
- Write playbooks for alarms, and then automate them so manual intervention is not required.
- Document knowledge and practices in a clear way, so other departments can benefit from it.
- Collaborate with the Engineering Support team on incident management.
- Conduct and lead post-mortem meetings; follow-up on the action items.
- Lead the way towards the chaos engineering path.

**What do we need from you?**:

- 5+ years of experience in a similar position (even if with a different title) in an always-up, always-available mission-critical service.
- You come from a Systems or a Software Engineering background, we will like you exactly the same
- Love for automation: you don’t want to repeat the same job twice.
- Proven record leading complex projects from start to end.
- You are the go-to person in your team if there are difficult technical problems to solve.
- You have written programs to automate tasks, reducing toll.
- You feel comfortable doing low-level Linux and networking debugging.
- Worked with complex Terraform code-bases. Bonus point if you wrote a provider.
- Very good cloud environments and Kubernetes knowledge (we use AWS & EKS).
- Working experience with chaos engineering practices.
- You like teaching and pass best-practices to others, and write thorough documentation.
- Proactive mindset: if you see something is not working, you start the process to fix it.
- Both written and spoken fluency in English.

Don’t worry, we don’t expect you to tick every single item here But it should give you a feeling of what kind of experience we are looking for.**The stuff you wanna know**:

- Family-friendly work-life balance - work from home and flexible hours
- Option to work remotely anywhere in Spain
- Ticket Restaurant by Edenred (€11 daily)
- Unlimited access to Udemy for all your learning and development needs
- Stuart Academy with regular workshops, Stu-Classes,


  • Jr. Sre Engineer

    hace 4 semanas


    En remoto, España Dabster Systems UK Limited A tiempo completo

    Job Description: **What are we looking for?** We are looking for a **SRE Engineer** working close to one of our main clients. **Main Tasks And Accountabilities Will Be** - You will be a key member of a team that leverages software and system engineering practices to build and run distributed, highly reliable systems at scale within AWS. - Working with...

  • Site Reliability Engineer

    hace 4 semanas


    En remoto, España zb.io A tiempo completo

    2+ years relevant industry experience in SRE, Cloud Engineering or DevOps roles Considerable experience with Linux systems administration (Ubuntu experience appreciated) - Experience with AWS and cloud architectures/services. - Familiarity with the container and container orchestration space (Docker, Kubernetes, etc.) - Experience working with...


  • En remoto, España Affidea BV A tiempo completo

    We are looking for an **IT Communications Engineer (L3) **that will provide the necessary know-how to identify, implement and maintain the Group IT Communications Infrastructure to their retirement as instructed by the Communications Lead Engineer. We are looking for talent based in any country where we have a presence and who will work at the corporate...

  • Site Reliability Engineer

    hace 4 semanas


    En remoto, España Fortexpro A tiempo completo

    We are looking for SRE to work on a major international project. 100% remote work. Offer addressed to workers from any EEC country. Tasks - Implements Site Reliability Engineering and/or DevOPS practices. - Manages technology, infrastructure and software development projects in accordance with SRE and/or DevOPS principles. - Empowers development teams...

  • DevOps Engineer

    hace 6 días


    En remoto, España Plexus A tiempo completo

    **Detalles**: Experiência Sector Salario At Plexus we are looking for a DevOps profile to join us on an important project in the banking sector. - 4 years of experience as DevOPs / SRE with advanced knowledge and experience in operating Kubernetes (EKS preferred) - Knowledge and experience in Public Cloud (AWS preferred) - Automation development...


  • En remoto, España WNTD A tiempo completo

    **What are we looking for?** For our team specialised we are looking for a SRE - Wintel to be part of a team working close to one of our main clients. This position can be performed 100% remote from any location in Spain. **Wintel Technologies** - Microsoft Windows Operating Systems - IIS - File Systems - DNS / DHCP / DFS - Identity & Security -...


  • En remoto, España Booming Games A tiempo completo

    About the role Join our team at Booming Games as a Site Reliability Engineer and ensure the peak performance and reliability of our systems across multiple geographical locations! As a key player in troubleshooting and resolving complex issues, you will collaborate with engineers to drive automation, standardization, and optimization efforts. Your expertise...


  • En remoto, España Aveva A tiempo completo

    AVEVA is a global leader in industrial software. Our cutting-edge solutions are used by thousands of enterprises to deliver the essentials of life - such as energy, infrastructure, chemicals and minerals - safely, efficiently and more sustainably. We’re the first software business in the world to have our sustainability targets validated by the SBTi, and...


  • En remoto, España IT Hiring Central A tiempo completo

    **Benefits**:DENTAL INSURANCE, MEDICAL INSURANCE, VISION INSURANCE, LIFE INSURANCE, WORK FROM HOME, PAID TIME OFF, OTHER **Salary**: 48.000,00€ - 55.000,00€ al año **BONUS DESCRIPTION**:2.5% quarterly bonus totaling 10% per year **Must Haves**: - Must have Automation Testing Experience in Cypress or Protractor - Must have a minimum of 2 years of API...


  • En remoto, España Raisin A tiempo completo

    Team Our SRE team builds and operates a reliable cloud infrastructure and empowers the product teams with tools and processes to deliver features as fast as possible. - Infrastructure - Process, tooling, automation - Observability of business critical systems - Security and compliance (in a highly regulated sector) Your Responsibilities - Design, build and...


  • En remoto, España Solera A tiempo completo

    **The Role** **What You Will Do** - Provide on-call support on a rotational basis and effectively diagnose / triage / fix systems during outages - Adopt DevOps practices to drive the use of automation to simplify routine tasks. - Consistently collaborate, cross-train and be secondary support on multiple areas within the team **What You’ll Bring** -...


  • En remoto, España Meta A tiempo completo

    Meta Security is looking for an Incident Response Engineer with experience in the identification, containment and mitigation of security incidents. You will be analyzing different data sources to detect, investigate and respond to internal and external threats. You will also be working with our software and production engineering teams to develop scalable...


  • En remoto, España Cover Genius A tiempo completo

    **The Company** Our team and products have been recognized with dozens of awards including by the Financial Times who ranked Cover Genius as the #1 fastest growing company in APAC. Our diverse team across 20+ countries and many language groups commits itself to diverse cultural programs, in particular “CG Gives” which makes social entrepreneurs out of...

  • Senior DevOps Engineer

    hace 4 semanas


    En remoto, España SIA Fyst Tech A tiempo completo

    **WHAT YOU'LL BE WORKING ON** - Manage and enhance the stability and operation of the most critical services (including reviews, capacity planning, and performance tuning). - Develop automations/tooling for better platform reliability/availability. - Work with cutting-edge & distributed systems not limited to, but including Kubernetes, ArgoCD, PostgreSQL,...


  • En remoto, España System73 A tiempo completo

    **Job Description ‍** Looking to be part of a team that develops leading solutions with proven innovations currently in production? Be part of System73, a company with a portfolio of disruptive products in the content delivery and media streaming industries. We are expanding internationally beyond our core team to ensure we deliver high-quality products...


  • En remoto, España Winning A tiempo completo

    **Overview / Position Summary**: The Role has a technical profile that is attracted by system administration tasks and the overall security of the entire infrastructure, with concerns and orientation towards automation. The role is able to operate **various IT systems** (databases, scripts, configurations, etc.) but who, above all, is responsible and...

  • Senior Backend Engineer

    hace 4 semanas


    En remoto, España Anyplace A tiempo completo

    Our specialty lies in providing exceptional accommodations that prioritize a productive and efficient workstation setup. We understand that remote workers value the ability to seamlessly transition their work environment wherever they go. That's why our accommodations are designed with state-of-the-art workspaces, equipped with all the necessary amenities...


  • En remoto, España Crypto Recruit A tiempo completo

    **Overview** At the company, we are looking for a backend developer with experience in NodeJS that will help us build new products in the blockchain space and help us add new features for some of our current projects and some new stealth projects that we are starting to work on. **Responsibilities** - Discuss, analyze, and understand new feature...

  • Data Engineer

    hace 6 días


    En remoto, España HEMAV Technology A tiempo completo

    At HEMAV we are an AgriTech StartUp founded in 2012 working passionately on Data Science in the agrifood industry with the purpose of generating more food with less environmental impact. ️ We use a unique approach, based on Artificial Intelligence using satellite technology and other data sources (drone, meteo, soil, etc.). Users of our technology...


  • En remoto, España Outliant A tiempo completo

    **Seniority Level**:Mid-Senior Level **Responsibilities** - Lead the design, development, and implementation of data science projects - Identify business problems, define project goals, and develop solutions. - Manage data pipelines, including acquisition, cleaning, integration, and transformation. - Develop machine learning models, select appropriate...