Platform Reliability Engineer
hace 1 día
Job Title: Platform Reliability EngineerCareer Level - EIntroduction to role:Join us as a Platform Reliability Engineer in our Commercial IT – SSD, Data, Analytics and AI Platform Success Team. Your primary focus will be to ensure the stability, performance, and reliability of our Data, Analytics, and AI systems. You will bridge the gap between development and operations by generating insights into sub-optimal processes and optimization opportunities. This role offers an exciting opportunity to integrate Agile, Lean and SaFe practices within monitoring and observability initiatives and to continuously improve delivery cycle times.Accountabilities:As a Platform Reliability Engineer, you will be responsible for the evaluation, selection, and deployment of monitoring & observability technologies. You will manage and maintain monitoring infrastructure, ensuring it aligns with industry best practices. You will collaborate with DevOps, CriticalOps, and IT leadership teams to understand system requirements and design effective monitoring strategies. You will also develop and implement monitoring solutions for infrastructure, applications, and services.Responsibilities:Ensuring the stability, performance, and reliability of Data, Analytics, and AI systems by implementing and maintaining robust monitoring and observability solutions.Designing, deploying, and managing monitoring tools and practices that provide insights into the health and performance of our data infrastructure and analytics processes.Bridging the gap between development and operations by generating insights into sub-optimal processes and optimization opportunities.Maintaining working knowledge of platform architecture and business acumen.Integrating Agile, Lean, and SaFe practices within monitoring and observability initiatives to continuously improve delivery cycle times.Exploring and implementing new ways to automate systems, designing and testing automation processes, identifying quality issues, and supporting IT platform teams to eliminate defects and errors with product and platform development.Experience leveraging AIOps capabilities to uplift existing production operationsTechnology/Tool ManagementResponsible for the evaluation, selection, and deployment of monitoring & observability technologies suitable for the organization’s needs.Manage and maintain monitoring infrastructure, ensuring it aligns with industry best practices.Monitoring & Observability Practice ManagementCollaborate with DevOps, CriticalOps, and IT leadership teams to understand system requirements and design effective monitoring strategies.Establish key metrics and KPIs that enable insights and analytics to achieve data-driven continuous improvement.Provide training and support to other teams on using monitoring tools effectively.Create and maintain documentation for monitoring and observability practices, including standard operating procedures and best practices.Stay abreast of industry trends, emerging technologies, and best practices related to monitoring and observability platforms.Monitoring & Observability Implementation & OperationsDevelop and implement monitoring solutions for infrastructure, applications, and services.Design and configure alerting mechanisms to proactively respond to potential issues.Use monitoring tools to identify and troubleshoot issues in real-time.Collaborate with other teams to resolve incidents promptly and prevent reoccurrence.Analyze monitoring data to identify performance bottlenecks and areas for improvement.Work with development and operations teams to optimize system performance based on monitoring insights.Implement automation scripts and workflows to streamline monitoring processes.Integrate monitoring solutions with existing frameworks for seamless operation.Identify and evaluate “self-healing” opportunities based on production issue trend analysis to inform AIOps roadmap.Essential Qualifications:Degree level education in computer science, information technology, or a related field.Proven experience as a monitoring and observability engineer or a similar role.Proficient in developing monitoring capabilities and configuring integration with tools such as Prometheus, Grafana, Splunk, SumoLogic, DataDog, DynaTrace, etc.Strong scripting skills (e.g., Python) for automation in data environments.Familiarity with logging, tracing, and APM (Application Performance Monitoring) solutions.Desirable Qualifications:Customer engagement experience.Knowledge of data processing frameworks (e.g., Apache Spark) and data storage solutions (e.g., data lakes, warehouses).Experience with data orchestration tools (e.g., Apache Airflow).Understanding of data lineage and metadata management.Ready to make a difference? Apply today and be part of a team that has the backing to innovate, disrupt an industry and change lives.
#J-18808-Ljbffr
-
Site Reliability Engineer
hace 2 semanas
España Antal International Network A tiempo completoSenior Site Reliability Engineer (SRE) - Fintech Sector Location: Barcelona, Spain (Hybrid Model) Company Overview: Join a leading international fintech company at the forefront of innovation, revolutionizing financial services for millions worldwide. Our client is looking for a Senior Site Reliability Engineer (SRE) to play a pivotal role in ensuring the...
-
Reliability Engineering Manager, Platform
hace 1 semana
España Softbank Investment advisers A tiempo completoReliability Engineering Manager, PlatformContentsquare is a digital experience analytics company dedicated to better customer understanding and making the digital world more human. We power more human experiences through understanding, action, and trust.In 2022, we raised $600M in Series F funding. In 2023, we were recognized as a certified Great Place to...
-
Site Reliability Engineer
hace 2 semanas
España Zartis A tiempo completoThe company and our mission: Zartis is a digital solutions provider working across technology strategy, software engineering and product development. We partner with firms across financial services, MedTech, media, logistics technology, renewable energy, EdTech, e-commerce, and more. Our engineering hubs in EMEA and LATAM are full of talented professionals...
-
Reliability Engineering Manager, Platform
hace 2 semanas
España Contentsquare A tiempo completoAbout Contentsquare Contentsquare is a digital experience analytics company dedicated to better customer understanding and making the digital world more human. We power more human experiences through understanding, action, and trust. Since our founding in France in 2012, we have grown to be a truly global team, representing more than 72 nationalities in...
-
Platform Engineer
hace 2 semanas
España KUBO International A tiempo completoNew opportunity for Platform Engineer! Join the fintech company which creates new technologies to enhance and automate financial services and processes. This allows small and medium-sized businesses to trade and transact internationally by eliminating boundaries related to more traditional methods.Key responsibilities:Work within a team of SREs to ensure...
-
Platform Engineer
hace 2 semanas
España Exoticca A tiempo completoWhat is Exoticca?Exoticca is a pioneering online travel agency that has revolutionized the conception, production, and e-commerce of long-distance dream trips. At the core of Exoticca's brand equity is the commitment to "creating life milestones." We believe in delivering best-value trips, exploring unique destinations, curating extraordinary travel...
-
Site Reliability Engineer
hace 2 semanas
España Galaxyfinx A tiempo completoFinX aims to build a digital financial platform that will deliver on the promise of simplicity, technological advancement, and superior customer products & services for millions of people in Vietnam.As a Site Reliability Engineer, you’ll be part of a friendly and supportive SRE & Cloud team within a company that has a fantastic culture, adopting cloud...
-
Platform Engineer
hace 6 días
España Exalate A tiempo completoWe are looking for an experienced Platform Engineer to join our team and play a critical role in building and maintaining our scalable infrastructure. The ideal candidate will have strong expertise in cloud environments, automation, and CI/CD pipelines. You will collaborate closely with development, operations, and security teams to create a seamless,...
-
Platform Engineer
hace 2 semanas
España Exoticca Travel Co A tiempo completoWhat is Exoticca?Do you like traveling? Do you enjoy the exotic? Do you like challenges? If the answer is YES, welcome to Exoticca!Exoticca is a company that started in 2013 by professionals specialized in the creation and online distribution of trips. Our mission is to offer our clients the possibility to visit the most beautiful and stimulating places on...
-
Sr Service Reliability Engineer
hace 3 días
España buscojobs España A tiempo completoJob Description This is an exciting opportunity to join our Managed Sportsbook Services (MSS) tribe in Seville, Spain. We are looking for a hands-on Senior Service Reliability Engineer with a strong technical foundation to oversee our Service Reliability Engineering practice and guide the strategic direction of system reliability within our unit. As Sr SRE...
-
Reliability Engineer
hace 2 semanas
España Resource Group - Recruitment A tiempo completoDesde Resource Group buscamos un Ingeniero de Fiabilidad para incorporarse al equipo de una aerolínea de carga y pasajeros en Madrid.Resource Group is looking for a Reliability Engineer to join the team of a cargo and passenger airline based in Madrid.Funciones y Responsabilidades:Seguimiento de la Fiabilidad de sistemas, componentes y aeronaves de la...
-
Sr Service Reliability Engineer
hace 2 semanas
España Sportradar A tiempo completoCompany Description We’re the world’s leading sports technology company, at the intersection between sports, media, and betting. More than 1,700 sports federations, media outlets, betting operators, and consumer platforms across 120 countries rely on our know-how and technology to boost their business. Job Description This is an exciting opportunity to...
-
Backend Engineer
hace 6 días
España Spotify A tiempo completoWe are looking for a Backend Engineer to join our Data Protection team and be a key player in ensuring that data at Spotify is appropriately protected. You will contribute to building a scalable data protection platform that ensures data across Spotify is appropriately protected, managed, and deleted when no longer needed. This is a highly impactful role...
-
Site Reliability Engineer
hace 2 semanas
España ING A tiempo completoAt ING we are looking for a Site Reliability Engineer Your role and work environment : We are looking for a talented and enthusiastic Site Reliability Engineer (SRE) to join our Team of SRE Expert Unit. The responsibility of this team is to ensure the reliability and scalability of the platform to provide the best customer experience to our clients and our...
-
Reliability Engineer
hace 3 días
España EDP Renewables A tiempo completoCountry/Region: ES City: Madrid Company: EDP RENOVÁVEIS S.A. EDP Renewables is a global leader in the renewable energy sector and the fourth-largest wind energy producer. With a sound development pipeline, first-class assets, and market-leading operating capacity, EDPR has undergone exceptional development in recent years and is...
-
Senior Site Reliability Engineer
hace 2 semanas
España zeroG - AI in Aviation A tiempo completoWELCOME TO OLXAt OLX, we work together to build a more sustainable world through trade.We make it safe, smart, and convenient to buy and sell cars, find housing, get jobs, buy and sell household goods, and more. Our colleagues around the world help to serve millions of people every month through our well-loved consumer brands including OLX, Otodom, and...
-
Senior Platform Engineer
hace 2 semanas
España SeQura A tiempo completoAbout seQuraSeQura provides innovative, flexible and easy-to-use payments technologies that help merchants acquire, convert and retain more customers. We make a difference in sales performance by tailoring our solutions to different sectors, addressing their unique pain points and delivering superior results in Retail, Education (EduQa), Optics (OptiQa),...
-
Site Reliability Engineer SRE 2
hace 1 día
España Norconsulting A tiempo completoPozuelo de Alarcón, MD, SpainSite Reliability Engineer (SRE) 2 Job DescriptionNorconsulting busca para unos de sus clientes, empresas líderes del sector de Seguridad, un Administrador de Sistemas para unirse a su equipo de sistemas y desarrollo en sus oficinas en Madrid.ADMINISTRADOR DE SISTEMASExperiencia requerida:Herramientas de monitorización y...
-
Senior Platform Engineer
hace 2 semanas
España Lansweeper A tiempo completoWith a $150 million funding round received recently from Insight Partners and impressive yearly revenue growth, Lansweeper is rapidly expanding its Global teams. We now need a Senior Platform Engineer to help us scale and to take Lansweeper to the next level. Lansweeper is an IT asset management software helping businesses better understand, manage and...
-
Senior Platform Engineer
hace 1 semana
España Lansweeper NV A tiempo completoWith a $150 million funding round received recently from Insight Partners and impressive yearly revenue growth, Lansweeper is rapidly expanding its Global teams. We now need a Senior Platform Engineer to help us scale and to take Lansweeper to the next level. Lansweeper is an IT asset management software helping businesses better understand, manage and...