Hpc Operations Engineer

hace 4 semanas


Madrid, España CoreWeave Europe A tiempo completo

**_ We are looking for people willing to work in two shifts from 7am to 9pm. This is fully remote within Spain._**

**About the role**:
The High Performance Computing Operations team is responsible for the day-to-day provisioning, management and uptime of CoreWeave's ever-expanding fleet of server nodes. Playing a central role in CoreWeave's growth strategy, this team is on the front line for configuration, updates and remote troubleshooting of our highest tier of supercomputing clusters and their networking, delivery platforms and tools dependencies. You will be in a daily battle with the forces of entropy to maximise the number of nodes CoreWeave can deliver to customers.

We are seeking curious, creative and persistent problem solvers to join our HPC Operations team to help us drive batches of server nodes through our provisioning and validation processes while efficiently and effectively troubleshooting node or cluster problems as they arise. This individual will join a team of committed engineers working to deploy nodes as fast as they can be racked and turned on.

**Key Responsibilities**:

- Install, configure, and maintain large-scale high-performance supercomputing clusters running state-of-the-art GPUs
- Troubleshoot hardware and software issues; escalate and coordinate as needed with data centre, network and platform teams to drive resolution
- Monitor and analyse system performance and take appropriate remediation actions for cloud health
- Approach your work with flexibility and optimism anticipating shifting business and technical priorities
- Create and maintain documentation of team processes, knowledge and best practices for system management
- Think critically about your day-to-day work and work collaboratively to improve team processes and efficiency
- Experience troubleshooting or administering data center or on-prem infrastructure (servers, storage, network or a mix)
- Strong understanding of Linux system administration and networking concepts
- Ability to troubleshoot hardware and software issues and perform system maintenance tasks consistently and reliably
- Software development or scripting languages (bash, python, powershell, etc)
- Grafana, prometheus, promsql queries or similar observability platforms
- Data centre environments including server racks, HVAC systems, fiber trays
- Kubernetes administration

The salary for this position ranges from 34,000€ to 38,000€ plus competitive benefits. Pay is based on a number of factors including job-related knowledge, skills, and experience.

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast We're in an exciting stage of hyper-growth that you will not want to miss out on. We're not afraid of a little chaos, and we're constantly learning. Our team cares deeply about how we build our product and how we work together, which is represented through our core values:

- Be Curious at your Core
- Act like an Owner
- Empower Employees
- Deliver Best In-Class Client Experience
- Achieve More Together

We support and encourage an entrepreneurial outlook and independent thinking. We foster an environment that encourages collaboration and provides the opportunity to develop innovative solutions to complex problems. As we get set for take off, the growth opportunities within the organization are constantly expanding. You will be surrounded by some of the best talent in the industry, who will want to learn from you, too. Come join us

**Benefits**
- CoreWeave is an equal opportunity employer, committed to our diversity and inclusiveness. We will consider all qualified applicants without regard to race, color, nationality, gender, gender identity or expression, sexual orientation, religion, disability or age._



  • Madrid, España Solera Corporation A tiempo completo

    Software Development Operations Engineer page is loaded Software Development Operations Engineer Apply locations Madrid Virtual Spain time type Full time posted on Posted 3 Days Ago job requisition id JR-015999 Software Development Engineer: Plans, designs, develops and tests software systems or applications for software enhancements and new products...


  • Madrid, España Santander A tiempo completo

    Process Engineer Operations - Santander Digital Services Country: Spain **WHAT YOU WILL BE DOING** **SANTANDER DIGITAL SERVICES está buscando un/a Process Engineer Operations para nuestras oficinas en Abelias, Madrid.** **POR QUÉ DEBERÍAS CONSIDERAR ESTA OPORTUNIDAD** En **Santander Digital Services **(SDS), el brazo de tecnología y operaciones de...


  • Madrid, España EDP Renewables A tiempo completo

    Country/Region: ES- City: Madrid- Business Unit: EDP RENEWABLES EUROPE,S.LEDP Renewables is a global leader in the renewable energy sector and the world's third-largest wind energy producer. With a sound development pipeline, first class assets, and market-leading operating capacity, EDPR has undergone exceptional development in recent years and is currently...


  • Madrid, España Kapres Technology, S.L. A tiempo completo

    Para un importante cliente del sector seguros, buscamos **INFRASTRUCTURE/WORKPLACE-Operations Engineer Intune**, el trabajo seria 2 días en oficina calle Emilio Vargas, Madrid y 3 en remoto), ofrecemos contrato indefinido con nosotros. **Service description**: - L3 support activities (major incidents, technical issues faced by users) - Bug...


  • Madrid, España Movilser Tech A tiempo completo

    We're recruiting an **IT Operations Engineer (M/F)** to reinforce our team in Madrid, for an international project with a hybrid work model. **What we're looking for**: - Degree in Computer Engineering or similar areas (preferable); - 2 years of professional experience (minimum) in the same role; - Experience in Infrastructure Management (including digital...

  • IT Operations Engineer

    hace 4 semanas


    Madrid, España Movilges A tiempo completo

    We're recruiting an **IT Operations Engineer (M/F)** to reinforce our team in Madrid, for an international project with a hybrid work model. What we're looking for: - Degree in Computer Engineering or similar areas (preferable); - 2 years of professional experience (minimum) in the same role; - Experience in Infrastructure Management (including digital...


  • Madrid, España Western Union A tiempo completo

    As a successful Senior IT Operations Support Engineer, do you enjoy the challenge of troubleshooting and resolving outages, security issues, and building product enhancements to ensure uninterrupted service for customers? Do you want to continuously expand your technology knowledge? Want to help lead the engineering of digital banking systems and networks?...


  • Madrid, España 4Plus Ingenieros y Arquitectos A tiempo completo

    L1 Operations Engineer - FTTH / GPON Network Operations Are you an experienced engineer with a passion for fiber optic networks? Join our team as an L1 Operations Engineer for an exciting project involving the end-to-end lifecycle of GPON OLT network planning, deployment, migration, testing, and operations for a prominent Telco in Spain. As part of our...


  • Madrid, España 4Plus Ingenieros y Arquitectos A tiempo completo

    L1 Operations Engineer - FTTH / GPON Network Operations Are you an experienced engineer with a passion for fiber optic networks? Join our team as an L1 Operations Engineer for an exciting project involving the end-to-end lifecycle of GPON OLT network planning, deployment, migration, testing, and operations for a prominent Telco in Spain. As part of our team,...

  • Senior Devops Engineer

    hace 3 semanas


    Madrid, España Yantech Associates A tiempo completo

    Senior DevOps Engineer YanTech Associates have partnered with a top eCommerce company in Europe who are looking to expand their DevOps team in Spain with another Senior DevOps Engineer. Key Responsibilities: Work closely with development and data engineering teams Take the lead in automating and optimizing operations and workflows Develop, implement, and...


  • Madrid, España Clarity AI A tiempo completo

    Clarity AI is a global tech company founded in 2017 committed to bringing social impact to markets. We leverage AI and machine learning technologies to provide investors, governments, companies, and consumers with the right data, methodologies, and tools to make more informed decisions. We are now a team of more than 300 highly passionate individuals from...


  • Madrid, España Movilges A tiempo completo

    We're recruiting an **IT Operations Engineer (M/F)** to reinforce our team in Madrid, for an international project with a hybrid work model. What we're looking for: - Degree in Computer Engineering or similar areas (preferable); - 2 years of professional experience (minimum) in the same role; - Experience in Infrastructure Management (including digital...


  • Madrid, España Tecdata Engineering A tiempo completo

    Conocimiento clave : TerraformIdiomas : Inglés AltoTitulación : INGENIERÍA TÉCNICA EN INFORMÁTICAConocimientos imprescindibles :Cloud engineer Experiencia en desarrollo, preferiblemente con frameworks IaC como Terraform y entorno Cloud Conocimiento y experiencia general de AWS. Experiencia diseñando y gestionando soluciones basadas en Kubernetes y...


  • Madrid, España Nexthink A tiempo completo

    Company Description Hi, we’re Nexthink. We’re not just the leader in the digital employee experience category, we invented the category. Our solutions combine real-time analytics, automation and employee feedback across all endpoints to help IT teams delight people at work. Our cloud-native platform pinpoints issues and solutions, automates response,...


  • Madrid, España Clarity AI A tiempo completo

    Clarity AI is a global tech company founded in 2017 committed to bringing social impact to markets. We leverage AI and machine learning technologies to provide investors, governments, companies, and consumers with the right data, methodologies, and tools to make more informed decisions. We are now a team of more than 300 highly passionate individuals from...


  • Madrid, España Axpo A tiempo completo

    Workload: 50-100% This is a unique opportunity for an open-minded person, with a passion for data and strong technical education. The Advanced Analytics Team designs, develops, and operates Machine Learning use cases to optimize energy trading strategies, employing cutting-edge technologies in the field of AI and Data.  In your internship as Data...


  • Madrid, España JLL A tiempo completo

    JLL supports the Whole You, personally and professionally. We’re JLL—a leading professional services and investment management firm specializing in real estate. We have operations in over 80 countries and a workforce of over 98,000 individuals around the world who help real estate owners, occupiers and investors achieve their business ambitions. As a...

  • Senior DevOps Engineer

    hace 3 semanas


    Madrid, España YanTech Associates A tiempo completo

    Senior DevOps EngineerYanTech Associates have partnered with a top eCommerce company in Europe who are looking to expand their DevOps team in Spain with another Senior DevOps Engineer.Key Responsibilities: Work closely with development and data engineering teams Take the lead in automating and optimizing operations and workflows Develop, implement, and...


  • Madrid, España Capgemini Engineering A tiempo completo

    Desde el equipo de Manufacturing & Operations buscamos gente como tú!, con ganas de enfrentar nuevos retos y de pertenecer a una gran compañía. En esta ocasión buscamos un/a Ingeniero/a Calidad Operaciones con experiência, para a nuestros proyectos ubicados en Madrid. ¿Te unes al reto? **¿Cuál será tu día a día?** - Darás soporte al proyecto...


  • Madrid, España Georg Fischer SA A tiempo completo

    Your tasks Develop, maintain, and enhance infrastructure as code using tools like Terraform, CloudFormation, or Ansible. Design and implement CI/CD pipelines using tools such as Jenkins, GitLab CI/CD, or Travis CI to automate the build, test, and deployment processes. Manage and orchestrate containerized applications using Docker and...