Empleos actuales relacionados con Data Engineer for Language Technologies - Barcelona - Somm Excellence Alliance


  • Barcelona, Barcelona, España isolutions AG A tiempo completo

    We are seeking a highly skilled data engineer with expertise in Microsoft technologies to join our team.The ideal candidate will have hands-on experience with data engineering tools and platforms, including Azure Synapse, Databricks, and SQL Server.As a key member of our team, you will be responsible for designing, developing, and implementing data solutions...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 125_25_LS_LT_RE2 **Position**: - Research Engineer for Language Technologies (RE2) **Closing Date**: - Monday, 10 March, 2025 **Reference**: 125_25_LS_LT_RE2 **Job title**: Research Engineer for Language Technologies (RE2) **About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    The Language Technologies Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, Machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan government with the mission to develop fundamental open-source...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Research Engineer for Language Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2025-03-10. Termini segons el web de l'ens convocant. Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciat en lingüística...

  • Data Engineer

    hace 7 días


    Barcelona, Barcelona, España isolutions AG A tiempo completo

    Isolutions AG is a dynamic and innovative company that prides itself on staying at the forefront of data engineering. We are currently seeking an experienced Data Engineer with a strong focus on Microsoft technologies to join our team.We're looking for someone who can design, develop, and implement robust and efficient data pipelines using Python and SQL....


  • Barcelona, España OXIGENT Technologies A tiempo completo

    Senior Data Engineer en hibrido. Would you be interested in working as a Senior Data Engineer in a leading company in the retail sector using cutting-edge technology tools? From Oxigent Technologies we are looking for a SENIOR DATA ENGINEER to participate in Data-related projects located in Barcelona. What would you be doing? - Interact with Big Data...

  • Data Engineer |

    hace 4 semanas


    Barcelona, España Kbc Technologies Group A tiempo completo

    Role: Data Engineer Location: Barcelona Hybrid: 3 days in week to office Job Description: Seeking a skilled Data Engineer with a robust background in PySpark and extensive experience with AWS services, including Athena and EMR.The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing systems, ensuring...

  • Data Engineer

    hace 3 semanas


    Barcelona, España Kbc Technologies Group A tiempo completo

    Job Title: Data Engineer Job Description: Seeking a skilled Data Engineer with a robust background in Py Spark and extensive experience with AWS services, including Athena and EMR.The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing systems, ensuring efficient and reliable data flow and...

  • Data Engineer

    hace 4 semanas


    Barcelona, España Kbc Technologies Group A tiempo completo

    Job Title: Data EngineerJob Description: Seeking a skilled Data Engineer with a robust background in Py Spark and extensive experience with AWS services, including Athena and EMR.The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing systems, ensuring efficient and reliable data flow and...


  • Barcelona, Barcelona, España Apple Inc. A tiempo completo

    **About Us:** At Apple Inc., we are revolutionizing the way people interact with information and technology. Our AIML Siri and Information Intelligence team is a key driver of this innovation, creating user experiences that exceed expectations in over 40 languages and dialects.As a **Language Engineer for Siri in Swedish**, you will be part of this exciting...


  • Barcelona, Barcelona, España Ntt Data Europe & Latam A tiempo completo

    About NTT DATA Europe & LatamNTT DATA is a leading IT services company in Europe and Latin America, with a strong focus on innovation and customer satisfaction.We are seeking a highly skilled Large Language Models Engineer to join our team and contribute to the development of cutting-edge AI solutions.Job DescriptionOverviewAs a key member of our Data &...

  • Data Engineer |

    hace 4 semanas


    Barcelona, España Kbc Technologies Group A tiempo completo

    Role: Data EngineerLocation: BarcelonaHybrid: 3 days in week to officeJob Description: Seeking a skilled Data Engineer with a robust background in PySpark and extensive experience with AWS services, including Athena and EMR.The ideal candidate will be responsible for designing, developing, and optimizing large-scale data processing systems, ensuring...


  • Barcelona, España Barcelona Supercomputing Center (Bsc) A tiempo completo

    Job Reference111_25_LS_LT_RE1PositionML developer for Language Technologies (RE1)Closing DateThursday, 20 February, 2025About BSCThe Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain.It houses MareNostrum, one of the most powerful supercomputers in Europe, and is now the hosting...


  • Barcelona, España Barcelona Supercomputing Center (Bsc) A tiempo completo

    Job Reference111_25_LS_LT_RE1PositionML developer for Language Technologies (RE1)Closing DateThursday, 20 February, 2025About BSC The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain.It houses MareNostrum, one of the most powerful supercomputers in Europe, and is now the hosting...

  • Data Engineer

    hace 14 horas


    Barcelona, Barcelona, España Oxigent Technologies A tiempo completo

    Data Engineer (80% Remoto) en hibrido.¿Te interesaría seguir desarrollándote como Ingeniero/a de Data en una empresa líder del sector transportes y turismo en un entorno colaborativo con jerarquía horizontal y con proyección a futuro ubicada en el Baix Llobregat?Desde Oxigent Technologies seleccionamos un/a DATA ENGINEER para formar parte de un equipo...


  • Barcelona, Barcelona, España Keonn Technologies A tiempo completo

    As a leading provider of RFID equipment for the retail sector, Keonn Technologies is seeking a talented Java Software Engineer to join our team. This high-profile position requires a minimum of 4 years' experience in Java development, with a focus on edge computing and IoT devices.Key ResponsibilitiesDesign, develop, and improve the core software (Java)...

  • Cloud Data Engineer

    hace 4 días


    Barcelona, Barcelona, España Talent A tiempo completo

    Are you passionate about building scalable and reliable data systems? Do you have a strong understanding of cloud computing and big data technologies? If so, we want to hear from you! As a Cloud Data Engineer, you will be responsible for designing, developing, and maintaining our cloud-based data infrastructure.Job Requirements:Develop and implement data...

  • Data Engineer

    hace 4 semanas


    Barcelona, España isolutions A tiempo completo

    Are you an experienced Data Engineer with a passion for Microsoft technologies? If so, join us in our journey of innovation and shape the future of data engineering! We look forward to welcoming you to our team. As a Data Engineer, you will thrive within our collaborative team, participating in high-end projects alongside skilled professionals. Your role...


  • Barcelona, Barcelona, España Tangelo Games A tiempo completo

    About UsTangelo Games is a pioneering gaming company that has been making a positive impact on people's lives around the world for over a decade. Our team of passionate gamers and developers is dedicated to creating games that foster meaningful connections among players and within our own team. We've earned the title of Best Place to Work due to our...


  • Barcelona, Barcelona, España Ntt Data Europe & Latam A tiempo completo

    Job Description\NTT DATA is a leading provider of IT services with a strong presence in Spain and Europe. We are seeking a talented Generative AI Engineer to join our team. As part of our Data & Intelligence CoE, you will work on building and optimizing LLMs, integrating them into scalable platforms, and creating AI-driven use cases that solve real-world...

Data Engineer for Language Technologies

hace 1 mes


Barcelona, España Somm Excellence Alliance A tiempo completo

Context And Mission

The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and technologies for Spanish and Catalan. In connection with this, the LT Unit is currently in charge of two flagship projects at the national and regional levels: the Spanish National Plan for the Advancement of Language Technology, funded by the Spanish Secretariat of Digitalisation and Artificial Intelligence, and the AINA project, aimed at developing AI resources for Catalan, funded by the Catalan Digitalisation Department. In addition, the Unit participates in various EU-funded international projects.

The Language Technologies Unit at BSC is seeking a Data Manager with experience in language technologies to lead the development of the largest curated Spanish language corpus. This corpus will be used to train reference foundational LLMs.

Key Duties
- Identification of open/public data sources: Proactively identify and evaluate open and public data sources for the creation of extensive corpora in Spanish and co-official languages. This includes scouting for datasets that are relevant to the group's research focus on language models, including translation, audio processing, and large language models (LLMs).
- Engagement with data providers: Act as the primary contact point for negotiations and communications with external data providers, including public entities, companies, and other research institutions. Establish and maintain relationships to secure access to valuable data resources.
- Data acquisition strategy design: Develop and implement strategies for the efficient acquisition of external data. This includes outlining procedures for data requests, licensing negotiations, and ensuring compliance with data privacy regulations.
- Data management and governance: Collaborate in data management protocols to ensure the integrity, confidentiality, and availability of data..
- Dissemination and engagement activities: Lead the dissemination of findings and datasets within the scientific community and beyond. This includes publishing data reports, contributing to academic papers, and presenting at conferences. Also, engage with the broader research community to foster collaborations and share best practices in data management.
- Manage corpora and language data according to the requirements specified in the Unit’s data managemt.
- Control the quality of collected data and metadata.
- Compliance and ethics oversight: Ensure all data management activities comply with relevant laws, ethical standards, and best practices in data handling. This includes overseeing the ethical review of data sources and uses, as well as managing any data protection implications.

**Requirements**:

- Education

Bachelor’s Degree.
- Essential Knowledge and Professional Experience

Proficiency in data management principles and techniques.

Strong understanding of data acquisition strategies, including licensing negotiations and compliance with data privacy regulations.

Knowledge of open/public data sources relevant to language models, translation, audio processing, and large language models (LLMs).

Familiarity with data governance principles, including data integrity, confidentiality, and availability.

Excellent communication and negotiation skills for engaging with external data providers and stakeholders.

Experience in disseminating findings and datasets within the scientific community through reports, academic papers, and conference presentations.

Strong attention to detail and ability to control the quality of collected data and metadata.

Knowledge of compliance requirements and ethical standards in data management.

Excellent understanding of data administration and management functions (governance, transfer, storage, analysis, distribution, exploration, etc.).

Understanding of data privacy laws, ethical considerations in data handling, and best practices in data governance.

Experience in establishing and maintaining partnerships with data providers, research institutions, and other relevant organizations.

Fluent in written and spoken Catala
- Competences

Ability to work effectively in a team, contributing positively to team operations and working relationships.

Willingness to stay abreast of new data sources, technologies, and methodologies in the rapidly evolving field of language technologies.

Strong organizational skills, with the ability to manage multiple tasks simultaneously and meet deadlines.

Ability to work independently and in a team to complete tasks on schedule.

Ability to work under set deadlines.

Conditions

The position will be located at BSC within the Life Sciences Department

We offer a full-time contract (37.5h/week), a good working environme