Data Manager for Language Technologies

hace 3 semanas


Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

**Job Reference**:

- 216_24_LS_LT_RE3**Position**:

- Data Manager for Language Technologies (RE3)**Closing Date**:

- Friday, 31 May, 2024**Reference**: 216_24_LS_LT_RE3**Job title**: Data Manager for Language Technologies (RE3)**About BSC**
- The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, was a founding and hosting member of the former European HPC infrastructure PRACE (Partnership for Advanced Computing in Europe), and is now hosting entity for EuroHPC JU, the Joint Undertaking that leads large-scale investments and HPC provision in Europe. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 900 staff from 55 countries.
- Look at the BSC experience:

- BSC-CNS YouTube Channel
- Let's stay connected with BSC Folks-
**Context And Mission**
- The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and technologies for Spanish and Catalan. In connection with this, the LT Unit is currently in charge of two flagship projects at the national and regional levels: the Spanish National Plan for the Advancement of Language Technology, funded by the Spanish Secretariat of Digitalisation and Artificial Intelligence, and the AINA project, aimed at developing AI resources for Catalan, funded by the Catalan Digitalisation Department. In addition, the Unit participates in various EU-funded international projects.
- The Language Technologies Unit at BSC is seeking a Data Manager with experience in language technologies to lead the development of the largest curated Spanish language corpus. This corpus will be used to train reference foundational LLMs.-
**Key Duties**
- Collaboration with MLOps and Deep Learning engineers: Work closely with machine learning engineers and the MLOps team to define and understand data requirements for projects. Assist in optimizing data flow and usage within machine learning pipelines
- Operationalization of data acquisition and into existing pipelines: Design and oversee the operationalization of accessing external data and its integration into the internal data processing pipelines. Ensure that the data integration process is efficient, scalable, and aligns with the research group’s technical infrastructure and goals.
- Data management and governance: Establish data management protocols to ensure the integrity, confidentiality, and availability of data. This involves setting up data governance practices, including data quality control, metadata management, and access controls.
- Dissemination and engagement activities: Lead the dissemination of findings and datasets within the scientific community and beyond. This includes publishing data reports, contributing to academic papers, and presenting at conferences. Also, engage with the broader research community to foster collaborations and share best practices in data management.
- Technical documentation: Write comprehensive technical reports, project documentation, and scientific papers in English, Spanish, and Catalan. Ensure documentation is clear, accurate, and accessible to stakeholder.
- Research support: Assist in preparing research proposals, including the articulation of data needs and plans for data acquisition and management. Contribute to writing scientific papers and reports on findings.
- Continuous learning and skill development: Keep abreast of the latest developments in data engineering, language processing tools, and machine learning operations. Continuously update skills to improve data processes and workflows within the research group.
- Collaboration facilitation: Facilitate collaborations between the research group and external partners to enhance the group’s data capabilities. This may involve coordinating joint research projects, data-sharing agreements, and other forms of partnership.
- Monitoring and reporting: Regularly monitor the data landscape for new trends, sources, and tools that can benefit the research group. Provide reports and insights to the leadership on the status of data acquisitions, challenges faced, and the impact of data on research outcomes.
- Compliance and ethics oversight: Ensure all data management activities comply with relevant laws, ethical standards, and best practices in data handling. This includes overseeing the ethical review of data sources and uses, as well as managing any data protection implications.
- Training and su



  • Barcelona, Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    Job Reference: 216_24_LS_LT_RE3Position: Data Manager for Language Technologies (RE3)Closing Date:Friday, 31 May, 2024Reference: 216_24_LS_LT_RE3Job title: Data Manager for Language Technologies (RE3)About BSC The Barcelona Supercomputing Center Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses...


  • Barcelona, Barcelona, España Barcelona Supercomputing Center-Centro Nacional De Supercomputación (Bsc-Cns) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Manager for Language Technologies (RE3). Concurs o valoració de mèrits. Laboral temporal Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura en Informàtica, Sistemes d'Informació, Lingüística amb enfocament...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional De Supercomputación (Bsc-Cns) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Manager for Language Technologies (RE3). Concurs o valoració de mèrits. Laboral temporal. 2024-05-31. Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura en Informàtica, Sistemes d'Informació, Lingüística amb...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Manager for Language Technologies (RE3). Concurs o valoració de mèrits. Laboral temporal. 2024-05-31. Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura en Informàtica, Sistemes d'Informació, Lingüística amb...


  • Barcelona, Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    Job Reference: 215_24_LS_LT_RE2Position: Data Engineer for Language Technologies (RE2)Closing Date:Friday, 31 May, 2024Reference: 215_24_LS_LT_RE2Job title: Data Engineer for Language Technologies (RE2)About BSC The Barcelona Supercomputing Center Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses...


  • Barcelona, España Somm Excellence Alliance A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission** The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 215_24_LS_LT_RE2**Position**: - Data Engineer for Language Technologies (RE2)**Closing Date**: - Friday, 31 May, 2024**Reference**: 215_24_LS_LT_RE2**Job title**: Data Engineer for Language Technologies (RE2)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Somma A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Somma A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Somm Excellence Alliance A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Context And MissionThe Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, Barcelona, España Barcelona Supercomputing Center-Centro Nacional De Supercomputación (Bsc-Cns) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura. Fluïdesa en català escrit i parlatVeure convocatòria- Contracte laboral...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional De Supercomputación (Bsc-Cns) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2024-05-31. Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura. Fluïdesa en català escrit i parlatVeure convocatòria- Contracte...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2024-05-31. Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura. Fluïdesa en català escrit i parlat Veure convocatòria -...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 9_23_LS_TM_RE1**Position**: - Data Engineer for Language and Translation Technologies (RE1)**Closing Date**: - Tuesday, 28 February, 2023**Reference**: 9_23_LS_TM_RE1**Job title**: Data Engineer for Language and Translation Technologies (RE1)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 234_24_LS_LT_RE2**Position**: - Data Engineer for Language and Translation Technologies (RE2)**Closing Date**: - Friday, 17 May, 2024**Reference**: 234_24_LS_LT_RE2**Job title**: Data Engineer for Language and Translation Technologies (RE2)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación...