Data Engineer for Language and Translation

hace 4 semanas


Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

**Context And Mission
The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and technologies for Spanish and Catalan. In connection with this, the LT Unit is currently in charge of two flagship projects at the national and regional levels: the Spanish National Plan for the Advancement of Language Technology, funded by the Spanish Secretariat of Digitalisation and Artificial Intelligence, and the AINA project, aimed at developing AI resources for Catalan, funded by the Catalan Digitalisation Department. In addition, the Unit participates in various EU-funded international projects.

The LT Unit at BSC is looking for a Data Engineer with experience in Natural Language Processing and/or Machine translation.

**Key Duties
- Collect language data as required by the projects carried out in the Unit.
- Prepare language data processing scripts to clean and prepare data to be ingested by the neural architectures.
- Automatically annotate data using state-of-the-art language processing tools.
- Manage corpora and language data according to the requirements specified in the Unit’s data management plan.
- Control the quality of collected data and metadata.
- Coordinate with machine learning engineers to determine data requirements
- Write technical reports and project documentation in English, Spanish and Catalan.
- Prepare research proposals and write scientific papers.
- Coordinate external teams for data collection and data annotation
- Ensure the applicability of open licenses to data sets, and resolve queries

**Requirements
- EducationDegree in Applied linguistics, Computer Science or related disciplines
- Essential Knowledge and Professional ExperienceDemonstrated experience of at least 3 years in NLP, MT or Speech processing fields.
- Excellent understanding of data administration and management functions (transfer, storage, analysis, distribution, exploration, etc.).
- Proven experience in working with large datasets and distributed file systems: SQL, databases and metadata management.
- Proven experience in UNIX/LINUX environments, scripting languages and Python Competences
- Fluent in written and spoken English and Spanish.
- Additional Knowledge and Professional ExperienceDemonstrated experience in developing open-source software and resources
- Fluent in written and spoken Catalan.
- Strong understanding of linguistic concepts.
- CompetencesAbility to work independently and in a team to complete tasks on schedule.
- Ability to work under set deadlines

**Conditions
- The position will be located at BSC within the Life Sciences Department
- We offer a full-time contract, a good working environment, a highly stimulating environment with state-of-the-art infrastructure, flexible working hours, extensive training plan, tickets restaurant, private health insurance, fully support to the relocation procedures
- Duration: Open-ended contract due to technical and scientific activities linked to the project and budget duration
- Starting date: asap



  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 9_23_LS_TM_RE1**Position**: - Data Engineer for Language and Translation Technologies (RE1)**Closing Date**: - Tuesday, 28 February, 2023**Reference**: 9_23_LS_TM_RE1**Job title**: Data Engineer for Language and Translation Technologies (RE1)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 12_23_LS_TM_RE2**Position**: - Data Engineer for Language and Translation Technologies (RE2)**Closing Date**: - Tuesday, 28 February, 2023**Reference**: 12_23_LS_TM_RE2**Job title**: Data Engineer for Language and Translation Technologies (RE2)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language and Translation Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2024-05-17. Termini obert. A - Grau universitari. Grau en lingüística aplicada, informàtica o disciplines afins. Domini de l'anglès, el castellà i el...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional De Supercomputación (Bsc-Cns) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language and Translation Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2024-05-17. Termini obert. A - Grau universitari. Grau en lingüística aplicada, informàtica o disciplines afins. Domini de l'anglès, el castellà i el...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional De Supercomputación (Bsc-Cns) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language and Translation Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2024-05-17. Termini obert. A - Grau universitari. Grau en lingüística aplicada, informàtica o disciplines afins. Domini de l'anglès, el castellà i el...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 215_24_LS_LT_RE2**Position**: - Data Engineer for Language Technologies (RE2)**Closing Date**: - Friday, 31 May, 2024**Reference**: 215_24_LS_LT_RE2**Job title**: Data Engineer for Language Technologies (RE2)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    The Language Technologies Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, Machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan government with the mission to develop fundamental open-source...


  • Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça d'Undergraduate Student - Data for Language and Translation Technologies (R0). Concurs o valoració de mèrits. Laboral temporal. 2024-06-30. Termini obert. A - Grau universitari. Estudis en curs en Enginyeria de Dades. Coneixement de l'àmbit de la IA, específicament...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission The Language Technologies Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan government with the mission to develop...

  • Research Engineer

    hace 2 semanas


    Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 173_24_LS_LT_RE1/RE2**Position**: - Research Engineer - ML for Language Technologies (RE1/RE2)**Closing Date**: - Wednesday, 20 March, 2024**Reference**: 173_24_LS_LT_RE1/RE2**Job title**: Research Engineer - ML for Language Technologies (RE1/RE2)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 10_23_LS_TM_RE1**Position**: - Machine Translation Engineer (RE1)**Closing Date**: - Tuesday, 28 February, 2023**Reference**: 10_23_LS_TM_RE1**Job title**: Machine Translation Engineer (RE1)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 13_23_LS_TM_RE2**Position**: - Machine Translation Engineer (RE2)**Closing Date**: - Tuesday, 28 February, 2023**Reference**: 13_23_LS_TM_RE2**Job title**: Machine Translation Engineer (RE2)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission The Language Technologies Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan government with the mission to develop...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional De Supercomputación (Bsc-Cns) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2024-05-31. Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura. Fluïdesa en català escrit i parlatVeure convocatòria- Contracte...


  • Barcelona, España Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS) A tiempo completo

    Barcelona Supercomputing Center-Centro Nacional de Supercomputación (BSC-CNS). 1 plaça de Data Engineer for Language Technologies (RE2). Concurs o valoració de mèrits. Laboral temporal. 2024-05-31. Termini obert. A1 - Grau universitari (correspondència amb llicenciatures). Llicenciatura. Fluïdesa en català escrit i parlat Veure convocatòria -...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 216_24_LS_LT_RE3**Position**: - Data Manager for Language Technologies (RE3)**Closing Date**: - Friday, 31 May, 2024**Reference**: 216_24_LS_LT_RE3**Job title**: Data Manager for Language Technologies (RE3)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission** The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...