Data Linguist for Language Technologies
hace 2 semanas
**Reference**: 281_25_LS_LT_RE1
**Job title**: Data Linguist for Language Technologies (RE1)
**About BSC**
The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, and is now hosting entity for EuroHPC JU, the Joint Undertaking that leads large-scale investments and HPC provision in Europe. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 1000 staff from 60 countries.
We promote Equity, Diversity and Inclusion, fostering an environment where each and every one of us is appreciated for who we are, regardless of our differences.
**Context And Mission**
The Language Technologies Unit at BSC has consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan governments with the mission to develop fundamental open-source resources and technologies for Spanish and Catalan. The LT Unit is currently in charge of two flagship projects at the national and regional level: the ALIA project and the AINA project. In addition, the Unit participates in various EU funded international projects.
**Key Duties**
- Collaborate with team members on data collection, cleaning, and preprocessing for LLM training.
- Assist in managing and organizing large-scale multilingual datasets, ensuring data integrity and accessibility.
- Support the implementation of data governance policies to ensure the legal and ethical use of language data.
- Work with other computational linguists and engineers to develop data pipelines and preprocessing workflows.
- Contribute to documentation of data processing methodologies to ensure reproducibility and transparency.
- Assist in evaluating the quality and suitability of datasets for language model development.
**Requirements**:
**Education**
- Master’s degree in Computational Linguistics, Theoretical and Applied Linguistics, or a related discipline.
**Essential Knowledge and Professional Experience**
- Knowledge of Python and experience working with NLP-related libraries such as NLTK and pandas.
- Strong analytical and problem-solving skills, particularly in data analysis and linguistic evaluation.
- Strong understanding of linguistic concepts.
- Ability to work effectively in a collaborative research environment.
- Fluency in spoken and written Spanish and English.
**Additional Knowledge and Professional Experience**
- Experience with language data preprocessing and linguistic annotation.
- Understanding of evaluation metrics for NLP models, such as accuracy, BLEU, and F1 score.
- Experience with tools for version control, such as Git and GitHub/GitLab.
- Native or good level of spoken and written Catalan.
**Competences**
- Strong organizational and documentation skills.
- Attention to detail and a proactive approach to problem-solving.
- Ability to work both independently and within a team.
- Critical thinking and adaptability in a fast-paced research setting.
- Good communication and presentation skills.
- Ability to work under set deadlines.
**Conditions**
- The position will be located at BSC within the Life Sciences Department.
- We offer a full-time contract (37.5h/week), a good working environment, flexible working hours, extensive training plan, restaurant tickets, private health insurance, support to the relocation procedures.
- Duration: Open-ended contract due to technical and scientific activities linked to the project and budget duration.
- Holidays: 23 paid vacation days plus 24th and 31st of December per our collective agreement.
- Starting date: asap.
**Applications procedure and process**
- A full CV in English including contact details.
- A cover/motivation letter with a statement of interest in English, clearly specifying for which specific area and topics the applicant wishes to be considered. Additionally, two references for further contacts must be included. Applications without this document will not be considered.
**Development of the recruitment process**
The selection will be carried out through a competitive examination system. The recruitment process consists of two phases:
- Curriculum Analysis: Evaluation of previous experience and/or scientific history, degree, training, and other professional information relevant to the position. - 40 points
The recruitment panel will be composed of at least three people, ensuring at least 25% representation of women. BSC-CNS is committed to the principles of the Code of Conduct for the Recruitment of Researchers of the European Commission and the Open, Tr
-
Data Engineer for Language Technologies
hace 21 horas
Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo**Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...
-
Data Engineer for Language Technologies
hace 4 días
Barcelona, España Somm Excellence Alliance A tiempo completoContext And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...
-
Data Manager for Language Technologies
hace 4 días
Barcelona, España Somm Excellence Alliance A tiempo completoContext And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...
-
Data Manager for Language Technologies
hace 21 horas
Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo**Context And Mission** The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...
-
Research Engineer for Language Technologies
hace 21 horas
Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo**Job Reference**: - 125_25_LS_LT_RE2 **Position**: - Research Engineer for Language Technologies (RE2) **Closing Date**: - Monday, 10 March, 2025 **Reference**: 125_25_LS_LT_RE2 **Job title**: Research Engineer for Language Technologies (RE2) **About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the...
-
Deep Learning Engineer For Language Technologies
hace 1 semana
Barcelona, Barcelona, España Barcelona Supercomputing Center A tiempo completoDeep Learning Engineer for Language Technologies (RE3)Apply for the Deep Learning Engineer for Language Technologies (RE3) role at Barcelona Supercomputing Center.Job Reference: 677_25_LS_LT_RE3Closing Date: Thursday, 27 November ****Location: Barcelona Supercomputing Center, Life Sciences Department.About BSCThe Barcelona Supercomputing Center (BSC-CNS) is...
-
Research Engineer for Language Technologies
hace 2 semanas
Barcelona, Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completoJob Reference720_25_LS_LT_RE2PositionResearch Engineer for Language Technologies (RE2)Closing DateMonday, 08 December, 2025Reference: 720_25_LS_LT_RE2Job title: Research Engineer for Language Technologies (RE2)About BSCThe Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses...
-
Computational Linguist
hace 2 días
Barcelona, España M47Labs A tiempo completoM47 Labs is a fast growing international company with offices in Barcelona and Madrid, focused on providing outstanding international AI/ML Engineering and Quality Language Analytics Services. We are growing our team and currently looking for a skilled Computational Linguist to work on a challenging temporary project in a cutting-edge software project with...
-
Research Engineer for Language Technologies
hace 2 semanas
Barcelona, Barcelona, España Barcelona Supercomputing Center A tiempo completoJob Reference720_25_LS_LT_RE2PositionResearch Engineer for Language Technologies (RE2)Closing DateMonday, 08 December, 2025Reference:720_25_LS_LT_RE2Job title:Research Engineer for Language Technologies (RE2)About BSCThe Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses...
-
Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo**Job Reference**: - 608_24_LS_LT_RE2**Position**: - Deep Learning Engineer for Language Technologies RE2**Closing Date**: - Thursday, 17 October, 2024**Reference**: 608_24_LS_LT_RE2**Job title**: Deep Learning Engineer for Language Technologies RE2 **About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is...