Data Linguist for Language Technologies

hace 1 semana


Barcelona, España Somma A tiempo completo

**Reference**: 281_25_LS_LT_RE1
**Job title**: Data Linguist for Language Technologies (RE1)

**About BSC**
The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, and is now hosting entity for EuroHPC JU, the Joint Undertaking that leads large-scale investments and HPC provision in Europe. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 1000 staff from 60 countries.

We promote Equity, Diversity and Inclusion, fostering an environment where each and every one of us is appreciated for who we are, regardless of our differences.

**Context And Mission**
The Language Technologies Unit at BSC has consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan governments with the mission to develop fundamental open-source resources and technologies for Spanish and Catalan. The LT Unit is currently in charge of two flagship projects at the national and regional level: the ALIA project and the AINA project. In addition, the Unit participates in various EU funded international projects.

**Key Duties**
- Collaborate with team members on data collection, cleaning, and preprocessing for LLM training.
- Assist in managing and organizing large-scale multilingual datasets, ensuring data integrity and accessibility.
- Support the implementation of data governance policies to ensure the legal and ethical use of language data.
- Work with other computational linguists and engineers to develop data pipelines and preprocessing workflows.
- Contribute to documentation of data processing methodologies to ensure reproducibility and transparency.
- Assist in evaluating the quality and suitability of datasets for language model development.

**Requirements**:
**Education**
- Master’s degree in Computational Linguistics, Theoretical and Applied Linguistics, or a related discipline.

**Essential Knowledge and Professional Experience**
- Knowledge of Python and experience working with NLP-related libraries such as NLTK and pandas.
- Strong analytical and problem-solving skills, particularly in data analysis and linguistic evaluation.
- Strong understanding of linguistic concepts.
- Ability to work effectively in a collaborative research environment.
- Fluency in spoken and written Spanish and English.

**Additional Knowledge and Professional Experience**
- Experience with language data preprocessing and linguistic annotation.
- Understanding of evaluation metrics for NLP models, such as accuracy, BLEU, and F1 score.
- Experience with tools for version control, such as Git and GitHub/GitLab.
- Native or good level of spoken and written Catalan.

**Competences**
- Strong organizational and documentation skills.
- Attention to detail and a proactive approach to problem-solving.
- Ability to work both independently and within a team.
- Critical thinking and adaptability in a fast-paced research setting.
- Good communication and presentation skills.
- Ability to work under set deadlines.

**Conditions**
- The position will be located at BSC within the Life Sciences Department.
- We offer a full-time contract (37.5h/week), a good working environment, flexible working hours, extensive training plan, restaurant tickets, private health insurance, support to the relocation procedures.
- Duration: Open-ended contract due to technical and scientific activities linked to the project and budget duration.
- Holidays: 23 paid vacation days plus 24th and 31st of December per our collective agreement.
- Starting date: asap.

**Applications procedure and process**
- A full CV in English including contact details.
- A cover/motivation letter with a statement of interest in English, clearly specifying for which specific area and topics the applicant wishes to be considered. Additionally, two references for further contacts must be included. Applications without this document will not be considered.

**Development of the recruitment process**
The selection will be carried out through a competitive examination system. The recruitment process consists of two phases:

- Curriculum Analysis: Evaluation of previous experience and/or scientific history, degree, training, and other professional information relevant to the position. - 40 points

The recruitment panel will be composed of at least three people, ensuring at least 25% representation of women. BSC-CNS is committed to the principles of the Code of Conduct for the Recruitment of Researchers of the European Commission and the Open, Tr



  • Barcelona, España Somm Excellence Alliance A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 606_25_LS_LT_RE1 **Position**: - Data Engineer for Language Technologies (RE1) **Closing Date**: - Saturday, 18 October, 2025 **Reference**: 606_25_LS_LT_RE1 **Job title**: Data Engineer for Language Technologies (RE1) **About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the...


  • Barcelona, España Somm Excellence Alliance A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 522_25_LS_LT_RE2 **Position**: - Data Engineer for Language Technologies (RE2) **Closing Date**: - Sunday, 24 August, 2025 **Reference**: 522_25_LS_LT_RE2 **Job title**: Data Engineer for Language Technologies (RE2) **About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading...

  • Senior Prompt Engineer

    hace 14 horas


    Barcelona, España Preply A tiempo completo

    An innovative Ed-Tech company in Barcelona is looking for a Computational Linguist / Prompt Engineer to improve its AI-driven language learning platform. The position involves crafting prompts for LLMs, developing datasets, and collaborating with cross-functional teams to enhance educational tools. Candidates should possess strong analytical skills, Python...


  • Barcelona, España Preply A tiempo completo

    An innovative Ed-Tech company in Barcelona is looking for a Computational Linguist / Prompt Engineer to improve its AI-driven language learning platform. The position involves crafting prompts for LLMs, developing datasets, and collaborating with cross-functional teams to enhance educational tools. Candidates should possess strong analytical skills, Python...

  • Linguist

    hace 5 días


    Barcelona, España TransPerfect A tiempo completo

    A leading language solutions provider is seeking a remote Linguist in Spain. The ideal candidate will curate and evaluate linguistic data, proofread audio content, and assess speech model outputs. This role requires idiomatic fluency in English (UK), a Bachelor's degree in Linguistics, and a strong grasp of linguistic concepts. Detail-oriented individuals...


  • Barcelona, Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Deep Learning Engineer for Language Technologies (RE3)Apply for the Deep Learning Engineer for Language Technologies (RE3) role at Barcelona Supercomputing Center.Job Reference: 677_25_LS_LT_RE3Closing Date: Thursday, 27 November ****Location: Barcelona Supercomputing Center, Life Sciences Department.About BSCThe Barcelona Supercomputing Center (BSC-CNS) is...

  • UK English Linguist

    hace 2 días


    Barcelona, España TransPerfect A tiempo completo

    A leading language solutions provider based in Spain is seeking a Linguist to work remotely on data curation and evaluation. The ideal candidate should have a Bachelor's degree in Linguistics and idiomatic fluency in English (UK). Responsibilities include curating linguistic data, proofreading audio scripts, annotating data, and providing feedback based on...

  • French Linguist

    hace 2 semanas


    Barcelona, España Welocalize A tiempo completo

    Welocalize is a global transformation partner accelerating business journeys worldwide with multilingual content transformation services. Over 250 languages and a network of 400,000 in‑country linguistic resources, we deliver translation, localization, adaptation, and NLP‑enabled machine learning training data solutions.Job OverviewThis French Linguist...