Data Manager for Language Technologies

hace 8 meses


Barcelona, España Somm Excellence Alliance A tiempo completo

Context And Mission

The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and technologies for Spanish and Catalan. In connection with this, the LT Unit is currently in charge of two flagship projects at the national and regional levels: the Spanish National Plan for the Advancement of Language Technology, funded by the Spanish Secretariat of Digitalisation and Artificial Intelligence, and the AINA project, aimed at developing AI resources for Catalan, funded by the Catalan Digitalisation Department. In addition, the Unit participates in various EU-funded international projects.

The Language Technologies Unit at BSC is seeking a Data Manager with experience in language technologies to lead the development of the largest curated Spanish language corpus. This corpus will be used to train reference foundational LLMs.

Key Duties
- Collaboration with MLOps and Deep Learning engineers: Work closely with machine learning engineers and the MLOps team to define and understand data requirements for projects. Assist in optimizing data flow and usage within machine learning pipelines
- Operationalization of data acquisition and into existing pipelines: Design and oversee the operationalization of accessing external data and its integration into the internal data processing pipelines. Ensure that the data integration process is efficient, scalable, and aligns with the research group’s technical infrastructure and goals.
- Data management and governance: Establish data management protocols to ensure the integrity, confidentiality, and availability of data. This involves setting up data governance practices, including data quality control, metadata management, and access controls.
- Dissemination and engagement activities: Lead the dissemination of findings and datasets within the scientific community and beyond. This includes publishing data reports, contributing to academic papers, and presenting at conferences. Also, engage with the broader research community to foster collaborations and share best practices in data management.
- Technical documentation: Write comprehensive technical reports, project documentation, and scientific papers in English, Spanish, and Catalan. Ensure documentation is clear, accurate, and accessible to stakeholder.
- Research support: Assist in preparing research proposals, including the articulation of data needs and plans for data acquisition and management. Contribute to writing scientific papers and reports on findings.
- Continuous learning and skill development: Keep abreast of the latest developments in data engineering, language processing tools, and machine learning operations. Continuously update skills to improve data processes and workflows within the research group.
- Collaboration facilitation: Facilitate collaborations between the research group and external partners to enhance the group’s data capabilities. This may involve coordinating joint research projects, data-sharing agreements, and other forms of partnership.
- Monitoring and reporting: Regularly monitor the data landscape for new trends, sources, and tools that can benefit the research group. Provide reports and insights to the leadership on the status of data acquisitions, challenges faced, and the impact of data on research outcomes.
- Compliance and ethics oversight: Ensure all data management activities comply with relevant laws, ethical standards, and best practices in data handling. This includes overseeing the ethical review of data sources and uses, as well as managing any data protection implications.
- Training and support: Provide training and support to research team members on data-related topics, including best practices in data collection, management, and usage. Act as a resource for team members on data management tools and methodologies.

**Requirements**:

- Education

Bachelor’s Degree in Computer Science, Information Systems, Linguistics with a computational focus, or a related field. A Master’s degree or higher in these areas is highly desirable
- Essential Knowledge and Professional Experience

Demonstrable experience in managing large datasets, including acquisition, storage, processing, and dissemination of data. Experience in handling linguistic data is highly preferred.
- Excellent understanding of data administration and management functions (governance, transfer, storage, analysis, distribution, exploration, etc.).
- Understanding of data privacy laws, ethical considerations in data handling, and best practices in data governance.
- Hands-on experience with database management systems (e.g., SQL, NoSQL) and data integration tools.
- Proven experience in UNIX/LINUX environments, scripting languages and Python



  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission** The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Barcelona Supercomputing Center - Centro Nacional de Supercomputación A tiempo completo

    **Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, España Somm Excellence Alliance A tiempo completo

    Context And Mission The Language Technologies (LT) Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning. It has been entrusted by the Spanish and the Catalan government with the mission to develop essential open-source resources and...


  • Barcelona, Barcelona, España Alpadia Language Schools Sa A tiempo completo

    Job DescriptionWe are seeking a skilled Customer Relationship Manager to join our team in Barcelona, Spain. As the first point of contact for our direct clients, you will be responsible for ensuring seamless admissions operations and providing exceptional customer service.About UsAlpadia Language Schools Sa is part of Kaplan Inc., the Kaplan Languages Group...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 608_24_LS_LT_RE2**Position**: - Deep Learning Engineer for Language Technologies RE2**Closing Date**: - Thursday, 17 October, 2024**Reference**: 608_24_LS_LT_RE2**Job title**: Deep Learning Engineer for Language Technologies RE2 **About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is...


  • Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Context And Mission The Language Technologies Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan government to develop fundamental...


  • Barcelona, Barcelona, España Healthcare Businesswomen'S Association A tiempo completo

    Job Description: As a Data Solutions Lead for Emerging Technologies, you will be responsible for developing and implementing innovative data solutions to drive business growth and improve patient outcomes.


  • Barcelona, Barcelona, España Kids&Us Language School A tiempo completo

    About UsKids&Us Language School is a renowned educational institution dedicated to providing high-quality English language instruction to children. We are passionate about fostering a love for learning and promoting cultural understanding through language.Job OverviewWe are seeking an enthusiastic and experienced English language instructor to join our team....

  • Language Data Specialist

    hace 3 semanas


    Barcelona, Barcelona, España M47 Labs A tiempo completo

    About usM47 Labs is a cutting-edge AI Engineering company that empowers businesses with custom-built, AI-driven applications. Our expertise in AI Language Technologies, including NLP and LLMs, enables us to deliver innovative solutions.We are a diverse and passionate team dedicated to creating intelligent assistants for the world. Join us in making the...


  • Barcelona, España Barcelona Supercomputing Center A tiempo completo

    Context And Mission The Language Technologies Unit at BSC has a consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan government to develop fundamental...


  • Barcelona, Barcelona, España Alpadia Language Schools A tiempo completo

    Company OverviewKaplan Languages Group is a leading provider of language travel services, operating a network of schools in several countries. As a key player in the industry, we are committed to delivering high-quality experiences for our students and clients.About the RoleWe are seeking a highly skilled CRM Marketing Manager to lead the transformation of...


  • Barcelona, España Barcelona Supercomputing Center (Bsc) A tiempo completo

    .Job Reference591_24_LS_LT_R0PositionUndergraduated Student - Speech, Language Technologies (R0)Closing DateFriday, 15 November, 2024Reference: 591_24_LS_LT_R0Job title: Undergraduated Student - Speech, Language Technologies (R0)About BSCThe Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center...

  • Data Privacy Manager

    hace 2 meses


    Barcelona, España Arbolus Technologies A tiempo completo

    About Arbolus Arbolus is reinventing the traditional and analog expert network industry by bringing technology to the forefront of knowledge sharing.Our platform helps hundreds of clients worldwide to connect with the best experts, collect high-quality insights faster, and streamline their processes using leading AI technology.Headquartered in London, we are...

  • Data Privacy Manager

    hace 2 meses


    Barcelona, España Arbolus Technologies A tiempo completo

    About Arbolus Arbolus is reinventing the traditional and analog expert network industry by bringing technology to the forefront of knowledge sharing.Our platform helps hundreds of clients worldwide to connect with the best experts, collect high-quality insights faster, and streamline their processes using leading AI technology.Headquartered in London, we are...

  • Data Scientist II

    hace 1 mes


    Barcelona, Barcelona, España Data Privacy A tiempo completo

    About the RoleWe are seeking a highly skilled Data Scientist II to join our team at Microsoft, working on large language models for Azure AI Search. As a member of our diverse and passionate team, you will leverage your expertise in machine learning to develop high-quality search results for various industries and scenarios.ResponsibilitiesTrain, fine-tune,...


  • Barcelona, España Arbolus Technologies A tiempo completo

    About ArbolusArbolus is reinventing the traditional and analog expert network industry by bringing technology to the forefront of knowledge sharing. Our platform helps hundreds of clients worldwide to connect with the best experts, collect high-quality insights faster, and streamline their processes using leading AI technology. Headquartered in London, we...


  • Barcelona, Barcelona, España Veeva Systems A tiempo completo

    About Veeva SystemsVeeva Systems is a pioneering organization in the industry cloud, dedicated to helping life sciences companies bring therapies to patients faster. With a strong focus on innovation and customer success, we strive to make a positive impact on our customers, employees, and communities.We are committed to fostering a culture of inclusion and...


  • Barcelona, España Barcelona Supercomputing Center (Bsc) A tiempo completo

    .Job Reference720_24_OP_DMPositionData Manager for Project SupportClosing DateThursday, 31 October, 2024About BSCThe Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe. The mission of BSC is to research,...


  • Barcelona, Barcelona, España Barcelona Supercomputing Center (Bsc) A tiempo completo

    About the RoleThe Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. We are seeking a Data Manager to join our Operations Department and participate in the activities of data management, data modelling and implementation for the Einstein Telescope Preparation Phase project...


  • Barcelona, España Barcelona Supercomputing Center (BSC) A tiempo completo

    **Job Reference**: - 592_24_LS_LT_R0**Position**: - Master Student Speech, Language Technologies (R0)**Closing Date**: - Friday, 15 November, 2024**Reference**: 592_24_LS_LT_R0**Job title**: Master Student Speech, Language Technologies (R0)**About BSC** - The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading...