Vector Data Engineer

hace 1 semana


Madrid, Madrid, España Johnson & Johnson Innovative Medicine A tiempo completo

At Johnson & Johnson, we believe health is everything. Our strength in healthcare innovation empowers us to build a world where complex diseases are prevented, treated, and cured, where treatments are smarter and less invasive, and solutions are personal. Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow, and profoundly impact health for humanity. Learn more at

Job Function
Data Analytics & Computational Sciences

Job Sub Function
Data Science

Job Category
Scientific/Technology

All Job Posting Locations:
Cornellà de Llobregat, Barcelona, Spain, Madrid, Spain

Job Description
Johnson and Johnson Innovative Medicine (J&J IM), a pharmaceutical company of Johnson & Johnson is recruiting for a Vector Data Engineer.  This position has a primary location of Barcelona, Spain. The secondary location is Madrid. This is a hybrid role.

Our expertise in Innovative Medicine is informed and inspired by patients, whose insights fuel our science-based advancements. Visionaries like you work in teams that save lives by developing the medicines of tomorrow.

Join us in developing treatments, finding cures, and pioneering the path from lab to life while championing patients every step of the way. Learn more at

Position Summary:

The Vector Data Engineer designs and implements the embedding and semantic-search infrastructure that connects discovery, translational, and clinical data into AI-ready knowledge representations.

This role bridges multi-omics data engineering and machine-learning infrastructure, enabling scientists and agentic tools to discover biological insights through vector-based search and reasoning.

Key Responsibilities:

  • Develop scalable pipelines that convert multi-omics and clinical data (e.g., proteomics, transcriptomics, spatial omics, biomarkers) into vectorized embeddings for AI and semantic retrieval.
  • Build and maintain vector databases and hybrid data stores using technologies such as TileDB, Weaviate, or Snowflake Cortex.
  • Collaborate with the Data Transformation Engineers to design standardized data formats suitable for embedding generation and cross-modality mapping.
  • Integrate metadata, ontology terms, and provenance into vector representations to ensure traceability and governance compliance.
  • Partner with AI/ML Team to deploy embeddings supporting agentic reasoning, semantic similarity, and cross-dataset query.
  • Optimize indexing, retrieval, and inference performance across large-scale multi-omics data collections.
  • Evaluate and incorporate emerging representation-learning and knowledge-graph techniques to improve data discoverability and model interoperability.

Qualifications

  • MS/PhD in Computer Science, Computational Biology, Data Science, or related field.
  • 3+ years of experience building or maintaining vector or semantic-retrieval infrastructure.
  • Hands-on experience with multi-omics or biomedical data integration (e.g., RNA-seq, proteomics, clinical endpoints).
  • Proficiency in Python and frameworks such as LangChain, Transformers, or sentence-embedding models.
  • Familiarity with TileDB, Snowflake, Weaviate, FAISS, or other vector/array database systems.
  • Understanding of metadata modeling, ontologies (e.g., OBO, UMLS), and FAIR data practices.
  • Strong ability to collaborate across solution architecture, data science, and AI/ML teams.

Strategic Impact:

  • Multi-omics and clinical data assets transformed into interoperable, vectorized embeddings supporting scientific AI applications.
  • AI can perform semantic queries and reasoning over governed datasets.
  • Vector database infrastructure scales efficiently and complies with governance and lineage standards.
  • Accelerated insight generation across discovery, translational, and clinical domains.

#JRDDS
Required Skills
Preferred Skills:
Advanced Analytics, Business Intelligence (BI), Coaching, Collaborating, Critical Thinking, Data Analysis, Database Management, Data Privacy Standards, Data Reporting, Data Savvy, Data Science, Data Visualization, Econometric Models, Process Improvements, Technical Credibility, Technologically Savvy, Workflow Analysis


  • Data Engineer

    hace 3 días


    Madrid, Madrid, España Altia A tiempo completo

    En Altia llevamos 30 años creando soluciones digitales preparadas para el futuro, capaces de generar valor real y provocar cambios significativos.Nos mueve un claro propósito: crecer haciendo crecer, y hacerlo, además, de forma sostenible y duradera. Tenemos claro que solo seremos importantes si juntos aportamos un impacto positivo y todos evolucionamos...


  • Madrid, Madrid, España MPower Plus A tiempo completo

    Role: Data Engineer/ Senior Data EngineerLocation: Madrid, Spain/ Lisbon, Portugal(Remote with Initially 2days onsite/ month)Contract : 6+ Months contractExperience developing data systems on major cloud platforms (AWS, GCP, Azure)Hands-on experience building modern data architectures (data lakes, lakehouses, hubs)Demonstrated proficiency in ingestion tools...

  • Data Engineer

    hace 3 días


    Madrid, Madrid, España Quanteam UK A tiempo completo

    Role:Data Engineer / AI SpecialistLocation:Madrid, SpainOn-site workingFull time workingOverview:We are seeking aData Engineer / AI Specialistto support the development, implementation, and monitoring of AI-driven and automated solutions. The role involves optimising data pipelines, ensuring data integrity and compliance, and contributing to the integration...

  • Data Engineer

    hace 1 semana


    Madrid, Madrid, España ADEREN A tiempo completo

    BUSCAMOS:  Data Engineer (Java/Spark) Para importante empresa sector TIC, buscamos un profesional con experiencia contrastada desempeñando el role de Data EngineerFunciones & Tareas: ■ Diseñar y desarrollar pipelines de datos eficientes y escalables.■ Gestionar y optimizar la infraestructura de datos (seguridad, rendimiento, escalabilidad).■...

  • Data Engineer

    hace 2 semanas


    Madrid, Madrid, España Axpe Consulting A tiempo completo

    Impulsa tu carrera conAXPE Consulting ¿Eres Data Engineer y buscas un nuevo reto en el sector bancario?Desde nuestroDepartamento de Data & Analytics, seguimos creciendo y buscamos un/aData Engineerpara incorporarse a un proyecto estratégico en uno de nuestros principales clientes delsector bancario.Si te apasiona trabajar con datos a gran escala, diseñar...

  • Data Engineer

    hace 3 días


    Madrid, Madrid, España Datamatics Technologies A tiempo completo

    Job Title: Data Engineer (Databricks, Teradata & Neo4j)Location: Remote (Candidates must be based in Europe)Experience: 5–7 YearsEmployment Type: Full-TimeClient Location: SwedenPosition OverviewWe are looking for an experienced Data Engineer with strong hands-on expertise in Databricks, Teradata, and Neo4j to join a leading technology-driven...

  • Data Engineer

    hace 1 semana


    Madrid, Madrid, España Senovo IT Ltd A tiempo completo

    We are looking for an experiencedData Engineerto join our growing data team. The ideal candidate will have a strong background in building and maintaining scalable data pipelines, managing infrastructure as code, and ensuring smooth CI/CD integration across environments. You'll work closely with cross-functional teams to design, implement, and optimize data...

  • Senior Data Engineer

    hace 3 días


    Madrid, Madrid, España Awin A tiempo completo

    Purpose of positionAs a Senior Data Engineer, you will play a pivotal role in our AI/ML workstream, you'll work closely with business teams and data scientists to design, maintain, and improve machine learning applications. Your main responsibilities will include managing existing ML workloads and building new batch and on-demand pipelines to support...

  • Data Engineer

    hace 1 semana


    Madrid, Madrid, España VASS A tiempo completo

    ¿Data Engineer con experiencia en AWS y sector banca? ¿Buscas seguir creciendo en las mejores condiciones y con la mayor cercanía y flexibilidad? Esto te interesaEn VASS, compañía líder en soluciones tecnológicas, buscamosData Engineercon amplia experiencia enPySparkySQL, pasión por los datos y con una sólida comprensión del entorno bancario. La...

  • Senior GenAI Engineer

    hace 3 días


    Madrid, Madrid, España Ultra Tendency A tiempo completo

    Our Engineering community is growing, and we're now looking for a Senior GenAI Engineer - Databricks (m/f/*) to join our team in Spain, supporting our global growth.  As a Senior GenAI Engineer (m/f/*), you will lead the design and implementation of advanced data and AI solutions on the Databricks Lakehouse Platform. Your focus will lie in building robust...