Senior AI Research Engineer, Model Inference

hace 1 semana


Madrid, España Tether.io A tiempo completo

Senior AI Research Engineer, Model Inference (Remote)Join to apply for the Senior AI Research Engineer, Model Inference (Remote) role at Tether.ioGet AI-powered advice on this job and more exclusive features.About the jobWe are looking for an experienced AI Model Engineer with deep expertise in kernel development, model optimization, fine-tuning, and GPU acceleration. The engineer will extend the inference framework to support inference and fine-tuning for Language models with a strong focus on mobile and integrated GPU acceleration (Vulkan).This role requires hands-on experience with quantization techniques, LoRA architectures, Vulkan backend, and mobile GPU debugging. You will play a critical role in pushing the boundaries of desktop and on-device inference and fine-tuning performance for next-generation SLM/LLMs.ResponsibilitiesImplement and optimize custom inference and fine-tuning kernels for small and large language models across multiple hardware backends.Implement and optimize full and LoRA fine-tuning for small and large language models across multiple hardware backends.Design and extend datatype and precision support (int, float, mixed precision, ternary QTypes, etc.).Design, customize, and optimize Vulkan compute shaders for quantized operators and fine-tuning workflows.Investigate and resolve GPU acceleration issues on Vulkan and integrated/mobile GPUs.Architect and prepare support for advanced quantization techniques to improve efficiency and memory usage.Debug and optimize GPU operators (e.g., int8, fp16, fp4, ternary).Integrate and validate quantization workflows for training and inference.Conduct evaluation and benchmarking (e.g., perplexity testing, fine-tuned adapter performance).Conduct GPU testing across desktop and mobile devices.Collaborate with research and engineering teams to prototype, benchmark, and scale new model optimization methods.Deliver production-grade, efficient language model deployment for mobile and edge use cases.Work closely with cross-functional teams to integrate optimized serving and inference frameworks into production pipelines designed for edge and on-device applications. Define clear success metrics such as improved real-world performance, low error rates, robust scalability, and memory efficiency, with continuous monitoring and iterative refinements.Proficiency in C++ and GPU kernel programming.Proven expertise in GPU acceleration with Vulkan framework.Strong background in quantization and mixed-precision model optimization.Experience and expertise in Vulkan compute shader development and customization.Familiarity with LoRA fine-tuning and parameter-efficient training methods.Ability to debug GPU-specific performance and stability issues on desktop and mobile devices.Hands-on experience with mobile GPU acceleration and model inference.Familiarity with large language model architectures (e.g., Qwen, Gemma, LLaMA, Falcon, etc.).Experience implementing custom backward operators for fine-tuning.Experience creating and curating custom datasets for style



  • madrid, España Adyen A tiempo completo

    This is Adyen Adyen provides payments, data, and financial products in a single solution for global customers like Meta, Uber, H&M, and Microsoft, making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition. We create an environment with opportunities for our people to succeed, backed by a culture that...

  • Senior AI Engineer

    hace 2 semanas


    Madrid, España Jordan martorell s.l. A tiempo completo

    We are seeking a Senior AI Engineer to join our AI & Analytics team in Spain. This role is ideal for an experienced engineer who is passionate about designing and deploying AI-driven solutions, with a strong foundation in machine learning, generative AI, and cloud computing. You will work with a multidisciplinary team to develop intelligent systems for...


  • Madrid, España Speechify A tiempo completo

    Senior Software Engineer, AI Model serving - Madrid, Spain Join to apply for the Senior Software Engineer, AI Model serving - Madrid, Spain role at Speechify Senior Software Engineer, AI Model serving - Madrid, Spain 3 days ago Be among the first 25 applicants Join to apply for the Senior Software Engineer, AI Model serving - Madrid, Spain role at Speechify...

  • AI Engineer

    hace 2 semanas


    Madrid, España Domyn A tiempo completo

    Overview We're looking for a talented AI Engineer to join our team in Madrid, a talent focused on implementing and scaling large language models (LLMs) and generative AI systems. In this role, you will bridge the gap between cutting-edge research and practical applications, turning innovative AI concepts into robust, efficient, and production-ready systems....

  • Senior AI Engineer

    hace 2 semanas


    Madrid, Madrid, España InteractiveAI A tiempo completo 90.000 € - 110.000 €

    What You'll DoAs a Senior AI Engineer (GenAI) at InteractiveAI, you'll play a key role in developing our next-generation AI capabilities, advancing our LLM, SLM, and fine-tuning workflows while contributing to the core model development that powers our platform.You'll work closely with the Chief of AI and a cross-functional squad to design, experiment with,...


  • Madrid, España Tether Operations Limited A tiempo completo

    Join Tether and Shape the Future of Digital Finance¿Tiene las habilidades necesarias para este puesto? Lea todos los detalles a continuación y presente su candidatura hoy mismo.At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting‑edge solutions empower businesses—from exchanges and wallets to...


  • Madrid, España Speechify A tiempo completo

    Senior Software Engineer, AI Model serving - Madrid, SpainDescubra exactamente qué habilidades, experiencia y cualificaciones necesitará para tener éxito en este puesto antes de enviar su solicitud a continuación.Join to apply for the Senior Software Engineer, AI Model serving - Madrid, Spain role at SpeechifySenior Software Engineer, AI Model serving -...


  • Madrid, España BEAI Energy A tiempo completo

    Company DescriptionBEAI is a specialized Artificial Intelligence company for the energy and industrial sectors that revolutionizes the application of this technology by placing human beings at the center of our solutions and applying the highest ethical standards.We have an AI laboratory with exceptional talent that our clients can leverage through an as a...


  • Madrid, España BEAI Energy A tiempo completo

    Company Description BEAI is a specialized Artificial Intelligence company for the energy and industrial sectors that revolutionizes the application of this technology by placing human beings at the center of our solutions and applying the highest ethical standards. We have an AI laboratory with exceptional talent that our clients can leverage through an as...

  • Senior AI Engineer

    hace 3 semanas


    Madrid, España NTT DATA Europe & Latam A tiempo completo

    NTT DATA is the 6th biggest IT Service Company in the world with more than 100.000 professionals and a turnover of more than 15 billion euros. We at NTT DATA Spain make the difference by being close to our clients, exceeding expectations, managing proactively our projects and customers and focusing on quality and selecting employees with the right mindset to...