AI Evaluation Engineer
hace 3 horas
The Role
We are seeking an exceptional
AI Evaluation Engineer
to design, implement, and scale frameworks for assessing the performance, reliability, and trustworthiness of advanced AI systems. This individual will be responsible for developing methodologies and tools to measure model quality across diverse dimensions, such as accuracy, robustness, reasoning, safety, and efficiency.
Key Responsibilities
- Design and Develop Evaluation Frameworks: Create scalable, reproducible evaluation pipelines for large-scale AI systems, including LLMs and multi-agent architectures, covering both automated and human-in-the-loop testing strategies.
- Metric Innovation: Define and implement novel evaluation metrics that capture model capabilities beyond traditional benchmarks.
- Benchmarking & Performance Analysis: Conduct benchmarking of AI models across domains, tasks modalities, analyzing their skills and behavior under different setups.
- Safety, Reliability & Alignment Testing: Develop tools and experiments to probe model safety, robustness, interpretability, and bias.
- Cross-functional Collaboration: Work closely with model finetuning and optimization teams to evaluate end-to-end system effectiveness, efficiency. Identify trade-offs between model performance, latency, and energy footprint.
- Continuous Improvement & Reporting: Monitor model performance over time, automate regression detection, and contribute to the continuous evaluation infrastructure that supports Openchip's AI research and product roadmap.
Qualifications
- MSc or PhD in Computer Science, Artificial Intelligence, Machine Learning, Statistics, or a related field. A publication record in ML evaluation, benchmarking, or interpretability is a plus.
- 3+ years of experience developing, evaluating, or optimizing AI systems.
- Strong programming skills in Python, with experience using PyTorch, TensorFlow, or JAX.
- Experience in designing evaluation protocols for LLMs, multi-agent systems, or reinforcement learning environments.
- Deep understanding of ML metrics, evaluation methodologies, and statistical analysis.
- Experience with data quality, annotation workflows, and benchmark dataset creation is a plus.
- Fluent in English; proficiency in additional European languages (German, Dutch, Spanish, French, or Italian) is a plus.
Soft Skills
- Analytical Rigor: An evidence-driven mindset that enjoys designing robust experiments to quantify and uncover complex AI behaviors, translating empirical insights into new research directions.
- Collaboration & Communication: Excellent communication and collaboration skills in a multidisciplinary environment.
- Integrity & Responsibility: Committed to building AI systems that are not only powerful but also safe, reliable, and aligned with human values.
What We Offer?
- The opportunity to build a cloud AI deployment platform that will power next generation AI systems.
- A collaborative, innovation-driven environment with significant autonomy and ownership.
- Hybrid work model with flexible scheduling.
- A chance to join one of Europe's most ambitious companies at the intersection of AI and silicon engineering.
- Position based in Barcelona.
We're looking for exceptional engineers ready to shape the future of AI infrastructure. If building scalable, cloud-native AI deployment platforms excites you, we'd love to meet you.
At Openchip & Software Technologies S.L., we believe a diverse and inclusive team is the key to groundbreaking ideas. We foster a work environment where everyone feels valued, respected, and empowered to reach their full potential—regardless of race, gender, ethnicity, sexual orientation, or gender identity.
-
AI Engineer
hace 5 días
Barcelona, Barcelona, España Biorce A tiempo completoAbout The CompanyBiorce is a pioneering Healthtech company dedicated to revolutionizing drug development through the power of AI. We are passionate about accelerating medical advancements and improving patient outcomes.Our team comprises seasoned clinical research professionals, data scientists, and AI experts, working collaboratively to bridge the gap...
-
AI Engineer/Scientist
hace 5 días
Barcelona, Barcelona, España Axiomatic_AI A tiempo completoAbout us: Axiomatic AI is building a new class of AI systems designed to reason with the rigor of the scientific method. By combining deep learning with formal logic and physics-based modeling, we create verifiable, interpretable AI systems that collaborate with and support human researchers in high-stakes scientific and engineering workflows. Our mission,...
-
AI Engineer/Scientist
hace 3 días
Barcelona, Barcelona, España Axiomatic_AI A tiempo completoAbout us: Axiomatic AI is building a new class of AI systems designed to reason with the rigor of the scientific method. By combining deep learning with formal logic and physics-based modeling, we create verifiable, interpretable AI systems that collaborate with and support human researchers in high-stakes scientific and engineering workflows. Our mission,...
-
AI Engineer/Scientist
hace 2 horas
Barcelona, Barcelona, España Axiomatic_AI A tiempo completoAbout us: Axiomatic AI is building a new class of AI systems designed to reason with the rigor of the scientific method. By combining deep learning with formal logic and physics-based modeling, we create verifiable, interpretable AI systems that collaborate with and support human researchers in high-stakes scientific and engineering workflows. Our mission,...
-
AI Senior Engineer
hace 3 días
Barcelona, Barcelona, España NTT DATA Europe & Latam A tiempo completoNTT DATA is the 6th biggest IT Service Company in the world with more than professionals and a turnover of more than 15 billion euros.We at NTT DATA Spain make the difference by being close to our clients, exceeding expectations, managing proactively our projects and customers and focusing on quality and selecting employees with the right mindset to make our...
-
AI Engineer
hace 1 semana
Barcelona, Barcelona, España Robert Walters A tiempo completoAI Software Engineer / MLOps EngineerWe are looking for aleading client in Barcelonato hire anAI Software Engineer / MLOps Engineerto design, build and deployend-to-end AI and generative AI solutionsin cloud environments (Azure and GCP).What you'll doDesign and implement AI/ML and GenAI solutions for real business use casesBuild and maintain MLOps...
-
AI Principal Engineer
hace 4 horas
Barcelona, Barcelona, España Checkatrade A tiempo completoJoin us as an AI Principal EngineerWant to do work that really matters?At Checkatrade, we're building the UK's go-to home improvement marketplace. Every day, we help millions of homeowners find the right tradesperson for the job, fast, fair, and without the faff.AI is a core part of that mission. We believe AI can genuinely improve lives and livelihoods —...
-
AI Automation Consultant
hace 3 días
Barcelona, Barcelona, España Maisa AI A tiempo completoAI Automation Consultant Location: Valencia / Madrid / Barcelona - Spain / Europe - Remote or HybridTeam: Client ServicesWelcome to Maisa - Making AI AccountableOur agentic process automation platform helps enterprises automate complex, decision-heavy processes that traditional automation can't handle and GenAI can't be trusted with. We enable...
-
Software Engineer
hace 3 días
Barcelona, Barcelona, España Maisa AI A tiempo completoSoftware Engineer - BackendLocation: Spain/ Europe- RemoteTeam: EngineeringWelcome to Maisa - Making AI AccountableOur agentic process automation platform helps enterprises automate complex, decision-heavy processes that traditional automation can't handle and GenAI can't be trusted with. We enable organizations to scale operations, resist hallucinations,...
-
AI Engineer
hace 2 semanas
Barcelona, Barcelona, España Robert Walters A tiempo completoWe are looking for anAI Engineer (Contractor)to design, build, and deploy machine learning and AI solutions, including modernLLM-based applications. You will work closely with product and engineering teams to take models from experimentation to production.Key ResponsibilitiesDevelop and optimize ML and deep learning modelsBuild LLM-powered solutions (RAG,...