Careers · AI Lab

Senior ML Engineer

We are looking for:

A Senior ML Engineer who has already built LLM agents and RAG systems in production and can not only “assemble a pipeline,” but also improve quality through hypotheses and experimentation. We need someone with a research-minded engineering mentality: quickly test ideas, turn them into working prototypes, measure results, and then drive them to stable production solutions. Independence and ownership over outcomes are crucial: “found a lever → proved it with metrics → deployed → monitor.”

In practice, this means you’ll be doing things like:

Designing quality evaluation for agents: defining metrics (task success, tool success, latency/cost, hallucination rate), building datasets, running offline/online evals, and setting up regressions and alerts;
Determining whether a code agent can outperform the current tool-based agent: comparing approaches (tool-based vs. code-execution/codegen), defining “better,” running A/B tests or controlled rollouts, and evaluating quality/cost/risk;
Improving RAG quality by 30%: enhancing retrieval (chunking, query rewriting, hybrid search), reranking, context composition, dedup/anti-leak, grounding - and then proving gains on benchmarks and production metrics.

What matters to us:

You’re not afraid of ambiguous problems, where there’s no ready-made solution and you must define what to measure, how to test, and what success means.
You can balance speed and quality: experiment quickly while keeping reliability, observability, and reproducibility in mind.
You can write (and vibe-code) production-friendly code - the kind that doesn’t make product engineers reach for their revolver.

Requirements:

5+ years of overall software engineering experience;
2+ years of experience building products around LLMs / agents;
You have built at least:

One RAG / search system with a full pipeline: retrieval → rerank → generation;
One agent (tool-use / multi-step / workflows);
These systems have real users;
You are comfortable with:
Asynchronous Python: asyncio, threads;
LLM prompting: system/user prompts, few-shot, templates, context, instructions;
Modern LLM internals: transformers, training, inference, serving;
Recent models and their differences (quality / speed / context length / cost / multimodality, etc.);
MCP (Model Context Protocol);
ML methodology: train/val/test, metrics, basic evaluation principles;

Quality control for agent responses:

Monitoring, metrics, guardrails, regressions, alerts, human labeling / feedback loops;

Frameworks and approaches for agents:

fastmcp, mcp-use, OpenAI Agents SDK and equivalents;

Tokenization:

How tokenization works;
Modern tokenizers, their impact on context length / cost / limits;

RAG pipelines:

Components (ingest / chunking / embeddings / vector store / retrieval / rerank / context composition / generation);
Typical issues and solutions (hallucinations, poor retrieval, degradation, cold start, data drift, duplicates, latency);

Cursor and similar tools (Claude Code, Codex, Aider, etc.):

How to use code agents effectively in development.

Nice-to-haves:

Experience designing APIs: REST / gRPC / GraphQL;
Understanding of the HTTP protocol;
Experience with relational databases: PostgreSQL and similar;
Knowledge of distributed and vector stores: Weaviate, Cassandra, etc.;
Experience with Python API frameworks: FastAPI, Flask, and equivalents;
Familiarity with background task systems: Celery, Taskiq, Airflow, etc.;
Containerization skills: Docker, Kubernetes, or Nomad;
Experience working with queues and brokers: Kafka, RabbitMQ, Redis, etc.

We offer:

Flexible schedule - you choose when to start your day;
Relocation to Bilbao, Spain for you and your family, with full support at every stage (documents, housing, adaptation, even pets);
Unlimited vacation - take time off when you really need it;
A culture of trust and respect for professionals, with no micromanagement;
A modern office in a cozy district, regular team events and off-sites;
Growth and participation in architectural decisions, challenging tasks, and a strong team you can learn from.

Apply for this role

Send us your details and we'll get back to you.