ArtificialSeed
Careers · AI Lab

Senior ML Engineer

We are looking for:

A Senior ML Engineer who has already built LLM agents and RAG systems in production and can not only “assemble a pipeline,” but also improve quality through hypotheses and experimentation. We need someone with a research-minded engineering mentality: quickly test ideas, turn them into working prototypes, measure results, and then drive them to stable production solutions. Independence and ownership over outcomes are crucial: “found a lever → proved it with metrics → deployed → monitor.”

In practice, this means you’ll be doing things like:

  • Designing quality evaluation for agents: defining metrics (task success, tool success, latency/cost, hallucination rate), building datasets, running offline/online evals, and setting up regressions and alerts;
  • Determining whether a code agent can outperform the current tool-based agent: comparing approaches (tool-based vs. code-execution/codegen), defining “better,” running A/B tests or controlled rollouts, and evaluating quality/cost/risk;
  • Improving RAG quality by 30%: enhancing retrieval (chunking, query rewriting, hybrid search), reranking, context composition, dedup/anti-leak, grounding - and then proving gains on benchmarks and production metrics.

What matters to us:

  • You’re not afraid of ambiguous problems, where there’s no ready-made solution and you must define what to measure, how to test, and what success means.
  • You can balance speed and quality: experiment quickly while keeping reliability, observability, and reproducibility in mind.
  • You can write (and vibe-code) production-friendly code - the kind that doesn’t make product engineers reach for their revolver.

Requirements:

  1. 5+ years of overall software engineering experience;
  2. 2+ years of experience building products around LLMs / agents;
  3. You have built at least:
  • One RAG / search system with a full pipeline: retrieval → rerank → generation;
  • One agent (tool-use / multi-step / workflows);
  • These systems have real users;
  • You are comfortable with:
  • Asynchronous Python: asyncio, threads;
  • LLM prompting: system/user prompts, few-shot, templates, context, instructions;
  • Modern LLM internals: transformers, training, inference, serving;
  • Recent models and their differences (quality / speed / context length / cost / multimodality, etc.);
  • MCP (Model Context Protocol);
  • ML methodology: train/val/test, metrics, basic evaluation principles;
  1. Quality control for agent responses:
  • Monitoring, metrics, guardrails, regressions, alerts, human labeling / feedback loops;
  1. Frameworks and approaches for agents:
  • fastmcp, mcp-use, OpenAI Agents SDK and equivalents;
  1. Tokenization:
  • How tokenization works;
  • Modern tokenizers, their impact on context length / cost / limits;
  1. RAG pipelines:
  • Components (ingest / chunking / embeddings / vector store / retrieval / rerank / context composition / generation);
  • Typical issues and solutions (hallucinations, poor retrieval, degradation, cold start, data drift, duplicates, latency);
  1. Cursor and similar tools (Claude Code, Codex, Aider, etc.):
  • How to use code agents effectively in development.

Nice-to-haves:

  • Experience designing APIs: REST / gRPC / GraphQL;
  • Understanding of the HTTP protocol;
  • Experience with relational databases: PostgreSQL and similar;
  • Knowledge of distributed and vector stores: Weaviate, Cassandra, etc.;
  • Experience with Python API frameworks: FastAPI, Flask, and equivalents;
  • Familiarity with background task systems: Celery, Taskiq, Airflow, etc.;
  • Containerization skills: Docker, Kubernetes, or Nomad;
  • Experience working with queues and brokers: Kafka, RabbitMQ, Redis, etc.

We offer:

  • Flexible schedule - you choose when to start your day;
  • Relocation to Bilbao, Spain for you and your family, with full support at every stage (documents, housing, adaptation, even pets);
  • Unlimited vacation - take time off when you really need it;
  • A culture of trust and respect for professionals, with no micromanagement;
  • A modern office in a cozy district, regular team events and off-sites;
  • Growth and participation in architectural decisions, challenging tasks, and a strong team you can learn from.

Apply for this role

Send us your details and we'll get back to you.