Omnilex Industry · Engineering

Data Scraping Engineer

CHF 100'000 – 120'000 / year
ZÜRICH
LLMSRAG

Description

At Omnilex, we’re on a mission to transform the way lawyers work. Our AI-native platform lets legal professionals enhance their productivity in legal research and automate workflows. We collaborate closely with our clients and iterate at a market-leading pace.

You’ll be joining a young, passionate, and dynamic team of 15, with roots at ETH Zurich.

Are you excited about turning messy, multi-jurisdiction legal content into clean, structured, and AI-ready data? Do you enjoy building reliable pipelines for extraction, normalization, chunking, citation handling, tagging, structuring, summarizing, and indexing; then measuring quality and cost? Do you thrive in a fast-paced startup where your work directly powers search, AI answer quality, and analytics?

Responsibilities

  • As a Data Engineer focused on AI data processing & integration, your primary focus will be building and owning data flows that make our AI features accurate, explainable, and scalable
  • Design and maintain ingestion for legal sources (APIs, scraping, bulk data) across jurisdictions with strong reliability and compliance
  • Normalize and model heterogeneous sources into pragmatic, typed schemas (statutes, decisions, commentaries, citations, metadata)
  • Implement citation-aware chunking, sectioning, and cross-referencing so RAG is precise, traceable, and cost-efficient
  • Build enrichment pipelines for tagging, classification, summarization, embeddings, entity extraction, and graph relationships; using AI where it helps
  • Improve search quality via better indexing strategies, analyzers, synonyms, ranking, and relevance evaluation
  • Establish data quality, lineage, and observability (QA checks, coverage metrics, regression tests, versioning)
  • Optimize performance, runtime complexity, DB query times, token usage, and overall pipeline cost
  • Collaborate closely with users and customers to translate user problems and company requirements into robust data and SLAs
  • Communicate your work and findings to the team for continuous feedback and improvement (in English)

Qualifications

MINIMUM QUALIFICATIONS

  • Degree in Computer Science, Data Science, or a related field; or equivalent practical experience
  • Strong hands-on experience in data engineering with TypeScript
  • Solid grasp of data structures, algorithms, regexes, and SQL (PostgreSQL)
  • Experience using LLMs/embeddings for practical data tasks (chunking, tagging, summarization, RAG-ready pipelines)
  • Ability to learn quickly and adapt to a dynamic startup environment, with strong ownership and product mindset
  • Availability full-time. On-site in Zurich at least two days per week (hybrid)

PREFERRED QUALIFICATIONS

  • You have a Swiss work permit or EU/EFTA citizenship
  • Working proficiency in German (much of our legal data is in German) and proficiency in English
  • Experience with Azure (incl. Azure AI/Cognitive Search), Docker, and CI/CD
  • Familiar with modern scraping/parsing stacks (Playwright/Puppeteer, PDF tooling, OCR)
  • Experience with vector indexing, relevance evaluation, and search ranking
  • Familiar with our stack: Azure / NestJS / Next.js
  • Knowledge and experience with legal systems, in particular Switzerland, Germany, USA

Benefits

  • Direct impact: your pipelines immediately improve search, answers, and user trust, transforming legal research
  • Autonomy & ownership: Own across ingestion, processing, enrichment, and indexing
  • Team: Professional growth at the intersection of legal, data, and AI with an interdisciplinary team
  • Compensation: CHF 8’000–12’000 per month + ESOP (employee stock options), depending on experience and skills.