Omnilex Industry · Engineering

Data Scraping Engineer

CHF 100'000 – 120'000 / year

ZÜRICH

LLMSRAG

Description

At Omnilex, we’re on a mission to transform the way lawyers work. Our AI-native platform lets legal professionals enhance their productivity in legal research and automate workflows. We collaborate closely with our clients and iterate at a market-leading pace.

You’ll be joining a young, passionate, and dynamic team of 15, with roots at ETH Zurich.

Are you excited about turning messy, multi-jurisdiction legal content into clean, structured, and AI-ready data? Do you enjoy building reliable pipelines for extraction, normalization, chunking, citation handling, tagging, structuring, summarizing, and indexing; then measuring quality and cost? Do you thrive in a fast-paced startup where your work directly powers search, AI answer quality, and analytics?

Responsibilities

As a Data Engineer focused on AI data processing & integration, your primary focus will be building and owning data flows that make our AI features accurate, explainable, and scalable
Design and maintain ingestion for legal sources (APIs, scraping, bulk data) across jurisdictions with strong reliability and compliance
Normalize and model heterogeneous sources into pragmatic, typed schemas (statutes, decisions, commentaries, citations, metadata)
Implement citation-aware chunking, sectioning, and cross-referencing so RAG is precise, traceable, and cost-efficient
Build enrichment pipelines for tagging, classification, summarization, embeddings, entity extraction, and graph relationships; using AI where it helps
Improve search quality via better indexing strategies, analyzers, synonyms, ranking, and relevance evaluation
Establish data quality, lineage, and observability (QA checks, coverage metrics, regression tests, versioning)
Optimize performance, runtime complexity, DB query times, token usage, and overall pipeline cost
Collaborate closely with users and customers to translate user problems and company requirements into robust data and SLAs
Communicate your work and findings to the team for continuous feedback and improvement (in English)

Qualifications

MINIMUM QUALIFICATIONS

Degree in Computer Science, Data Science, or a related field; or equivalent practical experience
Strong hands-on experience in data engineering with TypeScript
Solid grasp of data structures, algorithms, regexes, and SQL (PostgreSQL)
Experience using LLMs/embeddings for practical data tasks (chunking, tagging, summarization, RAG-ready pipelines)
Ability to learn quickly and adapt to a dynamic startup environment, with strong ownership and product mindset
Availability full-time. On-site in Zurich at least two days per week (hybrid)

PREFERRED QUALIFICATIONS

You have a Swiss work permit or EU/EFTA citizenship
Working proficiency in German (much of our legal data is in German) and proficiency in English
Experience with Azure (incl. Azure AI/Cognitive Search), Docker, and CI/CD
Familiar with modern scraping/parsing stacks (Playwright/Puppeteer, PDF tooling, OCR)
Experience with vector indexing, relevance evaluation, and search ranking
Familiar with our stack: Azure / NestJS / Next.js
Knowledge and experience with legal systems, in particular Switzerland, Germany, USA

Benefits

Direct impact: your pipelines immediately improve search, answers, and user trust, transforming legal research
Autonomy & ownership: Own across ingestion, processing, enrichment, and indexing
Team: Professional growth at the intersection of legal, data, and AI with an interdisciplinary team
Compensation: CHF 8’000–12’000 per month + ESOP (employee stock options), depending on experience and skills.

Apply Now