TenderCodes .eu

CPV 72312100

Data preparation services

Data preparation services is the work of getting raw data into a usable, structured state before anyone analyses or loads it: cleaning, formatting, converting, labelling, deduplicating, and otherwise shaping source material so a downstream system can consume it. Under CPV code 72312100, a contracting authority is buying that readiness step, not the data entry that captures the records and not the analysis that comes after.

The code is a leaf under Data entry services (72312000), one step down from the data-processing family. It is a moderate-frequency code: 110 awards carry it (TED 2009-2026), at a mean of about €1.0M across the 49 awards with disclosed values. Poland, Germany and Lithuania place the most.

In practice the scope is broad because the inputs vary. Some contracts prepare geospatial and cartographic source material; others assemble and annotate text or speech collections for later use; others convert and reformat document sets. What ties them together is the deliverable: cleaned, structured, ready-to-use data, not the system that will eventually process it.

Public search interest in the bare term is low (volume around 30 for the exact phrase), which fits its specialist role. Most contracting authorities reach 72312100 through a parent category or an existing framework rather than searching for it directly.

When to use this code

For buyers contracting authority

Pick 72312100 when the accepted deliverable is prepared data: cleaned, converted, structured or labelled source material that something downstream will then ingest. The cleanest test is the output. If what gets signed off is a ready-to-load dataset rather than captured keystrokes or an analytical result, this is the code.

The boundary that trips contracting authorities is the parent, Data entry services (72312000). Entry is the act of getting records into a system in the first place; preparation is the shaping of data that already exists into a usable form. Where a single contract does both, tag the primary CPV code by whichever dominates the statement of work.

Two neighbours matter when scope drifts. Optical character recognition services (72312200) is the sibling to reach for when machine-reading scanned text is the main job, and Data capture services (72313000) covers acquiring the data rather than preparing it. Content or data standardisation and classification services (72330000) is the better fit when the deliverable is a standardised classification scheme rather than a prepared dataset.

For suppliers bidding

Winners under 72312100 are data-handling specialists rather than software builders: firms that clean, convert, annotate and structure source material at volume, often with a domain slant. The work splits along that domain line. Geospatial and cartographic preparation rewards survey and GIS credentials; text and speech preparation rewards annotation pipelines and linguistic quality control. Bid where your domain actually sits, because a generalist pitch reads thin against a specialist.

Quality assurance is the signal that carries here. Throughput numbers, error rates, and a documented validation method weigh more than headline price, because prepared data that is wrong propagates into everything downstream. If you bid, lead with the QA method and a comparable reference dataset.

The activity is geographically concentrated: Poland, Germany and Lithuania account for most of the historical awards (TED 2009-2026), so a credible footprint in those markets helps. Suppliers active under the data-entry parent (72312000) and OCR sibling (72312200) bid here too, and the work often arrives bundled inside a larger digitisation or analytics programme rather than as a standalone line. It is a quieter code than the processing codes next door, but a steady one for a firm with the right domain depth.

Industries and typical projects

Commonly confused with

72312000
The parent covers entering records into a system; 72312100 shapes data that already exists into a usable, structured form.
72312200
OCR machine-reads scanned text into characters; data preparation is the broader cleaning and structuring of data for downstream use.
72313000
Data capture acquires the data; data preparation takes already-captured data and gets it ready to use.
72330000
That code delivers a standardisation or classification scheme; 72312100 delivers prepared data, not the classification rules themselves.

Example award titles

  • Rahmenvereinbarung VHR Bilddatenaufbereitung (Orthorekitifizierung) (DE)
  • Energieforschungserhebungen (AT)
  • Tekstynų anotavimo paslaugos (LT)
  • Cyfryzacja Powiatowego Zasobu Geodezyjno-Kartograficznego w ramach projektu „Projekt zintegrowanej informacji geodezyjno-kartograficznej Powiatu Giżyckiego” (PL)
  • Daugiakalbių ir vienkalbių tekstynų sukūrimo paslaugos (LT)
  • Administracinių tekstų tekstyno anotavimas (LT)

Live tenders (1)

One email when a new tender opens under 72312100.

Double opt-in. Unsubscribe in one click.

Hierarchy

Parent
72312000
Siblings
72312200
Documented June 2026 · AI-augmented body, human-reviewed before publication. Editorial policy.