OCR Labeling for PDFs
Use this template to perform region-level OCR directly on native PDFs.
The Pdf tag renders multi-page documents (up to 100 pages) with zoom and rotation, while OcrLabels lets you draw bounding boxes, assign labels, and capture editable OCR text per region.
Each region stores normalized coordinates, rotation, and a page index, making outputs reliable for downstream extraction tasks.
Ideal for document intelligence, QA on OCR output, and structured data capture workflows.
Enterprise
This template can only be used with in Label Studio Enterprise.

Labeling Configuration
<View>
<Header value="Select text to correct" size="4"/>
<OcrLabels name="ocr" toName="pdf">
<Label value="Typo" />
<Label value="Incorrect Amount" />
<Label value="Incorrect Name" />
</OcrLabels>
<Pdf name="pdf" value="$pdf"/>
</View>
<!-- {
"data": {
"pdf": "/static/samples/opossum-cuteness.pdf"
}
} -->
About the labeling configuration
PdfThis will display your PDF natively in Label Studio, allowing you to zoom in and rotate as needed.
Support for PDFs up to 100 pages.
OcrLabels
Used only with the
Pdftag, and allows you to draw bounding boxes around text. Note that the PDF must have a text overlay for this to work (for example, verify whether you can highlight text in the PDF using your cursor).Select the text under the Regions panel to correct it.

Input data
{
"data": {
"pdf": "/static/samples/opossum-cuteness.pdf"
}
}