NEW Native Support for Conversational Data: 4 Use Cases to Ship Better Chatbots

PDF

The Pdf tag displays a PDF document in the labeling interface. You can use this tag to:

  • Perform document-level annotations such as classification, transcription, and summarization.
  • Perform OCR validation on supported PDFs.

Supports:

  • Zoom
  • Rotation
  • PDFs up to 100 pages

Use with the following data types: PDF.

Enterprise

You can also use the PDF tag with Prompts to perform auto-labeling work such as PDF summarization, classification, information extraction, and document intelligence.

Parameters

Param Type Description
value string Data field value containing the URL to the PDF

Example: PDF classification

Labeling configuration apply document-level classification to PDF documents:

<View>
  <Pdf name="pdf" value="$pdf" />
  <Choices name="choices" toName="pdf">
    <Choice value="Legal" />
    <Choice value="Financial" />
    <Choice value="Technical" />
  </Choices>
</View>

Example Input data:

{
  "pdf": "https://app.humansignal.com/static/samples/opossum-cuteness.pdf"
}

Example: OCR

Enterprise

Label Studio Enterprise only.

For Community and Starter Cloud users who want to apply labels for OCR tasks, you will need to convert the PDF into images first and then use a labeling configuration similar to the Multi-Page Document Annotation template.

Labeling configuration for PDFs:

<View>
  <OcrLabels name="ocr" toName="pdf">
    <Label value="Typo"/>
    <Label value="Incorrect amount"/>
    <Label value="Incorrect name"/>
  </OcrLabels>

  <Pdf name="pdf" value="$pdf"/>
</View>

Example Input data:

{
  "pdf": "https://app.humansignal.com/static/samples/opossum-cuteness.pdf"
}

OcrLabels

This tag adds bounding boxes to the PDF and allows you to assign labels to them.

This tag must have one or more Label tag children, and supports standard parameters such as maxUsages (see RectangleLabels as an example).

Supported PDFs

PDFs that work best with the new OCR labeling are those that already contain a selectable text layer (text overlay).

In these PDFs, when you draw a bounding box, the tool can read and highlight the underlying text from that layer (see the video below).

Image-only PDFs such as scans or phone photos without a text layer won’t return text. For those, you may need to use an external OCR tool to add a text layer first. If a PDF’s text layer is misaligned or low quality, captured text may be incomplete or incorrect, and this feature can help you audit and improve those overlays.

Results:

Result Type Description
x, y, width, height Number Numbers from 0 to 1 that are relative to the page dimensions.
rotation Number Number in degrees clockwise from 0–360. Rotation is calculated with the origin at (x, y) (the top-left corner of the region).
pageIndex Number Page number, 1-based.
ocrtext String Captured text. This can be edited by selecting the region and then editing the text from the Info panel.

note

When you are rotating within the UI, it appears to originate from the center of the region. However, we store the origin as (x, y) - meaning in the top left corner of the region.