Pulse

Extract text from documents using Pulse OCR

Overview

Integrate Pulse into your workflow to extract text from PDF documents, images, and Office files. Supports URL-based or uploaded documents with optional chunking and layout analysis.

Setup

Add the Pulse block to your workflow
Enter your Pulse API key
Upload a document or provide a URL

Configuration

Parameter	Type	Required	Description
`apiKey`	string	Yes	Pulse API key
`document`	file/URL	Yes	Document to extract (PDF, images, DOCX, PPTX, XLSX; max 50MB)
`pages`	string	No	Page range (e.g., `1-3,5`)
`chunking`	string	No	Chunking strategy: `semantic`, `header`, `page`, or `recursive`
`chunkSize`	number	No	Max characters per chunk

Tools

`pulse_parser`

Extracts text and structure from documents using Pulse OCR.

Output

Parameter	Type	Description
`markdown`	string	Extracted content in markdown format
`page_count`	number	Number of pages
`job_id`	string	Unique job identifier
`bounding_boxes`	json	Bounding box layout information
`html`	string	HTML content if requested
`structured_output`	json	Structured output if schema provided
`chunks`	json	Chunked content if chunking enabled
`figures`	json	Extracted figures if enabled

Supported Formats

PDF documents
Images (JPEG, PNG)
Microsoft Office files (DOCX, PPTX, XLSX)

Notes

Category: tools
Type: pulse
Maximum file size: 50MB

PostHog Qdrant

On this page