@yarlisai/file-parsers
Extension-routed document parsing — PDF, CSV, DOCX, XLSX, PPTX, HTML, JSON, YAML and more, with lazy-loaded optional vendor adapters.
Extension-routed document parsing — PDF, CSV, DOCX, XLSX, PPTX, HTML, JSON, YAML, Markdown and plain text, with lazy-loaded optional vendor adapters.
Install
npm install @yarlisai/file-parsersVendor libraries are optional peer dependencies — install only the ones for the formats you parse (pdf-parse, csv-parse, mammoth, officeparser, exceljs, cheerio, js-yaml). Each adapter loads its vendor library on the first parse of that format; JSON, Markdown and plain text need no vendor at all.
Source: packages/file-parsers ·
npm ·
CHANGELOG
Why
@yarlisai/file-parsers follows the port/adapter contract: consumers depend on a port (the FileParser interface) and the createParserRegistry() factory routes a file extension to the right adapter at runtime. Adding a format is one new adapter file plus one registry entry.
Usage
import { isSupportedFileType, parseBuffer, parseFile } from '@yarlisai/file-parsers'
if (isSupportedFileType('pdf')) {
const fromDisk = await parseFile('/tmp/report.pdf')
const fromMemory = await parseBuffer(buffer, 'csv')
}The package's README ships a complete quickstart. mybotbox-platform itself is the reference consumer — apps/sat/lib/file-parsers/ is a thin shim re-exporting this package for the file-parse API route and the knowledge-base document processor.