repositorypackage
0.0.0-20240804143356-fe57a0d73567
Repository: https://github.com/heussd/pdftotext-go.git
Documentation: pkg.go.dev
# README
pdftotext-go
Extract texts with their corresponding page numbers from PDF files.
Wraps the command line tool pdftotext
(poppler-utils).
Usage
- poppler-utils (version >=22.05.0) must be installed and available in the path.
go get "github.com/heussd/pdftotext-go"
- See tests for code examples.
Why poppler version >=22.05.0
Version 22.05.0 of poppler introduced a new parameter -tsv
, which extracts PDF content with meta data as TSV. This functionality is essential for the operation of this library.
Thanks to
- amitaifrey for finding and fixing a bug