pdftotext does not support -htmlmeta anymore (chapter 8 metadata extraction)

Hi,

This issue concerns chapter 8 of your tutorial. 
It seems that pdftotext does not support the -htmlmeta flag anymore, so the metadata extraction you describe in the tutorial is broken when using the latest xpdf 4.0.0 version, [see current documentation for pdftotext](https://www.xpdfreader.com/pdftotext-man.html). 
According to my tests, you can still access that information but you have to use the pdfinfo command for that.
Another point regarding that notebook: the [Python library reference](https://docs.python.org/3.6/library/subprocess.html) recommends using `subprocess.run` for everything. Is there a specific reason why you're using `subprocess.call` in the tutorial instead?

Thanks for these great tutorials. 
Best regards,

Florian


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pdftotext does not support -htmlmeta anymore (chapter 8 metadata extraction) #24

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

pdftotext does not support -htmlmeta anymore (chapter 8 metadata extraction) #24

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions