FrodoSwaggins
Creating Archival CDs / DVDs
1
45
Creating Archival CDs / DVDs
I have a huge folder of documents in JPG, HTML, PDF, epub, and DOC / DOCX format. Does anyone know of a tool that can scan the files and make a nice keyword based index of them, preferably in HTML output?
It would be great if the tool did stuff like pull keywords and metadata from the individual files. Great if it can create a flat index, even better if it can build a series of index pages which are linked together. I'm trying to create an indexed archive not unlike the old warez CDs of the days of yore.
Of course, open source always appreciated.
You might look at this first: https://github.com/henrystern/PDF-Metadata-to-CSV
For Word docs, you could use the python-docx library to pull out properties and do whatever you want with them.
See: https://medium.com/@HeCanThink/python-do...65cf4b4cb9
the horrors persist, but so do we

(aka large mozz)


Forum Jump:


Users browsing this thread:
1 Guest(s)