Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Python package to analyze ODT or DOCX styles?
2 points by vfulco on Oct 5, 2016 | hide | past | favorite | 1 comment
There are so many packages floating around; I have loosely looked at last 10 years worth of code snippets. As a first pass, I want to gather up all the styles/ formatting/ fonts/ font sizes within an ODT/DOCX file and output to text file as a quick check against resumes I am writing for my professional editing service.


I'm not sure Python is the wrong tool for that particular job, but I'm not certain it's an optimal tool for it. Because to me, the starting point for dealing with Docx is the Office SDK [1] and my impression is that Python is not a supported language...IronPython is a possibility, but it's not really mainstream and I'm not sure how portable code would be to the ODT side of the equation. Aside from Microsoft's main .NET languages, Powershell is an option for quick hacking on Office Documents.

Good luck.

[1] https://msdn.microsoft.com/en-us/library/office/bb448854.asp...




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: