Hacker News new | past | comments | ask | show | jobs | submit login
NPR open-sources Copytext, a library for handling spreadsheets as Python objects (readthedocs.org)
128 points by jsvine on April 22, 2014 | hide | past | favorite | 12 comments



Blog post explaining what problem this is meant to solve in more detail:

http://blog.apps.npr.org/2014/04/21/introducing-copytext-py....


"The spreadsheet is your CMS"


gspread is a little-known Python module that uses the Google Spreadsheet API and offers a simple interface atop it. You don't even need to download the .xlsx file -- you can keep in sync with the file in real-time.

https://github.com/burnash/gspread

I use it for some internal proj management tools at Parse.ly.


Hey, (I'm the original author of copytext at NPR) this looks like a really nice lib. FWIW, we specifically don't do it this way, because during development we're probably refreshing our page thousands of times and we don't want the delay of hitting the Google servers for every request.

Incidentally, that's also why we don't request the spreadsheet in Javascript--the lag time to render the page is too high.

We've found our approach gives us a nice balance of "real time" and "practical".

Thanks for the comment.


Thank you for open sourcing it!


Nice, thanks for sharing this!

Also, I'm exploring simpler alternatives to oauth2 for command-line acccess to the Google Drive SDK, and your password approach is interesting. My current use case is extracting the HTML from a Google Doc and converting that to LaTeX with a custom template: https://github.com/dergachev/gdocs-export


Another useful tool for working with spreadsheets in Javascript is SheetJS: https://github.com/SheetJS/js-xlsx



What are the advantages over something like pandas?

also interesting that NPR is run on flask...


npr.org is not run on flask. There are several independent dev teams at NPR: one for npr.org and associated apps like the CMS (where I used to work), one for show production software, one for producing cool standalone apps (where this came from), and one for station-facing apps. I may be forgetting one. They all have their own typical languages: one uses java, scala, and PHP. Another uses drupal. The team that produced this uses python and flask. NPR is diverse. :)


It would be nice to mention that this is a wrapper around openpyxl: http://pythonhosted.org/openpyxl/


I realize that this project is about reading excel files. However, if you're creating spreadsheets from python it is worth mentioning that there are 3 different packages. In my experience:

xlwt [1] < openpyxl [2] < XlsxWriter [3]

[1] http://www.python-excel.org/

[2] http://pythonhosted.org/openpyxl/

[3] https://xlsxwriter.readthedocs.org/

The docs of XlsxWriter are beautiful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: