in every organization I've worked in, its usually the exact opposite - spreadsheets are used to do analysis and create reports while the source data is coming from other systems (SAP, Salesforce, quickbooks, etc)
(https://github.com/sensiblecodeio/databaker for the top-level project; https://github.com/sensiblecodeio/xypath for the library I created).
Get in touch if you're interested -- email@example.com
that seems to change as an automation and IT culture starts to develop in an organization; interestingly, IT can sometimes be a barrier to data organization (too slow, cross-org costs etc) and cause people to continue to use CSV and excel.
fwiw - we built a service called "yukon data solutions" that solves these classes of problems for people trying to transition from CSV and spreadsheets into automated and reproducible analytics and reporting.
Also the context is bit different; research vs businesses
Gene name errors are widespread in the scientific literature
Genome Biology 2016 17:177
https://doi.org/10.1186/s13059-016-1044-7 (Open Access)
I work closely with social scientists and it's common in those disciplines to embed narrative and presentation data in spreadsheets.. articles like this and the Good Enough Practices in Scientific Computing (http://journals.plos.org/ploscompbiol/article?id=10.1371/jou...) are nice summaries of good practices.
Not if you are working in excel! It can corrupt all sorts of unicode data when exporting to csv.
I've gotten in the habit of saving a copy to Excel first, then to CSV again. If I realize I still have a bug in my CSV copy I reopen the Excel version so any earlier formatting edits I've made are still intact.
I might have missed something obvious though so feel free to point it out. Google failed me though.
1. What do people use (currently) for data standardization?
2. What do you do if it's not in tabular format, but is something like JSON/noSQL?
Right now I use the Karma Data Integration tool (http://usc-isi-i2.github.io/karma/) to transform my source data to RDF triples. It can handle data in a variety of shapes, not just tabular, though it does struggle with say, highly-nested XML data. I want to try LinkedPipes ETL (https://etl.linkedpipes.com/) on my next project, whatever it might be.
HN discussion: https://news.ycombinator.com/item?id=17147272