
So You Want to Write Your Own CSV code - Monkeyget
http://tburette.github.io/blog/2014/05/25/so-you-want-to-write-your-own-CSV-code/
======
slg
CSV are a headache. Like the article says, RFC4180 doesn't necessarily
represent the real world. However sometimes you just have to reject things
that aren't spec.

Not too long ago I was struggling with one of these CSV issues and received
some good advice from Hans Passant [1] on a Stack Overflow question pertaining
to my problem (emphasis mine):

"It is pretty important that you don't try to fix it. That will make you
responsible for bad data for a long time. Reject the file for being improperly
formatted. If they hassle you about it then point out that it is not RFC-4180
compatible. _There 's another programmer somewhere that can easily fix this._"

It makes perfect sense in hindsight. If you accept a malformed CSV file,
people will expect you to accept _any_ malformed data that has a CSV
extension. You are taking on a lot of extra responsibility to cover for the
lack of work by another programmer. Odds are they can make a change to fix the
problem that takes a fraction of the time it would take you work around it.
You just have to raise the issue.

I realize that rejecting bad files isn't really possible in every
circumstance. But I have a feeling it is an option more times than you might
initially think.

[1] - [http://stackoverflow.com/users/17034/hans-
passant](http://stackoverflow.com/users/17034/hans-passant)

~~~
barrkel
On the other hand, the ability to handle all kinds of input can be a chief
selling point of your product.

In my current job, the most common "invalid" CSV format we get is .xlsx files.

So I wrote an .xlsx parser (way, way faster than Apache POI).

Another interesting hiccup to consider is CSV inside individual fields - i.e.
recursive CSV. There are various ways to handle this, but in my company's line
of business the usual route is to duplicate that line once per CSV element
found in the field.

Likely the next invalid format we'll have to parse is PDFs containing
tables...

~~~
mschuster91
> Likely the next invalid format we'll have to parse is PDFs containing
> tables...

 _cough_ people doing e-invoicing with pdf's...

------
mrweasel
The most retard structure I've seen in a CSV file relates to the "What if the
character separating fields is not a comma?".

We get "CSV" files from Klarna, an invoicing company, with the payments
they've processed for us. Because we're Danish and they are Swedish, it's not
really weird that they would use comma as the decimal separator. So to
compensate for having used the comma, they for some reason picks ", " ( that's
comma + space ) as the field separator. Most good csv parsers can handle the
field separator to be any character you like, as long is it's just ONE
character. By picking a two character separator they've just dictated that I
write my own or resort to just splitting a line on ", ".

~~~
tikumo
it can be irritating, but you can just as easy parse ", " to "|" or something,
by simple string replacing, pre parsing..

~~~
sunir
Think it through. What if there is free text in the field? "How are you,
Sally?"

~~~
lignuist
You can replace all commas with a placeholder (e.g. "#COMMA#"), replace the
delimiter with a comma, parse the document and then replace all placeholders
in the data with ",".

~~~
Someone
That does not work, unless that first replacement magically ignores the commas
that are part of field separators. If you know how to write the code that does
that, your problem is solved.

~~~
lignuist
I was referencing to "What if the character separating fields is not a
comma?".

And there it clearly works. I used this technique a few times with success. If
you find a CSV file that has mixed field separator types, then you probably
found a broken CSV file.

~~~
zAy0LfpBZLC8mAC
No, it doesn't. What if there is #COMMA# in one of the fields?

~~~
lignuist
You just choose a placeholder that does not appear in the data. You could even
implement it in a way that a placeholder is automatically selected upfront
that does not appear in the data.

When it comes to parsing, the thing is that you usually have to make some
assumptions about the document structure.

~~~
zAy0LfpBZLC8mAC
What if there is #COMMA, in one of the fields (but no #COMMA#)?

Yes, the assumption you have to make is called the grammar, and you better
have a parser that always does what the grammar says, and global text
replacement is a technique that is easy to get wrong, difficult to prove
correct, and completely unnecessary at that.

~~~
lignuist
> What if there is #COMMA, in one of the fields (but no #COMMA#)?

What should happen? Since #COMMA is not #COMMA#, it gets not replaced, because
it does not match.

Please keep in mind, that I replied to suni's very specific question and did
not try to start a discussion about general parser theory. In practice, we
find a lot of files that do not respect the grammar, but still need to find a
way to make the data accessible.

~~~
zAy0LfpBZLC8mAC
What would happen is that you first would replace #COMMA, with #COMMA#COMMA#
and then later replace that with ,COMMA# , thus garbling the data.

The way to make the data accessible is to request the producer to be fixed,
it's that simple. If that is completely impossible, you'll have to figure out
the grammar of the data that you actually have and build a parser for that.
Your suggested strategy does not work.

~~~
dbro
Usually the person parsing the CSV data doesn't have control over the way the
data gets written. If he did, he would probably prefer something like protocol
buffers. CSV is the lowest common denominator, so it's a useful format for
exchanging data between different organizations that are producing and
consuming the data.

[https://github.com/dbro/csvquote](https://github.com/dbro/csvquote) is a
small and fast script that can replace ambiguous separators (commas and
newlines, for example) inside quoted fields, so that other text tools can work
with a simple grammar. After that work is done, the ambiguous commas inside
quoted fields get restored. I wrote it to use unix shell tools like cut, awk,
... with CSV files containing millions of records.

~~~
zAy0LfpBZLC8mAC
You tend to have more control over the way the data is produced than you
think, and you should make use of it. It's idiotic to work around broken
producers over and over and over again, each time with a high risk of
introducing some bugs, instead of pushing back and getting the producer fixed
once and for all. Often the problem is simply in the perception that somehow
broken output is just "not quite right", and therefore nothing to make a fuss
about. That's not how reliable data processing works. You have a formal
grammar, and either your data conforms to it or it does not, and if it
doesn't, good software should simply reject it.

Your csvquote is something completely different, though it seems like you
yourself might be confused about what it actually is when you use the word
"ambiguous". There is nothing ambiguous about commas and newlines in CSV
fields. If it were, that would be a bug in the grammar. It just so happens
that many unix shell tools cannot handle CSV files in any meaningful way,
because that is not their input grammar. Now, what your csvquote actually does
is that it translates between CSV and a format that is compatible with that
input grammar on some level, in a reversible manner. The thing to recognize is
that that format is _not_ CSV and that you are actually parsing the input
according to CSV grammar, so that the translation is actually reversible. Such
a conversion between formats is obviously perfectly fine - as long as you can
prove that the conversion is reversible, that the round-trip is the identity
function, that the processing you do on the converted data is actually
isomorphic to what you conceptually want to do, and so on.

BTW, I suspect that that code would be quite a bit faster if you didn't use a
function pointer in that way and/or made the functions static. I haven't tried
what compilers do with it, but chances are they keep that pointer call in the
inner loop, which would be terribly slow. Also, you might want to review your
error checking, there are quite a few opportunities for errors to go
undetected, thus silently corrupting data.

------
qwerty_asdf
Garbage in? Garbage out. You give me a shitty file, you get shitty results.
Tough shit.

None of these questions are particularly daunting. CSV means "comma separated
values", so if you want to play games and use other delimiters, please fuck
off. If it's not a comma, then guess what: it's not delimited. New line
characters are well-known, and well-understood, across all platforms and easy
to detect. If you manage to fuck that up in your file, then take a look in the
mirror, because the problem is you. Enforcing the practice of enclosing the
target data in quotation marks among users is a good idea. It's something that
should be supported and encouraged, and ignored at one's own risk.

Additionally, employing an escape character (such as backslash) to allow for
the use of a quotation mark within enclosing quotation marks is a nice feature
to add in. After that, the concept of a CSV file has provided enough tools, to
tolerate [an arbitrarily large percentage] of all use cases. If you need
something more robust, XML is thataway.

------
foxhill
as the article mentions, CSV is not well defined. libraries are.. well,
different. you'd spend as much time becoming familiar with one as you would
writing a basic parser.

commas don't delimit field entries? CSV -> comma separated values.

new lines inside a field? i've never written a parser that would be foiled by
this. could be an issue if you use a built-in tokeniser (e.g strtok, etc.). be
aware.

variable number of fields? you’re probably writing this for something with an
expected input form. throw errors if you see something you do not accept. make
sure you catch them.

ascii/unicode? yea. it’s a fucking mess. everywhere.

just do it. handle failure gracefully. learn from your mistakes. don't be
naive. consider a library if the (risk of failure):(time) ratio is skewed the
wrong way. the only time i would absolutely insist that a 3rd party library be
used is when crypto is involved. even then, be aware that they are not
perfect.

absolutely ignore people who's argument is along the lines of "you are not
smart enough to implement this standard. let someone else do it.”. fuck
_everything_ about that statement, and it’s false sense of superiority.

nothing comes for free. wether you use a library, or do your own thing, you’re
going to run into problems.

~~~
B-Con
> absolutely ignore people who's argument is along the lines of "you are not
> smart enough to implement this standard. let someone else do it.”. fuck
> everything about that statement, and it’s false sense of superiority.

In general it's not about being smart enough (although for _some_ complicated
standards maybe it's true), but rather biting off more than you realize.
Everything sounds simple before you find the edge case implementation issues
and have to rework and rethink a bunch of hard issues that a dozen people have
already thought through. Doing it yourself is on the table, but rarely the
most efficient decision.

------
gavinpc
My most popular stackoverflow answer [1] includes a CSV writer and reader.
Yeah, I'd clean it up a little if I were doing it now (return enumerator
instead of array, etc). But people keep using it.

It uses regex lookaheads to deal with quoting, so it's not 100% portable. But
it's only about one page.

As for the other things mentioned by the OP (BOM, encoding), those should be
handled by the stream, and are not the provenance of CSV _per se_.

[1]
[http://stackoverflow.com/a/769713/4525](http://stackoverflow.com/a/769713/4525)

~~~
EvanPlaice
Regex lookaheads are more efficient because you're copying everything between
terminal chars at once as opposed to one char at a time.

Unnecessary string copy operations are what make the parser slow.

------
seanwoods
This article makes it much more complicated than it needs to be. It tries to
be all things to all people. In practice you're going to have to sacrifice
some functionality for the sake of usability and your own sanity.

When I add a CSV import feature to a project I'm working on, I tell people
"this works with MS Excel flavor of CSV." This covers most, if not all, real
world cases because in my world the people who want to import data are non-
programmer types who all use Excel.

I'll often include the basic rules in the screen that accepts the import. If I
ever had to accept data from something that was _not_ Excel I'd probably
include a combo box on the web form that lets you pick the dialect. So far I
haven't had to do that.

The only thing I might not be totally covering is how Excel handles newlines,
but in practice I've never had to deal with that.

~~~
leni536
Does it work with Hungarian Ms Excel? It uses semicolons as delimiters.

~~~
Moto7451
If all you care about is Excel compatibility you can add "sep=," on the first
line. You can also use the Text Import Wizard. Changing the extension to .txt
should cause Excel to show the Wizard upon opening the file.

------
encoderer
Early on in my career, just a year out of school, I, for some absurd reason,
had the idea to build my own date library.

Primarily, I didn't fully understand the date objects and functions available
in the languages/libraries i was using so simple things like formatting a
string date seemed difficult to me.

This was an awful idea. Dreadful.

I came up with all sort of delightful helper methods to cover common use cases
like adding one month to the current date. I made this decision to represent
dates internally with a timestamp, so adding a month is easy, right?! No.
...What's 1 month from January 31st? February 28th? Well then what's 1 month
from February 28th? The list of edge cases goes on.

Most things in life are more complicated than they, at first, seem.

~~~
dceddia
_Especially_ dates.

------
Dorian-Marie
> Ruby CSV library is 2321 lines.

If you look at lib/csv.rb [1] it's:

* 2325 Lines

* 2161 Non-blank lines

* 950 Lines of Code

[1]:
[https://github.com/ruby/ruby/blob/trunk/lib/csv.rb](https://github.com/ruby/ruby/blob/trunk/lib/csv.rb)

------
iagooar
"CSV is not a well defined file-format. The RFC4180 does not represent
reality. It seems as every program handles CSV in subtly different ways.
Please do not inflict another one onto this world. Use a solid library."

I can't but disagree when I read stuff like this. Why shouldn't I release a
library if I think it's good enough for the community? Even the powerful and
versatile Ruby library for CSV parsing started as a gem from a person who
didn't give a s... about advise like "do not inflict another one into this
world".

~~~
chrismcb
IF your library is a solid library, then release it. What he is saying though,
is don't roll your own if you can use a solid library. And if a good solid
library exists, why bother writing your own?

~~~
iagooar
> And if a good solid library exists, why bother writing your own?

Because, you know, learning, having fun and stuff.

------
EpicEng
> _What if the character separating fields is not a comma?_

> _Not kidding._

We'd ll be better off really, but that ship has sailed. Using CSV for data
which is only ever read by a machine is a dumb decision. Use the RS (record
separator) character and many of these ambiguities disappear.

Of course, like I said, that ship has sailed. If you want your data to be read
nicely by other programs you're probably stuck with CSV, TSV, or something
similar.

~~~
brianpgordon
On the other hand, there's definitely some value to being able to directly
inspect and alter your data in a text editor. It would be nice to not have to
deal with unprintable characters.

------
jimeh
Personally I know the pain of creating a CSV parser. In late 2006 I was
working on a PHP project that required a CSV parser, and what was available at
the time did not come close to cutting it. So I created my own
parser/generator, which among many other things included automatic delimiter
character detection. It was a rather painful project to create, but I learned
a lot, and found the experience really fun.

Overall I agree with the article, there's no point in reinventing the wheel if
there are libraries out there. And CSV specifically is a horribly complex
format to deal with. But sometimes rolling your own is the best and/or only
choice you have, and you might come out the other end enjoying the experience,
and having learned a lot.

As for what happened to my old CSV parser? It ended up being quite popular,
but stuck in the dark ages as I'd mostly moved on from PHP years ago. But
thanks to a contributor, we've recently put renewed effort into bringing the
project in to modern times: [https://github.com/parsecsv/parsecsv-for-
php](https://github.com/parsecsv/parsecsv-for-php)

------
huherto
CSV works for simple cases. It is trivial to parse, you shouldn't even need a
library.

It there are many "what ifs" like in the posted article. You probably need
another format like JSON (preferably) or XML.

~~~
daigoba66
Off topic, but why JSON over XML? What are the technical advantages for using
JSON instead of XML (and don't say anything about "human readable"). If you're
consuming the data with JavaScript, I'll grant you that JSON has quite an
edge. But most every language has standard libs for XML. Both are easy to
parse, but XML is easier to validate given a schema definition.

~~~
skybrian
JSON is considerably more compact, especially if you use lists instead of
maps. For a list of numbers, there is only one character of overhead per item.
For a list of strings, it's three characters per item.

Of course you can embed comma-separated lists in XML, but with JSON it will
parse them for you.

(And of course it's not as good as a protobuf, but not bad for a text format.)

------
mooreds
This goes for most complex problems. The first step of any dev problem should
be to make sure you understand the problem, the second to map out the main
pieces and the third to make sure you are leveraging every (well maintained)
library possible. There are, of course, issues with dependencies and tying
yourself to code you didn't write, but what would you rather depend on--code
that has had tens or hundreds of eyes on it, or code that you, and maybe one
or two team members has reviewed?

------
Rabidgremlin
One of my first open source projects was a JDBC driver that read CSV files. It
started simply enough but once you started adding in support for all the
quirks things became really complicated really quickly. Just check out all the
"options" for the driver that have been added by the community over the last
14ish years
[http://csvjdbc.sourceforge.net/doc.html](http://csvjdbc.sourceforge.net/doc.html)

------
p0nce
Can we stop being liberal in what we accept from others? It only leads to an
unfixable mess.

------
Sami_Lehtinen
Parsing CSV is easier than handling XML or JSON. I do integrations as my job
and most common format used is CSV because it's handy simple and reliable
compared to other formats. That is exactly the reason why ini and props file
are also preferred over database for data which isn't too volatile or big. Any
one can open the datafile and see what's stored and what's wrong.

~~~
JensRantil
Have a look at `jq`:
[https://stedolan.github.io/jq/](https://stedolan.github.io/jq/) It makes
working with JSON a breeze.

------
michaelmior
The best tool I've found for working with CSV files is csvkit[1]. I've run
into some of the issues mentioned in the article and it's handled them all
gracefully. It's basically a bunch of scripts mirroring sort, grep, cut, etc.
but specifically for dealing with CSV files.

[1] [http://csvkit.readthedocs.org/](http://csvkit.readthedocs.org/)

~~~
voltagex_
Hey, this looks good. I've also used csvfix [1] to get me out of trouble
before.

1: [http://neilb.bitbucket.org/csvfix/](http://neilb.bitbucket.org/csvfix/)

------
kemayo
We actually use CSV-reading as an incidental part of a hiring exercise. We
provide a really simple homemade CSV parser as part of a PHP project, with a
"could you find and fix bugs in this?" instruction. The way to get full marks
is to rip out the parser and replace it with the appropriate standard library
function.

~~~
ohwaitnvm
I like this.

Only thing that I don't like is that many candidates will assume that they
have to fix the code within the parser, given those instructions, even if they
know that a battle-tested library is how they would actually do it. I hope you
accept an off-hand comment such as, "ew, why is this hand-rolled" as a
sufficient indicator in favor of your solution.

~~~
kemayo
Such a comment would be acceptable, yeah. So long as we can tell they looked
at it and thought "wow, that might go incredibly wrong"...

------
Roboprog
CSVs were simpler back in the 80s, when there were a few products (e.g. -
Lotus 123, xBASE) that all wrote RFC 4180 compliant text (and I'm pretty sure
there was no RFC 4180 yet)

No alternate delimiters, no backslashes.

Now I have to put up with offshore staff trying to use apostrophes (') instead
of quotes (") :-(

Barring alternate delimiters, and disallowing newlines* in fields, I can write
the parser for 4180 in about 30 lines of perl, reading a char at a time and
flipping between about 4 states. (avoids getting root access and days of
paperwork to install from CPAN)

* disallowing newlines in the data is admittedly a big restriction, but it works for many use-case/applications, and allows the caller to pull in a line before calling the parse function.

For Java, the "Ostermiller" library is pretty good for CSV handling, and has a
few options for dealing with freaky variants.

------
collyw
I think this example is relevant to many seemingly trivial problems. Where the
task seems simple, but once you think about the details a bit more it becomes
complex.

I was trying to get Perl tar libraries working, when my colleague asked why I
don't just use backticks to do it in the shell. Basically because I don't know
that much about tar. I can use it to untar file, or create a new archive.
Someone else who has written a library probably has taken the time to read
through the whole manual and make it work nicely. They know the errors and
warnings, and have abstracted that to a sensible level hopefully. They have
thought about these things, so hopefully I won't have to.

------
michaelfeathers
Easy to write, hard to read. Perfect illustration of an emergent case of
Postel's Law.

~~~
astrobe_
To me it's more the perfect illustration of the broken window theory.

------
joshvm
I trust Numpy a lot for CSV handling. It deals with lots of edge cases
including missing data, weird delimiters (pipes '|' are popular in astro for
some reason) and massive files. If in doubt, whack it into Excel which has
been doing this stuff for decades now. I prefer using Numpy to Python's CSV
library which I find a bit clunky.

Very little data is actually true CSV.

The code isn't particularly long (~900 lines), it's Python (hence readable)
and it's well commented:

[https://github.com/numpy/numpy/blob/v1.8.1/numpy/lib/npyio.p...](https://github.com/numpy/numpy/blob/v1.8.1/numpy/lib/npyio.py#L1172)

~~~
jasode
>weird delimiters (pipes '|' are popular in astro for some reason)

I can only guess that since it's astronomy data and constellation coordinates
have decimal places, it's best to avoid the comma character because some
countries use it as a decimal separator.

[http://en.wikipedia.org/wiki/Decimal_mark](http://en.wikipedia.org/wiki/Decimal_mark)

------
winter_blue
A good and _performant_ alternative to CSVs are Google's protocol buffers:
[https://code.google.com/p/protobuf/](https://code.google.com/p/protobuf/)

------
izietto
But it is still by far the most readable text data format out there. Which is
the reason for its wide adoption. I'll be downvoted, but I really believe in
this.

------
jstsch
So, which library? CSV is a mess.

~~~
qnaal
perl's Text::CSV

[http://search.cpan.org/~makamaka/Text-
CSV-1.32/lib/Text/CSV....](http://search.cpan.org/~makamaka/Text-
CSV-1.32/lib/Text/CSV.pm)

~~~
draegtun
Here's some links which (will always) point to latest versions on MetaCPAN:

[http://p3rl.org/Text::CSV](http://p3rl.org/Text::CSV) |
[http://p3rl.org/Text::CSV_XS](http://p3rl.org/Text::CSV_XS)

------
NaNaN
Why CSV is not just for readability? I think RFC is sometimes too pedantic,
that it let CSV can handle both plain text and binary files. COMMA is not just
a COMMA, but a COMMA not in different environments. Why should we use the
phrase CSV or Comma Separated Values just for RFC?

CSV or Comma Separated Values are not only for RFC, but also for EVERYONE who
wants to use this word or phrase. Pedantry sucks!

------
mrcozz
We switched from a CSV based delivery to Apache Avro files. These are binary
files which have the record schema embedded in the file header. We're pretty
happy with this solution for the time being and it seems to be an awesome
alternative to CSV. I wonder if anyone else is doing something similar? Good
article but I'd appreciate if the author gave some alternatives.

------
mschuster91
I usually take advantage of the fixed formats of the individual exporting
tool. Everyone does it a bit different - so what? I have a php parser for it
and adapt it for every of my clients. It's cheaper to have a small parser,
adapted for the customer's needs, than having one 10k SLOC library to handle a
boatload of files...

------
mantis369
CSV is really slow to work with, because you have to check for well-
formedness, like you do with XML. And in the end, I always end up making
specific concessions for the files that my customers use (which must be
patched again and again) or having to take a hard stance on what can and can't
be in the "CSV" files.

------
aubergene
Mike Bostock's DSV library handles pretty much all of the cases listed for
encoding and decoding. Written in JavaScript, in 116 lines.

[https://github.com/mbostock/dsv/blob/master/dsv.js](https://github.com/mbostock/dsv/blob/master/dsv.js)

------
justifier
i recently needed to deal with a ~4G xml file.. i tried a parser but after
waiting thirty minutes for it to load i decided to parse out the bits i needed
manually with a bash script

knowing my needs i could easily account for all possible muck ups and avoid
the instances where ambiguity could play a part

i was then able to use the bits i pulled out of the ~4G file, now 16M, in the
parser with all of its assurances

sure, edge cases justify using a tried and true library for generics, but
there are also edge cases that justify mocking up your own naive
implementation.. if only, like in my case, to make the dada usable in such a
library

------
minimaxir
Most likely prompted by discussion on
[https://news.ycombinator.com/item?id=7794684](https://news.ycombinator.com/item?id=7794684)

------
kabdib
CSV: Where the only way to win is not to play . . .

------
neoyagami
This article represents all my feeligs when my boss says " just write a csv
parser for this, its just csv . So aint that hard"

------
itamarhaber
A non-standard standard is always a sure way to shoot yourself in the foot.
Endianess also causes some confusion...

------
codingdave
Its a flippin' CSV.

Of course you can come up with scenarios where it doesn't work, but anyone who
considers themselves to be a competent programmer should be able to deal with
these issues, use another data format, or just talk to whomever is giving you
the data to correct their data issues.

Seriously, The overhwleming CSV_bashing in these comment really makes me worry
that coders just can't handle the basics anymore.

~~~
SoftwareMaven
It's not a question of _can_ , it's a question of _should_. If any engineer on
my team came to me and told me he was building a CSV reader/writer, I would
seriously question his judgement as an engineer[1]. My thoughts would be that
either he isn't capable of seeing _obvious_ challenges in building a "simple"
CSV feature or he isn't able to prioritize his time well, focusing on useless
toys at the expense of getting important work done.

1\. Of course there are exceptions to the rules: perhaps the CSV is malformed
or there are special considerations in the backend, but the general point
stands.

------
epeus
Never ever use csv to export. Use tab separated, as it takes work to type a
tab in excel.

------
yp_all
Post a sample .csv file you believe is too difficult.

I will solve your problem with only UNIX utilities. And I'm sure others will
solve it other ways.

Usually I only need sed and tr. Sometimes lex or AWK.

Arguing about something without ever pointing to an example accomplishes
nothing; it's just whining.

Post an example.

Thank you.

~~~
josephlord
It isn't that any particular file is difficult but that the variations that
you haven't even thought about might catch you out. It is the deceptive
simplicity of the samples that you have at hand that may catch out your code
when it hits a different (also simple but different) example in the field.

------
mantrax5
Why are people using CSV when better (and less fuzzily defined) solutions
exist, such as JSON?

~~~
minimaxir
CSV is far, far more ubiquitous and much more usable in non-web settings.
(e.g. desktop data analysis programs)

~~~
jgalt212
true, and in those settings you largely don't see situations that trip up
naive parsers such as newlines or delimiters inside fields.

