
Ask HN: Where can I find free/cheap financial data? - pwhelan
I am doing a side project on investing need financial statements: balance sheets, income statements, cash flows. The SEC's EDGAR tool is terrible and Google &#38; Yahoo have a lot of companies where they just don't have the info posted. I would only need the NYSE data for yearly/quarterly reports.<p>I've scraped Google Finance for info but I have a lot of invalid rows in my database.<p>Thanks all.
======
geuis
<http://code.google.com/apis/finance/> [http://kottke.org/09/06/online-
financial-data-apis-and-resou...](http://kottke.org/09/06/online-financial-
data-apis-and-resources) [http://stackoverflow.com/questions/417453/best-most-
comprehe...](http://stackoverflow.com/questions/417453/best-most-
comprehensive-api-for-stocks-financial-data)

(edit: Added the link for Google's finance api because I'm not sure if you are
referring to their api or finance site when you say 'scrape')

------
carucez
I've done a lot of scraping and parsing. Your best option is to fetch the
SEC's RSS, then fetch the hard-to-parse XML/Free-form and parse it. XBRL is
great in its own regard, but it's very difficult to relate XBRL fields with
non-XBRL filings. You would do well to separate the two results.

SEC form 4 filings are in XBRL dating back to Jan 1, 2004 for every company.
There are well over 1,000,000 forms filed between then and now... I know, I
have them all locally right now.

You can scrape Google's Financial pages, obviously, and you can even get
2-minute data from a JSON "_5d" variable.

You can get fundamentals data from nasdaq pretty easily, too. Scraping it is a
little difficult, but you can go 120 quarters back for many companies, and 5
years back for annual data.

I have a financial statements database populated with nasdaq scraped data
right now. They update within a week or so after it's published to the SEC.
You'll always be behind the curve, but you will have good information, and it
is good information, albeit incomplete (missing things like the number of
shares outstanding).

------
DevX101
The SEC has a new XML-like data system for financial information called XBRL:
<http://xbrl.sec.gov/>

Not all companies are required to report in this format at this time but I
believe over the next year most Fortune 500 companies will be required to
provide their data in this format.

~~~
d2viant
That's great information. Sounds like it's voluntary at this point but
required by January 1st, 2011.

------
d2viant
I've been working on a closely related project for about a year now. It's not
worth your time to attempt scraping. The best advice I can give you is that
any financial data source worth consuming is going to cost money, you might as
well pay for it and focus your energy/time on building the product itself. The
free data sources are unreliable and stale and scraping legit sites is
problematic because of throttling issues.

~~~
mnemosyne51
I agree. The quality of data is always going to be an issue and even the paid
data has several problems. Be prepared to write a lot of code to catch these
data errors.

------
maqr
YQL for Yahoo Finance is a useful way of querying it. I'm not sure where you
go if they don't have the information that you need though.

------
starpilot
SECWatch has an api: <http://secwatch.com/api.jsp>

10kWizard's cheapest plan works out to about $21/month.

~~~
pwhelan
I hadn't seen the SECWatch site. It looks like there is a lot of potential
there, thank you.

------
regularfry
<http://caps.fool.com> is scrapeable, but I don't know if it's got quite what
you need.

