

Major Fail at XMLTeam Tonight - vdibart

To date I've had nothing but good things to say about XMLTeam (http://xmlteam.com/), an alternative to high priced/low value professional sports stat providers like Stats Inc.  I'm the last person who wants to see them go out of business, but sounds like today was a bad day to stop drinking if you're an employee over there.<p>This evening I got an email telling me that my password was reset due to security concerns, and any applications I have deployed have to use the updated password before October 5th.<p>There was only 1 rather large problem - the email contained the username and password for someone else's account.  And I couldn't log into my own account.  (No, I didn't try to log in with the other guy's credentials).<p>So, to summarize, someone at XMLTeam realized there was a security hole in their software (problem #1) and decided to reset everyone's passwords with barely any notice (problem #2).  Then they sent the usernames and passwords in clear text (problem #3) via email (problem #4) to the wrong emails (problem #5).<p>In his defense, the CEO responded quickly to my email and assured me that I would not be charged for any requests that are submitted to my account while they get things worked out (I'm on a pay-per-request plan, which is honestly one of the best deals out there for this kind of thing).  He sounded as rattled as you might expect.<p>Look, I sympathize to some degree, but this is a colossal fail.  Get it together guys!  Small companies like mine depend on you.
======
thwarted
My issues with XMLTeam has been that their salesmen don't seem very technology
oriented and the database their interface/API accesses is only the most recent
info and not everything that they actually have available. If you want
anything historical, there was some random cut-off date in the middle of last
season where stats disappeared without indication, it's not available via the
charge-per-document API but only via the salesman-runs-the-query-and-emails-
you-a-massive-zip-file which was error prone as files were missing even from
that or they "forgot" to include one of the document types. They were
generally nice and took care of problems like missing files, but when you're
on a time budget to do a proof of concept, this is very frustrating (which is
why I preferred to just get query for the data I wanted automatically, but
even that required numerous back and forths).

Additionally, it looks like their data is going to be really clean, but it's
not (like, what are you paying for otherwise?). A significant portion of the
data loading code I wrote did cleanup and tried to detect variations that a
human needed to look into. The data itself isn't often normalized. Positions
in basketball were sometimes single characters, other times were strings ("F"
vs "forward" or "forward center", "FC" or "CF" (but never "center forward")).
Dates had differing formats, even in the same field. Sometimes things were
specified in seconds, other times in minutes:seconds, sometimes in just
minutes (with no label, so I had to use heuristics to determine if a number
represented minutes or seconds and multiply accordingly). Some documents had a
different XML element nesting structure at the top level, almost like the XML
was generated by hand with someone typing it into an XML tool. Which is odd
because they have this whole database schema they talk about that is supposed
to be able to handle all the sports they support.

We also spent a lot of time hand comparing to other sources (like stats.com
data available on Yahoo Sports) because we found a lot of odd outliers that
didn't make any sense. I had stat names diverge between seasons, and they
provide a lot of data literally that is derivable, which in a few cases was
wrong (the free throw percentage should be the free throws made over the free
throws attempted, but was often different). The schema is really weird. They
know who played which position for every game, but rosters are only available
on a per-season basis. The mobility of players between teams (and even between
sports or leagues) isn't acknowledged well enough in the data (the same
physical person having different primary keys, for example).

They seem to have all documents pre-generated and then the API just selects
which documents to send you. This would be a good optimization, but was slow
unless the result set was only a handful of documents (significantly less than
a season's worth), and the API/interface implied that data was available that
wasn't.

They seem to have a good product if your goal is only to show some sports
stats on your website in an embedded frame. That's ephemeral by nature, so if
something is goofed up, it'll be fixed in the data for the next game. But
doing any kind of bulk analysis or browsable database from their data, there
are many deficiencies. There were also some odd licensing limits we couldn't
get straight answers from them (like "must state you got the data from XMLTeam
on your website", but our use of the data didn't have a website in the common
case and we would have been strictly violating the licensing terms). I was
often like "Dude, I'm throwing you money to make this work for me" and didn't
get a lot of satisfaction out of it.

They could be a _serious_ contender against the (significantly) more expensive
stats.com service (which I don't have any direct experience with because they
are so expensive), and I hope they improve, but it seems like one of those
companies/services that, at least to a tech person like me, makes money in
spite of themselves.

