Singer is an open-source standard for writing scripts that move data between databases, web APIs, files, queues, and just about anything else you can think of. Lots of companies build ETL scripts to move their data, and there's a huge amount of rework that happens from company to company. We believe that developers should spend less time moving their data and more time using it.
We're open sourcing 12 of our integrations (with more to come) so that they can be used in other applications, and we're excited to see what the community builds. Let me know if I can answer any questions.
I read through https://github.com/singer-io/getting-started/blob/master/SPE..., but I'm trying to better understand why they're necessary.
- JSON doesn't have a robust set of data types, and specifically lacks a datetime/timestamp type. With a schema, Taps can, for example, denote fields in the JSON that contain datetimes represented as strings, and then targets can convert those to proper datetimes and handle them accordingly.
- Dealing with un-structured or flexibly-structured data is hard. Requiring a schema forces a Tap author to think about the structure of the data up front. By validating each data point against a schema, the Tap author should be able to more quickly identify nuances in the data set - like missing fields, nullable fields, mixed-type fields, etc - and either decide to clean them out of the data (if appropriate), or provide the right schema to inform downstream applications about them. Identifying and handling these problems requires an understanding of the source data set, so it is best done as close to the data source as possible.
The longer answer is that it may take us a while to get to 100% open source, but that's the direction we're moving. All of our new integration development will be open source and be part of the Singer project. Our original integrations were written in a different framework and couldn't be run independently of Stitch, and it's a nontrivial amount of work to convert them to the Singer format.
We included several of our existing integrations as part of this launch, and we'll definitely be adding more of them as well as new integrations.
