

Smarter Rails Seeding with Sprig - dce
http://viget.com/extend/smarter-rails-seeding-with-sprig

======
rpwilcox
It looks like using Rails test fixtures (reborn) to create seed data. Which is
an OK win I guess, and having it separated out by environment (implicitly
mentioned in the article) is nice.. but I don't see how much of a win this is
over instantiating fixtures or FactoryGirl.create in your seeds.rb file.

Yes, there are tons of problems with using db/seeds.rb for anything serious.
I'm not sure that Sprig will handle my issues much better than I currently do
myself.

For the record, my desired behaviors for a seed data solution are:

1\. environmental separation (which Sprig has). Developers have different
needs for seed data as QA does, as does the staging server, as does
production. Developers want an easy default dev@myco.com user with a simple
password, but you don't want that in production.

2\. If I rerun my seed solution (perhaps because I added some more seed data)
it shouldn't duplicate records (or throw errors because it's trying to create
the second user with the same email address)

3\. Handle bootstrap data I need in my app (example: I want a list of US
states, and every environment should get this. To reiterate my second point, I
should be able to add to this bootstrap data without getting two copies of
"California" in my US state list).

It's sad that no real solution exists to handle all three of these needs. Some
projects I've been on have gotten this close, but that was years ago and
things have changed.

(If Sprig does have these things, then that's the selling point, not seed data
as fixtures which the article emphasized)

I'm also not sure about using fixture like things in Sprig. I give it 6 months
before most users remember why many people in the Rails community moved to a
Factory pattern for (test) data construction long ago.

However, I am happy that a relatively well known Rails consultancy is released
1.0 of a seed gem. Hopefully the name recognition / noise will lead developers
to the gem and I'll be a better solution with many more eyes.

~~~
bigtunacan
Not sure about Sprig, but Seedbank hits all 3 of those and has for the past
two years.

[https://github.com/james2m/seedbank](https://github.com/james2m/seedbank)

------
danso
I completely expected the OP to have had a typo in the headline and for it to
actually be about "Spring", which is an amazing an essential gem (and part of
4.1beta)...but this is pretty cool too :).

I'm biased because the OP shows off a custom-parser for Google Spreadsheets,
which is neat to me because Google Spreadsheets is my goto-interface for new
prototyped apps...a much better, live-collaborative admin than anything I
could easily build myself or with Rails tools.

But I wonder if this gem is more work than its worth? I mean, seeds don't seem
like the best place to persist intricate production-ready data in the repo.
And if you continue to use Google Spreadsheets, or whatever, as your main
admin input interface, then it seems worth it to build a more elaborate
abstraction to handle that usecase.

Also, I wonder if some of the self-referencing could be done via YAML's
standard syntax? That would mean no JSON as a format, but YAML seems like it
was built for this kind of lightweight relational data storage?

~~~
lkurtz
The Google Spreadsheet example is admittedly a reach. It's mostly just a
demonstration of how flexible Sprig can be with data formats.

That's a very interesting idea to use YAML's self-reference syntax. Although a
lot of the value of this gem comes from the seed organization across different
record-type-specific files, and I think YAML's self-referential syntax might
make cross-file referencing a bit more difficult for the user. Good thing to
keep in mind for the future though, especially once everyone is on board the
YAML train.

Do you have another place/system for persisting that you've success with in
the past? I've always felt that seeds are designed to be the place to persist
record data in the repo.

~~~
danso
For data that I intend to share/reuse, I tend to have it in a SQL dump, but
here are my (perhaps non-informed) assumptions and needs:

1\. The values in this data _never_ change. That is, if I'm maintaining a
store of the U.S. Congress vote database, all votes (in the past) are
inextricably and forever tied to a legislator's ID.

2\. The seed data is so large that loading by ActiveRecord is too slow and
needs to be done either by plain SQL import or activerecord-import.

3\. The dataset should, when possible, imply its own opinion on
conventions...so the Congressmebmer table will always be "congressmembers" and
Vote will always be in "votes" and the relations and their keys will have the
same convention in any app that uses this data. Of course, a particular app
may choose to rename things, but they can do that after the seeding process.

4\. For a situation like the above, it's likely that a Rails Engine has been
made, i.e. with all the domain-specific logic.

For smaller datasets, say, a list of the U.S. states and their
abbreviations...storing them as plain seed files should suffice.

So anyway, those are my past practices. I'm not saying they're best, though,
and am always looking for better ways to organize data between apps.

------
bigtunacan
This feels like just re-inventing the wheel again. The seedbank gem
[https://github.com/james2m/seedbank](https://github.com/james2m/seedbank) has
been around for a couple of years and does a great job for managing seeds on a
natural, more granular level, if that is what you need.

------
tbruffy
Seems like an oversight not to include XML support

------
hmans
No. Just no.

