Waaaay ahead of you: I'm a genealogy nerd and I built an open source archival records management system (Apache Solr + MySQL + PHP) called LeafSeek to do just that. Won second place at the RootsTech Developer Challenge for it.
I currently work with two major non-profit genealogy groups to put their record collections online for free and easy searches, without handing them over to for-profit groups like Ancestry. So far we're at 700,000+ records online using my code, and the newest "big" addition is the Israeli State Archives agreeing to put their British Mandate period marriage and divorce certificates online via one of the non-profits using the code.
I'm planning on launching the system as a multi-tenant SaaS one of these days, once my kids are a little older. :-)
Three or four tiered plan, based on number of records uploaded and desire for additional bells and whistles (personalized subdomain, whitelabel for better known libraries and archives, ability to keep some part of records private except to users who are affiliated with your group through offline membership, etc.).
No freemium, only paying clients. BUT a price discount if users allow (1) auto backup of datasets in CSV format to Internet Archive through their S3 system, even if the data is "darked" and not visible (I'm on Archive Team and I also created their "Antecedents" collection of genealogy-related web crawls: https://archive.org/details/archiveteam_antecedents) and (2) they explicitly put their data in the public domain.
Pie in the sky idea: create API for higher tier paying users to allow them to license their data directly to Ancestry, MyHeritage, FindMyPast, etc. Which would flip the pay-for-data model on its ear, allowing users to regain control of their own datasets, and for-profit data brokers to pay them for access.
Thank you! In a perfect world there could be a free plan for small time data publishers, individual researchers without a lot of funds, etc. But you have to be very wary of recreating the RootsWeb scenario from 12 years ago, where a giant collection of genealogy records created through volunteer-run goodwill gets in over its head financially and needs a white knight to come bail them out. In that case, the white knight was Ancestry, who (amazingly) haven't shut them down all these years. But I am aware that any new project to put archival data online MUST be fiscally responsible and prudent because it is such a big responsibility. This is customer data unlike most others.
To that end, I will also be rejecting VC and bootstrapping the venture. Ain't nobody going to be flipping this site or acqui-hiring that data. This is going to be done for the long-term.
Have you thought about incorporating as either a non-profit or a social B corp? Neither prevents you from drawing income from the project, but might increase the chances of using grant money or kickstarter/indiegogo funds to further the cause.
I have thought about, and rejected, the non-profit idea, as they have so much overhead to run (tons of paperwork) and might limit any political advocacy I might want to do in connection with the desire for more open data/records policies. But I don't know much about the social B corp. Would need to do more study...
Website: http://www.leafseek.com Background article about how the project came to be: http://www.leafseek.com/blog/leafseek-gets-published-in-avot...
I currently work with two major non-profit genealogy groups to put their record collections online for free and easy searches, without handing them over to for-profit groups like Ancestry. So far we're at 700,000+ records online using my code, and the newest "big" addition is the Israeli State Archives agreeing to put their British Mandate period marriage and divorce certificates online via one of the non-profits using the code.
I'm planning on launching the system as a multi-tenant SaaS one of these days, once my kids are a little older. :-)