Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: My school needs a data storage solution
31 points by antigen8 on Sept 13, 2015 | hide | past | web | favorite | 60 comments
I work at a New York Public High School, and we're sick of dealing with the legacy Department of Education systems. You cannot imagine. We have to keep them up to date by hand anyway, so while we're at it, we want to build our own data store so that we can work with our data to help our students.

Our data isn't time-series, we need someplace to store our stuff, and we need advice from experts

My expert advice is, Don't do it, unless you can buy the exact system off the shelf, and allocate the budget to run and support it. If you can't, stick with the crappy legacy system you have.

Building actual software for actual use cases for actual people is extremely hard, and a lot harder than it feels like when you're just a few visionaries with an itch talking about these things. I'll say it again, slowly. It is EXTREMELY hard.

Sure, you can probably get a couple of enthusiasts together and build an infinitely better 80% solution in Ruby on Rails over a weekend, but you will learn the hard way just how important those last 20% are. That unpleasant senior teacher that never liked you? He uses a feature in those 20% and he's friendly with the principal and the union rep, and before you know it, you're up to 3am, not building awesome solutions for the entire school, but building bespoke solutions to that guy's convoluted backwards workflows that barely made sense in the 70s when he came up with them -- and you don't have the political capital to tell him to shut up and get with the program, however warranted that message would be.

If you do pull through, your reward will be a pat on the back, and a lifetime of being on 24h call.

Oh, and that's your best case scenario. There's a high risk of not actually achieving a system that is substantially better than the legacy system, even in the 80% case. And a substantial risk of introducing a subtle bug that screws up grading at the worst possible time (you know, how an administrator pulls a transcript giving an A+ student a B- average and sends it to his prospective college, screwing up his future, and the error is only caught after the deadline, and it's all literally, personally your fault).

To pull this off, you need a team, a budget AND hard executive sponsorship. That sounds unlikely for a public high school.

If you think you have the vision and the domain knowledge, and the technical chops to pull this off (but no budget and/or no executive cover), your best bet might be something like YC: fix this problem for ALL schools.

Spot on.

(Been there done that, saving your excellent comment for future "let's rewrite this" situations.)

Also summarised well here: http://www.joelonsoftware.com/articles/fog0000000069.html

I'd only add:

- nitpicking, but with Ruby on Rails, you'd probably use ActiveRecords (the ORM) to manage the data; by not enforcing constraints in the database itself, you'll get a tiny bit of orphaned data. Fine for an e-commerce website where customer service can send a voucher; sucks for the few students whose life is affected by your non-provably-correct solution.

- most non-technical users (most of the teachers I know included) confuse good and pretty UI. "Bad" legacy system usually means "not flat design/doesn't look like my iPad". This usually pairs with the idea that "it would just be a simple app, we can get it done for a grand".

"If you think you have the vision and the domain knowledge, and the technical chops to pull this off (but no budget and/or no executive cover), your best bet might be something like YC: fix this problem for ALL schools."

Also - more often than not connections and domain knowledge seem to be more important for success than pure technical chops if the solution to the problem is more about tailoring existing technologies to a domain rather than implementing technically novel things.

"No good deed goes unpunished."

Take heed. It took me years to understand this . . .

> To pull this off, you need a team, a budget AND hard executive sponsorship. That sounds unlikely for a public high school.

Also be prepared to fend off the professional magpies who will undermine you. "Why aren't you buying an Oracle solution? Their sales folks were in here just yesterday, and they can do all this and more. Most of it is already built in to PeopleSoft."

And at that point you either whip out your snickersnee, gird your loins and leap with joy into the political fracas, or (just guessing you're not the type who's a joy-leaper into political bottle battles) you back away slowly, make popcorn and watch events unfurl. It'll be ghastly, and unless you execute an evasion plan you'll wind up editing fucked up XML configuration files and hating, just hating yourself for opening your big fat mouth, and hoping that there really is a hellmouth located under the school. Right there. Under that CD of PeopleSoft. What could be worse?

I totally agree. School is a complicated environment. There is a reason that legacy system survives so long. It is extremely difficult to introduce new solution into public school. There are regulations and many stack holders. Just don't do it.

Having worked in a public educational setting before, I would recommend finding out whether you actually have an option to do something yourself before spending a lot of time on it. My experience tells me that the people running the system you're trying to get away from are likely to be territorial; it's extremely disappointing to get really invested in making something better only to have politics scuttle the effort.

There may be also be regulatory things (ferpa, etc) your solution would have to comply with.

Aside from that, your request is more ambiguous than you realize. You're going to have to be more specific about what you're trying to do.

You're right- it's a very complex process even trying to talk about bringing new systems into a public school. FERPA is not an inconsiderable force.

I've tried to give more concrete insight into what I'm trying to do, above; but I think part of the ambiguity is because it's very difficult for me to understand what we need so far. We need a technical solution to serve people who can't be expected to directly interact with it...


Thank-you to everyone responding, especially those who are helping enormously by giving me real-talk; Here's some more information about my situtation:

The legacy system I referred to has limitations that make it very uncomfortable to contemplate continuing to use it as our sole data source. Some issues: - The database is updated once a day; Any changes you add to it today won't show until tomorrow. - Student absences (important so we can see cutting, lateness, etc) can be recorded once per day; students are either "at school" or not. - There are numerous categories of data that we have that can't be tracked or stored, for expample, whether a student is involved in a sports team. We have a large body of instructional data about how students are doing in class, but we're struggling to bring it together to get a full picture because as far as our main system is concerned, students receive a grade at the end of their course for their transcript, and that's that.

I'm not going to ignore what I'm being told and I think that building a new solution is beyond our capacity; we have to keep our focus on teaching the kids as well as we can. But we have a very young staff and an administration committed to exploring our options. I've already been looking at Tableau and it's so powerful and approachable that we can definitely use it. The difficulty I now face is what tool can we use to complement and expand the system that's causing a lot of grief.

We need to be able to store types of information including absences, student biographical info, grades; we want to be able to store student ID pictures, lesson plans, notes from teachers. Points of contact with parents. What we think we know is that we need a SaaS solution, we have regulatory compliance issues (not to mention ethical requirements) around security and privacy of our data, which many of you noticed.

I'm reading every single comment and will respond by email and here where it makes sense. We know we need help even to understand our problem. YC 2016 would be top of my priorities right now, but there are ethics rules preventing me from being a vendor to any NYC public school while I'm working at one. As much as it makes sense ethically, it really seems like that shouldn't be the department's focus when they have kids who can't read yet and they can't tell you which kids they are.

Hey - i'm a recent CS grad so this may be naive.. but heres the easiest possible solution - export all the data from the old system, and throw it into a Google Docs Spreadsheet - google apps is free for education. [1] Thats the simplest UI possible, not that easy to use though.

If you want a more custom tailored solution, you need a programmer to 1. create a dump of all your data 2. create a web frontend to display it and update it nicely. A club in my college did this kind of thing - they worked on simple websites for local companies. College students are a great option because this is an easy enough project (technically that is, i dont know about the politics) and they will work for free.

I don't know much about slicker SaaS solutions - Tableau is definitely pretty great. Sometimes SaaS companies will offer demos over skype of the product and can answer your questions - i say take advantage of that, tell them your problem and let them answer how well their product can work for you.

[1] https://www.google.com/edu/products/productivity-tools/

Google Spreadsheets is absolutely not what a government school would consider private or secure, nor is it likely to be able to handle a database like this, as it tends to crap out a little ways into the 5-figure row region.

They also won't pay some college kids to do it, enterprises, especially government enterprises, tend to require contractual support, which is probably outside the realm of what a college kid can or would want to do

I was a Technology administrator at a k-12 schools before coming to NYU. What you describe are (unfortunately) pretty classic issues that tons of schools (as we did) faced. Feel free to reach out if you want to chat about how to start tackling this. Would love to help in any way I can, and I'm local too.

I'd be thrilled to start a conversation with you, it would be great to take advantage of your insights.

I put an email in my profile

I just started doing database and training work for a school district that has been using Infinite Campus (https://www.infinitecampus.com/) for the last 5 years. Infinite Campus sounds like it is another package that would accomplish your goals.

Some positives:

- Browser based, modern browsers supported

- Permissions galore to minimize data exposure and to support external policies (FERPA, etc)

- Fulfills the brief requirements you outlined

- Iterative development, they drop releases faster than we're comfortable taking them

- Data is exportable, we're doing custom transcripts for the state college, custom exports for Clever, etc.

Some negatives:

- Proprietary

- UX doesn't seem to be a priority, it isn't horrible in most cases but it isn't great. To be fair, some of that may be due to feature creep in our installation / district ("Oh, you can develop custom fields and tabs? Great, lets make all of them"). There's already some workflows I've automated with AHK (again, our district policies are at least partially to blame).

Other notes

- Our installation is JSP / MSSQL, not sure if other databases are supported.

- Some reporting functionality built in, we've developed some external reporting around it to fill holes

- Custom development costs $$$ (not going to get this for free anywhere though)

- Old versions of IE have been left behind (I view this as a positive, IT departments might want to hold on to their IE 9 deployments)

- We used to use SASi, I don't know the reasons behind migrating away from it

I'd suggest evaluating Infinite Campus alongside some of the alternatives. I'd also see what other districts in NY are using and compare notes on the various state and district level customizations they've done. Discuss pain points alongside what just works and what shows up on the "I wish we could ..." list.

Contact me if you'd like more information.

Maybe the people who made or support this software can update the frequency of the batch jobs (e.g., every 15 minutes instead of every 24 hours)? After that, perhaps they can add a couple more fields for you to track the additional data.

I'd start with asking your administrator for that. If that person wants to go wild and bring in the million dollar software consulting team for a brand new solution, then let that be his/her idea, for you own benefit.


I really appreciate this insight, I think that would really be the ideal solution. I have satisfied myself that it won't be possible- the software we're talking about is one slice of a highly sophisticated network of services operated by the DOE- the largest and most complex school district in the country (and I know of no larger similar educational organization in the world).

I came into this believing that the DOE would address these issues if it could, that the system was simply too complex for anyone to be able to update it effectively. I hope that's true. What is certainly true is that there is an impending crisis in the management of educational data and one way to reduce the impact to our community is to begin cultivating a data management strategy which reduces our dependency and increases the robustness of our digital approach.

This might help some:


It's actually open source, you can host it yourself if you chose or they will run it for you for a fee. You can program against it... build plugins etc. There are also a large number of plugins already available. It's written in Ruby on Rails with a Cassandra DB.

Thanks for sharing this- as a school, we do already use canvas! One of the ways we're attacking our data problem is to make more effective use of canvas, but like all the other services we use, it's becoming clear that it is only part of the solution.

That said, it's my job to make sure we're not overlooking the ability to make our "problem" disappear by making better use of our resources; so I'll be agressively reevaluating the chain of decisions that brought us to this moment

Thanks for the update.

Can I ask if you guys are using Powerschool SIS? It's pretty common in the educational sector, and it's designed specifically for medium-to-large school districts. I worked with it only briefly a very long time ago (when it was called SASi), as it started to replace the COBOL mainframe system that the school district I worked at had been using. I'd be a little surprised if it has all the problems you describe. At the time, I'm pretty sure it had support for everything you're asking for.

Are there any automated import/export options for the system you're currently using? Any way at all to minimize the amount of duplicate data entry you'd have to do? Even if you used something like AutoHotKey (http://ahkscript.org/) to do it?

I have a recommendation that's probably not going to be real popular with the rest of HN, because it's not 2015 Buzzword Compliant, but: have you considered just using FileMaker Pro?

It's cheap, they offer discounts for education, and it's fairly easy to use. You don't need to be a programmer to get going with it. It's really, really common, especially in the small business market, so finding people that can work with it is easy. They even have a pile of ready-made applications for schools (http://solutions.filemaker.com/made-for-filemaker/search.jsp...), including one that seems to hit your immediate needs (http://solutions.filemaker.com/made-for-filemaker/detail.jsp...) (caveat: looks like that version of their software might not be editable; the vendor wants you to contact them for a customizable version).

You could throw a FMP server together for $1,000 or less, software included, and be up and running with your first version in a day or two. Treat it like your minimum viable product, use it for end-user testing and figure out what you do and don't like about it, and then if you decide in one or two years that you still really need a custom system, you'll have a very detailed list of requirements that will help the build process a lot.

And FMP is pretty decent about playing nice with other software and allowing you to export data, so you don't have to worry too much about vendor lock-in.

FMP server: we just recently did a server upgrade for the Friendship Club, for really cheap. A Mac Mini works just fine as an FMP server for a small number of users, add in an external drive for Time Machine backups, then your FMP license and a license for the pre-made K12 FMP application to get you started.

Feel free to drop me a line at the email address in my user profile. I work a lot with clients with constrained budgets. I could bat around some options with you.

And if you're looking bigger-picture, pretty much the entire SIS market sucks. Somebody could come along and develop a new product that was modern and robust and it could quickly be better than all of its competitors. But, it would also be really tough to get that exponential growth curve that YC and its kin all want to see because so many school districts would be reluctant to go through the trouble of switching systems, no matter how bad their current system was.

You need two competent programmers and, more importantly, buy-in from about six people to let you hire them and substantially more once you go to implement the recommendations pulled from the data. You do not need a very complicated data storage solution, but your local IBM or Oracle rep would be more than happy to explain the benefits of their $100,000 offering over a steak dinner.

Can I recommend IBM Cloudant, first $50 is free if I remember correctly and you can hit the ground running with a webapp using multiple solution out there.

If you like software that has been around for decades using Oracle MySQL, or if you are liberal MariaSQL will do! I will personally recommend Postgres.

No the questions how much they can get out of these datasets and how they are going to collect the dataset is up for discussion.

But maybe just one competent teacher and a class full of enthusiastic kids, can make for a interesting project.

I'm assuming 2 things - 1. The data is accessed by people, so an 50ms of latency doesn't matter so much for most purposes. Writes/modifications are infrequent so even a global lock when writing would be ok. 2. You have about a few hundred GBs of data at most.

If thats the case, the technical problem doesn't really require anything novel. A simple SQL engine like sqlite works - its very common, most programmers know or can pick up SQL and its mature/stable/battle-tested. I would go with that.

For a more detailed answer, maybe you could bring in a programmer or consultant to take a look, although they'll probably charge you way more than its worth.

bonobo3000- I think you're right that, generally, the technical sophistication of this problem isn't super demanding. We're not storing millions of records, we're not (as far as I can tell) asking for anything "complicated". We probably don't have the technical sophistication to manage this ourselves, so we're probably looking at a hosting SaaS, as has been said in other places

What kind of data do you need to store? This is important, it will have impacts on everything from access to security to support to backups and redundancies.

What level of the architecture are you looking at? Do you need better hardware, a better filesystem, a better RDBMS, a better reporting interface for the data you already have?

What kind of budget do you have? It's a public school, so it's probably safe to assume $0, but it would be good to know what kind of administrative support you have for this project.

What, specifically, is wrong with what you have now? What isn't working? What are your frustrations with it?

You may not need two (or any) competent programmers, and I'd be reluctant to recommend Oracle or IBM unless you're looking to solve a problem that they are uniquely good at solving.

Thanks thaumaturgy-

I used your questions in writing the update post above, which hopefully provides some more clarity

Hey antigen8,

I am working on something very similar with a non-profit in NYC, for NYC schools. It's very exciting, we are building on NodeJS and MongoDB, and already have some data syncing set up with the DOE's systems. Can you shoot me an email (in my profile)? I would love to talk with you and make an introduction if it makes sense.

New projects are still using MongoDB? I thought people started to like their data still being there in 6 months..

Your data isn't relational?

I won't pretend to be an expert, but from reading the thread it sounds like a separate data store with the additional information [e.g. who is playing sports, who is not showing up in 5th period, etc.] could readily run alongside the existing system.

To put it another way, the existing system is built to solve a particular set of problems. The set of problems that you would like to solve are mostly orthogonal. The existing system is designed to include every student, the problems you want to solve only involve some students. Essentially the problems of interest are fine grained and the legacy system is inherently course grained. The problems have soft answers, the legacy system is for hard answers.

In reality, the duplication of effort for a parallel system is pretty much the name of the student of interest into the new system. Everything else...sports teams, fine grained absences, etc. has to be entered regardless...and a lot of information such as home address doesn't really need to come across because it's only relevant at the point where a specific action requires it.

This is a case where most of the problems of interest don't depend on data normalization because the problems are mostly related to a small number of individuals and are handled on a case by case basis. Document store and search are fine.

Build a system that scratches the actual itches as you have them, not the system to end all systems.

Good luck.

YC W16 applications are open, sounds like you could give it a go if you think that this problem could apply to all schools and that schools would be willing to pay for it.

I think we all know that Education is one sector, not unlike legal services, that could stand for a big wake up call.

Lots of challenges to getting new or useful tech into schools, not least of all that everyone is too busy teaching to be able to pull new tools into their chain.

You're not wrong though: it would be grand to chip away at some of the entrenched, systemic inefficiencies.

I asume you are managing classical adminstrative data like student records and so on. For this case I advise a relational database like postgres.

I work in ed tech and just implemented a similar system. Shoot me an email. I'll gladly chat about it privately.

> we need someplace to store our stuff

What do you mean by "stuff"?

This isn't going to be a problem you can solve with a quick question on HN. You need to sit down with someone with a list of project goals and a budget. I'm happy to chat, because I know how frustrating it can be from your side of things. My email is in my profile.

The email field on hn profiles is hidden, if you want people to see it you should post it in the about field.

Oops thought it was. Thanks!

I don't think it is...

You should update your post and give examples of what you're storing. Test scores, word docs?

yes...it could be as minimal of a need as just needing a file server/NAS. I teach at a middle school where even the simple task of turning in an assignment as a file and not a piece of paper is a huge chore with no institutional support. Everything is either sneaker nets of USB flash drives, or convoluted processes for working around online services' general expectation that everyone has an email address.

Anyways, OP has given so little info it seems a bit of a waste that they have received so many long replies without even knowing what the problem is.

You're totally right, and one of the limitations on the info I was able to include was that it's very difficult for me to know how to describe what we're trying to do, not having the expertise. We seem to be trying to find a SaaS solution that will allow us to store academic records, and is flexible enough to accommodate things like student work samples, assessment data, and other forms of information that we might not completely anticipate.

I know we can't build what we need from scratch, and I know that we need something more sophisticated than a NAS- we don't want to invest the capital and time to create an inhouse technology stack, we use google apps for our student and staff accounts and document hosting, and it seems like our data storage needs to be a similar kind of managed database product which allows our office staff to input new records through internal web forms and similar approachable tools.

It depends on what kind of data you are talking about. There are legislative / regulatory requirements for some data, as it is required to be public. Other data is required to be private. There are vendors that have systems you can buy off the shelf, specifically for schools/districts, and other vendors that have more generic solutions that you can customize yourself.

But it is hard to even start such a conversation without knowing what you are really talking about.

You might get some value from having your superintendent talk to other districts - you may be able to share solutions. Also, NYSSBA has some partnerships with software vendors, so if your district is a member, there may be some answers from them as well.

Don't buy Blackboard? Without knowing a little more about what you really want to do, I couldn't recommend anything else.

Educational software (admin or teacher-facing especially) tends to be garbage

You could check out Fieldbook: https://fieldbook.com/?rc=VJTdQhbp

It's a simple data store that feels like a spreadsheet but lets you organize like a database (disclosure: I'm a founder). Happy to help you get set up with it.

I'm an NYU student in Education and Digital Design - HIGHLY interested in your problem. Can you describe a bit more about what the issue you are facing is? (what kind of data/how it is accessed/ etc.) With this info people on here are bound to give more specific and useful advice.

Hey ChicagoBoy11-

I wrote an update above with some more information on the types of data we need to accomodate, but feel free to reach out on the email listed in my profile, I'd love to make contact with you

Thanks Alec, I'll be reading a lot about these today!!

for an open spec meant for your specific use case, check out "Tin Can API" / "Experience API" / "xAPI":


specifically the idea of the "Learning Record Store".

it is the new open standard for learning, training, and experience, etc... backed by the DoD and ADL to replace SCORM (previous standard used by things like Blackboard and MOODLE and LMSs 'learning management systems').

let me know if you have any questions, i work and research in this space.

Can you give a little more information on what exactly you want to do? My "home" gig is a community college and we have done data gathering and storage, but I'm not quite sure what your needs are.

Hey protomyth,

Mine is a public high school with a desire to collect information about student assessment, attendance, and other educational records from their disparate sources together into one home base where the data can then be explored using tools like tableau.

We have limited knowledge of databases, SaaS services, and managed database product. What I think we need is a SaaS data hosting service with an aligned mission, and I'm not sure how best to evaluate the offerings or what the possible products might be.

I know that my ideal product is one which allows our teachers and office staff to send arbitrary types of data to the store using internal webforms and such so that they don't need to have a direct understanding of how to use, interact with, or manage the data itself.

Not sure what to tell you. We have a current project to do the same, but it is still in the infancy stage and has no requirements other than retrieve the data on demand (search is fine). I had an old project with a bit of structure and that was quite a lot easier. Post if you find anything and I will do the same as we move forward.

And it all makes you wonder if the NYC school system is the next to be dis-rupted. A budget of 25 billion ($$$) a year. That's 10X the total VC invested in NYC last year.

Expenditures of $20K per student.

My school used the products from https://iserv.eu/ They sell a complete software solution for servers at schools.

You don't want to build your own data store. You want a better solution than the legacy system you're dealing with now.

I work contract jobs, which has given me the opportunity to see many different approaches to IT infrastructure, team organization, workflow management, IT solutions, etc, in companies from some of the largest in the world to mom and pop businesses. I've only ever seen companies in the large to enormous range engaged in the type of project you're proposing.

It wouldn't just require software engineers and software licenses, you'd also need prob two business/data analysts with appropriate experience in data warehouse design and software migration to draw up requirements (if you don't want it to crap it's pants 10 months into production use, thus making it the new "Legacy System") and end user / operations documentation / training material. Then you need the software engineers (prob one application engineer and one database engineer) to turn those requirements into a working product. Oh, and these people mentioned thus far will be too busy with this work for 3-9 months to be doing anything else, and then generally not stay on after because there won't be any appropriate work for them once it's up and running and they've trained the end users. Next you'll need at least one person who's primary role is maintaining the system, even if 50% of their time is free to do other stuff. They need to be able to drop what they're doing at all times to address production issues, which will inevitably occur from time to time. This is assuming your organizations needs are static and you won't be keeping a developer on to iterate new releases in the future with new / different functionality, to keep your shiny new system relevant and from becoming "legacy".

Or, you can do what most of the business world does and get a SaaS solution. Many SaaS (Software as a Service) vendors provide free / cheap licensing options for educational institutions. Even if you can't find a solution which offers a free license, for your use case it'll be cheaper in the long run to go SaaS rather than attempting to bootstrap your own Dev shop, and paying for all the mistakes that come along with that enterprise.

What you're looking for is an end user SaaS solution from a vendor that will continue to innovate and has open-source at its core (these type of companies generally don't leverage closed source code to lock-in customers, and if they do they are at least easier to migrate away from). You're also looking for a vendor who's mission is in line with your organization's mission. This requirement is an intangible, but one that will pay massive dividends down the road. Finally, steer clear of any solution that includes the first step of: "Pay our consultant to come to your office to migrate your legacy system to Solution XYZ (which we didn't actually design and build, but trust us, we're certified)." I could make a living on just the contract jobs where I'm paid to come in after these companies and clean up their mess or finish implementing functionality they weren't able to implement before going over budget. Additionally, this type of company almost never cares to understand the domain specific issues or instance specific issues relevant to the customer. They may "specialize in and only do educational software / institutions", but more often then not, the result is boiler plate code from their last three jobs dropped into your servers, totally ignoring the specifics of your system, historical data, and historical work arounds.

To prevent this issue, and to do it right, you should considering either hiring or training one employee to be full time on this software, and have that employee be someone that has intimate knowledge of both your old system and your organization's operations. This employee must also show a willingness to adopt best practices and work from a place of consensus (both as relates to technical AND organizational decisions), rather than someone who will stubbornly implement what they are already comfortable with and refuse to listen to anyone that is not as technologically savvy as they consider themselves to be. This person should consider themselves a steward of the school's system and its many stakeholders (and maintain buy in and seek the input of the other stakeholders, who should include a senior administrative "champion" of the project who also fosters consensus rather than deliver decries), and not consider the school a user of their system. This person should aim to learn the solution inside and out, and develop an effective workflow / ticketing system to see that the system is responsive to it's end users' needs. One of the first orders of business for this person should be to document all aspects of their job. This documentation should be updated whenever any aspect of the role changes. This is the so called "bus manual". If that person is hit by a bus on the way into work any particular morning, the mysteries of operation your system do not go with them. Morbid, but if you only take two things from this post, let this be the first.

Insist on ease of use of the product, clarity of terms of software licensing (so that you aren't surprised by unforeseen costs if you require scaling up, adding new users, or making the system available via a web site, all things which will enviably happen). To avoid vendor lock in and avoid paying shoddy, expensive consults to implement the system, insist, I say again, insist on the following two points. If you take away only two things from this post, let this two part point be the second thing: 2a) And open data model and unfettered / unlimited api access to your data 2b) Excellent end user documentation, both for the api and the application

I'd suggest starting with checking out Tableau, Socrata (http://www.socrata.com), and Informatica. You want something that is as dead easy to use and as powerful as Tableau. And you want something that is as awesome with your data as Socrata. I've worked with a Dev from Socrata at a hackathon, I was really impressed. They've really got their heart in the right place and they have an awesome product. Informatica would be a more powerful/expensive alternative to Socrata, but likely overkill for your needs. I'm sure there are other products out there that will work for your organization with which I'm unfamiliar. Sit with and look over the shoulder, for a whole day, of people that already implemented any solution you're seriously considering. Ask them about their experience of implementing it and utilizing it. Also speak with people that may have gone with another solution, but started out with the same problem you're trying to solve. Ask them what they learned, would have done differently, biggest headaches / opportunities, etc. don't be afraid to solicit guidance from someone in a meetup or university for guidance through the process. I'm certain someone would be happy to serve as a sounding board. Just be wary of anyone eager to take an active role in the project. They are not a long term stakeholder, so they should only be fielding questions and offering wisdom, not driving the process, at all.

Best of luck! :) I have several family members who are educators; I wish you success!!!


I can't thank you enough for the depth and thoughtfulness of your response- you're 100% right about looking for a SaaS solution, about finding one that is aligned with our mission. I've been experimenting with Tableau for the last two days and it strikes the perfect balance of incredible power and usability.

From what I can tell, Socrata is focused on open data and sharing data with outside organizations. Their product is what we're looking for, their mission is spot on with ours, but our data has to remain secure and private. We need a socrata for internal use.

I've shared your thoughts with my administrators, thanks again

May I suggest you to have a look at OpenDataSoft? This is a SaaS solution which lets you publish data both externally AND internally with fine grained access control. It has advanced data processing and data visualization capabilities. Feel free to get in touch If you would like to give it a try.


Disclaimer: I work for OpenDataSoft.

Thanks for mentioning this, I'll definitely check it out

Hello antigen8, I wanted to clarify my reasoning for including Socrata in that list of software suggestionss. All-in-one solutions often don't address and sometimes actually create a coupling of your data and the applications you utilize to access it or create value from it.

Your initial post mentioned a data store. And we've talked about Tableau. A data store, data warehouse, or database will usually be separate from the applications making use of the data. Analytics software, such as Tableau, makes use of the data source, but is not necessarily meant to be your data source. (http://www.tableau.com/solutions/data-sources and http://www.tableau.com/solutions/environments)

One solution to this problem is to go with an "API-First" architecture pattern. (https://www.leaseweblabs.com/wp-content/uploads/2013/10/api_...)

The API-First pattern creates an abstraction layer between your application and data source. You can then make changes to your data source (say, switch from MS Access to MS SQL Server, or go from using a collection of Excel files to a single Oracle database) without requiring any changes to your applications. Or, conversely, you could switch out, add to, or remove some of your applications (say, switch from Tableau to Pentaho) without having to touch your data sources. http://onlinehelp.tableau.com/v8.0/pro/online/en-us/extracti... http://www.tableau.com/new-features/data-engine-api-0 http://open-source.socrata.com/architecture/

So really Socrata is a super user friendly API solution, which your school can use completely privately and securely in-house, in conjunction with your data sources and applications. The CEO of Socrata actually speaks directly to this use-case here: https://www.quora.com/Are-there-any-products-like-Socrata-bu...

One scenario would be using PostgreSQL (to store your data) + Socrata (API layer) + Pentaho (analytics, visuals, and reporting). Then, if you don't like Pentaho, you can switch it out for Tableau without having to touch PostgreSQL. Or, if you decide you don't enjoy managing your PostgreSQL instance, you can switch to Google Cloud SQL (https://cloud.google.com/sql/docs/introduction).

Just want to say that I have zero affiliation with any of these companies, and that I'm sure there are other products out there that would accomplish the same solution, architecture wise.

Another link that may help decide if this solution might be worth looking into more: http://customer-summit.socrata.com/sessions/added-value-open...

And, since I mentioned it above, you might want to look at http://www.pentaho.com/.

Regardless, I'd spend some time on this site before making any decisions.http://apievangelist.com/. And glance through this book: http://htchttp.s3.amazonaws.com/books/apis_a_strategy_guide....

If you choose to build this, there is an opportunity cost because you could be building something more important.


It would be great if you could expand on this.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact