Hacker News new | past | comments | ask | show | jobs | submit login
Machine Learning Models Are Missing Contracts (gradio.app)
31 points by Aliabid94 on Jan 19, 2021 | hide | past | favorite | 7 comments



I cannot agree with OP more. ML model code itself is too often seen as documentation for a paper, in the sense that authors implicitly expect users to go through the pre-processing pipeline to find the actual implementation steps.

This is because the data handling and pre-processing nitty gritty is not actually interesting from an academic perspective.

I cannot count the times I went to look at a paper's implementation source code on a benchmark dataset to find it cut-off annotated sequences to a fixed max length, essentially turning it into a different dataset making comparison to previous work invalid.

Good documentation costs time, time academics don't have.


I think ML as a field just needs to really mature. A lot of the work feels super hacky, sort of a mix between research, demos, and production.


More specifically, at least in industry, we need SRE / ops support to mature. Taking a team of people who are highly specialized at the research layer of a statistical computing problem, then treating them like they are immature when they get massively overloaded also solving credential management, Kubernetes config, web service hardening, efficient data pipelining, etc. etc. is just such a whiny and immature thing to see come out of infra / ops team leaders, that leads to burning out ML engineers, and wasting a lot of money failing to extract value from their comparative advantage for the business just because infra / ops leaders can’t get it together and solve ML coordination problems.


Right. I think it ignores the importance of the data when building a model. Even great data can lead to difficult modeling scenarios. The premise that a “solution” can guarantee (or even partially guarantee) to be useful is misleading.


I don't think the author would disagree with you. In fact, I think this article was highlighting one specific area in which the field could improve.


What's the difference between a test and a contract? I agree that code in the ML space needs to be more rigorously tested especially the data flowing in and out. But how are contracts different?


A contract defines how the parties should interact. A test determines how something is or behaves. Contract tests are a thing.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: