
Ask HN: What can I do to accelerate scientific research? - mariushn
I love science &amp; tech and how these improve lives. As a software engineer&#x2F;entrepreneur, in the last years I thought of starting&#x2F;contributing to some projects which scientists would find useful. Now I&#x27;m ready to work full time on this.<p>Ideas revolve around<p>* indexing all open research with free unlimited access, similar to arxiv-sanity.com but better; Other projects exist though: Google Scholar; semanticscholar.org, academic.microsoft.com,  https:&#x2F;&#x2F;www.chanzuckerberg.com&#x2F;science&#x2F;projects-meta<p>* generative design<p>* bioengineering (not sure exactly what, eg microbiota simulator)<p>* materials simulator (eg how can we get a material having a given set of properties)<p>I don&#x27;t need immediate financial returns, but I do need the work to be used &amp; have an impact in real life projects.<p>What ideas do you have on how one can accelerate scientific research?
======
biophysboy
This is a pipe dream of mine, but I would love a Wikipedia of null results:
nullpedia! Nobody publishes null results bc they’re not exciting, but I think
a lot of NSF money would be saved if “failed” experiments were aggregated
somewhere in a searchable way.

There’s lot of questions. How to organize it? How to encourage participation?
How to maximize usefulness while at the same time minimizing volunteer effort?
How to encourage discussion (suggesting changes for a better exp design)
rather than manipulation (stealing the seed of a bad experiment to publish at
your better funded lab)?

I don’t know how to do it, but I think if done right people would really like
it.

~~~
mariushn
Good idea, and great challenges indeed. Taking this further, I'd call it
experimenpedia, with experiments published as they are thought out. Be open
for comments/review and showing related experiments before the actual
experiment being done. This might prevent potential failures and let owners
tweak the planned experiment before being actually done. Then do the
experiment and publish the results, whatever they are.

I guess secrecy would actually win though, and nobody would use this?

~~~
biophysboy
eLife is sorta like this. The experiment isn’t live, but the publishing
process is.

[https://elifesciences.org/about/peer-
review](https://elifesciences.org/about/peer-review)

It’s pretty interesting

------
claudius
Provide funding for permanent positions. I'm currently in the last few months
of my first postdoc in condensed matter physics (think superconductors,
quantum computers etc.) but will move into industry next year for lack of
perspectives towards a permanent position in a reasonable place. As far as I
can tell, my research so far was not substandard and the software I wrote has
enabled quite a few projects which otherwise would have not been possible or
taken much longer. Most people I talk to (both inside and outside academia)
express some degree of disappointment over people like me leaving (after 4
years PhD + 2 years Post-doc) but none of them put their money where their
mouth is.

To be clear, I can understand that a PhD candidacy justifies a temporary
contract and I'm not even asking for a permanent position directly after a PhD
(as would be standard in industry), I'm only asking for a reasonably safe
perspective towards a permanent position reasonably soon after graduating.
Can't exactly start a family if you don't have any kind of job security beyond
the next couple months.

~~~
mariushn
Excellent point. I cannot afford providing funding, but I can fund myself for
3 years to work on useful software.

> the software I wrote has enabled quite a few projects which otherwise would
> have not been possible or taken much longer

What's the common practice with such software? Is that published somewhere,
open sourced? Or kept private in hopes of being monetized, with IP owned by
the author/university?

~~~
claudius
At the moment it's "available within collaborations". My former supervisor has
had some bad experiences with people using his open-sourced software without
acknowledgement etc., which is of course not quite ideal if you actually want
to build a career in academia for yourself. Monetisation is not really an
option.

My toolkit is maybe a bit non-standard in that it has attracted a few external
collaborators using it as well and I like to think I have taken better care of
upholding coding standards, documentation etc.

Normally software in my field is kept within a group and dies after one or two
PhD students have left.

~~~
fghtr
This is a very sad state of affairs, therefore I would like to bring your
attention to a petition towards open sourcing all scientific (and generally
tax-paid) software:

[https://publiccode.eu](https://publiccode.eu)

~~~
cosmodisk
This is a great initiative.It should not be limited to software: research
papers, databases and many other things that are currently either not
available at all or are behind pay walls should be released.

~~~
mariushn
For research papers, there's also this open access initiative which is gaining
support: [https://www.coalition-s.org/](https://www.coalition-s.org/)
[https://en.wikipedia.org/wiki/Plan_S](https://en.wikipedia.org/wiki/Plan_S)

------
randomsearch
The biggest problem scientists in academia face is that they can’t do actual
research. They spend most of their time doing stuff that doesn’t matter. So
probably the biggest leverage you have is to reduce the time they spend doing
stuff that doesn’t matter.

The most effective way to do that is to establish an alternative form of
institution that focuses only on research (and teaching if you like - teaching
is not the problem). A tough challenge. One line of attack would be to
contrast your costs with academia’s full economic coating model.

Failing that, here’s some things you can do:

\- develop a paper reference system that actually works well. Mendeley is the
best there is and it’s (IMO) rubbish. Poorly designed.

\- build a typesetting system that’s an alternative to latex but is WYSIWYG.
Particularly important outside of CS.

\- build a free conference organising / review tool that works. EasyChair is
popular and utter garbage.

\- build tools to automate the grant writing process. A step by step system to
create a grant proposal, tailored to each grant scheme. Yes, this would
potentially damage the grant application process. But it doesn’t make any
sense anyway, so at least this would free up a few years where academics could
churn the required proposals out fast until you were somehow disallowed.

\- provide free slide materials, examples, exercises for all the main CS etc
topics. Some kind of “piece it together” kit so lecturers could save time
making slides and other materials that tick boxes. Diagrams, for instance,
would be very useful. Pseudocode too.

NB you probably won’t make any money.

~~~
andrepd
>develop a paper reference system that actually works well. Mendeley is the
best there is and it’s (IMO) rubbish. Poorly designed.

Zotero works wonderfully for me. And it's open, which is a must.

~~~
Reelin
I'll preface this by clearly stating that Zotero is absolutely indispensable;
I wouldn't be nearly as organized without it. It's very important to me that
such tools be open, and Mendeley in particular is a complete disaster in this
regard (see their history with encrypting user data).

That being said, Zotero is very much a "least worst" tool in my opinion.

* Overly rigid in how it goes about modeling document types and metadata fields.

* Doesn't handle bookmarks, browsing history, and other various data types. At first glance it's easy to dismiss such things as out of scope, but I find that my typical workflow results in reams of such unsupported data being generated and manually tracked by me. The problem is that this unsupported data is often tightly coupled to the data I'm managing with Zotero, which is frustrating to say the least.

* An incredibly heavy and inefficient piece of software.

* It's far too difficult to set up and manage my own sync server (last time I checked, at least). I don't really _want_ to share all my data with the developers, but it's very inconvenient not to do so.

More on topic with the broader discussion - knowledge and data management in
general seems to be a largely unsolved problem, particularly in science and
particularly regarding interrelations between and versioning of arbitrary
pieces of data.

~~~
gsjbjt
Why don’t data versioning tools like git lfs do the job? Is it lack of
awareness or is the problem more complex than that?

~~~
Reelin
Well I could just be unaware of some functionality they have, but all those
tools do is version things. There's no integration with reference managers
like Zotero (that I'm aware of), and no tracking of interrelations or
metadata.

In contrast, Zotero (and other reference managers) don't do any versioning at
all (at least that I'm aware of). Instead, they keep track of the metadata
that's necessary to put together a works cited section for an academic paper.

... or at least that's what they started out doing. These days they also try
to organize your papers into some sort of category structure, facilitate
tagging and notes, provide synchronization between your devices, and probably
a few other things that don't come to mind right now.

Feature creep? Sure, but all that stuff is central to the research and writing
process. It's also all tightly coupled, so splitting it between multiple tools
doesn't work very well. And that's the current problem - how to integrate, for
example, a few of your browser bookmarks with your academic literature
collection. Or how to track a list of all the papers cited by a particular
paper. Or link a specific paper tracked by your reference management software
against a specific version of a large data set, perhaps itself tracked by Git
LFS.

Generalizing a bit, what about linking experimental notes (typically pen and
paper) with data collection software (typically a binary), as well as the
collected data (perhaps Git LFS), as well as a specific version of some data
analysis scripts you wrote (perhaps Git). Now try to track everything as you
work on multiple paper revisions with collaborators, each version of which
adds (and sometimes removes) citations and could use a different (likely
newer) revision of the collection software, data set, or analysis scripts.

Alternatively, for a data management scenario not directly involving writing
papers consider molecular cloning using plasmids. You have a dozen semi-
related tubes in a cryogenic freezer that you need to track over many years
(ie long term inventory management), each of which has one or more pieces of
sequencing data attached to it (so a small data set), they're all interrelated
(you create a new one by physically modifying an old one), and each has the
typical meta-links to experimental protocols, notes, academic literature, and
other things.

I'm not aware of any software solutions that comprehensively address all of
this stuff, so people still use pen and paper. But pen and paper is time
consuming, it's error prone, it doesn't sync between devices, it's slow and
tedious to cross reference - all the typical problems that software is good at
addressing.

------
maxander
Slightly smaller-scale than most suggestions here, but for the average
nonscientific HNer, the best way to help scientists is to improve their
programming tools. In the Python ecosystem, for instance; numpy/scipy, scikit-
learn, and matplotlib are widely used across dozens of disciplines, and are
open-source projects relatively welcoming of new contributors. Julia is a
whole new language for scientific computing, where all the fundamental tools
are still being built and refined. Raspberry pis, 3D printers and other
“hobbyist maker” tools are appearing in research labs to help develop novel
instrumentation, so hardware-oriented people can help by contributing to open-
source efforts of that kind.

~~~
Jedi72
Somewhere we computing folk lost our way - nobody needs software just for the
sake of it (with the exception of maybe games). Everything we do is supposed
to be building tools for other people who do the _real_ work to make their
life easier. As opposed to how it currently is which is to provide something
for free, then put up artificial barriers to certain parts of it and charge
for their removal.

------
drewvolpe
Come work with us at Plex Research. A huge problem our founder ran into while
doing drug discovery research was that there's tons of data in the world, but
no one is using it because it's all in thousands of different places. As a
programming, it was crazy to me that even the most advanced organizations in
industry (Novartis, Sanofi) or academia (Harvard, Stanford, ...) are still
keeping many important datasets in Excel.

We're pulling together all of the world's biomedical research data,
structuring it as a graph, and allowing research access all of it as easily as
Google does the web:

[https://www.plexresearch.com/products.html](https://www.plexresearch.com/products.html)

------
bocklund
> indexing all open research with free unlimited access, similar to arxiv-
> sanity.com but better

This space is pretty crowded, in my opinion.

I don’t know much about biology, but I can tell you that in materials, it’s
all about data. The materials design problem and predicting new materials
comes down to knowing properties of other materials. A lot of progress has
been made by using datasets generated by quantum mechanical calculations by
the Materials Project, OQMD, AFLOW and NOMAD, but materials design is tricky
because what we want to predict are the outliers that we haven’t seen yet:
materials with the highest strength, etc..

There’s value to be created for materials researchers by curating experimental
data in a digital, usable form, since so much is locked up in papers, but you
really need domain expertise for this and there’s another problem that the
experiments are so sparse and have so many features (chemistry,
microstructure, thermal history, etc) that people have really only been
successful when focusing on particular classes of materials.

~~~
andrei-mircea
[https://citrine.io/](https://citrine.io/)

You might find this company interesting.

~~~
bocklund
I actually know and work with several people there :)

------
sparadiso88
Note: The following is an earnest suggestion and not just a recruiting plug,
but it is also a recruiting plug. If this is not an appropriate place for this
content then comment and I'll remove (I do not frequently HackerNews)!

Personally, I believe the highest leverage thing we can do to promote
scientific research is to help connect the dots between related discoveries in
the service of bringing new ideas to commercialization/impact as quickly and
reliably as possible. I joined Citrine
([https://citrine.io/platform/](https://citrine.io/platform/)) a few years ago
thinking that we would help accelerate research by direct support - building
simulation, data, and machine learning tools for scientists. It became clear
very quickly that the community was already in a pretty strong position (smart
cookies, those research scientists) on that front. The biggest opportunity
turned out to be scaling expert knowledge - bridging the no man's land from
ideation to scale-up, integration, and manufacturing. At Citrine, we're
building infrastructure that helps researchers contribute to an organization-
wide knowledge graph capable of supporting inference in the scale-up or
manufacturing context based on relationships learned in the R&D phase. This
fundamentally changes the ROI calculus for basic research because it can
plausibly support the entire product life cycle.

If this sounds exciting to you, then consider joining us! We just raised a
Series B and are growing quickly.

If you have an applied math and software eng. background and want to help
generalize and scale our property inference infrastructure, then I think you'd
enjoy working with me and my team in SSE:
[https://citrine.io/careers/#scientific-software-
engineer](https://citrine.io/careers/#scientific-software-engineer).

If you have a backend software eng background and want to build distributed
services for scientists and engineers at some of the biggest materials and
chemicals companies in the world, join us in engineering:
[https://citrine.io/careers/#sr-backend-software-
engineer](https://citrine.io/careers/#sr-backend-software-engineer).

------
xwdv
If you want to help software related AI research, start compiling massive high
quality training data sets and giving it away for free. Easier said than done
though.

No _immediate_ financial return? I hope you can accept _no_ financial return,
period. In general the easiest way for an individual to accelerate general
research is through generous funding. But even then it’s not like a slider in
a game where you provide more funding and things get done faster. There’s
diminishing returns after a point. Not that I’m trying to discourage you, but
I hope you’re thinking about it the right way before you waste a lot of time
and money.

I suggest talking to actual researchers and asking _them_ what they really
need, and give them that. Basically the same as a startup going out and
talking to customers. The only research probably being done around here is
largely software related, and probably not changing the world much in ways
that actually matter.

~~~
mariushn
> I suggest talking to actual researchers and asking them what they really
> need, and give them that.

Will start to do that, thanks!

> No immediate financial return? I hope you can accept no financial return,
> period.

That would be ok for the next 3 years.

~~~
xwdv
> That would be ok for the next 3 years.

No, I don’t mean no returns for three years, I mean no returns _ever_. You
must go into this with both eyes open, don’t find yourself crippled later
because you gave all your time and money away and have nothing to show for it.

~~~
andreygrehov
What if he would sell models? A marketplace for trained models.

~~~
godelski
Some of this already happens. But you also have to realize that that is not in
the spirit of science.

~~~
andreygrehov
Got it. It's not, agreed, but nobody should work for free. In fact, I believe
that doing what OP wants to do is more harmful than building a business around
his intentions.

~~~
godelski
I agree, no one should have to work for free. But there are two competing
aspects here. Science is about seeking out knowledge and advancing humanity.
But unfortunately we need to buy food to eat.

------
khawkins
One of the big challenges for university researchers is trying to find talent
and dedication in the enormous pool of undergrads and masters students. Nearly
every professor relies heavily on the recommendation system or grabbing from
the pool of students in a class. The problem with this is that it's often hit
or miss with a net-neutral return on investment.

If I send a PhD student to spend a certain amount of time independently
training two students, then I am investing that grad student's time into
something that could be spent doing research. If one of those students is
flaky and is mostly there to pad their resume, it's largely a lost investment
(even more so if their work requires extra effort to fix). If the other is
graduated by the following semester, the gained productivity might not exceed
the other's loss by the time they leave.

The best situation is when you're working with an undergrad who is destined to
continue on to the PhD program because you know they'll be dedicated and
possibly have a head-start on their thesis work. If you could figure out how
to connect these students with advisors using metrics and private social
networking, then you will amplify their productivity significantly. You'll
improve the likelihood professors will take on undergrads and potentially push
researchers into the field earlier.

How you achieve this, I'm not entirely sure. Perhaps you can make it easy to
set up competitions which test the skills you need. The winners will be asked
to join the lab to work on some project. If some lab on the other side of the
country which does similar research also wants to do the same competition,
make it easy for them to share and run it.

~~~
blueboo
I mean, you could pay them more than approximately nothing...

------
whatshisface
I have a request. Somebody needs to make matplotlib, but with a C API and lots
of language bindings instead of only a python API. For example when I'm
writing rust, my best option is to use a Python interop library that calls
matplotlib...

~~~
H8crilA
Python is kind of the gold standard in many areas of research, no? Why would
you use Rust, or even anything but Python?

I'd only touch Rust (or C/C++) when I need to implement some fast numerical
computation that does not already exist in Numpy or Tensorflow, but still call
it from Python.

~~~
gpm
I've sped up a simulation 100 fold by porting a 20 line function to rust
(which had a terrible access pattern for numpy), and could probably sped it up
another 10 (which was closer to reasonable for numpy).

If the thing could have been written in rust in the first place, tons of time
would have been saved on trying to optimize python, waiting for simulations to
complete before (and to a lesser extent after) I ported a portion of it to
rust. Dealing with language interop and build systems.

The main reason I can't suggest that for future similar problems to the person
who I did this for is because of the lack of libraries like plotting (plotting
is by far the most important one, numpy is second but rust comes a lot closer
in that regard).

------
escot
Some interesting companies at the intersection of science/engineering that are
hiring software devs:

\- [https://www.ginkgobioworks.com/](https://www.ginkgobioworks.com/)
Synthetic biology engineering

\- [https://www.benchling.com/](https://www.benchling.com/) Online LIMS

\- [https://neuralink.com/](https://neuralink.com/) Brain Machine Interface

\- [https://strateos.com/](https://strateos.com/) Programmatic Cloud Lab
(disclaimer, I work here)

------
jchallis
Sci-Hub removes barriers to accessing scientific information. Their
infrastructure is slow and has trouble. If you are looking for real impact,
help them scale.

------
wintercarver
I don’t think this is a comprehensive answer, but if you want a nice summary
of how climate change might be impacted by ML/DL applications, this is hot off
the press: Tackling Climate Change with Machine Learning,
[https://arxiv.org/abs/1906.05433](https://arxiv.org/abs/1906.05433)

~~~
jointpdf
There was a recent NOAA conference on this topic as well, you can view the
slides here:
[https://www.star.nesdis.noaa.gov/star/meeting_2019AIWorkshop...](https://www.star.nesdis.noaa.gov/star/meeting_2019AIWorkshop_agenda.php)

------
haxiomic
I think there's lot of room to improve tools in bioinformatics. In practice
bioinformatics pipelines tend to be bundles of loosely organised python
scripts and I've heard files on the order of a few GB described as Big Data
because the processing times are so slow (days for stuff that could take
milliseconds)

It would help to pair up with practicing scientists and explore what parts of
their workflow can be improved

~~~
danielecook
Definitely. A lot of work is done in R or python and could be sped up in a
compiled language.

~~~
panta
Yes. I’m not a scientist, but I think if there were good compiled python/R
alternatives, the scientific world would benefit greatly, if not only for the
reduced waiting times... Maybe a language with Go simplicity and speed and
Python ease of use and appeal... It should have an almost real-time compile
mode (with very little optimization) to enable interactive playgrounds, like
Jupyter. Of course it’d need also strong optimization modes for final
production code.

------
folli
On the intersection of biology and machine learning, there is one of the holy
grails of science: protein structure prediction.

I'd recommend starting reading about Google's AlphaFold, since this is
currently considered state of the art in the field:
[https://deepmind.com/blog/alphafold/](https://deepmind.com/blog/alphafold/)

------
massung
> What ideas do you have on how one can accelerate scientific research?

I work in genetics (as a software engineer).

If there was a major flaw in current scientific research (that involves
software), it's that most labs care more about getting published than they do
about the reproduce-ability and validation of their work. This means most of
the software written in research is ad-hoc, write once, and often never looked
at again. It was put together for the sole-purpose of producing some output
that could be put in a paper and then lost to time.

A current "holy grail" of software in research would be to fix that: empower
other labs to validate and _reuse_ the software written and reproduce the work
of other labs with different data sets. And it is actively being worked on in
a couple places (that I know of, perhaps more):

* [https://genepattern-notebook.org/](https://genepattern-notebook.org/)

* [https://app.terra.bio/](https://app.terra.bio/)

* [https://software.broadinstitute.org/wdl/](https://software.broadinstitute.org/wdl/)

Some of these are just about giving the community a common framework to use
for their software (CWL, WDL, Jupyter), others are about data storage and
making it easily accessible for others to use in the cloud for reproducing
results.

If you want to have a impact, joining one of these groups would probably put
you in a much closer position to doing that.

If you just wanted to work on something in your spare time that would be
incredibly valuable, then might I suggest this:

It's amazing how much work is done in the scientific community using CSV/TSV
files (usually gzipped). And most of that work is done via perl, sed, and awk.
And often these are _huge_ I'm working with a VCF file (TSV) right now that's
2 TB in size ZIPPED! It's crazy. Researchers often don't have the time,
resources, or know-how to put together a simple Spark cluster and use it.

A command line tool that allowed someone to run SQL (or SQL-like) commands on
a gzipped CSV file FAST would be invaluable. And if it could JOIN across CSV
files ... wow!

~~~
mariushn
Thanks so much, mussung, for your excellent practical feedback! Both
alternatives that you listed are very tempting.

May I please ask you some followup questions? There's no email in your
profile. My email is marius.andreiana@gmail.com

> A command line tool that allowed someone to run SQL (or SQL-like) commands
> on a gzipped CSV file FAST would be invaluable. And if it could JOIN across
> CSV files ... wow!

What prevents one importing each CSV in a postgres db as tables, creating
indexes and then start running queries? Disk space availability? (My local
drive is only 1TB)

~~~
massung
mariushn, I've sent you an email.

> What prevents one importing each CSV in a postgres db as tables, creating
> indexes and then start running queries?

There are many reasons:

* Experience/knowledge. Many labs don't have anyone experienced with databases.

* Security. Without proper dev ops a local DB is often out of the question. And when dealing with PII genetic data, [cloud] security can be a major concern.

* Funding. Machines cost money. AWS RDS instance cost money. Maintaining them costs money. Dev ops costs money. etc.

* Often times the queries being done are simple. For example, you may have a giant CSV with cross-ethnic trait data, but only need samples of African decent with a beta value > 0.1. Sure, you could spin up a database, load the entire thing into a table (O(N) + disk space + time), then index it (now it's O(2N) + more disk space + more time), and then _finally_ run your query. Or you can just O(N) run over the CSV once and output the results with no extra disk space or "wasted" (perception of the researcher) time.

Finally, don't fool yourself about the capabilities of researchers. Many are
code-savvy, but lack experience. Writing a SQL query is easy. Loading multiple
TB of data into a relational database, indexed properly, and done in a manner
that won't take _days_ of time is a level higher.

------
gvggf
This will sound harsh, but it isn’t meant to be.

Scientists code better then you do science.

This is simply a consequence Of a weeding out mechanism for those that have no
coding skills. The only ppl who get away with no coding skills are important
professor with grad students to do the coding.

This isn’t to say that our skills are great, but a generic programmers (I.e.
CS majors) science abilities are approximately zero (common, no thermo in an
“eng” undergrad???)

So what can you do?

Since you mentioned science and not engineering, I’d ignore the AI advice.
Science needs models based on mechanistic understanding of the underlying
phenomena. A model that merely predicts is useful for engineers, not
scientists.

“materials simulator (eg how can we get a material having a given set of
properties)“

This is already done, but of limited usefulness. First the Mtls simulators are
far from perfect. Then there is the problem of actually synthesizing the mtls.
These simulations are more typically done to weed out bad candidates.

“No immediate financial return”

Wrong attitude. Only an attitude of “no financial return” helps science.
That’s not to say you won’t make money off of it, but that can never be a goal
since (true) science advances freely (again see the Gaussian jerk vs. Einstein
or Landau - who contr. more?)

Instead, focus on making the programming tools scientists use better, easier
to use and GPL. GPL is important because an MIT license by itself allows a
scientist to use others work while blocking others (see Gaussian).

For example, making python (or Julia?) better would be one of the most
important contributions you could make. The matplotlib guy was deeply mourned
in science.

The two cents of a physical sciences researcher who once flirted with the
Valley.

~~~
tntn
> Science needs models based on mechanistic understanding of the underlying
> phenomena. A model that merely predicts is useful for engineers, not
> scientists

I'm not sure I agree. I'm aware of quite a bit of supercomputing time that is
spent doing lattice QCD calculations (which apparently some scientists find
useful), and though I'm no quantum physicist I'm pretty sure there is not much
of a "mechanistic understanding" in QCD. I think your claim also doesn't apply
to a lot of social science - psychology has a lot of functional models, but I
don't think there are many mechanisms described.

I'll also state that modern science that doesn't require any engineering is
pretty rare nowadays, so if a predictive model helps engineers that can then
help scientists, the model has been helpful to scientists.

Ohm's law existed long before there was a mechanistic description behind it,
and though it is mostly used for "engineering," I feel confident that a lot of
scientists in the 19th century found it useful.

From [https://www.olcf.ornl.gov/leadership-
science/physics/](https://www.olcf.ornl.gov/leadership-science/physics/):

"New Frontiers for Material Modeling via Machine Learning Techniques" \-
40,000 hours allocated on Summit

"Large scale deep neural network optimization for neutrino physics" \-
58,000,000 hours allocated on Summit.

Supercomputers typically do not allocate 58 million hours to things which are
not useful.

~~~
godelski
I work with the DOE and was at ORNL before Summit was released (I got to play
on Summit-dev). When making these models there is A LOT of exploration
happening. There's a whole class of visualization techniques called "in situ"
that visualize data as it comes off the press (memory is then dumped because
there's neither enough storage space nor can we write to disk fast enough).
I'll tell you that there will be a lot of restarting those simulations because
the scientists need to explore the data as it is going on. Going in the wrong
direction? Made a small mistake that causes cells explode? Realize you're not
looking in the right region of interest? You restart the sim (thank god for
restart files, right?). Exploration is one of the most important things in
research and it is getting more and more difficult. I believe this is what the
gp is after. Having these understandings helps you explore the data better.
Creating these tools is hard work and takes a lot of collaboration too.

------
UglyToad
I saw this list linked yesterday from another HN article, might give some
useful jumping off points: [https://github.com/kakoni/awesome-
healthcare](https://github.com/kakoni/awesome-healthcare)

One related thing I found from there was a list of projects for magnetic
resonance imaging specifically:
[https://ismrm.github.io/mrhub/](https://ismrm.github.io/mrhub/)

I'd assume trying to contribute to those projects would hopefully give greater
ROI than building a new thing (without a very specific idea of what to build
and the market for it)?

~~~
mariushn
Thanks!

------
devit
Find any way to make a lot of money, gain a lot of political power or gain
influence over those who have money or power and use it to fund research and
make it more appealing culturally.

~~~
mike_ivanov
No, the most significant advances in science were done on a shoestring budget
(or no budget at all) - just by THINKING. It doesn't require lots of money to
support people who do that kind of work, and they are easy to find, especially
in theoretical physics and mathematics.

~~~
fghtr
Maybe it was true a hundred years ago, but it is not true today,
unfortunately. Science became much more complicated.

[https://news.ycombinator.com/item?id=20190468](https://news.ycombinator.com/item?id=20190468)

~~~
mike_ivanov
It is still true. Before the tau neutrino was confirmed in a lab, it had been
theoretized 25 or so years before that, which required nothing but pencil and
paper. Modern quantum field theory doesn't require expensive gear. Math
doesn't require any gear at all, and there is a whole pile of crucially
important unsolved problems.

The fact that science became complicated is a sign of thought stagnation, not
a sign of progress.

------
i000
As a faculty running a lab in human genomics /genetic I would say either join
a lab which needs your skills or work for a univeristy in support of their IT
and high-performance computing needs. People with computational skills, an
interest in doing/enabling research, and willing to accept a sub-market
salaery are obviously rare. IT departments at universities are under-staffed
and over red-taped, but most researchers depend on them to actually do work.

------
Glench
I would go talk to scientists and try to understand deeply what they’re
working on, how they do it, and what systems (social/technical/political)
allow them to do it. I’ve done this a few times and it’s always illuminating
and inspiring.

~~~
mariushn
Correct. How do you connect with scientists when not working in an university
like you do? Getting an email from a nobody offering help for free does sound
like a scam.

One answer would be "Enroll into an university". Others?

~~~
Glench
I just emailed them and told them I was doing research. Lots of people,
especially students, love talking about their work.

------
godelski
My dream would be to have a place I could look at any paper (even if not on a
campus internet connection), be able to look at raw data, code, and have a
forum to facilitate discussions between researchers.

I know the first won't happen for awhile, but it is a dream (open science).

The second is direly needed but in some cases not practical. But I think there
could be a lot to gain from just having a small portion of this. It could help
verify results significantly. Also imagine if people could research things
without having to have all the fancy instruments. Some of this already
happens, but I think it is harder to find and not always easy to sort though.
It isn't connected to papers and research. Just having a paper and a link to
the data would be tremendous help.

Similarly code. I don't know a researcher that hasn't made an accidental bug
in the code that changes results (some slight, some major). I think we need to
get over that WE ALL write hacky code. Hacky code is better than a vague
description in a paper because you don't have enough room to write an accurate
description of your model. Science is supposed to be replicated! Even linking
papers with a github account would be tremendous help. Some people don't want
to share code, and I think this is a shame and anti-scientific (especially if
you are using public money), but that's a rant on its own.

Researchers email one another all the time. Some of these discussions should
be public. Papers leave a lot of gaps. An area researchers could add extra
notes that couldn't be fit in a page limit, where collaboration can happen, or
where people can just ask questions, would be great. Replicating results can
be hard and we should be learning from one another's hurdles. That's the point
of science after all, for to push the progress of humanity. Lack of ability to
communicate should not be a gate.

Along with these things it would be nice to encourage putting up null results.
Aside a paper I would love to know what challenges a researcher fought. That's
where most of the work is. It is funny, we constantly talk about failure being
80-90% of research. Whole projects failing or just banging your head on the
desk because you can't figure out why something isn't working. Let's open this
up. Let's help one another. Let's talk about what went wrong and how we fixed
things to get to our success. I can't think of anything that would help
science more than this.

~~~
cellular
I wonder if people are reluctant to do this because they would be scrutinized
more and perhaps their phd, Grant etc might be at risk?

If so, is the only solution to give"credit" towards the phd, grant etc in the
form of hours worked towards pushing knowledge acquisition, and not strict
results?

~~~
godelski
I think it is partially that (but minor). Partially embarrassment (frequently
code is rushed). Partially people view code/data as a trade secret. The latter
I find anti-scientific, especially since a lot of funding comes from public
money. I'm okay with holding on to it for a small period of time (because we
live in an unfortunate world where sharing knowledge can't come first and
people need to secure funding), but I don't think this should be a default
mode that people do. Luckily it does seem that many are turning to
GitHub/BitBucket/GitLab to make their code available.

And everyone that's in a PhD or has one knows that the majority of work you do
is failing (but you learn from that failure).

Side note: I wonder if imposter syndrome would decrease if we were aware about
one anothers' failures and didn't only see the accomplishments.

------
hannob
Learning a lot how science actually works and how flawed it is I came to the
conclusion that the biggest boost in scientific progress you could ever
achieve is by eliminating science waste and doing more good science.

There are a few tiny steps in the right direction, but it's frustratingly
slow. Once you understand a phenomena like publication bias it's hard to
swallow that there are still empirical studies published without
preregistration. There are so many studies published with such low quality
that it's a complete waste, because the likelyhood that they're some
statistical fluke is much higher than that they tell the truth.

Though the problems are known since a while, little has changed:
[https://journals.plos.org/plosmedicine/article?id=10.1371/jo...](https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124)

~~~
godelski
That is a very click-baity publication (yes, it happens). Title suggests this
phenomena is independent of fields. Content talks about the high p value. Not
every field uses this metric. So this publication is aimed pretty much at
medicine.

While I think the p-value thing needs to be rethought, it is bold to claim
"Most Published Research Findings Are False". Honestly it would be more
accurate to say "Errors matter and we can't take findings at face value and
ignore error bars". Which is pretty much something every self-respecting
scientist should know in the first place.

------
boyband6666
So as someone working in application of medicines, I'd agree with the comments
that say those of us doing these things have learn to make code work, but are
not software engineers (and would not claim to be). This means that if you can
improve the tools that are used (and don't care about making a return), the
leverage would be massive.

I imagine it isn't trivial (otherwise it would have been done by now), but
imagine if R was automatically parallel processing; seldom is the code in my
field worth figuring out how to make it parallel, but if it was automatically,
then a hell of a lot of time would be saved! I know there are two packages
that kind of make it work, but so far I've yet to hit a threshold making it
worth it (I just leave thigns running overnight).

Abstract from that to projects like Zotero and you can see how you could have
an impact on a lot of people by enabling them to do what they do.

------
tehlike
I felt the same, and decided to learn how to do that. Followed this one guy i
admired a lot. The way he is doing it is to merge software engineering
practices into science (in this case AI and robotics), and reduce the cost to
do iteration.

For us it means: 1\. 1 click binary deployments 2\. Safer iterations that
allow for making mistakes, so that they can do 10s at the same time instead of
1 really safe one. 3\. Logging visualization and whatnot on a unified infra.

We are not scientists, but we know damn well how to scale software and
business. It applies everywhere. Think of tensorflow, before most people
couldnt do ai themselves, now it is damn easy and more things will happen as a
result.

This way they can concentrate on science while we concentrate on scaling it.
We are betting on couple breakthroughs as a result of increased enthropy.

------
lettergram
You can get a job at a company doing research. Improve their tooling and this
make them research faster. SRI comes to mind in this regard.

Inside the company you can also push them to try and open source the tools.
Unlikely that’ll work if they don’t do it already, however this skill set will
eventually let you contribute more in the future. After some time on the job,
start a personal open source project or start a company directed at some of
the issues you saw.

It’s general advice and it’s the long game, but will likely help you have more
impact.

The advice above may be more useful for engineers earlier in their career, but
you can accomplish that in a handful of years.

------
maxaf
Forget science: as an “army of one” you’re unlikely to make a contribution of
sufficient magnitude that some needle might be moved in the right direction.
Take advantage of the skills you do have: make money by doing what startup
entrepreneurs do, then donate all proceeds to scientific research.

There may be the next Zuckerberg hiding in you. Imagine what would be possible
if you could reach that level of wealth and use all that money to propel
science forward. I’m not talking about some chickenshit foundation; real
impact is made by committing all of your financial means to helping science.
That’s how a real difference is made.

~~~
6d6b73
A lot, if not most of the important scientific discoveries were result of work
of "an army of one". From biosciences to physics you can do amazing stuff. We
don't need any new zuckebergs that siphone all of the great minds that are out
there into showing more ads.

~~~
ken
The individual scientific discovery was common in the early history of
science, but has grown less common over time. Looking at this list [1], for
example:

\- Prior to roughly WWII, it was mostly individuals, and so we named things
after them (Kelvin, Doppler, Joule, Ohm, etc).

\- From WWII until the late 20th century, it was a lot of small teams
(transistor, 3; DNA, 4; pulsars, 2; etc).

\- Since then, team sizes have grown so that individuals aren't even named
(cloning, an Institute; Top Quark, a Lab; Tau neutrino, a Collaboration; etc).

In fact, in the past 35 years, the only individually named contributors on
that list were mathematicians who constructed proofs of long-standing unsolved
problems in mathematics.

[1]
[https://en.wikipedia.org/wiki/Timeline_of_scientific_discove...](https://en.wikipedia.org/wiki/Timeline_of_scientific_discoveries)

------
guzik
Side note: I am the co-founder of Aidlab - a device and a platform that is
widely used by biomedical scientists and students in their research
([https://www.youtube.com/watch?v=wY0YPOKNk88](https://www.youtube.com/watch?v=wY0YPOKNk88))

In my free time I am working on my next project - an open-sourced platform
where everyone can contribute to help fight death. The platform allows
uploading anonymized, structured health records - publicly available. Why?
Dying is a number #1 problem that should be solved together.

OP, if interested, drop me an email: jdomaszewicz (at) aidlab.com

~~~
trentlott
...why should death be solved ASAP?

We haven't got the space or resources for exponential growth of population

~~~
danieltillett
Death has very little to do with the exponential growth of the population. You
need to look at the other end.

------
abcxx99
Regarding the last point: try applying libraries like Quantum espresso [1] or
CP2K [2] to real world problems and apply machine learning with the solutions
they provide to a given problem. There's a tremendous amount of academic
research being done in this direction but try and take these libraries and
make them useful for real world applications.

[1] [https://github.com/QEF/q-e](https://github.com/QEF/q-e)

[2] [https://github.com/cp2k/cp2k](https://github.com/cp2k/cp2k)

~~~
mxcrossb
As someone working in this field, I disagree. The state of these codes is that
if you put garbage in, you get garbage out. Without a strong theoretical
background, you’re generating junk. Of course, if he’s willing to spend
several years developing his knowledge of the field, there is for sure room to
help. If not in your suggested machine learning area, instead perhaps in
improving the quality and usability of the code.

But instead I would recommend he start from something like the NOMAD database,
where the calculations have already been run by more knowledgeable people.
Then he can focus on the analysis side.

~~~
abcxx99
Of course you need to develop domain knowledge. But there's tremendous
opportunity lying in these libraries, you just have to spend the time looking
for it.

------
vikramkr
There are a lot of citizen science initiatives you can participate in
including folding at home and so on. Your best bet is going to be to find a
field you love and find experts in the field to work with and learn from -
theres definitely a lot of lack fo software talent in some areas that you'd be
able to make a dent into but outside of citizen science initiatives, you need
to start with understanding the problems that need to be solved which can be
much more technical and difficult to understand than they ultimately could be
to solve. Good luck!

~~~
mariushn
Thank you!

------
j7ake
Provide funding for high school students to spend a summer working in a
research lab? The amount of funding doesn’t need to be high and it may pique
early interest for the next generation .

------
ArtWomb
>>> Indexing all open research

I'd also check out analytics platforms such as VIVO. End goal is a universal
workflow for all research, discovery and collaboration. Solving this problem
will have immediate impact. For example, computational epidemiology and
containing Ebola outbreaks in "hot zones" hundreds of miles apart.

Web of Science

[https://clarivate.com/products/web-of-
science/](https://clarivate.com/products/web-of-science/)

------
arandr0x
If you have the money to work full time on a personal project you have the
time to sign up for a conference or three on the kind of scientific topic you
like. Ideally, pick one near your home/in the closest metropolis.

Once there you can apply the following no-social-skills-needed guide to making
contacts at a conference.

1\. Watch the keynote.

2\. After the keynote, walk up to pretty much anybody. Ask the following
questions: "Do you think (keynote title) is a major area for (conference
subject matter)?" and almost regardless of what they answer, you can follow up
with "oh, really? Is what you're studying related to (keynote subject)?".
Those two questions are enough to make virtually any academic launch in a
paragraph-long exposé. Usually by the 6th conversation you have along those
lines, you'll have a good summary of which subjects are considered important
right now. If the person looks like they like you or are invested in talking
about their topic, you can follow up by giving your contact info and saying
you're interested in them sending you a paper on the subject they talked about
(one of theirs if applicable).

3\. There will be designated poster sessions. The posters are giant sheets of
papers with young people in front of them. Walk up to a young person standing
in front of a poster that doesn't look that slick but where the subject matter
interests you (slick posters are from bigger labs where you are less likely to
have an impact). Ask the person what they do, how long they've been doing it
for, how big is their lab, and what's the most time-consuming step in their
research right now.

4\. If anyone asks you what you do, say you're an expert in (your computing
area, web apps, cloud, data analysis, whatever) and interested in the
intersection of (your area) and (conference name). Some of the people will say
they think (your area) would be great for (their subfield) because of (super
niche stuff you wouldn't have thought of). Grab their contact info so you can
have a 1:1 meeting with them later where they find you a research subject.

Follow up by having email conversations with a few of the people. If a grad
student looks like they can use your help, ask for an intro to the PI.
Eventually you'll walk your way into a (possibly paid) research project. It's
that simple. Thank you for caring about your world and its future.

------
prennert
Build tools to make research more efficient and reproducible. The research
community is a small market and has little money. Building tools is often not
regarded as scientific activity and can lead to dead ends if publication is
pursued.

So, if you do not have to rely on income anymore and want to contribute to
science, build tools for the community. Or better, come up with a way to help
scientists to build their own tools fast and efficiently.

~~~
mariushn
Thanks! Any other details / specifics would be welcome, since it seems you
already have some knowledge in this area.

~~~
prennert
With my background in computer science and ML, I worked with genitesists to
automate some of their observations.

A lot (almost everything) what they do routinely is still done manually. In
genetics there are about 10 model organisms (I forgot the exact number) that
most work is done on. Examples I have come in touch with are C. Elegans, D.
Melanogaster and mice. A huge amount of work is done on these organisms and a
huge amount of time is spent by grad students and post docs, repeating the
same boring tasks day in and out. This is a big factor in deciding the breath
of studies (you can only compare as many conditions as you can evaluate). Some
things biologists do are subjective or prone to bias (even objective tasks
like counting stuff becomes kind of biased after fatigue sets in if you have
to count a lot of things and have to do it very often)

If you can automate their tasks, what will happen is that you will enable them
to increase the breath of their studies drastically and potentially create
more consistent measurements as well.

------
mindcrime
Software tools to help with scientific discovery / hypothesis generation. I'm
thinking specifically about Literature Based Discovery[1] (LBD), but there are
probably other useful approaches in that domain.

I have a little project[2] related to LBD, which I'm working on here and
there. There's still a long way to go, but I'm optimistic there's something to
be accomplished in this space that could help.

And then you've got things like Robot Scientist[3] which is crazy interesting.

If AI, in the AGI sense, interests you, I might mention the possibility of
contributing to / working with one of the popular "Cognitive Architecture"
systems like ACT-R[4], SOAR[5], or OpenCog[6].

[1]: [https://en.wikipedia.org/wiki/Literature-
based_discovery](https://en.wikipedia.org/wiki/Literature-based_discovery)

[2]:
[https://github.com/fogbeam/Valmont-F](https://github.com/fogbeam/Valmont-F)

[3]:
[https://en.wikipedia.org/wiki/Robot_Scientist](https://en.wikipedia.org/wiki/Robot_Scientist)

[4]:
[https://en.wikipedia.org/wiki/ACT-R](https://en.wikipedia.org/wiki/ACT-R)

[5]:
[https://en.wikipedia.org/wiki/Soar_(cognitive_architecture)](https://en.wikipedia.org/wiki/Soar_\(cognitive_architecture\))

[6]:
[https://en.wikipedia.org/wiki/OpenCog](https://en.wikipedia.org/wiki/OpenCog)

------
7373737373
Solve the following problem:

"I want to find all research labs concerned with topic X in this area"

=> Create an (open) crunchbase/patreon for university research groups

Right now, each university has their own website, with their own layout and
varying degrees of information. General scraping is impossible. Finding
research groups with certain interests without trawling through an endless
amount of publications in related journals or randomly meeting people at
conferences is difficult. The only comprehensive lists are university ranking
sites, but they are not detailed down to the research group level.

\- create a crowdsourced, public index of running projects

\- help labs find each other and collaborate on work

\- help people apply to labs with open positions

\- let donors and investors find and support projects more efficiently

\- make science journalism easier, by connecting reporters straight to the
source

The main difficulty is creating the network effect which is why, I suppose, no
one has done it yet.

~~~
mariushn
> help labs find each other and collaborate on work

Ideally, that would be awesome. But... do they want to? Will they put science
& results above all, or they put first funding, egos and credits?

Science is great, but from my experience, these others are even more important
for individuals. These support a career & family.

~~~
7373737373
The ideas published in journals all depend on humans.

But to an outsider, these connections are completely invisible. Conferences
cost thousands of dollars to attend. Subject matter experts are hard to find,
so many connections are simply not made.

Which is a good reason to support such a platform, because it may make such
views possible and existing deficits more visible and quantifiable :)

Small and medium size businesses outside academia could also use it as to
improve their technology and connections.

------
xipho
Contribute to TaxonWorks [http://taxonworks.org/](http://taxonworks.org/).
Software from a small endowed group that builds tools that support those who
describe Earth's species. Software is completely open source, many
opportunities to improve what's going on there.

------
gpm
I've helped one on one by working with someone who had a problem and needed
some code a few times (sometimes getting paid, sometimes for free for a
friend). I think that's higher impact than most of the remaining low hanging
fruit. These where all 1 coder projects.

There are also larger coding projects in the sciences with teams. Running the
database (and doing whatever else needs doing in code) for a telescope for
instance. I don't have personal experience with them, but search around and
you can probably find them.

The only thing I can think of that would have helped across the projects I've
worked on would be better (more convenient, I don't care _that_ much about how
the output looks as long as it conveys the information) graphing/data
visualization tools.

------
ajot
Re: materials simulator, you should totally check Materials Project [1]. I
guess that they are open to getting some help in their github repos.

[1] [https://materialsproject.org/about](https://materialsproject.org/about)

~~~
mariushn
Thanks!

------
na85
I think the best way to accelerate scientific research is to recreate Bell
Labs in a modern setting, free from the desire to succeed.

Get a huge endowment fund (the hard part), hire a bunch of smart and promising
people, and tell them you want them to change the world.

------
mlspector
I work on a project called OpenReview
([https://openreview.net](https://openreview.net)) that aims to improve
scientific peer review by encouraging transparency and research on the peer
review process itself. Right now most of our activity is within the machine
learning academic community, but we have plans to spread into other areas of
computer science and more.

We’re looking for developers, if you’re interested:
[https://codeforscience.org/jobs?job=OpenReview-
Developer](https://codeforscience.org/jobs?job=OpenReview-Developer)

Let me know if you want to learn more and we can find a way to connect.

------
tristanpemble
I’m the engineering director at Quartzy (YC S11). We’re focusing on making
research more efficient through the laboratory supply room. We help labs
through a group buying experience to avoid waste, and offer discounts to their
contract prices by working directly with manufacturers.

This isn’t the most obvious way to make an impact, but we’ve effectively saved
many labs thousands of dollars on equipment and consumables that can then be
reallocated toward their research. Our mission is to increase the efficiency
of scientific research — the supply room is just the beginning.

Would be happy to chat - tristan@quartzy.com

~~~
fuzzfactor
>increase the efficiency of scientific research — the supply room is just the
beginning.

Efficiency is often compromised institutionally, so I know this is an
obstacle.

I've got a lifetime of business models for scientific software, but that
wasn't the question above.

One way to accelerate many types of laboratory research would be to offer a
workflow alternative where very high performance individuals can make the
choice to spend almost all their working time at the bench(es) rather than
being distracted or sitting down at desks or office & lab computers. While
maintaining overall leadership of the organization, so this requires a
different type of team structure as it scales.

That way someone who can really invent a lot doesn't get bogged down by the
bureaucracy of their earlier inventions.

------
gumby
A shoutout for asking “what can I do” as opposed to “why is research so slow?”

------
n1000
A lot of journals (including the one I am an editor at) would probably
consider leaving the fangs of elsevier and co and go full open access
publishing. However, there are a few obstacles to that, such as missing open
and reliable editorial and publishing platforms, contacts to libraries, etc.

In my opinion providing these things and actively encouraging journals to make
the jump to open access would be a huge service to academia. Also, I wouldn’t
be surprised if one could find funding for such a venture.

------
zitterbewegung
Donating money toward researchers. All of the following situations below may
or may not have a useful outcome but direct investment would be the easiest
one and one you can leverage.

------
cupotea
The least/most you can do in this space is likely contribute actively in
something you believe in.

Industry wide limitations are largely down to policy & regulations

------
amirathi
Make programming accessible to scientists.

Today, running any compute/memory intensive experiment requires working
through so many tools and terminologies (e.g. AWS, SSH, python, dependencies,
virtualenvs). Scientists really need just Jupyter notebook like experience
that's collaborative, reproducible, and powerful (i.e. run on any hardware
with 1 click).

~~~
badpun
Check out [https://www.dominodatalab.com/](https://www.dominodatalab.com/) \-
you've pretty much described their product.

------
mike_ivanov
> how one can accelerate scientific research

Easy peasy. Help (for example) this guy find funds for continuing his
research:
[https://scholar.google.com/citations?hl=en&user=Q0w_e84AAAAJ](https://scholar.google.com/citations?hl=en&user=Q0w_e84AAAAJ)

Edit: there are many like him.

------
kingryan
You mention meta.org, you know you could work with us on it
[https://chanzuckerberg.com/join-
us/openings/?team=engineerin...](https://chanzuckerberg.com/join-
us/openings/?team=engineering) (search "meta" on that page)

------
zbjornson
Shameless plug: come work for Primity Bio. My team builds a bioinformatics web
application that (among other things) is several orders of magnitude faster
than the other tools, which fundamentally changes the scope of analysis that
researchers can feasibly do. Email in bio.

------
billconan
to me, the biggest hurdle is the pretentiousness in the research world and the
way paper is written. if someone can rewrite papers in plain english, removing
all jargons, talking more about intuitions, that will help encourage more
people to get involved in scientific research.

------
raizinho
Here's a journal for publishing null results:
[http://www.jasnh.com/](http://www.jasnh.com/)

------
fghtr
Probably not the answer you would expect, but one very simple action you could
do is to sign/share this petition:
[https://publiccode.eu](https://publiccode.eu).

------
m463
I think you might enjoy this interview with Elon Musk:

Elon Musk - How to Build the Future

[https://www.youtube.com/watch?v=tnBQmEqBCY0](https://www.youtube.com/watch?v=tnBQmEqBCY0)

------
moonbug
The job role for you might be as a "Research software engineer"

~~~
mwest
Surprised this comment isn't higher.

OP does indeed sound like the discipline they're looking for is Research
Software Engineering.

Some reading:
[https://en.wikipedia.org/wiki/Research_software_engineering](https://en.wikipedia.org/wiki/Research_software_engineering)
"Research software engineering is the use of software engineering practices in
research applications. The term started to be used in United Kingdom in 2012,
when it was needed to define the type of software development needed in
research. This focuses on reproducibility, reusability, and accuracy of data
analysis and applications created for research."

[https://software.ac.uk/](https://software.ac.uk/)

[https://rse.ac.uk/](https://rse.ac.uk/)

------
roborzoid
Put yourself in survival conditions and use science to overcome it

~~~
fuzzfactor
Well nature did this to me and the committment to accelerated operation of a
wounded laboratory (before it could have been lost altogether) yielded better
progress IMHO on the same scientific instrumentation compared to leading PhDs
in my field.

I wouldn't recommend it for everyone seeking commercialization because of
accompanying roadblocks due to the systematic financial pressure against
individuals coming from survival conditions.

But it is good knowing what this gear can really do.

------
lykr0n
Run BOINC, Folding@Home, or like distributed computing project. Many of those
projects are run by education institutions.

------
benrapscallion
Sign up for a prospective health cohort such as AllOfUs in the US or UK
Biobank or equivalents in other countries.

~~~
benrapscallion
[https://allofus.nih.gov/](https://allofus.nih.gov/)
[https://www.ukbiobank.ac.uk/](https://www.ukbiobank.ac.uk/)

~~~
mariushn
Thanks! Since I'm in EU, I'll also contact biobank and ask if I can help with
anything software-related.

------
javier2
an idea: create quality libraries for working with genome data.

------
juskrey
Start a business

~~~
boyband6666
If you can find a problem, and solve it, a business is a fantastic way to make
an impact!

