
Crystallography Open Database - nickysielicki
http://www.crystallography.net/cod/
======
mkhorton
Love COD, very nice to see them on the front page :-)

Along similar lines, we maintain an open database of crystal structures
ourselves at the Materials Project (materialsproject.org) at Lawrence Berkeley
National Lab, which includes high-quality calculations of the electronic
structure of each crystal structure so that you can do queries for the
specific properties you need for your application. We're currently working
alongside the Crystallography Open Database and several other databases to
develop an API spec so that all similar databases can be queried in a common
way: [https://github.com/Materials-
Consortia/optimade](https://github.com/Materials-Consortia/optimade)

If it's ok to offer an ad here, we're actually looking for someone to join our
team. The calculations we offer at the Materials Project are the results of
millions of CPU hours of computation, so it becomes very important that we're
able to synthesize this data in a way that's approachable and understandable
and then share this data effectively with other researchers, students and
members of the public who might want to use it. We need help with this and
we're hiring a web developer at the moment:
[https://news.ycombinator.com/item?id=23381358](https://news.ycombinator.com/item?id=23381358)

~~~
ylem
The materials project has experimental and theoretical structures. On the
experimental side, how does it compare to ICSD? How up to date is it in
comparison? I love the API for the materials project!

~~~
mkhorton
> I love the API for the materials project!

Ah, so happy it's useful to you!

We're working on a new API internally too (based on FastAPI) that will
hopefully bring better documentation along with it, so stay tuned for
improvements.

> On the experimental side, how does it compare to ICSD?

We have pretty good coverage of ICSD and other experimental databases, and we
continue to process and calculate new materials as they're discovered. We also
calculate ordered approximations of disordered structures too, but this is an
area where we could improve.

We also provide a capability where users can upload crystal structures we
don't have and we calculate those too (with credit going to the original
uploader).

------
ajot
I like this project a lot (and use it ocassionally in my PhD). I've tried this
full profile Rietveld-esque tool [0] before, and it works pretty well for
being so simple (at least from a UX point of view).

What I would like to see (and would program myself if I was a better
programmer and knew more of the math and physics behind XRD) is a libre
search-and-match program, using a peak database (in reality, multiple
databases for different X-ray sources) constructed from the CIF files in COD.

[0] [http://cod.iutcaen.unicaen.fr/](http://cod.iutcaen.unicaen.fr/)

~~~
ylem
I think that's where commercial programs make their money--in phase
identification. JADE comes to mind

------
flobosg
> Open-access collection of crystal structures of organic, inorganic, metal-
> organics compounds and minerals, excluding biopolymers.

"biopolymers" links to the RCSB PDB
([http://www.rcsb.org/](http://www.rcsb.org/)), which stores biological
macromolecular structures.

------
btrettel
What are some other good scientific databases compiled from the open
literature?

Are there any best practices for the construction of similar databases?

~~~
mkhorton
For crystallography specifically, there's ourselves (Materials Project), OQMD,
AFLOW, Materials Cloud, JARVIS, and a number of more specific (but no less
important) specialized databases. There are also a number of commercial
offerings.

Best practices are incredibly difficult. We're trying to establish a common
API currently ([https://github.com/Materials-
Consortia/optimade](https://github.com/Materials-Consortia/optimade)) that can
be adopted by all database providers. How the data is stored behind the scenes
is something that ends up being very specific to how the data is generated and
what its applications are. We're definitely better as a community than we were
ten years ago, but there's a lot of work to be done here.

In terms of scientific databases outside crystallography/materials science,
Nature's Scientific Data is a good open-access journal to peruse:
[https://www.nature.com/sdata/](https://www.nature.com/sdata/)

------
UglyToad
On the hunt for file formats to pass I ended up implementing a parser for the
CIF format
[https://github.com/EliotJones/BioCif](https://github.com/EliotJones/BioCif)

It's great that these projects to create open data stores exist.

~~~
natechols
As a biologist, a field where open databases are the norm and deposition is
required for publication, I was very surprised when I found out that the
standard database of small molecule crystal structures was NOT open.

------
jonbraun
The Bilbao Crystallographic Server is another great database including
properties of crystallographic symmetry groups:
[https://www.cryst.ehu.es/](https://www.cryst.ehu.es/)

~~~
mkhorton
For anyone interested in symmetry specifically, there's also the ISOTROPY
Software Suite
([https://stokes.byu.edu/iso/isotropy.php](https://stokes.byu.edu/iso/isotropy.php))
which offers some nice tools.

------
TomJansen
What we also really need is an open antibody database, with links to buying
antibodies and info like affinity, gene sequence and epitope binding. I know
measuring affinity and epitope mapping can be hard but this is really
necessary for for example COVID-19 diagnostics and vaccines.

~~~
beerdoggie
Here is a decent place to start:
[http://www.bioinf.org.uk/abs/](http://www.bioinf.org.uk/abs/)

Happy to talk COVID-19 specific stuff offline if that does not suffice.

