
Tumor Specific Immune Receptors Computationally Identifiable - jostmey
http://cancerres.aacrjournals.org/content/79/7/1671.short?rss=1
======
jostmey
Author here. The immune system is known to participate in a cancer response.
With DNA sequencing, we can sequence the immune receptors in a tumor. From
each tissue sample, we get a _set_ of immune receptor sequences.

In this study, we compare immune receptor sequences from tumor tissue to
adjacent healthy tissue. Using "Multiple Instance Learning" we perform machine
learning on the sets of immune receptor sequences to identify immune receptors
that may be involved in the cancer response.

Reference:
[https://en.wikipedia.org/wiki/Multiple_instance_learning](https://en.wikipedia.org/wiki/Multiple_instance_learning)

~~~
nielsbot
Thanks for posting. Can you summarize (for laypeople) what this potentially
means for cancer treatment? And how that would be better than current state-
of-the-art treatments.

~~~
jostmey
Right now we are focused on diagnosing cancer earlier and identifying the most
effective treatment options. Generally speaking, the earlier a cancer is
detected, the better the chances of survival.

This study demonstrates that we can distinguish tumor tissue from healthy
tissue based on the immune receptors present in the tissue. Our hope is that
we can then search for tumor specific immune receptors to determine if a
patient has cancer, perhaps before a patient shows symptoms. In fact, we are
working on this right now

~~~
Tharkun
How can you determine whether a patient has cancer before they show symptoms,
if you require a tissue sample? I would expect people only get tumors biopsied
after they realize they have one?

~~~
jostmey
We are currently in the process of checking if we can find disease signatures
in easier to access tissue like blood

------
AtlasBarfed
I keep reading about various targetted immunotherapy papers with such amazing
potential.

It would seem our entire medical establishment is just waiting for simple
pills to come out of it, whereas the real "cure" for cancer would seem to be
something like a production version of this.

An organized labor pipeline of testing/sequencing of the patient,
computational analysis of the specific expression, to specific cancer therapy
targeted to the specific cancer case's biochemistry. It wouldn't be cheap, but
given the cost of drugs, it could still be efficiently delivered for a similar
cost, especially in managed health care systems.

Of course over time, "templates" of therapies for classes of cancer gene
expressions would converge. That would be ongoing optimization/efficiency.

But I have no confidence the American health system would deviate from its
culture of one size fits all pill pushing or nothing.

------
Gatsky
Interesting work. The clonal frequency of the TCRs was included in the model.
Tumours often have more clonal TCRs, whereas normal tissues generally do not,
so I would imagine this would be an important feature (in fact including the
TCRB prevalence gave better predictions than the 4-mer prevalence for breast
cancer). Can't see in the paper that they ran the model without the frequency
feature, or even ran it with just the frequency feature. Building a very
simple model probably would have been a more salient comparator than shuffling
the labels.

------
xiphias2
This sounds very interesting!

Can you explain why the tools/data to reproduce the research generally don't
get to github?

It looks like machine learning improvements could be made by people who don't
have lab access as well.

Is it some legal / financial reason, or just it's not common in biological
research?

------
twic
> Immune repertoire deep sequencing allows comprehensive characterization of
> antigen receptor–encoding genes in a lymphocyte population. [...] In this
> study, we developed statistical classifiers of T-cell receptor (TCR)
> repertoires that distinguish tumor tissue from patient-matched healthy
> tissue of the same organ.

It might be worth noting for those not familiar with the ins and outs of the
immune system that (and i'm remembering this from an undergraduate degree
almost twenty years ago, so E&OE):

* Lymphocytes are the white blood cells whose job is to recognise foreign molecular patterns that indicate invading pathogens, tumours, etc. B-lymphocytes recognise patterns in the shape of molecules, and T-lymphocytes (studied here) recognise patterns in the sequence of proteins.

* Each lymphocyte recognises a different pattern. Initially, at least - when a lymphocyte recognises something which is then classified as a threat, it multiplies itself so it can hunt it down and help destroy it.

* T-lymphocytes recognise things using a protein complex called the T-cell receptor (TCR) which sits on the surface of the cell, and checks out fragments of protein presented to it by other cells.

* So if they all use the same receptor, how do they recognise different patterns? Lymphocytes have this amazing mechanism where, during their initial production, they cut up and randomise key sections of the DNA for the genes encoding their receptors [1]. This is absolutely wild.

* TCRs work by having a peptide-binding cleft with a specific shape and surface properties, so that only selected fragments of protein will fit it, and it's the biophysical properties of the protein sequence making up the cleft that determines that specificity. For example, if you've got some positively-charged residues at one end of the cleft, then the cleft will only bind fragments with negatively-charged residues at that end. By randomising the sequence, you randomise the specificity.

So, if you look at the TCR genes a population of T-lymphocytes which is
actively involved in fighting some threat, you will find that they are mostly
ones which encode recognition of patterns characteristic of that threat. What
this study does is come up with a way of fingerprinting a set of TCR genes so
that you can decide what kind of threat they are fighting.

To my naive mind, the two key ideas are (1) using some fancy machine learning
model and (2) representing the TCR genes as features using biophysical
properties of the proteins they encode, rather than their sequence, because
ultimately it's that that determines specificity.

The author is around, so hopefully he will correct me where i've gone wrong.

[1]
[https://en.wikipedia.org/wiki/T-cell_receptor#Generation_of_...](https://en.wikipedia.org/wiki/T-cell_receptor#Generation_of_the_TCR_diversity)

~~~
jostmey
You did an amazing job coming up the adaptive immune system and the research
we’ve done. I am surprised that our approach worked. There must be some common
signature shared among a subset of immune receptors between patients with the
same cancer type. This is surprising because each cancer patients mutations
are probably unique

