Flow cytometry with a FACS can only shed light on a limited number of preselected markers on cells. This means that it allows no more than a glimpse of the immune system’s complexity. CyTOF, the next generation of flow cytometry, promises to reveal a whole chunk of the total picture – but only if the user is not overwhelmed by the data tsunami resulting from it.
The multidisciplinary Delft-Leiden Cytosplore project combines mass cytometry with data-mining, resulting in an effective computer visualisation of data analysis. It offers a time-saving algorithm as well as a made-to-measure application. This has already led to unprecedented insight into the heterogeneity of the immune cells at play in the gut.
Cytosplore is an interactive visual analysis system for understanding how the immune system works
A FACS can generally take into account a maximum of ten markers on a cell simultaneously. “It thus offers a limited view of the cell diversity at work,” says Leiden University Medical Center immunologist Vincent van Unen. “In order to solve fundamental health issues, we need to know more about the cell interaction in health and disease, he explains.” The invention of mass cytometry (CyTOF) three years ago allows for a much better view of the entire picture in one sample, as it takes many more markers into account. This means that, potentially, the role of more, different immune cells at play in a system comes to light. “Potentially,” says Unen, “because at the same time, conventional analysis no longer works.”
Translating the question
A tsunami of data hits the immunologist who tries to capitalise on the wealth of detail and insight the CyTOF promises. “The problem not only lies in the amount, but also in the complexity of the data,” explains Van Unen. “We’re talking about millions of data points, each describing forty features of a cell. Typically, you don’t know what to look for in this data. As a consequence, you won’t easily find differences between health and disease. For that, you need sophisticated software and a knowledge of mathematics. To use a metaphor: what use is a Formula 1 car, if you don’t know how to drive it?”
This is a problem for the whole CyTOF community. LUMC was the first centre in the Netherlands to have such a mass cytometer. There are now about forty in use worldwide. In Holland, Utrecht and Amsterdam have also now invested in this new technology. “Clearly, the key to success with mass cytometry is solving the big data problem,” Van Unen concludes. “Luckily, I have some basic knowledge of the math necessary for analysis. Not enough to handle the problem, but just enough to know there was more to get out of it and to formulate my question to computer scientists. This knowledge put me on track of the t-distributed stochastic neighbour embedding (t-SNE) machine-learning algorithm that allows for 2D visualisation. It was successfully used to analyse CyTOF data by Columbia University. I had assumed it was developed there, but Boudewijn Lelieveldt told me it was actually developed by Laurens van der Maaten, who was working at TU Delft at the time. So why not turn to the source?”
A screenshot of the Cytosplore application. From the left to right the visualisations show a force directed layout of the results of high dimensional clustering (a) using [SPADE], an embedding (b) (scatterplot visualisation of dimensionality reduced data) computed using our approximated version of [tSNE] and a heat map view (c), showing median expressions of the computed clustering.
A hundred times faster
At the Computer Graphics & Visualisation Group of TU Delft, Nicola Pezzotti was already was already developing (in the context of an STW project called VANPIRE) a way to make t-SNE faster, and make it suitable for interactive visual analysis. “The immunology application was welcome as a case,” says Dr. Anna Vilanova, associate professor at the group. “Nicola was making the t-SNE engine faster, but users were still unable to work with this engine on its own. Thomas Höllt developed a lot of extra visualisation and interaction components around it to make it into an actual application, one that would enable immunologists to use the algorithm in practice and to define immune subsets.” Pezzotti then added approximation to t-SNE, turning it into A-tSNE. “A controllable approximation allows a fast overview of results. After this, you can zoom in on the most interesting part of the data. You can reduce or remove the approximation, and go back to the exact data in the relevant data subset.” This is a procedure, he says, that allows the user to work a hundred times faster, although the result is indistinguishable from the normal result.
In samples from patients, specific subsets indicating Coeliac Disease or Crohn’s Disease could be distinguished
Discover disease drivers
Van Unen used the CyTOF together with the t-SNE application for single-cell analysis to understand mucosal intestinal immune cell heterogeneity in health and disease. “Beforehand, a maximum of around twenty immune cell subsets were distinguished. In this new approach, the results of which were published in the May 2016 issue of Immunity, an incredible 142 immune cell subsets were distinguished.” Within a month, retweets of the article have reached 127,000 people.
It brought to light that samples from the blood and from the gut looked completely different. “In samples from patients, specific subsets indicating Coeliac Disease or Crohn’s Disease could be distinguished,” says Van Unen. “In future, it should even become possible to tell what type of inflammation is at play on the basis of a biopsy. In research, it may become possible to discover what specific disease drivers are, and subsequently to develop targeted therapy. It will also have significant impact on diagnosis and monitoring. In fact, it turned out that a lot is more possible than I could have imagined.”
The results of this project were published in the June 2016 issue of Computer Graphics Forum. The key to our success,” says Anna Vilanova, “was definitely the integrative multidisciplinary approach. It’s not always easy to match biology and mathematics; you need the ability and affinity from both sides to get somewhere. But Cytosplore proves that crossing the boundaries of your own field enables great things: Medical Delta at its best.” Van Unen adds: “It’s a great team of people; that’s very important.”
The Cytosplore immunology research prototype is has now been distributed to LUMC researchers involved in diabetes, cancer and parasitology research. For use outside LUMC, there are thoughts about research licences. Vilanova is already thinking ahead: “The project was dedicated to solving the problems related to the amount and complexity of data generated by mass cytometry. Thomas’s application is completely made-to-measure, but we can also build applications for other scientific problems. The A-tSNE could, for instance, prove its value in satellite imaging, document-mining as well as decision-making in finance. We welcome scientific big data problems.”
Interview by: Leendert van der Ent