Communicating the Story of Protein Communities
Description: Don Kirkpatrick, PhD (VP & CTO, Interline Therapeutics) speaks on what we know about protein communities and its role in cells and human disease with a key focus on communicating this science to people with all levels of scientific expertise.
Proteomics, Protein communities, Mass spectrometry, Drug discovery, Ubiquitin
This webinar was recorded at VISUALIZE 2021, a virtual BioRender event dedicated to advancing communication in science.
We're honored to welcome Dr. Don Kirkpatrick as our first guest speaker, who actually was a very early BioRender champion and had even invited Shiz and Ryan, two of our three co-founders, out to the Bay Area for workshops at Genentech where he was working there. He really did believe in BioRender from the earliest days and gave us a ton of feedback on how to improve BioRender years ago, and so we certainly thank him for that. By way of background, Dr. Kirkpatrick has studied proteins, specifically ubiquitin, for over 20 years. His academic career through biochemistry and pharmacology led him to a postdoctoral fellow at Harvard Medical School. Afterward, he brought his experience to Genentech to develop MS proteomics technologies for drug discovery. Currently, he is the VP and CTO of Interline Therapeutics, and today we are excited to hear Dr. Kirkpatrick speak about protein communities and his thoughts on how to communicate the importance of this research to people of all scientific backgrounds. So please join me in giving Dr. Don Kirkpatrick a warm welcome.
All right, well, thank you for the wonderful introduction. It's really a pleasure to be here to tell you the origin story of Interline Therapeutics and really the story of protein communities. I'm also really excited to be here with the BioRender group. It's been such a pleasure watching them grow from the work of Ryan, Shiz, and all the BioRender team to really build tools that can help those of us that do science to communicate our stories.
So the story I want to tell you today starts with the average human cell, and this is a scanning electron micrograph.
Of a human macrophage, the average human cell contains 10 billion protein molecules, which is nearly equivalent to the total number of human beings on Earth. Now, it's interesting to think about the concentration of proteins within a given cell, which is really high, close to 100 mix per ML. If you wanted an analogy, that would be the equivalent of zooming down to Earth and looking at a packed stadium of people where all of the individuals were bumping into each other, where there were families and groups, people that work together, and so on and so forth.
But, of course, as we know, just as with human communities, there are forces at play in the environment, forces at play such as infection that can dramatically alter the way that communities behave, their actions, their functions, and the way the places where they aggregate, where they accumulate. And that's really the case with proteins as well.
So, as I said, the average human cell contains 10 billion proteins at a very high concentration, almost like pancake syrup. Most proteins function in complex protein communities; they have a bunch of different relationships that I'm going to tell you about in the next little bit. And yet, in spite of this, drug discovery historically has focused on individual proteins in isolation and been performed in experiments at protein concentrations on the order of one Mig per ml as compared to nearly a hundred and utilized by and large simplistic single endpoint readouts to understand the functions of otherwise very complex processes. So, one of the premises here at Interline is that we can improve the success of drug discovery by understanding protein communities, and that's what I'd like to tell you more about today.
By way of origin story for myself and how I got to thinking about this intersection of protein communities and drug discovery, I started with getting a bachelor's degree in Biochemistry at the University of Oklahoma. Really, what I would say is that, as with so many of us, our scientific careers are shaped by the mentors that we engaged early on in our career. The first of those for me was Dr. Pete Smith, who I met at the Society of Toxicology, who got me my very first internship to really encourage me to explore science and drug discovery, that there were important and interesting discoveries to be made there. He had gone to the University of Oklahoma and encouraged me to join the Farm Talks Department there, where I met really the second scientific mentor that played a key role in my development, Professor Jay Gandolfi. Jay gave me the freedom to really start to explore and understand proteins. I became aware of this protein that I have spent the last 20 years thinking about. I'd argue that it's the most important protein in all human cells; it's the protein ubiquitin.
We started early on trying to use a new technology to study proteins, and this was the mass spectrometer. And it was this work that led me to a postdoctoral fellowship in Steve Gigi's Lab at Harvard Medical School. And the reason that I did that had a lot to do with this paper here published by a now longtime friend and colleague, Jin Min Pang, who was at the time a postdoc in Steve's lab. They identified 1072 proteins that were modified by ubiquitin, which is in contrast to my own PhD thesis work, which had 23. When you see something like that, technologically advanced that so outstrips anything that you can imagine, you can imagine how important it was for me to really go and understand the new technologies that were coming into play and how they could be used to explore the biology that we were interested in.
I went on to do that in Steve's lab, focusing on ubiquitin and the protein communities that it controlled. At the time, we really didn't appreciate that they were communities in and of themselves. You know, the ubiquitination of proteins involved in the cell cycle like cyclin D or important drug targets such as the epidermal growth factor receptor (EGFR) that led me on to Genentech where I spent many years again trying to elucidate protein communities that were relevant to human disease and drug discovery. I worked with a wonderful team in the discovery proteomics group there for all of that time, again building technologies to understand human cells, the processes that go wrong in disease, and the ways that we can intervene with medicine to help patients.
One of the first projects that we did during my time there was to work on a protein known as LRRK2, a kinase that's associated with Parkinson's disease. That protein really, in addition to teaching us some interesting things about the ways that disease-associated variants alter extracellular function, also brought me into contact with somebody that I'll introduce you to here in a moment, Zack Sweeney, who plays the CEO role here at Interline Therapeutics. We worked during more recent times on using mass spectrometry to study mutant proteins, specifically mutant oncogenic proteins that drive cancer, and we're able to demonstrate with the mass spectrometer that those mutant proteins could be selectively degraded by drugs and in doing so have a profound impact on the viability of cancer cells.
Ubiquitin is a really important protein that controls a vast number of different processes within cells, and one of the places where it plays a key role is in mitochondria, the powerhouse of the cell, which has about a thousand proteins inside of it, and all but 13 of those proteins must be imported across the mitochondrial membrane, and ubiquitin again plays a key role there. Mass spectrometry helped us to understand many of the key proteins involved in that process.
There's a ubiquitin-like process or pathway known as autophagy that plays out in many different cells, and one of the things that it does is help cells to manage when they get infected with pathogens, and we wanted to understand how the autophagy pathway and innate immune signaling were playing out there.
I call out this paper for a couple of reasons. This is really the first one where I made a BioRender figure, and we ultimately published it, and also a place where the technology demonstrated the vastness of the proteome space and our ability to understand what these proteins were doing in rich detail.
Going to talk more about that in a couple of minutes and, lastly, some technology work that was done to really start to explore dynamic protein communities again, a story that I'm going to come back to in a few minutes. Now, this work that got the attention of a number of people, including Zach Sweeney, Nick Galley, on Saha, and the group that was coming together here at Interline Therapeutics with the idea that protein communities could indeed be the center of drug discovery and that we could put technologies in place to help us to explore those and understand their function in a way that would really help to develop medicines more effectively, more efficiently, and more successfully.
So let's talk a little bit more about protein communities. What are these, and what do they mean, and how does proteomics help us in this context? Let's start with the proteome. The proteome is the full composition of all proteins that exist within cells and tissues. If you went back to that human macrophage or any other individual cell, there are about 10 to 11,000 different genes that are expressing 10 to 11,000 different proteins, some of which are present at a million or more copies per cell, some of which may be present at 1, 10, or 100 copies per cell, adding up to these 10 billion protein molecules. The proteome of a human cell is a reflection of the variety of different inputs: the genotype of that cell, what cell type is it, is it a human cardiomyocyte, is it a hepatocyte from the liver, is it an immune cell like a macrophage, what environment does that cell reside within, is inflammation going on, is there tissue injury or something else, and of course, perturbation status, has it been exposed to drugs or other toxic chemicals. The proteome remodels itself in all of these contexts.
So that's the proteome, and proteomics is really the prism through which we try to understand the proteome. Mass spectrometers, mass spectrometry are one of many tools that give us the opportunity to understand proteins, their functions, and activities, and we started to think about these as protein communities, really relationships between groups of proteins, kind of dimensions or slices of the proteome through which you can get an understanding of what's happening.
Amongst the protein communities, we view them as physical communities, proteins that physically come into contact with one another. You might imagine members of a family, for example. Functional protein communities are those that act in a single process. You might imagine those people that you work together with in the job that you do. There are spatial protein communities. They're not physically or functionally related, but they happen to reside in the same place, maybe groups of people who live in the same apartment building. And then there are co-regulatory communities, a little bit more of an abstract concept, but you might equate this to people who follow the same political or religious affiliation, that they're driven to functions by some higher-order process.
So this is how we view the proteome and protein communities, and a premise of Interline Therapeutics is that the current approaches do not allow us to understand sufficiently well the molecular mechanisms at play within protein communities. There's a wide range
Different technologies are being used in drug discovery, such as imaging, characterizing mRNA through RNA sequencing, and metabolites through metabolomics. However, the vast majority of drug targets are proteins. By focusing at that level and understanding not just a single protein but complex communities of proteins in all of their interactions, physical, functional, and spatial, we have a better chance to understand the disease process and the way that our medicines will work. We believe that proteomics is poised to define molecular mechanisms for genetically validated therapeutic targets.
This slide shows how protein complexes or larger protein communities play an important role in the success or failure of a molecule. There are targets where there are genetic-associated variants, the complexes that they participate in, as well as the modulators of those individual proteins in the disease areas. The idea is that if we put proteomics at the center of drug discovery, we can expand from the light blue box to the dark blue box and bring more therapeutic targets into play where we can take action.
We believe that best-in-class medicines are defined by their impact on protein interactions. This is increasingly the case with a variety of successful and unsuccessful drug discovery programs across biotech and pharma. The majority of work is done to try to inhibit enzymes or the catalytic sites of proteins, and this has worked by and large for the successful drugs. But in cases where it hasn't, there are a number of reasons why. For example, the processes that are driven not by catalysis, not by the activity or the function of an enzyme, but rather by two proteins interacting with one another are often not in play. And then there's the rare case, and this plays out in the context of Wrath, an important oncogene or protein in cancer, and that is that inhibition of this protein can at times lead to activation of its partner proteins in a way that's deleterious to the cell and ultimately to the individual. Understanding the protein target and their protein communities is really essential to making medicines that will work earlier, faster, and more safely.
Now, proteomics is in a unique position, and this really is a technology that allows us to explore really all aspects of proteins. We can, therefore, use it at all stages of drug discovery. Whether we're looking at relatively early in vitro studies to try to discover new mechanisms and identify new lead molecules, whether we're taking those molecules into cells and trying to understand their activities, ultimately trying to develop better candidate molecules that we take to the clinic, and then in later stages within vivo and tissue studies as we ultimately try to establish clinical biomarkers that will tell us when the medicine is working and is confirmed to be safe. Proteomics is poised to improve the success of all three phases of drug discovery.
I call it out here that this is one of the many figures that we're making inside of Interline using the BioRender platform as a way to communicate this to our new team that's coming together, as well as to the many stakeholders that are overseeing the company and are interested in the things that we're doing here.
So, these are the three elements of Interline:
We want to identify genetic variants that are associated with disease and that have a connection with protein communities. We want to determine how disease-associated variants change protein communities, and then we want to discover drugs that can correct those communities because there is a vast amount of two million fully sequenced human genomes that are out there, and ultimately we want to translate that information into effective disease treatments for patients.
The paradigm that we've established is genes, maps, modulators. We're going to identify genetic variants, we're going to map their protein communities, and we're going to make modulators of those communities with the idea that in some cases those molecules that we make, those modulators, will be effective therapeutic medicines. Now, again, thinking about this paradigm in words is one thing, but putting it into pictures allows us to really explore it in a whole new level of detail. You know, the idea of a gene that expresses a single protein, but in fact, a protein participates as a part of a larger complex, maybe with three, four, five, ten other proteins, all of which are coming from their respective genes, and maybe only one of which is mutated or a variant. In other cases, maybe multiple members of that complex display genetic variants in the same disease. It often happens.
When we think about maps, this is a map of the Cullin E3 ubiquitin ligase family of proteins, and all of those other proteins that associate with the Cullin scaffolds. But you can imagine again that this is a complex community that doesn't exist in every cell, and when and how they change their function and activity and location really matters. And then of course, the idea of modulators that we're going to put molecule into a distinct place on a distinct protein and expect that it has a distinct activity, all of which we intend to read out biochemically, and structurally with the technologies that are in place.
So, this is what we're doing. We've established a growing scientific and leadership team here at Interline, a world-class group that combines leadership experience with diverse biological analytical and computational expertise. We're continuing to grow, we actually hit 27 people this week, so we're very excited to really have a team in place that can really advance the science and ultimately the development of new medicines for common diseases together here in South San Francisco. And as that team's coming together, we're spending a good amount of time designing the right experiments and communicating how these experiments can impact drug discovery. And that is really the place where BioRender has been playing out again and again for us. So, this is one of the figures that describes a common experiment that we're going to do here at Interline, one of the many.
Where we're going to have human cell lines and we're going to treat them with stimuli in this case an immune stimulus across a time series that runs from minutes out to hours. The complexity of these experiments is really immense because we're then going to take all of the proteins from all those samples and mix them together into a single sample but the samples will all be part. All the proteins from all the samples will then be barcoded analyzing the mass spectrometer and at the end, we'll use our bioinformatics and computational software to pull that data back apart to get quantitative barcodes for tens of thousands of features within the proteome. In fact, we can readily identify 8,000 proteins in more than 25,000 post-translational modifications or features within the proteome enriched time course and dose-response experiments even today. The expectation is that when applied systematically to understand the model systems used in drug discovery, it is going to have an impact on our ability to succeed and develop the right molecules.
So again, I go back to the work that I mentioned in an earlier slide talking about the autophagy pathway and this gene ATG 1601. My colleague Audie Murphy had been interested in it for quite some time in its role in controlling infection by intracellular pathogens such as shigella flexinarian. So we did a global protein experiment just like I said and ended up getting back close to 40,000 distinct pieces of quantitative data about the proteome and the way that it changed between wild-type cells and cells that lacked ATG-1601 or lacked the autophagy pathway and also between cells that were uninfected or infected for one or three hours with this with this important pathogen.
Through that process, we were able to identify and publish thousands of different protein profiles through these quantitative data. And again, as you start looking at data like this, visualization is absolutely critical. That one cannot consume all the information in a single pass, and the ability to drive around and swim in the data is critical. Here I want to call out really some wonderful work that's been done by the folks at PerkinElmer with the typical squat fire tool. So these data are available in interactive spot fire dashboards that are present on the PerkinElmer website on their analytics resource center. I point you to that at the link at the bottom of the talk so that you can see how these kinds of interactive visualizations complement really beautiful illustrations that one can.
To describe the experiments that we're doing, the insights into these sort of multi-scale disease Maps really come to life through interactive exploration made possible by interactive visualization tools. So, I want to close the talk in the last couple of minutes by talking about the work that was done by this really talented graduate student and now medical resident, Kurt Reichermeier. Kurt came to my lab at Genentech midway through his PhD at Caltech. He was in the lab of Ray Deshays and joined my group together with Ingrid Wertz when Ray went on to take the leadership position at Amgen. Kurt's project was to study proteins that were responsible for taking ubiquitin and putting them onto other proteins, attaching or conjugating these proteins together. Specifically, he was interested in the CRR CRL4 or call four ubiquin and ligase family to understand all of the dynamic interactions of these proteins and how those ultimately play out in the context of cells and ultimately disease.
So, call for ring ligases are very complex molecular machines. They have a scaffold down here in red, the Cullin, as well as the rbx1, and then up in Orange what they have is an adapter Protein that's essentially like the bit that essentially holds the bit on the end of a drill. The bits or the decafs, they're exchangeable proteins that are responsible for recruiting in substrates so that this molecular machine can act upon them. And indeed, that's what it does to bring in a substrate, apply ubiquitin to it, and then at the right time release it so that it can then act on the next protein. The question is how do these molecular machines remodel themselves in order to act on whatever protein substrates are present at any moment in time?
Kurt used a variety of different tools to dissect all of the complex components of this system. In the first layer, he counted effectively counting the numbers of molecules of each of the proteins up to more up to 30 different proteins within human cells, and then pulling down the complexes and again counting the molecules that were present within the complexes that he was able to enrich. He showed, for example, that the scaffold the Cullin was approximately as abundant as the rbx1, its sort of obligate partner, demonstrating that the adapter protein ddb1 was present at just slightly lower abundance and nearly equivalent to the sum of all of the decafs that came down. Then he was able to dissect all of the different decaf proteins of which there are approximately 30 and demonstrate that cerebellum, the target of the image molecules in multiple myeloma, as well as the thalidomide molecule which had such terrible effects many many decades ago, as the most abundant of the decaf proteins within the cell lines that he was studying. Then if you went down the.
List that 100 times lower abundance were interesting and important proteins such as this decaf-15. Yet even for a protein that was 100 times less than the most abundant of the decafs, putting a drug into the cells that engages decaf 15 and draws in its substrate, this protein RBM 39 very specifically leads to the Recruitment and the Assembly of decaf 15 onto an active call 4 ring Lagos complex so that it can perform its job and ultimately degrade or destroy the RBM 39 protein.
So it's these kinds of experiments that we believe really can be brought to bear on Project after project after project, and that's what we're going to do here at Interline. Really, we believe that these will differentiate our platform or these new technologies will differentiate our platform.
So we're going to start with some sequencing human genetic data that will point us to the likely therapeutic targets that we'll want to be acting upon. We're going to then have or take advantage of sequenced and interactive sequence to interact on prediction and active and machine learning to really go and predict protein interactions that will then be validated by experimental mass spectrometry proteomics as well as non-mass effect proteomics tools. We'll take those data and move them into extended molecular dynamic simulations to understand the protein relationships and then Giga docking and binding Affinity prediction so we can refine and optimize the molecules that will act upon those. Now many of these are going to be apps, all of these are going to be absolutely essential components of our pipeline, but we believe that the key differentiators will be these first two sequence to interact on prediction with active learning as well as the mass spectrometry proteomics that's going to be brought to bear on each and every project.
And that's what we're focusing on, bringing that technology platform into play for project after project and doing so in the context of both inflammatory disease as well as oncology. That there's many genetic drivers of disease that have been identified, that protein communities are starting to be elucidated, and what we're validating many of those here, and ultimately there's a high unmet medical need and the opportunity for precision medicine that will play out as we start to explore protein communities and understand the ways that they work.
So there, I want to stop, I want to thank a number of different people, certainly our growing team here at Interline, particularly Zach Sweeney for convincing me that this was the moment and the opportunity for me to take the leap and really jump in and try to build this company together with the group that's here. The wonderful team at Genentech that I worked with for 13 plus years to do science and explore drug Discovery, and lastly, kind of as a special a special shout out, today's my mom's 70th birthday and very rarely do parents get to see their children present scientifically, but today she jumped into the audience even on her birthday on vacation in Hawaii, and so I want to call out happy birthday and I love you to my mom. Thank you everybody.