Computational cohort materials from March 2023 Ghana training

Inference of cell–cell communication from single-cell RNA sequencing data is a powerful technique to uncover intercellular communication pathways, yet existing methods perform this analysis at the level of the cell type or cluster, discarding single-cell-level information. Here we present Scriabin, a flexible and scalable framework for comparative analysis of cell–cell communication at single-cell resolution that is performed without cell aggregation or downsampling. We use multiple published atlas-scale datasets, genetic perturbation screens and direct experimental validation to show that Scriabin accurately recovers expected cell–cell communication edges and identifies communication networks that can be obscured by agglomerative methods. Additionally, we use spatial transcriptomic data to show that Scriabin can uncover spatial features of interaction from dissociated data alone. Finally, we demonstrate applications to longitudinal datasets to follow communication pathways operating between timepoints. Our approach represents a broadly applicable strategy to reveal the full structure of niche–phenotype relationships in health and disease.

High-throughput phenotypic screens leveraging biochemical perturbations, high-content readouts, and complex multicellular models could advance therapeutic discovery yet remain constrained by limitations of scale. To address this, we establish a method for compressing screens by pooling perturbations followed by computational deconvolution. Conducting controlled benchmarks with a highly bioactive small molecule library and a high-content imaging readout, we demonstrate increased efficiency for compressed experimental designs compared to conventional approaches. To prove generalizability, we apply compressed screening to examine transcriptional responses of patient-derived pancreatic cancer organoids to a library of tumor-microenvironment (TME)-nominated recombinant protein ligands. Using single-cell RNA-seq as a readout, we uncover reproducible phenotypic shifts induced by ligands that correlate with clinical features in larger datasets and are distinct from reference signatures available in public databases. In sum, our approach enables phenotypic screens that interrogate complex multicellular models with rich phenotypic readouts to advance translatable drug discovery as well as basic biology.

T cells have a central role in adaptive immune responses. However, no accurate assays currently exist that link measurements of ex vivo or in vitro function to effective in vivo T cell responses. Diagnostic detection of T cell function in infectious and immune-mediated diseases also lags in vitro assessments of antibody function. An improved understanding of T cell responses will help researchers and clinicians better predict immune outcomes in response to vaccines, pathogenic infections or immune-mediated diseases. To address these issues, the National Institute of Allergy and Infectious Diseases (NIAID) convened the ‘T Cell Technologies: Assays, Innovations, Challenges, and Opportunities Workshop’ on 15–16 June 2022. The goals of the workshop were to explore assays and technologic advances that could improve understanding of T cell activation and function in different immune conditions, tissues and infections, and to identify methodologies that best provide an accurate measure of T cell biological relevance.

Cynomolgus macaque (Macaca fascicularis) is an attractive animal model for the study of human disease and is extensively used in biomedical research. Cynomolgus macaques share behavioral, physiological, and genomic traits with humans and recapitulate human disease manifestations not observed in other animal species. To improve the use of the cynomolgus macaque model to investigate immune responses, we defined and characterized the T cell receptor (TCR) repertoire. We identified and analyzed the alpha (TRA), beta (TRB), gamma (TRG), and delta (TRD) TCR loci of the cynomolgus macaque. The expressed repertoire was determined using 22 unique lung samples from Mycobacterium tuberculosis infected cynomolgus macaques by single cell RNA sequencing. Expressed TCR alpha (TRAV) and beta (TRBV) variable region genes were enriched and identified using gene specific primers, which allowed their functional status to be determined. Analysis of the primers used for cynomolgus macaque TCR variable region gene enrichment showed they could also be used to amplify rhesus macaque (M. mulatta) variable region genes. The genomic organization of the cynomolgus macaque has great similarity with the rhesus macaque and they shared > 90% sequence similarity with the human TCR repertoire. The identification of the TCR repertoire facilitates analysis of T cell immunity in cynomolgus macaques.

T cell receptor (TCR) clonotype tracking is a powerful tool for interrogating T cell mediated immune processes. New methods to pair a single cell’s transcriptional program with its TCR identity allow monitoring of T cell clonotype-specific transcriptional dynamics. While these technologies have been available for human and mouse T cells studies, they have not been developed for Rhesus Macaques (RM), a critical translational organism for autoimmune diseases, vaccine development and transplantation. We describe a new pipeline, ‘RM-scTCR-Seq’, which, for the first time, enables RM specific single cell TCR amplification, reconstruction and pairing of RM TCR’s with their transcriptional profiles. We apply this method to a RM model of GVHD, and identify and track in vitro detected alloreactive clonotypes in GVHD target organs and explore their GVHD driven cytotoxic T cell signature. This novel, state-of-the-art platform fundamentally advances the utility of RM to study protective and pathogenic T cell responses.

The cellular composition of barrier epithelia is essential to organismal homoeostasis. In particular, within the small intestine, adult stem cells establish tissue cellularity, and may provide a means to control the abundance and quality of specialized epithelial cells. Yet, methods for the identification of biological targets regulating epithelial composition and function, and of small molecules modulating them, are lacking. Here we show that druggable biological targets and small-molecule regulators of intestinal stem cell differentiation can be identified via multiplexed phenotypic screening using thousands of miniaturized organoid models of intestinal stem cell differentiation into Paneth cells, and validated via longitudinal single-cell RNA-sequencing. We found that inhibitors of the nuclear exporter Exportin 1 modulate the fate of intestinal stem cells, independently of known differentiation cues, significantly increasing the abundance of Paneth cells in the organoids and in wild-type mice. Physiological organoid models of the differentiation of intestinal stem cells could find broader utility for the screening of biological targets and small molecules that can modulate the composition and function of other barrier epithelia.

Prognostically relevant RNA expression states exist in pancreatic ductal adenocarcinoma (PDAC), but our understanding of their drivers, stability, and relationship to therapeutic response is limited. To examine these attributes systematically, we profiled metastatic biopsies and matched organoid models at single-cell resolution. In vivo, we identify a new intermediate PDAC transcriptional cell state and uncover distinct site- and state-specific tumor microenvironments (TMEs). Benchmarking models against this reference map, we reveal strong culture-specific biases in cancer cell transcriptional state representation driven by altered TME signals. We restore expression state heterogeneity by adding back in vivo-relevant factors and show plasticity in culture models. Further, we prove that non-genetic modulation of cell state can strongly influence drug responses, uncovering state-specific vulnerabilities. This work provides a broadly applicable framework for aligning cell states across in vivo and ex vivo settings, identifying drivers of transcriptional plasticity and manipulating cell state to target associated vulnerabilities.

Blood samples are frequently collected in human studies of the immune system but poorly represent tissue-resident immunity. Understanding the immunopathogenesis of tissue-restricted diseases, such as chronic hepatitis B, necessitates direct investigation of local immune responses. We developed a workflow that enables frequent, minimally invasive collection of liver fine-needle aspirates in multi-site international studies and centralized single-cell RNA sequencing data generation using the Seq-Well S3 picowell-based technology. All immunological cell types were captured, including liver macrophages, and showed distinct compartmentalization and transcriptional profiles, providing a systematic assessment of the capabilities and limitations of peripheral blood samples when investigating tissue-restricted diseases. The ability to electively sample the liver of chronic viral hepatitis patients and generate high-resolution data will enable multi-site clinical studies to power fundamental and therapeutic discovery.

Crohn’s disease is an inflammatory bowel disease (IBD) which most often presents with patchy lesions in the terminal ileum and colon and requires complex clinical care. Recent advances in the targeting of cytokines and leukocyte migration have greatly advanced treatment options, but most patients still relapse and inevitably progress. Although single-cell approaches are transforming our ability to understand the barrier tissue biology of inflammatory disease, comprehensive single-cell RNA-sequencing (scRNA-seq) atlases of IBD to date have largely sampled pre-treated patients with established disease. This has limited our understanding of which cell types, subsets, and states at diagnosis are predictive of disease severity and response to treatment. Here, through a combined clinical, flow cytometric, and scRNA-seq study, we profile diagnostic human biopsies from the terminal ileum of treatment-naive pediatric patients with Crohn’s disease (pediCD; n=14) and from non-inflamed pediatric controls with functional gastrointestinal disorders (FGID; n=13). To fully resolve and annotate epithelial, stromal, and immune cell states among the 201,883 single-cell transcriptomes, we develop and deploy a principled and unbiased tiered clustering approach, ARBOL, yielding 138 FGID and 305 pediCD end cell clusters. Notably, through both flow cytometry and scRNA-seq, we observe that at the level of broad cell types, treatment-naive pediCD is not readily distinguishable from FGID in cellular composition. However, by integrating high-resolution scRNA-seq analysis, we identify significant differences in cell states that arise during pediCD relative to FGID. Furthermore, by closely linking our scRNA-seq analysis with clinical meta-data, we resolve a vector of lymphoid, myeloid, and epithelial cell states in treatment-naive samples which can distinguish patients with less severe disease (those not on anti-TNF therapies (NOA)), from those with more severe disease at presentation who require anti-TNF therapies. Moreover, this vector was also able to distinguish those patients that achieve a full response (FR) to anti-TNF blockade from those more treatment-resistant patients who only achieve a partial response (PR). Our study jointly leverages a treatment-naive cohort, high-resolution principled scRNA-seq data analysis, and clinical outcomes to understand which baseline cell states may predict inflammatory disease trajectory.

SARS-CoV-2 infection can cause severe respiratory COVID-19. However, many individuals present with isolated upper respiratory symptoms, suggesting potential to constrain viral pathology to the nasopharynx. Which cells SARS-CoV-2 primarily targets and how infection influences the respiratory epithelium remains incompletely understood. We performed scRNA-seq on nasopharyngeal swabs from 58 healthy and COVID-19 participants. During COVID-19, we observe expansion of secretory, loss of ciliated, and epithelial cell repopulation via deuterosomal expansion. In mild/moderate COVID-19, epithelial cells express anti-viral/interferon-responsive genes, while cells in severe COVID-19 have muted anti-viral responses despite equivalent viral loads. SARS-CoV-2 RNA+ host-target cells are highly heterogenous, including developing ciliated, interferon-responsive ciliated, AZGP1high goblet, and KRT13+ “hillock”-like cells, and we identify genes associated with susceptibility, resistance, or infection response. Our study defines protective and detrimental responses to SARS-CoV-2, the direct viral targets of infection, and suggests that failed nasal epithelial anti-viral immunity may underlie and precede severe COVID-19.

COVID-19, caused by SARS-CoV-2, can result in acute respiratory distress syndrome and multiple-organ failure, but little is known about its pathophysiology. Here, we generated single-cell atlases of 23 lung, 16 kidney, 16 liver and 19 heart COVID-19 autopsy donor tissue samples, and spatial atlases of 14 lung donors. Integrated computational analysis uncovered substantial remodeling in the lung epithelial, immune and stromal compartments, with evidence of multiple paths of failed tissue regeneration, including defective alveolar type 2 differentiation and expansion of fibroblasts and putative TP63+ intrapulmonary basal-like progenitor cells. Viral RNAs were enriched in mononuclear phagocytic and endothelial lung cells which induced specific host programs. Spatial analysis in lung distinguished inflammatory host responses in lung regions with and without viral RNA. Analysis of the other tissue atlases showed transcriptional alterations in multiple cell types in COVID-19 donor heart tissue, and mapped cell types and genes implicated with disease severity based on COVID-19 GWAS. Our foundational dataset elucidates the biological impact of severe SARS-CoV-2 infection across the body, a key step towards new treatments.

Angiotensin-converting enzyme 2 (ACE2) and accessory proteases (TMPRSS2 and CTSL) are needed for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cellular entry, and their expression may shed light on viral tropism and impact across the body. We assessed the cell-type-specific expression of ACE2, TMPRSS2 and CTSL across 107 single-cell RNA-sequencing studies from different tissues. ACE2, TMPRSS2 and CTSL are coexpressed in specific subsets of respiratory epithelial cells in the nasal passages, airways and alveoli, and in cells from other organs associated with coronavirus disease 2019 (COVID-19) transmission or pathology. We performed a meta-analysis of 31 lung single-cell RNA-sequencing studies with 1,320,896 cells from 377 nasal, airway and lung parenchyma samples from 228 individuals. This revealed cell-type-specific associations of age, sex and smoking with expression levels of ACE2, TMPRSS2 and CTSL. Expression of entry factors increased with age and in males, including in airway secretory cells and alveolar type 2 cells. Expression programs shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues included genes that may mediate viral entry, key immune functions and epithelial–macrophage cross-talk, such as genes involved in the interleukin-6, interleukin-1, tumor necrosis factor and complement pathways. Cell-type-specific expression patterns may contribute to the pathogenesis of COVID-19, and our work highlights putative molecular pathways for therapeutic intervention.

The SARS-CoV-2 pandemic has caused over 1 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome, or direct complications resulting in multiple-organ failures. Little is known about the host tissue immune and cellular responses associated with COVID-19 infection, symptoms, and lethality. To address this, we collected tissues from 11 organs during the clinical autopsy of 17 individuals who succumbed to COVID-19, resulting in a tissue bank of approximately 420 specimens. We generated comprehensive cellular maps capturing COVID-19 biology related to patients’ demise through single-cell and single-nucleus RNA-Seq of lung, kidney, liver and heart tissues, and further contextualized our findings through spatial RNA profiling of distinct lung regions. We developed a computational framework that incorporates removal of ambient RNA and automated cell type annotation to facilitate comparison with other healthy and diseased tissue atlases. In the lung, we uncovered significantly altered transcriptional programs within the epithelial, immune, and stromal compartments and cell intrinsic changes in multiple cell types relative to lung tissue from healthy controls. We observed evidence of: alveolar type 2 (AT2) differentiation replacing depleted alveolar type 1 (AT1) lung epithelial cells, as previously seen in fibrosis; a concomitant increase in myofibroblasts reflective of defective tissue repair; and, putative TP63+ intrapulmonary basal-like progenitor (IPBLP) cells, similar to cells identified in H1N1 influenza, that may serve as an emergency cellular reserve for severely damaged alveoli. Together, these findings suggest the activation and failure of multiple avenues for regeneration of the epithelium in these terminal lungs. SARS-CoV-2 RNA reads were enriched in lung mononuclear phagocytic cells and endothelial cells, and these cells expressed distinct host response transcriptional programs. We corroborated the compositional and transcriptional changes in lung tissue through spatial analysis of RNA profiles in situ and distinguished unique tissue host responses between regions with and without viral RNA, and in COVID-19 donor tissues relative to healthy lung. Finally, we analyzed genetic regions implicated in COVID-19 GWAS with transcriptomic data to implicate specific cell types and genes associated with disease severity. Overall, our COVID-19 cell atlas is a foundational dataset to better understand the biological impact of SARS-CoV-2 infection across the human body and empowers the identification of new therapeutic interventions and prevention strategies.

Haplotype reconstruction of distant genetic variants remains an unsolved problem due to the short-read length of common sequencing data. Here, we introduce HapTree-X, a probabilistic framework that utilizes latent long-range information to reconstruct unspecified haplotypes in diploid and polyploid organisms. It introduces the observation that differential allele-specific expression can link genetic variants from the same physical chromosome, thus even enabling using reads that cover only individual variants. We demonstrate HapTree-X’s feasibility on in-house sequenced Genome in a Bottle RNA-seq and various whole exome, genome, and 10X Genomics datasets. HapTree-X produces more complete phases (up to 25%), even in clinically important genes, and phases more variants than other methods while maintaining similar or higher accuracy and being up to 10×  faster than other tools. The advantage of HapTree-X’s ability to use multiple lines of evidence, as well as to phase polyploid genomes in a single integrative framework, substantially grows as the amount of diverse data increases.

Bulk transcriptomic studies have defined classical and basal-like gene expression subtypes in pancreatic ductal adenocarcinoma (PDAC) that correlate with survival and response to chemotherapy; however, the underlying mechanisms that govern these subtypes and their heterogeneity remain elusive. Here, we performed single-cell RNA-sequencing of 23 metastatic PDAC needle biopsies and matched organoid models to understand how tumor cell-intrinsic features and extrinsic factors in the tumor microenvironment (TME) shape PDAC cancer cell phenotypes. We identify a novel cancer cell state that co-expresses basal-like and classical signatures, demonstrates upregulation of developmental and KRAS-driven gene expression programs, and represents a transitional intermediate between the basal-like and classical poles. Further, we observe structure to the metastatic TME supporting a model whereby reciprocal intercellular signaling shapes the local microenvironment and influences cancer cell transcriptional subtypes. In organoid culture, we find that transcriptional phenotypes are plastic and strongly skew toward the classical expression state, irrespective of genotype. Moreover, we show that patient-relevant transcriptional heterogeneity can be rescued by supplementing organoid media with factors found in the TME in a subtype-specific manner. Collectively, our study demonstrates that distinct microenvironmental signals are critical regulators of clinically relevant PDAC transcriptional states and their plasticity, identifies the necessity for considering the TME in cancer modeling efforts, and provides a generalizable approach for delineating the cell-intrinsic versus -extrinsic factors that govern tumor cell phenotypes.

B cell receptors (BCRs) display a combination of variable (V)-gene-encoded complementarity determining regions (CDRs) and adaptive/hypervariable CDR3 loops to engage antigens. It has long been proposed that the former tune for recognition of pathogens or groups of pathogens. To experimentally evaluate this within the human antibody repertoire, we perform immune challenges in transgenic mice that bear diverse human CDR3 and light chains but are constrained to different human VHgenes. We find that, of six commonly deployed VHsequences, only those CDRs encoded by IGHV1-202 enable polyclonal antibody responses against bacterial lipopolysaccharide (LPS) when introduced to the bloodstream. The LPS is from diverse strains of gram-negative bacteria, and the VH-gene-dependent responses are directed against the non-variable and universal saccrolipid substructure of this antigen. This reveals a broad-spectrum anti-LPS response in which germline-encoded CDRs naturally hardwire the human antibody repertoire for recognition of a conserved microbial target.

Despite the epidemics of chronic obstructive pulmonary disease (COPD), the cellular and molecular mechanisms of this disease are far from being understood. Here, we characterize and classify the cellular composition within the alveolar space and peripheral blood of COPD patients and control donors using a clinically applicable single-cell RNA-seq technology corroborated by advanced computational approaches for: machine learning-based cell-type classification, identification of differentially expressed genes, prediction of metabolic changes, and modeling of cellular trajectories within a patient cohort. These high-resolution approaches revealed: massive transcriptional plasticity of macrophages in the alveolar space with increased levels of invading and proliferating cells, loss of MHC expression, reduced cellular motility, altered lipid metabolism, and a metabolic shift reminiscent of mitochondrial dysfunction in COPD patients. Collectively, single-cell omics of multi-tissue samples was used to build the first cellular and molecular framework for COPD pathophysiology as a prerequisite to develop molecular biomarkers and causal therapies against this deadly disease.

Single-cell RNA sequencing (scRNA-seq) has provided a high-dimensional catalog of millions of cells across species and diseases. These data have spurred the development of hundreds of computational tools to derive novel biological insights. Here, we outline the components of scRNA-seq analytical pipelines and the computational methods that underlie these steps. We describe available methods, highlight well-executed benchmarking studies, and identify opportunities for additional benchmarking studies and computational methods. As the biochemical approaches for single-cell omics advance, we propose coupled development of robust analytical pipelines suited for the challenges that new data present and principled selection of analytical methods that are suited for the biological questions to be addressed.

The COVID-19 pandemic, caused by the novel coronavirus SARS-CoV-2, creates an urgent need for identifying molecular mechanisms that mediate viral entry, propagation, and tissue pathology. Cell membrane bound angiotensin-converting enzyme 2 (ACE2) and associated proteases, transmembrane protease serine 2 (TMPRSS2) and Cathepsin L (CTSL), were previously identified as mediators of SARS-CoV2 cellular entry. Here, we assess the cell type-specific RNA expression of ACE2, TMPRSS2, and CTSL through an integrated analysis of 107 single-cell and single-nucleus RNA-Seq studies, including 22 lung and airways datasets (16 unpublished), and 85 datasets from other diverse organs. Joint expression of ACE2 and the accessory proteases identifies specific subsets of respiratory epithelial cells as putative targets of viral infection in the nasal passages, airways, and alveoli. Cells that co-express ACE2 and proteases are also identified in cells from other organs, some of which have been associated with COVID-19 transmission or pathology, including gut enterocytes, corneal epithelial cells, cardiomyocytes, heart pericytes, olfactory sustentacular cells, and renal epithelial cells. Performing the first meta- analyses of scRNA-seq studies, we analyzed 1,176,683 cells from 282 nasal, airway, and lung parenchyma samples from 164 donors spanning fetal, childhood, adult, and elderly age groups, associate increased levels of ACE2, TMPRSS2, and CTSL in specific cell types with increasing age, male gender, and smoking, all of which are epidemiologically linked to COVID-19 susceptibility and outcomes. Notably, there was a particularly low expression of ACE2 in the few young pediatric samples in the analysis. Further analysis reveals a gene expression program shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues, including genes that may mediate viral entry, subtend key immune functions, and mediate epithelial-macrophage cross- talk. Amongst these are IL6, its receptor and co-receptor, IL1R, TNF response pathways, and complement genes. Cell type specificity in the lung and airways and smoking effects were conserved in mice. Our analyses suggest that differences in the cell type-specific expression of mediators of SARS-CoV-2 viral entry may be responsible for aspects of COVID-19 epidemiology and clinical course, and point to putative molecular pathways involved in disease susceptibility and pathogenesis.

Crucial transitions in cancer—including tumor initiation, local expansion, metastasis, and therapeutic resistance—involve complex interactions between cells within the dynamic tumor ecosystem. Transformative single-cell genomics technologies and spatial multiplex in situ methods now provide an opportunity to interrogate this complexity at unprecedented resolution. The Human Tumor Atlas Network (HTAN), part of the National Cancer Institute (NCI) Cancer Moonshot Initiative, will establish a clinical, experimental, computational, and organizational framework to generate informative and accessible three-dimensional atlases of cancer transitions for a diverse set of tumor types. This effort complements both ongoing efforts to map healthy organs and previous large-scale cancer genomics approaches focused on bulk sequencing at a single point in time. Generating single-cell, multiparametric, longitudinal atlases and integrating them with clinical outcomes should help identify novel predictive biomarkers and features as well as therapeutically relevant cell types, cell states, and cellular interactions across transitions. The resulting tumor atlases should have a profound impact on our understanding of cancer biology and have the potential to improve cancer detection, prevention, and therapeutic discovery for better precision-medicine treatments of cancer patients and those at risk for cancer.

The scale and capabilities of single-cell RNA-sequencing methods have expanded rapidly in recent years, enabling major dis- coveries and large-scale cell mapping efforts. However, these methods have not been systematically and comprehensively benchmarked. Here, we directly compare seven methods for single-cell and/or single-nucleus profiling—selecting representa- tive methods based on their usage and our expertise and resources to prepare libraries—including two low-throughput and five high-throughput methods. We tested the methods on three types of samples: cell lines, peripheral blood mononuclear cells and brain tissue, generating 36 libraries in six separate experiments in a single center. To directly compare the methods and avoid processing differences introduced by the existing pipelines, we developed scumi, a flexible computational pipeline that can be used with any single-cell RNA-sequencing method. We evaluated the methods for both basic performance, such as the structure and alignment of reads, sensitivity and extent of multiplets, and for their ability to recover known biological information in the samples.

Cellular immunity is critical for controlling intracellular pathogens, but individual cellular dynamics and cell–cell cooperativity in evolving human immune responses remain poorly understood. Single-cell RNA-sequencing (scRNA-seq) represents a powerful tool for dissecting complex multicellular behaviors in health and disease and nominating testable therapeutic targets. Its application to longitudinal samples could afford an opportunity to uncover cellular factors associated with the evolution of disease progression without potentially confounding inter-individual variability. Here, we present an experimental and computational methodology that uses scRNA-seq to characterize dynamic cellular programs and their molecular drivers, and apply it to HIV infection. By performing scRNA-seq on peripheral blood mononuclear cells from four untreated individuals before and longitudinally during acute infection, we were powered within each to discover gene response modules that vary by time and cell subset. Beyond previously unappreciated individual- and cell-type-specific interferon-stimulated gene upregulation, we describe temporally aligned gene expression responses obscured in bulk analyses, including those involved in proinflammatory T cell differentiation, prolonged monocyte major histocompatibility complex II upregulation and persistent natural killer (NK) cell cytolytic killing. We further identify response features arising in the first weeks of infection, for example proliferating natural killer cells, which potentially may associate with future viral control. Overall, our approach provides a unified framework for characterizing multiple dynamic cellular responses and their coordination.

There is pressing urgency to better understand the pathogenesis of the severe acute respiratory syndrome (SARS) coronavirus (CoV) clade SARS-CoV-2, which causes the disease known as COVID-19. SARS-CoV-2, like SARS-CoV, utilizes ACE2 to bind host cells. While initial SARS- CoV-2 cell entry and infection depend on ACE2 in concert with the protease TMPRSS2 for spike (S) protein activation, the specific cell subsets targeted by SARS-CoV-2 in host tissues, and the factors that regulate ACE2 expression, remain unknown. Here, we leverage human and non- human primate (NHP) single-cell RNA-sequencing (scRNA-seq) datasets to uncover the tissue- resident cell subsets that may serve as the cellular targets of SARS-CoV-2. We identify ACE2 and TMPRSS2 co-expressing cells within type II pneumocytes in NHP lung, absorptive enterocytes in human and NHP terminal ileum, and human nasal goblet secretory cells. Strikingly, we discover, and extensively corroborate using publicly available data sets, that ACE2 is an interferon-stimulated gene (ISG) in human epithelial cells. We further validate this finding in primary upper airway human respiratory epithelial cells. Thus, SARS-CoV-2 may exploit IFN- driven upregulation of ACE2, a key tissue-protective mediator during lung injury, to enhance infection.

Immune responses within barrier tissues are regulated, in part, by nociceptors, specialized peripheral sensory neurons that detect noxious stimuli. Previous work has shown that nociceptor ablation not only alters local responses to immune challenge at peripheral sites, but also within draining lymph nodes (LNs). The mechanisms and significance of nociceptor-dependent modulation of LN function are unknown. Indeed, although sympathetic innervation of LNs is well documented, it has been unclear whether the LN parenchyma itself is innervated by sensory neurons. Here, using a combination of high-resolution imaging, retrograde viral tracing, single-cell transcriptomics (scRNA-seq), and optogenetics, we identified and functionally tested a sensory neuro-immune circuit that is preferentially located in the outermost cortex of skin-draining LNs. Transcriptomic profiling revealed that there are at least four discrete subsets of sensory neurons that innervate LNs with a predominance of peptidergic nociceptors, and an innervation pattern that is distinct from that in the surrounding skin. To uncover potential LN-resident communication partners for LN-innervating sensory neurons, we employed scRNA-seq to generate a draft atlas of all murine LN cells and, based on receptor-ligand expression patterns, nominated candidate target populations among stromal and immune cells. Using selective optogenetic stimulation of LN-innervating sensory axons, we directly experimentally tested our inferred connections. Acute neuronal activation triggered rapid transcriptional changes preferentially within our top-ranked putative interacting partners, principally endothelium and other nodal stroma cells, as well as several innate leukocyte populations. Thus, LNs are monitored by a unique population of sensory neurons that possesses immunomodulatory potential.

To mark the 15th anniversary of Nature Methods, we asked scientists from across diverse fields of basic biology research for their views on the most exciting and essential methodological challenges that their communities are poised to tackle in the near future.

Genomic medicine has paved the way for identifying biomarkers and therapeutically actionable targets for complex diseases, but is complicated by the involvement of thousands of variably expressed genes across multiple cell types. Single-cell RNA-sequencing study (scRNA-seq) allows the characterization of such complex changes in whole organs. The study is based on applying network tools to organize and analyze scRNA-seq data from a mouse model of arthritis and human rheumatoid arthritis, in order to find diagnostic biomarkers and therapeutic targets. Diagnostic validation studies were performed using expression profiling data and potential protein biomarkers from prospective clinical studies of 13 diseases. A candidate drug was examined by a treatment study of a mouse model of arthritis, using phenotypic, immunohistochemical, and cellular analyses as read-outs. We performed the first systematic analysis of pathways, potential biomarkers, and drug targets in scRNA-seq data from a complex disease, starting with inflamed joints and lymph nodes from a mouse model of arthritis. We found the involvement of hundreds of pathways, biomarkers, and drug targets that differed greatly between cell types. Analyses of scRNA-seq and GWAS data from human rheumatoid arthritis (RA) supported a similar dispersion of pathogenic mechanisms in different cell types. Thus, systems-level approaches to prioritize biomarkers and drugs are needed. Here, we present a prioritization strategy that is based on constructing network models of disease-associated cell types and interactions using scRNA-seq data from our mouse model of arthritis, as well as human RA, which we term multicellular disease models (MCDMs). We find that the network centrality of MCDM cell types correlates with the enrichment of genes harboring genetic variants associated with RA and thus could potentially be used to prioritize cell types and genes for diagnostics and therapeutics. We validated this hypothesis in a large-scale study of patients with 13 different autoimmune, allergic, infectious, malignant, endocrine, metabolic, and cardiovascular diseases, as well as a therapeutic study of the mouse arthritis model. Overall, our results support that our strategy has the potential to help prioritize diagnostic and therapeutic targets in human disease.

Circulating tumor cells (CTCs) play a fundamental role in cancer progression. However, in mice, limited blood volume and the rarity of CTCs in the bloodstream preclude longitudinal, in-depth studies of these cells using existing liquid biopsy techniques. Here, we present an optofluidic system that continuously collects fluorescently labeled CTCs from a genetically engineered mouse model (GEMM) for several hours per day over multiple days or weeks. The system is based on a microfluidic cell sorting chip connected serially to an unanesthetized mouse via an implanted arteriovenous shunt. Pneumatically controlled microfluidic valves capture CTCs as they flow through the device, and CTC-depleted blood is returned back to the mouse via the shunt. To demonstrate the utility of our system, we profile CTCs isolated longitudinally from animals over 4 days of treatment with the BET inhibitor JQ1 using single-cell RNA sequencing (scRNA-Seq) and show that our approach eliminates potential biases driven by intermouse heterogeneity that can occur when CTCs are collected across different mice. The CTC isolation and sorting technology presented here provides a research tool to help reveal details of how CTCs evolve over time, allowing studies to credential changes in CTCs as biomarkers of drug response and facilitating future studies to understand the role of CTCs in metastasis.

Genome-wide association studies (GWAS) have revealed risk alleles for ulcerative colitis (UC), but their cell type and pathway specificities are often unknown. Here, we generate an atlas of 115,517 cells from the colon mucosa of seven UC patients and ten healthy individuals, revealing 51 epithelial, stromal, and immune cell subsets, including a subset of BEST4+ enterocytes, which may sense and respond to pH, and IL13RA2+IL-11+ inflammatory fibroblasts, which we associate with resistance to anti-TNF therapy. Inflammatory fibroblasts, inflammatory monocytes, microfold-like cells, and CD8+IL-17+ T cells expand during disease, and form intercellular interaction hubs that mediate cross-talk between diverse cellular lineages. We identify hundreds of putative autocrine and paracrine cell-cell interactions that may explain the migration, expansion, or inhibition of cell types with disease. Surprisingly, UC risk genes are often cell type specific and co-regulated in relatively few gene modules, suggesting convergence onto limited sets of cell types and pathways. Using this observation, we nominate and infer putative functions for UC risk genes across all GWAS loci. Our atlas thus provides a framework for interrogating complex human diseases and mapping risk variants onto their cell types and pathways of activity.

Background

Human immunity relies on the coordinated responses of many cellular subsets and functional states. Inter-individual variations in cellular composition and communication could thus potentially alter host protection. Here, we explore this hypothesis by applying single-cell RNA-sequencing to examine viral responses among the dendritic cells (DCs) of three elite controllers (ECs) of HIV-1 infection.

Results

To overcome the potentially confounding effects of donor-to-donor variability, we present a generally applicable computational framework for identifying reproducible patterns in gene expression across donors who share a unifying classification. Applying it, we discover a highly functional antiviral DC state in ECs whose fractional abundance after in vitro exposure to HIV-1 correlates with higher CD4+ T cell counts and lower HIV-1 viral loads, and that effectively primes polyfunctional T cell responses in vitro. By integrating information from existing genomic databases into our reproducibility-based analysis, we identify and validate select immunomodulators that increase the fractional abundance of this state in primary peripheral blood mononuclear cells from healthy individuals in vitro.

Conclusions

Overall, our results demonstrate how single-cell approaches can reveal previously unappreciated, yet important, immune behaviors and empower rational frameworks for modulating systems-level immune responses that may prove therapeutically and prophylactically useful.

 

Complete information about the scRAD R package is available on the Shalek Lab Resources page.

The recent advent of methods for high-throughput single-cell molecular profiling has catalyzed a growing sense in the scientific community that the time is ripe to complete the 150-year-old effort to identify all cell types in the human body. The Human Cell Atlas Project is an international collaborative effort that aims to define all human cell types in terms of distinctive molecular profiles (such as gene expression profiles) and to connect this information with classical cellular descriptions (such as location and morphology). An open comprehensive reference map of the molecular state of cells in healthy human tissues would propel the systematic study of physiological states, developmental trajectories, regulatory circuitry and interactions of cells, and also provide a framework for understanding cellular dysregulation in human disease. Here we describe the idea, its potential utility, early proofs-of-concept, and some design considerations for the Human Cell Atlas, including a commitment to open data, code, and community.

We develop single-cell transcriptomic approaches to comprehensively profile human tissues and model systems. Previously, we focused on establishingvalidating, scaling, and simplifying single-cell RNA-seq, often through the development of microdevices, to enable genome-wide identification of the cell types/states contained within complex biological samples. More recently, we helped both enhance the detection of phenotype-defining transcripts using these methods and simplify their on-site processing for clinical applications. In parallel, we have also worked to democratize these techniques, providing open access to resources and protocols, training thousands locally and abroad, and establishing infrastructure and on-site collaborations spanning across 6 continents and 26+ countries.

As many factors define cellular phenotype and influence disease beyond mRNA, we develop complementary methods for co-assaying other cellular attributes to enrich our understanding of the drivers of cellular behaviors. Examples including the abundance of additional ‘-omes’, the sequence and amount of important transcripts, cellular history, biophysical properties, spatial position, and functional output. Recently, we have worked to: 1. detect pathogens in cells and potentially actionable associated host factors; 2. query for specific mutations to identify cancer cells; and, 3. extract T cell receptor sequences to examine clonality. We have also formulated computational methods to derive deeper insights from these data (e.g., to examine viral dynamic in infected cells, reproducible features hidden by inter-individual variability, multicellular immune dynamics, intercellular communication, or alteration in cellular ecosystems associated with pathology).

We explore how the extracellular milieu influences cellular decision-making. Here, we have employed controlled culture conditions with cells and organoids, chemical and genetic perturbations, and constant microfluidic perfusion. We also have leveraged natural microenvironmental variation within and across tissues via microdissection and by using photoactivatable probes that retain spatial information through dissociation. In each instance, we aim to understand the degree to which extracellular environments modulate, and can be used to rationally control, the responses of individual cells or the overall distribution thereof, with an eye toward engineering tissue responses.

We examine the impact of intercellular interactions on cellular function. We have used coculture, imaging and perturbation strategies, as well as matched computational methods, to reinforce findings from dissociated samples, validate inferred cell-cell communication in vivo (e.g., between sensory neurons and lymph node resident cells), and manipulate multicellular systems (e.g., organoids). We are currently working on building arrayed, synthetically-designed cellular ensembles to examine how ‘tissue’ structure impacts functional response. Our overall goal is to understand cellular co-dependencies that influence niche- and tissue-level response dynamics.

We broadly study how intra- and extracellular circuits collectively drive healthy and diseased tissue states. By leveraging the massive genomic datasets we and others have generated from complex tissues (like melanoma tumors, inflamed gut, and nasal polyps), we have begun to identify common and unique cell types/states and circuits associated with pathology that may be important for regulating biological function and stability. Our current findings suggest multiple overlaps among distinct diseases, pointing to the possibility of a finite set of evolved response strategies and thus common interventions based on adjusting specific cell states, cell frequencies, and/or cell-cell communication pathways.

We lack effective treatments and preventions for many of the most challenging infectious diseases, many of which disproportionately impact those in low- and middle-income countries or traditionally marginalized communities.

To help address this, we have established and enabled multi-group, multi-country partnerships to deploy and adapt cutting-edge genomic tools. By examining how cells dynamically alter their states, individually and collectively, during disease and/or its resolution in acute and chronic infections—e.g., tuberculosis, HIV/SHIV, hepatitis, malaria, leprosy, flu, SARS-CoV-2, and ebola—we have uncovered cellular and molecular features of pathogen control or pathology to potentiate or counteract, respectively. Illustratively, in tuberculosis, we identified a functional role for cytotoxic CD8 and hybrid type1-type17 T cells in control of infection in the lung and links between mast, plasma, and endothelial cell abundance (type-2 immune responses) and bacterial burden. We have also built methods for examining pathogens within individual host cells to define their dynamic interdependence and identify potentially restrictive host factors.

We are currently working to identify the drivers of common host responses to distinct perturbations and their targetability, as well as the impact of different interventions (e.g., vaccines).

Immune responses play a critical role in preventing tumorigenesis. Sometimes, however, they are ineffectual and can even drive/support malignancy.

We have examined how cancer cells alter and are influenced by their tumor microenvironments (TMEs), and the impact this has on therapeutic responses. Illustratively, in Pancreatic Ductal Adenocarcinoma (PDAC), by profiling liver metastases and matched organoid models, we showed: 1. associations between TME and malignant cell state composition; 2. that autocrine and paracrine signaling can drive malignant cell state transitions, even in an isogenic background, altering the efficacy of frontline chemotherapies; and, 3. that microenvironmental manipulations can be used to control malignant state, and thereby drug responses, rationally, and to improve model fidelity for screening potential therapies. This and related work highlight the potential utility of modulating indirect target cells (T cells in the PDAC TME or basal cells in allergic inflammation) to enhance cures and preventions. 

We are now systematically expanding this work to define how additional environmental and cell-intrinsic factors influence malignant cell state plasticity in PDAC and other cancers toward enhancing treatments.

 

We are exposed to a constant flux of external biochemical and physical stimuli as we age. Despite variability in our overall experiences and exact constitutions, our individual tissues typically manage to maintain functionality, though each can differ in its resilience to distinct stressors.

We have characterized how differences in cellular composition and communication impact tissue fitness and have identified responses and subsequent adaptations that drive chronic dysfunction. For example, although aberrant immune activity can precipitate allergic inflammatory diseases, therapies targeting immune cells and signaling are only successful in some, suggesting chronicity may involve alternative mechanisms. Previously, we helped demonstrate that dysregulated type-2 immune signaling, driven by environmental allergens, can impact tissue health in the upper airway through generating dysfunctional basal epithelial stem cells. These stem cells can then contribute to persistence by serving as repositories for allergic inflammatory memories, altering the integrity and functional output of the nasal epithelium. Our work, with that of others, suggests generalizable principles for cellular memory, and informs where and how tissues should be targeted to support health or restore function. We have since further investigated how tissue-resident cellular subsets participate in, and are shaped by, environmental exposures at barrier tissues and the functional consequences of these experiences.

We are now working to develop a more holistic appreciation for how different intra- and extracellular factors (e.g., genetics and integrated exposure history, respectively) influence barrier tissue function.

Single-cell transcriptomics reveals gene expression heterogeneity but suffers from stochastic dropout and characteristic bimodal expression distributions in which expression is either strongly non-zero or non-detectable. We propose a two-part, generalized linear model for such bimodal data that parameterizes both of these features. We argue that the cellular detection rate, the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. Our model provides gene set enrichment analysis tailored to single-cell data. It provides insights into how networks of co-expressed genes evolve across an experimental treatment. MAST is available at https://github.com/RGLab/MAST.