Crohn’s disease is an inflammatory bowel disease (IBD) which most often presents with patchy lesions in the terminal ileum and colon and requires complex clinical care. Recent advances in the targeting of cytokines and leukocyte migration have greatly advanced treatment options, but most patients still relapse and inevitably progress. Although single-cell approaches are transforming our ability to understand the barrier tissue biology of inflammatory disease, comprehensive single-cell RNA-sequencing (scRNA-seq) atlases of IBD to date have largely sampled pre-treated patients with established disease. This has limited our understanding of which cell types, subsets, and states at diagnosis are predictive of disease severity and response to treatment. Here, through a combined clinical, flow cytometric, and scRNA-seq study, we profile diagnostic human biopsies from the terminal ileum of treatment-naive pediatric patients with Crohn’s disease (pediCD; n=14) and from non-inflamed pediatric controls with functional gastrointestinal disorders (FGID; n=13). To fully resolve and annotate epithelial, stromal, and immune cell states among the 201,883 single-cell transcriptomes, we develop and deploy a principled and unbiased tiered clustering approach, ARBOL, yielding 138 FGID and 305 pediCD end cell clusters. Notably, through both flow cytometry and scRNA-seq, we observe that at the level of broad cell types, treatment-naive pediCD is not readily distinguishable from FGID in cellular composition. However, by integrating high-resolution scRNA-seq analysis, we identify significant differences in cell states that arise during pediCD relative to FGID. Furthermore, by closely linking our scRNA-seq analysis with clinical meta-data, we resolve a vector of lymphoid, myeloid, and epithelial cell states in treatment-naive samples which can distinguish patients with less severe disease (those not on anti-TNF therapies (NOA)), from those with more severe disease at presentation who require anti-TNF therapies. Moreover, this vector was also able to distinguish those patients that achieve a full response (FR) to anti-TNF blockade from those more treatment-resistant patients who only achieve a partial response (PR). Our study jointly leverages a treatment-naive cohort, high-resolution principled scRNA-seq data analysis, and clinical outcomes to understand which baseline cell states may predict inflammatory disease trajectory.

SARS-CoV-2 infection can cause severe respiratory COVID-19. However, many individuals present with isolated upper respiratory symptoms, suggesting potential to constrain viral pathology to the nasopharynx. Which cells SARS-CoV-2 primarily targets and how infection influences the respiratory epithelium remains incompletely understood. We performed scRNA-seq on nasopharyngeal swabs from 58 healthy and COVID-19 participants. During COVID-19, we observe expansion of secretory, loss of ciliated, and epithelial cell repopulation via deuterosomal expansion. In mild/moderate COVID-19, epithelial cells express anti-viral/interferon-responsive genes, while cells in severe COVID-19 have muted anti-viral responses despite equivalent viral loads. SARS-CoV-2 RNA+ host-target cells are highly heterogenous, including developing ciliated, interferon-responsive ciliated, AZGP1high goblet, and KRT13+ “hillock”-like cells, and we identify genes associated with susceptibility, resistance, or infection response. Our study defines protective and detrimental responses to SARS-CoV-2, the direct viral targets of infection, and suggests that failed nasal epithelial anti-viral immunity may underlie and precede severe COVID-19.

COVID-19, caused by SARS-CoV-2, can result in acute respiratory distress syndrome and multiple-organ failure, but little is known about its pathophysiology. Here, we generated single-cell atlases of 23 lung, 16 kidney, 16 liver and 19 heart COVID-19 autopsy donor tissue samples, and spatial atlases of 14 lung donors. Integrated computational analysis uncovered substantial remodeling in the lung epithelial, immune and stromal compartments, with evidence of multiple paths of failed tissue regeneration, including defective alveolar type 2 differentiation and expansion of fibroblasts and putative TP63+ intrapulmonary basal-like progenitor cells. Viral RNAs were enriched in mononuclear phagocytic and endothelial lung cells which induced specific host programs. Spatial analysis in lung distinguished inflammatory host responses in lung regions with and without viral RNA. Analysis of the other tissue atlases showed transcriptional alterations in multiple cell types in COVID-19 donor heart tissue, and mapped cell types and genes implicated with disease severity based on COVID-19 GWAS. Our foundational dataset elucidates the biological impact of severe SARS-CoV-2 infection across the body, a key step towards new treatments.

Angiotensin-converting enzyme 2 (ACE2) and accessory proteases (TMPRSS2 and CTSL) are needed for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cellular entry, and their expression may shed light on viral tropism and impact across the body. We assessed the cell-type-specific expression of ACE2, TMPRSS2 and CTSL across 107 single-cell RNA-sequencing studies from different tissues. ACE2, TMPRSS2 and CTSL are coexpressed in specific subsets of respiratory epithelial cells in the nasal passages, airways and alveoli, and in cells from other organs associated with coronavirus disease 2019 (COVID-19) transmission or pathology. We performed a meta-analysis of 31 lung single-cell RNA-sequencing studies with 1,320,896 cells from 377 nasal, airway and lung parenchyma samples from 228 individuals. This revealed cell-type-specific associations of age, sex and smoking with expression levels of ACE2, TMPRSS2 and CTSL. Expression of entry factors increased with age and in males, including in airway secretory cells and alveolar type 2 cells. Expression programs shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues included genes that may mediate viral entry, key immune functions and epithelial–macrophage cross-talk, such as genes involved in the interleukin-6, interleukin-1, tumor necrosis factor and complement pathways. Cell-type-specific expression patterns may contribute to the pathogenesis of COVID-19, and our work highlights putative molecular pathways for therapeutic intervention.

The SARS-CoV-2 pandemic has caused over 1 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome, or direct complications resulting in multiple-organ failures. Little is known about the host tissue immune and cellular responses associated with COVID-19 infection, symptoms, and lethality. To address this, we collected tissues from 11 organs during the clinical autopsy of 17 individuals who succumbed to COVID-19, resulting in a tissue bank of approximately 420 specimens. We generated comprehensive cellular maps capturing COVID-19 biology related to patients’ demise through single-cell and single-nucleus RNA-Seq of lung, kidney, liver and heart tissues, and further contextualized our findings through spatial RNA profiling of distinct lung regions. We developed a computational framework that incorporates removal of ambient RNA and automated cell type annotation to facilitate comparison with other healthy and diseased tissue atlases. In the lung, we uncovered significantly altered transcriptional programs within the epithelial, immune, and stromal compartments and cell intrinsic changes in multiple cell types relative to lung tissue from healthy controls. We observed evidence of: alveolar type 2 (AT2) differentiation replacing depleted alveolar type 1 (AT1) lung epithelial cells, as previously seen in fibrosis; a concomitant increase in myofibroblasts reflective of defective tissue repair; and, putative TP63+ intrapulmonary basal-like progenitor (IPBLP) cells, similar to cells identified in H1N1 influenza, that may serve as an emergency cellular reserve for severely damaged alveoli. Together, these findings suggest the activation and failure of multiple avenues for regeneration of the epithelium in these terminal lungs. SARS-CoV-2 RNA reads were enriched in lung mononuclear phagocytic cells and endothelial cells, and these cells expressed distinct host response transcriptional programs. We corroborated the compositional and transcriptional changes in lung tissue through spatial analysis of RNA profiles in situ and distinguished unique tissue host responses between regions with and without viral RNA, and in COVID-19 donor tissues relative to healthy lung. Finally, we analyzed genetic regions implicated in COVID-19 GWAS with transcriptomic data to implicate specific cell types and genes associated with disease severity. Overall, our COVID-19 cell atlas is a foundational dataset to better understand the biological impact of SARS-CoV-2 infection across the human body and empowers the identification of new therapeutic interventions and prevention strategies.

Organ infiltration by donor T cells is critical to the development of acute graft-versus-host disease (aGVHD) in recipients after allogeneic hematopoietic stem cell transplant (allo-HCT). However, deconvoluting the transcriptional programs of newly recruited donor T cells from those of tissue-resident T cells in aGVHD target organs remains a challenge. Here, we combined the serial intravascular staining technique with single-cell RNA sequencing to dissect the tightly connected processes by which donor T cells initially infiltrate tissues and then establish a pathogenic tissue residency program in a rhesus macaque allo-HCT model that develops aGVHD. Our results enabled creation of a spatiotemporal map of the transcriptional programs controlling donor CD8+ T cell infiltration into the primary aGVHD target organ, the gastrointestinal (GI) tract. We identified the large and small intestines as the only two sites demonstrating allo-specific, rather than lymphodepletion-driven, T cell infiltration. GI-infiltrating donor CD8+ T cells demonstrated a highly activated, cytotoxic phenotype while simultaneously developing a canonical tissue-resident memory T cell (TRM) transcriptional signature driven by interleukin-15 (IL-15)/IL-21 signaling. We found expression of a cluster of genes directly associated with tissue invasiveness, including those encoding adhesion molecules (ITGB2), specific chemokines (CCL3 and CCL4L1) and chemokine receptors (CD74), as well as multiple cytoskeletal proteins. This tissue invasion transcriptional signature was validated by its ability to discriminate the CD8+ T cell transcriptome of patients with GI aGVHD from those of GVHD-free patients. These results provide insights into the mechanisms controlling tissue occupancy of target organs by pathogenic donor CD8+ TRMcells during aGVHD in primate transplant recipients.

In late 2019 and through 2020, the COVID-19 pandemic swept the world, presenting both scientific and medical challenges associated with understanding and treating a previously unknown disease. To help address the need for great understanding of COVID-19, the scientific community mobilized and banded together rapidly to characterize SARS-CoV-2 infection, pathogenesis and its distinct disease trajectories. The urgency of COVID-19 provided a pressing use-case for leveraging relatively new tools, technologies, and nascent collaborative networks. Single-cell biology is one such example that has emerged over the last decade as a powerful approach that provides unprecedented resolution to the cellular and molecular underpinnings of biological processes. Early foundational work within the single-cell community, including the Human Cell Atlas, utilized published and unpublished data to characterize the putative target cells of SARS-CoV-2 sampled from diverse organs based on expression of the viral receptor ACE2 and associated entry factors TMPRSS2 and CTSL (Muus et al., 2020; Sungnak et al., 2020; Ziegler et al., 2020). This initial characterization of reference data provided an important foundation for framing infection and pathology in the airway as well as other organs. However, initial community analysis was limited to samples derived from uninfected donors and other previously-sampled disease indications. This report provides an overview of a single-cell data resource derived from samples from COVID-19 patients along with initial observations and guidance on data reuse and exploration.

Recent political and social events, mainly those originating in the USA, have triggered an intense desire for equity in all facets of the human experience. More specifically, actions engendered by the Black Lives Matter movement and others have led to the scrutinizing of equity across a wide range of fields, from politics and business to academia and scientific research. In science, in particular, several major journals have published opinion pieces and editorials seeking greater equity or relating to the ‘non-white’ experience. Many of their readers have been stunned by the revelations. Indeed, the scientific community is only now coming to terms with an unsettling and uncomfortable truth: structural exclusion of non-white people permeates all levels of the scientific enterprise. That being said, with awareness comes opportunity. New frameworks for describing and addressing these issues have recently emerged, creating a structure with which groups can each consider how to best internalize and embody the lessons in their own scientific initiatives.

In the Human Cell Atlas (HCA) consortium, equity has been a point of emphasis from inception in 2016 for one simple reason: the HCA’s success depends upon it. Fundamentally, the HCA is meant to be a foundational resource, inclusive of the many cell types and states found in healthy people across the globe. That resource can then be used to address a wide range of scientific questions and, in the future, to facilitate a better understanding of disease. This mission demands, explicitly, the inclusion of representation along axes of sex, age, ethnicity, environment, socioeconomic status and, in some cases, disease susceptibility in its biospecimens. Moreover, it requires broad participation to ensure comprehensive coverage and identify barriers to success and support continuity, and necessitates reciprocal, balanced benefit from the methods, data and results to ensure global engagement.

To this end, the HCA has set ambitious and dynamic equity goals for itself. Below, we describe key lessons learned through equity activities thus far, as well as our future plans.

Granulomas are complex cellular structures comprised predominantly of macrophages and lymphocytes that function to contain and kill invading pathogens. Here, we investigated single cell phenotypes associated with antimicrobial responses in human leprosy granulomas by applying single cell and spatial sequencing to leprosy biopsy specimens. We focused on reversal reactions (RR), a dynamic process in which some patients with disseminated lepromatous leprosy (L-lep) transition towards self-limiting tuberculoid leprosy (T-lep), mounting effective antimicrobial responses. We identified a set of genes encoding proteins involved in antimicrobial responses that are differentially expressed in RR versus L-lep lesions, and regulated by IFN-γ and IL-1β. By integrating the spatial coordinates of the key cell types and antimicrobial gene expression in RR and T-lep lesions, we constructed a map revealing the organized architecture of granulomas depicting compositional and functional layers by which macrophages, T cells, keratinocytes and fibroblasts contribute to the antimicrobial response.

Ebola virus (EBOV) causes epidemics with high mortality yet remains understudied due to the challenge of experimentation in high-containment and outbreak settings. Here, we used single-cell transcriptomics and CyTOF-based single-cell protein quantification to characterize peripheral immune cells during EBOV infection in rhesus monkeys. We obtained 100,000 transcriptomes and 15,000,000 protein profiles, finding that immature, proliferative monocyte-lineage cells with reduced antigen-presentation capacity replace conventional monocyte subsets, while lymphocytes upregulate apoptosis genes and decline in abundance. By quantifying intracellular viral RNA, we identify molecular determinants of tropism among circulating immune cells and examine temporal dynamics in viral and host gene expression. Within infected cells, EBOV downregulates STAT1 mRNA and interferon signaling, and it upregulates putative pro-viral genes (e.g., DYNLL1 and HSPA5), nominating pathways the virus manipulates for its replication. This study sheds light on EBOV tropism, replication dynamics, and elicited immune response and provides a framework for characterizing host-virus interactions under maximum containment.

In humans and nonhuman primates, Mycobacterium tuberculosis lung infection yields a complex multicellular structure: the tuberculosis granuloma. All granulomas are not equivalent, however, even within the same host: in some, local immune activity promotes bacterial clearance, while in others, it allows persistence or outgrowth. Here, we used single-cell RNA-sequencing to define holistically cellular responses associated with control in cynomolgus macaques. Granulomas that facilitated bacterial killing contained significantly higher proportions of CD4+ and CD8+ T cells expressing hybrid Type1-Type17 immune responses or stem-like features and CD8-enriched T cells with specific cytotoxic functions; failure to control correlated with mast cell, plasma cell and fibroblast abundance. Co-registering these data with serial PET-CT imaging suggests that a degree of early immune control can be achieved through cytotoxic activity, but that more robust restriction only arises after the priming of specific adaptive immune responses, defining new targets for vaccination and treatment.

High-throughput single-cell RNA-sequencing (scRNA-seq) methodologies enable characterization of complex biological samples by increasing the number of cells that can be profiled contemporaneously. Nevertheless, these approaches recover less information per cell than low-throughput strategies. To accurately report the expression of key phenotypic features of cells, scRNA-seq platforms are needed that are both high fidelity and high throughput. To address this need, we created Seq-Well S3 (“Second-Strand Synthesis”), a massively parallel scRNA-seq protocol that uses a randomly primed second-strand synthesis to recover complementary DNA (cDNA) molecules that were successfully reverse transcribed but to which a second oligonucleotide handle, necessary for subsequent whole transcriptome amplification, was not appended due to inefficient template switching. Seq-Well Sincreased the efficiency of transcript capture and gene detection compared with that of previous iterations by up to 10- and 5-fold, respectively. We used Seq-Well S3 to chart the transcriptional landscape of five human inflammatory skin diseases, thus providing a resource for the further study of human skin inflammation.

Our nasal epithelial COVID-19 dataset, along with COVID-19 datasets from other genomics groups, can now be found at This work was sponsored by the Chan-Zuckerberg Initiative.

Bulk transcriptomic studies have defined classical and basal-like gene expression subtypes in pancreatic ductal adenocarcinoma (PDAC) that correlate with survival and response to chemotherapy; however, the underlying mechanisms that govern these subtypes and their heterogeneity remain elusive. Here, we performed single-cell RNA-sequencing of 23 metastatic PDAC needle biopsies and matched organoid models to understand how tumor cell-intrinsic features and extrinsic factors in the tumor microenvironment (TME) shape PDAC cancer cell phenotypes. We identify a novel cancer cell state that co-expresses basal-like and classical signatures, demonstrates upregulation of developmental and KRAS-driven gene expression programs, and represents a transitional intermediate between the basal-like and classical poles. Further, we observe structure to the metastatic TME supporting a model whereby reciprocal intercellular signaling shapes the local microenvironment and influences cancer cell transcriptional subtypes. In organoid culture, we find that transcriptional phenotypes are plastic and strongly skew toward the classical expression state, irrespective of genotype. Moreover, we show that patient-relevant transcriptional heterogeneity can be rescued by supplementing organoid media with factors found in the TME in a subtype-specific manner. Collectively, our study demonstrates that distinct microenvironmental signals are critical regulators of clinically relevant PDAC transcriptional states and their plasticity, identifies the necessity for considering the TME in cancer modeling efforts, and provides a generalizable approach for delineating the cell-intrinsic versus -extrinsic factors that govern tumor cell phenotypes.

There is pressing urgency to understand the pathogenesis of the severe acute respiratory syndrome coronavirus clade 2 (SARS-CoV-2) which causes the disease COVID-19. SARS-CoV- 2 spike (S)-protein binds ACE2, and in concert with host proteases, principally TMPRSS2, promotes cellular entry. The cell subsets targeted by SARS-CoV-2 in host tissues, and the factors that regulate ACE2 expression, remain unknown. Here, we leverage human, non-human primate, and mouse single-cell RNA-sequencing (scRNA-seq) datasets across health and disease to uncover putative targets of SARS-CoV-2 amongst tissue-resident cell subsets. We identify ACE2 and TMPRSS2 co-expressing cells within lung type II pneumocytes, ileal absorptive enterocytes, and nasal goblet secretory cells. Strikingly, we discover that ACE2 is a human interferon- stimulated gene (ISG) in vitro using airway epithelial cells, and extend our findings to in vivo viral infections. Our data suggest that SARS-CoV-2 could exploit species-specific interferon-driven upregulation of ACE2, a tissue-protective mediator during lung injury, to enhance infection.

The COVID-19 pandemic, caused by the novel coronavirus SARS-CoV-2, creates an urgent need for identifying molecular mechanisms that mediate viral entry, propagation, and tissue pathology. Cell membrane bound angiotensin-converting enzyme 2 (ACE2) and associated proteases, transmembrane protease serine 2 (TMPRSS2) and Cathepsin L (CTSL), were previously identified as mediators of SARS-CoV2 cellular entry. Here, we assess the cell type-specific RNA expression of ACE2, TMPRSS2, and CTSL through an integrated analysis of 107 single-cell and single-nucleus RNA-Seq studies, including 22 lung and airways datasets (16 unpublished), and 85 datasets from other diverse organs. Joint expression of ACE2 and the accessory proteases identifies specific subsets of respiratory epithelial cells as putative targets of viral infection in the nasal passages, airways, and alveoli. Cells that co-express ACE2 and proteases are also identified in cells from other organs, some of which have been associated with COVID-19 transmission or pathology, including gut enterocytes, corneal epithelial cells, cardiomyocytes, heart pericytes, olfactory sustentacular cells, and renal epithelial cells. Performing the first meta- analyses of scRNA-seq studies, we analyzed 1,176,683 cells from 282 nasal, airway, and lung parenchyma samples from 164 donors spanning fetal, childhood, adult, and elderly age groups, associate increased levels of ACE2, TMPRSS2, and CTSL in specific cell types with increasing age, male gender, and smoking, all of which are epidemiologically linked to COVID-19 susceptibility and outcomes. Notably, there was a particularly low expression of ACE2 in the few young pediatric samples in the analysis. Further analysis reveals a gene expression program shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues, including genes that may mediate viral entry, subtend key immune functions, and mediate epithelial-macrophage cross- talk. Amongst these are IL6, its receptor and co-receptor, IL1R, TNF response pathways, and complement genes. Cell type specificity in the lung and airways and smoking effects were conserved in mice. Our analyses suggest that differences in the cell type-specific expression of mediators of SARS-CoV-2 viral entry may be responsible for aspects of COVID-19 epidemiology and clinical course, and point to putative molecular pathways involved in disease susceptibility and pathogenesis.

Crucial transitions in cancer—including tumor initiation, local expansion, metastasis, and therapeutic resistance—involve complex interactions between cells within the dynamic tumor ecosystem. Transformative single-cell genomics technologies and spatial multiplex in situ methods now provide an opportunity to interrogate this complexity at unprecedented resolution. The Human Tumor Atlas Network (HTAN), part of the National Cancer Institute (NCI) Cancer Moonshot Initiative, will establish a clinical, experimental, computational, and organizational framework to generate informative and accessible three-dimensional atlases of cancer transitions for a diverse set of tumor types. This effort complements both ongoing efforts to map healthy organs and previous large-scale cancer genomics approaches focused on bulk sequencing at a single point in time. Generating single-cell, multiparametric, longitudinal atlases and integrating them with clinical outcomes should help identify novel predictive biomarkers and features as well as therapeutically relevant cell types, cell states, and cellular interactions across transitions. The resulting tumor atlases should have a profound impact on our understanding of cancer biology and have the potential to improve cancer detection, prevention, and therapeutic discovery for better precision-medicine treatments of cancer patients and those at risk for cancer.

There is pressing urgency to better understand the pathogenesis of the severe acute respiratory syndrome (SARS) coronavirus (CoV) clade SARS-CoV-2, which causes the disease known as COVID-19. SARS-CoV-2, like SARS-CoV, utilizes ACE2 to bind host cells. While initial SARS- CoV-2 cell entry and infection depend on ACE2 in concert with the protease TMPRSS2 for spike (S) protein activation, the specific cell subsets targeted by SARS-CoV-2 in host tissues, and the factors that regulate ACE2 expression, remain unknown. Here, we leverage human and non- human primate (NHP) single-cell RNA-sequencing (scRNA-seq) datasets to uncover the tissue- resident cell subsets that may serve as the cellular targets of SARS-CoV-2. We identify ACE2 and TMPRSS2 co-expressing cells within type II pneumocytes in NHP lung, absorptive enterocytes in human and NHP terminal ileum, and human nasal goblet secretory cells. Strikingly, we discover, and extensively corroborate using publicly available data sets, that ACE2 is an interferon-stimulated gene (ISG) in human epithelial cells. We further validate this finding in primary upper airway human respiratory epithelial cells. Thus, SARS-CoV-2 may exploit IFN- driven upregulation of ACE2, a key tissue-protective mediator during lung injury, to enhance infection.


In light of the global effort to better understand the new SARS-CoV-2 virus, we and other researchers from the HCA Lung Biological Network and beyond have begun an initiative to investigate datasets from relevant tissues profiled as part of other ongoing studies. These studies represent, for example, large efforts to characterize HIV, Mtb, and influenza infection and allergy in primary human and non-human primate samples. This page serves as a guide to viewing our data interactively on and downloading datasets from our single-cell portal, the Alexandria Project. Alternatively, bulk downloading of our data is available here. For more on research initiatives in COVID-19 being undertaken by the HCA, and the HCA Lung Biological Network in particular, please visit the HCA website here.

We investigated two genes whose protein products are central to the cellular entry of SARS-CoV-2: ACE2 and TMPRSS2. Consistent with previous studies, we found that the gene encoding ACE2, the SARS-CoV-2 entry receptor, is expressed on a subset of lung epithelial cells, type 2 pneumocytes, and a subset of ileal epithelial cells, absorptive enterocytes, across several datasets. The protease TMPRSS2 primes the spike protein of SARS-CoV-2 and is also important for viral entry. Because of this, we identified cells which co-express ACE2 and TMPRSS2 in our datasets, and investigated additional genes enriched within ACE2 and TMPRSS2 co-expressing cells. As we believe this data may prove useful to other researchers investigating similar questions, we have made our datasets public through the interactive Alexandria Project. Here, you can view our annotations of these datasets and investigate which other genes are highly expressed in these cell subsets of interest.

NB None of the datasets presented here were designed to answer specific questions about COVID-19. Additional studies will be required across larger, appropriately structured cohorts. Further, we provide a note of caution when interpreting scRNA-seq data for low abundance transcripts like ACE2 and TMPRSS2 as detection inefficiencies and/or sequencing depth may result in an underestimation of the actual frequencies of ACE2+ or ACE2+/TMPRSS2+ cells in tissues. Moreover, the protein levels of each may differ from their mRNA abundances. We present each data set separately, as each study differed by method of tissue processing and collection protocols, each of which can influence the frequency of recovered cell subsets.


Our pre-print “SARS-CoV-2 receptor ACE2 is an interferon-stimulated gene in human airway epithelial cells and is enriched in specific cell subsets across tissues” can be found here, and the abstract is reproduced below.

There is pressing urgency to better understand the pathogenesis of the severe acute respiratory syndrome (SARS) coronavirus (CoV) clade SARS-CoV-2. SARS-CoV-2, like SARS-CoV, utilizes ACE2 to bind host cells. While initial SARS-CoV-2 cell entry and infection depend on ACE2 in concert with the protease TMPRSS2 for spike (S) protein activation, the specific cell subsets targeted by SARS-CoV-2 in host tissues, and the factors that regulate ACE2 expression, remain unknown. Here, we leverage human and non-human primate (NHP) single-cell RNA-sequencing (scRNA-seq) datasets to uncover the cell subsets that may serve as cellular targets of SARS-CoV-2. We identify ACE2/TMPRSS2 co-expressing cells within type II pneumocytes, absorptive enterocytes, and nasal goblet secretory cells. Strikingly, we discover that ACE2 is an interferon-stimulated gene (ISG) in human barrier tissue epithelial cells. Thus, SARS-CoV-2 may exploit IFN-driven upregulation of ACE2, a key tissue-protective mediator during lung injury, to enhance infection.


Atlas of ACE2 expression in healthy non-human primate lung and ileum

In this study, we collected cells from various tissues in healthy and SHIV-infected non-human primates using Seq-Well v1. Here we highlight the lung and ileum and show that ACE2 and TMPRSS2 are co-expressed most frequently in type II pneumocytes in the lung and absorptive enterocytes in the ileum.

To visualize these cells in the Alexandria Project, visit this study: Atlas of healthy non-human primate lung and ileum ACE2+ cells

This project contains two UMAP visualizations (one for lung cells and one for ileum cells): toggle between them by clicking on the ‘Explore’ tab, select ‘View Options’ in the top right hand corner, and switch between lung and ileum under the ‘Load Cluster’ dropdown.

For each of the lung and ileum cell UMAPs, we provide subsets of these visualizations which contain only the cell types enriched for double positive cells. By selecting the ‘Load cluster’ options called ‘Lung Epithelial Cells’ or ‘Ileum Absorptive Enterocytes’, you will be able to view gene expression differences between double positive cells and other cells within those cell types. In the lung epithelial cells visualization, the following additional annotations are available under the ‘Select Annotation’ dropdown:

  • ACE2+ – cells expressing ACE2
  • TMPRSS2+ – cells expressing TMPRSS2
  • ACE2_TMPRSS2_double_positive – cells expressing both genes
  • celltype_double_positive – the cell type column and double positive column combined so genes that are differentially expressed between only one cell type may be viewed
  • celltype_ACE2+ – same as above for ACE2 expression

After selecting one of these annotations, you can then search for a gene of interest in the ‘Search Genes’ box in the left corner to view the expression of that gene as a violin plot split by the annotation you select. For example, to reproduce Figure 1D in the manuscript, you would select annotation “ACE2_TMPRSS2_double_positive” and search for gene ‘IFNGR2’ in the ‘Search Genes’ box.

The same options are available for the ileal absorptive enterocytes visualization except that the cell type is combined with the double positive/ACE2 column since this subset only includes one cell type.

Full expression matrices (and therefore gene expression values visible when using the ‘Search Genes’ box) are available only for epithelial cell types as this data is derived from a pre-publication study.

Atlas of ACE2 and TMPRSS2 in human ileum

In this study, samples from human ileum were collected, processed, and run on 10x 3′ v2 show ACE2 and TMPRSS2 co-expression in absorptive enterocytes.

To visualize these cells in the Alexandria Portal, visit this study: ACE2 and TMPRSS2 most enriched within GSTA1+MGST3+ absorptive enterocytes in context of non-inflamed terminal ileum

Two tSNE visualizations are available for this study, one of all cells in the ileum samples “non-inflammed-tsne” and one of just the epithelial cells in the samples “non-inflammed-epth-tsne”. To toggle between these, select the “Explore” tab under the study title, then select ‘View options’ from the upper right corner of the plot and choose the visualization of interest in the ‘Load cluster’ dropdown.

Full expression matrices (and therefore gene expression values visible when using the ‘search genes’ box) are available only for the epithelial cell types as this data is derived from a pre-publication study.

Atlas of ACE2 and TMPRSS2 expression in human HIV- and TB-infected lung

In this study, samples from lung surgeries were run with Seq-Well S^3 and contain a variety of immune and epithelial cell types. We found the majority of ACE2 and TPRSS2 double positive cells in type II pneumocyte cells.

To view these cells in the Alexandria Project, visit this study: Human lung HIV-TB co-infection ACE2+ cells

To visualize the double positive cells in these samples, click the ‘Explore’ tab, then select ‘View Options’ in the top right corner of the plot and choose ‘ACE2_TMPRSS2_double_positive’ under the ‘Select Annotation’ dropdown menu.

Full expression matrices (and therefore gene expression values visible when using the ‘search genes’ box) are available only for epithelial cell types as this data is derived from a pre-publication study.

A subset of ACE2+ secretory cells in human nasal mucosa

In this study, we find a subset of secretory cells which co-express ACE2 and TMPRSS2.

To visualize these cells on the Alexandria Project, visit this study: Allergic inflammatory memory in human respiratory epithelial progenitor cells

To view the tSNE of epithelial cells, select the ‘Explore’ tab, select ‘View Options’ in the right hand corner of the plot and choose ‘Epithelial cells’ in the ‘Load Cluster’ dropdown. To view the cell type annotations in the first panel of the above plot choose ‘subset’ in the ‘Select Annotation’ dropdown menu. To view the ACE2/TMPRSS2 double positive cells, select ‘ACE2_TMPRSS2’ from the ‘Select annotation dropdown menu, and to view the cluster subsets from the third panel of this plot, select ‘res_0_8’ from the ‘Select Annotation Dropdown Menu’.

ACE2 and TPRSS2 co-expressing cells found in non-human primate granulomas and adjacent uninvolved lung tissue

In this study, we collected lung tissue from non-human primates infected with mTB. These tissues come from both mTB granulomas and adjacent uninvolved lung in the same monkey.

These cells, profiled using Seq-Well S^3, can be investigated interactively in the Alexandria Project: Epithelial cells in NHP TB granuloma and uninvolved lung

To view the data colored by granuloma and uninvolved lung choose the “Granuloma” annotation accessible by clicking “View Options” in the top right-hand corner and selecting from the “Select Annotation” dropdown menu.

Full expression matrices (and therefore gene expression values visible when using the ‘Search Genes’ box) are available only for relevant cell types as this data is derived from a pre-publication study.

Comparison of ACE2 and TMPRSS2 expression in human duodenal and ileal tissue and organoid-derived epithelial cells

In this study, samples from adult human duodenum and ileum were collected and split for primary tissue single-cell RNA-seq and organoid culture under several conditions and profiled with Seq-Well S^3. Organoids were cultured and passaged every 6-8 days in Matrigel domes with established media conditions meant to recapitulate the broad diversity of in vivo epithelial cell types (Fujii, M., et al., Cell Stem Cell. 2018). Organoid culture media contained recombinant Noggin, Rspondin-3, FGF2, IGF1, afamin-Wnt3A, in addition to Gastrin and TGF-b inhibitor A83-01 with and without recombinant EGF (E/NR3+F2I1Gi+Af-W3+A83).  Cells co-expressing ACE2 and TMPRSS2 were identified principally within enterocyte clusters of both tissues and organoids.

Visualize these samples interactively and read more about this study on the Alexandria Project: Comparison of ACE2 and TMPRSS2 expression in human duodenal and ileal tissue and organoid-derived epithelial cells

This project contains two UMAP visualizations (one for organoid cells and one for primary tissue cells): toggle between them by clicking on the ‘Explore’ tab, select ‘View Options’ in the top right hand corner, and switch between tissue and organoid under the ‘Load Cluster’ dropdown.

To visualize cells which co-express ACE2 and TMPRSS2, select the ‘ACE2_TMPRSS2’ option under the ‘Select Annotation’ dropdown. You can then search for a gene of interest in the ‘Search Genes’ box in the left corner to view the expression of that gene as a violin plot split by the annotation you select.

Full expression matrices (and therefore gene expression values visible when using the ‘Search Genes’ box) are available only for enterocyte cell types as this data is derived from a pre-publication study. Expression of ACE2 and TMPRSS2 are available for all cells.

Interferon regulation of ACE2 in human and murine basal cells

Analysis of these datasets and others lead to the hypothesis that expression of the ACE2 receptor may be upregulated by interferon. To further interrogate this hypothesis, we cultured basal cells from two primary human donors, one human basal cell line, and one mouse trachea and stimulated them with  IL4, IL17a, IFNgamma, IFNαlpha, IFNbeta for 12 hours overnight. We performed bulk RNA sequencing and differential expression to show a dose dependent upregulation of canonical ISGs (interferon signaling genes). Specifically, we see that ACE2 is most significantly unregulated following IFN alpha stimulation in primary human basal cells, diminished in the BEAS-2B cell line and not seen in mouse cells. 

To visualize these samples, the gene expression data may be viewed interactively in the Alexandria Project: Interferon regulation of ACE2 in human and murine basal cells

Under the ‘Explore’ tab, use the ‘Search genes’ field in the top left corner to visualize log-normalized gene expression. To visualize expression in each sample, select ‘View Options’ in the top left corner of the plot and choose the sample of interest under ‘Load cluster’. The ‘Stim_Dose’ annotation refers to the dose of each stimulation condition applied to that sample. 

Data for all samples in this study can be dowloaded under the ‘Download’ tab.

Murine nasal mucosa after intranasal interferon exposure

We test the impact of IFNalpha stimulation in vivo by treating two mice intranasally with 200 ng of IFNalpha and two with saline. After 12 hours, the nasal mucosa of the respiratory and olfactory epithelia and underlying lamina propria were isolated and prepared for sequencing with Seq-Well S^3.

These cells may be viewed interactively in the Alexandria Project: Murine nasal mucosa after intranasal interferon exposure

Data for all samples in this study can be dowloaded under the ‘Download’ tab.

Alexandria Project Details

Alexandria Documentation

Single Cell Portal Documentation

We thank the Broad Institute Single Cell Portal team for creating the platform that allows Alexandria to exist and for their working tirelessly to help us share our datasets for others to access.

By affinity capture and amplification of TCR transcripts from whole-transcriptome libraries, TCR CDR3 sequences can be recovered from 3′-barcoded scRNA-seq libraries (e.g. Seq-Well, Drop-seq, etc.). This method can be applied post-hoc, allowing for the capture of additional information from archived samples. The protocol can also be found here.

High-throughput 3′ single-cell RNA-sequencing (scRNA-seq) allows cost-effective, detailed characterization of individual immune cells from tissues. Current techniques, however, are limited in their ability to elucidate essential immune cell features, including variable sequences of T cell antigen receptors (TCRs) that confer antigen specificity. Here, we present a strategy that enables simultaneous analysis of TCR sequences and corresponding full transcriptomes from 3′-barcoded scRNA-seq samples. This approach is compatible with common 3′ scRNA-seq methods, and adaptable to processed samples post hoc. We applied the technique to identify transcriptional signatures associated with T cells sharing common TCRs from immunized mice and from patients with food allergy. We observed preferential phenotypes among subsets of expanded clonotypes, including type 2 helper CD4+ T cell (TH2) states associated with food allergy. These results demonstrate the utility of our method when studying diseases in which clonotype-driven responses are critical to understanding the underlying biology. The protocol can be found here.

Immune responses within barrier tissues are regulated, in part, by nociceptors, specialized peripheral sensory neurons that detect noxious stimuli. Previous work has shown that nociceptor ablation not only alters local responses to immune challenge at peripheral sites, but also within draining lymph nodes (LNs). The mechanisms and significance of nociceptor-dependent modulation of LN function are unknown. Indeed, although sympathetic innervation of LNs is well documented, it has been unclear whether the LN parenchyma itself is innervated by sensory neurons. Here, using a combination of high-resolution imaging, retrograde viral tracing, single-cell transcriptomics (scRNA-seq), and optogenetics, we identified and functionally tested a sensory neuro-immune circuit that is preferentially located in the outermost cortex of skin-draining LNs. Transcriptomic profiling revealed that there are at least four discrete subsets of sensory neurons that innervate LNs with a predominance of peptidergic nociceptors, and an innervation pattern that is distinct from that in the surrounding skin. To uncover potential LN-resident communication partners for LN-innervating sensory neurons, we employed scRNA-seq to generate a draft atlas of all murine LN cells and, based on receptor-ligand expression patterns, nominated candidate target populations among stromal and immune cells. Using selective optogenetic stimulation of LN-innervating sensory axons, we directly experimentally tested our inferred connections. Acute neuronal activation triggered rapid transcriptional changes preferentially within our top-ranked putative interacting partners, principally endothelium and other nodal stroma cells, as well as several innate leukocyte populations. Thus, LNs are monitored by a unique population of sensory neurons that possesses immunomodulatory potential.

Genome-wide association studies (GWAS) have identified genetic variants associated with age-related macular degeneration (AMD), one of the leading causes of blindness in the elderly. However, it has been challenging to identify the cell types associated with AMD given the genetic complexity of the disease. Here we perform massively parallel single-cell RNA sequencing (scRNA-seq) of human retinas using two independent platforms, and report the first single-cell transcriptomic atlas of the human retina. Using a multi-resolution network-based analysis, we identify all major retinal cell types, and their corresponding gene expression signatures. Heterogeneity is observed within macroglia, suggesting that human retinal glia are more diverse than previously thought. Finally, GWAS-based enrichment analysis identifies glia, vascular cells, and cone photoreceptors to be associated with the risk of AMD. These data provide a detailed analysis of the human retina, and show how scRNA-seq can provide insight into cell types involved in complex, inflammatory genetic diseases.

Genome-wide association studies (GWAS) have revealed risk alleles for ulcerative colitis (UC). To understand their cell type specificities and pathways of action, we generate an atlas of 366,650 cells from the colon mucosa of 18 UC patients and 12 healthy individuals, revealing 51 epithelial, stromal, and immune cell subsets, including BEST4+ enterocytes, microfold-like cells, and IL13RA2+IL11+ inflammatory fibroblasts, which we associate with resistance to anti-TNF treatment. Inflammatory fibroblasts, inflammatory monocytes, microfold-like cells, and T cells that co-express CD8 and IL-17 expand with disease, forming intercellular interaction hubs. Many UC risk genes are cell type specific and co-regulated within relatively few gene modules, suggesting convergence onto limited sets of cell types and pathways. Using this observation, we nominate and infer functions for specific risk genes across GWAS loci. Our work provides a framework for interrogating complex human diseases and mapping risk variants to cell types and pathways.

The liver can substantially regenerate after injury, with both main epithelial cell types, hepatocytes and biliary epithelial cells (BECs), playing important roles in parenchymal regeneration. Beyond metabolic functions, BECs exhibit substantial plasticity and in some contexts can drive hepatic repopulation. Here, we performed single-cell RNA sequencing to examine BEC and hepatocyte heterogeneity during homeostasis and after injury. Instead of evidence for a transcriptionally defined progenitor-like BEC cell, we found significant homeostatic BEC heterogeneity that reflects fluctuating activation of a YAPdependent program. This transcriptional signature defines a dynamic cellular state during homeostasis and is highly responsive to injury. YAP signaling is induced by physiological bile acids (BAs), required for BEC survival in response to BA exposure, and is necessary for hepatocyte reprogramming into biliary progenitors upon injury. Together, these findings uncover molecular heterogeneity within the ductal epithelium and reveal YAP as a protective rheostat and regenerative regulator in the mammalian liver.

Genome-wide association studies (GWAS) have revealed risk alleles for ulcerative colitis (UC), but their cell type and pathway specificities are often unknown. Here, we generate an atlas of 115,517 cells from the colon mucosa of seven UC patients and ten healthy individuals, revealing 51 epithelial, stromal, and immune cell subsets, including a subset of BEST4+ enterocytes, which may sense and respond to pH, and IL13RA2+IL-11+ inflammatory fibroblasts, which we associate with resistance to anti-TNF therapy. Inflammatory fibroblasts, inflammatory monocytes, microfold-like cells, and CD8+IL-17+ T cells expand during disease, and form intercellular interaction hubs that mediate cross-talk between diverse cellular lineages. We identify hundreds of putative autocrine and paracrine cell-cell interactions that may explain the migration, expansion, or inhibition of cell types with disease. Surprisingly, UC risk genes are often cell type specific and co-regulated in relatively few gene modules, suggesting convergence onto limited sets of cell types and pathways. Using this observation, we nominate and infer putative functions for UC risk genes across all GWAS loci. Our atlas thus provides a framework for interrogating complex human diseases and mapping risk variants onto their cell types and pathways of activity.

Barrier tissue dysfunction is a fundamental feature of chronic human inflammatory diseases1. Specialized subsets of epithelial cells—including secretory and ciliated cells—differentiate from basal stem cells to collectively protect the upper airway2,3,4. Allergic inflammation can develop from persistent activation5 of type 2 immunity6 in the upper airway, resulting in chronic rhinosinusitis, which ranges in severity from rhinitis to severe nasal polyps7. Basal cell hyperplasia is a hallmark of severe disease7,8,9, but it is not known how these progenitor cells2,10,11contribute to clinical presentation and barrier tissue dysfunction in humans. Here we profile primary human surgical chronic rhinosinusitis samples (18,036 cells, n = 12) that span the disease spectrum using Seq-Well for massively parallel single-cell RNA sequencing12, report transcriptomes for human respiratory epithelial, immune and stromal cell types and subsets from a type 2 inflammatory disease, and map key mediators. By comparison with nasal scrapings (18,704 cells, n = 9), we define signatures of core, healthy, inflamed and polyp secretory cells. We reveal marked differences between the epithelial compartments of the non-polyp and polyp cellular ecosystems, identifying and validating a global reduction in cellular diversity of polyps characterized by basal cell hyperplasia, concomitant decreases in glandular cells, and phenotypic shifts in secretory cell antimicrobial expression. We detect an aberrant basal progenitor differentiation trajectory in polyps, and propose cell-intrinsic13, epigenetic14,15 and extrinsic factors11,16,17 that lock polyp basal cells into this uncommitted state. Finally, we functionally demonstrate that ex vivo cultured basal cells retain intrinsic memory of IL-4/IL-13 exposure, and test the potential for clinical blockade of the IL-4 receptor α-subunit to modify basal and secretory cell states in vivo. Overall, we find that reduced epithelial diversity stemming from functional shifts in basal cells is a key characteristic of type 2 immune-mediated barrier tissue dysfunction. Our results demonstrate that epithelial stem cells may contribute to the persistence of human disease by serving as repositories for allergic memories.

The recent advent of methods for high-throughput single-cell molecular profiling has catalyzed a growing sense in the scientific community that the time is ripe to complete the 150-year-old effort to identify all cell types in the human body. The Human Cell Atlas Project is an international collaborative effort that aims to define all human cell types in terms of distinctive molecular profiles (such as gene expression profiles) and to connect this information with classical cellular descriptions (such as location and morphology). An open comprehensive reference map of the molecular state of cells in healthy human tissues would propel the systematic study of physiological states, developmental trajectories, regulatory circuitry and interactions of cells, and also provide a framework for understanding cellular dysregulation in human disease. Here we describe the idea, its potential utility, early proofs-of-concept, and some design considerations for the Human Cell Atlas, including a commitment to open data, code, and community.

Tissue barrier dysfunction is a poorly defined feature hypothesized to drive chronic human inflammatory disease. The epithelium of the upper respiratory tract represents one such barrier, responsible for separating inhaled agents, such as pathogens and allergens, from the underlying submucosa. Specialized epithelial subsets-including secretory, glandular, and ciliated cells-differentiate from basal progenitors to collectively realize this role. Allergic inflammation in the upper airway barrier can develop from persistent activation of Type 2 immunity (T2I), resulting in the disease spectrum known as chronic rhinosinusitis (CRS), ranging from rhinitis to severe nasal polyps. Whether recently identified epithelial progenitor subsets, and their differentiation trajectory, contribute to the clinical presentation and barrier dysfunction in T2I-mediated disease in humans remains unexplored. Profiling twelve primary human CRS samples spanning the range of clinical severity with the Seq-Well platform for massively-parallel single-cell RNA-sequencing (scRNA-seq), we report the first single-cell transcriptomes for human respiratory epithelial cell subsets, immune cells, and parenchymal cells (18,036 total cells) from a T2I inflammatory disease, and map key mediators. We find striking differences between non-polyp and polyp tissues within the epithelial compartments of human T2I cellular ecosystems. More specifically, across 10,383 epithelial cells, we identify a global reduction in epithelial diversity in polyps characterized by basal cell hyperplasia, a concomitant decrease in glandular and ciliated cells, and phenotypic shifts in secretory cell function. We validate these findings through flow cytometry, histology, and bulk tissue RNA-seq of an independent cohort. Furthermore, we detect an aberrant basal progenitor differentiation trajectory in polyps, and uncover cell-intrinsic and extrinsic factors that may lock polyp basal cells into an uncommitted state. Overall, our data define severe T2I barrier dysfunction as a reduction in epithelial diversity, characterized by profound functional shifts stemming from basal cell defects, and nominate a cellular mechanism for the persistence and chronicity of severe human respiratory disease.

Click here to read the pre-publication manuscript.

We develop single-cell genomic approaches to comprehensively profile complex biological ensembles. To date, the majority of our work has focused on establishing, validating, and scaling single-cell transcriptomics, often through the development of microdevices to enable genome-wide identification of the cell types/states that comprise functional or dysfunctional biological samples.

Most recently, we have developed Seq-Well, a portable, low-cost platform for high-throughput single-cell RNA-Seq (scRNA-Seq). By providing open access to resources and protocols, we hope to democratize access to cutting-edge approaches in single-cell genomics.

As the amount of data we have relating to cells, properties, surroundings, and interactions increases exponentially, we are motivated to develop pan-system measurements and analyses to paint comprehensive pictures of immune response in health and disease. Relying on massive transcriptomic datasets generated from complex tissues, like melanoma tumors, inflamed human gut, M. tuberculosis (MTB)-induced granulomas, and healthy or SHIV-infected monkey tissues, we have begun to construct social networks of integrated responses to physiological perturbations. The technologies outlined above uniquely enable us to generate foundational datasets (e.g., transcriptomes from interacting cell pairs) for deconvolving and interpreting the potential drivers of observed ensemble behaviors, as well as for identifying which properties we cannot explain, and thus need to study. To date, our lab has generated over 2 million single-cell transcriptomes across multiple tissues, individuals, and species; we are utilizing this data, paired with metadata and additional characteristics, to look for common cellular network motifs, such as division of labor, quorum sensing, persistence, or bet-hedging.


The immune system plays an important role in regulating homeostatic balance across tissues and individuals in the face of changing and challenging environments. Given the pivotal and outsized impact cell subsets (e.g., rare precocious DCs) can have on ensemble dynamics (e.g., global activation of an antiviral response and deactivation of inflammation), we aim to understand the functional consequences of variation in cellular composition across tissues, as well as how different immune cells adapt to changing environmental conditions.

Motivating questions in the lab include:

  1. How can we perform observational and experimental studies to understand the fundamental units of tissues structure and function?
  2. Can we derive basic principles governing homeostatic and pathogenic immune responses within tissues?
  3. What dictates the evolution of clonal antigen-specific T & B cell responses?

To this end, we are several multiple tissues from multiple organisms across common sources of variation. By examining consistent and unique themes that emerge across these systems, we aim to extract basic principles that govern homeostatic and pathogenic immune responses within tissues. Ultimately, we intend to leverage this information to rationally engineer immune responses (e.g., in vaccines and immunotherapies).