Computational cohort materials from March 2023 Ghana training

Hepatitis B virus (HBV) infection is restricted to the liver where it drives exhaustion of virus-specific T and B cells and pathogenesis through dysregulation of intrahepatic immunity. Our understanding of liver-specific events related to viral control and liver damage have relied almost solely on animal models and we lack useable peripheral biomarkers to quantify intrahepatic immune activation beyond cytokine measurement. Our objective was to overcome practical obstacles of liver sampling using fine-needle aspiration (FNA) and develop an optimized workflow to comprehensively compare the blood and liver compartments within chronic hepatitis B (CHB) patients using single-cell RNA sequencing (scRNAseq). We developed a workflow that enabled multi-site international studies and centralized scRNAseq. Blood and liver FNAs were collected, and cellular and molecular capture were compared between the Seq-Well S3 picowell-based and the 10x Chromium reverse-emulsion droplet-based scRNAseq technologies. Both technologies captured the cellular diversity of the liver but Seq-Well S3 effectively captured neutrophils, which were absent in the 10x dataset. CD8 T cells and neutrophils displayed distinct transcriptional profiles between blood and liver. In addition, liver FNAs captured a heterogeneous liver macrophage population. Comparison between untreated CHB patients and patients treated with nucleoside analogues showed that myeloid cells were highly sensitive to environmental changes while lymphocytes displayed minimal differences. The ability to electively sample and intensively profile the immune landscape of the liver, and generate high-resolution data, will enable multi-site clinical studies to identify biomarkers for intrahepatic immune activity in HBV and beyond.

Ovulation is an integral part of women’s menstrual cycle and fertility. Understanding the mechanisms of ovulation has broad implications for the treatment of anovulatory diseases and the development of novel contraceptives. Now, few studies have developed effective models that both faithfully recapitulate the hallmarks of ovulation and possess scalability. We established a three-dimensional encapsulated in vitro follicle growth (eIVFG) system that recapitulates folliculogenesis and produces follicles that undergo ovulation in a controlled manner. Here, we determined whether ex vivo ovulation preserves molecular signatures of ovulation and demonstrated its use in discovering novel ovulatory pathways and nonhormonal contraceptive candidates through a high-throughput ovulation screening. Mature murine follicles from eIVFG were induced to ovulate ex vivo using human chorionic gonadotropin and collected at 0, 1, 4, and 8 hours post-induction. Phenotypic analyses confirmed key ovulatory events, including cumulus expansion, oocyte maturation, follicle rupture, and luteinization. Single-follicle RNA-sequencing analysis revealed the preservation of ovulatory genes and dynamic transcriptomic profiles and signaling. Soft clustering identified distinct gene expression patterns and new pathways that may critically regulate ovulation. We further used this ex vivoovulation system to screen 21 compounds targeting established and newly identified ovulatory pathways. We discovered that proprotein convertases activate gelatinases to sustain follicle rupture and do not regulate luteinization and progesterone secretion. Together, our ex vivo ovulation system preserves molecular signatures of ovulation, presenting a new powerful tool for studying ovulation and anovulatory diseases as well as for establishing a high-throughput ovulation screening to identify novel nonhormonal contraceptives for women.

Patients with chronic lung disease (CLD) have an increased risk for severe coronavirus disease-19 (COVID-19) and poor outcomes. Here, we analyze the transcriptomes of 611,398 single cells isolated from healthy and CLD lungs to identify molecular characteristics of lung cells that may account for worse COVID-19 outcomes in patients with chronic lung diseases. We observe a similar cellular distribution and relative expression of SARS-CoV-2 entry factors in control and CLD lungs. CLD AT2 cells express higher levels of genes linked directly to the efficiency of viral replication and the innate immune response. Additionally, we identify basal differences in inflammatory gene expression programs that highlight how CLD alters the inflammatory microenvironment encountered upon viral exposure to the peripheral lung. Our study indicates that CLD is accompanied by changes in cell-type-specific gene expression programs that prime the lung epithelium for and influence the innate and adaptive immune responses to SARS-CoV-2 infection.

Temporal resolution of cellular features associated with a severe COVID-19 disease trajectory is needed for understanding skewed immune responses and defining predictors of outcome. Here, we performed a longitudinal multi-omics study using a two-center cohort of 14 patients. We analyzed the bulk transcriptome, bulk DNA methylome, and single-cell transcriptome (>358,000 cells, including BCR profiles) of peripheral blood samples harvested from up to 5 time points. Validation was performed in two independent cohorts of COVID-19 patients. Severe COVID-19 was characterized by an increase of proliferating, metabolically hyperactive plasmablasts. Coinciding with critical illness, we also identified an expansion of interferon-activated circulating megakaryocytes and increased erythropoiesis with features of hypoxic signaling. Megakaryocyte- and erythroid-cell-derived co-expression modules were predictive of fatal disease outcome. The study demonstrates broad cellular effects of SARS-CoV-2 infection beyond adaptive immune cells and provides an entry point toward developing biomarkers and targeted treatments of patients with COVID-19.

Coronavirus disease 2019 (COVID-19) is a global pandemic caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). SARS-CoV-2 infection occurs predominantly by binding of the viral surface spike protein to the human angiotensin-converting enzyme 2 (ACE2) receptor. Hypertension and preexisting cardiovascular disease are risk factors for morbidity from COVID-19, and it remains uncertain whether the use of angiotensin-converting enzyme inhibitors (ACEis) or angiotensin receptor blockers affects infection and disease. This uncertainty has provoked public statements by the American Heart Association, the Heart Failure Society of America, and the American College of Cardiology advising continuation of these agents in the absence of compelling new data.

T cells have a central role in adaptive immune responses. However, no accurate assays currently exist that link measurements of ex vivo or in vitro function to effective in vivo T cell responses. Diagnostic detection of T cell function in infectious and immune-mediated diseases also lags in vitro assessments of antibody function. An improved understanding of T cell responses will help researchers and clinicians better predict immune outcomes in response to vaccines, pathogenic infections or immune-mediated diseases. To address these issues, the National Institute of Allergy and Infectious Diseases (NIAID) convened the ‘T Cell Technologies: Assays, Innovations, Challenges, and Opportunities Workshop’ on 15–16 June 2022. The goals of the workshop were to explore assays and technologic advances that could improve understanding of T cell activation and function in different immune conditions, tissues and infections, and to identify methodologies that best provide an accurate measure of T cell biological relevance.

Cynomolgus macaque (Macaca fascicularis) is an attractive animal model for the study of human disease and is extensively used in biomedical research. Cynomolgus macaques share behavioral, physiological, and genomic traits with humans and recapitulate human disease manifestations not observed in other animal species. To improve the use of the cynomolgus macaque model to investigate immune responses, we defined and characterized the T cell receptor (TCR) repertoire. We identified and analyzed the alpha (TRA), beta (TRB), gamma (TRG), and delta (TRD) TCR loci of the cynomolgus macaque. The expressed repertoire was determined using 22 unique lung samples from Mycobacterium tuberculosis infected cynomolgus macaques by single cell RNA sequencing. Expressed TCR alpha (TRAV) and beta (TRBV) variable region genes were enriched and identified using gene specific primers, which allowed their functional status to be determined. Analysis of the primers used for cynomolgus macaque TCR variable region gene enrichment showed they could also be used to amplify rhesus macaque (M. mulatta) variable region genes. The genomic organization of the cynomolgus macaque has great similarity with the rhesus macaque and they shared > 90% sequence similarity with the human TCR repertoire. The identification of the TCR repertoire facilitates analysis of T cell immunity in cynomolgus macaques.

The immune system represents a major barrier to cancer progression, driving the evolution of immunoregulatory interactions between malignant cells and T-cells in the tumor environment. Blastic plasmacytoid dendritic cell neoplasm (BPDCN), a rare acute leukemia with plasmacytoid dendritic cell (pDC) differentiation, provides a unique opportunity to study these interactions. pDCs are key producers of interferon alpha (IFNA) that play an important role in T-cell activation at the interface between the innate and adaptive immune system. To assess how uncontrolled proliferation of malignant BPDCN cells affects the tumor environment, we catalog immune cell heterogeneity in the bone marrow (BM) of five healthy controls and five BPDCN patients by analyzing 52,803 single-cell transcriptomes, including 18,779 T-cells. We test computational techniques for robust cell type classification and find that T-cells in BPDCN patients consistently upregulate interferon alpha (IFNA) response and downregulate tumor necrosis factor alpha (TNFA) pathways. Integrating transcriptional data with T-cell receptor sequencing via shared barcodes reveals significant T-cell exhaustion in BPDCN that is positively correlated with T-cell clonotype expansion. By highlighting new mechanisms of T-cell exhaustion and immune evasion in BPDCN, our results demonstrate the value of single-cell multiomics to understand immune cell interactions in the tumor environment.

T cell receptor (TCR) clonotype tracking is a powerful tool for interrogating T cell mediated immune processes. New methods to pair a single cell’s transcriptional program with its TCR identity allow monitoring of T cell clonotype-specific transcriptional dynamics. While these technologies have been available for human and mouse T cells studies, they have not been developed for Rhesus Macaques (RM), a critical translational organism for autoimmune diseases, vaccine development and transplantation. We describe a new pipeline, ‘RM-scTCR-Seq’, which, for the first time, enables RM specific single cell TCR amplification, reconstruction and pairing of RM TCR’s with their transcriptional profiles. We apply this method to a RM model of GVHD, and identify and track in vitro detected alloreactive clonotypes in GVHD target organs and explore their GVHD driven cytotoxic T cell signature. This novel, state-of-the-art platform fundamentally advances the utility of RM to study protective and pathogenic T cell responses.

Environmental enteropathy (EE) is a subclinical condition of the small intestine that is highly prevalent in low- and middle-income countries. It is thought to be a key contributing factor to childhood malnutrition, growth stunting, and diminished oral vaccine responses. Although EE has been shown to be the by-product of a recurrent enteric infection, its full pathophysiology remains unclear. Here, we mapped the cellular and molecular correlates of EE by performing high-throughput, single-cell RNA-sequencing on 33 small intestinal biopsies from 11 adults with EE in Lusaka, Zambia (eight HIV-negative and three HIV-positive), six adults without EE in Boston, United States, and two adults in Durban, South Africa, which we complemented with published data from three additional individuals from the same clinical site. We analyzed previously defined bulk-transcriptomic signatures of reduced villus height and decreased microbial translocation in EE and showed that these signatures may be driven by an increased abundance of surface mucosal cells—a gastric-like subset previously implicated in epithelial repair in the gastrointestinal tract. In addition, we determined cell subsets whose fractional abundances associate with EE severity, small intestinal region, and HIV infection. Furthermore, by comparing duodenal EE samples with those from three control cohorts, we identified dysregulated WNT and MAPK signaling in the EE epithelium and increased proinflammatory cytokine gene expression in a T cell subset highly expressing a transcriptional signature of tissue-resident memory cells in the EE cohort. Together, our work elucidates epithelial and immune correlates of EE and nominates cellular and molecular targets for intervention.

The Human Cell Atlas (HCA) is a global consortium of scientists who are compiling an exhaustive guidebook on the types and properties of all human cells (see This includes best-practice recommendations for making HCA research results beneficial for everyone. The consortium strongly opposes exploitation of differences in those results for any form of discrimination or racial profiling.

Mycobacterium tuberculosis lung infection results in a complex multicellular structure: the granuloma. In some granulomas, immune activity promotes bacterial clearance, but in others, bacteria persist and grow. We identified correlates of bacterial control in cynomolgus macaque lung granulomas by co-registering longitudinal positron emission tomography and computed tomography imaging, single-cell RNA sequencing, and measures of bacterial clearance. Bacterial persistence occurred in granulomas enriched for mast, endothelial, fibroblast, and plasma cells, signaling amongst themselves via type 2 immunity and wound-healing pathways. Granulomas that drove bacterial control were characterized by cellular ecosystems enriched for type 1-type 17, stem-like, and cytotoxic T cells engaged in pro-inflammatory signaling networks involving diverse cell populations. Granulomas that arose later in infection displayed functional characteristics of restrictive granulomas and were more capable of killing Mtb. Our results define the complex multicellular eco- systems underlying (lack of) granuloma resolution and highlight host immune targets that can be leveraged to develop new vaccine and therapeutic strategies for TB.

Malaria-causing Plasmodium vivax parasites can linger in the human liver for weeks to years and reactivate to cause recurrent blood-stage infection. Although they are an important target for malaria eradication, little is known about the molecular features of replicative and non-replicative intracellular liver-stage parasites and their host cell dependence. Here, we leverage a bioengineered human microliver platform to culture patient-derived P. vivax parasites for transcriptional profiling. Coupling enrichment strategies with bulk and single-cell analyses, we capture both parasite and host transcripts in individual hepatocytes throughout the course of infection. We define host- and state-dependent transcriptional signatures and identify unappreciated populations of replicative and non-replicative parasites that share features with sexual transmissive forms. We find that infection suppresses the transcription of key hepatocyte function genes and elicits an anti-parasite innate immune response. Our work provides a foundation for understanding host-parasite interactions and reveals insights into the biology of P. vivax dormancy and transmission.

Human breast milk (hBM) is a dynamic fluid that contains millions of cells, but their identities and phenotypic properties are poorly understood. We generated and analyzed single-cell RNA-sequencing (scRNA-seq) data to characterize the transcriptomes of cells from hBM across lactational time from 3 to 632 d postpartum in 15 donors. We found that the majority of cells in hBM are lactocytes, a specialized epithelial subset, and that cell-type frequencies shift over the course of lactation, yielding greater epithelial diversity at later points. Analysis of lactocytes reveals a continuum of cell states characterized by transcriptional changes in hormone-, growth factor-, and milk production-related pathways. Generalized additive models suggest that one subcluster, LC1 epithelial cells, increases as a function of time postpartum, daycare attendance, and the use of hormonal birth control. We identify several subclusters of macrophages in hBM that are enriched for tolerogenic functions, possibly playing a role in protecting the mammary gland during lactation. Our description of the cellular components of breast milk, their association with maternal–infant dyad metadata, and our quantification of alterations at the gene and pathway levels provide a detailed longitudinal picture of hBM cells across lactational time. This work paves the way for future investigations of how a potential division of cellular labor and differential hormone regulation might be leveraged therapeutically to support healthy lactation and potentially aid in milk production.

Protocol for integrating CITE-seq with well-based scRNA-seq protocols.

Blood samples are frequently collected in human studies of the immune system but poorly represent tissue-resident immunity. Understanding the immunopathogenesis of tissue-restricted diseases, such as chronic hepatitis B, necessitates direct investigation of local immune responses. We developed a workflow that enables frequent, minimally invasive collection of liver fine-needle aspirates in multi-site international studies and centralized single-cell RNA sequencing data generation using the Seq-Well S3 picowell-based technology. All immunological cell types were captured, including liver macrophages, and showed distinct compartmentalization and transcriptional profiles, providing a systematic assessment of the capabilities and limitations of peripheral blood samples when investigating tissue-restricted diseases. The ability to electively sample the liver of chronic viral hepatitis patients and generate high-resolution data will enable multi-site clinical studies to power fundamental and therapeutic discovery.

Human breast milk is a dynamic fluid that contains millions of cells, but their identities and phenotypic properties are poorly understood. We used single-cell RNA-seq (scRNA-seq) to characterize the transcriptomes of cells from human breast milk (hBM) across lactational time from 3 to 632 days postpartum in 15 donors. We find that the majority of cells in human breast milk are lactocytes, a specialized epithelial subset, and cell type frequencies shift over the course of lactation yielding greater epithelial diversity at later points. Analysis of lactocytes reveals a continuum of cell states characterized by transcriptional changes in hormone, growth factor, and milk production related pathways. Generalized additive models suggest that one sub-cluster, LALBAlow epithelial cells, increase as a function of time postpartum, daycare attendance, and the use of hormonal birth control. We identify several sub-clusters of macrophages in hBM that are enriched for tolerogenic functions, possibly playing a role in protecting the mammary gland during lactation. Our description of the cellular components of breast milk, their association with maternal-infant dyad metadata and quantification of alterations at the gene and pathways levels provides the first detailed longitudinal picture of human breast milk cells across lactational time. This work paves the way for future investigations of how a potential division of cellular labor and differential hormone regulation might be leveraged therapeutically to support healthy lactation and potentially aid in milk production.

Single cell biology has the potential to elucidate many critical biological processes and diseases, from development and regeneration to cancer. Single cell analyses are uncovering the molecular diversity of cells, revealing a clearer picture of the variation among and between different cell types. New techniques are beginning to unravel how dif- ferences in cell state—transcriptional, epigenetic, and other characteristics—can lead to different cell fates among genetically identical cells, which underlies complex processes such as embryonic development, drug resistance, response to injury, and cellular reprogramming. Single cell technologies also pose significant challenges relating to processing and analyzing vast amounts of data collected. To realize the potential of single cell technologies, new computational approaches are needed. On March 17–19, 2021, experts in single cell biology met virtually for the Keystone eSymposium “Single Cell Biology” to discuss advances both in single cell applications and technologies.

Crohn’s disease is an inflammatory bowel disease (IBD) which most often presents with patchy lesions in the terminal ileum and colon and requires complex clinical care. Recent advances in the targeting of cytokines and leukocyte migration have greatly advanced treatment options, but most patients still relapse and inevitably progress. Although single-cell approaches are transforming our ability to understand the barrier tissue biology of inflammatory disease, comprehensive single-cell RNA-sequencing (scRNA-seq) atlases of IBD to date have largely sampled pre-treated patients with established disease. This has limited our understanding of which cell types, subsets, and states at diagnosis are predictive of disease severity and response to treatment. Here, through a combined clinical, flow cytometric, and scRNA-seq study, we profile diagnostic human biopsies from the terminal ileum of treatment-naive pediatric patients with Crohn’s disease (pediCD; n=14) and from non-inflamed pediatric controls with functional gastrointestinal disorders (FGID; n=13). To fully resolve and annotate epithelial, stromal, and immune cell states among the 201,883 single-cell transcriptomes, we develop and deploy a principled and unbiased tiered clustering approach, ARBOL, yielding 138 FGID and 305 pediCD end cell clusters. Notably, through both flow cytometry and scRNA-seq, we observe that at the level of broad cell types, treatment-naive pediCD is not readily distinguishable from FGID in cellular composition. However, by integrating high-resolution scRNA-seq analysis, we identify significant differences in cell states that arise during pediCD relative to FGID. Furthermore, by closely linking our scRNA-seq analysis with clinical meta-data, we resolve a vector of lymphoid, myeloid, and epithelial cell states in treatment-naive samples which can distinguish patients with less severe disease (those not on anti-TNF therapies (NOA)), from those with more severe disease at presentation who require anti-TNF therapies. Moreover, this vector was also able to distinguish those patients that achieve a full response (FR) to anti-TNF blockade from those more treatment-resistant patients who only achieve a partial response (PR). Our study jointly leverages a treatment-naive cohort, high-resolution principled scRNA-seq data analysis, and clinical outcomes to understand which baseline cell states may predict inflammatory disease trajectory.

SARS-CoV-2 infection can cause severe respiratory COVID-19. However, many individuals present with isolated upper respiratory symptoms, suggesting potential to constrain viral pathology to the nasopharynx. Which cells SARS-CoV-2 primarily targets and how infection influences the respiratory epithelium remains incompletely understood. We performed scRNA-seq on nasopharyngeal swabs from 58 healthy and COVID-19 participants. During COVID-19, we observe expansion of secretory, loss of ciliated, and epithelial cell repopulation via deuterosomal expansion. In mild/moderate COVID-19, epithelial cells express anti-viral/interferon-responsive genes, while cells in severe COVID-19 have muted anti-viral responses despite equivalent viral loads. SARS-CoV-2 RNA+ host-target cells are highly heterogenous, including developing ciliated, interferon-responsive ciliated, AZGP1high goblet, and KRT13+ “hillock”-like cells, and we identify genes associated with susceptibility, resistance, or infection response. Our study defines protective and detrimental responses to SARS-CoV-2, the direct viral targets of infection, and suggests that failed nasal epithelial anti-viral immunity may underlie and precede severe COVID-19.

COVID-19, caused by SARS-CoV-2, can result in acute respiratory distress syndrome and multiple-organ failure, but little is known about its pathophysiology. Here, we generated single-cell atlases of 23 lung, 16 kidney, 16 liver and 19 heart COVID-19 autopsy donor tissue samples, and spatial atlases of 14 lung donors. Integrated computational analysis uncovered substantial remodeling in the lung epithelial, immune and stromal compartments, with evidence of multiple paths of failed tissue regeneration, including defective alveolar type 2 differentiation and expansion of fibroblasts and putative TP63+ intrapulmonary basal-like progenitor cells. Viral RNAs were enriched in mononuclear phagocytic and endothelial lung cells which induced specific host programs. Spatial analysis in lung distinguished inflammatory host responses in lung regions with and without viral RNA. Analysis of the other tissue atlases showed transcriptional alterations in multiple cell types in COVID-19 donor heart tissue, and mapped cell types and genes implicated with disease severity based on COVID-19 GWAS. Our foundational dataset elucidates the biological impact of severe SARS-CoV-2 infection across the body, a key step towards new treatments.

Angiotensin-converting enzyme 2 (ACE2) and accessory proteases (TMPRSS2 and CTSL) are needed for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cellular entry, and their expression may shed light on viral tropism and impact across the body. We assessed the cell-type-specific expression of ACE2, TMPRSS2 and CTSL across 107 single-cell RNA-sequencing studies from different tissues. ACE2, TMPRSS2 and CTSL are coexpressed in specific subsets of respiratory epithelial cells in the nasal passages, airways and alveoli, and in cells from other organs associated with coronavirus disease 2019 (COVID-19) transmission or pathology. We performed a meta-analysis of 31 lung single-cell RNA-sequencing studies with 1,320,896 cells from 377 nasal, airway and lung parenchyma samples from 228 individuals. This revealed cell-type-specific associations of age, sex and smoking with expression levels of ACE2, TMPRSS2 and CTSL. Expression of entry factors increased with age and in males, including in airway secretory cells and alveolar type 2 cells. Expression programs shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues included genes that may mediate viral entry, key immune functions and epithelial–macrophage cross-talk, such as genes involved in the interleukin-6, interleukin-1, tumor necrosis factor and complement pathways. Cell-type-specific expression patterns may contribute to the pathogenesis of COVID-19, and our work highlights putative molecular pathways for therapeutic intervention.

The SARS-CoV-2 pandemic has caused over 1 million deaths globally, mostly due to acute lung injury and acute respiratory distress syndrome, or direct complications resulting in multiple-organ failures. Little is known about the host tissue immune and cellular responses associated with COVID-19 infection, symptoms, and lethality. To address this, we collected tissues from 11 organs during the clinical autopsy of 17 individuals who succumbed to COVID-19, resulting in a tissue bank of approximately 420 specimens. We generated comprehensive cellular maps capturing COVID-19 biology related to patients’ demise through single-cell and single-nucleus RNA-Seq of lung, kidney, liver and heart tissues, and further contextualized our findings through spatial RNA profiling of distinct lung regions. We developed a computational framework that incorporates removal of ambient RNA and automated cell type annotation to facilitate comparison with other healthy and diseased tissue atlases. In the lung, we uncovered significantly altered transcriptional programs within the epithelial, immune, and stromal compartments and cell intrinsic changes in multiple cell types relative to lung tissue from healthy controls. We observed evidence of: alveolar type 2 (AT2) differentiation replacing depleted alveolar type 1 (AT1) lung epithelial cells, as previously seen in fibrosis; a concomitant increase in myofibroblasts reflective of defective tissue repair; and, putative TP63+ intrapulmonary basal-like progenitor (IPBLP) cells, similar to cells identified in H1N1 influenza, that may serve as an emergency cellular reserve for severely damaged alveoli. Together, these findings suggest the activation and failure of multiple avenues for regeneration of the epithelium in these terminal lungs. SARS-CoV-2 RNA reads were enriched in lung mononuclear phagocytic cells and endothelial cells, and these cells expressed distinct host response transcriptional programs. We corroborated the compositional and transcriptional changes in lung tissue through spatial analysis of RNA profiles in situ and distinguished unique tissue host responses between regions with and without viral RNA, and in COVID-19 donor tissues relative to healthy lung. Finally, we analyzed genetic regions implicated in COVID-19 GWAS with transcriptomic data to implicate specific cell types and genes associated with disease severity. Overall, our COVID-19 cell atlas is a foundational dataset to better understand the biological impact of SARS-CoV-2 infection across the human body and empowers the identification of new therapeutic interventions and prevention strategies.

Organ infiltration by donor T cells is critical to the development of acute graft-versus-host disease (aGVHD) in recipients after allogeneic hematopoietic stem cell transplant (allo-HCT). However, deconvoluting the transcriptional programs of newly recruited donor T cells from those of tissue-resident T cells in aGVHD target organs remains a challenge. Here, we combined the serial intravascular staining technique with single-cell RNA sequencing to dissect the tightly connected processes by which donor T cells initially infiltrate tissues and then establish a pathogenic tissue residency program in a rhesus macaque allo-HCT model that develops aGVHD. Our results enabled creation of a spatiotemporal map of the transcriptional programs controlling donor CD8+ T cell infiltration into the primary aGVHD target organ, the gastrointestinal (GI) tract. We identified the large and small intestines as the only two sites demonstrating allo-specific, rather than lymphodepletion-driven, T cell infiltration. GI-infiltrating donor CD8+ T cells demonstrated a highly activated, cytotoxic phenotype while simultaneously developing a canonical tissue-resident memory T cell (TRM) transcriptional signature driven by interleukin-15 (IL-15)/IL-21 signaling. We found expression of a cluster of genes directly associated with tissue invasiveness, including those encoding adhesion molecules (ITGB2), specific chemokines (CCL3 and CCL4L1) and chemokine receptors (CD74), as well as multiple cytoskeletal proteins. This tissue invasion transcriptional signature was validated by its ability to discriminate the CD8+ T cell transcriptome of patients with GI aGVHD from those of GVHD-free patients. These results provide insights into the mechanisms controlling tissue occupancy of target organs by pathogenic donor CD8+ TRMcells during aGVHD in primate transplant recipients.

In late 2019 and through 2020, the COVID-19 pandemic swept the world, presenting both scientific and medical challenges associated with understanding and treating a previously unknown disease. To help address the need for great understanding of COVID-19, the scientific community mobilized and banded together rapidly to characterize SARS-CoV-2 infection, pathogenesis and its distinct disease trajectories. The urgency of COVID-19 provided a pressing use-case for leveraging relatively new tools, technologies, and nascent collaborative networks. Single-cell biology is one such example that has emerged over the last decade as a powerful approach that provides unprecedented resolution to the cellular and molecular underpinnings of biological processes. Early foundational work within the single-cell community, including the Human Cell Atlas, utilized published and unpublished data to characterize the putative target cells of SARS-CoV-2 sampled from diverse organs based on expression of the viral receptor ACE2 and associated entry factors TMPRSS2 and CTSL (Muus et al., 2020; Sungnak et al., 2020; Ziegler et al., 2020). This initial characterization of reference data provided an important foundation for framing infection and pathology in the airway as well as other organs. However, initial community analysis was limited to samples derived from uninfected donors and other previously-sampled disease indications. This report provides an overview of a single-cell data resource derived from samples from COVID-19 patients along with initial observations and guidance on data reuse and exploration.

Recent political and social events, mainly those originating in the USA, have triggered an intense desire for equity in all facets of the human experience. More specifically, actions engendered by the Black Lives Matter movement and others have led to the scrutinizing of equity across a wide range of fields, from politics and business to academia and scientific research. In science, in particular, several major journals have published opinion pieces and editorials seeking greater equity or relating to the ‘non-white’ experience. Many of their readers have been stunned by the revelations. Indeed, the scientific community is only now coming to terms with an unsettling and uncomfortable truth: structural exclusion of non-white people permeates all levels of the scientific enterprise. That being said, with awareness comes opportunity. New frameworks for describing and addressing these issues have recently emerged, creating a structure with which groups can each consider how to best internalize and embody the lessons in their own scientific initiatives.

In the Human Cell Atlas (HCA) consortium, equity has been a point of emphasis from inception in 2016 for one simple reason: the HCA’s success depends upon it. Fundamentally, the HCA is meant to be a foundational resource, inclusive of the many cell types and states found in healthy people across the globe. That resource can then be used to address a wide range of scientific questions and, in the future, to facilitate a better understanding of disease. This mission demands, explicitly, the inclusion of representation along axes of sex, age, ethnicity, environment, socioeconomic status and, in some cases, disease susceptibility in its biospecimens. Moreover, it requires broad participation to ensure comprehensive coverage and identify barriers to success and support continuity, and necessitates reciprocal, balanced benefit from the methods, data and results to ensure global engagement.

To this end, the HCA has set ambitious and dynamic equity goals for itself. Below, we describe key lessons learned through equity activities thus far, as well as our future plans.

Granulomas are complex cellular structures comprised predominantly of macrophages and lymphocytes that function to contain and kill invading pathogens. Here, we investigated single cell phenotypes associated with antimicrobial responses in human leprosy granulomas by applying single cell and spatial sequencing to leprosy biopsy specimens. We focused on reversal reactions (RR), a dynamic process in which some patients with disseminated lepromatous leprosy (L-lep) transition towards self-limiting tuberculoid leprosy (T-lep), mounting effective antimicrobial responses. We identified a set of genes encoding proteins involved in antimicrobial responses that are differentially expressed in RR versus L-lep lesions, and regulated by IFN-γ and IL-1β. By integrating the spatial coordinates of the key cell types and antimicrobial gene expression in RR and T-lep lesions, we constructed a map revealing the organized architecture of granulomas depicting compositional and functional layers by which macrophages, T cells, keratinocytes and fibroblasts contribute to the antimicrobial response.

Ebola virus (EBOV) causes epidemics with high mortality yet remains understudied due to the challenge of experimentation in high-containment and outbreak settings. Here, we used single-cell transcriptomics and CyTOF-based single-cell protein quantification to characterize peripheral immune cells during EBOV infection in rhesus monkeys. We obtained 100,000 transcriptomes and 15,000,000 protein profiles, finding that immature, proliferative monocyte-lineage cells with reduced antigen-presentation capacity replace conventional monocyte subsets, while lymphocytes upregulate apoptosis genes and decline in abundance. By quantifying intracellular viral RNA, we identify molecular determinants of tropism among circulating immune cells and examine temporal dynamics in viral and host gene expression. Within infected cells, EBOV downregulates STAT1 mRNA and interferon signaling, and it upregulates putative pro-viral genes (e.g., DYNLL1 and HSPA5), nominating pathways the virus manipulates for its replication. This study sheds light on EBOV tropism, replication dynamics, and elicited immune response and provides a framework for characterizing host-virus interactions under maximum containment.

In humans and nonhuman primates, Mycobacterium tuberculosis lung infection yields a complex multicellular structure: the tuberculosis granuloma. All granulomas are not equivalent, however, even within the same host: in some, local immune activity promotes bacterial clearance, while in others, it allows persistence or outgrowth. Here, we used single-cell RNA-sequencing to define holistically cellular responses associated with control in cynomolgus macaques. Granulomas that facilitated bacterial killing contained significantly higher proportions of CD4+ and CD8+ T cells expressing hybrid Type1-Type17 immune responses or stem-like features and CD8-enriched T cells with specific cytotoxic functions; failure to control correlated with mast cell, plasma cell and fibroblast abundance. Co-registering these data with serial PET-CT imaging suggests that a degree of early immune control can be achieved through cytotoxic activity, but that more robust restriction only arises after the priming of specific adaptive immune responses, defining new targets for vaccination and treatment.

High-throughput single-cell RNA-sequencing (scRNA-seq) methodologies enable characterization of complex biological samples by increasing the number of cells that can be profiled contemporaneously. Nevertheless, these approaches recover less information per cell than low-throughput strategies. To accurately report the expression of key phenotypic features of cells, scRNA-seq platforms are needed that are both high fidelity and high throughput. To address this need, we created Seq-Well S3 (“Second-Strand Synthesis”), a massively parallel scRNA-seq protocol that uses a randomly primed second-strand synthesis to recover complementary DNA (cDNA) molecules that were successfully reverse transcribed but to which a second oligonucleotide handle, necessary for subsequent whole transcriptome amplification, was not appended due to inefficient template switching. Seq-Well Sincreased the efficiency of transcript capture and gene detection compared with that of previous iterations by up to 10- and 5-fold, respectively. We used Seq-Well S3 to chart the transcriptional landscape of five human inflammatory skin diseases, thus providing a resource for the further study of human skin inflammation.

Our nasal epithelial COVID-19 dataset, along with COVID-19 datasets from other genomics groups, can now be found at This work was sponsored by the Chan-Zuckerberg Initiative.

Bulk transcriptomic studies have defined classical and basal-like gene expression subtypes in pancreatic ductal adenocarcinoma (PDAC) that correlate with survival and response to chemotherapy; however, the underlying mechanisms that govern these subtypes and their heterogeneity remain elusive. Here, we performed single-cell RNA-sequencing of 23 metastatic PDAC needle biopsies and matched organoid models to understand how tumor cell-intrinsic features and extrinsic factors in the tumor microenvironment (TME) shape PDAC cancer cell phenotypes. We identify a novel cancer cell state that co-expresses basal-like and classical signatures, demonstrates upregulation of developmental and KRAS-driven gene expression programs, and represents a transitional intermediate between the basal-like and classical poles. Further, we observe structure to the metastatic TME supporting a model whereby reciprocal intercellular signaling shapes the local microenvironment and influences cancer cell transcriptional subtypes. In organoid culture, we find that transcriptional phenotypes are plastic and strongly skew toward the classical expression state, irrespective of genotype. Moreover, we show that patient-relevant transcriptional heterogeneity can be rescued by supplementing organoid media with factors found in the TME in a subtype-specific manner. Collectively, our study demonstrates that distinct microenvironmental signals are critical regulators of clinically relevant PDAC transcriptional states and their plasticity, identifies the necessity for considering the TME in cancer modeling efforts, and provides a generalizable approach for delineating the cell-intrinsic versus -extrinsic factors that govern tumor cell phenotypes.

There is pressing urgency to understand the pathogenesis of the severe acute respiratory syndrome coronavirus clade 2 (SARS-CoV-2) which causes the disease COVID-19. SARS-CoV- 2 spike (S)-protein binds ACE2, and in concert with host proteases, principally TMPRSS2, promotes cellular entry. The cell subsets targeted by SARS-CoV-2 in host tissues, and the factors that regulate ACE2 expression, remain unknown. Here, we leverage human, non-human primate, and mouse single-cell RNA-sequencing (scRNA-seq) datasets across health and disease to uncover putative targets of SARS-CoV-2 amongst tissue-resident cell subsets. We identify ACE2 and TMPRSS2 co-expressing cells within lung type II pneumocytes, ileal absorptive enterocytes, and nasal goblet secretory cells. Strikingly, we discover that ACE2 is a human interferon- stimulated gene (ISG) in vitro using airway epithelial cells, and extend our findings to in vivo viral infections. Our data suggest that SARS-CoV-2 could exploit species-specific interferon-driven upregulation of ACE2, a tissue-protective mediator during lung injury, to enhance infection.

The COVID-19 pandemic, caused by the novel coronavirus SARS-CoV-2, creates an urgent need for identifying molecular mechanisms that mediate viral entry, propagation, and tissue pathology. Cell membrane bound angiotensin-converting enzyme 2 (ACE2) and associated proteases, transmembrane protease serine 2 (TMPRSS2) and Cathepsin L (CTSL), were previously identified as mediators of SARS-CoV2 cellular entry. Here, we assess the cell type-specific RNA expression of ACE2, TMPRSS2, and CTSL through an integrated analysis of 107 single-cell and single-nucleus RNA-Seq studies, including 22 lung and airways datasets (16 unpublished), and 85 datasets from other diverse organs. Joint expression of ACE2 and the accessory proteases identifies specific subsets of respiratory epithelial cells as putative targets of viral infection in the nasal passages, airways, and alveoli. Cells that co-express ACE2 and proteases are also identified in cells from other organs, some of which have been associated with COVID-19 transmission or pathology, including gut enterocytes, corneal epithelial cells, cardiomyocytes, heart pericytes, olfactory sustentacular cells, and renal epithelial cells. Performing the first meta- analyses of scRNA-seq studies, we analyzed 1,176,683 cells from 282 nasal, airway, and lung parenchyma samples from 164 donors spanning fetal, childhood, adult, and elderly age groups, associate increased levels of ACE2, TMPRSS2, and CTSL in specific cell types with increasing age, male gender, and smoking, all of which are epidemiologically linked to COVID-19 susceptibility and outcomes. Notably, there was a particularly low expression of ACE2 in the few young pediatric samples in the analysis. Further analysis reveals a gene expression program shared by ACE2+TMPRSS2+ cells in nasal, lung and gut tissues, including genes that may mediate viral entry, subtend key immune functions, and mediate epithelial-macrophage cross- talk. Amongst these are IL6, its receptor and co-receptor, IL1R, TNF response pathways, and complement genes. Cell type specificity in the lung and airways and smoking effects were conserved in mice. Our analyses suggest that differences in the cell type-specific expression of mediators of SARS-CoV-2 viral entry may be responsible for aspects of COVID-19 epidemiology and clinical course, and point to putative molecular pathways involved in disease susceptibility and pathogenesis.

Crucial transitions in cancer—including tumor initiation, local expansion, metastasis, and therapeutic resistance—involve complex interactions between cells within the dynamic tumor ecosystem. Transformative single-cell genomics technologies and spatial multiplex in situ methods now provide an opportunity to interrogate this complexity at unprecedented resolution. The Human Tumor Atlas Network (HTAN), part of the National Cancer Institute (NCI) Cancer Moonshot Initiative, will establish a clinical, experimental, computational, and organizational framework to generate informative and accessible three-dimensional atlases of cancer transitions for a diverse set of tumor types. This effort complements both ongoing efforts to map healthy organs and previous large-scale cancer genomics approaches focused on bulk sequencing at a single point in time. Generating single-cell, multiparametric, longitudinal atlases and integrating them with clinical outcomes should help identify novel predictive biomarkers and features as well as therapeutically relevant cell types, cell states, and cellular interactions across transitions. The resulting tumor atlases should have a profound impact on our understanding of cancer biology and have the potential to improve cancer detection, prevention, and therapeutic discovery for better precision-medicine treatments of cancer patients and those at risk for cancer.

There is pressing urgency to better understand the pathogenesis of the severe acute respiratory syndrome (SARS) coronavirus (CoV) clade SARS-CoV-2, which causes the disease known as COVID-19. SARS-CoV-2, like SARS-CoV, utilizes ACE2 to bind host cells. While initial SARS- CoV-2 cell entry and infection depend on ACE2 in concert with the protease TMPRSS2 for spike (S) protein activation, the specific cell subsets targeted by SARS-CoV-2 in host tissues, and the factors that regulate ACE2 expression, remain unknown. Here, we leverage human and non- human primate (NHP) single-cell RNA-sequencing (scRNA-seq) datasets to uncover the tissue- resident cell subsets that may serve as the cellular targets of SARS-CoV-2. We identify ACE2 and TMPRSS2 co-expressing cells within type II pneumocytes in NHP lung, absorptive enterocytes in human and NHP terminal ileum, and human nasal goblet secretory cells. Strikingly, we discover, and extensively corroborate using publicly available data sets, that ACE2 is an interferon-stimulated gene (ISG) in human epithelial cells. We further validate this finding in primary upper airway human respiratory epithelial cells. Thus, SARS-CoV-2 may exploit IFN- driven upregulation of ACE2, a key tissue-protective mediator during lung injury, to enhance infection.


In light of the global effort to better understand the new SARS-CoV-2 virus, we and other researchers from the HCA Lung Biological Network and beyond have begun an initiative to investigate datasets from relevant tissues profiled as part of other ongoing studies. These studies represent, for example, large efforts to characterize HIV, Mtb, and influenza infection and allergy in primary human and non-human primate samples. This page serves as a guide to viewing our data interactively on and downloading datasets from our single-cell portal, the Alexandria Project. Alternatively, bulk downloading of our data is available here. For more on research initiatives in COVID-19 being undertaken by the HCA, and the HCA Lung Biological Network in particular, please visit the HCA website here.

We investigated two genes whose protein products are central to the cellular entry of SARS-CoV-2: ACE2 and TMPRSS2. Consistent with previous studies, we found that the gene encoding ACE2, the SARS-CoV-2 entry receptor, is expressed on a subset of lung epithelial cells, type 2 pneumocytes, and a subset of ileal epithelial cells, absorptive enterocytes, across several datasets. The protease TMPRSS2 primes the spike protein of SARS-CoV-2 and is also important for viral entry. Because of this, we identified cells which co-express ACE2 and TMPRSS2 in our datasets, and investigated additional genes enriched within ACE2 and TMPRSS2 co-expressing cells. As we believe this data may prove useful to other researchers investigating similar questions, we have made our datasets public through the interactive Alexandria Project. Here, you can view our annotations of these datasets and investigate which other genes are highly expressed in these cell subsets of interest.

NB None of the datasets presented here were designed to answer specific questions about COVID-19. Additional studies will be required across larger, appropriately structured cohorts. Further, we provide a note of caution when interpreting scRNA-seq data for low abundance transcripts like ACE2 and TMPRSS2 as detection inefficiencies and/or sequencing depth may result in an underestimation of the actual frequencies of ACE2+ or ACE2+/TMPRSS2+ cells in tissues. Moreover, the protein levels of each may differ from their mRNA abundances. We present each data set separately, as each study differed by method of tissue processing and collection protocols, each of which can influence the frequency of recovered cell subsets.


Our pre-print “SARS-CoV-2 receptor ACE2 is an interferon-stimulated gene in human airway epithelial cells and is enriched in specific cell subsets across tissues” can be found here, and the abstract is reproduced below.

There is pressing urgency to better understand the pathogenesis of the severe acute respiratory syndrome (SARS) coronavirus (CoV) clade SARS-CoV-2. SARS-CoV-2, like SARS-CoV, utilizes ACE2 to bind host cells. While initial SARS-CoV-2 cell entry and infection depend on ACE2 in concert with the protease TMPRSS2 for spike (S) protein activation, the specific cell subsets targeted by SARS-CoV-2 in host tissues, and the factors that regulate ACE2 expression, remain unknown. Here, we leverage human and non-human primate (NHP) single-cell RNA-sequencing (scRNA-seq) datasets to uncover the cell subsets that may serve as cellular targets of SARS-CoV-2. We identify ACE2/TMPRSS2 co-expressing cells within type II pneumocytes, absorptive enterocytes, and nasal goblet secretory cells. Strikingly, we discover that ACE2 is an interferon-stimulated gene (ISG) in human barrier tissue epithelial cells. Thus, SARS-CoV-2 may exploit IFN-driven upregulation of ACE2, a key tissue-protective mediator during lung injury, to enhance infection.


Atlas of ACE2 expression in healthy non-human primate lung and ileum

In this study, we collected cells from various tissues in healthy and SHIV-infected non-human primates using Seq-Well v1. Here we highlight the lung and ileum and show that ACE2 and TMPRSS2 are co-expressed most frequently in type II pneumocytes in the lung and absorptive enterocytes in the ileum.

To visualize these cells in the Alexandria Project, visit this study: Atlas of healthy non-human primate lung and ileum ACE2+ cells

This project contains two UMAP visualizations (one for lung cells and one for ileum cells): toggle between them by clicking on the ‘Explore’ tab, select ‘View Options’ in the top right hand corner, and switch between lung and ileum under the ‘Load Cluster’ dropdown.

For each of the lung and ileum cell UMAPs, we provide subsets of these visualizations which contain only the cell types enriched for double positive cells. By selecting the ‘Load cluster’ options called ‘Lung Epithelial Cells’ or ‘Ileum Absorptive Enterocytes’, you will be able to view gene expression differences between double positive cells and other cells within those cell types. In the lung epithelial cells visualization, the following additional annotations are available under the ‘Select Annotation’ dropdown:

  • ACE2+ – cells expressing ACE2
  • TMPRSS2+ – cells expressing TMPRSS2
  • ACE2_TMPRSS2_double_positive – cells expressing both genes
  • celltype_double_positive – the cell type column and double positive column combined so genes that are differentially expressed between only one cell type may be viewed
  • celltype_ACE2+ – same as above for ACE2 expression

After selecting one of these annotations, you can then search for a gene of interest in the ‘Search Genes’ box in the left corner to view the expression of that gene as a violin plot split by the annotation you select. For example, to reproduce Figure 1D in the manuscript, you would select annotation “ACE2_TMPRSS2_double_positive” and search for gene ‘IFNGR2’ in the ‘Search Genes’ box.

The same options are available for the ileal absorptive enterocytes visualization except that the cell type is combined with the double positive/ACE2 column since this subset only includes one cell type.

Full expression matrices (and therefore gene expression values visible when using the ‘Search Genes’ box) are available only for epithelial cell types as this data is derived from a pre-publication study.

Atlas of ACE2 and TMPRSS2 in human ileum

In this study, samples from human ileum were collected, processed, and run on 10x 3′ v2 show ACE2 and TMPRSS2 co-expression in absorptive enterocytes.

To visualize these cells in the Alexandria Portal, visit this study: ACE2 and TMPRSS2 most enriched within GSTA1+MGST3+ absorptive enterocytes in context of non-inflamed terminal ileum

Two tSNE visualizations are available for this study, one of all cells in the ileum samples “non-inflammed-tsne” and one of just the epithelial cells in the samples “non-inflammed-epth-tsne”. To toggle between these, select the “Explore” tab under the study title, then select ‘View options’ from the upper right corner of the plot and choose the visualization of interest in the ‘Load cluster’ dropdown.

Full expression matrices (and therefore gene expression values visible when using the ‘search genes’ box) are available only for the epithelial cell types as this data is derived from a pre-publication study.

Atlas of ACE2 and TMPRSS2 expression in human HIV- and TB-infected lung

In this study, samples from lung surgeries were run with Seq-Well S^3 and contain a variety of immune and epithelial cell types. We found the majority of ACE2 and TPRSS2 double positive cells in type II pneumocyte cells.

To view these cells in the Alexandria Project, visit this study: Human lung HIV-TB co-infection ACE2+ cells

To visualize the double positive cells in these samples, click the ‘Explore’ tab, then select ‘View Options’ in the top right corner of the plot and choose ‘ACE2_TMPRSS2_double_positive’ under the ‘Select Annotation’ dropdown menu.

Full expression matrices (and therefore gene expression values visible when using the ‘search genes’ box) are available only for epithelial cell types as this data is derived from a pre-publication study.

A subset of ACE2+ secretory cells in human nasal mucosa

In this study, we find a subset of secretory cells which co-express ACE2 and TMPRSS2.

To visualize these cells on the Alexandria Project, visit this study: Allergic inflammatory memory in human respiratory epithelial progenitor cells

To view the tSNE of epithelial cells, select the ‘Explore’ tab, select ‘View Options’ in the right hand corner of the plot and choose ‘Epithelial cells’ in the ‘Load Cluster’ dropdown. To view the cell type annotations in the first panel of the above plot choose ‘subset’ in the ‘Select Annotation’ dropdown menu. To view the ACE2/TMPRSS2 double positive cells, select ‘ACE2_TMPRSS2’ from the ‘Select annotation dropdown menu, and to view the cluster subsets from the third panel of this plot, select ‘res_0_8’ from the ‘Select Annotation Dropdown Menu’.

ACE2 and TPRSS2 co-expressing cells found in non-human primate granulomas and adjacent uninvolved lung tissue

In this study, we collected lung tissue from non-human primates infected with mTB. These tissues come from both mTB granulomas and adjacent uninvolved lung in the same monkey.

These cells, profiled using Seq-Well S^3, can be investigated interactively in the Alexandria Project: Epithelial cells in NHP TB granuloma and uninvolved lung

To view the data colored by granuloma and uninvolved lung choose the “Granuloma” annotation accessible by clicking “View Options” in the top right-hand corner and selecting from the “Select Annotation” dropdown menu.

Full expression matrices (and therefore gene expression values visible when using the ‘Search Genes’ box) are available only for relevant cell types as this data is derived from a pre-publication study.

Comparison of ACE2 and TMPRSS2 expression in human duodenal and ileal tissue and organoid-derived epithelial cells

In this study, samples from adult human duodenum and ileum were collected and split for primary tissue single-cell RNA-seq and organoid culture under several conditions and profiled with Seq-Well S^3. Organoids were cultured and passaged every 6-8 days in Matrigel domes with established media conditions meant to recapitulate the broad diversity of in vivo epithelial cell types (Fujii, M., et al., Cell Stem Cell. 2018). Organoid culture media contained recombinant Noggin, Rspondin-3, FGF2, IGF1, afamin-Wnt3A, in addition to Gastrin and TGF-b inhibitor A83-01 with and without recombinant EGF (E/NR3+F2I1Gi+Af-W3+A83).  Cells co-expressing ACE2 and TMPRSS2 were identified principally within enterocyte clusters of both tissues and organoids.

Visualize these samples interactively and read more about this study on the Alexandria Project: Comparison of ACE2 and TMPRSS2 expression in human duodenal and ileal tissue and organoid-derived epithelial cells

This project contains two UMAP visualizations (one for organoid cells and one for primary tissue cells): toggle between them by clicking on the ‘Explore’ tab, select ‘View Options’ in the top right hand corner, and switch between tissue and organoid under the ‘Load Cluster’ dropdown.

To visualize cells which co-express ACE2 and TMPRSS2, select the ‘ACE2_TMPRSS2’ option under the ‘Select Annotation’ dropdown. You can then search for a gene of interest in the ‘Search Genes’ box in the left corner to view the expression of that gene as a violin plot split by the annotation you select.

Full expression matrices (and therefore gene expression values visible when using the ‘Search Genes’ box) are available only for enterocyte cell types as this data is derived from a pre-publication study. Expression of ACE2 and TMPRSS2 are available for all cells.

Interferon regulation of ACE2 in human and murine basal cells

Analysis of these datasets and others lead to the hypothesis that expression of the ACE2 receptor may be upregulated by interferon. To further interrogate this hypothesis, we cultured basal cells from two primary human donors, one human basal cell line, and one mouse trachea and stimulated them with  IL4, IL17a, IFNgamma, IFNαlpha, IFNbeta for 12 hours overnight. We performed bulk RNA sequencing and differential expression to show a dose dependent upregulation of canonical ISGs (interferon signaling genes). Specifically, we see that ACE2 is most significantly unregulated following IFN alpha stimulation in primary human basal cells, diminished in the BEAS-2B cell line and not seen in mouse cells. 

To visualize these samples, the gene expression data may be viewed interactively in the Alexandria Project: Interferon regulation of ACE2 in human and murine basal cells

Under the ‘Explore’ tab, use the ‘Search genes’ field in the top left corner to visualize log-normalized gene expression. To visualize expression in each sample, select ‘View Options’ in the top left corner of the plot and choose the sample of interest under ‘Load cluster’. The ‘Stim_Dose’ annotation refers to the dose of each stimulation condition applied to that sample. 

Data for all samples in this study can be dowloaded under the ‘Download’ tab.

Murine nasal mucosa after intranasal interferon exposure

We test the impact of IFNalpha stimulation in vivo by treating two mice intranasally with 200 ng of IFNalpha and two with saline. After 12 hours, the nasal mucosa of the respiratory and olfactory epithelia and underlying lamina propria were isolated and prepared for sequencing with Seq-Well S^3.

These cells may be viewed interactively in the Alexandria Project: Murine nasal mucosa after intranasal interferon exposure

Data for all samples in this study can be dowloaded under the ‘Download’ tab.

Alexandria Project Details

Alexandria Documentation

Single Cell Portal Documentation

We thank the Broad Institute Single Cell Portal team for creating the platform that allows Alexandria to exist and for their working tirelessly to help us share our datasets for others to access.

By affinity capture and amplification of TCR transcripts from whole-transcriptome libraries, TCR CDR3 sequences can be recovered from 3′-barcoded scRNA-seq libraries (e.g. Seq-Well, Drop-seq, etc.). This method can be applied post-hoc, allowing for the capture of additional information from archived samples. The protocol can also be found here.

High-throughput 3′ single-cell RNA-sequencing (scRNA-seq) allows cost-effective, detailed characterization of individual immune cells from tissues. Current techniques, however, are limited in their ability to elucidate essential immune cell features, including variable sequences of T cell antigen receptors (TCRs) that confer antigen specificity. Here, we present a strategy that enables simultaneous analysis of TCR sequences and corresponding full transcriptomes from 3′-barcoded scRNA-seq samples. This approach is compatible with common 3′ scRNA-seq methods, and adaptable to processed samples post hoc. We applied the technique to identify transcriptional signatures associated with T cells sharing common TCRs from immunized mice and from patients with food allergy. We observed preferential phenotypes among subsets of expanded clonotypes, including type 2 helper CD4+ T cell (TH2) states associated with food allergy. These results demonstrate the utility of our method when studying diseases in which clonotype-driven responses are critical to understanding the underlying biology. The protocol can be found here.

Immune responses within barrier tissues are regulated, in part, by nociceptors, specialized peripheral sensory neurons that detect noxious stimuli. Previous work has shown that nociceptor ablation not only alters local responses to immune challenge at peripheral sites, but also within draining lymph nodes (LNs). The mechanisms and significance of nociceptor-dependent modulation of LN function are unknown. Indeed, although sympathetic innervation of LNs is well documented, it has been unclear whether the LN parenchyma itself is innervated by sensory neurons. Here, using a combination of high-resolution imaging, retrograde viral tracing, single-cell transcriptomics (scRNA-seq), and optogenetics, we identified and functionally tested a sensory neuro-immune circuit that is preferentially located in the outermost cortex of skin-draining LNs. Transcriptomic profiling revealed that there are at least four discrete subsets of sensory neurons that innervate LNs with a predominance of peptidergic nociceptors, and an innervation pattern that is distinct from that in the surrounding skin. To uncover potential LN-resident communication partners for LN-innervating sensory neurons, we employed scRNA-seq to generate a draft atlas of all murine LN cells and, based on receptor-ligand expression patterns, nominated candidate target populations among stromal and immune cells. Using selective optogenetic stimulation of LN-innervating sensory axons, we directly experimentally tested our inferred connections. Acute neuronal activation triggered rapid transcriptional changes preferentially within our top-ranked putative interacting partners, principally endothelium and other nodal stroma cells, as well as several innate leukocyte populations. Thus, LNs are monitored by a unique population of sensory neurons that possesses immunomodulatory potential.

Genome-wide association studies (GWAS) have identified genetic variants associated with age-related macular degeneration (AMD), one of the leading causes of blindness in the elderly. However, it has been challenging to identify the cell types associated with AMD given the genetic complexity of the disease. Here we perform massively parallel single-cell RNA sequencing (scRNA-seq) of human retinas using two independent platforms, and report the first single-cell transcriptomic atlas of the human retina. Using a multi-resolution network-based analysis, we identify all major retinal cell types, and their corresponding gene expression signatures. Heterogeneity is observed within macroglia, suggesting that human retinal glia are more diverse than previously thought. Finally, GWAS-based enrichment analysis identifies glia, vascular cells, and cone photoreceptors to be associated with the risk of AMD. These data provide a detailed analysis of the human retina, and show how scRNA-seq can provide insight into cell types involved in complex, inflammatory genetic diseases.

Genome-wide association studies (GWAS) have revealed risk alleles for ulcerative colitis (UC). To understand their cell type specificities and pathways of action, we generate an atlas of 366,650 cells from the colon mucosa of 18 UC patients and 12 healthy individuals, revealing 51 epithelial, stromal, and immune cell subsets, including BEST4+ enterocytes, microfold-like cells, and IL13RA2+IL11+ inflammatory fibroblasts, which we associate with resistance to anti-TNF treatment. Inflammatory fibroblasts, inflammatory monocytes, microfold-like cells, and T cells that co-express CD8 and IL-17 expand with disease, forming intercellular interaction hubs. Many UC risk genes are cell type specific and co-regulated within relatively few gene modules, suggesting convergence onto limited sets of cell types and pathways. Using this observation, we nominate and infer functions for specific risk genes across GWAS loci. Our work provides a framework for interrogating complex human diseases and mapping risk variants to cell types and pathways.

The liver can substantially regenerate after injury, with both main epithelial cell types, hepatocytes and biliary epithelial cells (BECs), playing important roles in parenchymal regeneration. Beyond metabolic functions, BECs exhibit substantial plasticity and in some contexts can drive hepatic repopulation. Here, we performed single-cell RNA sequencing to examine BEC and hepatocyte heterogeneity during homeostasis and after injury. Instead of evidence for a transcriptionally defined progenitor-like BEC cell, we found significant homeostatic BEC heterogeneity that reflects fluctuating activation of a YAPdependent program. This transcriptional signature defines a dynamic cellular state during homeostasis and is highly responsive to injury. YAP signaling is induced by physiological bile acids (BAs), required for BEC survival in response to BA exposure, and is necessary for hepatocyte reprogramming into biliary progenitors upon injury. Together, these findings uncover molecular heterogeneity within the ductal epithelium and reveal YAP as a protective rheostat and regenerative regulator in the mammalian liver.

Genome-wide association studies (GWAS) have revealed risk alleles for ulcerative colitis (UC), but their cell type and pathway specificities are often unknown. Here, we generate an atlas of 115,517 cells from the colon mucosa of seven UC patients and ten healthy individuals, revealing 51 epithelial, stromal, and immune cell subsets, including a subset of BEST4+ enterocytes, which may sense and respond to pH, and IL13RA2+IL-11+ inflammatory fibroblasts, which we associate with resistance to anti-TNF therapy. Inflammatory fibroblasts, inflammatory monocytes, microfold-like cells, and CD8+IL-17+ T cells expand during disease, and form intercellular interaction hubs that mediate cross-talk between diverse cellular lineages. We identify hundreds of putative autocrine and paracrine cell-cell interactions that may explain the migration, expansion, or inhibition of cell types with disease. Surprisingly, UC risk genes are often cell type specific and co-regulated in relatively few gene modules, suggesting convergence onto limited sets of cell types and pathways. Using this observation, we nominate and infer putative functions for UC risk genes across all GWAS loci. Our atlas thus provides a framework for interrogating complex human diseases and mapping risk variants onto their cell types and pathways of activity.

Barrier tissue dysfunction is a fundamental feature of chronic human inflammatory diseases1. Specialized subsets of epithelial cells—including secretory and ciliated cells—differentiate from basal stem cells to collectively protect the upper airway2,3,4. Allergic inflammation can develop from persistent activation5 of type 2 immunity6 in the upper airway, resulting in chronic rhinosinusitis, which ranges in severity from rhinitis to severe nasal polyps7. Basal cell hyperplasia is a hallmark of severe disease7,8,9, but it is not known how these progenitor cells2,10,11contribute to clinical presentation and barrier tissue dysfunction in humans. Here we profile primary human surgical chronic rhinosinusitis samples (18,036 cells, n = 12) that span the disease spectrum using Seq-Well for massively parallel single-cell RNA sequencing12, report transcriptomes for human respiratory epithelial, immune and stromal cell types and subsets from a type 2 inflammatory disease, and map key mediators. By comparison with nasal scrapings (18,704 cells, n = 9), we define signatures of core, healthy, inflamed and polyp secretory cells. We reveal marked differences between the epithelial compartments of the non-polyp and polyp cellular ecosystems, identifying and validating a global reduction in cellular diversity of polyps characterized by basal cell hyperplasia, concomitant decreases in glandular cells, and phenotypic shifts in secretory cell antimicrobial expression. We detect an aberrant basal progenitor differentiation trajectory in polyps, and propose cell-intrinsic13, epigenetic14,15 and extrinsic factors11,16,17 that lock polyp basal cells into this uncommitted state. Finally, we functionally demonstrate that ex vivo cultured basal cells retain intrinsic memory of IL-4/IL-13 exposure, and test the potential for clinical blockade of the IL-4 receptor α-subunit to modify basal and secretory cell states in vivo. Overall, we find that reduced epithelial diversity stemming from functional shifts in basal cells is a key characteristic of type 2 immune-mediated barrier tissue dysfunction. Our results demonstrate that epithelial stem cells may contribute to the persistence of human disease by serving as repositories for allergic memories.

The recent advent of methods for high-throughput single-cell molecular profiling has catalyzed a growing sense in the scientific community that the time is ripe to complete the 150-year-old effort to identify all cell types in the human body. The Human Cell Atlas Project is an international collaborative effort that aims to define all human cell types in terms of distinctive molecular profiles (such as gene expression profiles) and to connect this information with classical cellular descriptions (such as location and morphology). An open comprehensive reference map of the molecular state of cells in healthy human tissues would propel the systematic study of physiological states, developmental trajectories, regulatory circuitry and interactions of cells, and also provide a framework for understanding cellular dysregulation in human disease. Here we describe the idea, its potential utility, early proofs-of-concept, and some design considerations for the Human Cell Atlas, including a commitment to open data, code, and community.

Tissue barrier dysfunction is a poorly defined feature hypothesized to drive chronic human inflammatory disease. The epithelium of the upper respiratory tract represents one such barrier, responsible for separating inhaled agents, such as pathogens and allergens, from the underlying submucosa. Specialized epithelial subsets-including secretory, glandular, and ciliated cells-differentiate from basal progenitors to collectively realize this role. Allergic inflammation in the upper airway barrier can develop from persistent activation of Type 2 immunity (T2I), resulting in the disease spectrum known as chronic rhinosinusitis (CRS), ranging from rhinitis to severe nasal polyps. Whether recently identified epithelial progenitor subsets, and their differentiation trajectory, contribute to the clinical presentation and barrier dysfunction in T2I-mediated disease in humans remains unexplored. Profiling twelve primary human CRS samples spanning the range of clinical severity with the Seq-Well platform for massively-parallel single-cell RNA-sequencing (scRNA-seq), we report the first single-cell transcriptomes for human respiratory epithelial cell subsets, immune cells, and parenchymal cells (18,036 total cells) from a T2I inflammatory disease, and map key mediators. We find striking differences between non-polyp and polyp tissues within the epithelial compartments of human T2I cellular ecosystems. More specifically, across 10,383 epithelial cells, we identify a global reduction in epithelial diversity in polyps characterized by basal cell hyperplasia, a concomitant decrease in glandular and ciliated cells, and phenotypic shifts in secretory cell function. We validate these findings through flow cytometry, histology, and bulk tissue RNA-seq of an independent cohort. Furthermore, we detect an aberrant basal progenitor differentiation trajectory in polyps, and uncover cell-intrinsic and extrinsic factors that may lock polyp basal cells into an uncommitted state. Overall, our data define severe T2I barrier dysfunction as a reduction in epithelial diversity, characterized by profound functional shifts stemming from basal cell defects, and nominate a cellular mechanism for the persistence and chronicity of severe human respiratory disease.

Click here to read the pre-publication manuscript.

We develop single-cell transcriptomic approaches to comprehensively profile human tissues and model systems. Previously, we focused on establishingvalidating, scaling, and simplifying single-cell RNA-seq, often through the development of microdevices, to enable genome-wide identification of the cell types/states contained within complex biological samples. More recently, we helped both enhance the detection of phenotype-defining transcripts using these methods and simplify their on-site processing for clinical applications. In parallel, we have also worked to democratize these techniques, providing open access to resources and protocols, training thousands locally and abroad, and establishing infrastructure and on-site collaborations spanning across 6 continents and 26+ countries.

We broadly study how intra- and extracellular circuits collectively drive healthy and diseased tissue states. By leveraging the massive genomic datasets we and others have generated from complex tissues (like melanoma tumors, inflamed gut, and nasal polyps), we have begun to identify common and unique cell types/states and circuits associated with pathology that may be important for regulating biological function and stability. Our current findings suggest multiple overlaps among distinct diseases, pointing to the possibility of a finite set of evolved response strategies and thus common interventions based on adjusting specific cell states, cell frequencies, and/or cell-cell communication pathways.


We are exposed to a constant flux of external biochemical and physical stimuli as we age. Despite variability in our overall experiences and exact constitutions, our individual tissues typically manage to maintain functionality, though each can differ in its resilience to distinct stressors.

We have characterized how differences in cellular composition and communication impact tissue fitness and have identified responses and subsequent adaptations that drive chronic dysfunction. For example, although aberrant immune activity can precipitate allergic inflammatory diseases, therapies targeting immune cells and signaling are only successful in some, suggesting chronicity may involve alternative mechanisms. Previously, we helped demonstrate that dysregulated type-2 immune signaling, driven by environmental allergens, can impact tissue health in the upper airway through generating dysfunctional basal epithelial stem cells. These stem cells can then contribute to persistence by serving as repositories for allergic inflammatory memories, altering the integrity and functional output of the nasal epithelium. Our work, with that of others, suggests generalizable principles for cellular memory, and informs where and how tissues should be targeted to support health or restore function. We have since further investigated how tissue-resident cellular subsets participate in, and are shaped by, environmental exposures at barrier tissues and the functional consequences of these experiences.

We are now working to develop a more holistic appreciation for how different intra- and extracellular factors (e.g., genetics and integrated exposure history, respectively) influence barrier tissue function.