Tutorial 1: Finding Antibodies for Gating a Specific Cell Population
In this tutorial, we demonstrate how to use the ImmunoPheno library to identify commercial antibodies that can be used to gate a specific cell population. Specifically, we are going to perform a data-driven analysis to identify antibodies that can separate transitional stage B cells from other B cells. Transitional stage B cells are an intermediate stage population of cells that occur after the pro-B cell stage but before the mature B cell stage during B cell development. Due to their low abundance and transitory character, they are a relatively difficult population to isolate in cell sorting experiments.
Note that you may get slightly different results when running this tutorial, as the reference information in the ImmunoPhenoDB database is frequently updated.
[1]:
# Choose the adequate plotly renderer for visualizing plotly graphs in your system
import plotly.io as pio
pio.renderers.default = 'notebook_connected'
[23]:
import pandas as pd
import plotly.express as px
from immunopheno.connect import ImmunoPhenoDB_Connect
We first create an instance of ImmunoPhenoDB_Connect that will allows us to make queries to the ImmunoPheno database.
[3]:
cxn = ImmunoPhenoDB_Connect("http://www.immunopheno.org")
Loading necessary files...
Connecting to database...
Connected to database.
We can see a summary of what is currently in the database using the command db_stats()
[4]:
cxn.db_stats()
Database Statistics
===================
Number of experiments: 18
Number of tissues: 6
Number of cells: 115979
Number of antibodies: 646
Number of antibody targets: 294
Number of antibody clones: 390
Average number of experiments per antibody: 2.68
Let us now search for cell ontologies for which there is information in the database and that contain the word “B cell”
[5]:
cxn.which_celltypes("B cell")
[5]:
| idCL | label | idExperiment_used | |
|---|---|---|---|
| 0 | CL:0000236 | B cell | 10 |
| 1 | CL:0000787 | memory B cell | 1,2,3,4,5,6,7,8,11,12,13,14,15,16 |
| 2 | CL:0000788 | naive B cell | 1,2,3,4,5,6,7,8,11,12,15,16,17,18 |
| 3 | CL:0000816 | immature B cell | 5,17,18 |
| 4 | CL:0000817 | precursor B cell | 10,12,18 |
| 5 | CL:0000818 | transitional stage B cell | 10,12 |
| 6 | CL:0000826 | pro-B cell | 12 |
| 7 | CL:0000955 | pre-B-II cell | 5 |
| 8 | CL:0000970 | unswitched memory B cell | 17,18 |
| 9 | CL:0000972 | class switched memory B cell | 17,18 |
| 10 | CL:2000006 | tonsil germinal center B cell | 4,5 |
We see that the OBO Foundry Cell Ontology ID for transitional stage B cells is CL:0000818 and there is currently information from 2 experiments in the ImmunoPheno database. We can find more information about these experiments using the command find_experiments(). For example,
[6]:
cxn.find_experiments(idCL=['CL:0000818'])
[6]:
| idExperiment | nameExp | typeExp | pmid | doi | idBTO | tissue | |
|---|---|---|---|---|---|---|---|
| 0 | 10 | An immunophenotype-coupled transcriptomic atla... | CITE | 38514887 | https://doi.org/10.1038/s41590-024-01782-4 | BTO:0000141 | bone marrow |
| 1 | 12 | Comprehensive Integration of Single-Cell Data | CITE | 31178118 | https://doi.org/10.1016/j.cell.2019.05.031 | BTO:0000141 | bone marrow |
As expected, all datasets containing information about transitional stage B cells are from the bone marrow.
The cell ontology ID of B cells is CL:0000236. This ontology contains transitional stage B cells (CL:0000818) as a descendant ontology:
[7]:
cxn.plot_db_graph(root="CL:0000236")
In this graph, the nodes in red indicate cell ontologies for which there is data in the ImmunoPhenoDB database explicitly annotated with those ontologies. Nodes in blue indicate cell ontologies corresponding to derived annotations from the annotations in the ImmunoPhenoDB database.
We now look for antibodies that distinguish transitional B cells from other B cells in present in the bone marrow,
[8]:
result_df1, plot_dict1 = cxn.find_antibodies(id_CLs=["CL:0000818"], background_id_CLs=["CL:0000236"], idBTO=["BTO:0000141"])
result_df1
[8]:
| target | coeff | stderr | p_val | q_val | CL:0000818 | CL:0000236 | |
|---|---|---|---|---|---|---|---|
| AB_2750556 | CD38 | 7.109 | 0.977 | 0.000000e+00 | 0.000000e+00 | 1.000000 | 0.440799 |
| AB_2750381 | CD102 | 1.106 | 0.208 | 1.007000e-07 | 2.135600e-06 | 0.354232 | 0.241144 |
| AB_2734286 | CD10 | 0.790 | 0.139 | 1.410000e-08 | 4.967000e-07 | 0.952978 | 0.684974 |
| AB_2783249 | CD63 | 0.781 | 0.141 | 3.260000e-08 | 8.639000e-07 | 0.227273 | 0.089002 |
| AB_2734267 | CD45RA | 0.687 | 0.131 | 1.535000e-07 | 2.712700e-06 | 0.806886 | 0.915740 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| AB_2750357 | CD197 | -1.271 | 0.546 | 1.998851e-02 | 9.444172e-02 | 0.066667 | 0.103809 |
| AB_2749971 | CD11c | -1.280 | 0.703 | 6.844322e-02 | 2.418327e-01 | 0.000000 | 0.192551 |
| AB_2750000 | CD27 | -2.912 | 0.732 | 6.983760e-05 | 8.225319e-04 | 0.000000 | 0.677479 |
| AB_2749972 | CD34 | -3.383 | 0.800 | 2.359320e-05 | 3.126100e-04 | 0.066667 | 0.290628 |
| AB_2750347 | CD79b | -5.646 | 0.870 | 1.000000e-10 | 4.500000e-09 | 0.200000 | 0.704257 |
106 rows × 7 columns
[9]:
result_df1[result_df1["CL:0000818"]>0.8]
[9]:
| target | coeff | stderr | p_val | q_val | CL:0000818 | CL:0000236 | |
|---|---|---|---|---|---|---|---|
| AB_2750556 | CD38 | 7.109 | 0.977 | 0.000000e+00 | 0.000000e+00 | 1.000000 | 0.440799 |
| AB_2734286 | CD10 | 0.790 | 0.139 | 1.410000e-08 | 4.967000e-07 | 0.952978 | 0.684974 |
| AB_2734267 | CD45RA | 0.687 | 0.131 | 1.535000e-07 | 2.712700e-06 | 0.806886 | 0.915740 |
| AB_2832712 | HLA-DR DP DQ | 0.491 | 0.099 | 7.618000e-07 | 1.153550e-05 | 0.998433 | 0.958333 |
| AB_2734256 | CD19 | -0.622 | 0.282 | 2.749450e-02 | 1.040863e-01 | 1.000000 | 0.986301 |
| AB_2750001 | HLA-DR | -1.189 | 0.336 | 4.054752e-04 | 3.907306e-03 | 0.866667 | 0.622432 |
[10]:
plot_dict1["CL:0000818"]
find_antibodies() runs a linear mixed effects model to identify antobody levels that differ between the two populations. Positive (negative) coefficients indicate antibodies upreguated (downregulated) in the cell populations specified in id_CLs and their descendant cell ontologies, while negative coefficients. The optional idBTO="BTO:000141" argument restricts the analysis to data from bone marrow, which correspond to the BRENDA tissue
ontology ID BTO:000141 (as it can be seen from the output of find_experiments() above). If a tissue or list of tissues is not specified, all tissues in the ImmunoPheno database are considered. We observe that among the 106 antibodies that were tested, anti-CD38 AB:2750556 and anti-CD10 AB:2734286 are significantly upregulated in transitional stage B cells and detected in >95% of this population, compared to the general B-cell
population in the bone marrow.
We can now look at which cell populations in the bone marrow are positive for these two antibodies:
[11]:
result_dict, plot_dict_ct = cxn.find_celltypes(["AB_2750556", "AB_2734286"], idBTO=["BTO:0000141"])
plot_dict_ct["AB_2750556"]
[12]:
plot_dict_ct["AB_2734286"]
find_celltypes() uses a linear mixed effects model to identify cell populations on which a given antibody or set of antibodies is upregulated or downregulated in comparisson to all the other cell populations. We can look at the results of the test in the table returned by find_celltypes()
[13]:
result_dict["AB_2750556"][result_dict["AB_2750556"]["expressed"]>0.8]
[13]:
| cellType | coeff | stderr | p_val | q_val | expressed | |
|---|---|---|---|---|---|---|
| CL:0000980 | plasmablast | 12.555 | 1.253 | 1.262152e-23 | 3.029164e-23 | 1.000000 |
| CL:0001054 | CD14-positive monocyte | 6.667 | 0.127 | 0.000000e+00 | 0.000000e+00 | 0.993635 |
| CL:0000818 | transitional stage B cell | 6.598 | 0.920 | 0.000000e+00 | 0.000000e+00 | 1.000000 |
| CL:0000817 | precursor B cell | 6.369 | 1.400 | 5.370900e-06 | 8.593500e-06 | 1.000000 |
| CL:0000557 | granulocyte monocyte progenitor cell | 5.488 | 0.346 | 1.111304e-56 | 5.334258e-56 | 1.000000 |
| CL:0000549 | basophilic erythroblast | 4.453 | 2.916 | 1.268099e-01 | 1.521719e-01 | 1.000000 |
| CL:0002032 | hematopoietic oligopotent progenitor cell | 4.384 | 1.909 | 2.166646e-02 | 3.058794e-02 | 1.000000 |
| CL:0000826 | pro-B cell | 4.141 | 1.909 | 3.011700e-02 | 4.015600e-02 | 0.857143 |
| CL:0000623 | natural killer cell | 3.553 | 0.251 | 2.337678e-45 | 8.014897e-45 | 0.824645 |
[14]:
result_dict["AB_2734286"][result_dict["AB_2734286"]["expressed"]>0.8]
[14]:
| cellType | coeff | stderr | p_val | q_val | expressed | |
|---|---|---|---|---|---|---|
| CL:0010001 | stromal cell of bone marrow | 9.564 | 0.122 | 0.0 | 0.0 | 0.968978 |
| CL:0000818 | transitional stage B cell | 8.165 | 0.115 | 0.0 | 0.0 | 0.952978 |
| CL:0000817 | precursor B cell | 7.513 | 0.106 | 0.0 | 0.0 | 0.886968 |
We observe that among B cell populations in the bone marrow, pro-B cells and precursor B cells are also positive for AB:2750556 and AB:2734286. We would therefore like to identify another antibody that we can combine with these antibodies to fully separate transitional B cells from other B cell populations in the bone marrow. We can achieve this by running again find_antibodies(), this time using pro-B cells and precursor B cells as background cell populations.
[15]:
result_df1, plot_dict1 = cxn.find_antibodies(id_CLs=["CL:0000818"], background_id_CLs=["CL:0000826", "CL:0000817"], idBTO=["BTO:0000141"])
result_df1
[15]:
| target | coeff | stderr | p_val | q_val | CL:0000818 | CL:0000826 | CL:0000817 | |
|---|---|---|---|---|---|---|---|---|
| AB_2734256 | CD19 | 3.496 | 0.817 | 1.881360e-05 | 1.552120e-04 | 1.000000 | 0.000000 | 1.000000 |
| AB_2750001 | HLA-DR | 3.292 | 1.164 | 4.679777e-03 | 2.316489e-02 | 0.866667 | 0.142857 | 0.769231 |
| AB_2734267 | CD45RA | 2.345 | 0.232 | 4.725649e-24 | 2.339196e-22 | 0.806886 | 0.428571 | 0.588235 |
| AB_2750381 | CD102 | 1.877 | 0.225 | 7.333399e-17 | 0.000000e+00 | 0.354232 | NaN | 0.162234 |
| AB_2750347 | CD79b | 1.554 | 1.015 | 1.257598e-01 | 2.895400e-01 | 0.200000 | 0.142857 | 0.000000 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| AB_2800911 | CD305 | -0.940 | 0.230 | 4.528130e-05 | 3.448343e-04 | 0.683386 | NaN | 0.764628 |
| AB_2750000 | CD27 | -1.325 | 0.588 | 2.427334e-02 | 8.900226e-02 | 0.000000 | 0.142857 | 0.153846 |
| AB_2734247 | CD4 | -2.238 | 0.962 | 2.003938e-02 | 7.630377e-02 | 0.066667 | 0.428571 | 0.230769 |
| AB_2734366 | CD127 | -2.385 | 0.772 | 2.005962e-03 | 1.103279e-02 | 0.000000 | 0.714286 | 0.000000 |
| AB_2749972 | CD34 | -10.517 | 0.984 | 1.110806e-26 | 1.099698e-24 | 0.066667 | 0.857143 | 0.923077 |
99 rows × 8 columns
From this analysis, we observe that anti-CD34 antibody AB:2749972 is detected in >85% of the precursor B cells and pro-B cells, but only in 7% of the transitional stage B cells. Plotting the distribution of normalized expression levels confirms this observation,
[16]:
result_dict, plot_dict_ct = cxn.find_celltypes(["AB_2749972"], idBTO=["BTO:0000141"])
plot_dict_ct["AB_2749972"]
We therefore conclude that using a combination of the antibodies AB:2750556, AB:2734286, and AB:2749972 is an effective strategy to isolate transtional B cells in the bone marrow. Let us know now look for some more information about these antibodies:
[17]:
cxn.which_antibodies("AB_2750556")
[17]:
| idAntibody | abName | abTarget | clonality | citation | comments | cloneID | host | vendor | catalogNum | idExperiment_used | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AB_2750556 | TotalSeq(TM)-A0557 anti-mouse CD38 | CD38 | monoclonal | (BioLegend Cat# 102733, RRID:AB_2750556) | Applications: PG | 90 | rat | BioLegend | 102733 | 12 |
[18]:
cxn.which_antibodies("AB_2734286")
[18]:
| idAntibody | abName | abTarget | clonality | citation | comments | cloneID | host | vendor | catalogNum | idExperiment_used | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AB_2734286 | TotalSeq(TM)-A0062 anti-human CD10 | CD10 | monoclonal | (BioLegend Cat# 312231, RRID:AB_2734286) | Applications: PG | HI10a | mouse | BioLegend | 312231 | 7,10 |
[19]:
cxn.which_antibodies("AB_2749972")
[19]:
| idAntibody | abName | abTarget | clonality | citation | comments | cloneID | host | vendor | catalogNum | idExperiment_used | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AB_2749972 | TotalSeq(TM)-A0054 anti-human CD34 | CD34 | monoclonal | (BioLegend Cat# 343537, RRID:AB_2749972) | Applications: PG | 581 | mouse | BioLegend | 343537 | 7,12 |
Finally, we can find in which experiments of the ImmunoPhenoDB database these antibodies have been used:
[20]:
cxn.find_experiments(ab=["AB_2750556"])
[20]:
| idExperiment | nameExp | typeExp | pmid | doi | idBTO | tissue | |
|---|---|---|---|---|---|---|---|
| 0 | 12 | Comprehensive Integration of Single-Cell Data | CITE | 31178118 | https://doi.org/10.1016/j.cell.2019.05.031 | BTO:0000141 | bone marrow |
[21]:
cxn.find_experiments(ab=["AB_2734286"])
[21]:
| idExperiment | nameExp | typeExp | pmid | doi | idBTO | tissue | |
|---|---|---|---|---|---|---|---|
| 0 | 7 | PBMC from influenza vaccination | CITE | 32094927 | https://doi.org/10.1038/s41591-020-0769-8 | BTO:0001025 | peripheral blood mononuclear cell |
| 1 | 10 | An immunophenotype-coupled transcriptomic atla... | CITE | 38514887 | https://doi.org/10.1038/s41590-024-01782-4 | BTO:0000141 | bone marrow |
[22]:
cxn.find_experiments(ab=["AB_2749972"])
[22]:
| idExperiment | nameExp | typeExp | pmid | doi | idBTO | tissue | |
|---|---|---|---|---|---|---|---|
| 0 | 7 | PBMC from influenza vaccination | CITE | 32094927 | https://doi.org/10.1038/s41591-020-0769-8 | BTO:0001025 | peripheral blood mononuclear cell |
| 1 | 12 | Comprehensive Integration of Single-Cell Data | CITE | 31178118 | https://doi.org/10.1016/j.cell.2019.05.031 | BTO:0000141 | bone marrow |