Documentation for Margo

Margo is a tool that generates yaml cell type marker which maps cell types to gene expression from csv gene expression files.

Installation

pip install margo

Usage

margo <input_csv> <output_yaml> -t/--tissue <specified_tissues> -m/--min_marker_per_celltype <min_marker_per_celltype>

Notes:

  • <input_csv> is path to the csv file which contains gene expression data. A full description of the input file is in the Input .csv File section.

  • <output_yaml> is path where the yaml file is to be outputed to. A full description of the output file is in the Output .yml File section.

  • <specified_tissues> is one or more specified tissues where the cell markers is to be searched for within. Multiple tissues should be separated by commas. For example: -t Blood,Breast. If -t is not specified or specified with -t all, all tissues are gonna be searched. Note that if a tissue name contains a white space, the user could add quotation marks the tissues to avoid error (e.g. -t 'Large intestine', -t 'Splenic red pulp,Vocal fold').
    Here’s a list of available tissues:
    'Abdominal adipose tissue', 'Adipose tissue', 'Adrenal gland', 'Adventitia',
    'Airway epithelium', 'Alveolus', 'Amniotic fluid', 'Amniotic membrane',
    'Antecubital vein', 'Anterior cruciate ligament', 'Artery', 'Ascites', 'Bladder',
    'Blood', 'Blood vessel', 'Bone', 'Bone marrow', 'Brain', 'Breast',
    'Bronchoalveolar system', 'Brown adipose tissue', 'Cartilage', 'Chorionic villus',
    'Colon', 'Colorectum', 'Cornea', 'Corneal endothelium', 'Corneal epithelium',
    'Corpus luteum', 'Decidua', 'Deciduous tooth', 'Dental pulp', 'Dermis',
    'Dorsolateral prefrontal cortex', 'Duodenum', 'Embryo', 'Embryoid body',
    'Embryonic brain', 'Embryonic prefrontal cortex', 'Embryonic stem cell', 'Endometrium',
    'Endometrium stroma', 'Epithelium', 'Esophagus', 'Eye', 'Fat pad', 'Fetal brain',
    'Fetal gonad', 'Fetal kidney', 'Fetal liver', 'Fetal pancreas', 'Foreskin',
    'Gall bladder', 'Gastric corpus', 'Gastric epithelium', 'Gastric gland',
    'Gastrointestinal tract', 'Germ', 'Gingiva', 'Gonad', 'Gut', 'Hair follicle', 'Heart',
    'Hippocampus', 'Inferior colliculus', 'Intervertebral disc', 'Intestinal crypt',
    'Intestine', 'Jejunum', 'Kidney', 'Lacrimal gland', 'Large intestine',
    'Laryngeal squamous epithelium', 'Larynx', 'Ligament', 'Limbal epithelium', 'Liver',
    'Lung', 'Lymph', 'Lymph node', 'Lymphoid tissue', 'Mammary epithelium', 'Mammary gland',
    'Meniscus', 'Midbrain', 'Molar', 'Muscle', 'Myocardium', 'Myometrium', 'Nasal concha',
    'Nasal epithelium', 'Nerve', 'Nucleus pulposus', 'Optic nerve', 'Oral cavity',
    'Oral mucosa', 'Osteoarthritic cartilage', 'Ovarian cortex', 'Ovarian follicle', 'Ovary',
    'Oviduct', 'Pancreas', 'Pancreatic acinar tissue', 'Pancreatic islet', 'Parotid gland',
    'Periodontal ligament', 'Periosteum', 'Peripheral blood', 'Placenta', 'Plasma', 'Pleura',
    'Pluripotent stem cell', 'Premolar', 'Primitive streak', 'Prostate', 'Pyloric gland',
    'Rectum', 'Renal glomerulus', 'Retina', 'Retinal pigment epithelium', 'Salivary gland',
    'Scalp', 'Sclerocorneal tissue', 'Seminal plasma', 'Serum', 'Sinonasal mucosa',
    'Skeletal muscle', 'Skin', 'Small intestinal crypt', 'Small intestine', 'Spinal cord',
    'Spleen', 'Splenic red pulp', 'Sputum', 'Stomach', 'Subcutaneous adipose tissue',
    'Submandibular gland', 'Sympathetic ganglion', 'Synovial fluid', 'Synovium', 'Tendon',
    'Testis', 'Thymus', 'Thyroid', 'Tongue', 'Tonsil', 'Tooth', 'Umbilical cord',
    'Umbilical cord blood', 'Umbilical vein', 'Undefined', 'Urine', 'Uterus', 'Vagina',
    'Venous blood', 'Visceral adipose tissue', 'Vocal fold', 'Whartons jelly',
    'White adipose tissue'
    
  • <min_marker_per_celltype> is the minimum number of markers a cell type needed to have in order to be included in the output. For example, if -m 3 is indicated, each cell type in the output marker would have at least 3 expression markers. It is defaulted to 2 if not specified.

Input .csv File

The input file <input_csv> should be a csv file which contains single cell gene expression data. It must includes the feature names (gene markers) as the column names in the first row.

Here’s an example:

"","EGFR","Ruthenium_1","Ruthenium_2","Ruthenium_3","Ruthenium_4","Ruthenium_5","Ruthenium_6","Ruthenium_7","E-Cadherin","DNA1","DNA2","Rabbit IgG H L","GATA3","Histone H3 antibody 1","Ki-67","SMA","Vimentin","cleaved PARP","Cleaved Caspase3","Her2","p53","pan Cytokeratin","Cytokeratin 19","Progesterone Receptor A/B antibody 1","Progesterone Receptor A/B antibody 2","c-Myc","Fibronectin","Cytokeratin 14","Slug","CD20","vWF","CD31","Histone H3 antibody 2","Cytokeratin 5","CD44","CD45","CD68","CD3","Carbonic Anhydrase IX","Cytokeratin 8/18","Cytokeratin 7","Twist","phospho Histone","phospho mTOR","phospho S6"
"BaselTMA_SP41_126_X14Y7_1",0.281752771745428,1.31958781810241,0.597380072662003,1.78286258012158,1.75782433204907,1.99185744340211,2.58056400999575,2.28716705396776,1.8143094838686,2.26163835871481,2.85745429700996,0.153254092409689,0.218156920379724,1.67452982248683,0.0415625002251949,0.0712772918353616,0.444139725057488,0.293131435903423,0.293131435903423,0.483006805598638,0,1.18751223171639,0.0799563776315769,0.122841765135889,0.122841765135889,0.243465420557963,1.03973438831891,0.134128155146678,0.246095547312871,0.207883573332857,0,0,0.49913252131538,0.178349656238537,3.80842587638293,0.0447332863514713,0.184805194048002,0,0.928929162714277,0.025525779641982,0.0434231703349215,0.209742440797169,0.137454250163813,0.572811188966434,0.215508336047807
"BaselTMA_SP41_126_X14Y7_2",0.303016424879695,1.31958781810241,0.597380072662003,1.78286258012158,1.75782433204907,1.99185744340211,2.58056400999575,2.28716705396776,1.51768483657457,1.613059980226,1.93977298946158,0,0.104099530009894,1.30414953455425,0.258647854185918,0.179905035860049,0.270070346973278,0.269995556739188,0.269995556739188,0.513386095964132,0.124381164441986,0.749379200845966,0.0669218226368246,0.189137064182174,0.189137064182174,0.147830889471134,1.14764402398214,0.0269507329965758,0.117598975755152,0.0215064833013768,0,0,0.182321559277481,0.0811467853470927,3.37104347012115,0.0468019851802111,0.0804057172161878,0.110805851822127,0.752101441431739,0,0.032055860595542,0.108013160836267,0.0484275934259086,0.539647309997281,0.655731021108987
"BaselTMA_SP41_126_X14Y7_3",0.252373591929453,1.31958781810241,0.597380072662003,1.78286258012158,1.75782433204907,1.99185744340211,2.58056400999575,2.28716705396776,1.24643337481527,2.13874433914606,2.75314619116723,0.0612817239818973,0.0994445250565181,1.45047307552165,0.081559900454294,0.166532782140787,0.233909456892897,0.253298162954306,0.253298162954306,0.633226423035353,0.0583059596412607,1.21652113650208,0.186294085811945,0.155385304368762,0.155385304368762,0.250264917703643,0.988906212339524,0.0235147624338673,0.108889136643846,0.00887829026542762,0.00700873857806561,0.00700873857806561,0.407286238353478,0.0761116154963944,3.28244078218969,0.0284985582815762,0.203248021274544,0.0206172021971442,0.740759201163638,0.0833109946128711,0.0815033267848188,0.119058055107812,0.0630969787835357,0.409734520196029,0.437844717513601
"BaselTMA_SP41_126_X14Y7_4",0.397732141326238,1.30685199496848,0.534495793659021,1.67821741785772,1.75782433204907,1.9614302314342,2.52855073877333,2.18381402772155,1.83978462756571,1.81601501739998,2.33715322835657,0.0368180548997214,0.035844882193481,1.18364710505427,0.831303329536159,0.830879359804224,0.542361835868107,0.348377503475095,0.348377503475095,0.709272254696804,0.0861824834861172,1.35430267477242,0.346272876410815,0.241098688399882,0.241098688399882,0.165295478983948,0.842710446351632,0.114420092168108,0.0863638967806044,0.0530270737099543,0.0665261604158528,0.0665261604158528,0.258153476642932,0.164059309696648,3.72178336085638,0.0690533090513159,0.305199716511606,0.0602644884847739,1.0959675377634,0.184603159618805,0.131531313002247,0.160778092929216,0.0906664483203359,0.305717938537705,0.132236468695897
"BaselTMA_SP41_126_X14Y7_5",0.426352363450394,1.17343905015891,0.597380072662003,1.58930314723782,1.38983901810637,1.78988748334799,2.34374306012192,2.12333449194922,1.61834734172774,1.35521448777051,1.81264185807688,0.038586450615388,0.0719628676148131,0.675414098389089,0.155552925832135,0.306354251115869,0.759944254743028,0.46805827165428,0.46805827165428,0.482230235295827,0,0.629398171772225,0.12440719285688,0.135211618881543,0.135211618881543,0.260330432633911,1.07335705762628,0.0553682124940572,0.0446588354307538,0.0191269324719215,0.0710114837824813,0.0710114837824813,0.219566808403905,0.0953228955472686,3.78193608766637,0.233776903387249,0.135083909465158,0.0571945265673294,1.42798304820572,0.0353712104042453,0.0384479631255821,0.0144342602153511,0.127032488088794,0.261205097984432,0.157785714279719
"BaselTMA_SP41_126_X14Y7_6",0.609903609649237,1.31958781810241,0.372367846281158,1.78286258012158,1.50101098171655,1.91774736855616,2.57742682498001,2.28716705396776,1.9784894649437,1.74194650291225,2.20566406217482,0,0.276471473109737,1.63945084696482,0.542475932177256,1.81695533696929,1.20776421472246,0.31280033985516,0.31280033985516,0.915340793386204,0.155110899096183,1.31132500439165,0.33857250307486,0.392151184354905,0.392151184354905,0.0432483518904006,1.30384683749297,0,0.321245292642739,0.122130917971635,0.153152366032903,0.153152366032903,0.46163540261567,0.238640796201489,4.25353731506461,0.292395183840898,0.410143431345782,0.0403477664871872,1.42987285773009,0.169140917827723,0.125414606855289,0,0.0732652610050874,0.467445374083094,0.143676239430917
"BaselTMA_SP41_126_X14Y7_7",0.378273994641866,1.02309345484598,0.508287981414855,1.6678490878493,1.57199137751667,1.79316832485582,2.45316771376931,2.16937958589658,1.95202912431456,2.26163835871481,2.84924235042934,0.0367073013762609,0.143904949371807,1.93552161238719,0.0974311915849248,0.646225993108243,0.666724442699186,0.499302204133687,0.499302204133687,0.538990574677631,0.123797597755547,1.19515337905834,0.145308818386877,0.358448007124586,0.358448007124586,0.258737238762547,1.47234737844586,0.135282533579763,0.073474673245632,0.0568982011248754,0.0604383190074023,0.0604383190074023,0.775791828442325,0.195056487433125,3.86719830765511,0.225182259834438,0.236481768486859,0.00653958867991183,1.15129950325041,0.0455341977110848,0.0974200795935418,0.172891676713793,0.109492404799103,0.425186253511379,0.132594197855554
"BaselTMA_SP41_126_X14Y7_8",0.433318499070447,1.02144772702908,0.379583030368888,1.48233434475344,1.46248840039406,1.69173666753727,2.28494401869665,2.00368533559941,2.16328230764477,1.78611018785754,2.33096353152789,0.0415644016526728,0.0661214541572935,1.413064810116,0.167915238452389,0.31823853161661,0.479027926368563,0.435822538708403,0.435822538708403,0.599664499074605,0.0588073675249883,1.23269089469998,0.230692324129598,0.37271412016763,0.37271412016763,0.254613984278178,1.13562080829842,0.0465206421031809,0.161365246121161,0.106829515082629,0.0942926547603922,0.0942926547603922,0.497869574102834,0.139924037168949,3.84194937328078,0.146820805647934,0.138061178204614,0.00548440684875311,1.11870351689516,0.145760753051907,0.135387292236118,0.0623356846001655,0.0805003264295715,0.441512101894891,0.165296316182786
"BaselTMA_SP41_126_X14Y7_9",0.554613318737411,1.20464396576545,0.577642773075184,1.67304306621983,1.56311983738021,1.72451674367243,2.46620106627875,2.20203835261983,2.14503544934655,1.5654611792378,1.98795262476879,0.107617950218012,0.112150935457405,1.37935198571939,0.294009768451207,0.946665099397284,1.02147974097619,0.421806244945565,0.421806244945565,0.792418752220183,0.106627993124653,1.63971011636084,0.301515601607164,0.380333597838085,0.380333597838085,0.501687379831502,1.09628041204639,0.0983596724637716,0.130034790876899,0.207883573332857,0.00761080638330121,0.00761080638330121,0.456610750890645,0.0896210293968119,4.04298577103731,0.0630212173416578,0.30240826572314,0.0373502981646327,0.826435124479329,0.121386468226999,0.227358881054528,0.152215241717528,0.126400073551901,0.420150652151168,0.156433841543841
"BaselTMA_SP41_126_X14Y7_10",0.50913395949692,1.22058622932062,0.527537389178274,1.71558506204996,1.57943590200115,1.88985474424116,2.42839274009776,2.21243523037744,2.37355197431657,2.03232893709605,2.63087273640171,0.0327224360920273,0.325118021388983,2.38462296011013,0.265314858445117,1.18812379053559,1.05776762379102,0.387794912779111,0.387794912779111,1.13720884838478,0.0328188609439452,1.90993683854487,0.274004114584755,0.406666784475912,0.406666784475912,0.407673921593465,1.06827540001609,0.0950650365792463,0.245997127041375,0.0785149333702716,0.047644453635302,0.047644453635302,0.848886628160954,0.253882640339735,3.7418323410385,0.0490141226369128,0.237072417186584,0.0390821080535325,0.6743577041355,0.276578525276985,0.435942418245782,0.128295309976712,0.109421290304796,0.600302307273537,0.0731060951400031
"BaselTMA_SP41_126_X14Y7_11",0.608830025730392,1.25152160427507,0.344457228255887,1.61965905012165,1.48286884452225,1.85917210483041,2.37363944950833,2.06779669468505,2.36021253092643,1.44066417134664,2.06677271166277,0.0404141929581712,0.12726467468555,1.54903783373294,0.0604751369610325,0.359222746493563,0.886523169811,0.485117986598223,0.485117986598223,0.947499053795506,0.0205949432778254,1.2777115832032,0.202603895993156,0.441209489241267,0.441209489241267,0.299101119377199,0.773251382580783,0.0803546502009576,0.182464201381651,0.0690267366753256,0.0996816852290423,0.0996816852290423,0.424364356848866,0.168878213446364,3.61808628369392,0.0279386882643336,0.210603545166566,0.0169132921706009,0.946982221000932,0.0742639503865018,0.295328260059555,0.114796803504975,0.166931321669432,0.570790542230526,0.157564936689404
"BaselTMA_SP41_126_X14Y7_12",0.443653863521087,1.11521645906745,0.445161935258024,1.5231109526391,1.40970203769659,1.67320377816335,2.26292386076838,2.03608392257045,2.02076231395761,1.62376201592666,2.17061498093648,0.043196895302235,0.12520524042104,1.51594569721507,0.125256550479163,0.441748454204352,0.647734958851728,0.399569200378657,0.399569200378657,0.653227727606058,0.0882794039711532,1.33235521133485,0.231975257499933,0.375581552251365,0.375581552251365,0.254370880024891,0.690528596276582,0.10419856924114,0.149967869948951,0.0378098305449082,0.0218063018879478,0.0218063018879478,0.583722859182437,0.0881895216705521,3.38219175382825,0.0428244977856538,0.13869240665371,0.0599346395170729,0.722333100485728,0.177645906369811,0.219370995303025,0.120532691355742,0.116622398400743,0.462027722684399,0.187038336411999
"BaselTMA_SP41_126_X14Y7_13",0.552635706478001,1.20432262239729,0.378310835130766,1.63510577444156,1.41690446008722,1.78127973683651,2.36680011720726,2.11435223899551,2.32381821311352,1.59479116588725,2.26517405121178,0.112480049481033,0.239245288121319,1.62108943421845,0.330806345240921,0.334645762029237,0.786412825935814,0.442779032506085,0.442779032506085,0.882964054846778,0,1.53173118681668,0.198599000667451,0.438773757840415,0.438773757840415,0.25995743013612,0.654536032837098,0.031200957917212,0.160227078670634,0.0447750862101473,0.0905552952119225,0.0905552952119225,0.45047031162161,0.226610389074651,3.62791246277059,0.166247674839988,0.146081438430856,0.107837712233924,0.893368117120901,0.176641581400065,0.223902521302041,0.128088530757384,0.212955779303418,0.456351211042372,0.295319378680642
"BaselTMA_SP41_126_X14Y7_14",0.574417195982534,1.12990422914934,0.541641878717257,1.53591767840073,1.50188838637846,1.80739010009332,2.3196965921288,2.09829587005942,2.14607730737287,2.1405145142947,2.70459065116314,0.0506444960015956,0.193189193398403,1.51787972746613,0.724698593287406,0.310107592706018,0.65930033597213,0.393258086767271,0.393258086767271,0.711733164298637,0.103964801006653,1.57208373745252,0.339913262103744,0.326502017534756,0.326502017534756,0.244289594026535,0.908101147756714,0.148476691716738,0.146458306754174,0.0109124331535944,0.0394267490474644,0.0394267490474644,0.670860722215978,0.141035262478867,3.71785832258826,0.15741410650475,0.111172598652131,0.0263364768806063,0.984096349800148,0.106150595113424,0.185679853415902,0.115548268432496,0.117830779333529,0.490467269038208,0.112207669469491
"BaselTMA_SP41_126_X14Y7_15",0.445141373351121,0.983549809032694,0.537630323548894,1.48867791273499,1.35947910529097,1.69340884048838,2.23931092928345,2.01271251277941,2.07140504464823,1.68988351768388,2.24375827665897,0.0239659793004077,0.0961201340299801,1.48551664789288,0.013582111402314,0.330363942281392,0.721158464025108,0.42904638938744,0.42904638938744,0.711109588015052,0.0853290689952641,1.44040195243329,0.356297107828498,0.287137656814369,0.287137656814369,0.229259377531225,0.401458022810847,0.14242762842043,0.114452093006097,0.0211902227314997,0.101271569675967,0.101271569675967,0.6780266285872,0.308925481282205,3.44972173256667,0.0791919849231345,0.159892290753897,0.0433953036656595,0.954703828926171,0.150253142016165,0.29956040263481,0.120342892891634,0.123618400948608,0.612554704071968,1.53077728953142
"BaselTMA_SP41_126_X14Y7_16",0.423224124113962,0.986487058046945,0.361349919667782,1.42214013904286,1.38399461532553,1.64234047781656,2.17922874397926,1.91237774683103,2.18052005538034,1.77461076402444,2.36074129925337,0.0584691410635083,0.154197683858691,1.67281129272234,0.296603607824735,0.69280541135784,0.626312549782295,0.473788819528183,0.473788819528183,0.767828846237009,0.0510328736045171,1.4454307208539,0.302253906556529,0.282563968677143,0.282563968677143,0.28150367519358,0.606046190982351,0.0842701746540736,0.215652452975277,0.0516370026272807,0.0324127454550036,0.0324127454550036,0.729162644025991,0.289874636366653,3.44824549456649,0.0606348083608117,0.150566193834433,0.0183875461655388,1.53535223790493,0.156964149606285,0.231969500388552,0.0925539915963191,0.08672909528526,0.420317833648897,0.221002661580778
"BaselTMA_SP41_126_X14Y7_17",0.498678062102147,1.17781379532244,0.522682949700635,1.51458531530078,1.40931511169072,1.74668912260261,2.30674389457336,1.9950499197128,2.20254759244282,1.93580948875749,2.53811296009035,0.0962792182483904,0.0873258979628482,1.5202803378887,0.507469063395971,0.483955276339326,0.740435581379996,0.486125527866747,0.486125527866747,0.984527287596058,0.0885008911311593,1.59905638571433,0.157846277987796,0.179280825975571,0.179280825975571,0.244728271936502,0.660520682149438,0.0590714514463765,0.288551575565284,0.032123799210221,0.131254415692856,0.131254415692856,0.50692263890735,0.225686237019723,3.48779953460387,0.05774164794599,0.115660518792748,0.0631778616343239,1.76746974870673,0.162186983688376,0.271038520848474,0.1180840476944,0.181912414195249,0.581563621271709,0.197118834439433
"BaselTMA_SP41_126_X14Y7_18",0.551295539365493,1.10544722602193,0.323332706594479,1.64705154912145,1.54944374772256,1.91626417786334,2.39241224670681,2.10341448766267,2.13209703295856,1.91448956634526,2.54847390364756,0.0153521513906716,0.154836931864209,1.47523230693975,0.365308909023129,0.293447296150629,0.533473450322357,0.509065035505689,0.509065035505689,0.770741695920629,0.112149112021286,1.15747577746381,0.179127668579003,0.197534320541308,0.197534320541308,0.453477219832709,1.08040945487045,0.240022210168695,0.297897890813239,0.197467929723259,0.121092621027361,0.121092621027361,0.622657670813786,0.223826120537171,3.71997508326293,0.0933103259939839,0.563758929281138,0.0383864502156451,1.52655587223294,0.036484511329554,0.124137599062604,0.138446692919138,0.156746296737235,0.535765892451459,0.666997378692888
"BaselTMA_SP41_126_X14Y7_19",0.566751369369273,1.04381736994478,0.597380072662003,1.69952350465586,1.61695685736026,1.77902779771975,2.370488975039,2.0766191568681,2.39643995938127,1.02094989843374,1.44499924627171,0.124155173315278,0.0827724742166557,1.01722024284261,0.335172377899056,0.261676850385538,0.909118854078818,0.563572034094173,0.563572034094173,0.987686382296719,0.0919551040440306,1.91586952770624,0.425899072028039,0.406057368348545,0.406057368348545,0.281018087465073,0.714751505032067,0.21776186811034,0.28163326936891,0.0938841368378904,0.0749973388020032,0.0749973388020032,0.390677068825128,0.295840321093668,4.08808532517434,0.0439164726716929,0.128377803572521,0.00257318735542014,2.02313136906579,0.31693994469329,0.299870074804783,0.145659902493801,0.0762762995139594,0.419048297486753,0.214751513476061
"BaselTMA_SP41_126_X14Y7_20",0.66216447205261,1.1427611101012,0.490070333357719,1.62368417169293,1.59785740644498,1.8401527504458,2.32389271944869,2.16085075584843,2.39643995938127,1.40970366801321,1.92997590130751,0.144374527614514,0.223263999536455,1.1968632350793,0.126631145552913,0.305905072125029,1.85360377737747,0.537817112447803,0.537817112447803,0.794852570414301,0.0916352917062696,1.52470417756647,0.307877295875169,0.420095498435575,0.420095498435575,0.413007222615918,1.43541192918613,0.107706760072633,0.197430435790434,0.0825285451229568,0.0671987386726368,0.0671987386726368,0.215556196177536,0.269852830634172,4.13241517362576,0.121888885437114,0.187165095535051,0.0133313540532762,1.90631147993183,0.135198919236669,0.413270320814482,0.0903610540119895,0.118018146697121,0.554547661556131,1.53077728953142
...

Alternatively, if the expression data is unavailable, the user could input a csv file which includes only the first row with gene names. Such like:

"EGFR","E-Cadherin","Rabbit IgG H L","GATA3","Histone H3 antibody 1","Ki-67","SMA","Vimentin","cleaved PARP","Cleaved Caspase3","Her2","p53","pan Cytokeratin","Cytokeratin 19","Progesterone Receptor A/B antibody 1","Progesterone Receptor A/B antibody 2","c-Myc","Fibronectin","Cytokeratin 14","Slug","CD20","vWF","CD31","Histone H3 antibody 2","Cytokeratin 5","CD44","CD45","CD68","CD3","Carbonic Anhydrase IX","Cytokeratin 8/18","Cytokeratin 7","Twist","phospho Histone","phospho mTOR","phospho S6"

Output .yml File

The output yaml file <output_yaml> is a marker which maps cell types to gene markers.

Here’s an example:

cell_type:
   Angiogenic T cell:
      - CD3
      - CD31
   Basal epithelial cell:
      - Vimentin
      - Cytokeratin 14
      - Cytokeratin 5
   CD1C-CD141- dendritic cell:
      - CD45
      - CD68
   Cancer cell:
      - CD44
      - Cytokeratin 8/18
      - Her2
      - CD45
      - CD20
   Cancer stem cell:
      - CD44
      - c-Myc
   Epithelial cell:
      - Cytokeratin 19
      - Cytokeratin 8/18
      - SMA
   Hematopoietic stem cell:
      - CD44
      - CD45
   Leukocyte:
      - CD3
      - CD45
      - CD20
   Luminal epithelial cell:
      - Cytokeratin 19
      - Cytokeratin 8/18
   Myoepithelial cell:
      - CD44
      - SMA
      - Cytokeratin 14

A yaml file could be read into a python dictionary with several lines of code:

import yaml

with open("your_yaml_output.yml", "r") as stream:
   marker_dict = yaml.safe_load(stream)

Notes:

  • The dictionary contains a top level key "cell_type". So if the user wants to access a dictionary which maps cell types to gene markers directly, they could do:

    marker_dict = marker_dict["cell_type"]
    
  • During the generation of the marker file, cell types with the same markers are collapsed into a single type.
    For example,
    Megakaryocyte erythroid cell:
       - CD3
       - CD45
       - CD31
       - CD68
    
    Myeloid cell:
       - CD3
       - CD45
       - CD31
       - CD68
    

    would be collapsed into

    Megakaryocyte erythroid cell/Myeloid cell:
       - CD3
       - CD45
       - CD31
       - CD68
    
  • Apart from the input, the output result also depends on the CellMarker Database and the aliases of gene markers. If the desired result is not generated, try different aliases of the gene markers.