[{"data":1,"prerenderedAt":761},["ShallowReactive",2],{"content-query-vSJM8wIVLS":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"heading":8,"prompt":10,"tags":14,"files":17,"nav":6,"presets":18,"gallery":37,"body":39,"_type":754,"_id":755,"_source":756,"_file":757,"_stem":758,"_extension":759,"sitemap":760},"/tools/clustergram","tools",false,"","Clustergram Generator","Create clustergrams online from Excel and CSV matrices. Combine heatmaps with hierarchical clustering to reveal structure with AI.",{"prefix":11,"label":12,"placeholder":13},"Create a clustergram","Describe the clustergram you want to create","e.g. cluster rows and columns by similarity, use a red-blue diverging color scale, label top genes",[15,16],"charts","science",true,[19,25,31],{"label":20,"prompt":21,"dataset_url":22,"dataset_title":23,"dataset_citation":24},"Energy mix by country","clustergram of electricity production share by source (coal, gas, nuclear, hydro, wind, solar) across countries; cluster both rows (countries) and columns (sources); use a sequential yellow-to-blue color scale","https://ourworldindata.org/grapher/share-of-electricity-production-by-source.csv","Share of electricity production by source","Our World in Data",{"label":26,"prompt":27,"dataset_url":28,"dataset_title":29,"dataset_citation":30},"Country development indicators","clustergram comparing countries across multiple development indicators for the most recent year; cluster countries by similarity; diverging color scale centered at the mean; annotate the top cluster with a label","https://api.worldbank.org/v2/en/indicator/NY.GDP.PCAP.CD?downloadformat=excel","GDP per capita and development indicators","World Bank",{"label":32,"prompt":33,"dataset_url":34,"dataset_title":35,"dataset_citation":36},"Gene expression heatmap","clustergram of gene expression z-scores; cluster genes (rows) only, keep sample columns in their original order; use a red-white-blue diverging color scale; add a color bar at the top showing sample groups","https://raw.githubusercontent.com/hbctraining/DGE_workshop/master/data/DESeq2_results.csv","Differential expression results (DESeq2)","Harvard Bioinformatics Core / GitHub",[38],"/img/tools/clustergram.png",{"type":40,"children":41,"toc":744},"root",[42,51,79,91,117,123,196,202,342,348,517,523,631,637,666,672,682,704,727],{"type":43,"tag":44,"props":45,"children":47},"element","h2",{"id":46},"what-is-a-clustergram",[48],{"type":49,"value":50},"text","What Is a Clustergram?",{"type":43,"tag":52,"props":53,"children":54},"p",{},[55,57,63,65,70,72,77],{"type":49,"value":56},"A ",{"type":43,"tag":58,"props":59,"children":60},"strong",{},[61],{"type":49,"value":62},"clustergram",{"type":49,"value":64}," (also called a clustered heatmap) combines two ideas: a ",{"type":43,"tag":58,"props":66,"children":67},{},[68],{"type":49,"value":69},"heatmap",{"type":49,"value":71}," that encodes values as colors, and ",{"type":43,"tag":58,"props":73,"children":74},{},[75],{"type":49,"value":76},"hierarchical clustering",{"type":49,"value":78}," that reorders the rows and columns so that similar ones appear adjacent. The result is a visual that simultaneously shows individual values, relative magnitudes, and group structure — all in one chart.",{"type":43,"tag":52,"props":80,"children":81},{},[82,84,89],{"type":49,"value":83},"The magic is in the ",{"type":43,"tag":58,"props":85,"children":86},{},[87],{"type":49,"value":88},"dendrogram",{"type":49,"value":90},": the tree diagram drawn along the left (for rows) and top (for columns). Each branch point represents a merge of two clusters, and the height of the branch indicates how different those clusters are. Long branches mean the clusters are very different from each other; short branches mean they are nearly identical. By reading the dendrogram alongside the heatmap, you can instantly see which genes (or countries, or products) behave similarly, and how many natural sub-groups exist.",{"type":43,"tag":52,"props":92,"children":93},{},[94,96,101,103,108,110,115],{"type":49,"value":95},"Clustergrams are the standard visualization in ",{"type":43,"tag":58,"props":97,"children":98},{},[99],{"type":49,"value":100},"genomics",{"type":49,"value":102}," (clustering genes and samples by expression profile), but they are equally useful in ",{"type":43,"tag":58,"props":104,"children":105},{},[106],{"type":49,"value":107},"economics",{"type":49,"value":109}," (clustering countries by development indicators), ",{"type":43,"tag":58,"props":111,"children":112},{},[113],{"type":49,"value":114},"market research",{"type":49,"value":116}," (clustering customers by purchasing behavior), and any domain where you have a matrix of measurements and want to find structure without pre-specifying the number of groups.",{"type":43,"tag":44,"props":118,"children":120},{"id":119},"how-it-works",[121],{"type":49,"value":122},"How It Works",{"type":43,"tag":124,"props":125,"children":126},"ol",{},[127,145,161],{"type":43,"tag":128,"props":129,"children":130},"li",{},[131,136,138,143],{"type":43,"tag":58,"props":132,"children":133},{},[134],{"type":49,"value":135},"Upload your data",{"type":49,"value":137}," — provide a CSV or Excel file in ",{"type":43,"tag":58,"props":139,"children":140},{},[141],{"type":49,"value":142},"wide format",{"type":49,"value":144},": rows are the entities to cluster (genes, countries, products), and columns are the variables (samples, indicators, time points). Values should be numeric.",{"type":43,"tag":128,"props":146,"children":147},{},[148,153,155],{"type":43,"tag":58,"props":149,"children":150},{},[151],{"type":49,"value":152},"Describe the clustering",{"type":49,"value":154}," — e.g. ",{"type":43,"tag":156,"props":157,"children":158},"em",{},[159],{"type":49,"value":160},"\"cluster both rows and columns, diverging red-blue color scale, z-score normalize each row, label the top 20 rows\"",{"type":43,"tag":128,"props":162,"children":163},{},[164,169,171,178,180,186,188,194],{"type":43,"tag":58,"props":165,"children":166},{},[167],{"type":49,"value":168},"Get the visualization",{"type":49,"value":170}," — the AI writes Python code using ",{"type":43,"tag":172,"props":173,"children":175},"a",{"href":174},"https://seaborn.pydata.org/generated/seaborn.clustermap.html",[176],{"type":49,"value":177},"seaborn",{"type":49,"value":179}," or ",{"type":43,"tag":172,"props":181,"children":183},{"href":182},"https://docs.scipy.org/doc/scipy/reference/cluster.hierarchy.html",[184],{"type":49,"value":185},"scipy",{"type":49,"value":187}," + ",{"type":43,"tag":172,"props":189,"children":191},{"href":190},"https://plotly.com/python/dendrogram/",[192],{"type":49,"value":193},"Plotly",{"type":49,"value":195}," to cluster and render the heatmap with dendrograms",{"type":43,"tag":44,"props":197,"children":199},{"id":198},"interpreting-the-results",[200],{"type":49,"value":201},"Interpreting the Results",{"type":43,"tag":203,"props":204,"children":205},"table",{},[206,225],{"type":43,"tag":207,"props":208,"children":209},"thead",{},[210],{"type":43,"tag":211,"props":212,"children":213},"tr",{},[214,220],{"type":43,"tag":215,"props":216,"children":217},"th",{},[218],{"type":49,"value":219},"Visual element",{"type":43,"tag":215,"props":221,"children":222},{},[223],{"type":49,"value":224},"What it means",{"type":43,"tag":226,"props":227,"children":228},"tbody",{},[229,246,262,278,294,310,326],{"type":43,"tag":211,"props":230,"children":231},{},[232,241],{"type":43,"tag":233,"props":234,"children":235},"td",{},[236],{"type":43,"tag":58,"props":237,"children":238},{},[239],{"type":49,"value":240},"Cell color",{"type":43,"tag":233,"props":242,"children":243},{},[244],{"type":49,"value":245},"Value of that row–column combination (after any normalization)",{"type":43,"tag":211,"props":247,"children":248},{},[249,257],{"type":43,"tag":233,"props":250,"children":251},{},[252],{"type":43,"tag":58,"props":253,"children":254},{},[255],{"type":49,"value":256},"Row dendrogram (left)",{"type":43,"tag":233,"props":258,"children":259},{},[260],{"type":49,"value":261},"Hierarchical clustering of rows — rows near each other are similar",{"type":43,"tag":211,"props":263,"children":264},{},[265,273],{"type":43,"tag":233,"props":266,"children":267},{},[268],{"type":43,"tag":58,"props":269,"children":270},{},[271],{"type":49,"value":272},"Column dendrogram (top)",{"type":43,"tag":233,"props":274,"children":275},{},[276],{"type":49,"value":277},"Hierarchical clustering of columns — columns near each other are similar",{"type":43,"tag":211,"props":279,"children":280},{},[281,289],{"type":43,"tag":233,"props":282,"children":283},{},[284],{"type":43,"tag":58,"props":285,"children":286},{},[287],{"type":49,"value":288},"Long branch in dendrogram",{"type":43,"tag":233,"props":290,"children":291},{},[292],{"type":49,"value":293},"The two clusters being merged are very different",{"type":43,"tag":211,"props":295,"children":296},{},[297,305],{"type":43,"tag":233,"props":298,"children":299},{},[300],{"type":43,"tag":58,"props":301,"children":302},{},[303],{"type":49,"value":304},"Short branch in dendrogram",{"type":43,"tag":233,"props":306,"children":307},{},[308],{"type":49,"value":309},"The two clusters being merged are very similar",{"type":43,"tag":211,"props":311,"children":312},{},[313,321],{"type":43,"tag":233,"props":314,"children":315},{},[316],{"type":43,"tag":58,"props":317,"children":318},{},[319],{"type":49,"value":320},"Color band above heatmap",{"type":43,"tag":233,"props":322,"children":323},{},[324],{"type":49,"value":325},"Metadata annotation (e.g. sample group, condition, treatment)",{"type":43,"tag":211,"props":327,"children":328},{},[329,337],{"type":43,"tag":233,"props":330,"children":331},{},[332],{"type":43,"tag":58,"props":333,"children":334},{},[335],{"type":49,"value":336},"Block of uniform color",{"type":43,"tag":233,"props":338,"children":339},{},[340],{"type":49,"value":341},"A coherent cluster — all members behave similarly across conditions",{"type":43,"tag":44,"props":343,"children":345},{"id":344},"clustering-options",[346],{"type":49,"value":347},"Clustering Options",{"type":43,"tag":203,"props":349,"children":350},{},[351,367],{"type":43,"tag":207,"props":352,"children":353},{},[354],{"type":43,"tag":211,"props":355,"children":356},{},[357,362],{"type":43,"tag":215,"props":358,"children":359},{},[360],{"type":49,"value":361},"Option",{"type":43,"tag":215,"props":363,"children":364},{},[365],{"type":49,"value":366},"What to ask for",{"type":43,"tag":226,"props":368,"children":369},{},[370,389,408,429,461,492],{"type":43,"tag":211,"props":371,"children":372},{},[373,381],{"type":43,"tag":233,"props":374,"children":375},{},[376],{"type":43,"tag":58,"props":377,"children":378},{},[379],{"type":49,"value":380},"Cluster rows only",{"type":43,"tag":233,"props":382,"children":383},{},[384],{"type":43,"tag":156,"props":385,"children":386},{},[387],{"type":49,"value":388},"\"cluster rows, keep columns in original order\"",{"type":43,"tag":211,"props":390,"children":391},{},[392,400],{"type":43,"tag":233,"props":393,"children":394},{},[395],{"type":43,"tag":58,"props":396,"children":397},{},[398],{"type":49,"value":399},"Cluster columns only",{"type":43,"tag":233,"props":401,"children":402},{},[403],{"type":43,"tag":156,"props":404,"children":405},{},[406],{"type":49,"value":407},"\"cluster columns, keep row order fixed\"",{"type":43,"tag":211,"props":409,"children":410},{},[411,419],{"type":43,"tag":233,"props":412,"children":413},{},[414],{"type":43,"tag":58,"props":415,"children":416},{},[417],{"type":49,"value":418},"Cluster both",{"type":43,"tag":233,"props":420,"children":421},{},[422,427],{"type":43,"tag":156,"props":423,"children":424},{},[425],{"type":49,"value":426},"\"cluster both rows and columns\"",{"type":49,"value":428}," (default)",{"type":43,"tag":211,"props":430,"children":431},{},[432,440],{"type":43,"tag":233,"props":433,"children":434},{},[435],{"type":43,"tag":58,"props":436,"children":437},{},[438],{"type":49,"value":439},"Linkage method",{"type":43,"tag":233,"props":441,"children":442},{},[443,448,450,455,456],{"type":43,"tag":156,"props":444,"children":445},{},[446],{"type":49,"value":447},"\"use Ward linkage\"",{"type":49,"value":449}," / ",{"type":43,"tag":156,"props":451,"children":452},{},[453],{"type":49,"value":454},"\"complete linkage\"",{"type":49,"value":449},{"type":43,"tag":156,"props":457,"children":458},{},[459],{"type":49,"value":460},"\"average linkage\"",{"type":43,"tag":211,"props":462,"children":463},{},[464,472],{"type":43,"tag":233,"props":465,"children":466},{},[467],{"type":43,"tag":58,"props":468,"children":469},{},[470],{"type":49,"value":471},"Distance metric",{"type":43,"tag":233,"props":473,"children":474},{},[475,480,481,486,487],{"type":43,"tag":156,"props":476,"children":477},{},[478],{"type":49,"value":479},"\"use Euclidean distance\"",{"type":49,"value":449},{"type":43,"tag":156,"props":482,"children":483},{},[484],{"type":49,"value":485},"\"correlation distance\"",{"type":49,"value":449},{"type":43,"tag":156,"props":488,"children":489},{},[490],{"type":49,"value":491},"\"cosine similarity\"",{"type":43,"tag":211,"props":493,"children":494},{},[495,503],{"type":43,"tag":233,"props":496,"children":497},{},[498],{"type":43,"tag":58,"props":499,"children":500},{},[501],{"type":49,"value":502},"Normalization",{"type":43,"tag":233,"props":504,"children":505},{},[506,511,512],{"type":43,"tag":156,"props":507,"children":508},{},[509],{"type":49,"value":510},"\"z-score normalize each row\"",{"type":49,"value":449},{"type":43,"tag":156,"props":513,"children":514},{},[515],{"type":49,"value":516},"\"min-max scale each column\"",{"type":43,"tag":44,"props":518,"children":520},{"id":519},"example-prompts",[521],{"type":49,"value":522},"Example Prompts",{"type":43,"tag":203,"props":524,"children":525},{},[526,542],{"type":43,"tag":207,"props":527,"children":528},{},[529],{"type":43,"tag":211,"props":530,"children":531},{},[532,537],{"type":43,"tag":215,"props":533,"children":534},{},[535],{"type":49,"value":536},"Scenario",{"type":43,"tag":215,"props":538,"children":539},{},[540],{"type":49,"value":541},"What to type",{"type":43,"tag":226,"props":543,"children":544},{},[545,563,580,597,614],{"type":43,"tag":211,"props":546,"children":547},{},[548,553],{"type":43,"tag":233,"props":549,"children":550},{},[551],{"type":49,"value":552},"Gene expression",{"type":43,"tag":233,"props":554,"children":555},{},[556],{"type":43,"tag":557,"props":558,"children":560},"code",{"className":559},[],[561],{"type":49,"value":562},"clustergram, z-score rows, cluster both axes, red-white-blue scale, label top 30 genes",{"type":43,"tag":211,"props":564,"children":565},{},[566,571],{"type":43,"tag":233,"props":567,"children":568},{},[569],{"type":49,"value":570},"Country comparison",{"type":43,"tag":233,"props":572,"children":573},{},[574],{"type":43,"tag":557,"props":575,"children":577},{"className":576},[],[578],{"type":49,"value":579},"cluster countries by similarity across all indicators, sequential color scale",{"type":43,"tag":211,"props":581,"children":582},{},[583,588],{"type":43,"tag":233,"props":584,"children":585},{},[586],{"type":49,"value":587},"Time series patterns",{"type":43,"tag":233,"props":589,"children":590},{},[591],{"type":43,"tag":557,"props":592,"children":594},{"className":593},[],[595],{"type":49,"value":596},"cluster products by monthly sales pattern, keep months in chronological order",{"type":43,"tag":211,"props":598,"children":599},{},[600,605],{"type":43,"tag":233,"props":601,"children":602},{},[603],{"type":49,"value":604},"Survey responses",{"type":43,"tag":233,"props":606,"children":607},{},[608],{"type":43,"tag":557,"props":609,"children":611},{"className":610},[],[612],{"type":49,"value":613},"clustergram of average rating by question and department, Ward linkage",{"type":43,"tag":211,"props":615,"children":616},{},[617,622],{"type":43,"tag":233,"props":618,"children":619},{},[620],{"type":49,"value":621},"Customer segments",{"type":43,"tag":233,"props":623,"children":624},{},[625],{"type":43,"tag":557,"props":626,"children":628},{"className":627},[],[629],{"type":49,"value":630},"cluster customers by purchase frequency across product categories",{"type":43,"tag":44,"props":632,"children":634},{"id":633},"related-tools",[635],{"type":49,"value":636},"Related Tools",{"type":43,"tag":52,"props":638,"children":639},{},[640,642,648,650,656,658,664],{"type":49,"value":641},"Use the ",{"type":43,"tag":172,"props":643,"children":645},{"href":644},"/tools/ai-heatmap",[646],{"type":49,"value":647},"AI Heatmap Generator",{"type":49,"value":649}," if you want a heatmap without clustering — preserving the original row and column order. Use the ",{"type":43,"tag":172,"props":651,"children":653},{"href":652},"/tools/volcano-plot",[654],{"type":49,"value":655},"Volcano Plot Generator",{"type":49,"value":657}," to identify which genes are significantly changed before clustering a focused subset. Use the ",{"type":43,"tag":172,"props":659,"children":661},{"href":660},"/tools/exploratory-data-analysis-ai",[662],{"type":49,"value":663},"Exploratory Data Analysis tool",{"type":49,"value":665}," to get a correlation heatmap and basic statistics before building a full clustergram.",{"type":43,"tag":44,"props":667,"children":669},{"id":668},"frequently-asked-questions",[670],{"type":49,"value":671},"Frequently Asked Questions",{"type":43,"tag":52,"props":673,"children":674},{},[675,680],{"type":43,"tag":58,"props":676,"children":677},{},[678],{"type":49,"value":679},"What is the difference between a clustergram and a regular heatmap?",{"type":49,"value":681},"\nA regular heatmap preserves your original row and column order. A clustergram reorders both axes using hierarchical clustering so that similar rows and columns are placed next to each other — making patterns and blocks of correlated features immediately visible.",{"type":43,"tag":52,"props":683,"children":684},{},[685,690,692,696,698,702],{"type":43,"tag":58,"props":686,"children":687},{},[688],{"type":49,"value":689},"Should I normalize my data before clustering?",{"type":49,"value":691},"\nAlmost always yes. If your columns are on very different scales (e.g. GDP in trillions vs. population in millions), a feature with larger values will dominate the distance calculation. Ask for ",{"type":43,"tag":156,"props":693,"children":694},{},[695],{"type":49,"value":510},{"type":49,"value":697}," (standard for gene expression) or ",{"type":43,"tag":156,"props":699,"children":700},{},[701],{"type":49,"value":516},{"type":49,"value":703}," (for mixed indicator data) to put everything on a comparable scale.",{"type":43,"tag":52,"props":705,"children":706},{},[707,712,714,719,720,725],{"type":43,"tag":58,"props":708,"children":709},{},[710],{"type":49,"value":711},"How do I choose the number of clusters?",{"type":49,"value":713},"\nYou do not have to specify a number up front — hierarchical clustering produces a full tree and you can \"cut\" it at any level. Ask the AI to ",{"type":43,"tag":156,"props":715,"children":716},{},[717],{"type":49,"value":718},"\"draw colored cluster boundaries at k=3 groups\"",{"type":49,"value":179},{"type":43,"tag":156,"props":721,"children":722},{},[723],{"type":49,"value":724},"\"color the dendrogram to show 4 clusters\"",{"type":49,"value":726}," after generating the initial chart.",{"type":43,"tag":52,"props":728,"children":729},{},[730,735,737,742],{"type":43,"tag":58,"props":731,"children":732},{},[733],{"type":49,"value":734},"My data has missing values — will it still work?",{"type":49,"value":736},"\nThe AI will handle missing values by imputing with the row or column mean before clustering, or by dropping rows/columns with too many missing values. Mention in your prompt if you have a preference: ",{"type":43,"tag":156,"props":738,"children":739},{},[740],{"type":49,"value":741},"\"drop genes with more than 20% missing samples\"",{"type":49,"value":743},".",{"title":7,"searchDepth":745,"depth":745,"links":746},2,[747,748,749,750,751,752,753],{"id":46,"depth":745,"text":50},{"id":119,"depth":745,"text":122},{"id":198,"depth":745,"text":201},{"id":344,"depth":745,"text":347},{"id":519,"depth":745,"text":522},{"id":633,"depth":745,"text":636},{"id":668,"depth":745,"text":671},"markdown","content:tools:013.clustergram.md","content","tools/013.clustergram.md","tools/013.clustergram","md",{"loc":4},1775502471196]