[{"data":1,"prerenderedAt":619},["ShallowReactive",2],{"content-query-tEmKowKCS9":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"heading":10,"prompt":11,"tags":15,"files":18,"nav":6,"presets":19,"gallery":37,"body":39,"_type":612,"_id":613,"_source":614,"_file":615,"_stem":616,"_extension":617,"sitemap":618},"/tools/pair-plot","tools",false,"","Pair Plot Generator for Excel & CSV","Create pair plots online from Excel and CSV data. Explore pairwise relationships, distributions, and grouped patterns with AI.","Pair Plot Generator",{"prefix":12,"label":13,"placeholder":14},"Create a pair plot","Describe the pair plot you want to create","e.g. pair plot of GDP, life expectancy, CO2, and education index colored by income group",[16,17],"charts","statistics",true,[20,26,31],{"label":21,"prompt":22,"dataset_url":23,"dataset_title":24,"dataset_citation":25},"Country indicators by income group","pair plot of GDP per capita, life expectancy, CO2 emissions per capita, and education index; color points by World Bank income group; use log scale on GDP and CO2 axes; add regression lines to scatter panels","https://ourworldindata.org/grapher/life-expectancy-vs-gdp-per-capita.csv","Life expectancy vs. GDP per capita","Our World in Data",{"label":27,"prompt":28,"dataset_url":29,"dataset_title":30,"dataset_citation":25},"Health metrics by region","pair plot of life expectancy, fertility rate, child mortality, and adult obesity; color by world region; KDE plots on diagonal; add Pearson correlation coefficients to each scatter panel","https://ourworldindata.org/grapher/life-expectancy.csv","Life expectancy",{"label":32,"prompt":33,"dataset_url":34,"dataset_title":35,"dataset_citation":36},"Economic indicators by continent","pair plot of GDP per capita, trade openness, government expenditure, and inflation; color points by continent; log scale on GDP; show KDE on diagonal and scatter with trend lines off-diagonal","https://api.worldbank.org/v2/en/indicator/NY.GDP.PCAP.CD?downloadformat=excel","GDP per capita (current US$)","World Bank",[38],"/img/tools/pair-plot.png",{"type":40,"children":41,"toc":603},"root",[42,51,79,91,103,109,167,173,329,335,443,449,478,484,501,539,562,593],{"type":43,"tag":44,"props":45,"children":47},"element","h2",{"id":46},"what-is-a-pair-plot",[48],{"type":49,"value":50},"text","What Is a Pair Plot?",{"type":43,"tag":52,"props":53,"children":54},"p",{},[55,57,63,65,70,72,77],{"type":49,"value":56},"A ",{"type":43,"tag":58,"props":59,"children":60},"strong",{},[61],{"type":49,"value":62},"pair plot",{"type":49,"value":64}," (also called a ",{"type":43,"tag":58,"props":66,"children":67},{},[68],{"type":49,"value":69},"scatter plot matrix",{"type":49,"value":71}," or SPLOM) is a grid of plots that displays the ",{"type":43,"tag":58,"props":73,"children":74},{},[75],{"type":49,"value":76},"pairwise relationship between every combination of numeric variables",{"type":49,"value":78}," in a dataset simultaneously. For a dataset with N numeric columns, the grid is N×N: off-diagonal cells show scatter plots of one variable against another, while diagonal cells show each variable's own distribution (as a histogram or KDE curve). The result is a single compact view that lets you scan all pairwise correlations, distributions, and potential outliers at once.",{"type":43,"tag":52,"props":80,"children":81},{},[82,84,89],{"type":49,"value":83},"The pair plot's power lies in ",{"type":43,"tag":58,"props":85,"children":86},{},[87],{"type":49,"value":88},"pattern recognition at a glance",{"type":49,"value":90},". When points in an off-diagonal scatter panel form a diagonal line, the two variables are correlated. When they form a horizontal or vertical cloud, they are unrelated. When they form a curve, the relationship is non-linear. By adding a categorical color variable (e.g. species, income group, treatment arm), clusters and group-level differences become immediately visible across all panels simultaneously — something that would require dozens of individual charts to achieve otherwise.",{"type":43,"tag":52,"props":92,"children":93},{},[94,96,101],{"type":49,"value":95},"Pair plots are the standard first step in ",{"type":43,"tag":58,"props":97,"children":98},{},[99],{"type":49,"value":100},"multivariate exploratory data analysis",{"type":49,"value":102}," in fields ranging from ecology (examining how plant traits co-vary across species) to economics (scanning macro indicators for correlated country clusters) to machine learning (identifying which feature pairs are most informative before model training). The classic example is the Iris dataset — its pair plot immediately reveals that setosa separates cleanly from the other two species on petal dimensions while versicolor and virginica overlap, which directly informs classifier design.",{"type":43,"tag":44,"props":104,"children":106},{"id":105},"how-it-works",[107],{"type":49,"value":108},"How It Works",{"type":43,"tag":110,"props":111,"children":112},"ol",{},[113,124,140],{"type":43,"tag":114,"props":115,"children":116},"li",{},[117,122],{"type":43,"tag":58,"props":118,"children":119},{},[120],{"type":49,"value":121},"Upload your data",{"type":49,"value":123}," — provide a CSV or Excel file with at least three numeric columns. An optional categorical column (species, group, country type) can be used to color the points. One row per observation.",{"type":43,"tag":114,"props":125,"children":126},{},[127,132,134],{"type":43,"tag":58,"props":128,"children":129},{},[130],{"type":49,"value":131},"Describe the plot",{"type":49,"value":133}," — e.g. ",{"type":43,"tag":135,"props":136,"children":137},"em",{},[138],{"type":49,"value":139},"\"pair plot of temperature, humidity, wind speed, and pressure colored by weather type, KDE on diagonal\"",{"type":43,"tag":114,"props":141,"children":142},{},[143,148,150,157,159,165],{"type":43,"tag":58,"props":144,"children":145},{},[146],{"type":49,"value":147},"Get the visualization",{"type":49,"value":149}," — the AI writes Python code using ",{"type":43,"tag":151,"props":152,"children":154},"a",{"href":153},"https://seaborn.pydata.org/generated/seaborn.pairplot.html",[155],{"type":49,"value":156},"seaborn",{"type":49,"value":158}," and ",{"type":43,"tag":151,"props":160,"children":162},{"href":161},"https://matplotlib.org/",[163],{"type":49,"value":164},"matplotlib",{"type":49,"value":166}," to build the full scatter matrix with styled axes",{"type":43,"tag":44,"props":168,"children":170},{"id":169},"interpreting-the-results",[171],{"type":49,"value":172},"Interpreting the Results",{"type":43,"tag":174,"props":175,"children":176},"table",{},[177,196],{"type":43,"tag":178,"props":179,"children":180},"thead",{},[181],{"type":43,"tag":182,"props":183,"children":184},"tr",{},[185,191],{"type":43,"tag":186,"props":187,"children":188},"th",{},[189],{"type":49,"value":190},"Visual element",{"type":43,"tag":186,"props":192,"children":193},{},[194],{"type":49,"value":195},"What it means",{"type":43,"tag":197,"props":198,"children":199},"tbody",{},[200,217,233,249,265,281,297,313],{"type":43,"tag":182,"props":201,"children":202},{},[203,212],{"type":43,"tag":204,"props":205,"children":206},"td",{},[207],{"type":43,"tag":58,"props":208,"children":209},{},[210],{"type":49,"value":211},"Diagonal panel",{"type":43,"tag":204,"props":213,"children":214},{},[215],{"type":49,"value":216},"Distribution of that variable alone — histogram or KDE",{"type":43,"tag":182,"props":218,"children":219},{},[220,228],{"type":43,"tag":204,"props":221,"children":222},{},[223],{"type":43,"tag":58,"props":224,"children":225},{},[226],{"type":49,"value":227},"Off-diagonal scatter",{"type":43,"tag":204,"props":229,"children":230},{},[231],{"type":49,"value":232},"Relationship between the row variable and the column variable",{"type":43,"tag":182,"props":234,"children":235},{},[236,244],{"type":43,"tag":204,"props":237,"children":238},{},[239],{"type":43,"tag":58,"props":240,"children":241},{},[242],{"type":49,"value":243},"Tight diagonal cluster",{"type":43,"tag":204,"props":245,"children":246},{},[247],{"type":49,"value":248},"Strong positive correlation between the two variables",{"type":43,"tag":182,"props":250,"children":251},{},[252,260],{"type":43,"tag":204,"props":253,"children":254},{},[255],{"type":43,"tag":58,"props":256,"children":257},{},[258],{"type":49,"value":259},"Diffuse cloud",{"type":43,"tag":204,"props":261,"children":262},{},[263],{"type":49,"value":264},"Weak or no linear correlation",{"type":43,"tag":182,"props":266,"children":267},{},[268,276],{"type":43,"tag":204,"props":269,"children":270},{},[271],{"type":43,"tag":58,"props":272,"children":273},{},[274],{"type":49,"value":275},"Curved scatter pattern",{"type":43,"tag":204,"props":277,"children":278},{},[279],{"type":49,"value":280},"Non-linear relationship — consider log transform",{"type":43,"tag":182,"props":282,"children":283},{},[284,292],{"type":43,"tag":204,"props":285,"children":286},{},[287],{"type":43,"tag":58,"props":288,"children":289},{},[290],{"type":49,"value":291},"Separated color clusters",{"type":43,"tag":204,"props":293,"children":294},{},[295],{"type":49,"value":296},"The categorical variable distinguishes those two variables",{"type":43,"tag":182,"props":298,"children":299},{},[300,308],{"type":43,"tag":204,"props":301,"children":302},{},[303],{"type":43,"tag":58,"props":304,"children":305},{},[306],{"type":49,"value":307},"Overlapping color clusters",{"type":43,"tag":204,"props":309,"children":310},{},[311],{"type":49,"value":312},"The categorical variable does not separate on those two variables",{"type":43,"tag":182,"props":314,"children":315},{},[316,324],{"type":43,"tag":204,"props":317,"children":318},{},[319],{"type":43,"tag":58,"props":320,"children":321},{},[322],{"type":49,"value":323},"Isolated point far from cluster",{"type":43,"tag":204,"props":325,"children":326},{},[327],{"type":49,"value":328},"Potential outlier — unusual combination of the two variables",{"type":43,"tag":44,"props":330,"children":332},{"id":331},"example-prompts",[333],{"type":49,"value":334},"Example Prompts",{"type":43,"tag":174,"props":336,"children":337},{},[338,354],{"type":43,"tag":178,"props":339,"children":340},{},[341],{"type":43,"tag":182,"props":342,"children":343},{},[344,349],{"type":43,"tag":186,"props":345,"children":346},{},[347],{"type":49,"value":348},"Scenario",{"type":43,"tag":186,"props":350,"children":351},{},[352],{"type":49,"value":353},"What to type",{"type":43,"tag":197,"props":355,"children":356},{},[357,375,392,409,426],{"type":43,"tag":182,"props":358,"children":359},{},[360,365],{"type":43,"tag":204,"props":361,"children":362},{},[363],{"type":49,"value":364},"Ecology",{"type":43,"tag":204,"props":366,"children":367},{},[368],{"type":43,"tag":369,"props":370,"children":372},"code",{"className":371},[],[373],{"type":49,"value":374},"pair plot of sepal length, sepal width, petal length, petal width colored by species",{"type":43,"tag":182,"props":376,"children":377},{},[378,383],{"type":43,"tag":204,"props":379,"children":380},{},[381],{"type":49,"value":382},"Finance",{"type":43,"tag":204,"props":384,"children":385},{},[386],{"type":43,"tag":369,"props":387,"children":389},{"className":388},[],[390],{"type":49,"value":391},"pair plot of return, volatility, P/E ratio, and dividend yield colored by sector",{"type":43,"tag":182,"props":393,"children":394},{},[395,400],{"type":43,"tag":204,"props":396,"children":397},{},[398],{"type":49,"value":399},"Health research",{"type":43,"tag":204,"props":401,"children":402},{},[403],{"type":43,"tag":369,"props":404,"children":406},{"className":405},[],[407],{"type":49,"value":408},"pair plot of age, BMI, blood pressure, cholesterol colored by diabetes status",{"type":43,"tag":182,"props":410,"children":411},{},[412,417],{"type":43,"tag":204,"props":413,"children":414},{},[415],{"type":49,"value":416},"Climate data",{"type":43,"tag":204,"props":418,"children":419},{},[420],{"type":43,"tag":369,"props":421,"children":423},{"className":422},[],[424],{"type":49,"value":425},"pair plot of temperature, precipitation, humidity, wind speed by season",{"type":43,"tag":182,"props":427,"children":428},{},[429,434],{"type":43,"tag":204,"props":430,"children":431},{},[432],{"type":49,"value":433},"Machine learning",{"type":43,"tag":204,"props":435,"children":436},{},[437],{"type":43,"tag":369,"props":438,"children":440},{"className":439},[],[441],{"type":49,"value":442},"pair plot of all numeric features, color by target class, add correlation values",{"type":43,"tag":44,"props":444,"children":446},{"id":445},"related-tools",[447],{"type":49,"value":448},"Related Tools",{"type":43,"tag":52,"props":450,"children":451},{},[452,454,460,462,468,470,476],{"type":49,"value":453},"Use the ",{"type":43,"tag":151,"props":455,"children":457},{"href":456},"/tools/exploratory-data-analysis-ai",[458],{"type":49,"value":459},"Exploratory Data Analysis tool",{"type":49,"value":461}," for a complete automated analysis including pair plots, correlation matrices, summary statistics, and missing value reports — the pair plot tool is best when you want a specific, styled scatter matrix. Use the ",{"type":43,"tag":151,"props":463,"children":465},{"href":464},"/tools/ai-heatmap",[466],{"type":49,"value":467},"AI Heatmap Generator",{"type":49,"value":469}," to show a correlation matrix as a color-coded grid when you have too many variables for a pair plot (more than ~8 columns). Use the ",{"type":43,"tag":151,"props":471,"children":473},{"href":472},"/tools/ai-scatter-chart-generator",[474],{"type":49,"value":475},"AI Scatter Chart Generator",{"type":49,"value":477}," to examine a single pair of variables in detail with a larger canvas after identifying the most interesting relationship in the pair plot.",{"type":43,"tag":44,"props":479,"children":481},{"id":480},"frequently-asked-questions",[482],{"type":49,"value":483},"Frequently Asked Questions",{"type":43,"tag":52,"props":485,"children":486},{},[487,492,494,499],{"type":43,"tag":58,"props":488,"children":489},{},[490],{"type":49,"value":491},"How many variables can I include in a pair plot?",{"type":49,"value":493},"\nPair plots work best with ",{"type":43,"tag":58,"props":495,"children":496},{},[497],{"type":49,"value":498},"3–8 variables",{"type":49,"value":500},". A 4×4 grid is the sweet spot — large enough to reveal structure, small enough to read at a glance. With 8 variables you get a 64-panel grid that becomes hard to navigate. For more variables, ask the AI to select the most interesting subset or use the heatmap tool to show correlations only.",{"type":43,"tag":52,"props":502,"children":503},{},[504,509,511,516,518,523,525,530,532,537],{"type":43,"tag":58,"props":505,"children":506},{},[507],{"type":49,"value":508},"My scatter panels are all compressed because one variable has extreme outliers — what do I do?",{"type":49,"value":510},"\nAsk for a ",{"type":43,"tag":58,"props":512,"children":513},{},[514],{"type":49,"value":515},"log scale",{"type":49,"value":517}," on that variable: ",{"type":43,"tag":135,"props":519,"children":520},{},[521],{"type":49,"value":522},"\"log scale on the GDP axis\"",{"type":49,"value":524},". You can also ask to ",{"type":43,"tag":135,"props":526,"children":527},{},[528],{"type":49,"value":529},"\"cap outliers at the 99th percentile\"",{"type":49,"value":531}," or ",{"type":43,"tag":135,"props":533,"children":534},{},[535],{"type":49,"value":536},"\"exclude observations above X\"",{"type":49,"value":538},". Alternatively, ask for a log transform of the column before plotting.",{"type":43,"tag":52,"props":540,"children":541},{},[542,547,549,554,555,560],{"type":43,"tag":58,"props":543,"children":544},{},[545],{"type":49,"value":546},"Can I add correlation coefficients to each scatter panel?",{"type":49,"value":548},"\nYes — ask to ",{"type":43,"tag":135,"props":550,"children":551},{},[552],{"type":49,"value":553},"\"add Pearson r values to each scatter panel\"",{"type":49,"value":531},{"type":43,"tag":135,"props":556,"children":557},{},[558],{"type":49,"value":559},"\"annotate each panel with the correlation coefficient and p-value\"",{"type":49,"value":561},". The AI will compute these and place them as text annotations in the corner of each cell.",{"type":43,"tag":52,"props":563,"children":564},{},[565,570,572,577,578,591],{"type":43,"tag":58,"props":566,"children":567},{},[568],{"type":49,"value":569},"Can I show regression lines in the scatter panels?",{"type":49,"value":571},"\nYes — ask for ",{"type":43,"tag":135,"props":573,"children":574},{},[575],{"type":49,"value":576},"\"add regression lines to all scatter panels\"",{"type":49,"value":531},{"type":43,"tag":135,"props":579,"children":580},{},[581,583,589],{"type":49,"value":582},"\"use ",{"type":43,"tag":369,"props":584,"children":586},{"className":585},[],[587],{"type":49,"value":588},"kind='reg'",{"type":49,"value":590}," in seaborn pairplot\"",{"type":49,"value":592},". The AI will fit a linear regression line with a confidence band in each off-diagonal cell.",{"type":43,"tag":52,"props":594,"children":595},{},[596,601],{"type":43,"tag":58,"props":597,"children":598},{},[599],{"type":49,"value":600},"What's the difference between a pair plot and a correlation heatmap?",{"type":49,"value":602},"\nA correlation heatmap shows only the Pearson correlation coefficient between each pair as a single colored cell — fast to scan, but hides the actual shape of the relationship (linear vs. curved, presence of clusters, outliers). A pair plot shows the raw scatter for each pair, which reveals non-linearity, heteroscedasticity, and group structure that a single number cannot. Use the heatmap for a high-level overview with many variables; use the pair plot for thorough exploration with fewer variables.",{"title":7,"searchDepth":604,"depth":604,"links":605},2,[606,607,608,609,610,611],{"id":46,"depth":604,"text":50},{"id":105,"depth":604,"text":108},{"id":169,"depth":604,"text":172},{"id":331,"depth":604,"text":334},{"id":445,"depth":604,"text":448},{"id":480,"depth":604,"text":483},"markdown","content:tools:019.pair-plot.md","content","tools/019.pair-plot.md","tools/019.pair-plot","md",{"loc":4},1775502471196]