[{"data":1,"prerenderedAt":724},["ShallowReactive",2],{"content-query-uYhqYrarTb":3},{"_path":4,"_dir":5,"_draft":6,"_partial":6,"_locale":7,"title":8,"description":9,"heading":10,"prompt":11,"tags":15,"files":18,"nav":18,"presets":19,"gallery":36,"body":38,"_type":717,"_id":718,"_source":719,"_file":720,"_stem":721,"_extension":722,"sitemap":723},"/tools/empirical-cdf","tools",false,"","Empirical CDF Plot Generator for Excel & CSV","Create empirical CDF plots online from Excel and CSV data. Compare percentiles, thresholds, and full distributions with AI.","Empirical CDF Plot",{"prefix":12,"label":13,"placeholder":14},"Create an empirical CDF plot","Describe the ECDF plot you want to create","e.g. ECDF of life expectancy by income group, add quartile reference lines",[16,17],"charts","statistics",true,[20,26,31],{"label":21,"prompt":22,"dataset_url":23,"dataset_title":24,"dataset_citation":25},"Life expectancy by income group","empirical CDF of life expectancy by World Bank income group (low, lower-middle, upper-middle, high income); step-function curves; add horizontal reference lines at 25th, 50th, and 75th percentile; annotate median for each group","https://ourworldindata.org/grapher/life-expectancy.csv","Life expectancy","Our World in Data",{"label":27,"prompt":28,"dataset_url":29,"dataset_title":30,"dataset_citation":25},"GDP per capita distribution","empirical CDF of log GDP per capita by continent for the most recent year; one curve per continent; add vertical lines at $5k, $15k, $50k to show income thresholds; annotate the proportion of countries below each threshold per continent","https://ourworldindata.org/grapher/gdp-per-capita-worldbank.csv","GDP per capita (World Bank)",{"label":32,"prompt":33,"dataset_url":34,"dataset_title":35,"dataset_citation":25},"CO₂ emissions percentile ranking","empirical CDF of CO2 emissions per capita by world region; log scale on x-axis; one curve per region; annotate where each region's median falls; add vertical line at global median; show percentage of countries below 5 tonnes per capita","https://ourworldindata.org/grapher/co-emissions-per-capita.csv","CO₂ emissions per capita",[37],"/img/tools/empirical-cdf.png",{"type":39,"children":40,"toc":708},"root",[41,50,71,90,102,108,166,172,312,318,426,432,461,467,477,649,673,683],{"type":42,"tag":43,"props":44,"children":46},"element","h2",{"id":45},"what-is-an-empirical-cdf",[47],{"type":48,"value":49},"text","What Is an Empirical CDF?",{"type":42,"tag":51,"props":52,"children":53},"p",{},[54,56,62,64,69],{"type":48,"value":55},"An ",{"type":42,"tag":57,"props":58,"children":59},"strong",{},[60],{"type":48,"value":61},"empirical cumulative distribution function",{"type":48,"value":63}," (ECDF) is a step-function that, for each value x on the horizontal axis, shows the ",{"type":42,"tag":57,"props":65,"children":66},{},[67],{"type":48,"value":68},"proportion of observations in the dataset that are less than or equal to x",{"type":48,"value":70},". Sort your data from smallest to largest; the first observation sits at 1/n, the second at 2/n, and so on until the last observation reaches 1.0 (100%). The result is a staircase curve that rises from 0 to 1 across the range of the data, with each step corresponding to one observation.",{"type":42,"tag":51,"props":72,"children":73},{},[74,76,81,83,88],{"type":48,"value":75},"The ECDF's key advantage over a histogram or density plot is that it is ",{"type":42,"tag":57,"props":77,"children":78},{},[79],{"type":48,"value":80},"non-parametric and assumption-free",{"type":48,"value":82}," — it makes no choices about bin widths or smoothing parameters that could distort the shape. Every observation appears exactly once, as one step. This makes the ECDF a reliable reference for ",{"type":42,"tag":57,"props":84,"children":85},{},[86],{"type":48,"value":87},"percentile reading",{"type":48,"value":89},": to find the median, draw a horizontal line at 0.5 and read off where it crosses the curve. The 90th percentile is where the curve crosses 0.9. Comparing two ECDFs directly shows which group is stochastically larger (its curve lies to the right), and where exactly the distributions diverge — whether at the tails, the center, or uniformly.",{"type":42,"tag":51,"props":91,"children":92},{},[93,95,100],{"type":48,"value":94},"ECDFs are used for ",{"type":42,"tag":57,"props":96,"children":97},{},[98],{"type":48,"value":99},"distribution comparison",{"type":48,"value":101}," in almost every quantitative field. In survival analysis, the complement (1 − ECDF) is the survival function. In quality control, an ECDF shows what fraction of products fall within specification. In economics, the Lorenz curve (used to compute the Gini coefficient) is a specific ECDF of income shares. In machine learning, ECDFs help calibrate probability outputs — a perfectly calibrated classifier has its confidence ECDF match the diagonal.",{"type":42,"tag":43,"props":103,"children":105},{"id":104},"how-it-works",[106],{"type":48,"value":107},"How It Works",{"type":42,"tag":109,"props":110,"children":111},"ol",{},[112,123,139],{"type":42,"tag":113,"props":114,"children":115},"li",{},[116,121],{"type":42,"tag":57,"props":117,"children":118},{},[119],{"type":48,"value":120},"Upload your data",{"type":48,"value":122}," — provide a CSV or Excel file with at least one numeric column and optionally a categorical column for grouping. One row per observation.",{"type":42,"tag":113,"props":124,"children":125},{},[126,131,133],{"type":42,"tag":57,"props":127,"children":128},{},[129],{"type":48,"value":130},"Describe the plot",{"type":48,"value":132}," — e.g. ",{"type":42,"tag":134,"props":135,"children":136},"em",{},[137],{"type":48,"value":138},"\"ECDF of salary by department, add median reference lines, log scale on x-axis\"",{"type":42,"tag":113,"props":140,"children":141},{},[142,147,149,156,158,164],{"type":42,"tag":57,"props":143,"children":144},{},[145],{"type":48,"value":146},"Get the visualization",{"type":48,"value":148}," — the AI writes Python code using ",{"type":42,"tag":150,"props":151,"children":153},"a",{"href":152},"https://numpy.org/",[154],{"type":48,"value":155},"numpy",{"type":48,"value":157}," and ",{"type":42,"tag":150,"props":159,"children":161},{"href":160},"https://plotly.com/python/",[162],{"type":48,"value":163},"Plotly",{"type":48,"value":165}," to sort the data, compute cumulative proportions, and render step-function curves for each group",{"type":42,"tag":43,"props":167,"children":169},{"id":168},"interpreting-the-results",[170],{"type":48,"value":171},"Interpreting the Results",{"type":42,"tag":173,"props":174,"children":175},"table",{},[176,195],{"type":42,"tag":177,"props":178,"children":179},"thead",{},[180],{"type":42,"tag":181,"props":182,"children":183},"tr",{},[184,190],{"type":42,"tag":185,"props":186,"children":187},"th",{},[188],{"type":48,"value":189},"Visual element",{"type":42,"tag":185,"props":191,"children":192},{},[193],{"type":48,"value":194},"What it means",{"type":42,"tag":196,"props":197,"children":198},"tbody",{},[199,216,232,248,264,280,296],{"type":42,"tag":181,"props":200,"children":201},{},[202,211],{"type":42,"tag":203,"props":204,"children":205},"td",{},[206],{"type":42,"tag":57,"props":207,"children":208},{},[209],{"type":48,"value":210},"Curve far to the right",{"type":42,"tag":203,"props":212,"children":213},{},[214],{"type":48,"value":215},"Group has higher values overall — stochastically dominant",{"type":42,"tag":181,"props":217,"children":218},{},[219,227],{"type":42,"tag":203,"props":220,"children":221},{},[222],{"type":42,"tag":57,"props":223,"children":224},{},[225],{"type":48,"value":226},"Steep section",{"type":42,"tag":203,"props":228,"children":229},{},[230],{"type":48,"value":231},"Many observations clustered in a narrow range — values concentrate here",{"type":42,"tag":181,"props":233,"children":234},{},[235,243],{"type":42,"tag":203,"props":236,"children":237},{},[238],{"type":42,"tag":57,"props":239,"children":240},{},[241],{"type":48,"value":242},"Flat section",{"type":42,"tag":203,"props":244,"children":245},{},[246],{"type":48,"value":247},"Gap in the data — few observations in that value range",{"type":42,"tag":181,"props":249,"children":250},{},[251,259],{"type":42,"tag":203,"props":252,"children":253},{},[254],{"type":42,"tag":57,"props":255,"children":256},{},[257],{"type":48,"value":258},"Crossing curves",{"type":42,"tag":203,"props":260,"children":261},{},[262],{"type":48,"value":263},"One group is better below the crossing point; the other is better above",{"type":42,"tag":181,"props":265,"children":266},{},[267,275],{"type":42,"tag":203,"props":268,"children":269},{},[270],{"type":42,"tag":57,"props":271,"children":272},{},[273],{"type":48,"value":274},"Reading at y = 0.5",{"type":42,"tag":203,"props":276,"children":277},{},[278],{"type":48,"value":279},"Median of that group — 50% of observations below this value",{"type":42,"tag":181,"props":281,"children":282},{},[283,291],{"type":42,"tag":203,"props":284,"children":285},{},[286],{"type":42,"tag":57,"props":287,"children":288},{},[289],{"type":48,"value":290},"Reading at y = 0.9",{"type":42,"tag":203,"props":292,"children":293},{},[294],{"type":48,"value":295},"90th percentile — only 10% of observations exceed this value",{"type":42,"tag":181,"props":297,"children":298},{},[299,307],{"type":42,"tag":203,"props":300,"children":301},{},[302],{"type":42,"tag":57,"props":303,"children":304},{},[305],{"type":48,"value":306},"Vertical gap between curves at the same x",{"type":42,"tag":203,"props":308,"children":309},{},[310],{"type":48,"value":311},"Proportion difference — e.g. \"30% more countries in Group A fall below $15k GDP\"",{"type":42,"tag":43,"props":313,"children":315},{"id":314},"example-prompts",[316],{"type":48,"value":317},"Example Prompts",{"type":42,"tag":173,"props":319,"children":320},{},[321,337],{"type":42,"tag":177,"props":322,"children":323},{},[324],{"type":42,"tag":181,"props":325,"children":326},{},[327,332],{"type":42,"tag":185,"props":328,"children":329},{},[330],{"type":48,"value":331},"Scenario",{"type":42,"tag":185,"props":333,"children":334},{},[335],{"type":48,"value":336},"What to type",{"type":42,"tag":196,"props":338,"children":339},{},[340,358,375,392,409],{"type":42,"tag":181,"props":341,"children":342},{},[343,348],{"type":42,"tag":203,"props":344,"children":345},{},[346],{"type":48,"value":347},"Group comparison",{"type":42,"tag":203,"props":349,"children":350},{},[351],{"type":42,"tag":352,"props":353,"children":355},"code",{"className":354},[],[356],{"type":48,"value":357},"ECDF of test scores by teaching method, add 50th and 90th percentile lines",{"type":42,"tag":181,"props":359,"children":360},{},[361,366],{"type":42,"tag":203,"props":362,"children":363},{},[364],{"type":48,"value":365},"Threshold analysis",{"type":42,"tag":203,"props":367,"children":368},{},[369],{"type":42,"tag":352,"props":370,"children":372},{"className":371},[],[373],{"type":48,"value":374},"ECDF of income, add vertical line at poverty threshold, show % below",{"type":42,"tag":181,"props":376,"children":377},{},[378,383],{"type":42,"tag":203,"props":379,"children":380},{},[381],{"type":48,"value":382},"Before/after",{"type":42,"tag":203,"props":384,"children":385},{},[386],{"type":42,"tag":352,"props":387,"children":389},{"className":388},[],[390],{"type":48,"value":391},"ECDF of response time before and after system upgrade, overlay both curves",{"type":42,"tag":181,"props":393,"children":394},{},[395,400],{"type":42,"tag":203,"props":396,"children":397},{},[398],{"type":48,"value":399},"Log scale",{"type":42,"tag":203,"props":401,"children":402},{},[403],{"type":42,"tag":352,"props":404,"children":406},{"className":405},[],[407],{"type":48,"value":408},"ECDF of CO2 emissions per capita, log scale on x-axis, one curve per region",{"type":42,"tag":181,"props":410,"children":411},{},[412,417],{"type":42,"tag":203,"props":413,"children":414},{},[415],{"type":48,"value":416},"Percentile lookup",{"type":42,"tag":203,"props":418,"children":419},{},[420],{"type":42,"tag":352,"props":421,"children":423},{"className":422},[],[424],{"type":48,"value":425},"ECDF of salary distribution, annotate where $80k falls as a percentile",{"type":42,"tag":43,"props":427,"children":429},{"id":428},"related-tools",[430],{"type":48,"value":431},"Related Tools",{"type":42,"tag":51,"props":433,"children":434},{},[435,437,443,445,451,453,459],{"type":48,"value":436},"Use the ",{"type":42,"tag":150,"props":438,"children":440},{"href":439},"/tools/density-plot",[441],{"type":48,"value":442},"Density Plot Generator",{"type":48,"value":444}," when you want a smoothed continuous curve showing relative likelihood rather than cumulative proportion — density plots are more intuitive for general audiences but require bandwidth choice. Use the ",{"type":42,"tag":150,"props":446,"children":448},{"href":447},"/tools/ai-box-plot",[449],{"type":48,"value":450},"AI Box Plot Generator",{"type":48,"value":452}," to compare groups by summary statistics (median, IQR, outliers) in a more compact form. Use the ",{"type":42,"tag":150,"props":454,"children":456},{"href":455},"/tools/ai-histogram-generator",[457],{"type":48,"value":458},"AI Histogram Generator",{"type":48,"value":460}," when you want to show raw counts in discrete bins rather than a cumulative proportion.",{"type":42,"tag":43,"props":462,"children":464},{"id":463},"frequently-asked-questions",[465],{"type":48,"value":466},"Frequently Asked Questions",{"type":42,"tag":51,"props":468,"children":469},{},[470,475],{"type":42,"tag":57,"props":471,"children":472},{},[473],{"type":48,"value":474},"What's the difference between an ECDF and a histogram?",{"type":48,"value":476},"\nA histogram groups observations into bins and counts how many fall in each — the result depends on bin width and starting position, and shows relative frequency (not cumulative). An ECDF makes no binning choices, shows every observation exactly once as a step, and reads out cumulative proportions directly. The ECDF is more precise for percentile lookups and group comparisons; the histogram is more intuitive for showing the shape of a single distribution.",{"type":42,"tag":51,"props":478,"children":479},{},[480,485,487,640,642,647],{"type":42,"tag":57,"props":481,"children":482},{},[483],{"type":48,"value":484},"How do I read the percentile of a specific value from the plot?",{"type":48,"value":486},"\nFind the value on the x-axis, draw a vertical line up to the curve, then read the y-axis. That y-value is the percentile — if the curve crosses 0.73 at x = ",{"type":42,"tag":488,"props":489,"children":492},"span",{"className":490},[491],"katex",[493,566],{"type":42,"tag":488,"props":494,"children":497},{"className":495},[496],"katex-mathml",[498],{"type":42,"tag":499,"props":500,"children":502},"math",{"xmlns":501},"http://www.w3.org/1998/Math/MathML",[503],{"type":42,"tag":504,"props":505,"children":506},"semantics",{},[507,559],{"type":42,"tag":508,"props":509,"children":510},"mrow",{},[511,517,524,529,533,539,544,549,554],{"type":42,"tag":512,"props":513,"children":514},"mn",{},[515],{"type":48,"value":516},"50",{"type":42,"tag":518,"props":519,"children":521},"mo",{"separator":520},"true",[522],{"type":48,"value":523},",",{"type":42,"tag":512,"props":525,"children":526},{},[527],{"type":48,"value":528},"000",{"type":42,"tag":518,"props":530,"children":531},{"separator":520},[532],{"type":48,"value":523},{"type":42,"tag":534,"props":535,"children":536},"mi",{},[537],{"type":48,"value":538},"t",{"type":42,"tag":534,"props":540,"children":541},{},[542],{"type":48,"value":543},"h",{"type":42,"tag":534,"props":545,"children":546},{},[547],{"type":48,"value":548},"e",{"type":42,"tag":534,"props":550,"children":551},{},[552],{"type":48,"value":553},"n",{"type":42,"tag":512,"props":555,"children":556},{},[557],{"type":48,"value":558},"73",{"type":42,"tag":560,"props":561,"children":563},"annotation",{"encoding":562},"application/x-tex",[564],{"type":48,"value":565},"50,000, then 73% of the distribution earns less than ",{"type":42,"tag":488,"props":567,"children":570},{"className":568,"ariaHidden":520},[569],"katex-html",[571],{"type":42,"tag":488,"props":572,"children":575},{"className":573},[574],"base",[576,582,588,594,600,605,610,614,620,625,630,635],{"type":42,"tag":488,"props":577,"children":581},{"className":578,"style":580},[579],"strut","height:0.8889em;vertical-align:-0.1944em;",[],{"type":42,"tag":488,"props":583,"children":586},{"className":584},[585],"mord",[587],{"type":48,"value":516},{"type":42,"tag":488,"props":589,"children":592},{"className":590},[591],"mpunct",[593],{"type":48,"value":523},{"type":42,"tag":488,"props":595,"children":599},{"className":596,"style":598},[597],"mspace","margin-right:0.1667em;",[],{"type":42,"tag":488,"props":601,"children":603},{"className":602},[585],[604],{"type":48,"value":528},{"type":42,"tag":488,"props":606,"children":608},{"className":607},[591],[609],{"type":48,"value":523},{"type":42,"tag":488,"props":611,"children":613},{"className":612,"style":598},[597],[],{"type":42,"tag":488,"props":615,"children":618},{"className":616},[585,617],"mathnormal",[619],{"type":48,"value":538},{"type":42,"tag":488,"props":621,"children":623},{"className":622},[585,617],[624],{"type":48,"value":543},{"type":42,"tag":488,"props":626,"children":628},{"className":627},[585,617],[629],{"type":48,"value":548},{"type":42,"tag":488,"props":631,"children":633},{"className":632},[585,617],[634],{"type":48,"value":553},{"type":42,"tag":488,"props":636,"children":638},{"className":637},[585],[639],{"type":48,"value":558},{"type":48,"value":641},"50,000. Ask the AI to ",{"type":42,"tag":134,"props":643,"children":644},{},[645],{"type":48,"value":646},"\"annotate where $50,000 falls as a percentile\"",{"type":48,"value":648}," and it will add the crossing point label automatically.",{"type":42,"tag":51,"props":650,"children":651},{},[652,657,659,664,666,671],{"type":42,"tag":57,"props":653,"children":654},{},[655],{"type":48,"value":656},"Can I use the ECDF to formally test whether two groups have the same distribution?",{"type":48,"value":658},"\nYes — the ",{"type":42,"tag":57,"props":660,"children":661},{},[662],{"type":48,"value":663},"Kolmogorov-Smirnov (KS) test",{"type":48,"value":665}," uses the maximum vertical distance between two ECDFs as its test statistic. Ask for ",{"type":42,"tag":134,"props":667,"children":668},{},[669],{"type":48,"value":670},"\"two-sample KS test between Group A and Group B\"",{"type":48,"value":672}," and the AI will compute the KS statistic, p-value, and mark the maximum separation point on the ECDF plot.",{"type":42,"tag":51,"props":674,"children":675},{},[676,681],{"type":42,"tag":57,"props":677,"children":678},{},[679],{"type":48,"value":680},"My data has ties — does that affect the ECDF?",{"type":48,"value":682},"\nTies create flat horizontal sections in the ECDF (no step) followed by a larger single step when the tied value is reached. This is mathematically correct and doesn't need special handling. For continuous theoretical distributions there are no ties by definition, but real data always has some.",{"type":42,"tag":51,"props":684,"children":685},{},[686,691,693,698,700,706],{"type":42,"tag":57,"props":687,"children":688},{},[689],{"type":48,"value":690},"Can I overlay a theoretical CDF (e.g. normal) on top of the empirical one?",{"type":48,"value":692},"\nYes — ask to ",{"type":42,"tag":134,"props":694,"children":695},{},[696],{"type":48,"value":697},"\"overlay the theoretical normal CDF using the sample mean and standard deviation\"",{"type":48,"value":699},". If the ECDF closely follows the theoretical curve, the data is approximately normally distributed. The vertical distance between the curves at each point is what the KS goodness-of-fit test measures, and the ",{"type":42,"tag":150,"props":701,"children":703},{"href":702},"/tools/qq-plot",[704],{"type":48,"value":705},"Q-Q Plot Generator",{"type":48,"value":707}," shows the same comparison from a different angle.",{"title":7,"searchDepth":709,"depth":709,"links":710},2,[711,712,713,714,715,716],{"id":45,"depth":709,"text":49},{"id":104,"depth":709,"text":107},{"id":168,"depth":709,"text":171},{"id":314,"depth":709,"text":317},{"id":428,"depth":709,"text":431},{"id":463,"depth":709,"text":466},"markdown","content:tools:032.empirical-cdf.md","content","tools/032.empirical-cdf.md","tools/032.empirical-cdf","md",{"loc":4},1775502468196]