<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>F-test | Dhafer Malouche</title><link>https://dhafermalouche.net/tag/f-test/</link><atom:link href="https://dhafermalouche.net/tag/f-test/index.xml" rel="self" type="application/rss+xml"/><description>F-test</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><copyright>Dhafer Malouche © 2026</copyright><lastBuildDate>Sat, 02 May 2026 00:00:00 +0000</lastBuildDate><image><url>https://dhafermalouche.net/media/icon_hu294da7f24af66942b94b8e240e33fe59_2153342_512x512_fill_lanczos_center_3.png</url><title>F-test</title><link>https://dhafermalouche.net/tag/f-test/</link></image><item><title>StatANOVA — One-way ANOVA &amp; Tukey HSD Workbench</title><link>https://dhafermalouche.net/apps/statanova/</link><pubDate>Sat, 02 May 2026 00:00:00 +0000</pubDate><guid>https://dhafermalouche.net/apps/statanova/</guid><description>&lt;p>An interactive web application that fits a one-way analysis of variance and its standard post-hoc decomposition entirely in the student&amp;rsquo;s browser. &lt;strong>StatANOVA&lt;/strong> extends the small family of teaching tools designed for undergraduate statistics at Qatar University: where &lt;strong>StatTests&lt;/strong> answers &lt;em>which test do I run on these two groups&lt;/em> and &lt;strong>StatRegress&lt;/strong> asks &lt;em>given these data, what is the model&lt;/em>, &lt;strong>StatANOVA&lt;/strong> asks &lt;em>do these $k$ groups differ on average — and if so, which ones?&lt;/em>&lt;/p>
&lt;h2 id="why-an-anova-workbench">Why an ANOVA workbench?&lt;/h2>
&lt;p>The pedagogical gap StatANOVA targets is the step from the global $F$-test to a defensible per-pair conclusion. In a typical lecture, students are taught the $F$-statistic for the equality-of-means hypothesis $H_{0}: \mu_{1} = \cdots = \mu_{k}$, but the natural next question — &lt;em>which groups are responsible for the rejection?&lt;/em> — is usually answered with a quick remark about Tukey&amp;rsquo;s honestly significant difference. StatANOVA closes that loop interactively: the student uploads a real dataset, reads the ANOVA decomposition, and immediately sees the Tukey HSD intervals, the family-wise adjusted $p$-values, and the resulting compact-letter-display (CLD) groupings on the same screen.&lt;/p>
&lt;h2 id="what-the-app-does">What the app does&lt;/h2>
&lt;p>&lt;strong>Input.&lt;/strong> Upload a CSV (UTF-8, header row, comma-separated, dot decimal, $\le 10$ MB and $\le 50{,}000$ rows after dropping NAs). The app inspects the columns and proposes:&lt;/p>
&lt;ul>
&lt;li>a &lt;strong>factor variable&lt;/strong> — any column with between 2 and 6 distinct values, with at least 3 observations per level after listwise deletion;&lt;/li>
&lt;li>one or more &lt;strong>continuous response variables&lt;/strong> — numeric columns selectable in a checklist, capped at 50 active responses.&lt;/li>
&lt;/ul>
&lt;p>The student then chooses the significance level $\alpha$ and the post-hoc method (Tukey HSD by default).&lt;/p>
&lt;p>&lt;strong>Inferential output.&lt;/strong> For each selected response, the app reports:&lt;/p>
&lt;ul>
&lt;li>the &lt;strong>ANOVA summary table&lt;/strong> (sums of squares, degrees of freedom, mean squares, the $F$-statistic, and the corresponding $p$-value);&lt;/li>
&lt;li>the &lt;strong>compact-letter-display (CLD) table&lt;/strong>: each group is annotated with letters such that two groups share at least one letter if and only if their means are not significantly different at level $\alpha$ under the chosen multiple-comparison correction;&lt;/li>
&lt;li>a &lt;strong>forest plot of pairwise mean differences&lt;/strong> with simultaneous confidence intervals, ordered for readability, with intervals that exclude zero highlighted.&lt;/li>
&lt;/ul>
&lt;p>Because the workflow runs across many response variables in a single pass, StatANOVA is well suited to the kind of multivariate teaching dataset (Qatar Biobank-style, biomedical, or biodiversity) where a single grouping factor is to be screened against several outcomes.&lt;/p>
&lt;h2 id="classroom-workflow">Classroom workflow&lt;/h2>
&lt;p>In lectures, the instructor mirrors the app on the projector while writing the model on the board: the algebraic decomposition $\mathrm{SS}&lt;em>{\text{total}} = \mathrm{SS}&lt;/em>{\text{between}} + \mathrm{SS}_{\text{within}}$ is read off the same table the students see. In practice sessions, students upload their assigned CSV, copy the ANOVA table, the CLD summary, and the forest plot into their report, and explain in one paragraph (i) whether the global $F$-test rejects, (ii) which pairs of groups differ once the family-wise error is controlled, and (iii) how the CLD letters and the forest plot tell the same story in two complementary forms.&lt;/p>
&lt;h2 id="technical-notes">Technical notes&lt;/h2>
&lt;p>The app is a single-page client-side application: all computation runs in the browser, with no server round-trip and no data leaving the device. The ANOVA decomposition is computed directly from the group means and pooled variance estimator; Tukey HSD intervals use the studentised-range distribution at the chosen family-wise level; the CLD is constructed by the standard insert-and-absorb algorithm on the matrix of adjusted $p$-values. The static bundle is deployed on Netlify; like its siblings, it works offline after first load and has no external run-time dependencies.&lt;/p></description></item><item><title>StatRegress — Linear Regression Workbench</title><link>https://dhafermalouche.net/apps/statregress/</link><pubDate>Sat, 25 Apr 2026 00:00:00 +0000</pubDate><guid>https://dhafermalouche.net/apps/statregress/</guid><description>&lt;p>An interactive web application that fits and diagnoses ordinary least-squares regression models entirely in the student&amp;rsquo;s browser. &lt;strong>StatRegress&lt;/strong> completes a small family of teaching tools designed for undergraduate statistics at Qatar University: where &lt;strong>StatTables&lt;/strong> answers &lt;em>what is the critical value&lt;/em> and &lt;strong>StatTests&lt;/strong> answers &lt;em>which test do I run&lt;/em>, &lt;strong>StatRegress&lt;/strong> asks &lt;em>given these data, what is the model — and is it any good?&lt;/em>&lt;/p>
&lt;h2 id="why-a-regression-workbench">Why a regression workbench?&lt;/h2>
&lt;p>Most introductory regression instruction is split between (i) computing $\hat{\beta}$, $\mathrm{SE}(\hat{\beta})$, $t$- and $F$-statistics by hand on toy data and (ii) demonstrating the same calculations in R or Python with &lt;code>lm()&lt;/code>/&lt;code>statsmodels&lt;/code>. Both have pedagogical limits: hand calculations don&amp;rsquo;t scale beyond $n \approx 10$, while a full statistical environment hides the geometry of the fit behind a console output. StatRegress sits between the two — students paste a real dataset, see the regression line drawn directly on the scatter, and read the standard coefficient table and diagnostic plots in the same view, with no installation and no server round-trip.&lt;/p>
&lt;h2 id="what-the-app-does">What the app does&lt;/h2>
&lt;p>&lt;strong>Input.&lt;/strong> Paste a CSV (or load one of the bundled teaching datasets), choose the response and the predictor(s), and select the assumed model (simple linear regression, multiple regression with up to a small handful of predictors, or polynomial extension).&lt;/p>
&lt;p>&lt;strong>Estimation output.&lt;/strong> A regression report formatted as in a textbook:&lt;/p>
&lt;ul>
&lt;li>the &lt;strong>coefficient table&lt;/strong> with $\hat{\beta}&lt;em>{j}$, $\mathrm{SE}(\hat{\beta}&lt;/em>{j})$, $t_{j} = \hat{\beta}&lt;em>{j}/\mathrm{SE}(\hat{\beta}&lt;/em>{j})$, the two-sided $p$-value, and the $95%$ confidence interval;&lt;/li>
&lt;li>the &lt;strong>model summary&lt;/strong>: residual standard error $\hat{\sigma}$, multiple $R^{2}$, adjusted $R^{2}$, and the global $F$-test for $H_{0}: \beta_{1} = \cdots = \beta_{p} = 0$;&lt;/li>
&lt;li>the &lt;strong>ANOVA decomposition&lt;/strong> of the total sum of squares.&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Diagnostic output.&lt;/strong> The four classical residual plots — residuals vs.\ fitted values, normal Q–Q plot of standardised residuals, scale–location ($\sqrt{|r_{i}|}$ vs.\ $\hat{y}_{i}$), and residuals vs.\ leverage with &lt;strong>Cook&amp;rsquo;s distance&lt;/strong> contours — together with a flag for influential or high-leverage observations.&lt;/p>
&lt;h2 id="drag-a-point-mode">Drag-a-point mode&lt;/h2>
&lt;p>For simple linear regression the app exposes a &lt;strong>drag-a-point&lt;/strong> interaction: students grab a single observation in the scatter, move it, and the fitted line, $R^{2}$, the coefficient table, and the residuals all update in real time. This makes intuitive what an algebraic discussion of leverage and influence usually fails to convey — that a single high-leverage point can rotate the line, that an outlier in the middle of the design space barely moves the slope, and that Cook&amp;rsquo;s distance is geometric in nature.&lt;/p>
&lt;h2 id="classroom-workflow">Classroom workflow&lt;/h2>
&lt;p>In lectures, the instructor mirrors the app on the projector while building the model on the board: each new term in the algebra has its counterpart in the live coefficient table. In practice sessions, students paste their assigned dataset, copy the coefficient table and diagnostic plots into their solution, and report which assumptions look satisfied, which look suspicious, and which observations they would investigate further. Because the app produces a deterministic report from a deterministic input, grading is reproducible.&lt;/p>
&lt;h2 id="technical-notes">Technical notes&lt;/h2>
&lt;p>The app is a single-page client-side application built with &lt;strong>React&lt;/strong>: all computation runs in the student&amp;rsquo;s browser, with no server round-trip and no data leaving the device. The OLS estimator is computed via the QR decomposition for numerical stability; standard errors and inference are obtained from the corresponding $(X^{\top}X)^{-1}$ block. Distributional quantiles for the $t$ and $F$ tables are computed with the &lt;a href="https://github.com/jstat/jstat" target="_blank" rel="noopener">jStat&lt;/a> numerical library (MIT-licensed). The static bundle is deployed on Netlify; like its siblings, it works offline after first load and has no external run-time dependencies.&lt;/p></description></item></channel></rss>