| Authors | Jaimie N. Davis, Adriana Pérez, Fiona M. Asigbee, Matthew J. Landry, Sarvenaz Vandyousefi, Reem Ghaddar, Amy Hoover, Matthew Jeans, Katie Nikah, Brian Fischer, Stephen J. Pont, Daphne Richards, Deanna M. Hoelscher, Alexandra E. van den Berg |
| Journal | International Journal of Behavioral Nutrition and Physical Activity |
| Year | 2021 |
| DOI | 10.1186/s12966-021-01087-x |
| Citations | 146 |
TL;DR
A one-year school-based gardening, nutrition, and cooking program in 16 low-income, predominantly Hispanic elementary schools increased children's vegetable intake by about half a serving per day but had no detectable effect on BMI, waist circumference, body fat percentage, or blood pressure.
The researchers tested a comprehensive school-based program called Texas Sprouts, which combined three components: (1) building a 0.25-acre outdoor teaching garden at each school, (2) delivering 18 one-hour student lessons on gardening, nutrition, and cooking during the school day across one academic year (9 months), and (3) offering nine monthly parent lessons. The comparator was a delayed intervention control group — schools that received the same program the following academic year but served as a no-treatment control during the study period. The primary outcomes were changes in vegetable intake, fruit intake, sugar-sweetened beverage consumption, BMI z-scores (a standardized measure of body mass index adjusted for age and sex), waist circumference, body fat percentage, and blood pressure.
The study enrolled 3,135 children in 3rd through 5th grade across 16 elementary schools in central Texas. The average age was 9.2 years. The sample was 64% Hispanic, 47% male, and 69% eligible for free and reduced-price lunch (a marker of low-income status). All schools had >50% Hispanic enrollment and >50% of students eligible for free/reduced lunch. Schools were located within 60 miles of central Austin and had no existing garden or gardening program at baseline.
Dietary intake was measured using a self-reported survey that asked children how frequently they consumed vegetables, fruits, and sugar-sweetened beverages. The survey asked about frequency per day (e.g., "How many times did you eat vegetables yesterday?"). This is a simple recall method, not a detailed food diary or 24-hour recall interview. Obesity outcomes were measured directly: height and weight were measured by trained staff to calculate BMI and BMI z-scores (using CDC growth charts); waist circumference was measured with a tape measure; body fat percentage was estimated using bioelectrical impedance analysis (a device that sends a weak electrical current through the body to estimate fat mass). Blood pressure was measured using an automated monitor. All measurements were taken at baseline (before the school year started) and at follow-up (at the end of the 9-month school year).
Study design: This was a cluster randomized controlled trial (cluster RCT). Sixteen schools were randomly assigned to either the Texas Sprouts intervention (8 schools) or a delayed intervention control (8 schools). Randomization was done by a biostatistician who was blinded to school identities, using a block randomization procedure. The intervention was implemented in three waves over three years (2016–2019), with 6 schools per wave in waves 1 and 2, and 4 schools in wave 3.
Why cluster randomization matters: Schools, not individual children, were the unit of randomization. This is critical because the intervention was delivered at the school level — all children in a given school received the same program. If they had randomized individual children within the same school, children in the control group might have been exposed to the garden or learned from friends in the intervention group (contamination). Cluster randomization accounts for the fact that children within the same school are more similar to each other than to children in other schools (the "intra-cluster correlation"). The statistical analysis used generalized weighted linear mixed models to account for this clustering effect.
Blinding: The biostatistician who performed randomization was blinded, but the study was not blinded for participants, teachers, or outcome assessors. Children, parents, teachers, and the educators delivering the lessons all knew which schools were receiving the intervention. Outcome assessors (staff measuring height, weight, waist circumference, and blood pressure) were not explicitly stated to be blinded, though they were likely aware of school assignment since gardens were visible. This is a significant limitation — knowledge of group assignment can influence behavior and reporting.
Duration: The intervention lasted one full academic year (9 months). Measurements were taken at baseline (fall) and follow-up (spring). There was no long-term follow-up after the intervention ended, so we don't know if effects persisted.
Statistical approach: The primary analysis used complete cases (only children with both baseline and follow-up data). They also performed analyses with multiple imputation for missing data to check robustness. They reported results as adjusted mean changes from baseline, with p-values and 95% confidence intervals. They controlled for sex, age, ethnicity, and free/reduced lunch status as covariates.
What this design can and cannot prove: A cluster RCT is the gold standard for establishing causality when the intervention is delivered at the group level. Because schools were randomly assigned, any differences between groups at follow-up can be attributed to the intervention (assuming no major confounds). However, this design cannot tell us which specific component of the intervention (gardening, cooking, nutrition lessons, parent lessons, or the garden itself) caused the effects — it tests the whole package. It also cannot tell us whether the effects would persist beyond the 9-month intervention period, or whether they would generalize to other populations (e.g., non-Hispanic, higher-income, or older children).
Major methodological weaknesses: (1) No blinding of participants or outcome assessors — this introduces potential bias, especially for self-reported dietary intake. (2) Dietary intake was measured by a simple frequency survey, not a validated 24-hour recall or food diary, which is less accurate. (3) Attrition — not all children completed follow-up measurements, and while they used imputation, missing data can still bias results. (4) The control group received the intervention the following year, which is ethical but means the control group knew they would eventually get the program, potentially reducing motivation to change behavior on their own. (5) Only 16 schools were randomized, which is a small number for a cluster RCT — with only 8 schools per arm, a single unusual school could skew results.
Primary outcome — Vegetable intake:
Secondary outcomes — No significant effects:
Subgroup analyses (not pre-specified, so interpret cautiously):
Attrition and missing data:
The increase in vegetable intake was about half a serving per day (0.48 frequency/day in the intervention group vs. 0.04 in controls). To put this in perspective: if a child was eating vegetables once per day at baseline, they increased to about 1.5 times per day after the program, while control children stayed at about 1 time per day. This is a modest but meaningful increase — roughly equivalent to adding one additional serving of vegetables every two days. For comparison, the U.S. Dietary Guidelines recommend children aged 9–13 eat 2–3 cups of vegetables per day, so this intervention closed about 15–25% of the gap between typical intake and recommendations.
The lack of effect on BMI, waist circumference, body fat, and blood pressure means that even though children ate more vegetables, this did not translate into measurable changes in body composition or cardiovascular risk factors over 9 months. This could be because: (1) the increase in vegetables was too small to affect energy balance; (2) children may have compensated by eating less of other healthy foods or more of unhealthy foods; (3) 9 months is too short to see changes in body composition from dietary changes alone; or (4) the intervention did not reduce total calorie intake or increase physical activity enough to shift energy balance.
What the authors acknowledge:
What a critical reader would note:
For someone running their own n=1 experiment (or a small group experiment):
What to test:
Minimum meaningful duration:
What to measure (specific metrics):
Key confounds to control for:
What a positive result would look like:
Related papers
When does no-till yield more? A global meta-analysis
Cameron M. Pittelkow, Bruce A. Linquist, Mark Lundy +7 more · 2015
RCTWhat Is the Evidence to Support the Use of Therapeutic Gardens for the Elderly?
Mark B. Detweiler, Taral R. Sharma, Jonna G. Detweiler +6 more · 2012
RCTEffects of a community gardening intervention on diet, physical activity, and anthropometry outcomes in the USA (CAPS): an observer-blind, randomised controlled trial.
Litt JS, Alaimo K, Harrall KK +10 more · 2023
RCTImpact of school gardens in Nepal: a cluster randomised controlled trial
Pepijn Schreinemachers, Dhruba Raj Bhattarai, Giri Dhari Subedi +7 more · 2017