Introduction

Trastuzumab is a well-established treatment for human epidermal growth factor receptor 2 (HER2)-positive early breast cancers. Given its high efficacy, recent efforts have concentrated on de-escalation of historic multi-agent chemotherapy protocols to safer and shorter regimens preserving previous achievements in long-term survival, while improving short and long-term quality of life (QOL)1. Progress is most evident for stage I HER2-positive breast cancers, with the single-arm APT trial showing excellent long-term outcomes with adjuvant paclitaxel plus trastuzumab (TH), omitting doxorubicin and cyclophosphamide2,3.

To further improve QOL outcomes in these patients, we conducted the ATEMPT trial, a multicenter, investigator-initiated randomized phase II study comparing a year of adjuvant T-DM1 (trastuzumab emtansine) to TH for toxicity and establishing the disease-free survival for one year of adjuvant T-DM14. While T-DM1 was associated with excellent 3-year invasive disease-free survival (iDFS, 97.8% [95% confidence interval (CI), 96.3–99.3]), the co-primary outcome, a prospectively defined composite outcome including clinically relevant toxicities seen with either treatment, was equivalent (46% T-DM1 vs. 47% TH, p = 0.83). T-DM1 was associated with a high rate of treatment discontinuation due to adverse events (17%); however, adverse event profiles, assessed by patient-reported outcomes (PROs), revealed better QOL, lower risk of neuropathy and superior work productivity with T-DM1 vs. TH.

While the ATEMPT trial supports the use of T-DM1 as a potential adjuvant systemic therapy in stage I HER2 + breast cancers, it remains unclear which patients stand to benefit from this de-escalation5. Patient age is an acknowledged factor in breast cancer therapy decision-making, in some instances driving over-treatment of younger patients and undertreatment of older patients6. Age may also be associated with development of negative physical and emotional sequelae following breast cancer. Young survivors are consistently found to be at higher risk for adverse physical and psychological effects which may impair their QOL for years following diagnosis7,8,9,10. When comparing age groups, several studies show worse QOL and increased symptom burden in younger survivors, primarily in early years post-diagnosis11,12,13,14. However, QOL deterioration is also observed in older populations, particularly those with comorbidities less common among younger women15.

Given these considerations, the aim of this unplanned post-hoc analysis is to compare rates of treatment discontinuation, and patient-reported QOL and toxicities between younger and older women in ATEMPT at 18 months post-enrollment. This timepoint, 6 months after completion of all protocol therapy, provides important information regarding women’s experience as they transition to breast cancer survivorship care following one year of adjuvant therapy.

Results

Patient characteristics

Of 512 participants recruited, 497 initiated study treatment. Following exclusion of participants without a baseline (n = 28) or 18-month assessment (n = 99), and male participants (n = 4), 366 patients were included in this analysis (Fig. 1). Among included patients, 34% (n = 124) were ≤50 years and 66% (n = 242) were >50 with an equal distribution observed for excluded patients (34 and 66%, respectively). Additional characteristics were similar for included and excluded patients (see Supplementary Table 1 in the Supplementary File).

Fig. 1: Flow diagram of participants.
figure 1

Of 512 participants randomized to receive treatment with either adjuvant T-DM1 or TH, 366 were included in the current analytic cohort.

In the analytic cohort (N = 366), overall median age was 56.69 (range 23.2–85.9), 45.37 (23.2–50.9) in women ≤50 and 61.13 (51.2–85.9) in women >50 (Table 1). Treatment distribution was similar and consistent with the 3:1 allocation, with 75% of women ≤50 and 79% women >50 randomized to T-DM1 (p = 0.428, Fisher’s Exact test). Younger women were more commonly premenopausal at enrollment (86% vs. 11%, p < 0.001, Fisher’s Exact test). Tumor characteristics were similar; however, younger women more frequently underwent mastectomy (53% vs. 28%, p < 0.001, Fisher’s Exact test) and accordingly received less adjuvant radiotherapy (50% vs. 73%, p < 0.001, Fisher’s Exact test). Among patients with hormone receptor-positive (≥1% estrogen receptor expression) disease, endocrine therapy utilization at 18 months was similar between age groups (≤50: 87% vs. >50: 83%, p = 0.385, Fisher’s Exact test) and between arms (T-DM1: 85% vs. TH: 83%, p = 0.840, Fisher’s Exact test).

Table 1 Baseline characteristics by age group.

Treatment discontinuation and dose reduction

Discontinuations of all protocol therapy were 6 and 18% for TH and T-DM1, respectively. T-DM1 discontinuation was significantly higher among women >50 vs. ≤50 (23% vs. 9%, p = 0.003, Fisher’s Exact test) with 4%, 8%, and 17% of older patients discontinuing treatment by 3, 6, and 9 months, respectively (Table 2). Similarly, different rates were observed in extreme age groups: ≤40 years (5%, 1/19), ≥70 years (23%, 6/26). Time to discontinuation was significantly shorter for older women vs. younger women (p = 0.002, Log-rank test, Fig. 2a) and for T-DM1 vs. TH (p = 0.007, Log-rank test, Fig. 2b). Older women receiving T-DM1 were at particular risk for discontinuation, while discontinuation for younger women receiving T-DM1 and both age groups receiving TH, was comparable (p < 0.001, Log-rank test, Fig. 2c). Following T-DM1 discontinuation, 25% (2/8) of younger and 45% (20/44) of older women switched to trastuzumab to complete a year of treatment.

Table 2 Reasons for trastuzumab emtansine (T-DM1) discontinuation by age group.
Fig. 2: Time to discontinuation by age groups and arms.
figure 2

a Shows the time to discontinuation by age group (p = 0.002); b shows the time to early discontinuation by arm (p = 0.007); and c shows the time to early discontinuation by age group and arm (p < 0.001).

Toxicity was the primary reason for T-DM1 discontinuation and was higher among older women (18% vs. 8%). In both age groups approximately half of discontinuations were protocol-mandated and half based on the treating physician’s decision (Table 2). Among older women, the most common toxicities for T-DM1 discontinuation were elevated liver enzymes or bilirubin (29%), neuropathy (17%), and thrombocytopenia (17%). Discontinuations due to cardiotoxicity were infrequent (n = 2) and limited to the older subgroup.

T-DM1 dose reductions occurred in 18% of women included in this analysis and were more common in older compared to younger women, (20%, 38/128 vs. 14%, 13/93, respectively, p = 0.011, Fisher’s Exact test).

Patient-reported outcomes

PRO scores at baseline and 18-months are summarized by arm and age group (Table 3). In multivariable analysis, better 18-month Functional Assessment of Cancer Therapy-Breast Cancer (FACT-B)16 total score was associated with better baseline FACT-B total score (estimated mean difference 0.73, p < 0.001, linear regression), but not with age or treatment arm (Table 4). Independent associations between 18-month FACT-B and age or arm were not found; however, an interaction between the two was observed (p = 0.037, linear regression): among women ≤50, treatment with T-DM1 (vs. TH) was associated with better 18-month FACT-B total score with an estimated mean difference of 6.48 (95% CI 0.51–12.46), approaching the minimally important difference (MID) threshold of 7–8 points17. Additionally, within the T-DM1 group, younger age was associated with better adjusted 18-month FACT-B total scores than older age (estimated mean difference in score between age ≤50 and age >50, 4.12; 95% CI 0.32–7.92). We performed a sensitivity analysis replacing the dichotomized age variable with baseline menopausal status (premenopausal or postmenopausal). In contrast to the primary model which showed a significant interaction between age group and treatment arm, in the sensitivity model, the interaction between baseline menopausal status and arm was not significant (p = 0.993, linear regression).

Table 3 Unadjusted patient-reported outcomes by study arm and age group at baseline and 18-months post-enrollment.
Table 4 Multivariable linear regression model for mean difference in 18-month post-enrollment FACT-B total score.

Table 5 and Supplementary Fig. 1 (in the Supplementary File) list adjusted mean differences in 18-month PROs for age groups and treatment arms (interaction). When controlling for baseline values of the outcome measure and other covariates, higher 18-month FACT-B total scores among younger women treated with T-DM1 vs. TH were driven by differences in social/family well-being (SWB) (estimated mean difference in score between T-DM1 and TH, 2.61; 95% CI 0.64–4.58) and breast cancer subscale (BCS) (estimated mean difference in score between T-DM1 and TH, 1.92; 95% CI 0.05–3.79) sub-scores. T-DM1 was significantly associated with better physical well-being (PWB) scores vs. TH in women >50 (estimated mean difference in score between T-DM1 and TH, 1.43; 95% CI 0.26–2.60). A similar non-significant point estimate was found for women ≤50 (estimated mean difference in score between T-DM1 and TH, 1.35; 95% CI −0.15–2.86). All significant inter-group differences met or approached MID threshold (2–3 points for BCS, 1–3 points for PWB)17,18.

Table 5 Adjusted differences between age groups and treatment arms (interaction) in 18-month post-enrollment patient-reported outcomes.

Adjusted 18-month Rotterdam Symptom Checklist (RSCL)19,20 scores were comparable between age groups and arms; only activity level was significantly worse in younger vs. older women treated with T-DM1 (estimated mean difference in score between age ≤50 and age >50, 1.10; 0.28–1.93), although a similar point estimate was seen after TH (1.06, −0.42–2.53).

Using the 18-month Work Productivity and Activity Impairment Questionnaire: Specific Health Problem (WPAI:SHP)21, among women >50 years, T-DM1 vs. TH was associated with less activity impairment due to breast cancer (estimated mean difference in score between T-DM1 and TH, −6.53; 95% CI −12.79 to −0.28). No additional differences in mean WPAI:SHP scores were observed.

Adjusted odds of alopecia at 18 months, as reported on the Alopecia Patient Assessment (APA)22, did not significantly differ by age or arm. Using the Patient Neurotoxicity Questionnaire (PNQ)23, adjusted odds of 18-month residual moderate, moderate-severe or severe neuropathy were significantly lower with T-DM1 vs. TH among women >50 (odds ratio [OR] 0.33, 95% CI 0.16–0.68) with a trend for reduction among women ≤50 (OR 0.32, 95% CI 0.10–1.03).

Discussion

In light of favorable disease-related outcomes seen in contemporary trials for HER2-positive early breast cancer, ongoing efforts increasingly emphasize treatment de-escalation as a means of optimizing health-related quality of life (HRQOL) while sustaining treatment efficacy. In ATEMPT, adjuvant T-DM1 was associated with superior overall HRQOL, lower risk of neuropathy and superior work productivity, while maintaining excellent 3-year iDFS in patients with stage I HER2-positive breast cancer4. Analogous findings were reported in the KAITLIN trial, comparing similar regimens (plus pertuzumab) for stage II-III disease though following anthracycline-based chemotherapy24. Our current analysis shows that younger women, while opting for more aggressive surgery, and more often completing protocol therapy than older women, report larger HRQOL gains at 18 months with T-DM1 vs. TH, with differences within or approaching the range of clinical relevance17.

Multiple studies have identified a distinct and often more severe impact of breast cancer on younger survivors’ HRQOL and emotional well-being7,8,9,13. In a systematic review comparing younger (≤50) to older women (>50), HRQOL was more severely compromised in younger women, with greater deterioration noted for mental health as opposed to physical functioning domains9. In a recent longitudinal report, a steeper drop in HRQOL was observed among younger (≤50) vs. older (>50) survivors during the first three years post-diagnosis, and although HRQOL improved thereafter, at 10 years it remained below the general population level14. Our findings, showing better 18-month HRQOL particularly in young women treated with T-DM1 vs. TH, suggest that a modern, de-escalated chemotherapy approach may temper these effects on young women’s HRQOL. Additionally, complementary to prior studies, this improvement was driven by better SWB and BCS sub-scores, (including items focused on body image and sexuality) rather than PWB.

It is uncertain why T-DM1 led to superior 18-month HRQOL in young women. We previously showed that during the first 12 weeks of treatment, T-DM1 vs. TH was associated with less missed work time and work/activity impairment, and lower rates of alopecia and neuropathy4. By 18 months, these differences attenuated, although a lower risk of neuropathy following T-DM1 persists, regardless of age. Specific to younger women, treatment-related menopause is a toxicity with more long-term effects, which may in part explain our findings. In a preplanned sub-study, among premenopausal women enrolled to ATEMPT, 18-month chemotherapy-related amenorrhea was significantly lower with T-DM1 vs. TH (24% vs. 50%)25. Treatment-related menopause is associated with physiologic symptoms such as night sweats, hot flashes, vaginal dryness, and weight gain, which can adversely affect patients’ psychosocial QOL26,27. In younger premenopausal women, preserved ovarian function also contributes to fertility preservation, an important issue for many young patients28. In a sensitivity model however, after replacing age with baseline menopausal status, we did not replicate the significant interaction observed between age group and treatment arm.

Over half of women ≤50 treated for stage I breast cancers in ATEMPT underwent mastectomy, nearly twice the rate observed for women >50. Additionally, half of mastectomies in younger women were bilateral. These observations conform with national trends showing increasing rates of mastectomy, and particularly bilateral mastectomy, with steeper increases in younger patients and those with node-negative tumors ≤2 cm29. Compared to breast-conserving surgery, mastectomy with implant reconstruction is associated with inferior breast satisfaction, psychosocial well-being scores, and sexual well-being scores, even when restricting to stage I cancers30. Given the extremely low rates of locoregional recurrence associated with HER2-positive disease following adequate anti-HER2 therapy, less aggressive surgery may serve as an additional means to retain QOL31.

Among older women, we did not observe a significant difference in 18-month global HRQOL between arms, although T-DM1 was associated with better physical well-being, less activity impairment and lower odds of neuropathy. This may be partially related to increased toxicity and higher rates of T-DM1 discontinuation in older women, although we corrected for these in multivariable analyses. Additionally, our study was underpowered to examine differential treatment effects on HRQOL in women at extremes of age (≥65–70) and at higher risk of developing chemotherapy toxicity6,32. The ATEMPT 2.0 trial (NCT04893109) is evaluating whether six cycles of T-DM1 followed by trastuzumab can decrease toxicity while maintaining efficacy and will compare toxicities of this regimen to TH in patients with stage I HER2-positive breast cancer.

The current study’s strengths include its prospective nature and high-quality data captured within the setting of a multicenter clinical trial. We applied an age stratification (≤50, >50) commonly used in the study of breast cancer, facilitating comparisons to prior studies, but limiting our ability to comment on women at age extremes, primarily the elderly. However, the care of elderly patients can be complicated by geriatric factors and comorbidities (not captured within our data), and thus a clinical trial population may not be representative6. Safety and efficacy of adjuvant T-DM1 in older patients (≥60 years) with stage I-III HER2-positive breast cancer is being evaluated in the ATOP trial (NCT03587740). Generalizability of our findings may also be limited by the small number of minority participants. Racial/ethnic variations in HRQOL after breast cancer have been described, although they may be less evident in younger women due to their overall worse HRQOL33. Lastly, although baseline characteristics, including distribution of age groups, were similar for patients with missing surveys (n = 127) and the study population, we cannot exclude divergent PROs.

Our findings suggest that younger breast cancer patients, a population at times overtreated and at particular risk for QOL impairment, may benefit more than older women from use of T-DM1 rather than TH with regard to HRQOL. This is notable as it was observed with a full year of T-DM1 and compared to an already “de-escalated” regimen. Although younger patients and their providers may hesitate to accept de-escalated regimens, recent data suggest against an association between age and prognosis in adequately treated HER2-positive early breast cancer1,34,35,36,37. The potential for greater improvement in QOL further supports the prudent application of de-escalation strategies in the treatment of young and older breast cancer patients alike. PRO analyses, upcoming reports of longer-term outcomes from ATEMPT, and future data regarding the efficacy of a shorter course of T-DM1 from ATEMPT 2.0, will continue to shape recommendations for T-DM1 in the adjuvant setting.

Methods

Study population and procedures

ATEMPT (TBCRC033) was a randomized phase II trial that enrolled 512 participants within 90 days of their most recent surgery for stage I HER2-positive breast cancer at 24 institutions throughout the United States between 17 May 2013 and 13 December 20164. Patients were stratified by age (<55/≥55 years), planned radiotherapy (yes/no), and planned endocrine therapy (yes/no) and randomized in a 3:1 ratio to receive T-DM1 or TH, respectively. T-DM1 (3.6 mg/kg) was administered intravenously on day 1 of each 21-day cycle and continued for 17 cycles or 1 year. TH entailed intravenous weekly administration of paclitaxel (80 mg/m2) with concurrent weekly trastuzumab (4 mg/kg loading dose followed by 2 mg/kg(for 12 weeks, with trastuzumab (6 mg/kg) subsequently continued intravenously every 21 days for 13 cycles. Adjuvant radiotherapy and hormonal therapy could be initiated after 12 weeks of T-DM1 or completing paclitaxel. Female participants completing both baseline and 18-month survey assessments were included. The study (NCT01853748) was conducted in accordance with the International Conference on Harmonization Good Clinical Practice Standards and the Declaration of Helsinki. Institutional review board (IRB) approval was obtained at Dana-Farber/Harvard Cancer Center and participating sites (see the list of participating IRBs in the Supplementary Note, included in the Supplementary File). Written informed consent was obtained from each patient. The full trial protocol is available as a supplement.

Measures

Following randomized treatment allocation, English-speaking participants were surveyed at baseline (day 1 of treatment), 3 and 12 weeks, and 6, 12, and 18 months (24 months for QOL and symptom distress). To focus on posttreatment outcomes, the current analysis includes data collected at baseline and 18 months, the last timepoint at which all PRO instruments were administered. PRO surveys included FACT-B16, RSCL20, APA22, PNQ23, and WPAI:SHP21.

The FACT-B (Version 4) is designed to measure multidimensional health-related QOL (HRQOL) in breast cancer patients providing a total score (37 items, range 0–148) and 5 subscale scores: (PWB, 7 items, range 0–28), SWB, 7 items, range 0–28), emotional well-being (EWB, 6 items, range 0–24), functional well-being (FWB, 7 items, range 0–28), and (BCS, 10 items, range 0–40)16. Higher scores indicate better HRQOL. Differences of 7–8 points on the FACT-B score and 1–3 points for subscale scores are considered a minimally important difference (MID)17,18.

The RSCL measures HRQOL and symptoms in cancer patients, non-specific to breast cancer. It is comprised of four scales: physical distress (23 items, range 23–92), psychological distress (7 items, range 7–28), activity level (8 items, range 8–32), and a single item measuring overall HRQOL (range 1–7)19. Higher scores correlate with poor HRQOL, except for activity scale, for which higher scores correlate with better HRQOL.

The WPAI:SHP is a 6-item instrument designed to quantitatively assess the amount of absenteeism (percent work time missed), presenteeism (percent reduced on-the-job effectiveness), productivity loss and overall activity impairment attributable to a specific health problem20. Scores are transformed into percentages (range 0–100) with higher percentages indicating greater impairment. Missing data were handled similarly for all instruments – scores were calculated only when at least 50% of items were available and prorated for missing items. FACT-B total score was calculated only if all component subscales were valid.

Peripheral neuropathy symptoms and alopecia were assessed using specific instruments. PNQ asks participants to grade each, sensory and motor neuropathy symptoms on a 5-point Likert scale. Results were categorized as no/mild neuropathy or moderate/moderate-severe/severe neuropathy. For alopecia, we used a single item from the APA, “Have you had any hair loss during the past week?” for which participants responded “yes” or “no”.

Socio-demographic and disease characteristics were collected at enrollment. Age was dichotomized as ≤50 and >50 years, a cut-off used in other studies and approximating menopausal status9,11,14. Race and ethnicity were extracted from the medical record. Race was categorized as American Indian or Alaskan Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, White, more than one, or other. Ethnicity was categorized as Hispanic or Latino, Non-Hispanic or unknown. Women reporting at least one menstrual period within 12 months prior to registration were considered premenopausal and otherwise postmenopausal. Primary breast surgery (lumpectomy/mastectomy) was defined at enrollment. Receipt of radiotherapy and endocrine therapy use were collected through follow-up.

Statistical analysis

Patient characteristics were compared between age groups using Fisher’s exact test. Discontinuation rates, as defined in the main study protocol, were compared between arms and age groups using Kaplan–Meier curves and a log-rank test was used to compare discontinuation rates at 12 months, the duration of protocol treatment. FACT-B, RSCL and WPAI:SHP scores were expressed as means ± standard deviations. PNQ and APA results were categorized as described and reported as percentages. Linear (RSCL, WPAI:SHP, FACT-B) and logistic (APA, PNQ) multivariable regression models were used to compare PROs within age groups and arms at 18 months post-enrollment. Multivariate regression models included age group, treatment arm, and their interaction term (arm*age group) and were adjusted for covariates: race, early discontinuation, surgery type, receipt of radiotherapy and endocrine therapy use at 18 months and the respective PRO score at baseline. Regression models were implemented on non-missing data for variables entered. Models were not adjusted for multiple comparisons. All p-values were 2-sided and considered statistically significant if < 0.05. Analyses were conducted using SAS Software, Version 9.4 (SAS Institute, Cary, NC).