Introduction

Interdisciplinary research (hereafter IDR) has been widely recognized as a catalyst for transformative and innovative science, promoting national competitiveness. In recent decades, interdisciplinary approaches have increasingly been applied in science, education, and policy domains to tackle emerging research problems, such as public health issues, climate change, and biodiversity loss (de Macedo et al., 2023; Puppim de Oliveira and Qian, 2023; Sentell et al., 2020). IDR is often considered a research mode that necessitates an integration of concepts, techniques, theories, and/or data from two or more disciplines or specialized areas of knowledge (National Academy of Sciences, 2004). The theory of recombinant novelty suggests that the integration of knowledge elements from distant sources is more likely to bring about innovative outcomes, which is also true at the disciplinary level (Fleming, 2001; Liu et al., 2022; Weitzman, 1998). Since the 1960s, interdisciplinary issues have become a significant topic in the discourse on knowledge creation and science policy. As important ideas often flourish beyond the confines of a single discipline, policy instruments in many countries have encouraged IDR with high expectations for its benefits. For instance, in 2020, the National Natural Science Foundation of China (NSFC) launched a new division, i.e., the Department of Interdisciplinary Science, to push forward IDR to meet national strategic needs.

Given the potential benefits of IDR and the ongoing policy push, the capacity to perform IDR is becoming increasingly critical for scientists. However, there is surprisingly little knowledge about how early-career scientists (hereafter ECSs) are adapting to this trend toward interdisciplinarity, and whether gender plays a role in this process. The “publish or perish” imperative puts pressure on ECSs to publish frequently to secure a job in a competitive market. The research strategies they adopt may yield different outcomes for their survival and success in academia. To ensure that they remain in academia, ECSs should specialize and gain recognition within specific fields (Heiberger et al., 2021), while also becoming independent scientists and providing innovative perspectives.

Performing IDR can be particularly challenging for ECSs due to institutional, cultural, and practical barriers that make it risky. Despite the potential benefits of integrating knowledge from diverse disciplines, pursuing IDR is often a difficult endeavor. Empirical evidence suggests that IDR that requires additional commitment and effort can be difficult to produce, and may result in research outcomes of lower quality (Kovacs and Lex, 2012), lead to lower productivity (Leahey et al., 2017), create ambiguous identity, and receive less recognition (Hsu et al., 2009). Therefore, IDR entails high risks and uncertainties for ECSs, especially for female ECSs who are underrepresented in STEM fields and face systematic barriers that impede their careers in the research system (Huang et al., 2020). Female scientists exhibit disadvantages including reduced access to research funding (Larivière et al., 2011), stereotypes about their roles and abilities (Eagly et al., 2020), challenges in collaboration and leadership (Liu et al., 2022), and difficulties in balancing childcare responsibilities with the demands of their careers (Fox, 2005). As a result, previous studies find that female scientists exhibit inferior research performance and a greater likelihood of leaving academia than their male counterparts.

Given the riskier path of IDR and their vulnerable positions, female scientists tend to be more cautious or conservative in their research strategies than their male colleagues. This caution originates from their fear of being criticized (Etzkowitz, Kemelgor, and Uzzi, 2000), making mistakes (Correll, 1997) or acting impulsively (Rier, 2003). Besides, literature in sociology, economics and psychology points out the gender difference in response to risk. Empirical and experimental evidence suggests that women are less likely than men to engage in risky behaviors (Charness and Gneezy, 2012; Eckel and Grossman, 2008). Therefore, we presume that female ECSs may adopt safer research strategies and be more hesitant to engage in IDR.

The investigation of the interdisciplinarity of ECSs’ research is essential, particularly concerning gender and mentorship. Despite a few exceptions (Sugimoto et al., 2011; Unger et al., 2022), there is a lack of rigorous investigations of ECSs’ preferences for IDR. Studying IDR among ECSs can provide insights into how they embrace the practices and policies that favor IDR by adopting different research strategies, and how intellectual preferences for IDR are influenced by gender and mentorship. Understanding gender differences in engaging in IDR can help reveal the driving forces and mechanisms underlying the productivity puzzle and the glass ceiling in science education and research strategies.

Unlike a few existing studies that investigate ECSs’ publications (Lee, 2019; Zhang and Yu, 2020), such as journal articles or conference papers, we focus on doctoral theses to uncover their engagement in IDR. Doctoral theses are considered a crucial requirement for independent research contribution (Donner, 2021) and the most important research output of junior scientists (Vera-Baceta et al., 2019). Doctoral theses can reflect a part of the academic culture, thinking, learning, and writing skills of students, which allows us to reflect on ECSs’ participation in IDR, and helps to provide a clearer picture of ECSs’ interdisciplinary research capacity.

The degree of interdisciplinarity can vary across scientific domains. Previous literature has suggested that humanities tend to have a lower level of interdisciplinarity, while biomedicine, physics, and chemistry exhibit higher levels, compared to other domains (Morillo et al., 2003). Given the severe underrepresentation of female scientists in hard science and the disciplinary variations in how scientists incorporate knowledge from diverse domains (Huang et al., 2020; Liu et al., 2018; Liu et al., 2017), this study specifically focuses on five representative domains within the hard sciences. These domains include behavioral sciences, biological sciences, engineering, health and medical sciences, and mathematical and physical sciences.

Three key factors drive our investigation into scientists from the United States (hereafter the U.S.). First, the U.S. maintains a position at the forefront of science and technology, making it a significant contributor to global scientific advancements (White, 2019). The high-quality doctorate education system in this country plays a crucial role in building and sustaining its national scientific capacity (Sarrico, 2022). As a result, doctoral graduates from the U.S. serve as representative samples for investigating ECSs. Second, the U.S. serves as an example of strong interdisciplinary policies for higher education and research (National Academy of Sciences, 2004). The ECSs from the U.S. often receive training in IDR and might exhibit a strong motivation to perform IDR in their theses. In addition, gender differences are notable in U.S. academia, with systematic barriers affecting female scientists within the research system (Sá et al., 2020). This allows us to capture gender disparities in the research strategies employed by ECSs, thus shedding light on potential gender-related differences regarding engagement in IDR within their doctoral theses.

This study investigates the doctoral theses of 675,135 Ph.D. students who completed their doctorates at universities in the U.S. between 1950 and 2016 across five scientific domains: behavioral sciences, biological sciences, engineering, health and medical sciences, and mathematical and physical sciences. We construct an indicator to measure the degree of interdisciplinarity embedded in the doctoral research by utilizing co-occurrence matrices of subjects assigned to doctoral theses in the ProQuest Dissertations & Theses Database. To provide a comprehensive understanding of ECSs’ engagement in IDR, we propose the following research questions:

RQ1: What is the prevalence of interdisciplinary research among early-career scientists? To address RQ1, our analysis focuses on examining the temporal evolution of the interdisciplinarity indicator and its distributions across five scientific domains, and universities of different research intensity.

RQ2: How does gender influence ECSs’ engagement in IDR? This study explores the gender differences in ECSs’ engagement in IDR with a focus on temporal changes, disciplinary disparities, and differences across universities of varying research quality.

RQ3: Is the gender of advisors correlated with students’ participation in IDR? This study examines if the gender of advisors and the gender pairing of students and advisors are related to gender differences in this direction.

The remainder of this study is organized as follows. The next section reviews the related work and proposes the research questions. The section of “Data and methods” introduces the details of data and empirical approaches. In the section of “Results”, the results are presented. The last section presents a discussion of the findings and implications for science policies.

Literature review and research questions

Early-career scientists and interdisciplinary research

Existing literature suggests that IDR can benefit scientists’ career advancement in the mid to long term, as it has the potential to foster scientific breakthroughs and lead to greater impact (Chen et al., 2015; Larivière et al., 2015; Okamura, 2019). Extensive empirical studies suggest that novel inventions often emerge from the combination of disparate knowledge domains (Fleming, 2001; Schilling and Green, 2011). Recent studies have shown that interdisciplinary publications tend to attract more citations (Zhang et al., 2021) and gain higher societal visibility (D’Este and Robinson-García, 2023). An increase in the effective number of disciplines is associated with a 20% increase in research impact at the article level (Okamura, 2019). Due to its potential to advance scientific frontiers, science policy, especially funding policy, and higher education systems in numerous countries actively promote IDR, and encourage interdisciplinary collaboration (James Jacob, 2015; Lyall et al., 2013; Rylance, 2015; Spelt et al., 2009). For instance, since the mid-1990s, federal research funding agencies have exerted growing pressure on the research community to embrace IDR (Rhoten and Pfirman, 2007).

However, some other researchers point out a non-linear relationship between IDR and scientific impacts, and the potential citation penalty of IDR due to a lack of recognition or perceived lower quality (Jacobs and Frickel, 2009; Leahey et al., 2017; Liu et al., 2017). Analyzing all articles that are indexed in the Web of Science (hereafter WoS) in 2000, a study finds an inverted U-shaped correlation between interdisciplinarity and papers’ scientific impacts (Larivière and Gingras, 2010). Some studies find that IDR receives fewer citations in natural and health sciences, compared to disciplinary research (Levitt and Thelwall, 2008), and different components of interdisciplinarity can have varying effects on short-term versus long-term citations (Wang et al., 2015). In addition, pursuing IDR requires significant time and effort to integrate knowledge from diverse disciplines, and such research may face difficulties in terms of publication and recognition compared to more traditional research (Blackmore and Kandiko, 2011). ECSs face the pressure of productivity evaluation through metrics like the number of papers published, citations, and journal impact factors (Nicholas et al., 2018). Therefore, pursuing IDR may lead to temporary productivity obstacles and challenges in gaining recognition, potentially impacting ECSs’ career advancement (Frodeman et al., 2017).

Involvement in IDR presents a mix of benefits and challenges for scientists. On the positive side, IDR has the potential to contribute to long-term career growth and foster innovative breakthroughs. However, pursuing an IDR path may temporarily hamper productivity and introduce uncertainties in securing academic positions of ECSs. In contrast to scientists at later career stages, ECSs might exhibit greater reluctance to engage in IDR due to the difficulties and challenges they face when pursuing this path. However, as the importance of IDR continues to grow and science policies advocate for it, we expect ECSs to increasingly involve themselves in IDR over time. Despite ongoing discussions regarding ECSs’ attitudes toward IDR, the historical evolution of IDR among ECSs remains unclear.

Gender and interdisciplinary research

Gender refers to the societal, cultural, and behavioral expectations, roles, and identities imposed on individuals based on their perceived sex (Eagly, 2013). It is important to differentiate gender from sex, primarily concerning the biological distinctions between males and females. Given the research focus of this study on ECSs’ engagement in IDR, we prioritize examining gender as it encompasses a broader range of social, psychological, and environmental factors that influence ECSs’ decision-making and engagement in IDR.

The discourse surrounding the correlation between gender and IDR is shaped by various perspectives including gender-related biological origins and psychological characteristics, cultural influences, and gender disparities in the research system. However, the existing literature presents contradictory findings regarding gender differences in IDR.

The literature on gender-related biological and psychological origins suggests that female scientists may be more inclined to engage in IDR. The neuroanatomical structure of human brains is one potential explanatory factor, as research findings suggest that females are better at assimilating diverse forms of information and making connections between ideas (Haier et al., 2005). Feminist science studies have long theorized that women may be less constrained by the norms of science (Harding, 1986). Moreover, research in science studies suggests that gender differences in selecting research problems. For instance, female scientists tend to be more oriented toward advancing knowledge that meets human needs, in contrast to males, who are more interested in conventional scientific motivations (Harding, 1992). Empirical studies, including surveys and analyses of collaboration patterns, also suggest that women scientists are more likely to engage in IDR activities and collaborations. For example, female graduate students are found to spend more time on cross-fertilization activities and participate in more cross-disciplinary knowledge creation (Rhoten and Pfirman, 2007). Women at Utrecht University are consistently found to engage more in IDR collaboration (Van Rijnsoever and Hessels, 2011). In addition, having female team members is linked to increased interdisciplinarity of collaborative research outputs (Pinheiro et al., 2022; Specht and Crowston, 2022).

Cultural factors, such as gender stereotypes, the prevailing “masculine” culture in STEM disciplines, and a lack of role models, can hinder the participation of female scientists in IDR. Societal expectations often associate traits like kindness, warmth, and helpfulness with women, while men are perceived as more analytical, competitive, and independent (Carli et al., 2016). Cultural pressures impose gender roles on female scientists, reinforcing these stereotypes. Women frequently face stereotype threats in STEM disciplines due to long-standing sociocultural biases surrounding successful cisgender white males (Corbett and Hill, 2015). It is believed that women lack achievement-oriented attributes that are necessary for success in male-dominated occupations (Noe, 1988). In addition, female scientists encounter another well-documented gender stereotype that questions their competence in mathematics and science (Bench et al., 2015). STEM fields have traditionally exhibited a strongly masculine culture, creating significant pressure on female scientists to conform to masculine norms within the scientific community. This can result in feelings of lower belonging in the field for women scientists (Blackburn, 2017). These cultural factors, along with a lack of female role models in science, contribute to identity threats among female scientists, i.e., a concern that their perceived weaknesses are attributed to themselves and women as a whole. This can significantly impact their confidence in pursuing careers in science (Gaule and Piacentini, 2018; Hirshfield, 2010). Negative social experiences further compound their insecurities, leading them to adopt safer research strategies and participate less in IDR.

In addition, the systematic barriers female scientists experience in the current research system may prevent them from engaging in IDR. The Matilda effect, which refers to the tendency for women’s abilities and contributions to be underestimated and undervalued, is particularly relevant in academia and may further discourage women scientists from pursuing IDR (Merton, 1973; Rossiter, 1993). The current modes of scientific practices and rewards can place female scientists into unequally competitive positions, particularly in some traditional and well-established disciplines (Bird, 2001; Etzkowitz and Kemelgor, 2001). The barriers female scientists have in the scientific community, such as unequal opportunities and discriminatory policies, might make them more risk-averse in their work, which could be the reason why female scientists choose to specialize in comparatively “niche” domains where they can be recognized more easily (Bird, 2001). As a result, female scientists may be discouraged from pursuing IDR or other ventures that carry greater risks and uncertainties, especially in the early stages of their careers. Empirical studies have supported this notion. For example, a survey suggests that junior female scientists feel discouraged from engaging in interdisciplinary collaboration and face additional obstacles compared to their male colleagues (Smith-Doerr and Croissant, 2016).

As discussed in the “Early-career scientists and interdisciplinary research” sub-section, involvement in IDR presents scientists with both benefits and challenges. IDR promotes long-term career growth and acts as a vital catalyst for fostering innovative breakthroughs, whereas pursuing an IDR trajectory may result in temporary productivity obstacles and introduce uncertainties about securing academic positions. In conjunction with cultural factors and the systematic barriers experienced by female scientists within the research system, female scientists are likely less inclined to engage in IDR, relative to their male peers, thereby creating a gender bias in this particular direction. The existing literature on gender differences in IDR presents inconclusive results. Many relevant studies are based on survey data with small sample sizes, which limits the generalizability of their findings. As a result, it remains ambiguous whether there are true gender differences in IDR, particularly among ECSs.

Relationship between mentorship and students’ engagement in interdisciplinary research

Doctoral advisors hold significant influence as both the “gatekeepers to the scholarly profession” and “socializing agents of the discipline.” The terms advisor, mentor, or research supervisor are frequently used interchangeably, even though the responsibilities associated with these roles may overlap or have distinct aspects. Advisors’ role is pivotal in molding students’ selection of research topics and their thesis writing (Bu et al., 2022). Typically, there are two primary approaches for students to determine their thesis topic: either it is assigned by advisors or students independently find and choose a topic after consultation with their advisors (Lei, 2009). The selection of a thesis topic depends on students’ interests, their capacity to conduct and complete the research, its manageability, and the demonstration of their independent mastery of the subject matter (I’Anson and Smith, 2004; Isaac et al., 1989; Pemberton, 2012). The principles of independence and originality form the foundation of students’ thesis writing (Isaac et al., 1992), with advisors providing crucial guidance in defining the theoretical framework, research questions, research design, and addressing any challenges encountered during the thesis writing process (Bargar and Duncan, 1982). Existing literature emphasizes the importance of both students and their advisors in the selection of thesis topics (de Kleijn et al., 2013; de Kleijn et al., 2012). For instance, Lei (2009) suggests that the decision-making process for selecting a thesis topic is complex and involves critical factors and resources taken into account by students and their advisors. Recognizing the advisors’ impactful role in shaping students’ research choices, the concept of academic genealogy has been introduced to measure intellectual inheritance propagated across generations of scientists through academic mentoring provided by advisors to their students (Rossi et al., 2017; Sugimoto et al., 2011).

The majority of existing literature has primarily concentrated on exploring the influence of advisors on their students’ thesis topic selection or thesis writing, with only a limited number of exceptions investigating the correlation between mentorship and the level of interdisciplinarity in students’ doctoral theses. Specifically, Sugimoto et al. (2011) conducted a study focusing on 3038 Ph.D. theses in the field of Library and Information Science (LIS), revealing that doctoral theses supervised by advisors from non-LIS disciplines exhibited a higher degree of interdisciplinarity compared to those guided by advisors with an LIS background. Similarly, an examination of LIS doctoral theses in North America further affirms the significance of advisors’ disciplinary backgrounds in shaping the interdisciplinarity levels of students’ theses (Mongeon et al., 2016).

The gender of advisors plays a crucial role in influencing students’ involvement in IDR. Extensive literature highlights the significance of advisors’ gender in shaping mentoring experiences and outcomes. Female and male advisors tend to exhibit different communication styles due to their varying preferences regarding connection and intimacy for females, and status and independence for males (Tannen, 1991). Studies suggest that female advisors often provide more social, nurturing, caring, and supportive forms of assistance compared to their male counterparts, which can be attributed to gender contexts, hierarchies, and socialization patterns (Canary and Dindia, 2009). In addition, research demonstrates that male advisors typically offer instrumental and career support, while female advisors often provide more emotional support (Allen and Eby, 2004). These gender differences can influence both the nature of the mentoring relationship and students’ mentoring experiences, subsequently affecting their thesis topic selection and engagement in IDR.

The pairing of students and advisors based on their gender can potentially impact students’ engagement in IDR. Extensive discussions have focused on the benefits of same-gender mentoring relationships, such as improved communication, more frequent meetings, higher satisfaction, increased psychological support, and role-model effects (Aguinis et al., 2018; Gaule and Piacentini, 2018; Milkman et al., 2015; Robinson, 2011; Rossello and Cowan, 2019). Empirical evidence suggests that same-race mentoring relationships are often associated with higher student productivity (Canaan and Mouganie, 2023; Gaule and Piacentini, 2018; Pezzoni et al., 2016; Rossello et al., 2020), although there are some mixed findings with negative effects or no effect on students’ performance in same-gender mentorship (Blake‐Beard et al., 2011; Hilmer and Hilmer, 2007). However, previous studies have not thoroughly examined whether and how the gender combination of students and advisors influences students’ research strategies and thesis topic selection. Given the varying outcomes observed in terms of students’ productivity and success resulting from different types of student-advisor gender pairings, this study anticipates that the gender combination of students and advisors is related to the quality of mentorship, the nature of the mentoring relationship, and students’ research strategies, including their engagement in IDR.

Interdisciplinarity measures

To measure IDR, a variety of methods have been proposed from both qualitative and quantitative perspectives. Qualitative measures of IDR normally rely on self-assessment by participants and peer review processes. However, these approaches may be limited in scalability and objectivity. In recent years, with the proliferation of bibliographic data, quantitative measures of interdisciplinarity based on bibliometric approaches have been developed at different levels, ranging from institutions (Rafols et al., 2012), research teams (Specht and Crowston, 2022), journals (Zhang et al., 2016), and individuals to publications (Liu et al., 2022).

Citation analysis is the most widely used bibliometric tool for measuring IDR with a focus on knowledge exchange or integration across fields. Among the citation-based methods, the percentage of citations outside of the discipline of the citing publication is commonly used as an indicator of IDR (Rafols and Meyer, 2007). In addition, the occurrence of discipline-specific citations pointing to other disciplines serves as a manifestation of knowledge exchange or integration among fields.

When citation data is unavailable, citation-based methods cannot be applied to measure IDR. In such cases, a top-down approach is applied using a pre-determined subject classification scheme that categorizes publications based on cognitive attributes (Bordons et al., 2004). Bibliometric databases, such as WoS, Scopus and PQDT, assign each publication to one or more disciplines or subjects based on these attributes. Co-occurrence relations of subjects can then map the collaborative network at the subject level and quantify the interdisciplinary degree of different subjects (Hu and Zhang, 2018; Karlovčec and Mladenić, 2015; Morillo et al., 2003; Porter et al., 2007)

The current methodologies for measuring interdisciplinarity in scientific documents, such as journal articles (Wang et al., 2015), patents (Huang and Su, 2019; Leydesdorff, 2018), and research proposals (Huutoniemi et al., 2010), do not provide clear guidance on assessing interdisciplinarity within doctoral theses. Furthermore, it is uncertain if existing methods can effectively capture the integration of knowledge across scientific domains within this specific type of scientific document.

Research questions

Having examined the relevant literature, this study identifies several research gaps. First, despite extensive assessment of the relationship between interdisciplinarity and scientific impact, there remains a lack of clarity regarding the participation of ECSs in IDR over the past few decades. The research strategies adopted by ECSs need to align with evolving societal needs and be guided by science and technology policies. The promotion of IDR has been driven by societal challenges, resulting in the formulation and reinforcement of relevant policies over time (Ash, 2019). Consequently, we expect that ECSs will exhibit changes in their engagement with IDR over time, closely following the increasing policy emphasis and support for IDR.

Second, the level of ECSs’ engagement in IDR may vary across universities with differing levels of research intensity due to variations in the universities’ dedication to IDR. Previous studies have indicated the growing adoption of interdisciplinarity as an institutional strategy among research universities in recent decades (Brint, 2005). Leading universities typically demonstrate stronger commitments to IDR and are better equipped to effectively implement IDR strategies, which may not be true for other institutions (Feller, 2004). In addition, ECSs’ involvement in IDR might also be influenced by their disciplinary contexts, as the challenges and requirements for conducting IDR can differ across disciplines (Wagner et al., 2011).

However, prior literature does not provide comprehensive insights into the prevalence of IDR among ECSs from the perspectives of temporal changes, disciplinary disparities, and universities with varying research intensity. In light of these gaps, we pose the following research question:

RQ1: What is the prevalence of interdisciplinary research among early-career scientists? To address RQ1, our analysis focuses on examining the temporal evolution of the interdisciplinarity indicator and its distributions across five scientific domains, as well as universities with different levels of research intensity.

The existing literature has not extensively explored the impact of gender on ECSs’ involvement in IDR. The effect is determined by both the effects of IDR on ECSs’ career development and the preferences female scientists have toward IDR. Previous studies suggest that IDR positively affects scientists’ mid-term and long-term career advancement while potentially impacting productivity negatively in the short term. Considering the relative disadvantage of female ECSs in the research system and the cultural barriers as discussed in the sub-section of “Gender and interdisciplinary research”, it is expected that they may adopt safer research strategies, leading to a lower level of engagement in IDR compared to male peers.

The gender disparities in ECSs’ participation in IDR may have evolved over time. Research indicates an increasing prominence of gender differences in productivity and scientific impact over the past 60 years due to the worsening sustainability of female scientists’ careers in academia (Huang et al., 2020). Consequently, it can be inferred that gender differences in ECSs’ engagement in IDR might have strengthened over time.

Moreover, disciplinary variations can play a role in gender disparities concerning ECSs’ involvement in IDR due to gender differences in performance, academic positions, research grants, and leadership across different fields (Brouns, 2000; Higher Education Funding Council for England, 2006; Huang et al., 2020). Furthermore, the underrepresentation of female scientists is notably more pronounced in top-tier institutions where highly competitive faculties and discriminatory policies are prevalent (Buchmann, 2009; Nadis, 2001). Thus, it is expected that gender differences in ECSs’ engagement in IDR will be particularly significant within universities with high research intensity, relative to that within ordinary universities.

To address these considerations, the following research question is proposed with a focus on temporal changes, disciplinary disparities, and varying research intensity among universities:

RQ2: How does gender influence ECSs’ engagement in IDR?

The existing literature does not provide clear insights into the connection between advisors’ gender and ECSs’ involvement in IDR. Previous studies primarily focus on whether advisors’ gender influences their students’ research success or likelihood of remaining in academia (Canaan and Mouganie, 2023; Gaule and Piacentini, 2018; Pezzoni et al., 2016; Rossello et al., 2020), as well as the importance of advisors on students’ thesis topic selection (de Kleijn et al., 2013; de Kleijn et al., 2012). However, these investigations do not shed light on whether advisors’ gender influences students’ research strategies or topic selection, particularly about IDR. Taking into account the wide range of gender differences related to communication and mentoring styles (Allen and Eby, 2004; Canary and Dindia, 2009; Eagly and Crowley, 1986), it is plausible that female and male advisors may establish different types of mentoring relationships with students. These relationships could potentially impact students’ overall mentoring experiences, as well as their choice of thesis topics and their level of engagement in IDR.

In addition, the existing literature does not explore whether the gender pairing of students and advisors is associated with gender differences in ECSs’ engagement in IDR. Advisors are important in the socialization process of Ph.D. students, and in shaping their perception of their status in science and their belonging to academia (Sallee, 2011). Given that the five scientific domains we investigate are generally male-dominated, female students working under the guidance of female advisors may be more cognizant of the unequal opportunities and negative stereotypes about women that female scientists face (Gaule and Piacentini, 2018), which increases female students’ sense of insecurity in academia. Consequently, relative to being supervised by male advisors, being supervised by female advisors might result in reduced participation of female students in IDR considering the potentially negative impact of IDR on their short-term productivity. In parallel, this reduction in participation may not be observed for male students under supervision by female advisors or female students under supervision by male advisors. Hence, the gender pairing of students and advisors may influence the gender disparities observed in ECSs’ involvement in IDR. Specifically, mentorship by female advisors could potentially amplify the gender differences in students’ engagement with IDR.

To address these research gaps, we propose a third research question:

RQ3: Is the gender of advisors correlated with students’ participation in IDR?

To investigate this research question, our study examines whether the gender of advisors is correlated with students’ participation in IDR and explores whether the interaction between advisors’ gender and students’ gender contributes to gender disparities in this regard.

Data and methods

PQDT dataset

The data for this study is obtained from the Sciences and Engineering Collection of PQDT, which contains nearly all doctoral theses from U.S. universities in science and engineering since 1861. This study analyzes doctoral theses in behavioral sciences, biological sciences, engineering, health and medical sciences, and mathematical and physical sciences, which are representative scientific domains. PQDT provides information about each doctoral thesis including the author’s name, advisor’s name, abstract, title, year of publication, university, subject, and so forth. Until 1980, PQDT did not provide information on the names of advisors. Hence, in the analyses involving predicted gender based on advisors’ first names, we exclusively concentrate on records that contain available advisor names. We obtained 1,109,491 theses from the database up to the end of 2016. Supplementary Table S1 provides basic statistics of the original dataset.

PQDT provides information on secondary subjects for each thesis.Footnote 1 A thesis author selects the field(s) of research associated with their work from a list of subject categories when submitting their thesis to PQDT. The classification system for subject categories has been relatively stable in recent decades, with only minor updates. In 2000, the National Center for Education Statistics introduced a new subject category of “interdisciplinary” to the classification system through the Classification of Instructional Programs (CIP2000).Footnote 2 This addition resulted in the inclusion of “interdisciplinary” as a subject option in PQDT.

Within PQDT, information is available regarding the names of advisors and all members comprising the thesis committee. To measure mentorship, we define mentorship specifically as the mentoring relationship established between the student and their advisor. In this context, the members of the thesis committee are not considered in the measurement of mentorship since they do not provide direct training or guidance to the students. In cases where students have multiple advisors, our focus narrows down to the major advisor, who is typically mentioned first in the list of advisors.Footnote 3 This allows us to concentrate on the primary advisor’s role in the mentorship process.

We determine the major scientific domain for each thesis by using PQDT’s classification scheme of subject categories and mapping each subject to a broader scientific domain. For cases where a thesis in PQDT is assigned more than one subject, we define the major scientific domain to be the subject listed first by the author. If a thesis is assigned only one subject, the scientific domain the subject belongs to is defined as the major scientific domain of the thesis.

The distribution of major scientific domains of theses in the original dataset is shown in Supplementary Table S2. Out of a variety of scientific domains included in PQDT, we focus on doctoral theses in behavioral sciences, biological sciences, engineering, health and medical sciences, and mathematical and physical sciences, resulting in 920,619 doctoral theses. These five scientific domains are most representative of hard science research in this dataset, accounting for almost 83% of all theses present in PQDT. Based on the Carnegie Classification of Institutions of Higher Education,Footnote 4 we categorize U.S. universities in the final dataset into three categories: R1 university (university with very high research intensity), R2 university (university with high research intensity), and other types of university.

Variables operationalization

This study aims to investigate the evolution of IDR in doctoral theses over the past six decades. This study also examines whether there are gender differences between female and male students in the interdisciplinarity level of doctoral theses and whether advisors’ gender plays a significant role in shaping the interdisciplinarity of doctoral theses. The dependent variable is the average distance between subjects assigned to a thesis. The key independent variable is students’ gender, which is inferred by the Gender-Guesser package.

Predicting gender information of students and advisors

Given that PQDT does not contain gender information for authors and advisors, we use the Gender-Guesser package,Footnote 5 an open-source Python module, to predict the gender of authors and advisors based on their first names. This tool has been developed using data from the program of “gender” by Jorg Michael and involves manual checks by native speakers of various countries for higher accuracy (Santamaría and Mihaljević, 2018). Gender-Guesser is viewed as one of the most advanced tools for predicting gender from first names and has been widely used in relevant literature (Squazzoni et al., 2021; Zhang et al., 2022). This tool classifies gender into six categories based on the first names: “male”, “female”, “mostly male”, “mostly female”, “andy” (androgynous), and “unknown” (name not found). The predicted gender statistics for students and advisors are presented in Table 1. The categories “mostly male” and “mostly female” suggest that a given name may be used by both men and women, but it is more frequently used by one gender. The “andy” category indicates that a name is used equally by both men and women, while the “unknown” category refers to a name that does not exist within the gender dataset. To ensure high accuracy in gender prediction, we only retain observations with a definite predicted gender attribute (“male” or “female”) for students.Footnote 6 The final dataset included 675,135 doctoral theses authored by students from 747 U.S. universities.

Table 1 The predicted gender information of students and advisors.

Measuring the interdisciplinarity of doctoral theses

As discussed in the sub-section of “Interdisciplinarity measures”, two commonly used approaches in measuring the interdisciplinarity of scientific documents are citation-based methods and methods based on pre-determined subject classification schemes. However, for doctoral theses within PQDT, citation data is not available, making it impossible to calculate citation-based interdisciplinarity measures. Moreover, traditional citation indexes like WoS and Scopus often exclude doctoral theses (Vera-Baceta et al., 2019), rendering citation-based indicators unreliable for reflecting interdisciplinarity levels in theses. To address these challenges, we employ co-occurrence matrices of subjects assigned to doctoral theses in PQDT. This approach allows us to measure the degree of interdisciplinarity present in doctoral research, offering a viable alternative to overcome the limitations of citation-based methods.

Doctoral theses in PQDT are classified into 552 secondary subjects, which collectively encompass 22 broader subject categories. The distributions of the number of subjects for all doctoral theses, female-authored doctoral theses, and male-authored doctoral theses are displayed in Fig. 1a–c. The distribution of the number of subjects for doctoral theses by scientific domain is presented in Supplementary Fig. S1a.

Fig. 1: The distribution of the number of subjects and the average distance between subjects for theses in five scientific domains in PQDT.
figure 1

ac depict the distributions of the number of subjects for all doctoral theses, female-authored theses and male-authored theses, respectively. df illustrate the distributions of the average distance between subjects for all doctoral theses, female-authored theses and male-authored theses, respectively.

We further construct a continuous variable to proxy the degree of interdisciplinarity in each thesis at a fine-grained level. Specifically, we calculate the average distance or dissimilarity between subjects assigned to each thesis based on a co-occurrence matrix of subjects that is constructed using the assigned subjects for all theses in PQDT. For example, if a thesis is assigned three subjects, electrical engineering, nanotechnology, and energy, the three subjects are considered to have co-occurred with each other once. The co-occurrence matrix of 552 subjects in PQDT allows calculation of the cosine distance, dij, between any two subjects, denoted by i and j, which is calculated based on Eq. (1).

$$d_{i,j} = 1 - C_{i,j}$$
(1)

where Ci,j is the cosine similarity between subject i and j based on the co-occurrence matrix.

Following previous literature (Kim et al., 2022; Rafols and Meyer, 2007), for any thesis that is classified into more than one subject, we use Eq. (2) to calculate the average distance, i.e., Distancet, between subjects assigned to thesis t:

$$Distance_t = \frac{1}{{n\left( {n - 1} \right)}}\mathop {\sum}\limits_{i \ne j} {d_{i,j}}$$
(2)

where i and j indicate two subjects that are assigned to thesis t; di,j refers to the cosine distance between subject i and j, and is calculated by Eq. (1). N indicates the number of subjects assigned to thesis t (n > 1). The larger the Distancet, the more interdisciplinary the thesis is. Distancet is only available for theses that are assigned more than one subject. If a thesis is classified to only one subject, we define the value of Distancet as 0, which indicates that this thesis is not interdisciplinary at all. The distributions of the average distance between subjects for all doctoral theses, female-authored doctoral theses and male-authored doctoral theses are shown in Fig. 1d–f. The distribution of the average distance between subjects for doctoral theses by scientific domain is presented in Supplementary Fig. S1b. The statistics regarding the gender of students, and the average distance between subjects of theses across five scientific domains are shown in Table 2.

Table 2 Statistics of five scientific domains.

To address RQ1, we utilize descriptive analyses to examine the extent to which ECSs participated in IDR based on their doctoral theses. We investigate the temporal change of the interdisciplinarity indicator as well as its distributions across five scientific domains and universities with varying research intensity.

Regression analyses

To investigate RQs 2 and 3, we examine potential differences between female and male students in the level of involvement in IDR as reflected in their doctoral theses. We also explore the influence of advisors’ gender on shaping students’ engagement in IDR, as well as the possible relationship between the gender combination of students and advisors and the magnitude of gender differences observed in students’ participation in IDR. We use t to denote a thesis. Equation (3) is used to estimate the dependent variable, i.e., Distancet that refers to the average distance of subjects assigned to thesis t.

$$\begin{array}{l}Distance_t = \alpha + \beta _1female\,student_t + \beta _2female\,adviser_t\\\qquad\qquad\quad\;\,\, +\, \beta _3female\,student_t \times female\,adviser_t + Y_t + U_t + D_t + \varepsilon \end{array}$$
(3)

where female studentt is a dummy variable that reflects whether or not the student, i.e., the author of thesis t, is female. We introduce female advisort to Eq. (3) as an explanatory variable that is a dummy variable that reflects whether or not the advisor is female. Building upon the discussions in the “Relationship between mentorship and students’ engagement in interdisciplinary research” and “Research questions” sub-sections, which highlight the impact of the gender combination of advisors and students on students’ engagement in IDR, we introduce an interaction term between female studentt and female advisort into Eq. (3) to empirically test the validity of this assumption. Year fixed effects, i.e., Yt, are added to control time-variant unobserved changes, such as policy changes that support or discourage IDR; university fixed effects, i.e., Ut, are included to control possible effects of university characteristics on students’ preferences for IDR; fixed effects concerning scientific domains, i.e., Dt, are incorporated to control the major scientific domains in which students work. Equation (3) is estimated by an ordinary least squares (OLS) regression model. The average variance inflation factor (VIF) for all explanatory variables obtained is 2.03, which is significantly lower than the threshold value of 5. This suggests that there are no issues of multicollinearity present in the regression model.

Results

Evolution of interdisciplinary doctoral theses

In general, doctoral theses by U.S. Ph.D. students in the five scientific domains, have demonstrated a growing trend toward IDR. Doctoral theses that are assigned multiple subjects are considered interdisciplinary doctoral theses. Based on this definition, in period 1, i.e., before 2000, interdisciplinary doctoral theses only accounted for 32.1% of the total doctoral theses (Fig. 2a). This proportion increased to 59.5% during the second period, i.e., from 2000 to 2016, suggesting that IDR has become dominant in doctoral theses. The increasingly higher proportion of interdisciplinary doctoral theses is observed across scientific domains, and universities of varying research quality (Fig. 2b, d). From Supplementary Fig. S2, we find that since the 1990s, interdisciplinary studies have been dominant in doctoral theses.

Fig. 2: Linear regressions of the fraction of IDR and the interdisciplinarity indicator as a function of year.
figure 2

To mitigate potential biases caused by the introduction of the “interdisciplinary” subject category in PQDT in 2000, data points from 2000 to 2004 have been excluded. Solid lines represent the linear increases in the fraction of IDR and the interdisciplinarity indicator from 1950 to 1999 (period 1), while dashed lines indicate the increases from 2005 to 2016 (period 2). a displays the linear increases of the fraction of IDR and the average distance between subjects in two periods. The coefficients of the year on the fraction of IDR and the average distance between subjects, along with their significance levels, are provided. b, d indicate the linear increases of the fraction of IDR by scientific domain, and by university of varying research intensity in two periods; c, e show the linear increases of the average distance between subjects by scientific domain, and by university of varying research intensity in both periods. IDR doctoral theses are defined as doctoral theses that are assigned multiple subjects in PQDT. We use BE, BI, EN, HM and MP to denote behavioral sciences, biological sciences, engineering, health and medical sciences, and mathematical and physical sciences, respectively. From (be), the coefficients of year and the P values are not shown to increase readability.

The growing trend toward IDR in doctoral theses is also reflected by the increasing average distance between subjects assigned to doctoral theses (Fig. 2a), which is found across five scientific domains, and universities of different types (Fig. 2c, e). This trend suggests that, even within interdisciplinary doctoral theses, i.e., doctoral theses that are assigned multiple subjects based on our definition, research subjects with greater cognitive distance are being combined and integrated into doctoral research. Supplementary Fig. S2 displays the year-by-year temporal evolution of the proportion of IDR and the average distance between subjects.

To gain insights into the variations in the average distance between subjects, we conduct comparisons across different periods, scientific domains, and universities with varying research intensity. The findings confirm that there has been an increase in the average distance between subjects over time (Fig. 3a). Disciplinary disparities are also identified. Specifically, among the five scientific domains examined, mathematics and physical sciences showcase the lowest level of interdisciplinarity in doctoral theses (Fig. 3b). On the other hand, biological sciences, and health and medical sciences exhibit relatively higher levels of interdisciplinarity. Moreover, universities with high research intensity (R2 universities) demonstrated significantly higher levels of interdisciplinarity, compared to ordinary universities (Fig. 3c).

Fig. 3: The average distance between subjects by period, scientific domain and university of varying research intensity.
figure 3

ac indicate the mean of the interdisciplinarity indicator in two periods, in five scientific domains and in three types of universities. The Welch’s t tests are performed in (a, c). *** represents significance at the 1% level.

Gender disparities in interdisciplinary doctoral theses

Our analyses uncover the presence of gender disparities in the interdisciplinarity level of doctoral theses. The results of the Welch’s t test indicate a significant difference in the average distance between subjects in doctoral theses authored by female and male students, with female-authored theses having an average distance ~0.09 higher than that of male-authored theses (Supplementary Fig. S3a). When examining the historical changes, the average distance between subjects in doctoral theses by female and male students almost exhibits overlapping trends. Simple comparisons of the average distance between subjects between female- and male-authored theses suggest that female students combine and integrate research subjects with slightly greater cognitive distance than their male counterparts do. However, the simple comparisons fail to consider the potential effects caused by factors, such as universities, scientific domains, periods and advisors’ gender that may influence students’ preferences for IDR, as discussed in the section of “Literature review and research questions”. To address this issue, we use a multivariate regression model to estimate gender differences in the interdisciplinarity indicator while controlling for influential factors concerning the above aspects.

Gender discrepancies are evidenced by the greater average distance between subjects in doctoral theses by males. Overall, the results of the multivariable regression models indicate that the average distance between subjects in male-authored doctoral theses is significantly greater than that in female-authored theses by around 0.012 (P < 0.01), which is approximately 6% of the overall sample mean (Fig. 4a and column 1 of Table 3). This finding consistently holds across various periods (Fig. 4b, c and columns 3 and 5 of Table 3), universities of different research intensity (Fig. 5), and most scientific domains (Fig. 6). This observation suggests that the gender disparities in students’ engagement observed in doctoral theses are prevalent across periods, university quality, and scientific domains.

Fig. 4: The linear prediction of the interdisciplinarity indicator by students’ gender and period.
figure 4

The x axis in sub-figures (ac) indicates students’ gender. The y axis means the predicted dependent variable, i.e., the average distance between subjects, when all other covariates are set to their means. Error bars represent the upper and lower bounds of 95% confidence intervals.

Table 3 The estimated relationship between students’ gender, advisors’ gender, and the interdisciplinarity indicator of doctoral theses.
Fig. 5: The linear prediction of the interdisciplinarity indicator by students’ gender and university type.
figure 5

The x axis in sub-figures (ac) indicates students’ gender. The y axis means the predicted dependent variable, i.e., the average distance between subjects, when all other covariates are set to their means. Error bars represent the upper and lower bounds of 95% confidence intervals.

Fig. 6: The linear prediction of the interdisciplinarity indicator by students’ gender and scientific domain.
figure 6

The x axis in sub-figures (ae) indicates students’ gender. The y axis means the predicted dependent variable, i.e., the average distance between subjects, when all other covariates are set to their means. Error bars represent the upper and lower bounds of 95% confidence intervals.

However, it should be noted that the magnitude of gender differences in the average distance between subjects varies across periods, university quality, and scientific domains. The gender disparity in the average distance between subjects is more pronounced during the second period (2000–2016) compared to the period spanning from 1950 to 1999 (Fig. 4b, c). This discovery implies an increase in gender disparities regarding the level of interdisciplinarity in doctoral theses over time. Besides, the average distance between subjects in male-authored doctoral theses is higher than in female-authored doctoral theses to a greater level in the top-tier universities (Fig. 5), i.e., R1 universities, relative to the remaining two types of universities. Additionally, disciplinary differences are found across scientific domains. Figure 6 indicates significant gender differences that exist in the interdisciplinarity indicator between female students and male students in the scientific domains including behavioral science, biological science, and engineering. However, in health and medical sciences, and mathematical science and physical sciences, the gender differences are not very significant.

The adjusted R2 values in the regression models are not high, especially for the regression analyses for observations from 2000 to 2016 (Table 3). In the regression model that focuses on the period of 1950–1999, the adjusted R2 is ~10%, suggesting that the model explains 10% of variations within the data. However, the adjusted R2 of the regression model that investigates the period of 2000–2016 only approaches around 5%. The relatively low R2 in the regression models suggest that besides the explanatory variables we use in this study, other important factors influence interdisciplinarity level in doctoral theses, such as students’ personalities, attitudes toward IDR, and so forth. Furthermore, we conduct residual analyses to evaluate the suitability of the regression models (Supplementary Fig. S4). Overall, the findings indicate that a linear regression model is appropriate for this study.

Furthermore, we find that the advisor’s gender is not significant in shaping gender disparities in the interdisciplinarity level of doctoral theses (columns 1, 3, and 5 in Table 3). Alternatively, students who are supervised by female advisors and those who are supervised by male advisors are not significantly distinctive concerning the average distance between subjects in their doctoral theses.

Our findings reveal a significant negative interaction term between female student and female advisor (Fig. 7a–c and columns 2 and 6 in Table 3), indicating that the gender combination of students and advisors influences the average distance between subjects covered in their doctoral theses. Specifically, the gender difference in the average distance between subjects is more pronounced when students are mentored by female advisors compared to male advisors. This suggests that being supervised by female advisors might strengthen the gender difference in students’ engagement in IDR.

Fig. 7: The linear prediction of the interdisciplinarity indicator by incorporating the interaction term of students’ gender and advisors’ gender.
figure 7

The x axis in sub-figures (ac) indicates students’ gender. The y axis means the predicted dependent variable, i.e., the average distance between subjects, when all other covariates are set to their means. The blue line indicates female advisors and the red line indicates male advisors. The shaded areas represent the upper and lower bounds of 95% confidence intervals.

Robustness check

We apply four methods to investigate the robustness of the major findings in this study. First, to minimize the potential bias in gender prediction, we apply Genderize.io, another commonly used gender inference tool (Sebo, 2021), to predict the gender of students and advisors, and perform the above analyses again. Genderize.io collects data from all over the Web, assigning gender to each given name based on the proportion of people with that name who are men or women. While Genderize.io exhibits high gender prediction accuracy, it does have certain limitations. These include a bias toward English names and potential unreliability concerning the sources utilized within the tool (Santamaría and Mihaljević, 2018). We retain names with a gender probability of 0.9 and above in the Genderize.io database. The predicted gender of students and advisors in five scientific domains in PQDT are shown in Supplementary Tables S3 and S4, respectively. The regression results are shown in Supplementary Table S5. Generally, we observe the consistent gender difference in the average distance between subjects in doctoral theses.

Furthermore, we incorporate doctoral thesis data where students’ gender is predicted as “mostly female” or “mostly male”. To maintain consistency, we reclassify instances labeled as “mostly female” to “female” and those labeled as “mostly male” to “male”. The regression results are shown in Supplementary Table S6, suggesting the persistent and lower interdisciplinarity of female-authored doctoral theses.

We further utilize the doc2vec model to assess the interdisciplinarity of doctoral theses based on their titles. Doc2vec is an unsupervised learning algorithm and an extension of the word2vec model (Le and Mikolov, 2014). It is used to vectorize input documents by capturing contextual information and preserving semantic relations. This tool is widely used because of its performance advantages over its alternatives, such as the TF-IDF models (Kim et al., 2017). Using the “gensim” library, we leverage the doc2vec model to represent each secondary subject in PQDT as vectors for comparing semantic distances between subjects. Following previous literature (Alasehir and Acarturk, 2022; Li et al., 2023; Whalen et al., 2020), we employ the PV-DM model with vector dimensions, learning rate, and window length set to 100, 0.025, and 3, respectively. We combine the titles of all doctoral theses assigned to each subject, resulting in a collection of 552 text blocks corresponding to 552 secondary subjects. These text blocks serve as input data for training the doc2vec model, which generates 100-dimensional vector representations for each secondary subject. These vectors can be viewed as points in a multidimensional semantic space, representing the content of each subject. To measure the distance between subjects, we calculate the cosine distance between each pair of subjects using the 100-dimensional vectors. Drawing from Eq. (2), we compute the average distance between subjects in the doctoral theses as a measure of interdisciplinarity and perform regression analyses using this metric. The regression results, as presented in Supplementary Table S7, indicate a persistent pattern of lower participation in IDR among female students.

Advisors’ characteristics, such as preferences for IDR, personalities, and mentoring styles may be related to students’ engagement in IDR. For robustness check, we examine whether the higher interdisciplinarity level of male-authored doctoral theses than that of female-authored doctoral theses still holds, even when female students and male students are supervised by the same advisor. We include advisor-fixed effects in Eq. (3) to account for advisor-specific and time-invariant characteristics. Upon accounting for the influence of advisors by adding advisor-fixed effects in regression models, Supplementary Table S8 indicates that male students consistently exhibit a higher level of interdisciplinarity in their theses, compared to female students, even when male and female students are supervised by the same advisor.

Discussion and conclusion

This study finds persistent and pervasive gender imbalance in interdisciplinary research based on doctoral theses by 675,135 U.S. Ph.D. students in five scientific domains from 1950 to 2016. We find that interdisciplinary doctoral theses have witnessed a growing trend across different scientific domains and universities of different research quality, which is in line with the rise of IDR and education. Since the 1990s, interdisciplinary studies have been dominant in doctoral theses. The average distance between subjects in doctoral theses is growing as well, which suggests that Ph.D. students are inclined to combine different types of disciplinary knowledge that are increasingly cognitively distant from each other. The finding suggests that Ph.D. students are becoming more and more adaptive to the growing complex social and economic problems, and get more engaged in IDR.

This study further finds not too substantial but persistent gender differences in ECSs’ engagement in IDR across the five hard science disciplines. This finding is evidenced by the longer average distance between subjects of male-authored doctoral theses, relative to that of female-authored doctoral theses. Specifically, the average distance between subjects of male-authored doctoral theses is significantly longer than that of doctoral theses by female students by 0.012, namely around 6% of the sample mean.

Our findings indicate that advisors’ gender is not significantly associated with the average distance between subjects in students’ doctoral theses. While we discussed the potential influence of advisors’ gender on students’ thesis topic selection in the “Relationship between mentorship and students’ engagement in interdisciplinary research” sub-section, it appears that advisors’ gender may not directly impact their students’ involvement in IDR. Other attributes and characteristics of advisors, such as their academic background and active participation in IDR, may be more influential in this regard. Further research should explore these aspects in greater detail.

Our analysis further confirms that the gender pairing of students and advisors magnifies the gender disparity in the level of interdisciplinarity observed in doctoral theses. This finding aligns with the hypothesis presented in the sub-section of “Research questions”. Specifically, when female students are mentored by female advisors, they tend to exhibit a heightened awareness of cultural factors and systemic barriers that present challenges for female scientists. As a result, these students may adopt more cautious and conservative research strategies to mitigate potential negative impacts on their career advancement that could arise from engaging in IDR. In contrast, no similar considerations appear to influence male students under the guidance of female advisors or students of any gender supervised by male advisors.

The findings mentioned above are in line with the earlier discussions regarding the systematic obstacles female students encounter when engaging in IDR, stemming from gender biases that are prevalent in current scientific practices and rewards. Our results suggest that female students exhibit less engagement in IDR and a lower degree of interdisciplinarity, relative to their male peers. This gender disparity displays a growing trend over time and is more prominent in top-tier universities and some scientific domains, such as biological science.

Considering the high potential of IDR, being less interdisciplinary during the early-career stages may pose a risk to female scientists’ future career progression, exacerbating gender disparities in science. Moreover, existing evidence suggests that female scientists and males differ in terms of selecting research topics and questions. For example, female-dominated teams tend to investigate research inquiries related to women’s health and medical needs (Koning et al., 2021). Being less interdisciplinary could hinder the generation of effective solutions to issues relevant to females and adversely impact their well-being. The results of this study have broad implications for university managers, funding agencies, and policymakers. To mitigate gender gaps in IDR for ECSs, addressing the systematic gender biases within current scientific practices and rewards is critical, particularly in terms of alleviating gender-based inequalities in the job market, promotion assessment, and funding allocation. Science policy and funding agencies should provide funding support for female scientists, particularly those at the early-career stages, who engage in IDR or work in interdisciplinary research centers, thus fostering an environment conducive to their participation in IDR. Furthermore, our findings suggest that being supervised by female advisors exacerbates the gender disparity in ECSs’ engagement in IDR. This implies that providing additional support to female faculty members may not only directly benefit their career development but also hold significant importance in fostering the growth and success of future generations of female scientists. It is important to acknowledge that the findings of this study are specific to the five hard science disciplines and may not apply to other domains, particularly humanities and social sciences.

This study has a few limitations. Informal mentorship plays a crucial role in shaping early-career scientists’ trajectories. Thus, whether or not the results of this study hold for informal mentoring relationships requires further investigation. Second, although we utilize large-scale data on doctoral theses, our analyses only investigate five scientific domains; it should be useful to validate our findings across other fields, particularly in humanities and social sciences. Furthermore, although doctoral theses reflect the culmination of students’ research training, and serve as a significant academic accomplishment, students may apply different research strategies in their doctoral theses, and other publications, such as journal articles or conference papers. Therefore, future studies should investigate whether the results of this study also apply to ECSs’ other publications. In addition, it is important to consider various other influential factors that shape ECSs’ preferences for IDR, including their personalities, advisors’ mentoring styles, and their attitudes toward IDR. These factors warrant further investigations in future studies. In this study, we employ first names as a proxy for inferring the gender of students and advisors. However, it is important to note that first names only reflect the gender assigned at birth and may not accurately capture their social gender. To overcome this limitation, conducting survey research could serve as a valuable complement, enabling a more comprehensive measurement of the social gender of both students and advisors.