Abstract
Evaluating research activity in research departments and education programs is conventionally accomplished through measurement of research funding or bibliometrics. This limited perspective of research activity restricts a more comprehensive evaluation of a program’s actual research capacity, ultimately hindering efforts to enhance and expand it. The objective of this study was to conduct a scoping review of the existing literature pertaining to the measurement of research productivity in research institutions. Using these findings, the study aimed to create a standardized research measurement tool, the Productivity And Capacity Evaluation in Research (PACER) Tool. The evidence review identified 726 relevant articles in a literature search of PubMed, Web of Science, Embase, ERIC, CINAHL, and Google Scholar with the keywords “research capacity” and “research productivity.” Thirty-nine English-language studies applicable to research measurement were assessed in full and 20 were included in the data extraction. Capacity/productivity metrics were identified, and the relevance of each metric was data-charted according to 3 criteria: the metric was objective, organizational in scale, and applicable to varied research domains. This produced 42 research capacity/productivity metrics that fell into 7 relevant categories: bibliometrics, impact, ongoing research, collaboration activities, funding, personnel, and education/academics. With the expertise of a Delphi panel of researchers, research leaders, and organizational leadership, 31 of these 42 metrics were included in the final PACER Tool. This multifaceted tool enables research departments to benchmark research capacity and research productivity against other programs, monitor capacity development over time, and provide valuable strategic insights for decisions such as resource allocation.
- ADFM/NAPCRG Research Summitt 2023
- Benchmarking
- Bibliometrics
- Capacity Building
- Efficiency
- Health Care Quality Indicators
- Health Personnel
- Leadership
- Research Personnel
- Resource Allocation
- Systematic Review
Introduction
Effective research can have a profound impact, leading to significant advancements in new technologies, medicines, and evidence-based policies. In recent years, the use of research metrics has gained significant attention as a way to assess the quality and impact of research, leading to improved ability to increase research productivity and capacity in primary care.1,2 Measuring the impact and quality of scientific research, however, remains a challenge for researchers, institutions, and funding agencies.3⇓⇓–6 There are no standard guidelines for which research metrics are most informative, making it difficult to assess the relative effectiveness of different research organizations.1 A standardized data set would allow for comparison between research organizations, and within organizations over time.
As a solution to this problem, the Building Research Capacity (BRC) Steering Committee commissioned a study to form a panel of research metrics. BRC comprises members from the North American Primary Care Research Group and the Association of Departments of Family Medicine. Since 2016, BRC has been engaged in offering resources to departments of family medicine to enhance and expand research, including consultations and leadership training through a research leadership fellowship.7 The development and monitoring of research capacity is a topic of significant practical interest to the committee, which has compiled a list of research metrics that have proved useful in providing consultations to clinical research departments and teaching fellows. Starting with this list as a template, the BRC Steering Committee commissioned a scoping review to investigate other metrics in the scientific literature that have been shown to be relevant and to collect a list of research assessment resources. The objective of this review was to generate a structured collection of metrics, termed the Productivity And Capacity Evaluation in Research (PACER) Tool.
Methods
We performed a scoping review using the method outlined by Arksey and O'Malley that was further developed by Levac et al.8,9 We aimed to identify previously reported metrics or tools that have been used as indicators to track, report, or develop research capacity and productivity in medicine. Arksey and O’Malley8 identified a process consisting of 6 steps: 1) identifying the research question, 2) identifying relevant studies, 3) selecting studies, 4) charting the data, 5) collating, summarizing, and reporting results, and 6) consulting (optional). The scoping review checklist described by Cooper et al10 was used to guide the process.
A medical librarian performed a literature search of relevant databases to identify other citations in PubMed, Web of Science, Embase, ERIC, CINAHL, and Google Scholar by using the keywords “research capacity” and “research productivity”; further search details are given in the Supplemental Material. Further forward and backward citation searching was performed to identify any additional articles. No timeline restrictions were imposed, and only peer-reviewed articles were included. Deduplicator in the Systematic Review Accelerator package was used to remove duplicates from the results of the above database searches, producing a final list of citations, which were then uploaded to Rayyan, a web and mobile app for systematic reviews.11 This article follows the PRISMA-ScR checklist.12
Results
For the study selection for the scoping review, 2 authors (SKS and PC) screened the titles and abstracts of 726 articles to determine their relevance to research capacity and/or productivity (Figure 1). Articles were selected if they met 3 metrics: 1) they developed or assessed a research tool or metric; 2) the tool or metric was objective in nature; and 3) the assessment was organizational in scope. If the primary screeners disagreed, a third screener (CM) adjudicated. Before article screening, the authors completed training to ensure consistency.
PRISMA flow diagram.
After the screening round, 39 articles were selected to assess for eligibility. These articles were retrieved in full and underwent independent analysis by 2 authors (SKS plus MS-S, PC, JWL, CM, TTC, or PHS) to determine study inclusion. Conflicts between the reviewers in the independent analysis were resolved by discussion between researchers. Reasons for exclusion included no evaluation of research metrics (n = 4), subjective metrics only (n = 5), not peer-reviewed article (n = 2), and not organizational in scope (n = 8). Ultimately, 20 articles were selected for data extraction (Figure 1).
For the 20 included studies, the following information was recorded on a data-charting form: article title, authors, publication year, study objective, study type, target population, sample, data collection method, study duration, location of study, and study limitations. For studies that evaluated a tool or instrument for research capacity evaluation, the following additional data were recorded: name of tool/instrument, whether the tool/instrument was original or adapted, description of the tool, how it was developed, if and how it was validated, number of metrics captured, description of metrics, and how the tool performed. Key takeaways from the data extraction are summarized in Table 1. These data were used to generate an initial list of metrics that were objective, organizational in scale, and relevant to varied research domains. From the included articles, we extracted a set of 42 separate items that formed the first draft of the PACER Tool. Through qualitative content analysis, each of the 42 metrics was grouped into 8 domains of research capacity. These categories are
Bibliometrics
Impact
Ongoing research
Collaboration activities
Funding
Personnel
Education/academics
Recognition
Summary of Findings from Data Extraction
Using the Delphi method, we submitted the initial tool to a panel of 31 research leaders (eg, deans, administrators, department chairs) to provide feedback, content expertise, and additional perspectives on the preliminary draft.31 The panel was chosen from among experts known to the BRC Steering Committee, and represented various expertise areas, including medicine (n = 21, from family medicine, internal medicine, psychiatry, pain and addiction medicine, and sports medicine), business administration (n = 2), finance (n = 1), research operations (n = 3), and population health (n = 4). The feedback from the Delphi panel was discussed by the authors. Unanimous consensus by the authors of necessary changes led to a second draft of the PACER Tool. This was then sent to the panel for further comment. The process was repeated a third time. After consensus was achieved by incorporating panelists’ feedback, the final PACER Tool was created.
The Delphi panel reported that the initial tool was too complex and requested simplification. This resulted in the removal of thirteen metrics, including items such as internal publications, nonpeer-reviewed publications, and book chapters. The “recognition” category of metrics was removed after the Delphi panel determined that each of the identified metrics in that category (eg, internal awards and speaking invitations) was either infeasible or irrelevant. There was also feedback from panel members that we needed to include more data surrounding the impact of research. As a result of that feedback, we added “number of citations” and “median h-index” to the PACER Tool. They gave feedback on how each metric is described, which led to revisions in the clarity of each description. The Delphi panel also suggested we make it clear that organizations should not be expected to track every metric in the PACER Tool simultaneously as this would be infeasible for most of them.
The final PACER Tool consists of 31 numeric metrics that, when taken as a whole, shed light on domains of research capacity and productivity that are amenable to such analysis (Table 2).
Productivity and Capacity Evaluation in Research (PACER) Tool
Discussion
Research metrics are important for academic institutions because they allow institutions to evaluate the productivity and impact of departments, teams, and individual researchers.2,22 By following relevant metrics, institutions are able to identify strengths and weaknesses and allocate resources more effectively. Bibliometric indicators, including citation counts, h-index, and impact factor, have become widely accepted measures of scientific productivity.32,33 However, they do not reflect the quality or validity of the research, and they can be influenced by factors such as the popularity of the research topic, the size of the research community, and the publishing practices of the field.29,34,35 With enough data, each of these metrics could conceivably be normalized by discipline, career stage, and other factors. This could lead to more effective comparisons over time and between institutions.
Quantifying research capacity through measurements like bibliometrics or external funding often requires contextualization, which demands the collection of additional data.36 To assess whether any such data would be useful, we must be able to evaluate their effectiveness in measuring excellence of scientific output.25 Such an evaluation can seem circular, however, because it requires a prior definition of what constitutes excellence. Given the numerous possible metrics and the complex parameter landscape, it is worthwhile to define a priori what, at a minimum, may render a metric practical. In response to this, Kreiman and Maunsell29 posited that useful research metrics should possess the following characteristics:
Quantitative
Based on robust data
Based on data that are rapidly updated and retrospective
Presented with distributions and CIs
Normalized by number of contributors
Normalized by discipline
Normalized for career stage
Impractical to manipulate
Focused on quality over quantity
These requirements necessitate that multiple metrics be obtained simultaneously. For example, to normalize quantitative bibliometric data by number of contributors or career stage, one would need to compare the data with additional data regarding the quantity and demographics of researchers. What is called for, then, is not a single metric but a panel of metrics that, when taken together, create a reasonably comprehensive picture of an organization’s research productivity and capacity. To normalize research data by discipline, a panel of metrics would need to be widely used. Such data would also need to be available to researchers so research productivity could be compared within and across organizations to discover and track trends.
As the scientific landscape continues to evolve, research metrics will continue to have an increasingly important role in shaping the future of scientific research.1,2 A robust research data set could serve multiple purposes, including 1) equipping department chairs and deans with a set of practical measures to monitor research development; 2) allowing third-party organizations to compare research productivity at the organization or network level; and 3) providing researchers with a data set to evaluate the research economy (ie, how scarce resources of funding, personnel, and publications are allocated).2,37 Currently, no widely adopted set of research indicators exists that could serve these purposes.
The PACER Tool was developed to meet the need identified by our team and supported by our scoping review for robust and comprehensive research capacity measurement systems. It provides a system of metrics that can be used to benchmark, monitor, and compare research productivity and capacity in various research settings. In particular, the PACER Tool provides a way for research programs, funders, and researchers themselves to benchmark research capacity and productivity in a way that is standardized, allowing for comparison across programs and within programs over time.
Use of the PACER Tool will enable leaders to form a detailed evaluation of the capacity and productivity of their research enterprise and make evidence-based resourcing decisions for their own organizations. In addition, once such data become widely available, they could be used for benchmarking research enterprises across organizations. Consistent, widespread use of PACER data would allow researchers to find answers to important questions in research capacity development. For example, PACER data could be used to discover the average number of new publications an organization could expect if they were to focus resources on adding more junior researchers or having fewer senior researchers.
Although the PACER Tool provides an array of metrics, it may be infeasible for an organization to obtain all data contained within the tool. Many members of the Delphi panel agreed, with one commenting that “some [measures] might be zero or not adopted, such as patents and [institutional review board] applications.” Another mentioned that using “a select subset of metrics would be best.” In response to this, the individual metrics in the PACER Tool are grouped by category. This allows users to focus on obtaining data in the domains that are most important and/or practical to them and their organizations. For example, a department that is trying to assess whether increased funding leads to increased high-impact publications could monitor aspects of the Bibliometrics, Impact, and Funding categories of the PACER Tool. An organization that is concerned with increasing the proportion of faculty with academic rank may want to focus on the Personnel and Education/academics categories.
One limitation of this study is that it may not be applicable to commercial entities or countries with emerging research. All authors and Delphi panel members were from academic departments in the US and Canada. However, we tried to include perspectives from a wide array of experts in different, including nonmedical, disciplines. In addition, the review identified no non-English studies, which suggests a need for further research to extend these results to departments in non-English speaking countries.
A limitation of the PACER Tool itself is that it only conveys quantitative data. Many areas of research capacity building such as quality or leadership may be more amenable to qualitative analysis. In addition, the PACER Tool does not assess important indicators that may be more applicable to smaller units (eg, metrics that focus on personal or team growth) or scales larger than a single organization (eg, national policies or journal-level bibliometrics).
The ultimate goal of monitoring metrics such as those contained in the PACER Tool is to facilitate effective research. Organizations can use metrics in the PACER Tool to plot, trend, and compare data to generate a visible “research economy.” The PACER Tool represents a robust, multidimensional set of metrics, but it is important to acknowledge that research assessment is a complex and evolving field. The tool should be viewed as a starting point and may require further refinement and adaptation to specific research contexts. Further contextualization with qualitative data will continue to be important. Ongoing feedback and evaluation from colleagues in multiple disciplines and organizations, as well as ongoing validation and improvement of the metrics, will help ensure the ongoing relevance and usefulness of the PACER Tool.
Conclusion
The PACER Tool offers an adaptable, multifaceted approach for monitoring research performance. By incorporating a diverse set of metrics across multiple domains, it addresses many of the limitations of existing research metrics that focus only on bibliometrics and funding. This will enable organizations to evaluate the productivity and impact of research departments, teams, and individual researchers more effectively.
Acknowledgments
Database searching assistance was provided by a reference librarian affiliated with Louisiana State University Health Sciences Library. The Scientific Publications staff at Mayo Clinic provided editorial consultation, proofreading, and administrative and clerical support. This work is the opinions of the author (PC) and does not represent the views of the Department of Defense or the Uniformed Services University of the Health Sciences.
Appendix
Search Strategy
Databases searched: PubMed, Web of Science, Embase, ERIC, CINAHL, and Google Scholar. For PubMed, the following terms were used: ((research) AND (capacity building OR productivity [MeSH Terms])) AND (tool[Title/Abstract] OR indicator[Title/Abstract] OR metric[Title/Abstract]). For Embase, the following terms were used: ('research'/exp OR 'research') AND ('capacity building'/exp OR 'capacity building') AND ('tool':ti, ab OR 'indicator':ti, ab OR metric:ti,ab). For ERIC, the following were used: ((research) AND ((“capacity building”)) AND (((faculty)))) AND ((TI tool OR AB tool) OR (TI indicator OR AB indicator) OR (TI metric OR AB metric)). For CINAHL, the search utilized ((MH “Research+”) OR (MH “Publishing+”)) AND (“capacity building”) AND ((TI tool OR AB tool) OR (TI indicator OR AB indicator) OR (TI metric OR AB metric)). Finally, for Google Scholar via Publish or Perish, the following search was utilized: research AND medical AND faculty AND capacity AND tool.
Notes
This article was externally peer reviewed.
This is the Ahead of Print version of the article.
Conflict of interest: The authors report no financial conflicts of interest.
Funding: No funding was received for this study.
To see this article online, please go to: http://jabfm.org/content/00/00/000.full.
- Received for publication February 23, 2024.
- Revision received March 28, 2024.
- Accepted for publication April 1, 2024.