A systematic review of pedagogical interventions on the learning of historical literacy in schools

Over the past thirty years, there has been a growing body of research investigating the efficacy of pedagogical interventions to enhance the historical literacy skills of primary and secondary school students. However, there exists no systematic review or meta-analysis summarising the impact of such research or the efficacy of interventions trialled. The purpose of this systematic review is to identify pedagogies


Introduction
Over the past thirty years, there has been a growing body of research investigating the efficacy of pedagogical interventions to enhance the historical literacy skills of primary and secondary school students.Non-systematic exploration of effective historical classroom pedagogies, for example, Nokes and De La Paz (2023: 350) investigated historians' reading heuristics and procedures, and noted the difficulty of fostering students' historical argumentation, especially when 'processes are contingent upon fragile and emerging understandings of the nature of history as a discipline'.Luís and Rapanta (2020) conducted a thorough review into how historical reasoning competence has been operationalised in history education empirical research.They found a clear predominance of studies focusing on content knowledge acquisition skills, together with a lack of empirical research investigating the full suite of historical reasoning competence skills (Luís and Rapanta, 2020: 10-11).However, there exists no systematic review or meta-analysis summarising the impact of such empirical research or the efficacy of interventions trialled.The purpose of this systematic review is to identify pedagogies that have a demonstrable effect on students' historical literacy skills, with a particular interest in those pedagogies that have a measurable positive effect on historical epistemological knowledge and skills.We are also interested in collating information about pedagogies that can be transferred from experimental conditions to a mixed-ability primary or secondary classroom.The research question that guided the review was: What is the relationship between pedagogical strategies and improved historical literacy in primary and secondary school children?

Historical literacy: what does it mean?
There is great diversity in the terminology used to describe historical literacy.Terminology ranges from basic historical recall and narration, to more sophisticated explanation, analysis and evaluation.We provide characteristic features of the language used to describe historical literacy, grouped into two strands: (1) historical content knowledge; and (2) historical epistemological knowledge and skills.
(1) Historical content knowledge Historical content knowledge may be referred to as factual, historical or objective knowledge.The acquisition of historical content knowledge is typically demonstrated through description, narration, factual recount or recall in response to comprehension-style questions.This type of knowledge acquisition may be referred to as concrete or lower-order thinking, because the internalisation of historical knowledge maps to what Krathwohl's (2002) Taxonomy of Educational Objectives refers to as Remembering (recognising and recalling) and Understanding (determining the meaning of instructional messages, especially classifying and summarising). 1Researchers refer to the attainment of historical content knowledge through processes of: memorisation (Aidinopoulou and Sampson, 2017); use of historical vocabulary, sequencing events and periods, and identifying characteristic features (de Groot-Reuvekamp et al., 2018); or, for example, as the successful acquisition of background knowledge on a given historical topic (Wissinger et al., 2021).
(2) Historical epistemological knowledge and skills Historical epistemological knowledge and skills can be divided into two main categories: deconstruction of source material, and reconstruction of historical narrative or argument.We drew on a selection of studies included in this systematic review (Ariës et al., 2015;Bertram et al., 2017;De La Paz and Felton, 2010) to define the elements of source deconstruction and historical narrative/argument reconstruction: History Education Research Journal https://doi.org/10.14324/HERJ.20.1.09 A systematic review of pedagogical interventions on the learning of historical literacy in schools 3

•
Deconstruction -critical analysis of historical sources to ascertain context, audience, message; purpose of source creation and perspective represented; techniques used to communicate the message, purpose and perspective of a historical source.

•
Reconstruction -interpretation, reasoning and explanation of historical evidence; analysis and synthesis of evidence and historical argument; analysis and reasoning leading to a judgement expressed as an assessment of value or an evaluation based on criteria.
The development of historical epistemological knowledge and skills is often referred to as abstract or higher-order thinking, because the demonstration of these skills and knowledge maps to what Krathwohl's (2002) Taxonomy of Educational Objectives refers to as Applying (executing or implementing), Analysing (identifying constituent parts and detecting how parts relate to one another), Evaluating (making judgements) or Creating (combining elements to make an original product).
Most pedagogical empirical studies in secondary and primary school history contexts focus on an intervention to improve some aspect of students' historical literacy in epistemological knowledge and skills, possibly because the demonstration of abstract thinking is a key feature of academic achievement at the highest levels.
Our systematic review seeks to demonstrate the efficacy of historical literacy pedagogical interventions in improving student historical content knowledge and/or historical epistemological knowledge and skills.Following, we provide a detailed overview of all steps and processes taken in this systematic review, including detailed notations on reasons for study inclusion and exclusion, and the methods of critically appraising included studies.Our review is informed by a clear theory of change regarding historical literacy education, and the synthesis of data appraises pedagogical interventions in terms of feasibility, replicability, extent of academic gain, and explicit or implicit moderating variables affecting intervention results.Implications of findings are discussed, and recommendations for further research are suggested.

Methods
A systematic literature search was conducted using electronic databases (PsychINFO, ERIC, Academic Search Premier, Education Research Complete, Humanities International Complete) from 1990 to February 2023.The search protocol guided the database search (see Appendix A).Included studies reported on at least one of two primary outcomes (PO): PO(i) historical recount or historical description or historical narrative; and PO(ii) historical explanation or historical interpretation or historical judgement -in combination with accurate historical knowledge.The search was limited to peer-reviewed journal articles written in English.
Search terms included the use of three broad categories: (1) historical content knowledge; (2) historical epistemological knowledge and skills; and (3) educational context.Searches used the following terms:

Study selection, data collection process and data items
Included studies took place with school-aged children (about 5-18 years old), were delivered in a school classroom to the whole class group (either by the classroom teacher or by a guest instructor) and in a regular school.We selected studies conducted under these conditions because we were interested in identifying historical literacy pedagogical strategies that could be scaled up for large cohorts, and potentially delivered state-wide.Studies were selected on pedagogical interventions that were conducted over a sustained period (one week or more, with a minimum of three sequential lessons), and were empirical, reported on at least one primary outcome and investigated the use of a historical literacy pedagogical strategy.
Studies were excluded if they were delivered at university or outside a regular school classroom (for example, in a museum or to a withdrawn group of students), or if instruction was provided as a one-off lesson or presentation.Furthermore, studies were excluded if they were not empirical, they did not report on at least one primary outcome, or there was no investigation of a historical literacy pedagogical intervention.
Three researchers (R1, R2 and R3) worked on the selection of studies.R1 searched databases for relevant literature.Following the removal of duplicates, R1 and R2 completed a full title sweep of the 13,050 articles independently.If the researcher was unsure whether to include an article based on title, the abstract was consulted; if still unsure, the article was included.In total, 173 articles were agreed upon as meeting the inclusion criteria at this point, and proceeded for full text review.R2 and R3 completed reads of full articles independently, and mutually agreed on the exclusion of 119 articles.The remaining 54 articles were identified for possible inclusion and assessed for eligibility.R2 completed detailed reads of each, and recorded reasons for any study being excluded in this phase.This resulted in a further 33 studies being excluded, and allowed 21 studies to be included in this review.See Figure 1 for an overview of the study selection process.
R3 extracted data from the 21 included studies, and R2 checked a 20 per cent sample for accuracy; no anomalies were found.The data extraction included: inclusion criteria, evidence hierarchy; aim/objective or focus of study; research question(s); study design; recruitment, participants, study setting, pedagogical strategy/intervention; findings.
Once this data extraction had occurred, the fourth researcher (R4) was consulted on whether a meta-analysis of the included studies was possible and/or appropriate.R4 entered the statistical data from all 21 studies into the Comprehensive Meta Analysis software (version 4.1).Heterogeneity was assessed across all the studies using a series of complementary statistical analyses, such as the Q statistic, inconsistency index (I 2 ), Tau statistic (T 2 ) and prediction interval.R4 determined that based on the significant heterogeneity present in the included studies, and the diversity in the comparisons being made by the primary studies, it would be inappropriate to combine all included studies in a single meta-analysis, given the mix of comparisons of different pedagogical interventions with different outcome variables.Furthermore, a meta-analysis of studies that present risk of bias is likely to generate an effect size that is misleading.

Critical appraisal
Two critical appraisal models suitable for evaluating qualitative studies were consulted.Our Critical Appraisal and Weight of Evidence (WoE) template (see Appendix B) was designed with reference to the Critical Appraisal Skills Programme (CASP) template and the Cochrane Effective Practice and Organisation of Care (EPOC) protocol and template.These templates have been used in similar qualitative systematic reviews (see Pino and Mortari, 2014;Sterman et al., 2016).
The Critical Appraisal and WoE template was completed independently by R2 and R3 for every included study.Each study was assessed for internal methodological coherence, with criteria including: clarity of research question or aim; study and sample design; setting, participants and recruitment; data collection and analysis procedures; traceability of research processes; and inclusion of researcher background or orientations.Implicit reference to sample design was accepted.Criteria were scored on a yes (1 point) or no/cannot tell (0 point) measure.The numerical score for internal methodological coherence was mapped to a 5-point value scale: high (9-10), high-medium (7)(8), medium (5)(6), medium-low (3)(4) and low (1)(2).Each study was further assessed for its relevance to the review question, with criteria including: description of pedagogical intervention (detailed -2 points; general -1 point; no description -0 points); a defined historical literacy skill (yes -1 point; no -0 points); and primary outcomes reported (PO(ii) -2 points; PO(i) -1 point).A detailed description of the pedagogical intervention was one that could be replicated by an expert teacher practitioner (R2, an experienced secondary school history teacher and university history education lecturer, made the final judgement on detailed versus general description of the pedagogical intervention).The numerical score for the relevance of the studies to the review question was mapped to a 3-point value scale: high (5), medium (3)(4), low (1)(2).Results for the internal methodological coherence and relevance to review question appraisal were combined to report a WoE 5-point value scale: high (13-15), high-medium (10-12), medium (7)(8)(9), medium-low (4)(5)(6), and low (1)(2)(3).R2 compared the completed templates for each study, and resolved any differences via direct reference to the study.Where relevant, page references were noted in support of the resolution.Results of the Critical Appraisal and WoE are provided in Table 1.

Synthesis methods
Informing our review was a clear theory of change regarding historical literacy education.We hypothesised that when a discrete historical literacy pedagogical intervention was implemented in a primary or secondary school classroom, students would learn a discrete history thinking skill.The acquisition of the discrete history thinking skill would: (1) be apparent in students' reuse of the learnt skill in other similar circumstances; and (2) lead to improved historical thinking, evident in an academic gain demonstrated through a specified measurement tool (for example, an essay or test) with no detrimental effect on students' acquisition of required historical content knowledge.Our synthesis of data follows Weiss's (1997)

Results of systematic review
Studies are referred to in this section by number assigned according to alphabetical ranking.See Table 1 for number ranking (#) and author(s).Of the 21 studies reviewed, 4 were conducted in primary school contexts (#1, 5, 10 and 21), and 17 were conducted in high school contexts (#2-4, 6-9 and 11-20)

Nature of historical literacy pedagogical interventions in terms of being fit for purpose
Historical literacy pedagogies that are fit for purpose are those strategies that intentionally target discrete historical skills and knowledge acquisition.These strategies are typically scaffolded with explicit linkage to abstract skills required to think historically (for example, inferencing, interpretation, reasoning and judgement).All studies in this review with a high weight of evidence (WoE,(13)(14)(15) trialled pedagogies that guided students to think historically.Reisman (2012, #17) trialled a document-based approach offered by the Stanford History Education Group called Reading Like a Historian.In this approach, students used 'background knowledge to interrogate, and then reconcile, the historical accounts in multiple texts' (Reisman, 2012: 89), thus drawing on the skills of inferencing, reasoning and evaluation.Reisman reports gains in students' historical thinking.Similarly, Bertram et al. (2017, #4) found that their intervention groups demonstrated clearer understanding of the genre of oral sources, and were more likely to understand the constructed nature of historical recounts after engaging in 'oral history interviews in either active (live) or passive (video, text) ways' (Bertram et al., 2017: 453).
Studies in which Daniel R. Wissinger or Susan De La Paz have been involved (#6, 7, 8, 9, 20 and 21) typically trial discrete, often mnemonic, historical thinking and writing scaffolds. 2For example: IREAD 3  for historical reading and annotations (see #8); I3C 4 for source analysis (see #21); H2W 5 , STOP, DARE 6 or PROVE IT! 7 for writing argumentative essays (see #8, 9, 20 and 21); a historical reasoning strategy graphic organiser 8 (see, #6 and 9); a Model of Domain Learning (MDL) framework for domain-specific content and pedagogical strategies (see #18); and heuristics 9 (see #7) to support disciplinary approaches to reading historical documents.Disciplinary approaches specific to history include perspective recognition, contextualisation of source material, corroboration of evidence and substantiation of argument.Authors of these studies report gains in students' capacity to make enhanced claims (#6, 7, 8 and 18), compelling rebuttals (#6, 7 and 20), persuasive arguments (#9) with greater substantiation of evidence (#20 and 21) and that identify perspective and the influence of context (#21).
Heuristics frequently feature in studies trialling pedagogical strategies to improve historical literacy (see also, #13 and 16).As Sam Wineburg (1999: 491) has argued, 'historical thinking, in its deepest forms, is neither a natural process nor something that springs automatically from psychological development'; hence, historical thinking requires learned and discipline-specific strategies.Nokes et al.'s (2007, #16) study intervenes with explicit instruction on the heuristics of contextualisation, providing students with opportunities to 'infer about the social and political context' (Nokes et al., 2007: 497) of sources through practice with multiple texts.They found in post-test data that students from the experimental group scored significantly higher on sourcing than all other groups.Success with historical contextualisation was also reported by Huijgen et al. (2018: #13), when the experimental group in the study demonstrated gains after participating in the pedagogical intervention that created cognitive incongruity to scaffold students' understanding of the importance of contextualisation and the dangers of presentism.
Not all pedagogical interventions trialled in this systematic review were fit for the purpose of improving students' historical literacy (#1, 2 and 3).The focus of Aidinopoulou and Sampson's (2017, #1) intervention was to compare the use of classroom time for student-centred activities between the flipped classroom model and the traditional classroom.While results reported gains for the flipped classroom model in historical thinking skills (HTS), the gain was rather in additional time for learning tasks 'such as collaborative activities and debates' that might 'cultivate' HTS (Aidinopoulou and Sampson, 2017: 242).Furthermore, a lack of data regarding what students did to demonstrate their understanding, analysis and interpretation of historical sources meant that we could not draw credible conclusions about the efficacy of the flipped classroom model for improvement in students' historical literacy skills.Similarly, the study by Ariës et al. (2015, #2) had insufficient information on the historical content of lesson activities and question types to enable us to make a credible judgement as to reported gains in historical reasoning.In this study (#2), identifying targeted discrete historical skills and knowledge acquisition was secondary to the meta-cognitive working memory training intervention.Historical literacy was also a secondary concern in Azor et al.'s (2020, #3) study, with their focus being firmly on the use of YouTube audiovisual documentaries for teaching history, and the effects on interest and achievement between genders.Moreover, there was insufficient detail provided on the pedagogy used with YouTube documentaries for us to make an assessment of the intervention's effectiveness (that is, fitness for purpose) for developing historical literacy.

Steps taken in implementations of pedagogical interventions, and feasibility and replicability of historical literacy pedagogical interventions
The theory of change underpinning our systematic review of research into historical literacy education hypothesised that when a historical literacy pedagogical intervention was implemented in a primary or secondary school classroom, students would learn a discrete history thinking skill.Appraising the steps taken in the implementation of the intervention allows judgement as to the feasibility and replicability of the pedagogy.Approximately half of the studies in this review provided insufficient content and procedural detail on the pedagogical intervention to enable replication (#1, 2, 3, 5, 10, 11, 12, 14 and 15); however, if sufficient detail about the steps taken to implement the intervention was available, there remain a few studies that would not be feasible to replicate due to appropriateness or complexity of teaching and learning material required by the intervention.Studies falling into this category include #2, 3 and 12. Azor et al.'s (2020, #3) YouTube audiovisual documentaries intervention would not be appropriate to replicate, because the use of YouTube documentaries is not a historical literacy pedagogy in and of itself.Study #2 is not feasible to replicate because the complexity of the working memory training tool is highly specialised and atypical of the skillset of a history schoolteacher; the authors explicitly note this limitation to their study (Ariës et al., 2015).Finally, in Study #12 -as the authors (Fontana et al., 2007) observe -their intervention requires a sophisticated understanding of language to create their trialled mnemonic teaching and learning resource.
All studies in this review with a high weight of evidence (WoE,(13)(14)(15) provided sufficient detail for an expert practitioner to replicate the pedagogical intervention trialled.Most of these pedagogies followed a pattern of: (1) familiarisation with historical content; (2) explicit instruction; (3) expert modelling; (4) scaffolded learning activity; and ( 5) communication of learning (see #4, 6, 7, 8, 9, 13, 16, 17, 18, 20 and 21).The scaffolded learning activities targeted the development of historical epistemological knowledge and skills, and provided strategies for students to critique historical sources of information.

Nature and extent of academic gains linked to interventions, and explicit or implicit moderating variables that may affect intervention results
Some claims of benefit to historical thinking from interventions trialled in this systematic review had insufficient data or quality controls to fully assess the credibility of conclusions (#1, 2, 3, 5, 10, 11, 12, 14, 15 and 19).Studies #1, 2, 5, 11, 14 and 15 reported improvements in students' historical epistemological knowledge and skills, with no detrimental effect on historical content knowledge acquisition.However, these studies are limited by the lack of identified historical literacy skills tested and reported on.While Studies #1, 2, 11, 14 and 15 identify reasoning or understanding historical sources, analysis or interpretation as the historical thinking skills developed during the intervention, there are no details provided as to the questions asked in the test instrument, nor is there sufficient description of the historical thinking lesson activities to enable external judgement as to the validity of their findings.Brugar (2016, #5) provides more detail in terms of historical thinking lesson activities; however, the qualitative data reported are at risk of single-coder bias.Further, statements of claim for students in the experimental condition showing the ability to draw inferences and make evaluations are not supported with reference to data items.
The extent of academic gains reported in Studies #10, 12 and 19 are also open to interrogation.Van Straaten et al. (2019, #19) reported gains in students' perceptions of subject relevance, with no underperformance in knowledge acquisition; however, their results are threatened by lack of quality controls reported for two measurement items (pedagogical questionnaire, and content knowledge post-test).de Groot-Reuvekamp et al. (2018, #10) reported gains in understanding historical time, with significant gains reported for Grade 5 participants; however, six teachers in the experimental condition spent on average an additional 24 minutes per week teaching history by providing students with both the intervention lesson and the traditional lesson.This treatment fidelity check was reported, but not factored into the analysis of intervention outcomes, therefore threatening the validity of assertions of academic gain due to the intervention pedagogy.Fontana et al. (2007, #12) reported significant gains for English as second language students for their mnemonic strategy intervention but 'no condition-specific performance differences' (Fontana et al., 2007: 352) overall in the post-test, suggesting that the intervention strategy has limited benefit for the development of historical literacy in general or mixed-ability history classrooms.

Discussion
Our findings reported on the nature of historical literacy pedagogical interventions in terms of being fit for purpose, feasibility and replicability of pedagogy, academic gains linked to intervention, and explicit or implicit moderating variables that may affect intervention results.Findings of this review indicate that when a discrete historical skill or knowledge is targeted by a pedagogical intervention that utilises a scaffolded heuristic targeting explicit historical thinking skills, there is greater likelihood of positive outcomes for students learning historical literacy skills.
Studies in this review with a high weight of evidence (WoE, 13-15) demonstrated the most convincing and credible academic gains resulting from the pedagogy trialled.Studies in the 13-15 WoE category (#4, 6, 7, 8, 9, 13, 16, 17, 18, 20 and 21) were similar in terms of the instructional pattern adopted: teaching began with either teacher-directed or student familiarisation with historical content, sometimes combined with explicit teacher instruction; expert modelling followed; students next engaged in a scaffolded learning activity designed to improve their historical epistemological knowledge and skills; with the final step being a communication of findings drawn from results from the scaffolded learning activity.This finding is in line with Luís and Rapanta's (2020: 10) conclusion confirming the 'importance of a disciplinary approach to history teaching, one inspired by the use of empiricist historical thinking methods'.Our findings demonstrate how the more effective historical literacy pedagogical interventions work in terms of instructional patterning, and they are similar to other research findings noting a growing body of research demonstrating the effectiveness of strategies, such as the cognitive apprenticeship model, which includes 'explicit instruction, teacher modeling, opportunities for whole class and small group discussions, collaborative planning, and repeated practice with faded support [to] improve students' ability to produce written evidence-based historical argumentation' (Nokes and De La Paz, 2023: 357).These more effective pedagogies assist students to interpret, reason and explain historical evidence, analyse and synthesise evidence and historical argument, and provide a judgement expressed as an assessment of value or as an evaluation based on criteria.
The nature and appearance of scaffolded heuristics employed by the high weight of evidence studies reporting credible findings are targeted towards the deconstruction of historical sources, and thus assist students to identify elements such as: the context, audience or message of the source; the purpose of source creation and perspective represented; and techniques used to communicate the message, purpose and perspective of the historical source (see #4, 6, 7, 8, 9, 13, 16, 17, 18, 20 and 21).In addition, studies reporting credible findings include a scaffolded heuristic to develop students' communication of their epistemological knowledge and skills via reconstruction of historical source material through interpretation, reasoning, explanation, analysis and/or synthesis of evidence to construct historical arguments.
Our synthesis of studies in this systematic review has demonstrated how effective historical literacy pedagogical interventions work.The common feature of pedagogies trialled in the high weight of evidence studies is a scaffolded heuristic; hence, we have concluded that these interventions work because they provide students with explicit and discipline-specific step-by-step guidance on how to deconstruct and reconstruct historical sources and communicate results of findings.Nine out of the ten high weight of evidence studies were conducted in secondary schools (# 4, 6, 7, 8, 9, 13, 16, 17, 18 and 20), with only one study trialled in a primary school context, with Year 4,5 and 6 students (#21).Based on the proclivity of studies trialled in secondary school contexts, we assume that high school provides the most appropriate context to trial pedagogical interventions designed to improve students' epistemological historical knowledge and skills; however, we acknowledge that the assumption is speculative.
A significant limitation of this systematic review is the inability to pinpoint which scaffolded heuristic is the most effective among the high weight of evidence studies.The considerable heterogeneity of the included studies, and the diversity in the comparisons being made by the primary studies, made it inappropriate to combine all included studies in a single meta-analysis.Given the mix of comparisons of different pedagogical interventions with different outcome variables, we are limited to describing key characteristics of interventions, rather than identifying the most impactful intervention for the purposes of improving historical literacy skills among primary and secondary school students.We further note that our study findings may be limited in their broader application, given that a high number of empirical studies included in this systematic review were conducted in the United States.

Figure 1 .
Figure 1.PRISMA flowchart of identified studies theory-based evaluation method, whereby we set out to appraise the:

Table 1 . Description of studies and key findings (n = 19) *WoE maximum score = 15 # Authors, year/country Participants Method of data collection Method of data analysis Quality control Evidence hierarchy (EH) and WoE
WoE: medium-high (10)*

Critical Appraisal and Weight of Evidence (WoE) tool
Quasi-experimental comparison group studies EH.6 Case study report EH.7 Expert reviews EH.8Other school-based report.A 'Can't tell' judgement must have an explanatory note provided.History Education Research Journal https://doi.org/10.14324/HERJ.20.1.0910.Inclusion of enough information on researchers' orientations/background.Check for the use of quality control measures, e.g., attention to the effects of the researcher during all steps of the research process, information on the researcher's background, education, perspective or relationship to study site.

study for the Review question Review question:
What is the relationship between pedagogical interventions and improved historical literacy in primary and secondary school children?Detailed = e.g., scaffolds identified and described, a lesson-by-lesson recount provided, lesson recount provides teaching sequences, pedagogical strategy/intervention could be replicated (by an expert teacher practitioner) based on the description provided General = e.g., reference to a scaffold may be made but scaffold is not clearly described, an overview of a sequence of lessons may be provided (omitting lesson by lesson detail), lesson recount does NOT provide teaching sequences, pedagogical strategy/ intervention could NOT or most likely could not be replicated from the description provided.History Education Research Journal https://doi.org/10.14324/HERJ.20.1.09