Engaging domestic abuse practitioners and survivors in a review of outcome tools – reflections on differing priorities

Researchers often develop and decide upon the measurement tools for assessing outcomes related to domestic abuse interventions. However, it is known that clients, service providers and researchers have different ideas about the outcomes that should be measured as markers of success. Evidence from non-domestic abuse sectors indicates that engagement of service providers, clients and researchers contributes to more robust research, policy and practice. We reflect on what we have learnt from the engagement of practitioners and domestic abuse survivors in a review of domestic abuse measurement tools where there were clear differences in priorities between survivors, practitioners and researchers about the ideal measurement tools. The purpose of this reflective article is to support the improvement of future outcome measurement from domestic abuse interventions, while ensuring that domestic abuse survivors do not relive trauma because of measurement.


Introduction
Domestic abuse encompasses physical or sexual abuse, violent or threatening behaviour, controlling or coercive behaviour, economic abuse, and psychological and emotional abuse (legislation.gov.uk, 2021). Included within this definition are stalking, harassment, intimidation, humiliation, manipulation, financial control and entrapment (IRISi, n.d.). Domestic abuse between parents or carers is a major risk factor for children, as prolonged exposure to domestic abuse can seriously impact children's development and emotional well-being, and increase their risk of child abuse (Evans et al., 2008;IRISi, n.d.;Regional Child Protection Procedures for West Midlands, n.d.).
While there have been several calls for improvements to outcome measurement in the domestic abuse sector (Howarth et al., 2015, there are various factors that can complicate measurement of domestic abuse interventions. Most seriously, tools and their administration can cause harm. For example, accurately measuring abuse entails asking people to recall experiences that can cause individuals to relive traumatic memories, which can be deeply upsetting (Pitt et al., 2020). There is also a wide range of domestic abuse interventions to reduce recurrence and to reduce associated trauma. Interventions can encompass individual-, group-or family-based approaches (Rizo et al., 2011;Stanley and Humphreys, 2017), and they can be delivered by a range of healthcare and social service professions, who have diverse scopes of practice, training and professional regulations. This diversity of interventions and professionals can prevent consistent implementation of outcome measurement tools (see, for example, Cordis Bright, 2016).
Our research group previously developed a core outcome set (COS) for domestic violence and abuse interventions (DVA-COS) through a two-year consensus process, and engaged domestic abuse survivors, practitioners and researchers Powell et al., 2022b). (We acknowledge that the term 'victim' is in line with the statutory Domestic Abuse Act 2021; however, we chose to use the word 'survivor', as it is our survivor advisory group's preferred term.) A COS is a group of outcomes that stakeholders agree are the most important to be measured in a given health care or social service setting (Williamson et al., 2017). Having defined outcomes and measurement tools improves the consistency, effectiveness, relevance and safety of outcome measurement, which enhances the quality of evidence available to decision makers (Pantaleon, 2019;Tunis et al., 2016). Involving individuals with lived and practice-based experience in research is the best way to understand their experience, and to improve the validity of research results and their relevance to clients (Gargon et al., 2014;Honey et al., 2020;Wiering et al., 2017). (We chose the word 'client' over 'service user' because we felt that it better reflects voluntary engagement with services or interventions, and fits with an ethos of trauma-informed working.) If the outcome, measurement tool or intervention is not relevant or safe for survivors (that is, if it re-triggers trauma), it could prevent them from engaging in the intervention or completing the measurement tool, affecting how representative the results from an intervention are (Gargon et al., 2014).
The aim of this article is to reflect on the lessons learned from engaging survivors, practitioners and researchers of domestic abuse in a consensus-based method to identify practice-based measurement tools which measured the following COS items: child emotional health and well-being; feelings of safety; caregiver emotional health and well-being; family relationships; and freedom to go about daily life. The project was funded by the UK Home Office, aiming to identify usable measurement tools for child-focused domestic abuse interventions from a practice perspective. The outputs of the project included an internal report for the Home Office and an executive summary published on the Open Science Framework (OSF) server (Powell et al., , 2022a. The co-authors of this paper are drawn from the research team, the survivor advisory group and the steering group (which comprised both practitioners and researchers). Between us, we have aimed to present all the perspectives from our consensus process and, as a group, we represent a range of views.

Approach
Our consensus approach began with a rapid review of the literature following Tricco et al.'s (2017) methodology for rapid reviews, and the protocol was preregistered . We carried out a four-stage process to identify and assess outcome measurement tools currently used in practice relevant to the DVA-COS (Powell et al., 2022b), as shown in Figure 1.
The stages of the process were: (1) scope measurement tools currently in use; (2) review those tools that related to the DVA-COS; (3) rate usability and acceptability using four checklists, as shown in Table 1; and (4) an adapted consensus process that implemented separate workshops with each of the practitioner, survivor and steering advisory groups, where they recommended tools based on the checklists and their professional or survivor expertise. Our survivor advisory group was drawn from the survivor-led charity VOICES (https://voicescharity.org/), who were consulted throughout the original COS study, and who were co-applicants on the funding application for this study. For a full description of the methods, see Powell et al. (2022a).
The three advisory groups were: a survivor advisory group of individuals from the UK with lived experience of domestic abuse; a steering advisory group consisting of international researchers and practitioners within the field of domestic abuse; and a practitioner advisory group, consisting of practitioners working for domestic abuse organisations internationally (see Acknowledgements for further details).

Findings
From our scoping review during Stage 1, we found 163 tools used in domestic abuse practice. Of these, 55 measured at least one of the outcomes from the DVA-COS; we then assessed these against the  Table 1. Checklists used to assess the usability and acceptability of domestic abuse outcome measurement tools -for further details, see OSF Appendices 1-3 (Powell et al., 2021: 3-14) Usability and acceptability checklists Summary of the checklists Adapted version of the COSMIN (COnsensus-based Standards for the selection of health Measurement INstruments) Risk of Bias (RoB) checklist (Mokkink et al., 2018).
Standardised checklists that originally focused on assessing the methodological quality of studies measuring the reliability, measurement error and feasibility of outcome measurement instruments.
Steering advisory group checklist. Developed after consensus workshops with the steering advisory group and the practitioner advisory group, where the most important aspects to include in a domestic abuse outcome measurement tool were discussed.
Practitioner advisory group checklist.
four checklists. The tools rated in the top 50 per cent of the practitioner and steering advisory group checklists were then shared in a second consensus round process. The practitioner and steering advisory group reviewed the top-rated tools and shared their recommendations. The tools that had received recommendations or partial recommendations from each group were then shared with the survivor advisory group to understand their perspectives on the tools. Our original aim was for the survivor group to rate the tools at the same time as the practitioner and steering groups. Unfortunately, due to staff illness, the group had to be delayed and, given the short timescale determined by the funders, we were unable to pause the process. We acknowledge that this is a limitation of the consensus process, and we took additional tools to be reviewed by the survivor group. Full details can be found in our final report (Powell et al., 2022a). Only one tool (with two versions) was recommended by both the practitioner advisory and the steering advisory groups: the Warwick-Edinburgh Mental Wellbeing Scale (WEMWBS; Warwick Medical School, 2021), and the short version suitable for children (SWEMWBS; CORC, n.d.). These scales have been developed as a measure for mental well-being within the general population (Warwick Medical School, 2021). In general, throughout the review and consensus process, two types of tools were identified. The first type consisted mainly of domestic abuse-specific tools, developed by domestic abuse organisations, which included service feedback in addition to an individual's outcome-related questions. These tools had not usually been psychometrically validated. The second type of tools, such as the WEMWBS, were those that assessed broader health and well-being, and they were more likely to have been psychometrically validated. However, this latter set of tools were not developed specifically for domestic abuse, and, therefore, they were often not trauma-informed (Sweeney et al., 2016); that is, they were not designed to be used with individuals who have experienced trauma. These tools were more likely to use language, or to be designed, in a way that could cause participants to relive their trauma.
In summary, we reached a consensus on the applicability of the WEMWBS; however, despite this agreement, the tool was not without criticism by the survivor advisory group, which highlighted wider issues about trauma-informed outcome measurement.

Reflections on engaging survivors and practitioners
Our team's reflections on the engagement in the measurement identification process can be broadly grouped into four domains: the definition of usability criteria; the need for trauma-informed tools; the need to link outcomes to service delivery; and survivor-specific reflections on tool recommendations. What follows is a narrative overview of these domains. Further detail on the advisory group recommendations can be found in OSF Appendix 4 (Powell et al., 2021: 15-19).

Defining 'usability' criteria
Our usability checklists were developed through workshops with the steering and the practitioner advisory groups. Both groups emphasised how domestic abuse-specific tools would need to be able to be used with diverse families from different backgrounds, and accessibility of measurement usage was seen as important by survivors and practitioners, and by the steering advisory group.
Discussions of accessibility centred on language, literacy, age, disability, cultural differences and minoritised groups. In terms of differences, the steering advisory group underlined the importance of cultural validity, which was about the concepts translating across cultures, rather than just translating into other languages -for example, how mental health or well-being is understood in one culture versus another (Huang and Wong, 2014). There is currently no consensus about how best to translate outcome measures for diverse groups (Epstein et al., 2015). The practitioner and survivor advisory groups reflected on the implications of how tools would be used by members of staff (that is, would staff use them appropriately and sensitively?), staff level of training (that is, whether tools required additional trauma-informed training, and how staff could be supported to use them), and how tools could be sensitive to individual family contexts.
The different focus on 'measurability' and use in practice highlighted the science-to-service gap between tools that are research-effective versus those that practitioners find relevant and usable in their direct work with survivors. Practitioners wanted tools to be rated on their design, in particular, the use of a narrative approach to enable the collection of more personal data that can describe individual experiences and contexts, rather than being confined to multiple choice answers. The practitioners also focused on the appropriateness of tools for children, and on the possibility of tools capturing risk of further harm related to domestic abuse. Psychometric robustness (namely, reliability and validity), along with usage of tools in previous randomised controlled trials, and financial costs, were priorities for the steering advisory group.
These key differences meant that it was difficult to find tools that scored highly on both checklists, and to reach consensus on tools that were considered feasible, acceptable and usable. On the one hand, the practitioner advisory group checklist ranked tools highly if they captured narrative data to understand the client's history, in the form of free-text options. Tools that met these criteria would be useful for practitioners to inform their clinical judgement and to better understand progress at an individual level. On the other hand, the steering advisory group checklist ranked tools highly if they had been used in randomised controlled trials, had undergone research studies to determine their validity and reliability, and if they were free to use. Tools that met these criteria would be more beneficial to researchers to use in research studies.

Trauma-informed tools
All three advisory groups discussed the importance of selecting trauma-informed tools, which had been developed with an understanding that the users of the tools may have previously experienced trauma. There was a common and sustained concern that insensitive tools (for example, those that were not developed with the clients in mind) could cause clients to relive the trauma that they had previously encountered.
Two aspects were identified as important indicators of being trauma-informed: the design of the tool, including the wording and order of the questions; and how the tool was administered, including the trauma-informed training that would be needed for practitioners. As an example, individual tool wording was discussed by survivors, and they flagged how tools can use words or phrases that could be upsetting, such as the statement 'I have been feeling useful'. The survivor advisory group explained that this statement could evoke ideas of feeling useless, which, in some cases, paralleled their previous experiences of abuse.
All advisory groups acknowledged the importance of measures capturing the full range of abusive experiences, rather than simply focusing on physical violence, which has been recognised in previous research about domestic abuse measurement tools (Evans et al., 2016). There was a common concern relating to the risk of non-domestic-abuse-specific tools that measured interparental relationships. Survivors and practitioners felt that these could cause harm, if the focus of the tool and subsequent intervention was on reconciling relationships without attending to the dynamics of abuse.
Advisory groups were consistent in their comments about the importance of measure brevity, being clear about the recall time frame (for example, in the last two weeks), and the need for tools to be colourful and visually appealing. There was some debate concerning the use of 'numbered' response scales: survivors discussed that numbers could make them feel anxious, especially when they were anchored by positive or negative response options. Some practitioners, however, felt that it was difficult to try to score and assess change without a numerical scale. Furthermore, practitioners highlighted how inexperienced or poorly trained staff could deliver the tool in an insensitive way, even if wording and numbers were appropriate. Additional design elements discussed by survivors included the preference for a space for clients to add narrative information to explain their scores and stories, and this was seen as a way of mitigating the stressful or reductive impact of numbered scales. The need for space for narrative has been identified in previous research with domestic abuse survivors (Hegarty et al., 1999). The survivor advisory group shared that these narratives could be particularly relevant in cases where a survivor was involved in an ongoing court case or felt the need to explain their scale responses to their service professional, or to better capture progress while receiving services or interventions.
Use of narrative was important to practitioners, too, as part of the therapeutic process; however, this was one of the key aspects that meant that tools were less likely to be reliable or valid, as there were limited psychometric studies on these types of data collection tools. However, practitioners in lowresource settings were concerned about the additional burden of including narrative questions, both in terms of possible stress for survivors (when practitioners were unable to follow up) and in terms of the strain for practitioners to have difficult conversations in limited time.

Linking individual outcomes to service delivery
The survivor advisory group discussed the importance of having tools that capture the attribution of change to the service/intervention, for example, by using items such as 'I feel more confident because of the intervention', rather than 'I feel more confident'. The survivor advisory group reflected that it should be made clear to clients that the tools are assessing their outcomes because of a domestic abuse service or interventions, rather than assessing the clients themselves. This parallels the discussions among the practitioners; measurement tools may indicate that a survivor has experienced limited positive outcomes, but this may often be related to problems with the service, rather than for the individual. Practitioners indicated that intervention-specific outcome measurement tools were important, as opposed to generalised outcome tools to assess any and all types of interventions. Researchers and practitioners were concerned that it can be hard to attribute change to a single intervention (or intervention component) when families are often receiving support from complex, multi-component interventions in multi-agency systems.

Survivor feedback on the Warwick-Edinburgh Mental Wellbeing Scale (WEMWBS)
Although the WEMWBS was the only tool recommended by all three groups, and although survivors appreciated its use of positive language, as well as the clarity and the brevity of the tool, there were two criticisms from survivors. These included noting that the language used within the tool was not traumainformed, with the use of the (previously discussed) statement 'I have been feeling useful', and that it also lacked a space for the inclusion of narrative responses. This feedback highlights how mainstream tools used to measure outcomes in the general population may not be appropriate for individuals who have experienced domestic abuse and, at worst, may contribute harm to survivors. Were this tool to be widely adopted as a domestic abuse outcome measurement tool, it would be important to work with survivors to change the wording, so that the tool does not cause harm. However, care would have to be taken to avoid changing the wording too much, such that the previously conducted psychometric tests become invalid and comparison with other studies is no longer possible. In the case of significant changes, new studies assessing the psychometric validity of the new wording should be conducted.

Discussion
Our reflections show a tension between measurement tools that are used within domestic abuse practice, and standardised measurement tools that are used more generally to measure outcomes that happen to overlap with the DVA-COS. This likely contributes to the continued science-to-service gap in outcome measurement, and to the wider challenges of practitioners applying research findings (Casey et al., 2021).
Many of the standardised measurement tools that measure interparental relationships were developed for marriage and couples counselling, rather than for the domestic abuse context, and therefore they do not attend to the dynamics of abuse. This highlights the need for the development of interparental and familial relationship measurement tools within the domestic abuse setting. Furthermore, this underlines that couple interventions such as counselling should only take place following appropriate assessment that ensures that intimate terrorism is not occurring, where one partner is afraid of the other and/or the relationship is characterised by coercive control (VEGA Project, 2016).
To our knowledge, there is no literature on how tools that are used to measure outcomes such as those listed within the DVA-COS have been assessed for how trauma-informed they are. This highlights a gap in the development of trauma-informed domestic abuse outcome measurement tools, and it underscores the need for more research investment in this field. To understand how trauma-informed tools are, survivors of trauma must be involved in the development and assessment of measurement tools, and in how the tools are implemented.

Final reflections from practitioner and survivor co-authors on the findings
Practitioners in domestic abuse services carry large caseloads and, in our experience, demands for services have increased due to the Covid-19 pandemic. Thus, it is important to ensure that core outcome measurement tools are not too time-consuming for practitioners to use and complete. Additionally, while some professionals involved in the domestic abuse sector may be part of regulatory bodies and have research training, most practitioners do not, so standardised tools that can be reliably implemented across professional disciplines are important. Costs must also be considered. Many domestic abuse services are already working with minimal budgets; thus, tools that do not require licences, or extensive or expensive training of staff, and which are intuitive to use and easily integrated into routine service visits, are likely to yield higher implementation. Finally, those coming into contact with domestic abuse survivors are not always domestic abuse experts (for example, health professionals), which reiterates the need for trauma-informed tools that minimise the risk of retraumatising survivors through insensitive usage.
A key concern for survivors is how the findings from outcome measurement tools could be used, for instance, if the narrative data collected from tools were accessed by the courts and misused by perpetrators. This could be detrimental to survivors; thus, confidentiality protocols are crucial, so that any results from outcome measurement tools are safely captured. Survivors that are connected to abusive partners due to co-parenting or ongoing family court procedures may be reluctant to take part in a domestic abuse intervention, or to use the associated tools, if they fear that information they share could be used against them, especially if they have previously suffered secondary abuse via the judicial, health, education and social systems. Staff training needs to encompass full and detailed understanding of outcome collection and information sharing that is thorough and trauma-informed.

Conclusions
Based on our reflections from the advisory groups' input, there are implications that should be considered for the use of domestic abuse outcome measurement tools in practice. Survivors and practitioners should be included in the development of all tools to ensure that the tools are usable, not harmful to survivors, and not just developed for researcher priorities. Organisations that are currently using domestic abuse outcome measurement tools need to review their tools to determine if any, and how many, are not fit for purpose, and to amend the tools or seek better tools for practice. The reflections on the process and comments from this research project provide us with future directions for the redesign and development of outcome measurement tools that are acceptable for all interventions related to trauma survivors.