Research Aim 3: Generalisability of Population Data
Research has shown that the strength of evidence is sensitive to the delimitation of the background population in FSC cases (e.g. using New Zealand English for a SSBE speaker; Hughes in progress). However, it is not known to what extent this sensitivity exists when narrowly defining background populations (e.g. Kirklees, Bradford, Wakefield speakers), rather than defining them more generally (e.g. West Yorkshire speakers). Given that the database collection process is extremely time-consuming and costly, it would be ideal if background populations could be generalised and collected at a more macro-level. For this reason, WYRED intends to evaluate the generalisability of populations, and identify whether narrowly-defined population groups are needed for the majority of FSC cases, or whether more broadly-defined populations are sufficient.
To consider the generalisability of background population data, the population statistics collected in Aim 2 will be utilised to test how
variable the strength of evidence would be in simulated FSC cases. In order to investigate the third research aim, likelihood ratios (LRs) will be calculated (along with confidence intervals, equal error rate, and cost-log likelihood ratios) for simulated cases where the regional background populations are varied (Kirklees, Bradford, Wakefield) and the regional suspect and criminal samples are kept constant (either Kirklees, Bradford, Wakefield; method will follow that of Hughes in progress). Analysis will then be carried out combining regional groups into background populations (based on regions that are most similar from Aim 2), and assessing the sensitivity of the strength of evidence.
The research will also consider the DyViS data, and undertake similar testing to that described above with the DyViS speakers serving in simulation FSC cases as the background population and WY speakers playing the roles of the suspect and criminal samples. The strength of evidence results (using the SSBE speakers of DyViS as the background population) will be used as a measure of poorly delimited background data. The strength of evidence calculated for the different cases above can then be used to consider whether narrowly-defined background populations can be generalised.
It is expected that steps can be taken to understand how narrowly/broadly population data can be generalised. This research seeks to address the problems of heterogeneity that exists across accents, and offer potential solutions to ease the burden that comes with accessing population data. Caution will need to be yielded in overgeneralising the anticipated results to other accent groups or languages. However, the results are encouraged to serve as an example of how to test whether background population data can be generalised.