Multicentre evaluation of two multiplex PCR platforms for the rapid microbiological investigation of nosocomial pneumonia in UK ICUs: the INHALE WP1 study

Background Culture-based microbiological investigation of hospital-acquired or ventilator-associated pneumonia (HAP or VAP) is insensitive, with aetiological agents often unidentified. This can lead to excess antimicrobial treatment of patients with susceptible pathogens, while those with resistant bacteria are treated inadequately for prolonged periods. Using PCR to seek pathogens and their resistance genes directly from clinical samples may improve therapy and stewardship. Methods Surplus routine lower respiratory tract samples were collected from intensive care unit patients about to receive new or changed antibiotics for hospital-onset lower respiratory tract infections at 15 UK hospitals. Testing was performed using the BioFire FilmArray Pneumonia Panel (bioMérieux) and Unyvero Pneumonia Panel (Curetis). Concordance analysis compared machine and routine microbiology results, while Bayesian latent class (BLC) analysis estimated the sensitivity and specificity of each test, incorporating information from both PCR panels and routine microbiology. Findings In 652 eligible samples; PCR identified pathogens in considerably more samples compared with routine microbiology: 60.4% and 74.2% for Unyvero and FilmArray respectively vs 44.2% by routine microbiology. PCR tests also detected more pathogens per sample than routine microbiology. For common HAP/VAP pathogens, FilmArray had sensitivity of 91.7%–100.0% and specificity of 87.5%–99.5%; Unyvero had sensitivity of 50.0%–100.0%%, and specificity of 89.4%–99.0%. BLC analysis indicated that, compared with PCR, routine microbiology had low sensitivity, ranging from 27.0% to 69.4%. Interpretation Conventional and BLC analysis demonstrated that both platforms performed similarly and were considerably more sensitive than routine microbiology, detecting potential pathogens in patient samples reported as culture negative. The increased sensitivity of detection realised by PCR offers potential for improved antimicrobial prescribing.


INTRODUCTION
Pneumonia is differentiated into its communityacquired, hospital-acquired (HAP) and ventilatorassociated pneumonia (VAP) forms. 1 Even pre-COVID-19, it was the most-frequently reported infection in intensive care unit (ICU) patients, [2][3][4] with crude mortality estimated at 30%-70% for nosocomial cases (ie, HAP and VAP). 2 Swift effective antimicrobial therapy after clinical onset is crucial to outcome, with increased mortality among patients receiving delayed antibiotics or those that prove inactive. 5 6 The bacteria, viruses and (rarely) fungi that cause nosocomial pneumonia cannot be distinguished from clinical symptomology. Rather, microbiological diagnosis is needed, delivering results in 48-72 hours and meaning that the patient must be treated empirically in the interim. EU, US and UK guidelines advocate broad-spectrum empirical antibiotics owing to the diversity of bacteria that can be responsible and the need to cover the resistances these may carry. 2 4 7 8 Aetiological investigation is by microbiological culture, hereafter termed routine microbiology, which depends on cultivable bacteria being recoverable and fails to identify a pathogen in up to 50% of cases. [9][10][11] These patients nonetheless remain sick and mostly continue to receive empirical antibiotics.
The slowness and poor sensitivity of routine microbiology thus combine to promote poor stewardship and prolonged use of broad-spectrum agents, increasing the risk of side effects, including selection of resistant gut bacteria and Clostridium difficile. 12 A further hazard, particularly in high-resistance countries, is that the empirical agent proves ineffective against the pathogen, increasing the risk of a poor clinical outcome.
Rapid, accurate, diagnostics provide a route to improving this situation, promoting early refinement of individual patients' therapy. Commercial 'sample-in, answer-out' PCR-based pneumonia tests are now available, specifically the Unyvero (Curetis) and BioFire FilmArray (bioMérieux) platforms which have both received US Food and Drug Administration (FDA)clearance for diagnosis of pneumonia. 13 Both are substantially automated, seek prevalent pathogens and critical resistances and have turnaround times of hours instead of days. [13][14][15][16] We evaluated and compared their performance, in respect of pathogen and resistance detection using lower respiratory tract samples from patients clinically diagnosed with HAP or VAP at 15 UK ICUs. As well as providing a manufacturer-independent direct comparison, we sought to choose one test to take forward into a randomised controlled trial (RCT), evaluating outcomes compared with patient management based on routine microbiology. This is now underway (Trial ID: ISRCTN16483855). 17 Note that this study and RCT are distinct from a recently published trial for nebulised amikacin with the same name. 18

MATERIALS AND METHODS
Additional details and methods are described in online supplemental data.

Patients and specimens
Between September 2016 and May 2018, surplus routine lower respiratory tract samples were collected from eligible patients with suspected HAP/VAP at the 15 participating ICUs. The sites represented a range of UK hospital types, included tertiary referral (n=6), district general (n=7), children's (n=1) and private (n=1).
Specimens were included if they had sufficient volume (>400 µL) and were from patients hospitalised >48 hours about to receive a new antibiotic or change in antibiotic for suspected lower respiratory tract infection. Specimens were eligible only when collected within 12 hours (before or after) of antimicrobial therapy being initiated and then tested (or frozen at −80°C), within 72 hours of collection. All lower respiratory specimen types were accepted, whereas upper respiratory tract specimens were excluded. Second specimens from the same patient were included only when collected >14 days after the first sample.

Routine microbiology
Each respiratory specimen was initially cultured locally at the laboratory serving the participating hospital. Testing was according to their standard operating procedures, all based on the Public Health England UK Standard. 19

PCR testing
Samples were transported to two central research laboratories (University of East Anglia and University College London) by courier. On receipt, each was promptly tested using both the Unyvero Pneumonia Panel (Curetis, Holzgerlingen, Germany) and the BioFire FilmArray Pneumonia Panel (BioFire Diagnostics, Salt Lake City, USA) according to manufacturer's instructions. The tests are described in table 1.

Data analysis
Analyses were carried out using Stata (V.15) and R (v 3.5 or above), and followed a pre-defined, detailed statistical plan. Results from the conventional and PCR tests were described using standard summary statistics. Agreement between results was examined by categorising each sample in terms of concordance of organisms detected by PCR and routine microbiology, then calculating overall concordance with 95% CIs. Definitions of the categories are detailed in table 2. Sensitivity, specificity, positive predictive values (PPV) and negative predictive values (NPV) initially were estimated (with exact 95% CIs) for each PCR target, taking routine microbiology and routine virology as the gold standard. Owing to concerns that routine microbiology provides a poor gold standard 20 which could result in biased estimation of the diagnostic ability of PCR, estimates (with 95% credible intervals) were also calculated using Bayesian Latent Class (BLC) models 21-23 incorporating results from both PCR tests, and routine microbiology. BLC models do not assume the infallibility of any diagnostic test or combination thereof, instead estimating their accuracies based on the actual infection status (ie, infected or not) of each patient. Models used non-informative priors for all parameters (although specificities were constrained to be above 0.15 to obtain more stable posterior distributions), and were fitted with and without assuming correlation between tests. The best-fitting models were identified based on Deviance Information Criteria.

Scoring the overall performance of PCR-based diagnostic tests
At the outset of the study, through expert consensus, a scoring system was developed to assess the suitability of each 'sample-in, answer-out' test for progression to the INHALE RCT. Tests were assessed against one essential criterion-that the incidence of major discordances, meaning failures to detect pathogens found by routine microbiology, must be <5%, and 10 points-based 'Desirable Criteria', scoring a total of 150 (online supplemental table S1). Criteria i-iii were based on study results, criteria iv-viii on manufacturer's published information and criteria ix and x on a user questionnaire. The scale was weighted towards accurate detection of pathogens, with implementation-based criteria given a lower weighting.

Specimens collected
A total of 752 samples, 652 of them eligible, were collected from the 15 participating ICUs (figure 1). The range of eligible samples per site was 7-141, with 9 sites each providing >20 eligible samples. Most were from adults, with 72 from children; 260 were from patients with suspected HAP and 392 from patients with suspected VAP. Endotracheal aspirates (n=299) were the

Routine microbiology results
Routine microbiology was performed on all samples at the local laboratories. The median time to a result was 70.2 hour (IQR 51.1 hours-92.1 hours), including a median of 6.1 hour (IQR 2.5 hours-15.4 hours) transit time from the ICU to laboratory booking-in and 55.5 hours (IQR 44.8 hours-76.5 hours) from sample booking to release of results. The positivity rate was 44.2%, with 35.1% recording one significant organism with 9.1% reporting two or more. The remaining 55.8% of samples were reported variously as 'normal flora', 'non-significant growth' or 'no growth' (figure 2). Staphylococcus aureus was the most-frequently found bacterium, representing 23.6% (83/352) of all organisms reported, followed by Pseudomonas aeruginosa (20.7%); Enterobacterales collectively accounted for 38.1% of isolates, with Klebsiella spp and Escherichia coli prominent (figure 3A). Occasionally routine microbiology laboratories reported Candida spp, Enterococcus spp and coagulase-negative staphylococci: these were excluded because there is no evidence base for their involvement in pneumonia. Online supplemental table S2 lists the bacteria detected by all three methods in HAP compared with VAP patients.
Results of standard-of care diagnostic virology were recorded if performed within 24 hours of collection of the eligible bacteriology specimen. Only 113 patients, 33 of them children, had virology results meeting this criterion, and, of these, 31 (27.4%) were positive: seven had influenza A, six adenovirus and six cytomegalovirus. The study was undertaken before SARS-CoV2 began to circulate.

PCR results
Among the 652 eligible samples, 631 had Unyvero tests and 632 had FilmArray tests within 72 hours of the sample's collection, or with a frozen sample (figure 1). Among these eligible tests, 620 generated a result on the FilmArray, while 12 failed. Defining failure on the Unyvero is more complex since targets are divided into eight chambers. We considered one sample where >2 chambers failed as a 'total failure' along with 24 samples that failed to generate any result, leaving 606 valid results. In 32 of these 606 one or two chambers nonetheless failed. Their data were retained in the analysis, with the proviso that organisms sought by the failed chambers would have been missed. We did not note any user errors for either test; neither machine requires regular service or maintenance.
The overall positivity rate for both machines exceeded routine microbiology, at 60.4% for the Unyvero and 74.2% for the FilmArray (χ 2 test: p<0.0001). Most specimens had multiple organisms detected (figure 2), with this proportion higher for FilmArray than Unyvero. FilmArray found only bacteria in 54.2% of samples and only viruses in 6.9% whereas 13.1% contained both. The principal species detected by PCR, and their relative prevalence were broadly similar to routine microbiology, although E. coli and Klebsiella spp were detected relatively more frequently by PCR, whereas S. aureus and P. aeruginosa were found less frequently (figure 3B). Among viruses detected by the FilmArray, rhinovirus was the most prominent (n=55), followed by influenza A (n=29) and B (n=25) (see online supplemental table S3); Unyvero does not seek viruses.

Performance of PCR tests
Test performance was compared in several ways to accommodate the fact that routine microbiology is an imperfect 'gold standard' and the fact that the PCR tests seek multiple targets, more than one of which may be present in any sample, confounding simple calculation of overall sensitivity and specificity.
Overall test performance was first measured as concordance with routine microbiology, taken as a gold standard (table 2). Both PCR tests deliver semiquantitative outputs: the FilmArray reports bacterial targets as 10 4 , 10 5 , 10 6 or ≥10 7 copies per ml, whereas the Unyvero reports as +, ++ or +++. In addition to detection at any concentrations, we therefore also undertook further concordance calculations, considering only targets detected at high concentration, defined as 10 6 or ≥10 7 copies/ mL for FilmArray and ++ or +++ for Unyvero (table 2). Around half of the PCR results by each method demonstrated full positive or negative concordance with routine microbiology. Most of the remainder were either partially concordant or had minor discordance. Major discordance was rare, totalling only 4.6% for Unyvero and 1.8% for FilmArray. Details of results that were discordant between routine microbiology and PCR are shown in online supplemental tables S4 and S5. If PCR detections at low concentrations were excluded, full concordance increased for both tests, but major discordance increased unacceptably. A comparison of negative results determined that there was no significant difference in the number of positive PCR detections between samples reported in routine microbiology as 'no growth' and 'no significant growth' compared with those reported as 'normal flora' and 'mixed growth' (data not shown). The number of organisms detected per sample did not vary significantly according to sample type (online supplemental table S6).
PCR assay sensitivity was >95% for most target bacteria, with NPVs >98% (table 3). Specificity and PPVs were lower, due to the PCR tests detecting more organisms per sample and finding more positive samples than routine microbiology. Strikingly, however, both machines often found the same organism as each other when routine microbiology failed to record any organism, casting doubt on routine microbiology as a gold standard. Accordingly, We further conducted subanalyses to investigate factors that might influence the results, such as the timing of the sample in relation to antibiotic administration, fresh versus frozen samples, or time from sample collection to testing (24 hours, 48 hours or 72 hours). None of these factors had a significant impact on the performance of the PCR tests (online supplemental tables S10 and S11 and data not shown).
Chlamydophila pneumoniae, Legionella pneumophila and Mycoplasma pneumoniae were excluded from analysis because they are not ordinarily sought by routine microbiology. Unyvero and FilmArray

Antimicrobial resistance and comprehensive culture
All routine microbiology results for antimicrobial susceptibility testing were recorded, and online supplemental table S12 shows data for antimicrobials commonly used to treat HAP and VAP against prevalent species. The PCR tests differ from routine microbiology by seeking resistance (as genes) in a whole sample, not in particular bacteria. Assessment of the machines' performance in respect of resistance gene detection is further complicated because routine microbiology often reported no organism for PCR-positive samples. In other cases, we were unable to retrieve routine isolates for genetic investigation. These isolates were supplemented with those recovered by 'comprehensive culture' on a subset of the discrepant samples (online supplemental methods). In total, comprehensive culture detected 12 additional key resistance genes, the host bacteria of which were not isolated or reported by routine microbiology (table 5). Specific resistance gene detections are catalogued in online supplemental table S13. We performed concordance analysis for 'high-consequence' resistance genes only, encoding extendedspectrum β-lactamases (ESBLs), carbapenemases or methicillinresistant Staphylococcus aureus (MRSA) phenotypes. Among 17 Enterobacterales with ESBL phenotypes, 12 were from specimens where Unyvero found bla CTX-M and 17 from those where FilmArray found bla CTX-M . Considered from the opposite perspective, culture found ESBL producers in 12/14 cases where Unyvero found bla CTX-M and 17/32 cases where FilmArray did so. Fifteen cultured S. aureus isolates had an MRSA phenotype, of these 13 were from specimens where Unyvero found mecA/C and all 15 from those where FilmArray found mecA/C-MREJ. Considered from the opposite perspective, culture found MRSA in 13/25 cases where Unyvero found mecA/C in presence of S. aureus and 15/32 cases where FilmArray did so. There were only 11 detections of carbapenemase producers by Unyvero (including Acinetobacter OXA enzymes) and three by FilmArray, precluding review by enzyme type: culture confirmed a carbapenemase producer in 7/11 samples where Unyvero found a carbapenemase gene and 2/3 where FilmArray did so. Unyvero found a carbapenemase gene in all eight samples that grew an organism with carbapenemase phenotype, while FilmArray only found two carbapenamses in these isolates (table 5).
Overall, comprehensive culture was performed on 103 samples, from which 123 potential pathogens were grown. Routine microbiology reported 65 potential pathogens from the same samples. Of the additional pathogens grown by comprehensive culture, 86% were also identified by one or both PCR tests.

Overall comparison of PCR tests
Both PCR systems met the essential requirement of having <5% major discordances. Accordingly, we collated performance and implementability data in order to choose which to carry forward to the INHALE RCT. Our scoring (online supplemental table S1  and table 6) weighted performance, but also considered easeof-use, footprint, turnaround time and overall user experience. FilmArray scored 105 points vs 68 for Unyvero. Unyvero was more concordant with routine microbiology, but FilmArray had better sensitivity; Unyvero had a broader target panel but more failed tests. FilmArray performed better on characteristics relating to implementation, ease-of-use, turnaround time and user experience. Accordingly, we have preferred the FilmArray Pneumonia Panel for the INHALE RCT, now being undertaken across 12 UK ICUs.

DISCUSSION
We undertook a comprehensive, independent, head-to-head comparison of the two currently available rapid tests for the microbiological investigation of pneumonia. Samples were from ICU patients for whom clinicians prescribed antimicrobials to treat pneumonia.
Both systems were considerably faster than routine microbiology and detected more organisms. This underscores the known poor sensitivity of routine microbiology in pneumonia. [9][10][11] Crucially, PCR tests tended to detect the same additional organisms in a given sample, implying that these additional detections were 'real' and that PCR may improve microbiological diagnosis of ICU pneumonia, increasing the proportion of patients who potentially could receive targeted antimicrobials. A confounder is that, unlike the molecular tests, routine microbiology was decentralised, performed across 11 different hospital laboratories, receiving specimens from the 15 ICUs. The main difference between the two PCR tests is that Unyvero seeks S. maltophilia whereas FilmArray seeks respiratory viruses as well as bacteria. Early detection of S. maltophila might lead to early tailored therapy with co-trimoxazole, whereas fast viral detection may prompt the early cessation or de-escalation of antibiotic therapy.
To analyse test performance, we initially took routine microbiology as a gold standard. Only 56.6% of Unyvero results and 50.3% of FilmArray results were fully concordant with routine microbiology, with the remaining partial concordances and minor discordances mostly due to additional organisms detected by PCR, reflecting increased sensitivity of the latter. Per pathogen sensitivity performance was consistently good (91.7% to 100%) for FilmArray; Unyvero's performance was more variable, with sensitivity <90% for several pathogens. Cases where pathogens represented on the PCR panels were missed by these tests but found by routine microbiology were rare at 4.6% for Unyvero and 1.8% for FilmArray. Sensitivity and specificity values are similar to those reported by others in evaluations of one or other of the two PCR tests. 15 16 24-27 We initially hoped that 16S rRNA analysis could act as an alternative, molecular, reference, but it proved less sensitive than PCR and was abandoned (see online supplemental methods and data). Instead, the widely acknowledged limitations of routine microbiological culture 20 -confirmed by the frequency with which both PCR tests detected the same organism that was missed by routine microbiology-led us to adopt BLC analysis. In brief, this technique uses information from all tests to infer a new, unmeasurable yet underlying (ie, latent) gold standard result, with no prior assumption about any one test being 'correct'. This method has been recommended and frequently adopted for studies evaluating diagnostics in settings where reference tests are acknowledged to be sub-optimal. 21 22 28 29 BLC analysis showed (1) the sensitivity of routine microbiology was extremely poor and (2) the specificity and PPV of the PCR tests were considerably higher than those calculated using routine microbiology as the 'gold' standard. This suggests that both PCR tests were clearly superior to routine microbiology, and that the latter should perhaps not be considered a gold standard technique. A caveat is that it is perhaps predictable that two similar PCR tests (although with different primers and detection methods) should agree better with each other than with a dissimilar culture-based method. A potential concern in respect of PCR-based methods is that they may detect residual nucleic acids rather than viable pathogens requiring treatment. However, this argument is partly countered by the observation that comprehensive culture methodology was able to grow many viable pathogens that were not reported by routine culture. It is crucial to remember, in context, that all patients in this study were severely-ill, clinically diagnosed with respiratory infection and received contingent antibiotic treatment; it therefore seems more reasonable to consider an organism found by any one method as potentially significant rather than to dismiss those methods that most often recorded a potential pathogen in favour of one that failed to do so simply because it is the 'traditional method'. If the molecular results are accepted, it becomes possible to identify groups of patients, for example, those found only to have S. aureus pneumonia or Haemophilus influenzae, in whom there is wide scope to de-escalate from typical empirical therapy for HAP/VAP with for example, piperacillin/tazobactam or a carbapenem. This supports a potential to deliver improved antimicrobial stewardship along with better targeted, personalised, treatment of pneumonia. A countervailing risk is that the additional organisms found by PCR instead may prompt unnecessary prescribing. Both the present systems offer semiquantitative detection which might, in theory, assist assessment of the need for therapy. In a subanalysis, excluding organisms detected at low concentration by PCR, we did observe increased concordance with routine microbiology, but at the price of discounting organisms confirmed by routine microbiology. Ultimately the best approach may be to combine rapid microbiology with measurement of patient biomarkers as a guide to the need for therapy.
The types and relative frequencies of organisms identified were similar for routine microbiology and both PCR tests, without any obvious bias for either approach to miss particular organisms. The species distribution resembled that reported in numerous HAP/VAP studies from Europe and North America, with S. aureus, P. aeruginosa and Enterobacterales predominant. 7 8 Comparison of resistance gene detection with resistance phenotypes from routine microbiology is complicated by imperfect genotype/phenotype associations and the fact that phenotypic resistance may arise from unsought mechanisms (eg, a combination of an ESBL and impermeability may confer carbapenem resistance in Enterobacterales). 30 Moreover, except for mecA on the FilmArray, PCR detection of a resistance gene in a clinical sample does not indicate which bacterial species is hosting that gene. We therefore conducted independent genotypic investigation of isolates identified as resistant by routine microbiology and for further organisms recovered by comprehensive culture. Overall, despite all these caveats, 66% of Unyvero gene detections and 51% of FilmArray detections were concordant against a combination of routine microbiology and comprehensive culture results. Crucially, PCR tests identified several key high-consequence resistance genes that had been missed by routine microbiology but which were confirmed by testing bacteria recovered by comprehensive culture. Although the PCR-methods did not provide a full susceptibility profile, they do deliver a swift and sensitive predictor of critical resistance, potentially useful for early identification of patients who should be isolated or have their therapy escalated.
The run times of the machines are measured in hours rather than the days required for routine microbiology. Total turnaround will also reflect the machine's placement in the clinical pathway; this could not be measured here because the tests were run retrospectively under research conditions. However, we established that the median transport time of samples from the ICU to the laboratory was 6 hours, with longer times when laboratories were remote from the hospital site. If the advantages of speed are to be realised, the machine must be placed in, or near to, the ICU.
The decision of whether to adopt a rapid diagnostic into routine clinical practice will depend not only on its performance but also on the practicalities. Here, we evaluated diagnostic accuracy as well as potential for implementation, finding the FilmArray to be more sensitive than the Unyvero, also faster, smaller and easier to use. Accordingly, we have taken the FilmArray Pneumonia panel forward into INHALE's next stage, involving an RCT where patients either receive treatment guided by results of FilmArray test, performed in the ICU, or 'standard care', comprising empirical antibiotics, adapted once microbiology results become available. This trial will determine if the potential of PCR in ICU HAP/VAP can be realised without compromising patient safety. 31 Twitter Virve I Enne @HAP_Diagnostics and Synairgen (all with research/products pertinent to medical and diagnostic innovation) through Enterprise Investment Schemes but has no authority to trade these shares directly. VG: Advisory boards or ad-hoc consultancy Gilead, Shionogi, bioMérieux, MSD, Vidya Diagnostics. VIE: Speaking honoraria, consultancy fees and in-kind contributions from several diagnostic companies including Curetis GmbH, bioMérieux and Oxford Nanopore. JO'G: has received speaking honoraria, consultancy fees, in-kind contributions or research funding from Oxford Nanopore, Simcere, Becton-Dickinson and Heraeus Medical.

Patient consent for publication Not applicable.
Ethics approval This study was approved by UK Health Research Authority (Reference: 16/HRA/3882, IRAS ID: 201977) and the UCL DNA Infection Bank Committee, whose operation is governed by the London Fulham Research Ethics Committee (REC Reference: 17/LO/1530).
Provenance and peer review Not commissioned; externally peer reviewed.

Data availability statement
The dataset for this study is available on request from Norwich Clinical Trials Unit.