Article Text

Download PDFPDF

Nursing workload in UK tertiary neonatal units
  1. D W A Milligan1,
  2. P Carruthers2,
  3. B Mackley3,
  4. M P Ward Platt1,
  5. Y Collingwood1,
  6. L Wooler1,
  7. J Gibbons1,
  8. E Draper4,
  9. B N Manktelow4
  1. 1
    Newcastle Neonatal Service, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
  2. 2
    Northumberland Care Trust, Morpeth, Northumberland, UK
  3. 3
    Mackley Management Consulting Services, South Shields, UK
  4. 4
    University of Leicester, Leicester, UK
  1. David Milligan, Newcastle Neonatal Service, Newcastle Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK; d.w.a.milligan{at}


Background: Neonatal intensive care requires adequate numbers of trained neonatal nurses to provide safe, effective care, but existing research into the relationship between nurse numbers and the care needs of babies is over 10 years old. Since then, the preterm population and treatment practices have changed considerably.

Aims: To validate the dependency categories of the British Association of Perinatal Medicine (BAPM, 2001) and to revalidate the Northern Region categories (NR, 1993) in relation to contemporary nursing workload.

Setting: Three tertiary neonatal intensive care services in England.

Methods: Nursing activity around each baby was captured every 10 min by direct observations by trained observers. Time spent on each nursing activity was related to the baby’s dependency category and the nurse’s grade.

Results: Both scales detected differences between categories. Discrimination between individual categories was improved when nasal continuous positive airway pressure (nCPAP) was distinguished from ventilation and combined with BAPM2/NRA. On this revised four-point scale, babies in BAPM1/NRA occupied nursing time for a median of 56 min per hour (IQR 48–70), those on nCPAP or in BAPM2/NRB for 36 min, (27–42), those in BAPM3/NRC for 20–22 min (15–33) and those in BAPM4/NRD for 31–32 min (24–36). The NR scale was easier to apply and had greater interobserver agreement (98.5%) than the BAPM scale (93%). All categories attracted more time compared to 1993.

Conclusions: Both scales predict average nursing workload. A revised categorisation which separates nCPAP from ventilation is more robust and practical. Nursing time attracted in all categories has increased since 1993.

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Trained nurses are the single most expensive element of intensive care provision for newborn babies and are its most precious resource. Until recently there were only two studies of neonatal nursing workload in the English literature, both published in 1993. Workers in Liverpool, UK examined the relationship in their institution between three broad classes of babies and the nursing time given to them.1 They found that on average a stable ventilated baby occupied half a nurse’s time, a well baby on intravenous fluids (special care) one third to one quarter, and that some babies who were very ill or were undergoing a specialised procedure occupied 100% of a nurse’s time. They pointed out that nursing time given was not consistently related to illness severity; babies starting to take oral feeds, for instance, levied as much time as some stable babies on ventilators. A study from the Northern region of England published at the same time2 used established work study methodology to determine how much time was spent by individual nurses on caring for babies in one of four predefined categories and found that it was possible to separate the study cohort into two principal groupings: so-called “high dependency” babies who on average demanded a nurse’s time for 30 min every hour, and “low dependency” babies who required attention for only 15 min an hour. This led the authors to propose a minimum core staffing ratio for neonatal units (excluding supernumeraries, supervisors and transport nurses) of one nurse per two high dependency babies and one per four low dependency babies. A comparative study of two existing measures of neonatal nursing workload with a scale measuring workload as perceived by the nurse caring for the baby in two units in Australia3 found that perceived workload correlated poorly with workload predicted by the published tools and was highly dependent on factors such as experience, shift patterns and the organisation of the nursing manager.

Since the original UK workload studies were published, much has changed in neonatal intensive care. Babies are now more immature and survive longer; modalities of treatment have changed (notably there is a trend towards large scale provision of nasal continuous positive airway pressure (nCPAP)); neonatal abstinence syndrome now imposes a significant workload in some areas; parental expectations have changed and nurses find that not only do they need to spend more time in discussion with parents, but that they (the nurses) also need to be better informed than before and therefore need to spend time acquiring that information; and the advent of clinical governance has imposed a heavy additional workload of documentation on all staff. We felt, therefore, that it would be informative to re-examine the workload of neonatal nurses using the British Association of Perinatal Medicine (BAPM) and Northern Region (NR) scales to categorise the babies to determine whether patterns had changed over the last decade.


The study was carried out in three tertiary (level 3) neonatal units in Newcastle upon Tyne, Leeds and Leicester (all in the UK). Activity sampling analysis techniques4 were used to determine the time spent by all individual nurses working during the study period on delivering care. Teams of observers comprising experienced nurses from the Newcastle unit recorded coded tasks undertaken by each nurse under observation using a taxonomy of nursing activities. The process of measurement and recording was identical to that in the 1993 study. As the taxonomy in the earlier study was adopted directly from one used for observing nursing activity on general wards,5 6 we developed a new and more robust taxonomy (table 1) which was more relevant to the tasks carried out by nurses in a neonatal unit. The focus for each observer was the nursing activity related to each individual baby. The work of all nursing staff (including matrons, specialist nurses and educators) considered to have a role contributing to the care of the baby was recorded. In effect, a theoretical ring was drawn around each sample area and all staff activity in the area and movements into and out of it were recorded. Each team of observers was led by an experienced management services analyst who trained the observers and was present throughout the study to provide supervision and quality control. The observers recorded the time taken on each activity within a fixed interval sampling framework enabling time spent to be correlated with assigned baby dependency. The grade of each nurse was also recorded. All nurses present on the unit (including supernumeraries) during the main study periods were included. When a supernumerary was involved, the task being performed was attributed either to the supernumerary or to the trained nurse. On those occasions when both nurses were undertaking discrete tasks (but still within a training remit), both observations were recorded. Each baby on the unit during the study period was assigned a dependency category. Two dependency scales were used: the 2001 revision of the 1992 scale devised by the British Association of Perinatal Medicine (BAPM),7 which is a consensus statement with four categories 1, 2, special (3) and normal (we have designated this 4), and an evidence based scale from the Northern region of England (NR) derived using methodology similar to that described in this paper, also with four categories A to D.2 The study was carried out in four phases: a pilot study in Newcastle (one 12 h day), a full study in Newcastle (three 12 h days and one 12 h night) and two validating studies in Leeds and Leicester (one 12 h day each). The pilot study was designed to establish the practicality of the proposed methodology, to capture (or reject) nursing tasks which had not been identified at the planning stage and to determine whether 10 min observations were as discriminating as observations made at 5 min intervals. Reproducibility of dependency category allocation between observers was tested using eight independent staff scoring the same 30 babies on each of the two scales on the same day.

Table 1 Taxonomy of nursing tasks recorded


As the underlying distribution of nurse time is unknown, and likely to be asymmetric, distribution free methods were used to compare the nurse time across categories. Since each of the classification methods was hypothesised to represent decreasing morbidity, Dunn’s test was use to compare adjacent categories for any difference in their distributions, keeping the type I error rate equal to 0.05 for each set of comparisons. For this test the observed difference in the mean ranks of adjacent categories is compared to a specified cut-off value. There is evidence for a statistically significant difference in times between the categories if the absolute value of the differences is greater than the cut-off. SAS v 9.1 software was used for all analyses.


Pilot study

A total of 28 babies and 20 nurses were observed over a 10 h period. There were no missing observations and all babies and substantive staff were included. The choice of sampling methodology proved practical and appropriate to the needs of data capture and study aims. The pilot study allowed us to establish consistency of recording and demonstrated that sampling intervals of 5 min offered no increase in discriminating power compared to 10 min interval sampling. Inter-observer agreement was 98.5% using the NR scale and 93% using the BAPM scale.

Main study

Results on babies observed for less than 1 h (n = 3) were excluded from the analysis.

The Newcastle study period was continuous through three day shifts and one night shift. All babies (26–30 per observed study period) and all nurses (19–22 per dayshift, 12 per nightshift) present on the unit during this time were included; to these data we have added those from Leeds (12 h, 35 babies, 22 nurses) and Leicester (12 h, 21 babies, 14 nurses). The case mix and number of babies on the ward during the study were representative of average periods of activity during the preceding year. Babies spanned the full range of dependency, although numbers in NR category B were small. Separation of categories was not affected when supernumerary staff were excluded from the analysis. The observed nurse time (inclusive of supernumeraries) in minutes for each hour observed is shown in fig 1 with BAPM categories in the left hand panels and NR ones on the right. The first row illustrates groupings according to the two published scales, the second after babies on nCPAP have been extracted and grouped separately as the second of five categories, and the third when all nCPAP has been re-allocated to BAPM2 and NRB. The range of values in most categories is wide but, in all except NRA/BAPM1, most of the variability is accounted for by a few outliers with the majority of values clustered together as reflected in the interquartile ranges. In the original grouping (row 1), there is a progressive reduction in the time spent by nurses caring for babies in the first three categories of both scales and a rise again in the fourth category (see text below graphs for values). When nCPAP is extracted out as a separate fifth category (row 2), it is apparent that the nursing time attracted (median 37 min) is significantly less than by babies who are being ventilated or satisfy the other criteria for BAPM1 (median 55 min) and approximates the value in the category below; the extreme outlier is a baby observed for 4.8 h after admission. When nCPAP was re-allocated to BAPM2/NRB (row 3), separation between the top three categories on each of the two scales improved further and there were statistically significant differences (p<0.01) between adjacent categories on both scales (table 2). On this revised scale, babies in BAPM1 (no nCPAP) and NRAv occupy nursing resource for almost 60 min in an hour (median 56, IQR 48–70), for two thirds of that time in BAPM2+nCPAP (median 36, IQR 27–31) and NRAc+B (36, 31–42), for a third in BAPM3 (median 22, IQR 15–31) and NRC (20, 15–30) and for about half in BAPM4 (median 31, IQR 24–36) and NRD (32, 26–36).

Figure 1 Nursing time by subscale of each of the three models of BAPM and NR scales. Row 1: Original groupings (BAPM, 2001 and NR, 1993). Row 2: nCPAP extracted as a separate fifth category; Ac, A (nCPAP); Av, A ventilated. Row 3: nCPAP reallocated to BAPM2/NRB. BAPM, British Association of Perinatal Medicine scale; nCPAP, nasal continuous positive airway pressure; IQR, interquartile range; NR, Northern Region scale.
Table 2 Separation between individual categories in the three models of the scales with difference in average ranks and p value for the hypothesis that difference in average ranks = 0 (Dunn’s test)

On average, two thirds of nursing time was spent on 10 tasks: observation/assessment (9–11%), general baby care such as nappies/mouthcare (7–11%), feeding (6–10%), supervision of area (5–11%), documentation and charting (6–9%), handover (6–7%), drug administration (4–8%), teaching (1–7%), parent interaction (1–7%) and paid breaks (2–4%). Low figures for teaching and parent interaction were observed at night.

There was good agreement between the median values obtained in each dependency category between the three units where observation took place and there was no systematic bias with respect to time spent in delivering care (fig 2). Median nursing time spent on all categories of babies on the NR scale was longer than reported in the 1993 study. Time absorbed was nearly twice in NRA (43 vs 24 min), one and a half times in NRB and NRC (34 and 20 vs 20 and 12 min) and three times as long in NRD (32 vs 11 min).

Figure 2 Comparison of values obtained in the three units studied. BAPM, British Association of Perinatal Medicine scale; IQR, interquartile range; nCPAP, nasal continuous positive airway pressure; NR, Northern Region scale.


Our findings suggest that, on average, one nurse should be able to care for one baby receiving ventilation, one and a half babies on nCPAP or in BAPM2/NRB, three babies in BAPM3/NRC and two in BAPM4/NRD. This does not include the time needed for specialist managerial roles (matron, team leader, educator) or for roles such as transport which need to be planned for separately.

What is already known on this topic

  • Categorising babies according to their predicted dependency was a valid tool when first introduced in the 1980s.

  • Neonatal care has greatly changed in the succeeding years.

What this study adds

  • The dependency categories of the British Association of Perinatal Medicine (BAPM) and the Northern Region (NR) scales discriminate well in relation to nursing workload.

  • The BAPM and NR categories would both be enhanced by re-categorising babies on nasal continuous positive airway pressure.

There are some limitations to this study. Activity sampling allows large scale observation programmes to be mounted more economically than by continuous time recording and is an established work measurement technique, but each set of values is derived from a limited time period which might not be representative of the whole either as regards the case mix and number of patients or the nursing grade mix and staffing ratio (understaffing might result in tasks being poorly completed in a hurry and taking less time; overstaffing might have the opposite effect). We have attempted to address the latter question by analysing the data with and without supernumerary staff reasoning that, if the results were comparable, there would be no measurable effect of small alterations in staffing ratios. In addition we have verified that occupancy, case mix and staffing ratios on the study days were typical of those in the preceding year. The nature of the technique used to capture information meant that nurses knew they were being observed, which could have affected the observations in unpredictable ways. Finally, the different nursing task classifications (taxonomies) used in the categorisation of nursing activity in this study and that in 1993 limit our ability to induce causes for the increased nursing time expended in the current study. In retrospect, it seems possible that the classification of tasks in the earlier study, a classification designed to monitor activity on a general ward, may have failed to capture nursing activity which was recorded by the more sophisticated and tailored task list developed for this study.

Measurement of nurses’ own perception of their workload was outside the remit of the study, but Spence et al3 pointed out that this is an important element to consider in the overall management of a workforce on a day-to-day basis. In contrast, our intention was to collect objective data that would relate to average workload and staffing for a whole unit not for deploying nurses shift by shift. Finally, we did not include any measure of outcome as the intention was not to measure how much or how well nurses were doing but how much time they were spending on a variety of groupings of babies in practice. Hamilton et al have examined this question and found an inverse relationship between mortality and the proportion of neonatal nurses with specialist qualifications.8

There is close agreement between the median values on both scales obtained in the three neonatal units in the study, suggesting that either tool should be generalisable, at least in a tertiary care (level 3) setting. The close agreement between observers on coding allocation for babies receiving intensive care is reassuring. The smaller inter-observer variability on the NR scale for all babies probably reflects its simpler design and is consistent with the findings in the 1993 study.

The extraction of nCPAP gives a clearer separation in relation to nursing workload with a median value of around two thirds of that of the group on mechanical ventilation. It has been argued that babies on nCPAP in the first days of life demand more attention than those on ventilatory support. Numbers in this study are too small to answer this question clearly, but the study population in this group included babies who were just being stabilised through to older babies who were in the process of being weaned on to unsupported breathing.

At first sight the increased time demand levied by the “least dependent” babies who are well and feeding is surprising, but we think it is explained by the increased time taken to supervise breast feeding or give bottle feeds compared to tube feeds and the time spent on discharge planning and discussion with or education of families. It is not possible to directly compare the constituent elements with those in the 1993 study. Symptomatic babies born to substance using mothers are not captured separately on the NR scale. There were five such babies (six observation periods) in this study population (two were coded C and three D of whom two were in the upper quartile). In many units these babies are mostly cared for on the postnatal ward and will not be captured by conventional workload measures.

Any workable method of categorising dependency has to sacrifice some precision in relation to individual babies in favour of ease of application in practice. Because these categories are about averages, the workload arising from any individual baby at any one time cannot be predicted very precisely from the category into which that baby falls. Therefore, the use of these categories lies not in deciding the staffing for individual shifts (or more practically, deciding the safe limit for the number of babies for whom it is possible to provide care at any time) but rather for planning overall staffing using historical (or projected) data on workload as defined by these categories. While decisions on staffing for individual shifts may be informed by the objective criteria of the dependency categories, these decisions are more dependent on knowledge of the clinical condition of existing babies, the experience, skill and stress profile of nurses on the ground and the prediction of events which are known to be time consuming, such as death, transport or some procedures.

We conclude that a modified version of both scales which separates babies on nCPAP from those on assisted ventilation provides a useful discrimination between nursing time spent on the average baby in each of the four or five categories on each scale. Re-allocation of babies on nCPAP to BAPM2/NRB in a modified scale with four categories improves precision and simplifies coding. We suggest that minimum nurse staffing ratios using this modified scale should be 1:1 (level 1), 1:1.5 (level 2), 1:3 (level 3) and 1:2 (level 4). Further modelling of the BAPM scale to refine its components may be helpful particularly in view of its central place in the future as a determinant of treatment costs under Payment by Results. Neonatal nurses spend more time caring for babies in all categories than they did 15 years ago when studied using a less sophisticated taxonomy of tasks.


We are grateful to the nursing staff at Leeds General Infirmary, Leicester Royal Infirmary and the Royal Victoria Infirmary, Newcastle upon Tyne for their assistance with this study.



  • Competing interests: None.