Study design and site selection
This research was conducted on a method to implement the larger Balanced Scorecard national assessment, which was approved by the Johns Hopkins University and Afghan Ministry of Public Health institutional review boards. Badghis province was chosen for its range of secure and insecure areas. Badghis is a province in western Afghanistan covering 20,068 square kilometers of largely mountainous or semi-mountainous terrain, and it is divided into 7 districts [28]. A United Nations report released in October 2009 classified 1 district in Badghis as “low risk”, 2 districts as “medium risk”, 4 districts as “high risk”, and 0 districts as “very high risk” [11]. In 2010, the Afghanistan NGO Safety Office (ANSO) classified Badghis province as “moderately insecure”, on the scale of “low insecurity”, “deteriorating”, “moderately insecure”, “highly insecure”, or “extremely insecure”, with 356 total reported attacks by armed opposition groups in Badghis in 2010 [29]. The total population in Badghis is estimated to be 499,393 people, with 97% of the population living in rural areas [28].
Facilities eligible for inclusion were those covered under the BPHS package in Badghis: sub-health centers (SHC), basic health centers (BHC), or comprehensive health centers (CHC). District, provincial, and regional hospitals were excluded, since the focus of the BSC assessment is on a basic package of health services at predominantly outpatient-oriented facilities [1, 2, 4]. Of the 40 BPHS facilities in Badghis at the time, a stratified random sample of 25 BPHS facilities selected for assessment was generated, the sample size used to calculate BSC scores in each province. A standard survey team of physicians, nurses, and a pair of monitor-supervisors, upon arriving in Badghis, then met with key provincial officials from the Ministry of Public Health’s Department of Monitoring and Evaluation, Provincial Health Department, and Provincial Educational Department, and other key local stakeholders to determine the security status of facilities selected for sampling. Based on this discussion, the standard team was deemed safely able to assess 11 “secure” facilities; the approach using locally-based teams was able to assess those 11 “secure” as well as 13 additional “insecure” facilities (24 total). Because of the incredibly dynamic security environment in Afghanistan, we chose to use local informants as the guide to the security status, as opposed to using district level security scores, such as those used by various intergovernmental and nongovernmental organizations in Afghanistan [11, 29]. It was felt that relying on these scores might place surveyors at undue risk, as these reports often do not represent the most up to the minute security context, are dependent on the ability to report security incidents (some of the most dangerous areas had few people reporting incidents), and the survey teams placed more faith in informed, local knowledge.
Facility assessments incorporated observation of patient-provider clinical interactions with follow-up exit interviews of the patients, health worker interviews, and facility record audits. Survey instruments contained a mixture of continuous, binary, and categorical variables. Categorical variables were scored using Likert scales. Locally-based teams were trained with abridged survey instruments containing only questions necessary for calculation of the BSC, compared to survey instruments used by the standard team that included a number of research-related questions. For each facility surveyed, observation of patient care was based on a systematic sample of clinical interactions between children and adults with the main health worker, with targets of 5 adult and 5 child patients selected using a random starting point and sampling interval determined by the average number of new patients per day. Following observation of patient-provider clinical interaction, patients were invited for an exit-interview, away from any local health-care providers. A target of 4 health workers were also randomly sampled and selected for interview at each facility, stratified by the type of health worker. One facility record audit was completed for each facility [1, 2, 4].
Selection and training of locally-based teams
Upon arrival in Badghis, the standard team and pair of monitor-supervisors worked with the Provincial Education Department to identify suitable, documented, and qualified teachers to comprise the locally-based teams. To be selected for a locally-based team, the teacher must have resided in the catchment area of the facility that they would evaluate at the time of the survey and have stated they had had no relationship with the workers at that facility. Teachers were primary or secondary teachers, with preference given to secondary (high school) teachers, who were felt to be more capable at completing complex tasks. Because teachers must come from the catchment area of the facility surveyed, a different locally-based team composed of two teachers was used to survey each facility assessed by that method; whereas, only one standard survey team was used for the entire province.
For each facility to be surveyed by the locally-based method, a pool of three to five teachers who were willing to participate travelled to the provincial capital, where they collectively underwent three days of intensive training. During the training period, the monitor-supervisors gave instruction on ensuring data quality, interviewing techniques, research ethics, and patient selection, and were familiarized with the survey tools to be used. Key medical equipment and aspects of hospital infrastructure were demonstrated. Training culminated in a field testing exercise, followed by a post-training exam to assess understanding of the study protocol. For each facility to be surveyed, the two teachers scoring highest on the post-training exam were retained from the original pool of three to five teachers for that given facility. This rapid training was in contrast to the standard team, which was comprised of Afghan health professionals from throughout the country, most of whom had years of experience in survey data collection. Prior to data collection the standard team underwent an annual, two week training on survey tools and procedures in Kabul that included extensive field testing and post-training exams.
Each of two monitor-supervisors was paid $600 US Dollars (USD)/month as part of their annual contract, in addition to a $15 USD/day per diem for days spent in the field. All four members of the standard survey team received $500 USD/month plus a $15 USD/day per diem while in the field. Each of the 48 locally-based surveyors received $80 USD total for their work on this project.
Data collection
The standard survey team collected data in Badghis during March-April, 2010; however, due to delays in participant selection and training, locally-based teams were not able to collect data until July-August, 2010. A maximum of 2 days was given to complete each facility assessment. Once finished, locally-based teams returned to the provincial capital to meet with the provincial supervisor, who ensured completion of the survey tools and confirmed the local team’s visit to the facility by phone. Participants on the local teams were reimbursed for their time upon verification of survey completion. During the period of data collection, supervisors conducted active monitoring of the locally-based teams by randomly selecting 2 facilities in secure areas to which they accompanied the survey teams. Post-monitoring was conducted on 4 randomly selected facilities in secure areas, where highly-trained monitors re-surveyed the facility using only the facility record audit survey tool one day after the locally-based teams finished. Upon review of all questions administered at the 4 secure facilities selected for post-monitoring, there was a 91% concordance rate in the data generated by the supervisors and locally-based teams.
Data analysis
Data were analyzed using STATA version 10 (Stata Corp, College Station, TX). Scales and indices used in the calculation of BSC scores were generated from the survey data for categorical and continuous variables, respectively. Details of BSC indicator composition are discussed elsewhere [1, 2, 4]. Briefly, each of the 23 indicators was generated from 1 to 19 component variables that are included in the BSC facility survey tools. All indicator scores in this study were continuous variables that ranged from 0 (poor) to 1 (excellent).
For the primary objective of assessing the reliability between the locally-based and standard survey methods, only the 11 facilities visited by both survey methods were used to compare 23 BSC indictors. Spearman rank-correlation coefficients were used to compare these indicators by survey method (standard versus locally-based), and chi-squared analysis was performed to assess statistical significance of aggregate demographic data. Because each of the 11 overlapping facilities was assessed once by each survey method and each facility contained multiple observations of health workers and patients, a linear regression model with generalized estimating equations (GEE) and robust variance estimation was used to account for correlations within the repeated measures of the health service indicators at each facility. P-values were generated using GEE regression models to determine the influence of survey method on the given outcome. GEE regression with robust variance estimation has been validated for sample sizes less than 10 [30]. Kappa scores were not used, given that our analysis required comparing multiple data points paired by the individual facilities assessed, instead of a comparison of aggregate, unpaired data.
For the secondary objective of comparing health service provision at secure versus insecure facilities, we compared indicators generated from the locally-based method for 11 secure and 13 insecure facilities, respectively. This was done using multiple linear regression with GEE controlling for facility type (SHC, BHC, CHC) to account for potential confounding.