Church Abuse Research & Education Services


The Sisu Advantage Center for Abuse Research and Education Services (sisuCARES) recently has initiated a series of research activities to systematically study abuses between the sexes and estimate its multidimensional impact on individuals, families, workplaces, and societies.
While these initiatives are not yet ready for dissemination, the following research strategies and links to topic-related sites are suggestive of the efforts underway, which will be shared here soon.


Our initial efforts are on more precisely defining the currently loose terminology surrounding sexual harassment and abuse, developing more reliable measures of its incidence across varied settings, and estimating its impact.

Following are three examples of the type of research we are conducting.

EXAMPLE 1 — Randomized Survey Responses

How prevalent is sexual abuse? Hard to say. First, we would need to very precisely define what does or does not constitute “sexual abuse”, a task best suited for applied linguists — but lets for the moment assume there is a commonly-accepted definition. Next, we need data. One could rely on public records, such as police reports, but crimes of this nature are notoriously unreported, for a variety of reasons that include fear of retaliation, embarrassment, confusion and denial. Instead, we are better served through sample surveys. They may at first glance seem to suffer from the same under-reporting bias as public records, but there are strategies to mitigate or outright overcome these limitations.

The general strategy is known as a “randomized response” methodology. There are many variations of this strategy, but the simplest version will suffice here for purposes of demonstration. Assume that 1,000 women between the ages of 18-34 agree to participate in a survey involving intimate relations, on the condition of anonymity and resultant assured confidentiality. It is likely that some who agree to participate will nonetheless distrust the process and not be fully forthcoming with true responses. The randomized response strategy asks the women to answer ‘yes’ or ‘no’ to either a question of interest or its negation — and there is no way for researchers or others to know which question has been answered. For instance, the respondent may be asked to think of any birthdate for any person (themselves, their 3rd-best friend, Abraham Lincoln, …), identify the last digit in the day of the month in which that person was born, and:

IF that number = 7, answer “Have you EVER been sexually abused?”
IF that number ≠ 7, answer “Have you NEVER been sexually abused?”

Although the research cannot discern which question any particular respondent answered, she does know the probability (p) that the first question was answered and (1-p) that the second question was answered.

The randomized response strategy incorporates (p) and (1-p) into the aggregate estimate of the percent responding that they, in fact, had been sexually abused. The accuracy of that estimate increases with sample size. There are mechanisms to estimate the number of respondents who nonetheless did not trust the process and adjust estimates accordingly, but for purposes of this simple example let’s assume full compliance.

Let’s simulate the process, with p = 0.10, (1-p) = 0.90, and an unknown true percent of women abused ( = 0.33). First, let’s visualize the problem:

Next, we generate a random sample of N = 1,000 to represent the women between the ages of 18-34 who have agreed to participate in our survey, and simulate their responses in accordance with the rules we have set.

With a true underlying value of ( = 0.33), a single run of the simulation yields the following result using the classical (Warner) method:

The estimated value (0.3500 = 35%) seems a reasonable approximation of the true value (0.33 = 33%), and that approximation improves with N.

Re-running the same random sample with different N yields increasingly small standard errors, but in each case

N estimate of standard error Pr(>|z|)
100 0.3375 0.0607 2e-16 ***
1,000 0.3500 0.0192 2e-16 ***
10,000 0.3318 0.0060 2e-16 ***

At first glance the smaller sample (N = 100) may seem more accurate than our base case (N = 1,000), but that is an artifact of the higher standard error with a smaller sample size. Re-running the simulation at N = 100 yields:

Run 1: 0.3375; Run 2: 0.4875 Run 3: 0.2370 Run 4: 0.2750

Clearly, all things equal, the larger the sample, the better. Regardless, the randomized resampling strategy has been applied extensively over the past quarter-century to attain (less biased) answers to sensitive questions.

EXAMPLE 2 — Systematic Revision of Estimates

There are many small data samples on the prevalence of sexual harassment, but no reliable meta-data source, as a result of which our estimate of pervasiveness can be overly sensitive to the next available study. Consider the following example, based on two survey results.

You want to estimate the percent of female workers in the hospitality industry that have been subjected to sexual harassment at work. Your initial thought might be that there is little or no difference among industries in this regard, and so you base your estimate on this first information source in framing your understanding of prevalence:

  1. “A new survey found that one in three women between the ages of 18-34 has been sexually harassed at work. Cosmopolitan surveyed 2,235 full-time and part-time female employees and found that one in three women has experienced sexual harassment at work at some point their lives.”

Taken literally, this study indicates that 745 respondents responded “yes”, and 1,490 “no”, given that 1/3rd of 2,235 surveyed responded in the affirmative. You further consider that: (1) this may be too high an estimate if sexual harassment is more likely within the 18-34 age bracket, and (2) this may be too low an estimate among workers in the hospitality industry (especially among lower-wage service providers financially dependent upon tips). Subjectively balancing these two competing forces, your best guess based on available evidence is that the incidence among all female workers in the hospitality industry is half again as large as indicated in the more general sample (.33 * 1.5 = .50), 50%. Further, your subjective estimate of uncertainty around that estimate is such that you believe there is only a 1 in 10 chance (10%) that the true incidence rate among all female workers in the hospitality industry is more than double the 1/3rd (0.33 * 2 = 0.67), 67%, point estimate from the study cited above. A binomial distribution with median = 0.50 and 90th-percentile = 0.67 can be represented by the beta distribution with parameters a = b = 6.93, a perfectly symmetric curve.

A few days later you discover a study specific to the hospitality industry:

  1. “A shocking 90% of women in the (hospitality) industry report being sexually harassed at work, according to a 2014 study (of 888 women) by the US-based Restaurant Opportunities Center.”

Notice above that your prior distribution gave virtually no indication that the actual incidence could be as high as 90%. Bayesian analysis provides a systematic means of updated our view to reflect the new information, weighing the prior distribution by the strength of belief (inverse of the variance) and the data by its strength (sample size) and belief (inverse of the variance). In our simple example, the resulting posterior distribution makes clear that the new information (the “data”; Study #2) dominates the earlier information (the “prior”; Study #1 + subjective adjustments), resulting in an updated view that essential abandons the earlier information and embraces the latter. The beta parameters for this posterior are a = 806 and b = 96.

The intent of our work at The Sisu Advantage in this regard is to develop a sufficiently robust and comprehensive database on sexual harassment and abuse that such wild fluctuations in our understanding of the scope of the problem are avoided.

EXAMPLE 3 — Incorporating “Expert” Opinions

Victims of sexual harassment or abuse are not the only ones aware of the incidents. Perpetrators are not inclined to admit to such behavior, and may not even recognize that their actions fit the definition. Even when intending to be forthcoming, victims can be unreliable sources if they have suppressed or are in a state of denial regarding the incident. Another credible source for information can be recognized“experts”, whose “opinions” about aggregate incidence logically should be weighted heavily.

What constitutes an “expert”? The answer varies by context, but for these purposes, it is anyone who has access to relevant information regarding sexual harassment or abuse at an aggregate level and has demonstrable skill in processing such information. How do we identify candidate experts? It seems reasonable to consider professionals who presumably are knowledgeable of the subject and have proprietary access to information, such as doctors and nurses, school counselors, priests, and ministers. How do we validate that they have expertise relative to the questions we intend to ask? We ask some questions that are of a similar vein, but for which we have access to the truly correct answer. How do we then proceed? We weight the “experts” based on their relative accuracy in answering the (seed) questions to which we have right answers, and construct weighted estimates and confidence intervals for their collective answers to the (target) questions to which we are seeking answers. An example follows.

Assume we know the reported number of reported sexual abuses in a certain jurisdiction, but not the actual number. That number in reality likely is unknowable with certainty, but we believe “experts” are more likely to provide quality estimates. We solicit participation from three alleged experts: a minister, a nurse, and a school counselor. We ask each to independently estimate three values (their estimates of the 10th, 50th, and 90th percentiles) from their subjective notion of the distribution of plausible answers to each of three questions. The first two are seed questions and the third is our target question of interest. Let’s assume these questions:

seed_1: What percentage of females in our town are between ages 18-34?
seed_2: How many of them have reported a sexual abuse?
seed_3: How many of them have actually been sexually abused?

Assume we receive back the following responses from our three experts:

From municipal records we know the right answer to seed_1 = 25% (0.25), in which case expert_2 comes closest, followed by the under-estimate of expert_1, with the over-estimate by expert_3 having the greatest error. Similarly, from police records we know the correct answer to seed_2 = 2,000, in which case expert_2 again comes closest, again followed by the under-estimate of expert_1, with expert_3 again trailing with an over-estimate. Given this knowledge, we will weight expert_2 most heavily in estimating the true answer to qstn_3, and will adjust the inputs from expert_1 upward to overcome his tendency for under-estimate bias, and weight expert_3 the least while adjusting her estimates downward. There are several variants for properly adjusting the weighted inputs to derive a density distribution for the unknown (nay unknowable) answer to qstn_3, with the “Cooke Method” perhaps the most well-known — resulting here in:

The density distribution is non-symmetric and non-concave. The median of the weighted and probabilistic responses = 4,819 — our best estimate of the number of women aged 18-34 in the town who have been sexually abused.