Responses to Criticism

Responses to Criticism

A number of criticisms have been leveled against the study. This page presents responses to a number of these criticisms. We’ll add more responses over time as we continue to engage in this important area of research.

The study was approved by the Institutional Review Board at Wheaton College.

Please note, we cannot answer every single specific criticism sent to us. It is very likely that your specific comment is already covered in the material below and we encourage you to read all our responses.

The material in the second and third major section is taken as noted from our book Ex-Gays? A Longitudinal Study of Religiously Mediated Change in Sexual Orientation by Stanton L. Jones and Mark A. Yarhouse. Copyright(c) 2007 by Stanton L. Jones and Mark A. Yarhouse. Used by permission of InterVarsity Press (http://www NULL.ivpress PO Box 1400 Downers Grove, IL 60515. 

Brief reflections on the most significant legitimate (in our minds) criticisms and questions about study

Are your definitions of success “Conversion” and “Chastity” legitimate?

Are your definitions of success legitimate? Does a “Success: Conversion” case who still experiences homosexual attraction really count as a conversion? Does a person who is merely refraining from sex (what you call “chaste”) really count as success?

Let us begin by quoting the first chapter of our conclusion of our 2007 book (pp. 364-5):

In conducting this study, we found empirical evidence that change of homosexual orientation may be possible through involvement in Exodus ministries, either a) in the form of an embrace of chastity with a reduction in prominence of homosexual desire, or b) in the form of a diminishing of homosexual attraction and an increase in heterosexual attraction with resulting satisfactory heterosexual adjustment. These latter individuals regard themselves as having changed their sexual orientation; the former regard themselves as having reestablished their sexual identities to be defined in some way other than by their homosexual attractions.

Most of the individuals who attained what we called “Success: Chastity” did not regard this outcome as a trivial thing. These individuals were often engaged in what they experienced as compulsive sexual acting out (involvement in promiscuous anonymous sexual encounters, involvement in adulterous homosexual encounters outside of marriage, addiction to pornography or compulsive masturbation, and so forth). Their experience of “healing” included bringing these behaviors under control by eliminating them, but also attaining a sense of peace and satisfaction afforded by living a life consistent with Christian moral standards. Further, these individuals experienced a certain “distancing” from their psychological identification as gay or lesbian, and experienced a recent turning of their identity on their religious life. While others may trivialize this as an insignificant attainment, for these individuals it was anything but trivial.

Regarding individuals who attained “Success: Conversion,” we again quote from our 2007 book (pp. 372-3):

Second, while we found that a part of our research population experienced success to the degree that it might be called (as we have here) “conversion,” we have insufficient evidence to conclude that these changes are categorical, resulting in uncomplicated, dichotomous, and unequivocal reversal of sexual orientation from utterly homosexual to utterly heterosexual. Most of the individuals who reported that they were heterosexual at Time 3 did not report themselves to be without experience of homosexual arousal, and did not report heterosexual orientation to be unequivocal and uncomplicated. Sexual orientation for the individuals in this study (and indeed for most of us) may be considerably more complicated than commonly conceived, involving a complex interplay of what we are instinctively attracted to, what we can be attracted to with proper attention and focus, what we choose to be attracted to based on how we structure our interpersonal environments, our emotional attachments, our broader psychological functioning, (of course) our religious and moral beliefs and values, and many more factors. We believe the individuals who presented themselves as heterosexual success stories at Time 3 are heterosexual in some meaningful but complicated sense of the term. 

Some would immediately challenge us that any change short of complete conversion to uncomplicated (and caricaturish) heterosexuality with total eradication of homosexual attraction means that claims of successful change are a sham. By this standard, any admission that these participants experience any homosexual arousal means that claims of success are an illusion. Are people who still have any level of homosexual attraction still gay?

As we discussed in Chapter 6 and elsewhere, we reject the stance that any reoccurrence of homosexual attraction signifies that the person has “not really changed.” This is simply too stringent a stance to pass the test of generalization to other conditions. We would not say that couples who have experienced positive gains from relational therapy but still have distressing conflict have not changed for the better, that a person who still experiences craving for an addictive substance has not defeated the addiction, or that the person who still struggles with occasional milder dysphoria has not overcome major depression. On the other hand, the general concern here is not without merit as an argument. Take what might be regarded as the paradigm case: the phenomenon of the “ex-ex-gay,” who a) experiences clear homosexual orientation (“gay”), b) attempts change and claims healing/change (“ex-gay”), and c) then later reverts back to an embrace of homosexual identity and lifestyle (“ex-ex-gay”), while reporting that his or her earlier reports of change were coerced, insincere responses to a heterosexist climate that were made despite stable, enduring, unaltered homosexual attraction. It must be acknowledged that this person never shifted from being homosexually-orientated. This, however, begs the question, in that the fact that such a person experienced inadequate change on which to reestablish a new ground of sexual identity cannot establish that others were not able to make that shift in a way that is satisfying and meaningful.

Underlying this criticism is a common and troublesome assumption, that of a radical dichotomization of sexual orientation (“you are either totally gay or totally straight; there is no in between”). This simply does not appear to be the reality. If an individual who experiences exclusive or near exclusive homosexual orientation changes, as a result of an intervention, to experience diminished homosexual attraction and sufficiently strong heterosexual attraction to another person adequate to serve as a foundation for a satisfying relationship, significant changes occurred. Critics often act as if the occurrence of any continuing homosexual attraction indicates that no real change occur, but this seems to us unwarranted. Real change, even dramatic change, need not necessarily be categorical change.

We also discuss the “standard of success” extensively on page 235 and following in the book.

What of the anecdotes of failure to change and of harm caused by the attempt?

What about all the powerful anecdotes of failure to change and of profound harm caused by the attempt?

If we are going to be fair, we should listen seriously to all anecdotes, not just the anecdotes on one side or the other. There are anecdotes of remarkable transformation out there, and anecdotes of profound frustration and hurt as well.

A scientific study could be regarded as simply a carefully gathered compilation of anecdotes. At one level, any scientific study is simply a conglomeration of the experiences of a group of people, and thus would appear to be similar to anecdotes. A scientific study, however, has two characteristics that any loose collection of anecdotes lacks: First, a scientific study requires that its participants participate in assessments according to some sort of objectively-defined set of criteria so that the results are less subjective in nature. Second, a scientific study attempts to gather a representative group of subjects so that the results reported are more informative about “average outcomes” than a non-random group of anecdotes would be.

The common sense examples of this are easy to describe: As soon as you are diagnosed with a disease, those who care for you will bombard you with anecdotes, good and bad, about various horror stories of those with your disease and of treatments for the disease. Suppose you are told “I know someone who took supplement X and three months later was the picture of perfect health!” Rather than rely on such an anecdote, a reasonable person would prefer to have a scientific study that would tell you what kinds of outcomes can be expected based on objective standards from such an intervention.

We approached the prospect of conducting this study having heard both anecdotes of success and failure, and about the potential for harm. We attempted to create the best sample possible of individuals seeking change through Exodus, and then attempted to assess and report honestly the outcomes derived. The result is a complex body of data. Our study is not of sufficient quality, particularly regarding the size and representativeness of the sample, to say what proportions of individuals can expect to derive benefit from Exodus’s interventions. Nor can we say that individuals are highly unlikely to be harmed by any given Exodus intervention. Our data is suggestive, however, that the positive outcomes sought by the participants were not uncommon, and that harm is not intrinsic to the attempt to change.

Why are you not allowing subjects from your study to be identified in the media?

If your findings are really true, why are you not allowing subjects from your study to be identified in the public media? What do you have to hide?

The short answer is that we have sought to provide the highest level of protection for every individual who participated as a research subject in our study. The outcomes attained are typical, on all sides, of those who are already available for public comment, and so the media does not need exposure to our actual research subjects to find real personal examples of these outcomes. Exodus provides personal stories (http://exodusinternational and speakers (http://exodusinternational representing individuals who have attained satisfactory heterosexual adjustment, and those who maintain satisfactory chastity in singleness. Ex-Gay Watch (http://www NULL.exgaywatch similarly provides many archived stories and connections for individuals who will speak of their failure to experience significant change of any kind in sexual orientation.

Let us briefly explain our concern for our research participants. On the surface, it might seem quite benign for us to contact an individual research participant, request them to speak about their experience in the study, and then to verify to the media their participation in the study. The problem is that most of our Exodus participants were not isolated individuals going through the change process alone (whether successfully or not), but were members of groups. Thus, any one individual who were to be identified as an actual participant could potentially identify others who are not willing to go public with their participation. Such an identification could occur intentionally or unintentionally. The only way to completely protect the anonymity of all our participants is to determine beforehand that we would not identify any participants in our study. We will not under any circumstances reveal the identities of any research participant nor confirm the participation of anyone who self-identifies as such.

Isn't it proven that those who attempt change risk profound harm, including suicide?

Isn’t it proven that those who attempt change risk profound harm, including suicide?

The wide array of anecdotes of harm induced by the attempt change sexual orientation should be taken seriously. They have to be taken in the context, however, of the wide array of anecdotes of benefit and success from the attempt to change.

The better question is how the potential for harm is balanced by the potential for benefit from any approach. Human beings take risks the matter what they do. How do we balance potential for both benefit and harm?

Complicating this question considerably is the different ways that people assign value to the possible outcomes of the attempt to change sexual orientation. The reason religious conservatives tend to value the attempt to change is that they accept a belief system that, in one form or another, places a high value on conformity to a moral ideal that does not include same-sex conduct. The Report of the APA Task Force (http://www NULL.apa NULL.pdf) on Appropriate Therapeutic Responses to Sexual Orientation (2009) acknowledged precisely this kind of conflict when it stated that “Some religions give priority to telic congruence . . . . [while in contrast] Affirmative and multicultural models of LGB psychology give priority to organismic congruence” (p. 18).

By “telic congruence” the Task Force was referring to religions, such as Christianity, that hold out a moral ideal that may be in conflict with our experience of our own reality. The moral ideals of chastity and monogamy, for instance, are in obvious conflict with the widely documented male propensity, whether hetero- or homosexual, for promiscuous sex.  In such an instance, the Christian calling is to sacrificially pursue or bring one’s life into congruence with this high ideal (telos). Thus, a conservative Christian with pervasive same-sex attractions might say that sexual orientation change is of premier importance in his life, while a person of no religious orientation might ask “why do you even care?” For the conservative Christian, the issue comes close to being a matter of life and death. For the nonreligious person, this seems crazy. The difference in outlook is profound.

The Task Force said “organismic congruence” is of greater importance for gay affirming psychologists.  By this term, the Task Force was referring to persons who place above all else fidelity to themselves and their own experience, including biological impulses and sexual drives. The ultimate referent for life is thus grounded in experienced reality, biological and psychological, in contrast to some transcendent ideal.  Such an outlook is truly alien to some religious persons.  This is a profound divide, and what you value above all else is shaped by which “side you choose” of this divide.

Further complicating the issue of harm are the questions:

  • What really produced the harm? Assume that reports of harm experienced are legitimate. Was the harm experienced the result of the attempt to change generically? Or was the harm experienced as the result of particular abusive, superficial, intrusive, or irresponsible change method or an abusive leader?
  • Are all types of harm receiving equal attention? Religious conservatives who “come out of the gay lifestyle” report feelings of harm from their embrace of gay identity that is relieved by coming to faith. Are these stories being treated seriously as well?
  • How do we judge complaints of harm attributed to “wasted years” of unsuccessful change attempts? If every psychological intervention method has successes and failures (and they do), aren’t individuals who fail in their change attempts always going to regret the time invested in the failed attempt?

Finally, on the grievous and serious issue of suicide: We have heard a number of anecdotes of those who attempt sexual orientation change, grow frustrated and despondent at their lack of success, and eventually succumb to despair and commit suicide in response to their experience. There are specific stories of deceased individuals leaving suicide notes or telling loved ones that their suicide was directly related to their despondency over a failed change attempt.

In response, we would affirm the preciousness of human life and state forthrightly that the preservation of life should be our highest value. Faced with a suicidal client, our efforts as mental health professionals should be directed at preserving life. At the same time, one of the fundamental lessons mental-health professionals learn in their training is that individuals despondent enough to commit suicide are not thinking rationally. Such individuals, in their suicide notes or statements of suicidal intent, often lash out irrationally and blame family members, employers, loved ones, and even mental-health professionals for their condition. That a person blames a failed change attempt must be interpreted with caution, and such direct attributions taken in context.

Why publish a limited study when a more ideal study would have contributed so much more?

Why publish such a limited study when a more methodologically ideal study would have contributed so much more?

It is commonly acknowledged that research design is always a compromise between the ideal of perfect control of all variables of interest, on the one hand, and the “doable” dictated by the practical possibilities achievable by the researcher. Below, we sketch here what an ideal study might look like, and discuss some practical reasons why it would be impossible to conduct such a study.

An ideal study of the question of whether sexual orientation change is possible and the attempt harmful might take the following form (here we interact with some of the ideas from the APA Task Force Report):

  • Research Sample: comprised of a representative sample of gay and lesbian persons interested in change recruited from religious and nonreligious subpopulations; screen out all co-morbid conditions such as depression or addiction
  • Sample Size: 800 subjects
  • Prospective: insist that all subjects have never attempted sexual orientation change in the past
  • Longitudinal: given the anecdotes of how long people can persist in fruitless efforts of change, follow subjects over 15 to 20 years
  • Interventions: develop manualized, highly standardized consensual intervention methods; have one “pure religious” intervention, one “pure reparative psychotherapy” intervention, and one method that combines both methods
  • Assessment Methods: utilize multiple assessment methods, including psychophysiological measurement
  • Ideologically Neutral: have the research conducted by those with no vested stake in the outcome

The proposed research design would be comprised of four groups:

1)      Control Group that receives no intervention for at least five years

2)      Religious Intervention only

3)      Reparative Psychotherapy only

4)      Combined Religious and Psychotherapy Intervention

Assign 200 subjects to each group; 800 subjects total. All subjects would be comprehensively assessed before random assignment to one of the four experimental groups. The results would be published in approximately the year 2040.

What are some of the many reasons why this study would and could never be done?

  • We do not know what a representative sample of gay and lesbian persons looks like; no such sample exists. Where would one turn to construct such a sample?
  • Interest in change of sexual orientation is not evenly distributed in the gay and lesbian population; as the APA reports, it is primarily the religiously conservative seeking change today. This would mean immediately the study sample would not be representative.
  • It would be prohibitively expensive to conduct such a study, and no particular entity (government funding agencies; research foundations) appear interested in conducting such a study.
  • It would be difficult to find people motivated to change who have never attempted such change by some other method.
  • There is no intervention package that is acknowledged as the treatment standard in these areas. The various Exodus Ministries have some similarities and differences from each other, and the mental health professionals who still do some form of professional psychotherapy in this area do not practice standardized, manualized approaches.
  • There would, without a doubt, be attempts to disrupt the study. As demonstrated in the recent controversy (http://www NULL.nytimes NULL.html) surrounding the exposé of the counseling center of Marcus Bachmann, husband of presidential candidate Michele Bachmann, for offering ex-gay reparative therapy demonstrates, gay advocacy organizations like Truth Wins Out are open to infiltrating such attempts through deception to discredit the very idea of offering such intervention methods. For their own reasons, religious conservatives might also attempt to “stack the deck” in the research population with those likely to put the best face on change attempts.
  • It is very hard to maintain the research population over a long period of time; we know of no study that has gone as long as this proposed study.
  • As we discuss in our review of assessment methods in the 2007 book, there is no consensus on how to measure sexual orientation. There is wide agreement that orientation has at least four major elements: sexual attraction, emotional/romantic attraction, sexual behavior, and identity. Disputes abound about how to measure each of these, and whether there are other dimensions or better ways to define these dimensions.
  • Psychophysiological measurement, in particular, is not the “gold standard” that some claim it to be. Neither the various measures of genital erotic response nor MRI studies of brain function are perfectly valid and reliable. Further, these methods require subjects to expose themselves to pornographic stimulus materials, which religious populations reject on principle. Finally, these methods are complex and expensive.
  • Given the controversial nature of the subject matter, research tends to be driven by those who have an ideological interest in this area. People who are neutral are unlikely be willing to conduct such a study.

For all of these reasons, we chose the quasi-experimental design of our study as the best way to get at the core questions we were interested in examining.

Our 2007 responses from the book to criticisms we had already received of the study, reproduced with permission from InterVarsity Press

Responses quoted with permission from the original manuscript version, pages 108 – 117 and 129 – 132 in the book:

Why didn't you implement a true experimental design?

Why didn’t you implement a ‘true experimental’ design using random treatment assignment to experimental and control conditions?

The “gold standard” for psychological research is the true experiment. The true experiment in psychology is defined by several characteristics, the most important being random assignment of research volunteers to multiple intervention conditions, with information kept from subjects as to whether they are receiving the experimental intervention that is the focus of the study or one of the control interventions against which the experimental condition is being measured. This is simply illustrated by the prototypical drug study: Subjects with a certain medical concern are recruited for an experimental trial of a new medication. They are randomly assigned to one of three conditions, the experimental condition and two “controls”: Group A receives the new experimental medication. Group B receives the “standard” treatment in the field which is presumed to be the “treatment to beat.” Group C receives a placebo (inert) medication, which is the equivalent of no treatment. In cases where there is a very well-established standard treatment or where it would be judged unethical to withhold treatment from the subjects, the placebo control is not used.

In the psychological parallel to the prototypical drug study design, subjects with a certain psychological concern are recruited for an experimental trial of a new psychological intervention method. They are randomly assigned to an experimental condition and to appropriate control conditions. The experimental group receives the new experimental intervention. This experimental group is compared against an appropriate control. Often, and especially when the psychological intervention is projected to work rapidly, a “waiting list control” is the chosen control condition. The randomly assigned control subjects are told that the experimental study is full, but that they will be assigned to treatment as soon as possible and that while they wait for treatment, the researchers would like to monitor their functioning. Then, the difference in the psychological functioning of the waiting list control and experimental groups becomes the test of the effectiveness of the experimental method.

If a waiting list control is not appropriate, either because it is unethical to withhold treatment (for example, with suicidal persons) or because the experimental intervention takes longer to implement than is reasonable to ask people to wait for treatment, an active control group of some kind must be utilized. One approach would be to offer a standard intervention against which to compare the experimental intervention (as when a new form of psychotherapeutic intervention with depression is pitted against the appropriate standard treatment). Another is to create a seeming “placebo.” What has been called “non-directive listening” has been used in many studies as a presumed placebo in the past.

There are several key issues, however, that would have made such a true experiment difficult to undertake. First, such a study would have been pragmatically impossible for us. By definition, such a true experiment would either have either doubled the size of our study in order to match our experimental sample with a control sample, or would have required us to divide the sample we had into halves, thus diminishing the size of the sample we were able to study.

Second, the background logic for establishing a control condition against which to compare an experimental treatment contains at least two key assumptions that are invalid with respect to sexual orientation change. These assumptions are (1) that there is some level of spontaneous remission of the condition being treated, and/or (2) that the question being answered by the study is not whether change occurs but whether it occurs faster or better through the experimental treatment than through spontaneous remission or other existing treatments.[i]

Consider depression, the “common cold” of mental health. It is well known that if people who are experiencing depression go without treatment, then after a given duration of time (during which they presumably experience new life circumstances and also draw on the resources around them—caring family and friends, the wisdom of a pastor or bartender, and so forth), a certain percentage of the original sample will spontaneously (i.e., without professional treatment of any kind) experience a remission of symptoms and recover. Further, there are lots of existing professional treatments for depression, so a new treatment, to establish its worth, must be shown not just to work, but to work better than existing alternatives.

None of this logic has much hold on our study if the reality is that homosexual orientation is in fact impossible to change. As we have already belabored, the mental-health-care establishment has spoken clearly: homosexuality is immutable. Hence, our primary research hypothesis was that change would not occur at all. This greatly simplified our methodology, as we did not have to worry about spontaneous remission, nor about proving that Exodus treatment is better than existing alternatives. Sexual orientation change is supposed to be impossible, and so any demonstration of change stands against a presumed backdrop of impossibility. These logical arguments mitigate against the need for a control group of any kind.

Third, if we take Robert L. Spitzer’s study, which we discussed in chapter three, as any indication, the change process, when it is successful, takes a long time.[ii] Spitzer reported that it took an average of two years for participants to begin to experience change, and that it took an average of five years for 79% of participants to report a change of sexual orientation. This is far too long to ask a wait list control group to wait for professional services. A “wait list control” design works very well for interventions that target immediate symptom reduction, such as a program designed to alleviate panic attacks among those diagnosed with an anxiety disorder. Such a study looking at rapid treatment of panic symptoms has been run by David Barlow and others over the course of twelve to fourteen weeks, and the rapid treatment program has been demonstrated to be successful for 60-80% of participants.

But there is consensus among the few mental health professionals who believe that change of sexual orientation is possible that such change is a complicated clinical concern that likely requires a significant commitment on the part of participants over the course of several years. To make persons seeking change wait such a long time would be questionable ethically[iii].

Fourth, we decided to implement what is best called a “quasi-experimental design,” in that it does not solicit volunteers but rather gathers information on those naturally pursuing an existing intervention method (Exodus). In the current highly politicized cultural wars surrounding homosexuality, it would be very difficult to solicit volunteers who critics on either side of the change debate did not believe were participating to prove or disprove claims of change.

The typical controlled experiment testing a new treatment of depression might appropriately place an advertisement on TV and radio or in newspapers stating, baldly, “We are recruiting volunteer subjects to study an unvalidated treatment for depression. Subjects who volunteer may be assigned to this treatment, or to a placebo treatment. Please volunteer.”

Imagine the response, however, if a similar ad were placed stating, “We are recruiting volunteer subjects to study an unvalidated religious treatment for homosexual orientation. Subjects who volunteer may be assigned to this treatment or to a placebo treatment. Please volunteer.” Advocates against such treatment might flood the volunteer rolls merely to “prove” that it does not work, and it is quite possible that individuals who sincerely desired change would avoid such a study in order not to risk being assigned to a “placebo” treatment and thus waste months or years with methods that do not produce success [iv].

Fifth, control conditions such as placebo-attention conditions (with perhaps lots of active listening but no active treatment) or competing treatments must truly be credible alternatives to the experimental treatments being studied. Given the long time spans needed for change of sexual orientation to occur even in the most optimistic projections and the lack of credible alternative treatments (our subjects were typically aware that it was “Exodus or nothing,” so to speak), there was simply no opportunity to create a credible control or comparison condition.

Finally, we should note that in the mental health field so-called true experiments are undertaken typically in contexts where treatments are rigidly defined and controlled. Handbooks to define and govern the interventions are created, and interventions are conducted by therapists working under the direction of those conducting the research. Because of our limited funding, our commitment to studying established intervention programs in the “real world,” and because the ministries we were working with would not condone putting those who wanted their help on a “no treatment waiting list” for three to five years, we decided that use of a true experiment research design with random subject assignment was impractical.

[i]See Kendall, P. C., Holmbeck, G., & Verduin, T. (2004). Methodology, design, and evaluation in psychotherapy research (pp. 16-43). In M. J. Lambert. (Ed.). Bergin and Garfield’s handbook of psychotherapy and behavior change (5th ed.)New York: John Wiley.

[ii]Spitzer, Robert L. (2003). Can some gay men and lesbians change their sexual orientation? 200 participants reporting a change from homosexual to heterosexual orientation. Archives of Sexual Behavior, 32(5), 403-417.

[iii]See Kendall, P. C., Holmbeck, G., & Verduin, T. (2004). Methodology, design, and evaluation in psychotherapy research (pp. 16-43). In M. J. Lambert (Ed.), Bergin and Garfield’s handbook of psychotherapy and behavior change (5th ed.). New York: John Wiley, p. 20.

[iv]This is no longer a hypothetical example as we finish the first draft of this chapter. In October 2005, word flashed around the Internet that “Dr. Phil” was taping a television episode in Los Angeles on gay teens seeking healing of homosexuality. Friends on both sides of this contentious issue forwarded us e-mails urging the rallying of troops to flood the studios and create the impression that all the support was on one side or another—pro-gay advocates to protest and shout down the oppressive religious conservatives, and religious conservatives to cheer on testimonies of change and silence the sinful homosexual lobby. That would be the precise fate of a publicly announced study.

Why didn't you use psychophysiological (biological) measurement of sexual orientation?

Why didn’t you use psychophysiological (biological) measurement of sexual orientation?

“Behavioral” or psychophysiological measurement of sexual response would appear to be a truly scientific, utterly objective measure of sexual orientation superior to putatively subjective self-report measures of sexual response. If such methods are available and respected, why didn’t we use them in this study? We will here briefly describe these methods and the reasons we had for not using them.

Those three reasons are that these methods are not the indisputable scientific measures that many imagine them to be, that the use of the methods posed insurmountable practical challenges for our study, and finally that the methods would have been morally unacceptable to our study participants. This presentation of these issues is a condensation of our original discussion of this issue; that original discussion is available on the website where we are archiving materials related to this project for public access at

A commonsense explanation of this method is in order. Both men and women respond physically when they are sexually aroused, and it makes sense that assessing how individual men and women respond to different types of erotic stimuli would be valuable information in assessing their sexual orientations. Both sexes respond with a variety of whole-body responses during sexual arousal. Men and women breathe faster, their nipples harden, their pulses quicken. And the genitals undergo a process called vasocongestion (“congestion by or with blood”) as blood flows to and enriches the genital area to facilitate sexual pleasure.

For men, the primary result of vasocongestion is erection of the penis. The penis is composed of a spongy tissue that can hold blood, and a man’s penis becomes erect simply because more blood gets pumped in than gets out when he is aroused. An erect and aroused penis, as a result, gets warmer, longer, thicker (of greater diameter and circumference) and harder when it is erect than when unaroused or flaccid. For women, vasocongestion results in a reddening or darkening of the vagina and the tissues of the labia, swelling or thickening of these tissues, a tightening or narrowing of the outer third of the vaginal opening but an opening up (or “tenting,” after the image of a tent being lifted up from the ground to create a space within) of the inner portion of the vagina, and perhaps most obvious, lubrication of the vagina.

All of these physiological reactions can be measured and quantified. Various measures have been developed and standardized to measure how sexually aroused a person is. A vaginal plethysmograph is inserted like a tampon into a woman’s vagina, where it bounces light off her vaginal walls to measure changes in their redness, a direct correlate of vasocongestion and arousal. Penile plethysmography measures the physical changes in the penis in a variety of ways, either through measuring temperature or change in size or both. In the classic paradigm for such measurement, individuals are fitted with the appropriate plethysmographs and then are exposed to a variety of sexual stimuli while being urged to relax and simply respond normally. The sexual stimuli are usually a variety of pornography (or “erotica,” as it is commonly called).

If we were attempting to assess whether a person who was a pedophiliac had overcome his sexual attraction to children, that person might be shown a series of video clips or photographs mixing sexual scenes involving children and those involving only adults of various ages and sexes, with the resulting responses carefully measured. Obviously, the expectation would be that if the person had not changed, exposure to the depictions of sex with children would elicit a response of sexual arousal.

Experimental use of this method requires a fairly sophisticated and stable lab environment with facilities for computer measurement, changing facilities where subjects can undress enough to put the plethysmograph on or in, video capabilities for the delivery of erotic stimulus materials (pornography of some kind), separate rooms where the subject can experience the stimuli without the experimenter intruding and so forth. Our subjects were spread all over the country, requiring at Time 1 that our interviewing teams travel widely to interview subjects and by Time 3 to resort to phone assessments because subjects were too dispersed for face-to-face interviews. This made it impossible to establish a central and accessible psychophysiological laboratory. Further, the substantial demands in time and energy we were already planning to make on our subject population to complete hours of interview questions and survey instruments, together with the personal intrusiveness of our questionnaires, led us to conclude that it was simply impractical to include such psychophysiological measurement among our dependent variables.

The moral unacceptability of psychophysiological measurement of sexual response to our population should also be fairly obvious. The moral and ethical acceptability of these psychophysiological measures of sexual arousal, as with all methods of intervention (whether for the purposes of research or therapy), must be assessed from the perspective of the professional and from the perspective of the recipient of the professional’s intervention (the research participant or therapeutic client). Exodus seeks to provide support for change from a biblical perspective for those who are struggling with homosexuality, lesbianism or bisexuality. The vast majority of our participants, as we shall see in chapter five, reported themselves to be born-again Christians at the start of our study, and the methods used by Exodus are all directed toward helping the individual understand who he or she is as a sexual person created by God and how to live in a way that honors the biblical principles of sexuality.

One of the most important goals for any participant in an Exodus ministry is sexual purity. Such purity includes, at the minimum, desisting from overtly immoral sexual behavioral patterns, but desired change toward sexual purity does not stop there. Participants are urged to purge their thought lives of immoral sexual images. The scriptural basis for this stance comes from the words of Jesus in Matthew 5:27-30:

You have heard that it was said, “Do not commit adultery.” But I tell you that anyone who looks at a woman lustfully has already committed adultery with her in his heart. If your right eye causes you to sin, gouge it out and throw it away. It is better for you to lose one part of your body than for your whole body to be thrown into hell. And if your right hand causes you to sin, cut it off and throw it away. It is better for you to lose one part of your body than for your whole body to go into hell.

This passage is viewed by traditionalist Christians as embodying both a morally normative judgment that lust is wrong and a moral imperative to avoid such responses and to discipline oneself to avoid the occasion of such responses. On this basis, the following sorts of practices are viewed as morally undesirable: sexual fantasy about immoral actions, masturbation (because of its typical incorporation of sexual fantasy) and consumption of erotic sexual images whether in the form of literary pornography, photographic or video pornography, Internet pornography and other forms.

Any method of assessment or change that itself embodied practices judged incongruent with biblical principles of sexuality would be unacceptable in these groups. Psychophysiological assessment of sexual response involves, at a minimum, having the research subject fantasize about a variety of sexual action possibilities that are specifically designed to be arousing to the viewer, and often utilizes the subject’s active consumption of a variety of pornographic materials as an aid to fantasy. Accessing pornography in any form would be in direct discord with the stated goals of most participants and their sponsoring groups. Had we attempted to use such methods, we would have not received referrals from participating Exodus groups and would have had high, if not total, refusal rates from participants.

That brings us to the third issue, the scientific validity of psychophysiological measurement of sexual response. Many studying sexual response scientifically would argue that the best measure of sexual arousal in the male is penile tumescence. G. S. Alford, D. Wedding and S. L. Jones stated that “clinicians and researchers have generally assumed (at least implicitly) that penile tumescence is more difficult to control subjectively or to fake than are self-reports. Hence it is considered a more valid indicator of subjects’ true levels of sexual arousal in response to clinically or experimentally administered sexual stimulus materials.”[i] There is reason to doubt this conclusion, however. We will only highlight the major confounding findings here.

Alford, Wedding and Jones reported conclusive evidence using an elegant single-case design that their homosexual research subject (who had sought treatment to eradicate unwanted homosexual arousal) could under one set of instructions “fake” being a treatment success by producing heterosexual arousal when he was “supposed to” and completely mask homosexual arousal to feign its eradication.[ii] Under different instructions the subject demonstrated (truthfully) that the treatment had actually been a complete failure. In the words of Alford, Wedding and Jones, “this patient was able to suppress or generate arousal with almost perfect reliability, parallel to instructions.”[iii] On an autobiographical note, Stan Jones was the third author in this study, which was a sobering introduction to the “scientific measurement” of sexual arousal and orientation.

D. R. Laws and M. L. Holmen[iv] established that clients were capable of producing penile responses to stimuli that were not erotic to them, and able to inhibit their erectile response to stimuli that they normally found sexually arousing.

Recent research has challenged the validity of plethysmography by demonstrating that sexual arousal can be conditioned or learned, thus potentially distorting assessment regarding which stimuli are or are not “naturally” arousing.[v]

A. Kaine, M. Crim and G. Mersereau  found in a mixed group of sex offenders and nonsex offenders that subjects had “a clear ability to suppress penile response to preferred stimuli, both as determined by the subject’s choice and by measurement of tumescence.”[vi]

A study completed by R. J. Wilson confirmed that the penile tumescence test has the potential to be faked.[vii] Subjects were able to control their penile response under conditions of instructed faking, with subjects showing a pattern that it was easier to suppress a response to arousing stimuli that it is to fake an arousal response to nonarousing stimuli.

McAnulty and Adams found one-third of their study participants were capable of complete suppression of undesired arousal, while another third exhibited no significant ability to suppress genital arousal.[viii] The researchers suggested that the use of plethysmography as a “lie detector” regarding sexual arousal patterns would be invalid and unethical.

This pattern of findings has led to a reexamination of the idea of measurement of penile tumescence as the “true” index of arousal. According to L. L. Delizonna, J. P. Wincze, B. T. Litz, T. A. Brown & and D. H. Barlow, the mere presence of physical arousal (as measured by the instrument) does not automatically and univocally equate to physical or mental feelings of sexual arousal.[ix] Other components, particularly the subjective psychological, emotional and relational dimensions of sexual attraction, appear to be at least as important as genital response in conceptualizing and explaining sexual arousal. As E. Koukounas and R. Over insightfully point out, no measure comparable to genital vasocongestion has been developed for assessing the level of one’s psychological sexual desire.[x] It would appear that self-report techniques are the best option to investigate this crucial realm.

So it was that we considered and rejected the use of psychophysiological measurement of sexual response as practically impossible, morally unacceptable and of insufficient value empirically to override the prior two reasons.

[i]Alford, G. S., Wedding, D., & Jones, S. L. (1983). Faking “turn-ons” and “turn-offs”: The effects of competitory covert imagery on penile tumescence response to diverse extrinsic sexual stimulus materials. Behavioral Modification, 7(1), 113.

[ii]Ibid., p. 113.

[iii]Ibid., p. 123.

[iv]Laws, D. R., & Holman, M. L. (1978). Sexual response faking by pedophiles. Criminal Justice Behavior, 46, 1517-1518.

[v]Plaud, J. J., & Martini, J. R. (1999). The respondent conditioning of male sexual arousal. Behavior Modification, 23(2), 254-268.

[vi]Kaine, A., Crim, M., & Mersereau, G. (1988). Faking sexual preference. Canadian Journal of Psychiatry, 33, 384.

[vii]Wilson, R. J. (1998). Psychophysiological signs of faking in the phallometric test. Sexual Abuse: A Journal of Research and Treatment, 10(2), 113-126.

[viii]McAnulty & Adams (1991). Voluntary control. Journal, p. ??

[ix]Delizonna, L. L., Wincze, J. P., Litz, B. T., Brown, T. A., & Barlow, D. H. (2001). A comparison of subjective and physiological measure of mechanically produced and erotically produced erections (Or, is an erection an erection?). Journal of Sex and Martial Therapy, 27, 21-31.

[x]Koukounas, E., & Over, R. (2001). Habituation of male sexual arousal: Effects of attentional focus. Biological Psychology, 58, 49-64.

How can this research be trusted when the researchers are clearly biased against GLBT people?

How can this research be trusted when the researchers are clearly biased against GLBT people AND when Exodus funded the research?

The Question of Bias in Research and Biased Researchers

Funding for this study was provided through grants and gifts to Exodus International. This funding source raises the serious question of whether objective and credible outcomes are possible from this study. But our funding source is not the only issue. Some might allege preexisting bias on the part of the research team based on the fact that the principle investigators are evangelical Christians who have published extensively about homosexuality, espousing and defending the traditional view of the Christian church through two millennia that homosexual behavior (i.e., full homosexual erotic and physical intimacy) is immoral. How, it might be asked, can ideologically committed researchers funded by Exodus conduct a fair study?

These concerns are points of legitimate discussion. Any reasonable person would be wise to ponder the implications of lung cancer research being funded by the tobacco industry and conducted by a person who has “declared his allegiances” before the study is conducted. These concerns must be balanced, however, by the realization that there seem to be few neutral parties in the debate over homosexuality to fund the research or conduct the proposed study. Let us focus first on bias introduced by the preexisting views of the researchers.

This is a field of study in which few come with no preconceived notions or strongly held ideas. Much of the most-cited research in the area is published by gay, lesbian and bisexual individuals who write from a perspective of passionate commitment in their critiques of existing research, development of new lines of empirical inquiry and advocacy for their views.

Further, we have each expressed in print the belief that while change may be possible for some, it may occur a good deal less frequently and in a more fragmentary fashion than many conservative Christians would like to believe. We have also expressed our belief that the scientific evidence is not decisive in the moral debates about homosexuality, including arguing specifically that the moral debate does not hinge on whether or not homosexual orientation can be changed. Specifically, we have argued that in the end the Christian moral argument about homosexuality does not stand or fall on the possibility of “conversion” of sexual orientation, because in the end the Christian moral demand is not orientation change but rather “chaste behavior.”[i]

On this view the homosexual person pleases God by ceasing to engage in homosexual sex. Although we might suppose that God would “heal” some homosexual persons to experience heterosexual desire and fulfillment in the context of marriage, we would not presume this necessary to the Christian sexual ethic.

Our views have placed us in opposition to several prominent leaders of the “healing of homosexuality” movement outside of Exodus, one of whom has chided us publicly and privately for dishonoring God by not believing that all homosexual persons can and will be healed completely (to heterosexual normalcy) if they simply submit their lives to God. We have been criticized, in other words, for not believing strongly or unequivocally enough in sexual-orientation change for the homosexual.

Ultimately, the issue of a researchers’ bias must boil down to one of scientific integrity, to whether scholars are willing and able to “bracket” their beliefs and to report honestly the results they garner. We confronted this issue of our integrity with the Exodus administration and board of directors, because we were not sure that our construal of the moral situation was in perfect alignment with that of Exodus. When finalizing the agreements about the research with the Exodus International organization, Jones presented the following statement to the Exodus board of directors:

Since it [i.e., this research project] will be sponsored by Exodus and funded through Exodus contacts, the proposed study will be subject to severe criticism that it is hopelessly contaminated with bias and proprietary interests. The only possible counters to such criticisms are a clean and rigorous methodology, and absolute intellectual and academic integrity on the part of the research team. To be perfectly clear: Once a commitment is made to this study, it is my unalterable intent as Principal Investigator to publish our findings regardless of what they are. This commitment rests on a conviction that the God of the Christian faith is glorified by truth, and that more harm than good would be created by any avoidance of or suppression of whatever findings this study unearths.

What about Exodus’ funding our research? Concern about such proprietary funding is growing as governmental funds for research dwindle and corporate funding expands. Of particular concern is the growing pattern of pharmaceutical companies’ funding their own clinical trials documenting the efficacy of new drugs. Often here, the issue is much deeper than that of the pharmaceutical companies’ providing core research funding, but more broadly a pattern of vested financial interests of researchers in the profits of the company, direct or indirect.[ii]

Ultimately, the only defense against research bias is solid methodological design executed by honest researchers with the results reported honestly and completely. Much of the research on which we depend in the Western world is conducted under circumstances similar to that of the present study, as when pharmaceutical companies fund studies of the effectiveness of experimental drugs they have produced or when an environmentalist foundation funds research on new approaches to recycling. We have been committed since the outset to publish honestly all of our findings regardless of the outcomes. Exodus International funded our research, but was not allowed to exercise any control over our methods or the reporting of outcomes. Our funding has gone solely toward the practical challenges of running this study, and no funds have gone to the personal enrichment of the researchers. Outcome studies such as this one are often funded by governmental agencies for many multiples of the types of funding supplied here; in contrast, our research has been conducted on a shoestring budget.

One way to bring these issues together is to assess whether research is done by “interested” or “disinterested” parties. Recently, the Academic Senate of the University of California system, representing the thousands of faculty of that massive university system, voted to change the academic freedom policy of the system.[iii] In the words of the resolution, the original policy “associated academic freedom with scholarship that gave ‘play to intellect rather than to passion.’ It conceived scholarship as ‘dispassionate’ and as concerned only with ‘the logic of the facts.’ ” In contrast, according to the resolution,

The revised version of [APM-010, the Academic Freedom statement] supersedes this standpoint. It holds that academic freedom depends on the quality of scholarship, which is to be assessed by the content of the scholarship, not by the motivation that led to its production. The revision of [APM-010] therefore does not distinguish between “interested” and “disinterested” scholarship; it differentiates instead between competent and incompetent scholarship. Although competent scholarship requires an open mind, this does not mean that faculty are unprofessional if they reach definite conclusions. It means rather that faculty must always stand ready to revise their conclusions in light of new evidence or further discussion. Although competent scholarship requires the exercise of reason, this does not mean that faculty are unprofessional if they are urgently committed to a definite point of view. It means rather that faculty must form their point of view by applying professional standards of inquiry rather than by succumbing to external and illegitimate incentives such as monetary gain or political coercion. Competent scholarship can and frequently does communicate definite and politically salient viewpoints about important and controversial questions.

This is an apt statement of the approach to which we aspired in conducting this study. We came to this study with convictions, but these convictions were neither blind nor naive; they were forged through engagement with our religious tradition, but also through a professional, critical and fair engagement with the empirical literature evaluated by exacting standards. Our views, though deeply held, were open to revision based on the data.

[i]Jones & Yarhouse (2000), see chap. 5, particularly pp. 148-151.

[ii]The arrangements vary: Professors serve as consultants, are paid by companies to perform clinical trials on potential drugs, or are given stock in the companies. A professor also might have a financial stake in a patent on a product being marketed by the company.” Mangan, K. S. (1999, June 4). Medical professors see threat in corporate influence on research. Chronicle of Higher Education, p. A15.

[iii]Academic Senate, University of California. “APM-010, Proposed Revision of the Academic Freedom Statement.” Retrieved August 5, 2003, from Quotations from footnote 1 of the revision.

Our 2007 responses from the book to then anticipated criticisms, reproduced with permission from InterVarsity Press

Responses quoted with permission from the original manuscript version, pages 382 – 387

But these findings just cannot be true! (The response of raw incredulity)

We expect the most frequent and perhaps potent negative response to this study to be raw cynicism and incredulity. Such responses are becoming increasingly common. A recent “research study” (an opinion poll, really) linked conversion therapy for sexual orientation with “angel therapy,” “orgone therapy,” pyramids, crystals, alien abduction, past and future lives therapies, and rebirthing therapies.  All were declared “discredited” approaches (Norcross et al., (2006), ibid.) based on this poll. This seems to represent a rhetorical strategy of shaming and ridicule rather than serious discourse that conversion therapy for sexual orientation, with its long and significant history in many facets of the mental health community, to be linked and associated with various bizarre and aberrant forms of change methods. This may represent the kind of response our results are destined to generate. It is common today to believe that sexual orientation change is impossible, and for many this belief may be impervious to refutation by the presentation of either anecdotes of change or by the kinds of empirical evidence offered here. There is perhaps no effective response to such cynicism.

But anti-gay researchers cannot be trusted! (The accusation of researcher bias)

We expect, second, to be attacked and dismissed as biased researchers, with descriptors such as homophobic, heterosexist, fundamentalist, fanatic, extremist, and so forth applied liberally to us. We would note in response that there are few neutral researchers on such controversial issues. Note, for example, that many of the contributors to the September 2004 theme issue of the journal The Counseling Psychologist on the relationship between religion and homosexual behavior were themselves gay, lesbian or bisexual, and none of the contributors were to our knowledge traditionalist Christians. This issue of The Counseling Psychologist was built around the lead article by Lee Beckstead and Susan Morrow, and Beckstead has been public[i] (file:///C:/Users/User/AppData/Local/Temp/Brief%20Responses%20to%20Prior%20Published%20Criticisms%20of%20Jones%20and%20Yarhouse-1-1 NULL.doc#_edn1) that that as background to his study of the phenomenon of conversion therapy in the religiously-conservative Latter Day Saint (Mormon) Church world, he has struggled with the personal conflict of his Mormon upbringing and his same-sex attractions, a struggle resulting in his departure from the Mormon Church and full embrace of gay identity. We are grateful to have someone like Beckstead contributing to discourse on this issue, but it does point out that “neutrality” is not required for admission to the discussion. If LGBT authors and scholars are not neutral but nevertheless contribute freely to the professional discussion and research in this area, should not the same courtesy be extended to traditionalist Christian researchers or others of varying perspectives? Are only those who advocate for gay-affirming approaches to this topic to be allowed to speak and contribute to this topic?

[i] (file:///C:/Users/User/AppData/Local/Temp/Brief%20Responses%20to%20Prior%20Published%20Criticisms%20of%20Jones%20and%20Yarhouse-1-1 NULL.doc#_ednref1) “Beckstead, who grew up as a member of The Church of Jesus Christ of Latter-day Saints, said much of his youth was spent torn between his religious upbringing and his sexual feelings. ‘My Mormon background was my Mormon foreground. My entire life was spent asking, “Am I gay or am I Mormon? Am I good or am I evil?”’ Beckstead said. That moral debate, Beckstead said, led to his eventual departure from the church and allowed him to come to terms with other parts of his identity. ‘Who I am sexually was wrapped up in my religion. I had to disassociate from the church and the things they’re trying to do against me and my partner,’ he said.”  Benson, A. (4/20/04) “Panel breaks the silence about LGBT issues,” The Daily Utah Chronicle, accessed February 5, 2007 (http://media NULL.www NULL.dailyutahchronicle NULL.Breaks NULL.The NULL.Silence NULL.About NULL.Lgbt NULL.Issues-665697 NULL.shtml?sourcedomain=www NULL.dailyutahchronicle NULL.collegepublisher

But the study has flaws and therefore cannot be trusted! (The response of doubt)

We expect the limitations of the current study to be expounded as grounds for dismissing the findings. Critics will claim that our study sample was large but not large enough, was a quasi-experimental rather than experimental design, sampled a wide range of subjects but did not prove that the sample was representative, used many measures of sexual orientation but not the right ones, misused or misapplied some of the measures used,[i] failed to use psychophysiological measures of sexual arousal patterns, and on and on. Again, we explained our decision-making on each of these specific points in Chapter 6 and stand by the decisions made to conduct the most methodologically sophisticated study of sexual orientation change to date.

In the 1970s, Michael Mahoney, a well-respected psychologist regarded as a rising star in the field of cognitive psychotherapy, conducted an unauthorized experiment that nearly resulted in his expulsion from the American Psychological Association.[ii] As the editor of a scientific journal, he sent out for review facsimile manuscripts that he constructed so that the putative results reported in the manuscripts either conformed to or clashed with the pre-established professional opinions of the scientists selected as manuscript reviewers. In other words, some reviewers received manuscripts that were identical in terms of introduction, methods sections, and references cited, but either reported results that were congruent with the theoretical viewpoints and expectations of the reviewers or alternatively reported results that were incongruent with the theoretical viewpoints and expectations of the reviewers. The methodologies used in the facsimile studies, it must be emphasized, were the same. Mahoney documented that reviewers critiquing manuscripts that reported results that conformed with their expectations had many fewer complaints about the methodologies of the manuscript they received and were much more likely to regard the manuscript as meriting publication, while reviewers critiquing manuscripts that reported results that clashed with their expectations had many, many criticisms about the methodologies of the manuscript they received and tended to recommend against publication. In other words, reviewers tended to “go easy” on the manuscript reporting results in accord with their expectations, and tended to “rip apart” the manuscript that reported results that clashed with their expectations. Mahoney reported that the net result was that even though the reviewers were critiquing manuscripts reporting identical methodologies, the reviewers gave harsher recommendations against publication when the manuscripts they reviewed clashed with their expectations.

No one is immune to such tendencies. We have tried to critique even-handedly the research in all areas, noting in Chapter 3, for instance, the many limitations of the research on change of sexual orientation, and in Chapter 6 the limitations of the existing methods for conceptualizing and measuring sexual orientation. This study is not above criticism methodologically, and we have tried to be the first to articulate what we regard as fair criticisms of our study. But we do expect this study to be greeted with various claims that because the study was not rigorous enough on issue X, its findings should be dismissed. We must resist such responses. There are no formulaic rules for judging how far a study must deviate from perfection before its results no longer deserve consideration. Our study is short of perfect; so also is all scientific inquiry, particularly psychological inquiry, and particularly inquiry into such a controversial subject. This is an imperfect but strong study, on that in many respects is the most rigorous ever conducted, and its results should be taken seriously.

[i]such as using a summary score for the Klein Sexual Orientation Grid when Klein urged researchers not to use such a summary score, as discussed in Chapter 6

[ii]As recounted in Mahoney, M. (1976).  Scientist as subject.  Cambridge, MA:  Ballinger. The core of the study is described on page 93ff.

But the reports of religious nuts trying to change cannot be trusted! (The accusation of subject bias)

We expect our results to be dismissed on the basis of bias in the self-report of the individuals studied. Our research participants, it will be argued, cannot be trusted to report their experience rightly and truthfully. They will be maligned as parroting the claims of a repressive religious system and a heterosexist society. We would respond first that we have attempted to urge, over and over, the value of honest report of their experience. One of the virtues of long-term follow-up as conducted in this longitudinal study is that we can expect that, if false presentation is present, that individuals will become more likely over time to report their true status, and we are committed to reporting those results honestly.

It is also critical to note that almost the entire fabric of human research is built on the reliability and validity of self-report. We ingest new headache medications because we trust the self-report of subjects in clinical trials that their headaches are lessened by a new medication; we rely on new medicines or psychotherapies for depression because we trust the report of individuals that they are less depressed. This area is arguably different in some important respects, particularly in that homosexual “lifestyle” is the subject of vigorous moral, religious, and political debate today, and reporting that you are less headachy or depressed is surely less subject to intense pressure and scrutiny than those others reports. Even so, we have no indication that any of the participants in our study are political pawns or parrots, but rather that they are struggling, honest human beings grappling with personal situations and experiences of immense complexity, and that they have told us the truth about their personal experiences.

To this criticism we would note finally that it was in recognition of this potential criticism that we did not rely on simple, dichotomous reports of whether the participants are “straight or gay,” but rather inquired after the specifics of their experiences of sexual attraction, fantasy, infatuation, and so forth. We found our participants to be transparent in reporting their experience when asked in detail, for instance, about the frequency and intensity of sexual attraction to persons of the same and opposite sex.

But what if individuals who claim they were subjects in the study come forward and discredit the study? (The accusation of falsification)

It is not impossible that individuals will emerge after the publication of this study who claim to have been in the study and to have been falsely represented, pressured to present themselves as if they were completely healed, and so forth. We may even have individuals come forward saying they were subjects in the study and dropped out because the research was biased or making other accusations. We have already addressed this issue of the problematic status of anecdotal reports in Chapters 2 and 3. The recent ruling by U.S. Food and Drug Administration on silicone breast implants stands as a compelling example of the risks of grounding judgments in anecdotes. The original rulings that led to the removal of silicone breast implants were based on the tragic anecdotes of immune system disruption that some individuals (and their lawyers) claimed to be the result of such implants, and only years of research established that there was insufficient evidence that the implants were causally related to the negative health sequelae of the patients/plaintiffs in these cases. Only by examining normative results over multiple subjects can the true connections between interventions and outcomes be charted. This is why we took strides to encourage ongoing participation in the study, so that the results could be reported in the context of the overall study itself, which in the end is much more valuable and credible in terms of empirical findings than are anecdotal reports as such.

But the confusion of chastity and true change means these results cannot be trusted! (The accusation of outcome confusion)

We will be criticized for inflating the success statistics of this study by combining two types of success cases: chastity and “conversion.” Chastity in the minds of many is not success. We have already discussed this matter at length from a Christian theological and moral perspective in Chapter 2. Our situation here is not unlike that in marital and couple therapy, where one has to judge whether an outcome of an amicable divorce is a treatment success or failure. It of course depends on what one believes about divorce and the premium one places on successful preservation of a marital relationship. The reader will remember that in Chapter 2 we argued that most traditionalist Christians do not regard miraculous and unequivocal conversion to heterosexuality as a moral necessity for living a righteous life, but rather that release from undue preoccupation with and involvement in same-sex relationships and a positive embrace of chastity (sexual purity) is a moral necessity for the righteous life approved by God. Recall that this group reported average decreases in same-sex attraction that were moderate to large in terms of effect sizes. So the picture, then, is not of individuals begrudgingly refraining from sexual behavior; rather, they are not experiencing attraction to the same sex to the same degree they had been previously, and this is making chastity presumably more attainable and less demanding in terms of emotional or psychological demands placed upon them.

Other Responses

Response to American Psychological Association Task Force on Appropriate Therapeutic Responses to Sexual Orientation

The American Psychological Association some years ago formed a Task Force to examine the thorny issue of attempts to change sexual orientation through psychotherapeutic means (or what the committee called Sexual Orientation Change Efforts, or SOCE). The Report of the APA Task Force (http://www NULL.apa NULL.pdf) on Appropriate Therapeutic Responses to Sexual Orientation (2009) gave a pessimistic review of such efforts.

Methodologically, it distinguished between research studies published in peer-reviewed scientific journals, on the one hand, and research studies published in other venues, on the other. The former type of literature was rigorously scrutinized in the report and generally dismissed as inadequate; the latter type of literature, deemed “grey literature,” was never given serious consideration despite the considerable value of many such studies and despite the fact that the Task Force utilized heavily all sorts of literature published outside of peer-reviewed journals in many parts of the report. (The reader may want to consult the more general critique of the limitations of the Task Force Report in The General Psychologist (http://www NULL.apa, Volume 45, issue # 2, pp. 7-18.)

Let’s begin our response to the APA Task Force on a positive note. The Task Force formulated five “best-practice standards for the design of efficacy research” (p. 6) on SOCE as follows:

[Credible]Research on SOCE would (a) use methods that are prospective and longitudinal; (b) employ sampling methods that allow proper generalization; (c) use appropriate, objective, and high-quality measures of sexual orientation and sexual orientation identity; (d) address preexisitng and co-occuring conditions, mental health problems, other interventions, and life histories to test competing explanations for any changes; and (e) include measures capable of assessing harm.

Though the design of the Jones and Yarhouse study was set years before the 2009 Task Force Report, this study meets many of the standards proposed by the APA Task Force.  Quoting from our 2011 Journal of Sex and Marital Therapy article, and responding to their five recommendations:

(a) The present study is prospective and longitudinal. (b) Its quasi-experimental design is adequate to address (“generalize to”) the fundamental question of whether sexual orientation change is ever possible, though the design is inadequate, as the Task Force Report points out, due to the “absence of a control or comparison group” (APA, 2009, p. 90, fn 65), to allow for decisive causal attribution of the changes noted to the religious interventions. The design is adequate, however, as a test of the very possibility of change. (c) The study utilized the best validated measures of sexual orientation current in the late 1990s when the design was set. (d) The study did not address competing explanations as proposed by the Task Force, as it had an insufficient sample size to make valid inferences. (e) The study included a validated measure of psychological distress as an index of harm.

Arguably, our study embodies much of what the Task Force describes as useful research design.  Now to the criticisms. The Task Force dismissed the 2007 Jones and Yarhouse book as unworthy of careful scrutiny in their Report; they dedicated footnote 65 on page 92 to justifying the dismissal of the study:

A published study that appeared in the grey literature in 2007 (Jones & Yarhouse, 2007) has been described by SOCE advocates and its authors as having successfully addressed many of the methodological problems that affect other recent studies, specifically the lack of prospective research. The study is a convenience sample of self-referred populations from religious self-help groups. The authors claim to have found a positive effect for some study respondents in different goals such as decreasing same-sex sexual attractions, increasing other-sex attractions, and maintaining celibacy. However, upon close examination, the methodological problems described in Chapter 3 (our critique of recent studies) are characteristic of this work, most notably the absence of a control or comparison group and the threats to internal, external, construct, and statistical validity. Best-practice analytical techniques were not performed in the study, and there are significant deficiencies in the analysis of longitudinal data, use of statistical measures, and choice of assessment measures. The authors’ claim of finding change in sexual orientation is unpersuasive due to their study’ methodological problems.

Let us examine in order the criticisms in footnote 65 on page 90 of the Task Force Report:

  • The study is a convenience sample of self-referred populations. . .Response:  Much of the research on sexual orientation is conducted on convenience or volunteer samples; truly representative samples are rare. In fact, it is hard to know what a representative sample of gay and lesbian persons would look like given the uncertain boundaries of what constitutes homosexual orientation and the phenomena of being “closeted.”Had we been trying to rigorously examine the question of what percentage of all homosexual persons could achieve sexual orientation change, it would have been necessary to gather a representative sample of all gay and lesbian persons. This was not our goal; our goal was to examine the question of whether sexual orientation change is ever possible at all. To answer this question, it was necessary to recruit a subject population that was highly motivated to change and actively pursuing change. The only way to study such a population was to utilize a volunteer sample.
  • the methodological problems [include] the absence of a control or comparison groupResponse: We have consistently described the design of our study as “quasi-experimental.” Such designs often do not include control or comparison groups. Further, if one is testing the effectiveness of treatments of a condition that can easily change (such as the common cold or mild depression) one must have a control group, because any intervention under examination must produce change results beyond the normal baseline of change (the people who would have “gotten better” without intervention; most of us get over colds in two weeks, and many mild depressions just abate). But if homosexual orientation is either unchangeable or rarely changeable, it is not necessary to have a control group if one is examining ht ebasic question of whther change is possible a all because the condition does not just change spontaneously. The APA itself said for years on its website that “[H]omosexuality is not changeable;” it was this very pronouncement that motivated this study and made it unnecessary to have a control group.
  • the methodological problems [include] threats to internal, external, construct, and statistical validityResponse: It is hard to know what this very expansive criticism is saying. In order: Internal validity refers to using an experimental design that accomplishes precisely what the experimental hypothesis demands. The APA Task Force demanded precision in the attribution of causation of change to a single intervention method as the standard for internal validity. This was not, however, the focus of our study. Our study has internal validity appropriate to its hypothesis.External validity refers to representativeness to “real world” conditions. Our study has high external validity, as it studied real world change attempts over a long period of time.Construct validity refers to conceptualizing and measuring the variables of your study appropriately. We used best practice measures from peer-reviewed journals, thus rendering this criticism invalid. Further, it is standard in the field of sexual orientation study to squabble about what the best measures are. For this reason, we used multiple measures.Regarding statistical validity, see the next response.
  • Best-practice analytical techniques were not performed in the study” and “there are significant deficiencies in the analysis of longitudinal data, use of statistical measuresResponse: Again, it is hard to know what is being claimed here given that there are no specifics. We have received multiple criticisms of the 2007 book on the basis of our use of many t-tests. We deliberated about which statistical procedures to use. Our practice in the 2007 book of using multiple t-tests was grounded in the less formal assumptions required for t-tests in comparison to ANOVA and MANOVA tests. In particular, the imprecise and uneven timing of the various assessments did not seem to fit the more rigorous assumptions/requirements of ANOVA testing. While arguably valid to use t-tests, it is a judgment call regarding which tests to use. In the 2011 report of our findings, we shifted to more sophisticated statistical measures (ANOVA and MANOVA), though this may result in criticisms of a different sort. The lay reader needs to understand that researchers squabble about such things all the time. Often, there is no one right way to analyze data.
  •  “there are significant deficiencies in the . . . choice of assessment measuresResponse: As with the response immediately above, this is predictable and relatively vacuous criticism. The vigilant reader of the APA Task Force will note that they nowhere define the “right” assessment measures to use. We used recognized, professional and credible measures.

Addendum: Criticisms from APA Task Force Chair Judith Glassgold:

Glassgold was quoted expounding on the views of the APA Task Force regarding this study in the September 9, 2009 issue (http://discovermagazine of Discover Magazine as follows:

Everything was wrong with that study,” Glassgold says. “[Yarhouse and Stanton (sic)] chose the wrong statistics to evaluate, they violated statistical laws, and they didn’t have a control group-just a small sample of people recruited from religious groups. They followed the individuals over a couple of years, but didn’t specify that the subjects should only try one intervention at the time, so they tried many at the same time. So we aren’t sure which, if any, intervention was causal.

Response:  Much of this quote repeats the criticisms of the APA Task Force, as would be expected from its Chair. In this informal quote, Glassgold embellishes a bit, but again without specific content. We chose reasonable statistical tests, and violated no “statistical laws.” The lack of control group is defensible, as we explained above. Our sample of 98 subjects was significantly larger than those of many studies that they regarded as adequate in the Task Force Report, and is adequate for the hypothesis.

The final criticism points to a significant problem with the whole APA Task Force Report. The Task Force insisted upon hyper-rigorous experimental design for the studies it considered valid because it established a ground rule that it had to be able to infer causality for a single, tightly-defined intervention method. The question “what specific method produces a specific causal effect?” is a legitimate question to ask, but it simply was not our question, and thus our design did not demand this level of control. As two distinguished researchers[i] have argued, “Research designs that do not involve total experimental control by the investigator . . . are always problematic in that the alternative explanations of the finding are invariably possible. . . .  This does not, however, rule them out as useful research designs.”

By its method, the Task Force manifested an intriguing “sleight of hand”: It conducted its analysis of intervention methods on the basis of demanding extreme rigor in order to get evidence of a specific causal effect.  But after these methods allowed them to disqualify almost all relevant studies, they then declared that the lack of evidence for such a specific effect somehow justified a much broader conclusion, that being that change of sexual orientation is uncommon or impossible. They made the common mistake of confusing absence of evidence for an effect, on the one hand, for evidence of absence of an effect on the other. (For more about this, see our The General Psychologist article mentioned above.)

[i]Bouchard, T. J., and McGue, M. (2003)  Genetic and environmental influences on human psychological differences, Journal of Neurobiology, 54 (1), 4-45. Quote p. 9.

Response to Dr. Patrick M. Chapman, whose extensive critical review of our 2007 book was posted on the website “Ex-Gay Watch”

Ex-Gay Watch (http://www NULL.exgaywatch describes itself as “the most comprehensive and widely read website dedicated to monitoring the ex-gay movement.” They are a source for counter-examples to change of sexual orientation and host sustained criticism of all facets of the ex-gay movement. The site is very extensive. This website hosted a three-part book review by Dr. Patrick M. Chapman of our 2007 book. The book review, which we do not reproduce here, can be found at the following links, for his overall criticism of the methodology of the study in part one (http://www NULL.exgaywatch, his examination of our empirical evidence of change in part two (http://www NULL.exgaywatch, and his critique of the evidence about harm part three (http://www NULL.exgaywatch Our response was also posted in three parts; we provide here the links to response part one (http://www NULL.exgaywatch, response part two (http://www NULL.exgaywatch, and response part three (http://www NULL.exgaywatch, but reproduce below the entire response as submitted to Ex-Gay Watch. For the interested reader, Chapman had the last word (http://www NULL.exgaywatch

Response to Dr. Patrick M. Chapman’s Review of “Ex-Gays”, posted on Ex-Gay Watch, November, 2007, by Stanton L. Jones and Mark A. Yarhouse

The greatest compliment that be paid to any work of scholarship is for it to receive serious consideration and generate discussion.  Thus, we are pleased to see the review by Dr. Chapman of our book, Ex-gays?: A Longitudinal Study of Religiously Mediated Change in Sexual Orientation (http://www NULL.ivpress  Chapman raises important issues, but in the end, we must conclude that his review fails to establish the serious flaws he claims in our study.

Response to “Part 1:  Introduction and Methods”

We applaud Chapman for correctly summarizing the main questions we examined in the study, for a reasonable brief summary of the study’s methodology, and particularly for granting us some credulity in saying that “They claim the ex-gay organization [Exodus] did not exert any control or power over their results and conclusions (p. 127), and there is currently no reason to believe otherwise.”  Minor points of disagreement with his summary and commentary include the following:

  • Our interest was not triggered by “the conflicting views of science [versus the claims of our] conservative Christian acquaintances;” but rather by the conflict between a) the prevailing and hardening consensus of mental health opinion that change is utterly impossible, based on a very mixed scientific record, versus b) the actual scientific record and the anecdotal claims of people we know.  Regarding the actual scientific record, note for instance the recent publication by a respected scholar of a report of some notable plasticity in “female same-sex sexuality” in a minority of women followed in a longitudinal study (Lisa Diamond, Perspectives on Psychological Science, 2(#2), 142-161. Diamond rightly concludes “the more we learn, the more we do not understand,” p. 142. She also, it must be said, would not regard her findings as providing support for change as understood in this study, but on the other hand, her results do challenge a simple “sexual orientation is utterly and always unchangeable” stance).  And Chapman in his review gives weight to the anecdotes of people he knows, and his own story, so once again we raise the question why only certain anecdotes are privileged as worthy of consideration in this debate.
  • Chapman implicitly dismisses “behavior modification” as trivial, but we see insufficient justification to take this step.  Some of our subjects experienced more than mere behavior modification, and even behavior modification can be very meaningful if it empowers a person to live in closer accord with her freely chosen core values.

The core of Chapman’s criticism of the study in Part 1 is that our study is somehow not truly prospective.  We would agree that if our study is not prospective then it is disingenuous to claim that it is, and the scientific value of the study is considerably weakened.  This charge, in other words, is truly significant.  Let’s look carefully, then, at the basis for Chapman’s claims.

First, Chapman claims that “technically the study is not prospective because 41 individuals were involved in the Exodus program for one to three years prior to the study (p. 121).”  The logic of this argument is not compelling.  We are utterly explicit that some of the subjects (the 41 “Phase 2 subjects” in the change process with their current Exodus ministry for 1 to 3 years) had been in the change process longer than others (the 57 “Phase 1 subjects” in the change process for less than 1 year).  We continue to maintain that the results for the Phase 2 subjects are worthy of inclusion and consideration, but we always report analyses of the Phase 1 population by itself for precisely the concern Chapman articulates:  If the reader insists on a tighter understanding of “prospective,” then you can narrow the focus to the Phase 1 results.  These results were not as positive as those for the population as a whole, but were still statistically significant and meaningful, with Phase 1 subjects represented in all six categories of outcomes.  Again, for Chapman to focus on the 41 Phase 2 subjects and then pronounce the whole study as not prospective makes no more sense than declaring that the results of our study are irrelevant for men because there were 26 women in the study.

Chapman’s second concern is more interesting and merits serious discussion.  He argues that our study is not prospective because “the claim that participants were at the start of their change process is misleading.”  He then cites several pieces of data indicating that subjects had previously tried to use other methods to change their sexual orientation before starting their current Exodus involvement (including through involvement in other religious ministries and professional therapy), and then concludes “Suggesting the individuals in this study are ‘starting the change process’ is incorrect. Perhaps this was their first attempt with Exodus ministries but that is not the same as ‘starting the change process.’”

Chapman seems to be arguing for an extremely literalistic understanding of “starting the change process.”  Our research question was the possibility of change through involvement in an Exodus ministry, and so we focused on persons between zero and 3 years into that change process.  Chapman is arguing for a much more rigorous standard:  that the only proper way to study change is to locate and study what we might call “change virgins,” people who had never attempted change at all.  We would argue that such a standard is unreasonable for several reasons:

  • First, such a standard is rarely applied in the study of other intervention methods with other targets of intervention.  We urge that our study be examined according to the standards applied to all psychological studies of change, and not by ad hoc standards with few parallels in the general literature.  We compare our results in the book with the pattern of results for the STAR*D treatment study of chronic depression, but the very idea that you would screen out all subjects who had previously sought help to change their depressive patterns to get a sample of “change virgins” is not credible.  If your goal is to study the effectiveness of a particular intervention method, why would you screen out of your study persons who had previously sought change by other means, especially when it is common in these ministries to work with people who have attempted to change before?
  • Second, to erect such a requirement for the validity of a study of change of sexual orientation would be to make such a study impossible to conduct.  How would you find a pure sample of “change virgins” who had never attempted change?  If people are distressed by their sexual orientation for religious, moral or other reasons, isn’t it likely that those person would try a variety of formal and informal means to change that orientation?
  • Most importantly, if our research question is that of the possibility of change through involvement in an Exodus ministry, why would prior or even concurrent involvement in other methods of change serve as a barrier to involvement in the study?  If we are studying the effectiveness of anti-depressants in treatment of depression or of interpersonal therapy on marital relationships, what is the relevance of the subjects having previously received pastoral counseling for depression or having attended a marriage encounter weekend to enhance marital satisfaction?

So in the end, in response to Chapman’s criticism that “Perhaps this was their first attempt with Exodus ministries but that is not the same as ‘starting the change process,’” we would simply reply that by our saying that these subjects were “starting the change process,” we were implicitly and explicitly saying “starting the change process in this particular Exodus ministry.”  Hence, we believe that this study meets reasonable standards as a prospective study of individuals seeking sexual orientation change through the Exodus change process.  Chapman’s criticisms fail to establish the contrary.

Response to “Part 2:  A Focus on the Results — Examining if Change is Possible

Here in Part 2 Dr. Chapman’s criticisms turn more severe.  First he asserts that ours is not a long term study.  Again, his logic is questionable, and the problem of incomplete citation of our argument is significant.  Chapman says “In the opening chapter Jones and Yarhouse honestly and correctly state this study cannot establish if long-term, permanent and enduring change occurs because that would require a long-term study (p. 17).”  What we actually say on page 17 is that “this study will not establish that permanent, enduring change has occurred; only a very long-term study can demonstrate that.”  Our point was not that our study was not a long-term study, nor that our study was inadequate to produce evidence suggesting that change was not impossible.  Our point instead was that if you want to show that change is permanent, then logically you have to study subjects throughout their lifespans to death to insure the change was permanent.  So our study cannot show that change is permanent, but even so a three to four year span of time is scientifically meaningful and qualifies as “long term.”

Chapman’s subsequent criticisms share a common characteristic that must be noted:  Chapman imagines that he blunts our argument that change is possible for some by pointing out contrary pieces of isolated evidence that change did not happen for certain people or did not happen in certain ways he considers important.  Science, in contrast, operates by examining all relevant data for trends, and then applies that data to the evaluation of hypotheses.

Our hypothesis regarding change was that “change is impossible.”  The relevant data for falsification of that hypothesis is evidence that change is possible for some.  Imagine the argument that “it is impossible to sustain life through heart transplant operations.”  A scientist studies 100 heart transplants, and finds one year post-operation that 67% of transplant patients are still alive.  Does the death of 33% constitute evidence in support of the argument “it is impossible to sustain life through heart transplant operations”?  Of course not:  If heart transplants are not supposed to help people, then the relevant data is data that falsifies the hypothesis, i.e., evidence of people surviving.  Chapman’s selective citation of our data is the equivalent of focusing on the negative cases in this example.  This is explicit, as Chapman argues that our conclusion that change is possible for some “is unwarranted because . . .” and then cites a series of evidences of incomplete change.

It was very surprising for Chapman to build the core of his argument around selectively citing the 3 tables (7.4 through 7.6; pp. 239-240) that show no change (which we openly admit) while completely ignoring the other tables on the related variables that show significant change (7.1 through 7.3; pp. 238-239) AND while completely ignoring all of the other variables measured (the balance of Chapter 7) on which statistically significant change and effect sizes ranging from small to large were demonstrated.  It was in response to the broader pattern of evidence that we concluded that “change is possible for some” again and again through the book.  Chapman says that “This study is littered with biased and sloppy scholarship,” but actually provides no evidence of this.  Chapman and others who want to engage this work fairly need to respond to the overall pattern of our findings which, in contrast to the hypothesis that “change is impossible,” found many statistically significant changes and meaningful effect sizes on almost all of the measures of sexual orientation.  How can an exclusive focus on those few instances where statistically significant change was not found be justified?

Chapman then turns to a rebuttal of our qualitative categorization of outcomes, focusing first on those we termed “Success: Conversion.”  His core complaint is that some of these individuals report various forms of recurring homosexual attraction even as they also report satisfying heterosexual adjustment.  Should individuals who report any sort of continuing homosexual attraction be considered to have changed?  We discuss this matter throughout the book, but focus on it on pages 235-237 and 373-374, concluding that it is an unreasonable standard to deny that an individual has changed significantly if they experience any residual of homosexual desire.  Chapman takes the stance that any signal of homosexual attraction indicates full and enduring homosexual orientation; this strikes us as a naïve and dichotomous understanding of sexual orientation.  Further, such standards are not applied to other efforts at psychological change, and we believe they cannot and would not be so applied.  Marital couples continue to struggle with conflict; persons with addictions continue to experience cravings.  Put differently, the same sorts of standards that recognize significant change with other psychological patterns that are the subject of change attempts should hold for the area of sexual orientation as well.

Chapman then dismisses our conclusions about those who experienced a decrease in the potency of their homosexual desires and were able to embrace chastity, and who themselves considered this a successful outcome to the change process.  Chapman suggests that we “accept asexuality as a functional opposite of homosexuality. Based on the depression analogy it appears that Jones and Yarhouse would declare a person ‘healed’ from depression if they ceased to have any and all emotions, for the person would no longer be intensely and persistently sad.  I suspect the psychological community would define success in other ways.”

This is an important argument, to which we would respond in two ways.  First, these individuals did not find themselves to be either devoid of all emotion entirely nor to be utterly asexual in the sense of being emotionally dead.  Instead, their common testimony was of experiencing a diminishing of unwanted, powerful same-sex attractions, and that that decrease enhanced their experiences of satisfying emotional and relational connections with God and with other persons in non-erotic relationships.  These people typically felt themselves more emotionally alive and healthy as a result of experiencing a decrease in homosexual attraction.  Second, we must ask who has the authority to deny these individuals the opportunity to make their own choices about what they find satisfying in life?  These individuals regard their adjustment to be successful; is Chapman positioned to assert his view of their lives over theirs?  Yes, some of the subjects reported experiences discordant with their desires and hopes for complete change.  But these individuals (except for the one who retracted his claim to change) did not see these experiences as negating the reality of positive change in their lives.

Chapman’s concluding paragraph deserves careful attention.  We quote him, and then comment on each of his challenges:

  • “Despite explicitly stating that this study cannot demonstrate whether long-lasting change is possible. . .”  As stated above, this is NOT what we said.  What we said was that our study could not prove change was permanent.
  • “despite admitting that individuals in ex-gay ministries misreport their condition . . .”  This is NOT what we said.  Rather, we report in the book how some Exodus ministries urge their clients to reject the notion that their same-sex attractions mean that their identity is that of a homosexual person.
  • “despite knowing that previous testimonies of change were untrue . . .”  Rather, we recognize that some previous testimonies of change have proven to be untrue.
  • “despite knowing that one of their own ‘Success: Conversion’ participants later recanted his proclaimed ‘conversion’ to heterosexuality. . .”  As we say in the book, we report the data as it presents itself, as the experience of one person does not invalidate that of another.  The experience of change of Alan Chambers, President of Exodus, does not invalidate Dr. Chapman’s experience that he did not change, and it is for this reason that we insist that the implication of our research is that change appears possible for some, specifically that “change is not impossible” (p. 365), and that our data does not prove “that everyone (or anyone) can change” (p. 372).
  • “despite the fact that ‘Success: Conversion’ and ‘Success: Chastity’ participants retain a homosexual orientation (using Jones and Yarhouse’s own definition). . . ”  Chapman has inadequate basis for this claim.  He selectively picks counter-examples to the evidence of significant change, and ignores the direct evidence of change such as the reported changes summarized in the bar graph on page 296.

Given Chapman’s selective engagement with the data of our study—specifically by focusing only on a series of small slices of the results congruent with his skepticism about change—he responds incredulously to the fact that “the authors claim that homosexual orientation is changeable! Clearly their conclusion is not consistent with the evidence.”  In contrast, Dr. Chapman; you appear to have reached your conclusion that our evidence proves that change is impossible by selective engagement with only those pieces of evidence that fit your conclusion.  We, in contrast, engaged all of the data as a whole.

Response to “Part 3:  A Focus on the Results — Examining if it is Harmful”

In this final response, Chapman raises a number of interesting questions, but again continues 1) applying a pattern of logic and argument that would, if applied broadly in the mental health field, establish self-defeating and unsustainable implications for the entire field and 2) on that basis then highlighting isolated findings and anecdotes as if they refute the broader pattern of empirical findings from the study.

In his first paragraph, Chapman chides us for imprecision and inconsistency both in how we characterize the claims about harm made by the various professional organizations, and in how we characterize our own findings and conclusions.  He provides a link to the very same American Psychological Association Public Affairs website that we site in our book that cautions about harm from attempts to change sexual orientation.  This is one of the less forceful warnings about harm (we cite others in our book in many places; see for example pp. 330-331).  Further, public pronouncements by key professional representatives (for instance, psychiatrist Jack Drescher’s op ed piece, titled “Conversion attempts mostly lead to harm,” at [link deleted; this op-ed piece was published August 16, 2007 in the newspaper The Tennessean, but we have not found an active link that connects to this article]) have yet further heightened the perceived likelihood and severity of risk of harm.  Regarding his listing of how we describe this literature in the book, we do regret using “always” harmful (p. 19) as he points out, but the other quotes are reflective of the diverse array of characterizations of the likelihood of harm.

To address his pattern of logic, let’s begin by some simple clarification of how to think about harm.  I (Jones) recently had minor knee surgery, and both the surgery itself and the medication prescribed post-surgery had risks.  The fact that the rare person has had serious, even devastating reactions to such surgery and medication did not and can not itself invalidate my choice to pursue this procedure or the doctor’s administration of the treatment.  The risks have to be weighed against the potential gains I expected in light of my dissatisfaction with the state of my knee prior to surgery and in light of the likelihood of such risks.

The attempt to change sexual orientation is no doubt much riskier and more challenging than knee surgery.  But just how severe are the risks and just how likely are they to obtain?  It is to answer this question that we framed our search for answers in this area in terms of harm “on average.”  Chapman would seem to want to frame the question in terms of evidence that any harm occurs for anyone, a characterization substantiated by his listing of five anecdotes from our book of some level of unhappy reaction to the change process, followed by his rhetorical question, “One wonders what would have to be the reports of the participants for Jones and Yarhouse to declare the ministry harmful?”  If only the matter were that simple.  We could ask in return, How many positive results of participants would have to be reported, and how many reports of distress and unhappiness in living in the gay community would have to be reported, to justify the continuing existence of an option for attempting change?  The type of standard used by Chapman would be completely unrealistic and paralyzing for the mental health field.  Many interventions with complicated or distressing conditions produce some negative outcomes.  When starting treatment with a depressed person, one always has some sense that if the attempt to intervene is unsuccessful, the person could plunge into despair about the possibility of change and be worse off than before.  But such outcomes are not common.

But our answer was not to make that judgment for ourselves, but rather to report changes in distress level on average for those attempting change and to argue that ultimately it is the individuals themselves seeking change—and not Chapman or us—who should make their choices about whether or not to pursue change based on their own reading of the evidence.  Chapman would urge that the professional world together declare such intervention attempts invalid based on the power of the anecdotes of harm; we would argue instead that individuals should be empowered with the best array of information available to make their best choices for themselves (see pp. 377-382).

Armed with a poorly developed rationale for how to handle harm, Chapman utterly disregards the pattern of standardized findings showing no escalating patterns of distress on average across the sample, and instead claims that the five anecdotes of distress and harm we present in our transcripts establish an unacceptable level of harm for participants.  He states, “Nonetheless, dismissing this possibility and ignoring the statements of the participants that remained in the program, Jones and Yarhouse confidently declare the change process is not harmful. Once again, their conclusion is not based on the evidence: those who declare they are hurt by the process are evidence of harm.”  This is both right and wrong.  It is right in that we do indeed handle the few anecdotes of harm under the more general umbrella of empirical findings that distress does not increase on average.  It is wrong, in that we do not “declare the change process is not harmful,” but rather declare the change process is not harmful on average.  For further evidence that this is so, the reader should read our point 9 in our Conclusion (p. 376), in which we state emphatically that “despite our finding that on average participants experienced no harm from the attempt to change, we cannot conclude that specific individuals are not harmed by an attempt to change.”  We point out there that harm may obtain because of the type of intervention or because of the emotional vulnerability of the person seeking change.  We also allude to the fact of political realities:  “It is also necessary to say that claims of harm may be ideologically based and exaggerated for the sake of foreclosing the option of the attempt to change” (p. 376).

Chapman asks, “How many lives must be broken before the authors realize the actual damage caused by these ministries outweighs any potential good?”  This is a good question, but not the only question.  A contrasting question might be “How many testimonies of significant and satisfying change, and how infrequent do the empirically documented evidences of harm have to be, before opponents of such change efforts might be willing to cede to these efforts a continuing right to exist as long as they operate with rigorous levels of informed consent?”

Chapman closes with a nod to the bigger picture:  Sexual orientation, he asserts, is determined before birth, “no scientific study has successfully identified any postnatal causal factor or factors,” and therefore sexual orientation is immutable.  We, in contrast, 1) would acknowledge that there is intriguing evidence of biological factors involved in causation of sexual orientation, but would also argue that the evidence is far from establishing complete biological determination of sexual orientation; 2) suggest in contrast to Chapman that there is intriguing evidence of postnatal factors in causation (see our pp. 122-125 as well as our previous publications); 3) argue that the establishment of partial biological causation does not in itself logically entail that orientation is utterly immutable for everyone, and 4) join with Lisa Diamond (see our response to Part 1) in concluding that “the more we learn, the more we do not understand.”

How would we present the bigger picture in contrast?  Chapman’s review adds validity to our study.  He asserts bluntly “sexual orientation cannot be changed,” and clearly feels that harm is so likely and likely so devastating (“How many lives must be broken?”) that there is no merit to the attempt to change.  It was precisely to address these questions that we performed our study.  Chapman ignores the data from our study that does not fit his conclusions.  We believe that a fair read of our study produces a more difficult, complex, challenging set of conclusions (see Chapter 10), namely that:  1) change appears possible for some but not for all, and further this change is for some ambiguous, complicated, conflicted, and incomplete; 2) while harm may occur for some, on average the participants did not experience increased distress as a result of the attempt to change; and therefore 3) we would urge that individual consumers be empowered to make the best choices for themselves based on the best evidence and on full disclosure from multiple sources of information.  

Our responses to criticisms from the gay advocacy group Truth Wins Out

Two creative, entertaining, alarming, outrageous and distorted reports about our research have been released to the general media from Truth Wins Out (TWO), a forceful gay advocacy group. The first report (http://www NULL.truthwinsout, coordinated with the release of the 2007 book, declared ours a “SHAM STUDY.” The second report (http://www NULL.truthwinsout was more of a personal attack on author Mark Yarhouse. We here respond to a number of their criticisms:

  • TWO calls our book a “biased new politically motivated ‘ex-gay’ sham study;” TWO also claims that “The study was conducted by two supporters of ex-gay ministries.” Response: The motivations behind our study are not political, but scientific. When our scientific and professional organization, the APA, declares homosexuality “unchangeable,” it is an interesting scientific question to ask, as illustrated in the massive Task Force Report, whether or not this is true. Our motivations were no doubt complex, having religious and moral dimensions as well. As to the claim that our study is “biased,” we assume that this is a claim that we as researchers are biased. We have indeed previously defended the existence of the religious ministries of Exodus because, as scientists and as professionals, we have felt there to be insufficient evidentiary base behind many of the criticisms. TWO would likely only consider a researcher “unbiased” who approached the study of Exodus ministries with the presumption that change of sexual orientation is impossible. Is that not a bias? We felt it best to approach the subject with an open mind, and to report the findings honestly. If we have a bias, it is towards telling the truth.
  • The researchers are “controversial ‘researcher[es]’ at a right wing religious universityResponse: This is ad hominem argumentation, both towards us as persons and towards our educational institutions. Perhaps TWO considers us controversial, but we have appropriate credentials and have published credible research in respected journals. We criticize conservative religious thinkers who dismiss quality research just because it is authored by GLBT researchers, funded by gay advocacy groups like the Human Rights Campaign, or published in outlets clearly dedicated to GLBT advocacy. We would like to have the same respect shown to us.
  • the research likely consists of calling handpicked ex-gay lobbyists and ministry leaders on the telephone and asking if they had ‘changed’”; also “[they] utilized activist research subjects who were recruited with help from Exodus and the ex-gay therapy lobby NARTH” Response: This is a shocking and libelously false description of our study, one regarding which TWO made no efforts at confirmation or disconfirmation with us before publication. NARTH played no role in our research. TWO clearly demonstrated its lack of commitment to journalistic standards and ethics with this characterization. The only grain of truth in this allegation is that we did recruit subjects from Exodus with the help of Exodus leadership.
  • Truth Wins Out’s Executive Director Wayne Besen [stated] “We challenge Dr. Stanton Jones to submit his so-called ex-gay subjects to the ‘No Lie MRI’ because we believe that ex-gay ministries are consumer fraud and his reported study may be invalid.” Response: This is a comical statement and a theatrical gesture. We discuss in both reports why we did not use psychophysiological measures to assess sexual orientation in our study. Besen’s challenge was the first time we had heard of the “No Lie MRI,” and we were interested to see its questionable validity discussed in a thoughtful article (http://www NULL.newyorker published about the same time as Besen’s challenge.
  • According to Besen, “It is folly to suggest that telephone interviews can be considered genuine research.Response: Actually, for many types of research, telephone interviews are a standard practice. Our method for assessment was actually a combination of paper and pencil questionnaires and telephone surveys.
  • Their work was funded by ExodusResponse: Indeed, our research was funded by grants from and through Exodus. We discuss this openly in the 2007 book. In accepting the grant, we challenged the leadership of Exodus that we were committed to publishing our actual findings regardless of how affirming or embarrassing they might be to Exodus. The validity of research depends ultimately on the soundness of the methodology and the integrity of the researchers, not on the source of funding. GLBT organizations have funded some of the research that we all depend on in this field; pharmaceutical companies fund much of the research that we depend on when we take drugs for serious medical conditions.
  • Jones and Yarhouse originally sought 300 participants, but after more than a year of seeking to round up volunteers, they had to settle on only 98 participantsResponse: True. We discuss this openly in the 2007 book. This was attributable, fundamentally, 1) to the loose organization of the independent ministries that identify under the Exodus umbrella, 2) to the understaffed nature of many of these ministries, which leaves largely volunteer leaders with little time to dedicate to cooperation with researchers, and 3) the suspicion of researchers in the minds of many Exodus leaders after their poor treatment by past researchers.
  • During the course of the study, 25 dropped out, and one participant’s answers were too incomplete to be usedResponse: As we discussed in the book, our dropout rate was parallel to several “gold standard” longitudinal studies. It is impossible to conduct a study following a population over time and to retain every one of your original subjects.
  • After the study ended, but before the book was finished, one of the 11 [success cases] wrote to the authors to say that he lied — he really wanted to change, had really hoped he had changed, and answered that he had changed. But he concluded that he hadn’t, came out, and is now living as an openly gay manResponse: TWO can report this because we reported it openly first. WTO may be interested to know that as the study progressed, one other person who was declared a “gay identity failure” case at the three-year mark later repudiated gay identity and reentered the Exodus change process.
  • The study purposely declined to interview any ex-gay survivors: people who claim to have been injured by ex-gay programsResponse: True. We did purposely decline to interview “ex-gay survivors” because these were not the focus of our study. Our purpose was to conduct a prospective study of people early in the process of attempting change.  “Ex-gay survivors” are by definition not suitable candidates to be subjects in a prospective study.
  • the study’s supposed success stories were gay celibate individuals who adopted false labels to direct attention away from frequently undiminished same-sex attractionResponse: Our success cases included individuals who experienced orientation change and also those who embraced sexual chastity. We were very open about this; there was no fabrication or avoidance of these realities.
  • In short, the study design was so flawed that no mainstream, peer-reviewed, mental-health journal would publish itResponse: This is clearly not true now even if it was true for years ago, because the respected Journal of Sex and Marital Therapy chose to publish the six-year report of our study. It is false that “no mainstream, peer-reviewed, mental-health journal would publish it,” because we never made any such attempt with our first report.  We deliberately chose to publish our first report of our three-year findings as a book because of the complexity of the issues and findings. We did this knowing that it would have less professional credibility because of the form of publication with a religious publisher.
Video Summary of Responses to Criticism in 2007