By JL Cederblom, an independent researcher, republished with his permission from his Medium. Of note, former WPATH president and board member and transwoman Dr. Erica Anderson mentions the problem with gender research in her recent opinion piece in the San Francisco Examiner, where she notes: “Unfortunately, we find the research on trans youth has not kept up with what is happening.”
In March of 2021 a paper was published by the journal Plastic and Reconstructive Surgery Global Open, the open access only (meaning pay-to-publish) version of Plastic and Reconstructive Surgery. This is the journal of the American Society of Plastic Surgeons, a fairly reputable medical organization — as far as plastic surgeons go, at least.
The article published, Regret after Gender-affirmation Surgery: A Systematic Review and Meta-analysis of Prevalence, was part of a larger set of four papers on the topic of gender medicine: three of which purported to be systematic reviews of regrets, complications and satisfaction with surgery respectively and one called Building a Multidisciplinary Academic Surgical Gender-affirmation Program: Lessons Learned about how they set up their own gender clinic. However, as it is the topic I am most familiar with, I’ll mainly be talking about the regret paper.
As the evidence for the efficacy of these interventions is tenuous at best — something even WPATH studies note — other ways to prove they are at least not detrimental or mistaken are explored. Looking at reported regret and detransition is one such option. It’s worth adding that historical data for this is not actually that meaningful as both the population and the criteria to undergo medical transition in the past are worlds apart from what we find today, but it is all we have to go on.
The time and effort required to put together four academic papers, including three systematic reviews, is considerable. The cost to publish an article of this format in the journal is listed as (USD) $2195, which adds up to $8780 to publish these articles. The academic publishing process involves lengthy peer review and high standards. The authors report using PRISMA and NIH quality assessment tools to aid them in their process. Their credentials are impeccable, with the nine listed authors having MDs across the board, as well as two PhDs and two FACS. Everything appears set up to go right.
So what actually happened?
I’ve been following this issue for a few years now and I’m quite familiar with the source material, so when a systematic review was published it quickly rose to the top of my to-read list. As soon as I read the paper it became clear something was very wrong here.
For there to be small, inconsequential, errors in a paper is commonplace across scientific literature, although research into gender is generally of especially poor quality. To illustrate what I expect from papers in this field, the review states that “Approximately 0.6% of adults in the United States identify themselves as transgenders” with a reference to 2014 data. This completely ignores the explosive growth of the phenomenon since then, as well as the uneven distribution of that growth — for some areas and age groups the number appears closer to 10% more recently. This kind of misinformation is expected in these papers — it’s not technically lying or even erroneous but it’s still simply wrong.
What I got with this paper was on another level. Sure, the usual warning flags appeared right away — conclusions in the introduction, rampant equivocations and completely unsupported statements — but something more fundamental was at play here. Nine authors, peer review and a six month publication process, thousands of dollars, yet they didn’t even run spell check on it? And what are these numbers? These are all wrong, all over the place. And these weren’t even simple one-off errors, instead different tables disagreed with each other. The metaphor that comes to mind is drunk driving.
The most glaring example is Wiepjes et al., 2018, where Bustos et al. claim 4863 people had gonadectomies, at mean ages of 23 and 33 for adult females and males respectively, and 26 and 16 for adolescents. They were then supposedly investigated for regret via questionnaire. If we read the actual paper, we find that 2627 people had gonadectomies, no one was under 18 at the time of surgery and it was a medical records search rather than a questionnaire.
Table 2 has 10 columns, and somehow Bustos et al. managed to put erroneous data in eight of them for Wiepjes et al. — only “Authors and Year of Publication” and “Country” were correct. Normally those would go without saying, but surprisingly they managed to get them wrong elsewhere: they get the names of the authors wrong for two of the papers and the country wrong for three.
I spent some time going over the whole table in detail, finding dozens of errors. If we include inconsistencies and erroneous footnotes, the number approaches 50 — for a single table! There are errors in every single column, and for 19 of the 27 papers. The most frequent error is that Bustos et al. claims that data was “not specified” in a paper where it actually was. For example in Landén et al. (who they here call Laden) the sex ratio is clearly provided but is listed as not specified in the review, one of dozens of this type of error. There appears to have been no effective proof reading stage of any kind.
The erroneous sample sizes are eye catching, but more fundamental is perhaps their misreporting of the “assessment tool”, how regret was measured in these papers. It reveals that the authors did not actually understand what they were reading, and that they did not care that they did not understand. The same goes for the reviewers — for something like this to be published, we’re not talking about a single point of failure. The nine authors, several reviewers and the editorial staff all have to take a serious look at themselves.
The errors span every possible aspect of a review, or academic paper in general for that matter. The numbers are incorrect, the procedures are incorrect, the fundamental thing they aim to extract — the rates of regret — is incorrect. They don’t seem to have understood what they were reading. Although perhaps that is asking too much as they don’t even spell correctly in an article they paid $2195 to publish.
I fairly quickly heard of others with these concerns, with plans to reach out to the journal, and in a published letter to the editor Expósito-Campos and D’Angelo point out that the search was insufficient (my own list of papers matching the search criteria has at least another ten on it, and it is not exhaustive) and even plain wrong in including a paper which clearly should not have been included as it measured “regret” in choosing neo-vulva construction over full neo-vagina construction. The letter notes some of the data errors, as well as the absurdity of suggesting any of the findings could be generalized to “non-binary” as the sample size for that group was a single individual.
In a reply to this letter the authors make only non-responses:
- When confronted with the fact that their systematic review missed a number of key studies they simply state that they developed a “comprehensive search strategy” following PRISMA guidelines. Skipping past that they only partially followed PRISMA, “we performed a comprehensive search” is not an answer to “you missed a lot of papers, here are examples” as their answer is already shown to be false in the lead-up to the question. Why would anyone respond this way?
- For the paper that they erroneously included they simply state that it “complied with the selection criteria […] clearly stated in our methods section”, which is simply inaccurate. The paper measured regret in choosing one procedure over another, which is a completely different question from regret with undergoing gender-affirming surgery in the first place. It is clear that the authors did not understand what this paper investigated, as is evident from the fact that they use the wrong rate as well.
- Regarding their Wiepjes et al. miscalculation they appear to double down, stating that they “identified only those patients who underwent gonadectomy” before detailing their exact calculation. Despite being explicitly informed of the step they skipped they repeated their error and once again arrived at the incorrect number.
To hammer that home further, this should never have happened in the first place, as Table 1 in Wiepjes et al. makes clear with a footnote that the ratio of people in the “Underwent gonadectomy” rows are of those who were “treated with gender-affirming hormones” rather than the entire study population. The paper also explicitly states that the number was 1,742 males and 885 females. Even a brief moment of consideration reveals an odd situation if their calculations were accurate: it would mean that many more people had gonadectomies than were started on cross-sex hormones.
- Regarding generalizing a non-binary sample size of 1 to an entire population — now the most common form of trans — they simply don’t respond. They do not admit fault, nor do they offer a justification. And how could they? There is no defense for this. Their response is simply to pretend it didn’t happen.
The authors end their reply by stating that in conclusion, they “strongly believe that this study is of very high-quality and adds important and relevant information to the literature despite the observations stated in the letter.”
A baffling situation, to say the least. The level of incompetence required to make these mistakes is only comparable to the level of brazenness it would be if this was fraud. Perhaps the question should be “incompetence, fraud, or delusion”. Or perhaps I am just reinventing the wheel and it should be described with the phrase “this is not just ‘not right’, it’s not even wrong”.
Where does this leave us?
Even a casual look at the paper reveals trivial flaws. A deeper look reveals a fundamental lack of understanding for what science is. The process did not work. This would have been given a failing grade if it was written by a high schooler. I’ve seen much higher quality papers be fully retracted. This one has not even been corrected despite ample opportunity. So where exactly is the line between incompetence and fraud?
I don’t know. Perhaps the original paper was a result of incompetence, and the handling of it after this became apparent is fraud. Perhaps it’s incompetence throughout, with a healthy dose of delusion on the part of the authors. But where is the editor in all of this? Why has this mess not been cleaned up? Why has the editor allowed the situation to reach a state where the credibility of the complete staff and process is falling apart? I certainly don’t have answers, and I doubt they do either. They certainly haven’t responded to my attempts to get answers from them.
Perhaps it is the wrong question to ask in the first place — it doesn’t actually matter if it’s a result of incompetence or fraud. The reality of the matter is that this is normal in this field. You’ll find it in every journal, from every discipline. Something about this field makes people lose their heads. It’s very rare to read a paper and not have to ask yourself if it’s out of ignorance or malice.