Why Lawrence Lessig is wrong about Francesca Gino
Francesca Gino is almost certainly guilty of multiple counts of fraud
Recently Lawrence Lessig has come to the defense of embattled Harvard Business School (HBS) professor Francesca Gino. (Gino is technically still a professor at HBS, although she is on two years of unpaid leave and barred from stepping foot on campus).
Lessig is a legal professor at Harvard Law School and is the former director of the Edmond J. Safra Center for Ethics at Harvard. He also ran for president in 2016.
Lessig says he believes Gino is 100% innocent. He says:
“But having known Gino for almost a decade, and (though I do not represent her) having seen some of the evidence about the anomalies, I strongly urge any fair-minded soul to reserve judgment in this case. There is, in my mind, exactly zero chance that Gino manipulated any data at all.”
“When the full evidence is revealed by her lawyers, with the benefit of third-party discovery and a serious data forensics expert, I am certain that there will be no serious doubt that Gino is innocent of the charge made against her.”
This is ridiculous for a number of reasons:
First, did you notice that he contradicted himself directly? Go back and re-read the section I highlighted in bold. [Edit: I’m getting push-back about this, believe it or not. He tells everyone to reserve judgement, then he makes a judgement. While technically not a contradiction, it is hypocritical (contradicting one’s own advice)]
Secondly, it’s not rational to say there is 0% chance or 100% chance of something, unless you’re talking about “a priori” truths like 2 + 2 = 4. So Lessig’s statement is a deviation from rational thought.
Finally, and most importantly, multiple converging lines of evidence suggest that she committed the fraud in all four cases. Of course we can’t be 100% sure. But I think a credence of 85% here is reasonable, and 0% is ridiculous.
Brief recap, for those who need it
If you don’t have time to read all four of the Data Colada posts, then I recommend watching this 20 minute summary video:
Francesca Gino was a superstar faculty at HBS, with a named professorship and 1+ million salary. She used to give talks to audiences about her research on behavior change, commanding a speaking fee of $50,000 - $100,000.
For years many thought here was something fishy about Gino’s work. She continually was publishing surprising findings. The levels of statistical significance found in her studies were oddly high. There was also increasing number of failed replications of her work (one researcher, Michael Sanders, spent $250,000 trying to replicate the 2012 study in PNAS listed below).
Three researchers who are also involved in behavioral science research run a blog called Data Colada were they investigate research misconduct. They decided to investigate her articles using what public datasets they could obtain. They sent a report of their findings to HBS in fall of 2021.
In spring 2023 the Data Integrity Office at HBS concluded a very thorough investigation with the assistance of an outside forensics firm. Their report is said to be over 1,100 pages long. On the basis of that report, Gino was put on leave and retraction requests were issued by HBS for four of Gino’s papers.
Only after that process had concluded did Data Colada publish the findings of their report. This wait was a courtesy to HBS.
If you read Data Colada’s analysis, there’s very little doubt that deliberate data manipulation occurred in all four articles. In all four cases there is smoking-gun evidence of data manipulation, and the manipulated data swings a key result from statistical non-significance to significance.
Interestingly, the first paper was already retracted for fraud also discovered by Data Colada in 2021. The subject of paper, ironically, was dishonesty. That’s right — two acts of fraud were committed by two different authors in two separate experiments that were reported in a single paper. This raises some troubling questions. Did the researchers collaborate to both commit fraud together? If not, then the presence of two independent frauds in one paper suggests fraud is much more common than we care to think.
Anyway, that’s enough of a recap.
Here’s why I think it’s pretty clear Gino was involved in all four cases of fraud:
The one common denominator between the four cases is Gino
Here’s the four papers:
Shu, Mazar, Gino, Ariely, & Bazerman, PNAS, 2012. (https://datacolada.org/109)
Gino, Kouchaki, & Galinsky, Psychological Science, 2015. (https://datacolada.org/110)
Gino & Wiltermuth, Psychological Science, 2014. (https://datacolada.org/111)
Gino, Kouchaki, & Casciaro, J. Person. Soc. Psych. 2020 (https://datacolada.org/112)
All four have been retracted at the request of Harvard Business School. Looking at the author list here, the only common author across all four papers is Gino. Either Gino was extraordinarily unlucky to have worked with multiple different fraudsters or she is the perpetrator. Occam’s Razor applies.
Gino was directly responsible for the specific datasets where manipulated data was found
Let’s go through the four papers and look in a bit more detail and see if we can track down who was responsible for the fraudulent data:
(Side note: in behavioral science it is common for papers to discuss multiple small experiments that were all done with the goal of testing a particular hypothesis. All of these papers followed that format.)
The first paper contains fraudulent data in experiment one (and also experiment three). The paper says the data was collected “at local universities in the southeastern United States”. Data Colada explain: “Study 1 was run at the University of North Carolina (UNC) in 2010. Gino, who was a professor at UNC prior to joining Harvard in 2010, was the only author involved in the data collection and analysis of Study 1.” Indeed, Gino is the only co-author to be located in the southeastern US. As a side note, Data Colada also suspect experiment two is fraudulent, which was also run in the southeastern US.
The second paper contains fraudulent data in experiment four. The paper states that the experiment was done on Harvard University students. The author contributions section says: “Testing and data collection were performed by F. Gino and M. Kouchaki”. Kouchaki is at Northwestern University in Illinois. Gino is the only author at Harvard. So it seems a fair inference that Gino conducted this experiment, or she at least directed it. Furthermore, Gino was the one who uploaded the fraudulent data to OSF. The retraction notice states that the OSF dataset differs from the dataset found in Gino’s lab — some data is missing in the OSF version. When the missing data is added, the key result of the experiment doesn’t replicate.
The third paper contains data in experiment four. The Data Colada authors say “We received this dataset several years ago directly from Professor Gino.” The retraction notice states that evidence of data manipulation was discovered in Profesor Gino’s lab records. The only other co-author was Scott Wiltermuth, who was far away in sunny California.
The fourth paper contains fraudulent data was in experiment 3a. The retraction notice states (italics mine): “This retraction is at the request of the Research Integrity Officer at Harvard Business School after the results of a review into data for Study 3a collected and analyzed by Francesca Gino. The review identified unexplained discrepancies between (a) the data associated with Study 3a in the Open Science Framework platform and (b) the original, raw data collected in Qualtrics. These discrepancies, which involved 28% of the total data from Study 3a, biased the published results in the direction of the study hypothesis…. The report of the investigation did not concern either Maryam Kouchaki or Tiziana Casciaro, who agree with the retraction.”
Of course it’s still possible that in all these cases there was an unattributed research assistant (RA) that nobody knows about who actually did the fraud. This is the possibility Gino’s lawyers argue in their lawsuit. That possibility raises its own set of questions: why did multiple different RAs at two different universities all commit fraud over the course of 8-9 years? Why did Gino not detect at least one of the discrepancies in the datasets that Data Colada found, a few of which were rather obvious? Why were all of those RAs not appropriately credited with either a co-authorship or acknowledgement at the end, if they were doing most of the actual work?
Furthermore, what incentive would Gino’s RAs have to commit fraud? Why would they risk their academic careers if they were not even going to get credit? Were they pressured by Gino to report good results? The lawsuit also says that HBS's investigation asked the RAs if they were ever pressured to commit fraud, and they all said no.
If there was any way of shifting blame to RAs, HBS would have a very strong incentive to do so. That brings us to the next section:
Harvard couldn’t find other people to blame
Universities usually work very hard to avoid having to take serious action on these cases. (See this case at Duke University, or look at how slow Stanford was to act on Tessier-Lavigne.) Firing a professor or putting them on leave looks really bad. It’s especially painful for HBS in this case because of the superstar status of Gino.
Harvard dragged their heals for almost two years investigating the case. If they had brought in an forensic data analyst, they probably could have completed everything in a few months.
If HBS could have found anyone else to blame for the fraud, they definitely would have done so, in order to minimize the fallout from Data Colada’s posts. So the fact that they were unable to speaks volumes.
Renowned science policy expert Stuart Buck responds to Lessig thusly on Twitter:
“Why would Harvard care about a "threat" to "release data"??? Unless the data was absolutely damning, Harvard wouldn't care.
Nor would Harvard undertake a years-long investigation of a superstar faculty member, write a 1000+ page report, and suspend her, unless they had turned up overwhelming evidence. Universities bend over backwards to avoid this outcome, so the evidence must have been incontrovertible.”
A few final points
If you haven’t already heard, Gino sued both HBS and Data Colada. This is really sad and will have a chilling effect on whistleblowers. You can support Data Colada’s legal defense here.
Lessig may have a point that the investigation at HBS breached Gino’s contract by not following all the right procedures, as Gino’s lawsuit alleges. That question is independent of whether Gino is responsible for the four acts of fraud discovered by Data Colada, which is the question this post looked at.
Even if by some outlandish coincidence the fraud was done by someone else in all four cases, Gino is still the responsible party as the co-author responsible for the experiment in question. Arguably all of the co-authors bear some responsibility for everything that is published under their name. Scientific fraud is very harmful and some punishments are in order to deter others from committing it.
HBS should release their report so the scientific community can understand the full extent of the fraud. HBS should also investigate all of the research Gino did while at HBS and report their findings. (It appears they only investigated the four papers Data Colada wrote about, but Data Colada state “We believe that many more Gino-authored papers contain fake data. Perhaps dozens.”)
HBS should strip Gino of the title of Professor. It’s an embarrassment to HBS that Gino is still allowed to call herself a Professor on her LinkedIn and social media. Harsher punishments are needed to disincentivize fraud. Two years of administrative leave is weak sauce.
Appendix: a few of the comically ridiculous aspects of this case
Part of the reason this particular episode of fraud has gotten so much attention is that there are number of absurdly comical aspects:
As noted before, at least two acts of fraud were committed in a single paper on dishonesty.
The name of the third paper is “Evil Genius? How Dishonesty Can Lead to Greater Creativity”.
Francesca Gino’s most recent book is entitled “Rebel talent : Why it pays to break the rules at work and in life.”
Five of the authors of the fraudulent PNAS paper published a 2nd paper in PNAS eight years later attempting a replication of experiment one of their first paper, one of the fraudulent experiments. So Gino and others got two PNAS papers out of a single fraudulent one - first from publishing the fraud and then later by publishing a paper that failed to replicate the fraud. Gino even had the gall to co-author a piece in Scientific American about how heroic they were being by publishing a failed replication of their own work.
Further reading
Addressing the Data Analysis in Francesca Gino’s Data Colada Lawsuit - this is a deep dive into why her lawsuit against Data Colada is unfounded.
Support Data Colada's Legal Defense - GoFundMe page.
Your October '23 post provoked 2 comments that could've been transplanted from LinkedIn, where Gino was defended by a million mini-Lessig's, who fell in love with her through her consulting and TED talking.
I'm re-reading your post now that the court threw out Gino's defamatory suit against Data Colada. I didn't notice your calling out of Duke for its laxity in exposing fraud, and the $112MM settlement that required in 2019. To me, Duke is the academic home of Dan Ariely. Data Colada exposed Ariely's fraudulence first, in 2021, where his contribution to the joint 2012 paper with Gino (and Lazar & 2 others) was shown to be fabricated.
Duke has done NOTHING public since then.
Anyone in the field of Behavioral Economics recognizes DC as purely inspired by a commitment to truth. They primarily post to clear up methodological confusions. Their impeccable civility is matched only by their crystal clarity (& dry humor).
I thought there were questions about the provenance of files being analyzed for the report. It seems a bit hasty to call it fraud.
I don’t understand why DC just relied on the HBS report and didn’t give her a chance to respond.
I thought Nelson reviewed one of the papers before it was published. Why didn’t he comment on the discrepancies then?