Remember the name of Chinese researcher “Kangpeng Xiao,” because the National Institutes of Health really, really want you to forget him. Last week, the NIH filed a motion in a Virginia court to seal portions of documents that reference the Chinese researcher and an NIH official in a lawsuit filed against the agency for redacting and covering up records that might explain how the pandemic began.
“[T]he individuals have a substantial privacy interest in avoiding harassment or media scrutiny that would likely follow disclosure,” wrote a lawyer for the NIH to the judge. “Sealing is therefore necessary to protect this information from any further public dissemination.”
But what is actually being protected? The American public’s right to access public information that may reveal what kicked off the pandemic, or the purported privacy rights of a scientist who lives thousands of miles away in China? This legal ploy further highlights the NIH’s aggressive, haphazard approach to redacting documents and hiding information that might explain how the pandemic started.
Last summer Buzzfeed released an investigation of the NIH’s Anthony Fauci and reported that the documents the agency released were “just a portion of what was requested, and they are filled with redactions, making them an incomplete record of the time period and Fauci’s correspondence.” Meanwhile, the Intercept reported in February that the NIH continues to withhold critical documents that could shed light on how the epidemic began, noting that the agency sent them 292 pages of fully redacted records.
A week after The Intercept story, The Chief Records Officer for the U.S. Government sent the NIH a letter asking them to investigate allegations that agency personnel are shredding documents related to grant-making decisions and funding for research in China.
Kangpeng Xiao’s name became public in December 2020, when the nonprofit U.S. Right to Know published a report on revisions to coronavirus sequences that Chinese researchers had added to the NIH’s Sequence Read Archive. These datasets involved key studies that virologists were using at the time to promote the now discredited theory that the COVID-19 virus may have passed from pangolins to humans.
“These revisions are odd because they occurred after publication, and without any rationale, explanation or validation,” wrote Sainath Suryanarayanan, in the December 2020 report for U.S. Right to Know. The nonprofit based their report on NIH documents they received from a FOIA request.
According to these documents, several Chinese scientists asked the NIH to alter coronavirus sequences stored on the NIH database, with many of these requests coming from Kangpeng Xiao with the South China Agricultural University. In one case, Xiao asked NIH official Rick Lapoint in March 2020 to delete some coronavirus sequences.
A few months later, Xiao published a prominent paper on May 7, 2020, in the journal Nature that argued a coronavirus discovered in pangolins was closely related to COVID-19. But as U.S. Right to Know discovered, Xiao’s request to delete coronavirus sequences from the NIH’s Sequence Read Archive (SRA) was just one of many changes.
Xiao et al. made numerous changes to their SRA data, including the deletion of two datasets on March 10, the addition of a new dataset on June 19, a November 8 replacement of data first released on October 30, and a further data change on November 13 — two days after Nature added an Editor’s “note of concern” about the study.
Eventually, Nature’s “note of concern” attached to Xiao’s 2020 study changed to a very lengthy correction in late 2021 that explained that data “were mislabelled and attributed incorrectly.” That correction also thanked Alina Chan of Harvard and the Broad Institute “for bringing the errors to our attention.” Chan later tweeted that Nature refused to publish her analysis of the viruses and merely folded her study into their correction.
The next time a mysterious outbreak occurs, will we see a repeat performance of inaccurately written papers published in top journals and the dismissal of independent analyses demonstrating that these studies are not reproducible/accurate based on the available data?
This U.S. Right to Know report was largely ignored, but last summer, coronavirus sequences at the NIH SRA became national news when Jesse Bloom, a virologist at the Fred Hutchinson Cancer Center, published a preprint on NIH sequences that Chinese researchers had deleted. As Bloom explained in an email to NIH leadership at the time, “[T]his can be a good opportunity for the NIH to take the lead by using its remarkable data archives to make progress in resolving some of the important questions about the virus’s origins.”
The NIH would not disclose to reporters the names of Chinese researchers who requested sequence deletions, but the New York Times later identified one of the scientists as Ben Hu at Wuhan University.
Empower Oversight referenced Kangpeng Xiao’s identity in federal court recently on July 11, when the nonprofit charged that the NIH was improperly redacting documents when responding to their FOIA requests. In one case, the NIH had provided Empower Oversight with an October 12, 2021, email from Jesse Bloom to the NIH discussing a scientist’s request to delete NIH SRA sequences. Citing Exemption 6, which covers privacy concerns, the NIH redacted both the names of the requestor and the NIH official that Bloom cited in his email
In an 18-page declaration to the court, NIH FOIA Officer Gorka Garcia-Malene detailed how the NIH redacts documents in compliance with the law. In the case of Exempt 6 privacy concerns, Garcia-Malene declared:
Exemption 6 mandates the withholding of information that if disclosed “would constitute a clearly unwarranted invasion of personal privacy.” 5 U.S.C. § 552(b)(6). Exemption 6 was applied here due to the heightened public scrutiny with anything remotely related to COVID-19.
Mr. Garcia-Malene also claimed that information had be redacted “because of the amount of misinformation surrounding the pandemic and its origins.” Seriously, the NIH is now arguing in court that because there is so much misinformation about how the pandemic began, they can’t release facts that might clear up misinformation about how the pandemic began.
Garcia-Malene also asserted, “If released, this type of information could be used by the public to send threatening and harassing messages.”
However, in a second production of records—and going through the exact same process of complying with the law when redacting documents as explained to the court by Garcia-Malene—the NIH provided the exact same email to Empower Oversight, but did not redact the names.
In their court filing, Empower Oversight highlighted how the NIH is now citing the fear of “threatening and harassing messages” to redact documents in a haphazard manner, telling the judge, “NIH used this approach in instances where there was no legitimate threat of such harassment. NIH repeatedly redacted the official NIH email address of Dr. Francis Collins, even though he had retired months before NIH’s May 13th production of responsive records.”
Four days later, on July 15, the NIH’s lawyer with the Department of Justice cited Empower Oversight’s legal filing and asked the judge to seal portions that named Chinese researcher Kangpeng Xiao and NIH official Rick Lapoint. In a footnote, the Justice Department noted, “NIH is reviewing whether other materials leave unredacted the two names at issue.”
Other materials that leave these names unredacted include dozens of pages of documents the NIH released in late 2020 to U.S. Right to Know—around two years ago! I emailed Justice Department attorney Meghan Loftus, who is handling this case for the NIH, and asked her if the NIH’s haphazard redaction process undermines the sworn declaration by Mr. Garcia-Malene, the NIH’s FOIA Officer. I also asked her if this slapdash redaction process calls into question other redactions by the NIH.
“As the case you are referencing is still being litigated, our office is unable to comment on it at this time,” a press officer with the Department of Justice emailed me.
“This is messed up as so many levels,” says U.S. Right to Know’s Sainath Suryanarayanan. “At one level they are having a knee jerk reaction—cover up everything. Second, they are incompetent and their efforts to cover up are inconsistent.”
But Suryanarayanan points out that the NIH has been dishonest from the pandemic’s beginning, noting that the NIH’s Anthony Fauci was caught misleading Congress about funding for gain-of-function research at the Wuhan Institute of Virology. “There might be something deeper they want to hide,” he says. “We just don’t know, but it all reeks.”
ADDENDUM: In the October 12, 2021, email that the NIH now wants to censor, virologist Jesse Bloom asked the NIH’s Steve Sherry about coronavirus sequences that Kangpeng Xiao requested Rick Lapoint to remove from the NIH SRA database. Bloom pointed out that the virus sequences disappeared but then reappeared over a year later around June 16, 2021.
NIH rules state that sequences can only be restored at the submitter’s request, yet there is no evidence that Kangpeng Xiao asked the NIH to restore the sequences. Bloom then asked several questions to understand why the sequences were deleted and then restored.
The NIH continues to provide conflicting reasons for why coronavirus sequences are deleted and restored to their SRA database. Yet, this information may help us understand how the pandemic started. Over the weekend, I copied Bloom’s questions and sent them to the NIH, asking for a response by Monday:
Did Kangpeng Xiao request that the sequences be restored to the SRA? If yes, on what date was this request made?
Did the NIH restore these sequences without a request from Kangpeng Xioa? If yes, what was the rationale?
Despite repeated requests for the NIH to comment, I received no response.