Background
The President John F. Kennedy Assassination Records Collection Act of 1992, commonly known as the JFK Records Act, was enacted by the United States Congress to ensure the expeditious disclosure of records related to the assassination of President John F. Kennedy. This legislation mandated the National Archives and Records Administration (NARA) to establish a comprehensive collection of all U.S. government records concerning the 1963 assassination. The collection is housed in the NARA Archives II building in College Park, Maryland.
The act was partly inspired by public interest and the narrative popularized by Oliver Stone's 1991 film "JFK," which suggested a conspiracy involving U.S. government agencies. The Assassination Records Review Board (ARRB), established by the act, was tasked with reviewing and releasing these records to the public. The legislation required that each assassination record be disclosed in full and made available in the collection no later than 25 years after the act's enactment, unless the President certified that continued postponement was necessary.
Significant releases of these records have occurred over the years, with notable document releases in 2017 and subsequent years. These releases have continued to fuel public interest and scholarly research into the events surrounding President Kennedy's assassination. On January 23, 2025, President Trump signed an executive order to declassify more documents related to Kennedy's assassination, as well as others concerning the assassinations of his brother Robert F. Kennedy and Martin Luther King Jr.
The 2025 Files
In accordance with President Trump's executive order, the National Archives, in partnership with agencies across the federal government, released all previously classified records from the President John F. Kennedy Assassination Records Collection on March 18, 2025.
The release contains more than 60,000 pages of documents across 1,123 PDF files. These PDFs primarily include scanned documents, images, handwritten notes, and other materials. Many of the documents are difficult to read and parse. They are available for download on the National Archives website.
Using Co AI to Analyze the JFK Files
Parsing large document dumps like the JFK files has previously been a daunting task. Researchers and historians face significant challenges due to the sheer volume of documents, which often include formats that are difficult to analyze. Traditional methods require manual sorting and reading, which is not only time-consuming but also prone to human error. The complexity of the documents, combined with the need for cross-referencing and contextual understanding, makes it nearly impossible to efficiently extract meaningful insights from such vast collections.
With the advent of Large Language Models (LLMs), we can now leverage AI to analyze these documents more effectively.
Selecting an LLM
Different LLMs offer various comparative advantages. Since we prioritized precision, we selected Gemini Flash 2, which features a 1M token context window with near-perfect retrieval (> 99% accuracy). To put this in perspective, 1M tokens equals approximately:
- 50,000 lines of code (with the standard 80 characters per line)
- All the text messages you have sent in the last 5 years
- 8 average-length English novels
- Transcripts of over 200 average-length podcast episodes
Since we knew we wanted to put all the documents into context, we first programmatically downloaded all 1,123 PDFs. We then utilized Gemini Flash 2, which also excels at Optical Character Recognition (OCR), to transform the PDFs into markdown (text) files. There are other ways to do this (and for small files, using PDFs works fine, too!).
Our text files totaled just under 2 million tokens, so we split them into two files each containing about 900K tokens, which allowed us to comfortably fit them in the context window of two bots.
Setting up the Contextualized Bots on Co
A key advantage of Co AI is its ability to support collaborative document investigation through our splits and optimizations. The platform enables multiple AI bots (and people) to work together seamlessly.
We created two new Co AI bots and assigned each one of our two collated text files.
Recognizing other users might want to access these contextualized bots, we organized them into a collection that anyone can easily add to their own spaces.
Now we're ready to explore, analyze, and understand the documents.
Analyzing the Documents
From here, numerous possibilities emerge. We can use the bots to answer specific questions about the documents, summarize their contents, or combine their exploration with other analytical tasks.
For example, here's how you can use a bot to extract highlights from one of the document sets:
As demonstrated, Gemini Flash 2 enables the bot to identify interesting findings from the document set with precision and provide citations for further exploration.
To illustrate how you can leverage multiple bots or LLMs together, here's an example of setting up a channel and creating interactions to investigate points of interest:
First, we ask Sonnet 3.7, a highly capable LLM, to summarize the official assassination narrative. Then we direct the JFK new disclosures bot to investigate the document set for evidence related to these claims. This demonstrates just one of many complementary interaction patterns possible with this setup.
To try it out yourself, click the button below!