University of Michigan Sells Recordings of Study Groups and Office Hours to Train AI

The data is being offered for tens of thousands of dollars to outside third-parties. It is unclear if the speakers provided informed consent.
University of Michigan.
Image: Unsplash/Alex Mertz.

Update: After the publication of this piece, Colleen Mastony, the university spokesperson, said in a statement that the third party vendor (which 404 Media identified as Catalyst Research Alliance) advertising the data has “been asked to halt their work.” Catalyst then eventually took down the advertised University of Michigan data from its website. The statement added that no transactions were made by the vendor. The original article follows below, and the university’s statement is included at the end.

The University of Michigan is selling hours of audio recordings of study groups, office hours, lectures, and more to outside third-parties for tens of thousands of dollars for the purpose of training large language models (LLMs). 404 Media has downloaded a sample of the data, which includes a one hour and 20 minute long audio recording of what appears to be a lecture.

The news highlights how some LLMs may ultimately be trained on data with an unclear level of consent from the source subjects. The University of Michigan did not immediately respond to a request for comment, and neither did Catalyst Research Alliance, which is part of the sale process.