Exploring a Books Data Commons for AI Training

What role do books play in training AI models, and how might digitized books be made widely accessible for the purposes of training AI? What dataset of books could be constructed and under what circumstances? A new paper investigates the concept of a responsibly designed, broadly accessible dataset of digitized books to be used in training AI models.