Data and Code

PrivaSeer Corpus (ACL, 2021)

The PrivaSeer corpus is a collection of 1,005,380 privacy policies described in the following paper

Mukund Srinath, Shomir Wilson and C. Lee Giles. Privacy at Scale: Introducing the PrivaSeer Corpus of Web Privacy Policies. In Proc. ACL 2021.

For technical questions about this data, please contact Mukund Srinath (mukund@psu.edu). For licensing questions, please contact Prof. Shomir Wilson (shomir@psu.edu).

For research, teaching, and scholarship purposes, the corpus is available under a CC BY-NC-SA license. Please contact us for any requests regarding commercial use.

Link to the corpus: https://drive.google.com/drive/folders/1zJFy13tqeWscvad-xUfAYGTG1Lhv74Mb?usp=sharing