iTagPDF iconiTagPDF:
Towards Finally Automating PDF Accessibility

Peya Mowar Aaron Steinfeld Jeffrey Bigham

Carnegie Mellon University

ACM CHI 2026

🏆 Best Paper Award

Paper / Tool (Coming Soon)

Figure showing an overview of iTagPDF's three outputs: content region, reading-order, and content-specific metadata. (a) displays a sample page with text blocks, authors, title, and an image each labeled with predicted content regions. (b) shows page annotated with arrows and numbered labels indicating reading order (c) shows content-specific metadata, i.e., image alt text

Intelligent PDF Tagging (iTagPDF) automatically produces three kinds of PDF accessibility metadata: (a) content regions and their locations (tags), (b) their reading order, and (c) content specific metadata, such as structure information for tables and alt texts for figures and formulas. iTagPDF is the first to combine all major tasks of making a PDF accessible into one tool, and outperforms all prior approaches by jointly modeling the input source and the rendered pixels of the PDF.