Material Classification

A Case Study in AI Consulting

Material Classification - Doug Peterson

Executive Overview

This project originated through the PA ArCHER Grant program that I conceived. This Grant generated 23 new leads (a big win in our high-LTV vertical) while providing a low-stakes way to explore AI's business value.

We awarded the grant to the Smithsonian. The resulting project demonstrates how AI can address critical metadata challenges for cultural heritage institutions, delivering immediate value while establishing reusable methodologies for future engagements.

As Head of R&D and Product Manager at Digital Transitions, I led both the Grant and the resulting AI project, with the following key results:

  • Improved Collection Access: Enhanced access to 112,000 digitized items that were previously "dark" due to lack of metadata

  • Workflow Efficiency: Reduced processing time from years to weeks, saving over 3,600 staff hours

  • Cost Reduction: Decreased post-digitization costs by more than $130,000

  • Business Compliance: Ensured 99.5% accuracy in identifying sensitive contract documents containing royalty agreements, on par with standard human review

  • Scalable Methodology: Established a repeatable process for applying AI to similar collections

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

By offering a grant program to fund AI projects, we generated 23 new leads (an excellent result since our vertical has a high LTV for leads).

It was also a low-stakes way to familiarize ourselves with the potential business value of AI.

Material Classification - Doug Peterson

Our Approach

  • Collection Analysis: Assessed needs and characteristics of the Smithsonian’s Asch Collection

  • AI Strategy: Identified material type categorization as optimal application of AI

  • Custom Model Development: Trained specialized algorithms for seven material types

  • Risk Management: Created enhanced protocols for sensitive content detection

Client Challenge

The Smithsonian's Ralph Rinzler Folklife Archives faced critical issues with their Moses and Frances Asch Collection:

  • 112,000 digitized images with two-thirds lacking descriptive metadata ("dark" content)

  • Manual processing estimated at 3,600 hours costing $131,863, plus oppourtunity cost

  • Need to keep private sensitive business documents containing royalty agreements

Solution Development

  • Assessment: Determined AI could automate material type categorization

  • Data Preparation: Selected seven priority categories and refined training data

  • Model Development: Trained ResNet on our refined training data

  • Implementation: Applied asymmetric error handling to prioritize sensitive content protection

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

One challenge of this project is that the distinction between archival material categories like “Handwritten Note” and “Correspondence” are sometimes fuzzy.

Material Classification - Doug Peterson

Key Results

  • 95% overall classification accuracy across seven material types

  • 99.5% accuracy identifying contracts containing sensitive information

  • Similar accuracy to human processing

  • $130,000+ cost reduction compared to manual processing

  • Months to weeks reduction in processing timeline

Value Delivered

  • Material Classification: Enabled filtering by document type across the collection

  • Content Protection: Secured sensitive business documents with specialized detection

  • Operational Efficiency: Saved 3,600+ staff hours and redirected talent to higher-value tasks

  • Methodology Transfer: Established framework for applying AI to additional collections

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

A Confusion Matrix on validation data showing high accuracy and generally sensible conflations.

Material Classification - Doug Peterson

Client Outcome

The project transformed the Asch Collection from partially accessible to fully searchable, with Smithsonian Folkways gaining efficient access to business-critical documents while protecting sensitive information.

Success Factors

  • Targeted a specific, high-value challenge with measurable ROI

  • Tailored implementation to unique cultural heritage requirements

  • Balanced accuracy needs with practical considerations for different content types

  • Created foundation for expanded AI applications beyond initial use case

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

Lorem ipsum

Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum.

This website uses cookies to improve your experience.