The New York Times and Google Cloud digitizing five to seven million old archived photos on the cloud, creating easily searchable database, using cloud-based tools

Briefing

The New York Times and Google Cloud digitizing five to seven million old archived photos on the cloud, creating easily searchable database, using cloud-based tools

November 23, 2018

Briefing

  • Cloud Photo Database – The New York Times partnering with Google Cloud to digitize five to seven million old photos from its archive and make them searchable to reporters and editors
  • Hundreds of Cabinets – Catalog of photos originally took up space of up to hundreds of file cabinets stored three stories below street level in The New York Times office
  • Stored Data – Data stored in cloud includes high resolution scans of images, plus metadata, such as text, handwriting and other details in images as well as from back
  • Cloud Tools Used – Include processing pipeline manager Cloud Pub/Sub, ImageMagick for resizing photos, Cloud SQL for storing and tracking metadata, ExifTool open-source program for modifying metadata, Cloud Vision API for recognizing, storing, and reading text at back of images, as well as identifying objects, places, and images in photos, plus Cloud Natural Language API which can intelligently add or tag more information on top of recognized text

Accelerator

Business Model and Practices

Business Model
and Practices

Sector

Information Technology, Media and Entertainment

Organization

Google Inc., The New York Times

Source

Original Publication Date

November 9, 2018

Leave a comment