The New York Times and Google Cloud digitizing five to seven million old archived photos on the cloud, creating easily searchable database, using cloud-based tools

November 23, 2018

  • Cloud Photo Database – The New York Times partnering with Google Cloud to digitize five to seven million old photos from its archive and make them searchable to reporters and editors
  • Hundreds of Cabinets – Catalog of photos originally took up space of up to hundreds of file cabinets stored three stories below street level in The New York Times office
  • Stored Data – Data stored in cloud includes high resolution scans of images, plus metadata, such as text, handwriting and other details in images as well as from back
  • Cloud Tools Used – Include processing pipeline manager Cloud Pub/Sub, ImageMagick for resizing photos, Cloud SQL for storing and tracking metadata, ExifTool open-source program for modifying metadata, Cloud Vision API for recognizing, storing, and reading text at back of images, as well as identifying objects, places, and images in photos, plus Cloud Natural Language API which can intelligently add or tag more information on top of recognized text

Image Source: Google Cloud Blog

The New York Times file cabinets

Image Source: Google Cloud Blog

Sectors: Information Technology, Media and Entertainment
Organizations: The New York Times, Google Inc.

What are your thoughts?

AcceleratingBiz® is a trademark of MangoStrategy, LLC


   +1 617-588-3400
Become part of the community!

Receive the latest AcceleratingBiz updates and access member-only content

© 2013-2019 MangoStrategy, LLC   |   Read our Cookie Policy, Privacy Policy and Terms of Service