Organize and Index Your Screenshots (OCR) on macOS

| 4 minutes | Comments

I maintain a growing Screenshots folder. Screenshots contain text, text that should be searchable, as finding a screenshot later is the whole point of creating it.

My folder is stored in Dropbox, and unfortunately they are not indexing images on the “Plus” plan, at the moment of writing. And OneDrive currently has this functionality suspended for personal accounts, due to some technical issues they are having. Having to depend on a cloud service for searching your screenshots sucks, and I don’t want the lock-in of online services.

Here’s how to organize and index your screenshots locally, using open source stuff, and save some money …

1. Install: Tesseract OCR Engine #

Tesseract OCR is a neat OSS project, available in Homebrew:

brew install tesseract

Given a PNG image file, you can convert it into a properly annotated PDF with:

tesseract /input/path/to/file.png /output/path/to/file -l eng pdf

This will process /input/path/to/file.png and create a properly annotated /output/path/to/file.pdf from it, that can then be indexed by macOS’s Spotlight, or Dropbox.

2. Create folder structure #

In Dropbox I have the following folders:

  • Screenshots
    • Processing
    • Raw
    • OCR

New screenshots go in Processing. Indexable PDF files generated by Tesseract go in OCR. And after processing, the raw files are moved in Raw, possibly with some renaming, which helps in sorting the files by their timestamp.

3. Change the default screenshots folder #

Instruct your macOS to save screenshots directly in your Processing folder:

defaults write com.apple.screencapture location ~/Dropbox/Screenshots/Processing

4. Add the synchronization script #

I’ve built my script with Ruby. In case you don’t have Ruby installed:

brew install ruby

Create the following script somewhere on your PATH and make it executable. I prefer ~/bin/screenshots-sync:

#!/usr/bin/env ruby

require 'optparse'
require 'pp'

USAGE = <<ENDUSAGE
Usage:
  screenshots-sync [-h] -i <path> -o <path> -r <path> [-d]
ENDUSAGE

options = {}
OptionParser.new do |opts|
  opts.banner = USAGE

  opts.on("-i", "--input-dir INPUT_DIR", "Path to the input directory.") {|v|
    raise OptionParser::InvalidArgument unless File.directory?(v)
    options[:processingDir] = File.expand_path(v)
  }
  opts.on("-o", "--output-ocr-dir OUTPUT_OCR_DIR", "Path to the output directory for OCR-ed PDF files.") {|v|
    raise OptionParser::InvalidArgument unless File.directory?(v)
    options[:ocrDir] = File.expand_path(v)
  }
  opts.on("-r", "--output-raw-dir OUTPUT_RAW_DIR", "Path to the output directory for the raw image files.") {|v|
    raise OptionParser::InvalidArgument unless File.directory?(v)
    options[:rawDir] = File.expand_path(v)
  }
  opts.on("-f", "--filter FILTER", "File name filter (LIST), defaults to *.jpg, *.jpeg, *.png") {|v|
    options[:filter] ||= []
    options[:filter].push(v)
  }
  opts.on("-v", "--[no-]verbose", "Run verbosely") {|v|
    options[:verbose] = v
  }
end.parse!

options[:tesseract] = `which tesseract`
options[:tesseract] = "/usr/local/bin/tesseract" unless File.exists? options[:tesseract]
raise "Missing tesseract executable from PATH" unless File.executable?(options[:tesseract])

options[:filter] ||= ["*.png", "*.jpeg", "*.jpg"] unless options[:filter]

raise OptionParser::MissingArgument.new("--input-dir") if options[:processingDir].nil?
raise OptionParser::MissingArgument.new("--output-ocr-dir") if options[:ocrDir].nil?
raise OptionParser::MissingArgument.new("--output-raw-dir") if options[:rawDir].nil?

if options[:verbose]
  puts "\nRunning with options:\n\n"
  pp options
  puts
end

def execute(cmd, options)
  puts cmd if options[:verbose]
  out = if options[:verbose] then "" else "1>/dev/null 2>&1" end
  r = system("#{cmd} #{out}")
  unless r
    $stderr.puts "ERROR — command exited with error code (#{r}):\n  #{cmd}"
    exit 1
  end
end

options[:filter].each do |filter|
  Dir["#{options[:processingDir]}/#{filter}"].each do |f|
    # Filename format generated by macOS
    if f =~ /Screenshot\s+(\d{4}-\d{2}-\d{2})\s+at\s+(\d{2}\.\d{2}\.\d{2})/
      fname = "Screenshot #{$1} #{$2}#{File.extname(f)}"
    # Filename format generated by my Galaxy Tab (Android)
    elsif f =~ /Screenshot[_\s-]+(\d{4})(\d{2})(\d{2})[_\s-]+(\d{2})(\d{2})(\d{2})(?:[_\s-]+([^.]*))?/
      details = if $7 then " #{$7}" else "" end
      fname = "Screenshot #{$1}-#{$2}-#{$3} #{$4}.#{$5}.#{$6}#{details}#{File.extname(f)}"
    else
      fname = File.basename(f)
    end

    raw_output = File.join(options[:rawDir], fname)
    ocr_output = File.join(options[:ocrDir], fname)
    source = File.expand_path(f)

    if source != raw_output
      execute("mv \"#{source}\" \"#{raw_output}\"", options)
    end

    execute("#{options[:tesseract]} \"#{raw_output}\" \"#{ocr_output}\" -l eng pdf", options)
  end
end

Don’t forget to make it executable:

chmod +x ~/bin/screenshots-sync

Note that this script can be used standalone, for importing existing screenshots archives:

$ screenshots-sync -h

Usage:
  screenshots-sync [-h] -i <path> -o <path> -r <path> [-d]
    -i, --input-dir INPUT_DIR        Path to the input directory.
    -o OUTPUT_OCR_DIR,               Path to the output directory for OCR-ed PDF files.
        --output-ocr-dir
    -r OUTPUT_RAW_DIR,               Path to the output directory for the raw image files.
        --output-raw-dir
    -f, --filter FILTER              File name filter (LIST), defaults to *.jpg, *.jpeg, *.png
    -v, --[no-]verbose               Run verbosely

5. Load a Launch Agent #

Installing a launch agent for running our synchronization script, whenever new files appear in Processing, is easy.

Create the ~/Library/LaunchAgents/my.sync.ocr.plist file:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>my.screenshot.ocr</string>
    <key>ProgramArguments</key>
    <array>
        <!-- TODO: change alex with your username -->
        <string>/Users/alex/bin/screenshots-sync</string>
        <string>--input-dir</string>
        <!-- TODO: change alex with your username -->
        <string>/Users/alex/Dropbox/Screenshots/Processing</string>
        <string>--output-ocr-dir</string>
        <!-- TODO: change alex with your username -->
        <string>/Users/alex/Dropbox/Screenshots/OCR</string>
        <string>--output-raw-dir</string>
        <!-- TODO: change alex with your username -->
        <string>/Users/alex/Dropbox/Screenshots/Raw</string>
    </array>
    <key>WatchPaths</key>
    <array>
        <!-- TODO: change alex with your username -->
        <string>/Users/alex/Dropbox/Screenshots/Processing</string>
    </array>
    <key>ThrottleInterval</key>
    <integer>5</integer>
    <key>StandardOutPath</key>
    <!-- TODO: change alex with your username -->
    <string>/Users/alex/Library/Logs/screenshots-sync.log</string>
    <key>StandardErrorPath</key>
    <!-- TODO: change alex with your username -->
    <string>/Users/alex/Library/Logs/screenshots-sync.log</string>
</dict>
</plist>

Then install it:

launchctl load -w ~/Library/LaunchAgents/my.sync.ocr.plist

Enjoy #

Your screenshots will now be indexed by macOS’s Finder, searchable via Spotlight.

They’ll also get indexed by Dropbox, because the Plus plan supports searching in PDF files.

| Written by
Tags: Cloud | macOS | Ruby