Using OpenAI to filter cold outreach emails

AI is dominating the world of technology right now, and you're seeing little AI helpers pop up all over the place.

They're pretty good for reasoning about basic things, interpreting and modifying language, and brainstorming.

I'm already seeing the results of this in my inbox in the form of cold outreach emails - this technology is making it easier than ever for people to send decent enough emails to avoid spam detection while remaining equally useless from a practical standpoint.

Enter: Google Apps Script

Google has a scripting tool that sits on top of lots of their services called Google Apps Script.

Here's an overview of how I use it:

Every few minutes, a Google Apps Script runs and combs through my emails.
If a series of criteria are met, the email is processed by the OpenAI API to determine whether or not it's cold outreach.
If it is, I toss a label on it and throw it into a split in Superhuman.
I give these marked emails a cursory review once per day; it generally results in me selecting all the messages and dumping them in the trash.

How to set it up

If you have a massive inbox full of unread emails, processing all of them through the OpenAI API could be expensive. Set appropriate API spending limits on your OpenAI developer dashboard, or modify the original search query used in the main function below.

Head to your Google Apps Script dashboard. Make sure you're logged into the Google account with the email inbox you want to scan.
Make a new project.

Paste in the following script, reviewing the setUserVariables() function to replace things like your OpenAI API key:

const userProperties = PropertiesService.getUserProperties()

function setUserVariables() {
  // Ensure that your account has access to the model you're trying to use.
  userProperties.setProperty('openAiApiKey', 'sk-your-key')
  userProperties.setProperty('openAiModel', 'gpt-3.5-turbo-16k')

  // We can use a regular expression to check for participants that should always flag a conversation as valid
  // Put your own personal email here, or a wildcard matching your company's domain like this:
  userProperties.setProperty('allowListRegex', '.*@yourdomain.com[^.]*')

  // If you're using a nested label, separate the sub-labels with a slash like 'AI/Processed'
  userProperties.setProperty('processedLabel', 'AI/Processed')
  userProperties.setProperty('outreachLabel', 'AI/ColdOutreach')
}

function haveReceivedFromSenderBefore(thread) {
  const threads = GmailApp.search(
    `from:${thread.getMessages()[0].getFrom()}`,
  )
  return threads.length > 1
}

function isAllowListedConversation(thread) {
  return thread.getMessages().some((message) => {
    return new RegExp(userProperties.getProperty('allowListRegex')).test(
      message.getFrom(),
    )
  })
}

function itsProbablyNotSpam(thread) {
  return (
    isAllowListedConversation(thread) || haveReceivedFromSenderBefore(thread)
  )
}

function alreadyProcessed(thread) {
  const processedLabel = GmailApp.getUserLabelByName(
    userProperties.getProperty('processedLabel'),
  )
  const outreachLabel = GmailApp.getUserLabelByName(
    userProperties.getProperty('outreachLabel'),
  )

  const labels = thread.getLabels()

  return labels.includes(processedLabel) || labels.includes(outreachLabel)
}

function looksLikeColdOutreach(thread) {
  const messageBody = thread.getMessages()[0].getBody()
  const apiKey = userProperties.getProperty('openAiApiKey')
  // This saves a bit of cost by sending only the first 250 characters of the email body. You can adjust this up and down based on the token
  // limit of the model you're using and how much you want to spend.
  const prompt = `Does this email content looks like cold outreach from a company I don't know? Answer with one word, "Yes" or "No"\n\n ${messageBody
    .trim()
    .substring(0, 250)}`
  const apiUrl = 'https://api.openai.com/v1/chat/completions'

  let data = {
    model: userProperties.getProperty('openAiModel'),
    messages: [{ role: 'user', content: prompt }],
    max_tokens: 1,
    temperature: 0.5,
    n: 1,
  }

  let options = {
    method: 'post',
    headers: {
      Authorization: `Bearer ${apiKey}`,
    },
    contentType: 'application/json',
    payload: JSON.stringify(data),
  }

  const responseBody = UrlFetchApp.fetch(apiUrl, options)
  const response = JSON.parse(responseBody)
  const aiResponse = response.choices[0].message.content

  if (aiResponse === 'Yes') {
    return true
  }

  return false
}

function addInvestigationLabel(thread) {
  const label = GmailApp.getUserLabelByName(
    userProperties.getProperty('processedLabel'),
  )
  thread.addLabel(label)
}

function removeInvestigationLabel(thread) {
  const label = GmailApp.getUserLabelByName(
    userProperties.getProperty('processedLabel'),
  )
  thread.removeLabel(label)
}

function addColdOutreachLabel(thread) {
  const label = GmailApp.getUserLabelByName(
    userProperties.getProperty('outreachLabel'),
  )
  thread.addLabel(label)
}

function main() {
  setUserVariables()

  const inboxThreads = GmailApp.search('in:inbox is:unread')
  inboxThreads.forEach((thread) => {
    Logger.log(`Processing thread: '${thread.getFirstMessageSubject()}'`)

    if (itsProbablyNotSpam(thread)) {
      Logger.log(`\tProbably not spam. Moving on!`)
      return
    }

    if (alreadyProcessed(thread)) {
      Logger.log(`\tAlready processed. Moving on!`)
      return
    }

    Logger.log(`\tCould be spam. Investigating!`)
    addInvestigationLabel(thread)

    if (looksLikeColdOutreach(thread)) {
      Logger.log('\t\tThis looks like cold outreach.')
      removeInvestigationLabel(thread)
      addColdOutreachLabel(thread)
    }
  })
}

In Gmail, create two labels matching the values you selected for processedLabel and outreachLabel.
Click "Run" in the top toolbar to make sure everything is working and accept Google's permission prompts. It's helpful to manually move an email that was cold outreach and one that definitely wasn't into your inbox and mark them both as unread.
Click on "Triggers" on the left side of your screen, and add a new one that runs the main function however often you need your inbox scanned. I run mine every 5 minutes.

Enjoy!

If you find a large number of false positives that you can easily categorize, try modifying the prompt above! You might include things like "If someone mentions that they're looking for an investment, please don't designate it as cold outreach" or "If the email contains a calendar invite, never designate it as cold outreach".