<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=1822615684631785&amp;ev=PageView&amp;noscript=1"/>

How to Quickly Clean Data & Extract Emails with GPT for Sheets

As great as it would be for every dataset to come to you perfectly scrubbed, categorized, and formatted, that’s rarely the case.

Fortunately, AI now makes it easier than ever to clean up your cold outreach prospecting data and get it ready to use in your campaigns.

And in this article, we’ll cover exactly how to use AI to do just that.

Below are some different ways you can use the GPT for Sheets add-on to clean your data right inside of Google Sheets. (Which then prepares it perfectly for your GMass campaigns.)

Note: If you need a walkthrough of what GPT for Sheets is, how to install and activate it, and how to figure out the costs, check out our article giving a full overview of GPT for Sheets.

Check out our other guides to using GPT for Sheets: Writing unique cold email intro linesScraping prospect dataFixing grammar, typos, and more

Cleaning and Extracting Data with GPT for Sheets: Table of Contents

Using GPT for Sheets to Clean and Extract Data

GPT for Sheets does an admirable job of cleaning up messy data and getting it into usable form. It certainly can do most, if not all, of what you used to have to do manually or assign to a virtual assistant.

Here’s how.

Cleaning up a list of messy email addresses

For this example, I put 20 rows of jumbled up contact info into a Google Sheet.

You can see my data includes email addresses in different formats (including some with typos), plus some names and phone numbers as well.

Raw data of 20 jumbled email addresses

Then I used the GPT_EXTRACT formula from GPT for Sheets to extract email addresses.

The GPT_EXTRACT formula is designed to find a specific type of data inside of a specified cell. The format is =GPT_EXTRACT(data,what to find).

My formula here for row 2, for example, is:

= GPT_EXTRACT(A2, “email address”)

Here were the results for the whole column, powered by the gpt-3.5-turbo model. (Which is the least sophisticated — and therefore cheapest — model available in GPT for Sheets.)

Extracted and cleaned email addresses

The results are almost perfect. Of the 20 rows, it failed on Row 15 with the apostrophe and Row 18 with more complex decoding. (Though I was impressed it figured out the name[at]organization[dot]extension formatting in the first place.)

I tried again with the much more advanced gpt-4o model — and it made the same mistakes.

So, as always, ChatGPT isn’t infallible but does an overall good job.

And, again as always, you need to double check (or at least spot check) ChatGPT’s work before you take it live into the world.

Figuring out prospects’ first names based on their email addresses

Next, I wanted to try to figure out prospects’ first names based on their email addresses. (Sure, GMass can do this already for you with our first name detection feature, but I figured I’d see if OpenAI can do it too.)

I used the GPT_FILL function to figure out first names from email addresses.

With GPT_FILL, you give ChatGPT a few rows’ worth of examples, then it figures out what you were doing in those examples and fills in the remaining rows.

The format is =GPT_FILL(examples, inputs). The examples part should include both the rows/columns of raw data as well as the column where you cleaned up your manual examples.

So I manually typed in the correct first names for the first three rows, then used this formula:

=GPT_FILL(A2:B4, A5:A21)

These were the results:

Figuring out first names from email addresses

As you can see, everything came out correct. ChatGPT was 20-for-20 on grabbing first names out of email addresses.

Finding phone numbers inside raw data, then standardizing their format

Finally, I wanted to see how GPT for Sheets would perform trying to find, then standardizing, phone numbers.

This took two functions (I tried a few different ways to reduce this to one formula, but that was a bridge too far).

I used GPT_EXTRACT to grab phone numbers, then GPT_FORMAT to standardize the results in the international phone number format.

GPT_FORMAT is a function structured as =GPT_FORMAT(input, target format). GPT for Work doesn’t specify all the potential target formats; in their examples they include ISO dates, different cases (title case, upper case, and so on), international phone numbers, and ISO currency.

But… I also had to wrap the formatting in a Google Sheets IF statement using ISERROR, because the GPT_FORMAT would return an error if there was no phone number to be found.

So this is the formula I put in column D:

=GPT_EXTRACT(A2, "phone number")

And this is the formula I put in column E:

=IF(ISERROR(GPT_FORMAT(D2,"international phone number")), "", GPT_FORMAT(D2,"international phone number"))

Extracting and formatting phone numbers

Once again, almost perfect. It missed only on row 7, where it couldn’t figure out the country code.

Takeaways from using GPT for Sheets for data cleaning

Overall, even with a mistake here or there, using GPT for Sheets would be a big time saver during the data cleaning process — especially with a large set of contacts.

All of this work (including some trial and error) cost me 1 cent worth of GPT for Sheets tokens.

Based on the handful of little mistakes, I would recommend double checking GPT for Sheets’ results — but the results were awfully close overall.

Also, I recommend that after you get results from GPT for Sheets, copy that column and use Google Sheet’s Edit > Paste special > Values only to paste your results into a new column. Otherwise, GPT for Sheets has a tendency to keep re-running the formulas and driving up your costs.

Using Your Cleaned and Extracted Data with GMass

And once you have your cleaned and extracted data, it’s incredibly easy to get it into a cold outreach campaign in GMass.

Since GMass works inside Gmail and has a native integration with Google Sheets, you can simply connect your Google Sheet and use mail merge to bring in your data.

Here I connected my Google Sheet, then picked the sheet within that file that had the name, email, and phone information.

Connecting the Google Sheet of results to a GMass campaign

Now I can use mail merge to bring in the data I found.

(Note: I used GMass’s conditional formatting to say one thing if I had the person’s phone number and another thing if I didn’t.)

The draft of the GMass campaign

Here’s how a few of these emails look when I generate the drafts using GMass.

The results after mail merge

If you aren’t using GMass yet, you can start a free trial by downloading the Chrome extension.

It only takes a minute to get started and you won’t even have to put in a credit card for your trial.

See why 300,000+ people are using GMass (and give it a 4.8/5-star rating across 10,000+ reviews) to send their cold emails, mail merge campaigns, and more.

Email marketing, cold email, and mail merge inside Gmail

Send incredible emails & automations and avoid the spam folder — all in one powerful but easy-to-learn tool


Download Chrome extension - 30 second install!
No credit card required
Love what you're reading? Get the latest email strategy and tips & stay in touch.

Leave a Reply

Your email address will not be published. Required fields are marked *

Start your free trial of GMass now

Install in 30 seconds — no credit card or sign up form required

Try GMass for free Then check out the quickstart guide to send your first mail merge email in minutes!


Share This