Skip to main content

Entity extraction within Excel

Perform Named Entity Recognition (NER) on text in individual cells using GPT for Excel. This guide walks you through extracting People names from news articles in Excel and optimizing results.

While you can also use the GPT_EXTRACT function to perform this task, we'll focus on using the Custom prompt bulk tool in this guide.

Step 1: Get started

Prerequisites

Open the GPT for Excel add-in and run the Extract entities bulk tool with People names as entity to extract on a few rows to see how it works.

Click Home tab, GPT for Excel Word. Then Bulk tools tab, Extract entities.


If you find the answers satisfactory, go ahead and launch your bulk entity extraction! Otherwise, check how to improve your results in Step 2.

Step 2: Improve your results (optional)

To obtain the best results when extracting entities, consider the following approaches. All solutions use the Extract field and require gpt-4o, which follows instructions more accurately.

GoalRecommended approach
Discover unknown entitiesUse unsupervised extraction
Instruct the model to extract all entities. This will give you an overview of the entities in the text.

Entities along with their type (e.g. Apple (Company))
Increase extraction precisionProvide context
Define in which context the entities should be extracted, for example extract Persons only if they are CEOs or CFOs.

Person (only when CEO or CFO)
Or add custom instructions for generic guidelines:

Only extract entities that are related to the company Apple
Standardize entitiesNormalize the extraction
Provide a normalized form for the output entities, for example you may want to extract 'Advil' and 'Nurofen' as 'Ibuprofen', their USAN form.

Drug name (normalized with USAN)
Eliminate duplicatesDefine an output format
Specify an output format to ensure each entity is extracted only once, under this format. For example 'John Doe' and 'Mr Doe' are extracted once, as 'John Doe', if they appear in the same text.

Person (First_Name Last_Name)
Avoid extraction of generic termsDisambiguate the instructions
Make the instructions more specific to prevent the extraction of generic terms. For example, Drugs extracts both drug names and synonyms of 'drug'.

Drug names
Extract very specific entitiesDefine the entity form
Specify the form of the entities to be extracted so that the model can identify them. For example, provide your Product ID format.

Product IDs (start with E, 10 characters)

Once you have refined your extraction method and are satisfied with the results from the initial rows, you are ready to launch your entity extraction in bulk. Select more cells or even all cells, click Run cells, and watch GPT for Excel handle the rest of the extractions.