Skip to main content

Entity extraction within Excel

Extract key information like product attributes or names and emails from unstructured text within Excel cells using GPT for Excel. This guide walks you through the extraction of name, weight, and price from bike model descriptions.

Step 1: Get started

Prerequisites

Open the GPT for Excel add-in and run the Extract entities bulk tool with Name, Weight, and Price as entities to extract on a few rows to see how it works.

Click Home tab, GPT for Excel Word. Then Bulk tools tab, Extract entities.

If you find the answers satisfactory, go ahead and launch your bulk entity extraction! Otherwise, check how to improve your results in Step 2.

Step 2: Improve your results

For best results, use gpt-4o, which offers higher accuracy. If your results are still not satisfactory (e.g., incorrect extractions, lack of precision, inconsistent formatting, or missing information), you can further improve them by adding Extraction instructions to give more specific guidelines to the AI.

Click Add extraction instructions
info

You can provide specific instructions for multiple entities simultaneously. Separate each set of instructions with a new line for clarity.

GoalRecommended approach
Remove irrelevant informationInstruct the model to exclude specific details
Define what should be excluded from the extraction to get cleaner results. For example, extract only the model name, without manufacturer or year.

Name: extract only the model name, without manufacturer or year
Normalize entitiesRequest a standard form for the output entities
Get consistent output for variations of the same entities in your text. For example, standardize color names to a consistent set of terms.

Add Color as an entity in the Extract field, then instruct:

Color: normalize color names (e.g., 'midnight' to 'black')
Split numerical values and unitsExtract separate entities for values and units
Separate numbers from units to simplify their manipulation. For example, split a price like "$1,299.99" into separate numerical and currency entities.

Add Price value as an entity in the Extract field, then instruct:

Price value: extract only the numerical value.

Next, add Price currency as another entity, and instruct:

Price currency: extract only the currency code (e.g., USD, EUR).
Use a common unitRequest conversion to specific units
Ask for values to be converted to a common unit for consistency. For example, convert all weight measurements to pounds.
Note: Currency conversion is not possible due to fluctuating exchange rates.

Weight: extract the weight in lbs
Open-ended extractionProvide open-ended instructions
Identify all relevant entities or entities of a given type in your text. For example, extract various bike specifications.

Specifications: extract all technical specs as a comma-separated list. Add the type of specification between brackets, for example: 1kg (weight)
Once you have refined your extraction method and are satisfied with the results from the initial rows, you are ready to launch your entity extraction in bulk. Select more cells or even all cells, click Run rows, and watch GPT for Excel handle the rest of the extractions.