Skip to main content

Search the web

Access fresh web search data and retrieve up-to-date information about companies, current events, facts, people, and more, in bulk directly in Google Sheets. For example, you can search for the specifications of a product to include in your catalog.
info
  • Search the web is the only bulk tool that generates responses based on recent web search data.
  • It is built on the Perplexity API and uses Perplexity Sonar model, regardless of the model you select in the model switcher. For more information, see Supported models.

Run the bulk tool

Prerequisites
  1. Select Extensions > GPT for Sheets and Docs > Open.

    Open GPT for Sheets
  2. In the sidebar, select Bulk tools, and click Search the web.

    Select Search the web
    info

    If you open the Bulk tools for the first time, you are prompted to grant additional permissions. These permissions are required for GPT for Sheets to write in your spreadsheets.

    1. Click Sign-in with Google. A Sign in with Google window opens.
    2. Select the Google account with which you have installed GPT for Sheets.
    3. Click Allow to grant GPT for Sheets the required permissions.
  3. Set up your bulk tool run.

    Setup Search the web
    FieldDescriptionExample
    1Column name row
    (optional)
    If your column names aren't in the first row, select the number of the row that contains the column names. The bulk tool will run on the rows below this one.2
    2Write a questionDescribe the task you want the AI to perform based on the target text. For example, ask a question about the text or request content based on the text. The AI generates the response based on fresh web search data.
    List 3 unique selling propositions on this company website. Just generate the 3 USPs in a short list, don't elaborate.
    3About each cell in columnSelect the column that contains the text to search the web for.A: Company
    4Put results in columnSelect the column to put the results in. Cells in this column won't be overwritten with the results if they contain text.B: Unique Selling Propositions
    5Show sources
    (optional)
    Select to show the sources of the information. The sources will be written in the column to the right of the results column. Cells in this column won't be overwritten with the sources if they contain text.Show sources in column C
    info

    The setup for the bulk tool run is stored in the History and can be reused for other runs.

    Bulk tool History
  4. Run the Search the web bulk tool starting from the first empty cell in the results column:

    1. Select a specific number of rows to run or select All rows.
    2. Click Run rows.

    Run Search the web tool

You have set up and run the Search the web bulk tool. If needed, try improving the results.

Search the web results

Improve results

Show sources

Enable the Show sources option to write the sources of the information in the column to the right of the results column. This helps you understand where the information comes from and assess its reliability.

Show sources

If you provide a URL in the question or in the input column About each cell in column, this option also helps you check if the URL you provided is actually used to generate the response.

info

Searching a specific page only works if:

  • Perplexity's web crawler has already independently indexed the content of the page. The crawler does not index new pages on demand based on URLs you provide. In addition, the crawler respects robots.txt directives and does not index disallowed content.
  • The URL points directly to the page without any redirects. To check if a URL results in redirects, use the Redirect Checker.

If a specific URL does not produce valid results, use web scraping to retrieve the information from the page.

Refine your question

Edit your question to:

  • Provide more context. For example, add the following to your question to remove ambiguity:
    The company is a clothing retailer.
  • Specify what to do when the answer cannot be found. For example, add the following to your question to avoid irrelevant information to be returned:
    If the answer cannot be found, return "N/A".

Break down complex tasks

The Perplexity model is not ideal for precise rendering tasks. For such cases, start by searching the web to gather relevant information using the Web search bulk tool. Then process the results with a model like gpt-4o-mini, which works well for tasks such as classification, extraction, or generating yes/no responses. For example, you can use the Web search bulk tool to retrieve a company's description from their website and then apply the Classify / Categorize bulk tool to determine its industry.

Use Web scraping

If you need to retrieve information from a specific web page, you can import its content into your spreadsheet using our Web scraping script:

  1. Install the script in your spreadsheet.
  2. Launch scraping:

Install the script in your spreadsheet

Install the web scraping script in your spreadsheet to enable the import of web page content:

  1. From the Extensions menu in your spreadsheet, click Apps Script.

    A new project opens on the Apps Script page.

  2. Copy and paste the script into the script editor.

    View Web Scraping Script
     /**
    * @OnlyCurrentDoc
    */

    const SHEETS_CELL_LIMIT = 50000;

    function onOpen() {
    SpreadsheetApp
    .getUi()
    .createMenu("Scrape")
    .addItem("Scrape URLs in column A", "ScrapeAB")
    .addToUi()
    }

    /**
    * @param {"https://en.wikipedia.org/wiki/ChatGPT"} url The url to scrape
    * @customfunction
    */
    function GPT_SCRAPE(url) {
    const res = UrlFetchApp.fetch(url);
    if (res.getResponseCode() >= 300) {
    return `Error: response code ${res.getResponseCode()}`;
    }
    const html = res.getContentText();
    const text = cleanup_html(html);
    // html.substring(0,SHEETS_CELL_LIMIT),
    return text.substring(0,SHEETS_CELL_LIMIT);
    }

    function cleanup_html(html) {
    // remove tags
    const tag_free = html
    .replace(/<style([\s\S]*?)<\/style>/gi, "")
    .replace(/<script([\s\S]*?)<\/script>/gi, "")
    .replace(/<[^>]+>/g, "");

    // replace html entities
    let decoded = tag_free;
    try {
    const xml = XmlService.parse('<d>' + tag_free + '</d>');
    decoded = xml.getRootElement().getText();
    } catch (e) {
    // unable to decode some HTML entity
    }

    // remove repeated spaces
    const clean = decoded
    .replace(/\n\s*\n/gi, "\n")
    .replace(/&nbsp;/gi, " ")
    .trim();

    return clean;
    }

    function ScrapeAB() {
    const range = SpreadsheetApp.getActiveSpreadsheet().getRange("A:B");
    const values = range.getValues().slice(1)

    let urls = 0
    let done = 0

    for (const row of values) {
    const url = row[0];
    const res = row[1];
    if (!url) continue;
    urls++;
    if (!res) continue;
    done++;
    }

    if (urls === done) {
    SpreadsheetApp.getUi().alert(`Found ${urls} urls, ${done} already scraped. Nothing to be done.`)
    return;
    }
    SpreadsheetApp.getUi().alert(`Found ${urls} urls, ${done} already scraped. Scraping ${urls - done}...`)

    for (let irow = 0; irow < values.length; irow++) {
    row = values[irow];
    const url = row[0];
    const res = row[1];
    if (!url || res) continue;
    const text = GPT_SCRAPE(url)
    range.getCell(irow+2, 2).setValue(text);
    }
    }
  3. Save the project and click Run.

    An Authorization required pop-up appears.

  4. Click Review permissions to accept the necessary access rights for the script.

    You are prompted to select a Google account.

  5. Choose the account you are using for the spreadsheet. This ensures the script works correctly within your document.

  6. Click Allow to grant these permissions.

info

The following permissions are necessary for the script to run:

  • View and manage spreadsheets that this application has been installed in: Required for the script to specifically interact with the spreadsheets where it's installed, whether it's adding, modifying, or removing data.
  • Connect to an external service: Required for the script to browse web pages and import the content to your Google Sheets.

You can now scrape multiple URLs at once in your spreadsheet or scrape a single URL using GPT_SCRAPE.

Scrape multiple URLs with the Scrape menu

Prerequisites
  1. From the Scrape menu, select Scrape URLs in column A.

    A confirmation pop-up appears.

  2. Click OK to launch the script.

This script processes the web page URLs in column A, and write the scraped content into column B, ready for you to edit directly. It removes complex codes and styles, leaving only text, and includes the page's title and main body text.

Scrape a single URL with GPT_SCRAPE

Prerequisites
You have installed the web scraping script in your spreadsheet.

Type a GPT_SCRAPE formula into a cell, for example =GPT_SCRAPE("https://talarian.io"), and press Enter to generate the result:

This function browses content from the specified web page and imports it into the cell. To edit a cell's content, copy the cell and use Paste special, and Values only for pasting.

info

Using the GPT_SCRAPE function does not affect your balance in GPT for Sheets.

Limitations

Note the following limitations of the web scraping script:

  • Caution is advised when scraping content from web pages to adhere to website policies and prevent server overload. Spacing out requests can prevent issues such as IP blocking or service restrictions.
  • Each import is capped at 50,000 characters of text. Content beyond this limit will be truncated.
  • Web pages heavily reliant on JavaScript might not be fully captured by this script.
  • This script is optimized for HTML content. It may not perform optimally with APIs that return structured data, such as JSON.

What's next

Try another bulk tool.