Site icon Winhelponline

How to Extract Links from a Webpage in Windows

You may need to extract the links (URLs) in a webpage for different purposes — eg., internet research, web development, security assessments, or webpage testing. This article tells you how to extract links from a webpage or HTML document in Windows.

How to Extract Links from a Webpage in Windows

There are several methods to extract URLs in a webpage. Let’s start with a native way — using your web browser’s developer tools section.

Using your Web browser’s Developer Tools

  1. Open Chrome for Firefox, and visit the website or webpage first.
  2. Press F12 to open the Developer Tools window.
  3. Click on the Console tab in Developer Tools.
  4. Clear the console output by clicking on Clear console (in Chrome) or Clear the Web console output (in Firefox) button.
  5. Type the following code at the console prompt:'a'), 0).forEach((a,i)=>console.log(`#${i+1} - ${a.innerText} -- ${a.href}`));

This outputs the ordered list of links in that webpage along with the title in the console window.

If you only want to grab the URLs without the serial number or the title text, use this command:

urls = $$('a'); for (url in urls) console.log ( urls[url].href );

Copy the output to Notepad and save it.

Using PowerShell

Launch PowerShell and use the following command-line syntax:

(Invoke-WebRequest -Uri "").Links.Href | Sort-Object | Get-Unique | out-gridview

This gets the list of links in the specified webpage and outputs the list to grid view control.

Another advantage of this PowerShell command is that it sorts the entries and also removes duplicate URLs from the collection.

The grid view control lets you filter URLs key keyword search, as well as copy the listings to the clipboard using Ctrl + C

Grab title and URL

To view the innerText in addition to the corresponding links or URLs, run:

(Invoke-WebRequest -Uri "").Links | sort-object href -Unique | Format-List innerText, href

You’ll get an output like this:

The duplicate URLs are removed automatically in the output.

You can even copy the output to the clipboard automatically using the | clip parameter:

(Invoke-WebRequest -Uri "").Links | sort-object href -Unique | Format-List innerText, href | clip

Grab Image URLs only

To extract the list of image URLs, use this syntax:

(Invoke-WebRequest -Uri "").Images | Select-Object src

Using AddrView from Nirsoft

Nirsoft’s AddrView tool extracts the links (including image links) from a given webpage or a local HTML file automatically and lists it in a grid view.

You can even sort the results by Type and copy only the image URLs to the clipboard or save to file.

If you only want to grab the image links, you can use Firefox’s built-in View Page Info option. In Firefox, right-click on the current webpage, click View Page Info → Media.The Media tab lists all the image URLs, including the data:image links. You can copy selected items or all items to the clipboard, or save the entries to a file.

Other than the above methods, for browsers like Chrome or Firefox, there are plenty of extensions or add-ons that will grab the URL or image links from the currently active web page in your browser.

Exit mobile version