DelphiPythonPython GUIWindows

Learn To Build A Python GUI For Web Scraping Using BeautifulSoup Library In A Delphi Windows App

Are you looking for tools to build website scrapers to automate your data collecting process, and build a nice GUI for them? You can build scalable web scrapers easily by combining BeautifulSoup and Python4Delphi library, inside Delphi and C++Builder.

BeautifulSoup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree.

Since 2004, BeautifulSoup has been saving programmers hours or days of work on quick-turnaround screen scraping projects.

BeautifulSoup is a Python library designed for quick turnaround projects like screen-scraping. Three features make it powerful:

  1. BeautifulSoup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. It doesn’t take much code to write an application
  2. BeautifulSoup automatically converts incoming documents to Unicode and outgoing documents to UTF-8. You don’t have to think about encodings unless the document doesn’t specify an encoding and Beautiful Soup can’t detect one. Then you just have to specify the original encoding.
  3. Beautiful Soup sits on top of popular Python parsers like lxml and html5lib, allowing you to try out different parsing strategies or trade speed for flexibility.

BeautifulSoup parses anything you give it and does the tree traversal stuff for you. You can tell it “Find all the links”, or “Find all the links of class externalLink”, or “Find all the links whose URLs match “foo.com”, or “Find the table heading that’s got bold text, then give me that text.”

Valuable data that was once locked up in poorly-designed websites is now within your reach. Projects that would have taken hours to take only minutes with Beautiful Soup.

 

Hands-On

This post will guide you on how to run the BeautifulSoup library for scraping the data from the National Weather Service and display it in the Delphi Windows GUI app.

First, open and run our Python GUI using project Demo1 from Python4Delphi with RAD Studio. Then insert the script into the lower Memo, click the Execute button, and get the result in the upper Memo. You can find the Demo1 source on GitHub. The behind the scene details of how Delphi manages to run your Python code in this amazing Python GUI can be found at this link.

0_rundemo1-6813476

These are the steps for scraping the Austin/San Antonio, TX weather data from the National Weather Service in Python GUI by Python4Delphi:

  • Import libraries:
  • Read URL:
  • Download the page and start parsing:
  • Extracting information from the page:
  • Extract the title attribute from the img tag:
  • Extracting all the information from the page:
  • Combining our data into a Pandas dataframe:
  • Run all the complete steps inside Python GUI:
1-4266126

 

Congratulations, now you have learned how to run the BeautifulSoup library for scraping the data from the National Weather Service and display it in the Delphi Windows GUI app! Now you can scrape any data you are interested in using the BeautifulSoup library and Python4Delphi.

Check out the BeautifulSoup library for Python and use it in your projects: https://pypi.org/project/beautifulsoup4/ and

Check out Python4Delphi which easily allows you to build Python GUIs for Windows using Delphi: https://github.com/pyscripter/python4delphi

close
Related posts
DelphiDelphiVCLPythonPython GUIWindows

What is DelphiVCL.Application.CreateHandle?

DelphiDelphiVCLPythonPython GUIWindows

Creating A New Application With DelphiVCL.Application.Create

DelphiDelphiVCLPythonPython GUIWindows

All You Need To Know About Application.DialogHandle

DelphiDelphiVCLPythonPython GUIWindows

What is the FieldAddress Property in DelphiVCL.Application?

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *

it_ITItalian