HTML Agility Pack or HTML Screen Scraping libraries for Java, Ruby, Python?

I found the HTML Agility Pack useful and easy to use for screen scraping web sites. What's the equivalent library for HTML screen scraping in Java, Ruby, Python?


Asked by: First Name472 | Posted: 06-12-2021






Answer 1

Found what I was looking for: Options for HTML scraping?

Answered by: Ryan124 | Posted: 07-01-2022



Answer 2

BeautifulSoup is the standard Python screen scraping tool.

Recently, however, I used the (incomplete at the moment) pyQuery, which is more or less a rewrite of jQuery into python, and found it to be very useful.

Answered by: Emily208 | Posted: 07-01-2022



Similar questions

web scraping - how to scrape video media with scrapy or other python's libraries?

Concretely I want to extract videos from this website: equidia. The first problem is when I launch scrapy shell https://www.equidia.fr/courses/2022-07-31/R1/C1 -s USER_AGENT="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36". I inspected with


web crawler - What are the best prebuilt libraries for doing Web Crawling in Python

This question already has answers here:


How to find all built in libraries in Python

I've recently started with Python, and am enjoying the "batteries included" design. I'e already found out I can import time, math, re, urllib, but don't know how to know that something is builtin rather than writing it from scratch. What's included, and where can I get other good quality libraries from?


OCSP libraries for python / java / c?

Going back to my previous question on OCSP, does anybody know of "reliable" OCSP libraries for Python, Java and C? I need "client" OCSP functionality, as I'll be checking the status of Certs against an OCSP responder, so responder functionality is not that important. Thanks


how do i use python libraries in C++?

I want to use the nltk libraries in c++. Is there a glue language/mechanism I can use to do this? Reason: I havent done any serious programming in c++ for a while and want to revise NLP concepts at the same time. Thanks


How can I use Perl libraries from Python?

I have written a bunch of Perl libraries (actually Perl classes) and I want to use some of them in my Python application. Is there a natural way to do this without using SWIG or writing Perl API for Python. I am asking for a similar way of PHP's Perl interface. If there is no such kind of work for Perl in Python. What is the easiest way to use Perl cl...


Python vs. C# Twitter API libraries

Closed. This question does not meet Stack Overflow guid...


d - Calling gdc/dmd shared libraries from Python using ctypes

I've been playing around with the rather excellent ctypes library in Python recently. What i was wondering is, is it possible to create shared D libraries and call them in the same way. I'm assuming i would compile the .so files using the -fPIC with dmd or gdc and call them the same way using the ctypes library. Has anyone tried this ? ...


plot - Python plotting libraries

Closed. This question does not meet Stack Overflow guid...


c - Building a Python shared object binding with cmake, which depends upon external libraries

We have a c file called dbookpy.c, which will provide a Python binding some C functions. Next we decided to build a proper .so with cmake, but it seems we are doing something wrong with regards to linking the external library 'libdbook' in the binding: The CMakeLists.txt is as follows: PROJECT(dbookpy) FIND_PACKAGE(PythonInterp) FIND_PACKAGE(PythonLibs) INCLUDE_DIRECTORIES(${PYTHON_INCLUDE...


shared libraries - Can two versions of the same library coexist in the same Python install?

The C libraries have a nice form of late binding, where the exact version of the library that was used during linking is recorded, and thus an executable can find the correct file, even when several versions of the same library are installed. Can the same be done in Python? To be more specific, I work on a Python project that uses some 3rd-party libraries, such as paramiko. Paramiko is now version 1.7.4, bu...






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top