Can't retrieve link from webpage

I am using bs4 to run through a bunch of websites and grab a specific link off each page but I am having an issue grabbing that link.

I have tried getting all the links using.

 soup = BeautifulSoup(browser.page_source,"lxml")
 print(soup.find_all('a'))

I have tried many other ways including telling it the exact address of one site.

but every time seems to return everything but the link I want.

For context my code goes to pages of this site
https://ce.naco.org/?find=true

These are two of many pages that I am searching for the link in
https://ce.naco.org/?county_info=06019 https://ce.naco.org/?county_info=08045

Under "COUNTY CONTACT" there is a link in most of these pages and that is the link I want to grab but I just can't find a way to make it return only that link it just seems to be invisible to bs4.

I think it has something to do with how the page loads data based on what the user clicks and since bs4 isn't interacting with the site it doesn't load the data??? but this is just a guess.


Asked by: Brad616 | Posted: 30-11-2021






Answer 1

Instead of scraping the page, just use this endpoint to grab the data:

https://ce.naco.org/get/county?fips=06019

Here's how:

import requests

data = requests.get("https://ce.naco.org/get/county?fips=06019").json()
print(f'{data["county"]["Full_Address"]}\n{data["county"]["County_Website"]}')

Output:

2281 Tulare St<br>Hall Of Records<br>Fresno, CA 93721-2105
http://www.co.fresno.ca.us

This works for both county codes:

import requests

county_codes = ["06019", "08045"]

with requests.Session() as s:
    for county_code in county_codes:
        data = requests.get(f"https://ce.naco.org/get/county?fips={county_code}").json()
        print(f'{data["county"]["Full_Address"]}\n{data["county"]["County_Website"]}')

Output:

2281 Tulare St<br>Hall Of Records<br>Fresno, CA 93721-2105
http://www.co.fresno.ca.us
108 8Th St<br>Glenwood Springs, CO 81601-3355
http://www.garfield-county.com/

Answered by: Emily511 | Posted: 01-01-2022



Similar questions

html - How can I retrieve the page title of a webpage using Python?

How can I retrieve the page title of a webpage (title html tag) using Python?


python - Unable to retrieve code from webpage, because of query string?


html - How can I retrieve the page title of a webpage using Python?

How can I retrieve the page title of a webpage (title html tag) using Python?


python - Unable to retrieve code from webpage, because of query string?


html - How can I retrieve the page title of a webpage using Python?

How can I retrieve the page title of a webpage (title html tag) using Python?


python - How to retrieve an element from a set without removing it?

Suppose the following: &gt;&gt;&gt; s = set([1, 2, 3]) How do I get a value (any value) out of s without doing s.pop()? I want to leave the item in the set until I am sure I can remove it - something I can only be sure of after an asynchronous call to another host. Quick and dirty: &gt;&gt;&gt; elem = s.pop() &gt;&gt;&gt; s.add(elem)


sql server - Python: Retrieve Image from MSSQL

I'm working on a Python project that retrieves an image from MSSQL. My code is able to retrieve the images successfully but with a fixed size of 63KB. if the image is greater than that size, it just brings the first 63KB from the image! The following is my code: #!/usr/bin/python import _mssql mssql=_mssql.connect('&lt;ServerIP&gt;','&lt;UserID&gt;','&lt;Password&gt;') mssql.select_db('&lt;Database...


python - Best way to retrieve variable values from a text file?

Referring on this question, I have a similar -but not the same- problem.. On my way, I'll have some text file, structured like: var_a: 'home' var_b: 'car' var_c: 15.5 And I need that python read the file and then create a variable named var_a with value 'home', and so on. Example...


python - How to retrieve the selected text from the active window

I am trying to create a simple open source utility for windows using Python that can perform user-defined actions on the selected text of the currently active window. The utility should be activated using a pre-defined keyboard shortcut. Usage is partially outlined in the following example: The user selects some text using the mouse or the keyboard (in any application window)


python - How can I retrieve last x elements in Django

I am trying to retrieve the latest 5 posts (by post time) In the views.py, if I try blog_post_list = blogPosts.objects.all()[:5] It retreives the first 5 elements of the blogPosts objects, how can I reverse this to retreive the latest ones? Cheers


python - Retrieve module object from stack frame

Given a frame object, I need to get the corresponding module object. In other words, implement callers_module so this works: import sys from some_other_module import callers_module assert sys.modules[__name__] is callers_module() (That would be equivalent because I can generate a stack trace in the function for this test case. The imports are there simply to make that example complete an...


How do I retrieve Hotmail contacts with python

How can I retrieve contacts from hotmail with python? Is there any example?


linux - How to retrieve the process start time (or uptime) in python

How to retrieve the process start time (or uptime) in python in Linux? I only know, I can call "ps -p my_process_id -f" and then parse the output. But it is not cool.


python - Retrieve the two highest item from a list containing 100,000 integers

How can retrieve the two highest item from a list containing 100,000 integers without having to sort the entire list first?






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top