Unable to retrieve code from webpage, because of query string? [closed]

I'm trying to write a Python script to retrieve the source code from a web page whose content depends on a query string. The specific page is: http://www.billboard.com/charts/hot-100#/charts/hot-100?begin=1&order=position. This is the Billboard Hot 100 page, which displays the Hot 100 in 10 pages, in groups of 10 songs. You control the chart positions retrieved via the "begin" query string item, and 10 songs including the "begin" number are returned. Basically, I want to use Windows Scheduler to automatically run a script that retrieves the 10 pages each week. I'm not wedded to doing this with Python.

I've used the following script, which retrieves what is seemingly the first page (positions 1 - 10).

import urllib
urllib.urlretrieve("http://www.billboard.com/charts/hot-100#/charts/hot-100?begin=1&order=position",r'C:\g\filename.txt')

However, when I change the "begin" value to 11, it still retrieves 1 - 10. Maybe the webpage somehow not permit such an action by a script? If I input the URL without the query string, it'll return the first 10 positions as well, so I'm thinking one possibility is that the query string isn't being taken into account at all, which may point to a need to a different approach.

Any help you could provide would be greatly appreciated!

Thanks! (and happy holidays!)


Asked by: Kate450 | Posted: 30-11-2021






Answer 1

Your code is working perfectly.

You are getting the same HTML data as generated by the URL http://www.billboard.com/charts/hot-100#/charts/hot-100?begin=11&order=position when given in the browser.

This means, the code which retrieves the entries from 11 to 20 is a asynchronous request, like AJAX.

Try this code, it seems to work:

import urllib
urllib.urlretrieve("http://www.billboard.com/charts/hot-100?begin=11&order=position",r'C:\g\filename.txt')

Answered by: Alissa613 | Posted: 01-01-2022



Answer 2

Have you tried going to http://www.billboard.com/charts/hot-100#/charts/hot-100?begin=11&order=position in your browser? You'll also get results 1-10. It seems likely that begin must be of the form 10n + 1.

Answered by: Marcus851 | Posted: 01-01-2022



Answer 3

Look at that URL carefully, you'll notice half of it is actually a part of the fragment (which isn't even sent to the server normally). They're probably using AJAX to parse that and show the relevant data.

This seems to work as expected: http://www.billboard.com/charts/hot-100?begin=21&order=position

Answered by: John822 | Posted: 01-01-2022



Similar questions

html - How can I retrieve the page title of a webpage using Python?

How can I retrieve the page title of a webpage (title html tag) using Python?


python - Can't retrieve link from webpage

I am using bs4 to run through a bunch of websites and grab a specific link off each page but I am having an issue grabbing that link. I have tried getting all the links using. soup = BeautifulSoup(browser.page_source,"lxml") print(soup.find_all('a')) I have tried many other ways including telling it the exact address of one site. but every time seems to return every...


html - How can I retrieve the page title of a webpage using Python?

How can I retrieve the page title of a webpage (title html tag) using Python?


python - Can't retrieve link from webpage

I am using bs4 to run through a bunch of websites and grab a specific link off each page but I am having an issue grabbing that link. I have tried getting all the links using. soup = BeautifulSoup(browser.page_source,"lxml") print(soup.find_all('a')) I have tried many other ways including telling it the exact address of one site. but every time seems to return every...


html - How can I retrieve the page title of a webpage using Python?

How can I retrieve the page title of a webpage (title html tag) using Python?


python - How to retrieve an element from a set without removing it?

Suppose the following: >>> s = set([1, 2, 3]) How do I get a value (any value) out of s without doing s.pop()? I want to leave the item in the set until I am sure I can remove it - something I can only be sure of after an asynchronous call to another host. Quick and dirty: >>> elem = s.pop() >>> s.add(elem)


sql server - Python: Retrieve Image from MSSQL

I'm working on a Python project that retrieves an image from MSSQL. My code is able to retrieve the images successfully but with a fixed size of 63KB. if the image is greater than that size, it just brings the first 63KB from the image! The following is my code: #!/usr/bin/python import _mssql mssql=_mssql.connect('<ServerIP>','<UserID>','<Password>') mssql.select_db('<Database...


python - Best way to retrieve variable values from a text file?

Referring on this question, I have a similar -but not the same- problem.. On my way, I'll have some text file, structured like: var_a: 'home' var_b: 'car' var_c: 15.5 And I need that python read the file and then create a variable named var_a with value 'home', and so on. Example...


python - How to retrieve the selected text from the active window

I am trying to create a simple open source utility for windows using Python that can perform user-defined actions on the selected text of the currently active window. The utility should be activated using a pre-defined keyboard shortcut. Usage is partially outlined in the following example: The user selects some text using the mouse or the keyboard (in any application window)


python - How can I retrieve last x elements in Django

I am trying to retrieve the latest 5 posts (by post time) In the views.py, if I try blog_post_list = blogPosts.objects.all()[:5] It retreives the first 5 elements of the blogPosts objects, how can I reverse this to retreive the latest ones? Cheers


python - Retrieve module object from stack frame

Given a frame object, I need to get the corresponding module object. In other words, implement callers_module so this works: import sys from some_other_module import callers_module assert sys.modules[__name__] is callers_module() (That would be equivalent because I can generate a stack trace in the function for this test case. The imports are there simply to make that example complete an...


How do I retrieve Hotmail contacts with python

How can I retrieve contacts from hotmail with python? Is there any example?


linux - How to retrieve the process start time (or uptime) in python

How to retrieve the process start time (or uptime) in python in Linux? I only know, I can call "ps -p my_process_id -f" and then parse the output. But it is not cool.


python - Retrieve the two highest item from a list containing 100,000 integers

How can retrieve the two highest item from a list containing 100,000 integers without having to sort the entire list first?






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top