M
M
Maxim2018-03-15 22:53:34
Parsing
Maxim, 2018-03-15 22:53:34

Why does HttpClient.GetAsync incorrectly use a "#" reference?

I am new to programming.
According to the lesson on YouTube, I made a parser for article titles on the Habrahabr website.
The parser works correctly with links https://habrahabr.ru/page1/ and further 2, 3 pages. Goes through them and takes all the headings of the articles.
I changed the links to the Steam site so that the parser selects the names of the items on its pages.
Successfully getting the result as a list of items, but from the wrong page. If I understand correctly, then the link to the page "breaks" in GetAsync.
A piece of code

var currentUrl = url.Replace("{CurrentId}", id.ToString());
Process.Start(currentUrl);
var response = await client.GetAsync(currentUrl);

"currentUrl" is the collected link of the page from which you want to take the names of the items:
steamcommunity.com/market/search?appid=232090&q=em...
It turns out to be correct. As well as links to pages 2, 3 and beyond ("p1" in the above link is the first page).
I added Process.Start(link) to make sure. The browser opens exactly what is required.
But the parser takes the results from another page:
steamcommunity.com/market/search?appid=232090&q=emote
(I checked by manually opening it in the browser and comparing the list of items on it with the results of the parser - they matched)
i.e. the link in GetAsync seems to be cut off at the "#" character.
I googled this problem for a long time but didn't find anything.
I would be glad, in addition to answering the question itself, to find out how I had to look for it.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Alexey Pavlov, 2018-03-16
@Zzombik

After the # symbol, a local fragment (or anchor ) of the page is indicated - an element within the page that the browser uses to navigate the page. Also used for local (client) page settings.
In your case, the sort type p1_name_asc is specified after # - sort the table by name in ascending order. For example, to sort in descending order, you can put the type p1_name_desc.
The problem is that the sorting happens in the browser, not on the server (the server won't see that part of the address at all).
If you need to get the list exactly as indicated in the link, then you must also sort the result after receiving the list.

R
RidgeA, 2018-03-15
@RidgeA

the part of the address bar after the # is usually not processed by the web server and should not be passed to it at all

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question