T
T
tursumbekov2015-11-03 07:39:38
Java
tursumbekov, 2015-11-03 07:39:38

How to check a site for broken links?

Good afternoon.
There is a site . It needs to be checked for broken links. Checked with the help of services and Xenu. Everyone sees only the main one. There are no sources, I rummaged through the page code and concluded that it was written in Java (I could be wrong), I think this is the problem. How can I automate the search for broken links on this site?
PS When I get access to the site, I'll try to analyze it through Google Webmasters.

Answer the question

In order to leave comments, you need to log in

9 answer(s)
U
ummahusla, 2015-11-03
@Antonoff

Google Webmasters to the rescue.

A
Avrong, 2015-11-03
@Avrong

1) Go to the page you need
2) Using regular expressions or something else, select all href
3) Follow all links and if the response code is not 200, then write this link to non-working

A
archelon, 2015-11-03
@archelon

netpeak-spider

K
Kovalsky, 2015-11-03
@lazalu68

This is Sodom!
If you look at the http response from nu.edu.kz , you will immediately see this:

/**
 * This is the loopback script to process the url before the real page loads. It introduces
 * a separate round trip. During this first roundtrip, we currently do two things: 
 * - check the url hash portion, this is for the PPR Navigation. 
 * - do the new window detection
 * the above two are both controled by parameters in web.xml

From this, at least it is clear that a redirect actually occurs (one or more), "real page loads" only then. Well, and now, it is quite possible that this happens only for the browser, apparently not a very standard redirect. Robots stick their noses into it and see nothing else.
I understand that this does not seem to be the answer, but I'll try to figure it out and come up with something)

I
Ilya Beloborodov, 2015-11-03
@kowap

You can make a robot that will follow links and receive a response from the server (200, 404, 500, etc.), if the answer is 200, then the link is not broken.
But that option probably won't work for you.

V
Vladimir, 2015-11-04
@vkushni

the simplest option is
1) to set up a local proxy
2) to know an adequate site-rocking program in which you can set a proxy (if teleport pro is set)
a proxy is required in order to track the statistics of the animal and a site-rocking chair in order to replace you with access to all available URLs

D
Dmitry Pavlov, 2015-11-13
@dmitry_pavlov

If quickly manually - www.brokenlinkcheck.com
Similar services are enough.

K
kolya-kuznetsov-96, 2017-03-14
@kolya-kuznetsov-96

Seo experts still recommend Xenu

A
Alexnetroet, 2019-09-06
@Alexnetroet

Maybe you are doing something wrong, here is a more detailed article https://k-gayduk.ru/blog/tech/bitye-ssylki.html , it always helped me.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question