T
T
tangro2011-01-08 02:28:29
Sphinx
tangro, 2011-01-08 02:28:29

How to search for a test in your comments on Habré?

I have 1200 comments. I want to find among them those in which the word "write" is included. How to do it?
-Searching the site for this word gives a bunch of links to all users, not just me
-Searching in Google \ Yandex on the Habr website by my nickname and this word gives a bunch of pages with other people's comments with this word and mine without it
-Search in google \ yandex this word on the site tangro.habrahabr.ru/comments/ does not give anything
- Scrolling through the comments on the pages and searching on each browser search is a bit boring (many pages). You can’t open all comments on one (well, or I don’t know how).

Are there any adequate ways (other than "write a spider to collect all the pages with comments and search them")?

Answer the question

In order to leave comments, you need to log in

6 answer(s)
W
webscout, 2011-01-08
@tangro

tangro &&/+3 "write" site:habrahabr.ru - for Yandex.
+3 - means the distance is no more than three sentences in the forward direction. Gives, in my opinion, for the most part your comments.

R
Rulin, 2011-01-08
@Rulin

robots.txt (http://%username%.habrahabr.ru/robots.txt) prohibits indexing of everything that is on the user's subdomain, so search engines cannot find anything
User-agent: *
Disallow: /
Host: %username% .habrahabr.ru

R
Ramzeska, 2011-01-08
@Ramzeska

You yourself have described all the possible ways. All that remains is a fashionable option for geeks - find the SQL-Injection bug and search the database :)

@
@ntkt, 2011-01-08
_

Comments are on a page like %USERNAME%.habrahabr.ru/comments/page%NUMBER%/ We
find the number of the last page with our hands - point the mouse at the arrow.
Further options:
( 0) YQL, unfortunately, disappears due to the ban on indexing in robots.txt )
1) a shell script that will call wget with a delay. Let's get N html nicknames, you can find it.
For Windows, if there is no wget or you don’t feel like writing a batch file, you can use VBScript / JScript - this is also not long.
2) javascript-one-liner to the address bar of the browser, which will add N iframes to the page with a delay.
In the browser, turn off pictures and flash, we get a bare text page, Ctrl + F drives.
If this does not fall under the definition of "write a spider", in my opinion - quite a way out.

@
@ntkt, 2011-01-08
_

And here are the answers:
1) venda / wget - it doesn’t fit into one line, plus it’s tight with a delay in CMD:

set MAXPSTO=30
set HABRUSER=tangro
for /L %i in (1,1,%MAXPSTO%) DO @echo http://%HABRUSER%.habrahabr.ru/comments/page%i/ >> tmp.url
wget -w 5 tmp.url

2) nix/wget - no comments:
$ for i in {1..30} ; do wget http://tangro.habrahabr.ru/comments/page$i/ && sleep 5 ; done

3) Cross-platform - javascript in the address bar of the browser, in opera and chrome, it seems to work, the main thing is to chop off pictures and plugins in advance. Along the way, I found strange setInterval behavior, and in general, some code in the "javascript: code" format seems to work only from a link, and not from the address bar, so the script has grown dramatically.
pastebin.com/EXc7DFQC
It's pointless to kick with your feet, the knee-decision is the same :)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question