N
N
NeonMercury2013-04-09 16:18:55
Programming
NeonMercury, 2013-04-09 16:18:55

Sniffer. How to understand that it was the user who made the request?

We are developing a network packet sniffer. Can extract the host from the HTTP header and store the visited site. But if the user visits, for example, habrahabr.ru , then in addition to the habrahabr.ru host, habrastorage.org is also requested (and this is logical, since some of the pictures from some post are stored there) and many, many other hosts where the user even didn't want to come in.
This is where the whole question lies: how to filter the host that the user goes to from what the browser requests to download additional content? Moreover, there is a need to keep statistics, and if there are 10 requests to one host (and the user has just opened the page), then only one entry should be made in the log (so as not to bring down the statistics of visiting sites).
There was an option to watch the time of the last request and if it is less than N seconds, then do not add a new host to the log, since this is most likely an automatic request. But this is a crutch, because: (a) the user may have a slow connection, (b) it does not save from the situation: "Let's open an online store, quickly open 25-100 tabs and then study the product."
I hope for the habra community, since a week of searches in Google, stackoverflow and similar sites did not really give anything.
PS: Corporate sniffer
PPS: If, in addition to the sniffer, you need to apply any other technologies, it's not a problem.

Answer the question

In order to leave comments, you need to log in

8 answer(s)
S
Sergey Galkin, 2013-05-07
@NeonMercury

In full pages, there are all sorts of obligatory “head” and “body” types that are not in the loaded pieces. Maybe you can determine full pages by tags? It is unlikely that the user will open the counter in a window that usually has a maximum frame.

R
Rowdy Ro, 2013-04-09
@rowdyro

The first thing that came to mind was http_referer - in case of a request to habrastorage.org, it will be put in habrahabr.ru

R
Rowdy Ro, 2013-04-09
@rowdyro

Yes, in this case, you can filter the html received by the user - and look for links to iframe, a href ... only with javascript a hat. You can look at the time of requests, it is unlikely that the user will be able to click faster than the browser loads

E
egorinsk, 2013-04-09
@egorinsk

> Corporate sniffer
I wish you to fail safely with this senseless undertaking.

A
Alex Khayev, 2013-04-09
@hasalex

if the sniffer is corporate, then, I think, it’s easier to write something from:
- for the client side
1. addon to the browser
2. mouse hook (bad option)
3. your browser for corporate use
- for the server side, if you have something proxy type
1. change the html by adding or on-click handlers to it
2. change the html by modifying the urls in it
and a combination of various options, including those proposed above,
well, I sketched my thoughts like this, for reflection options, although your task is not completely clear.

U
UZER2006, 2013-04-09
@UZER2006

You can combine checking the Referer and Accept headers. Now I looked at my headers, when loading statics or AJAX requests, the Accept-header, which is different from the usual loading of the page, leaves. Also, the If-Modified-Since, Cache-control, etc. headers sometimes appear. But here you still need to consider how other browsers behave. I checked on Google Chrome latest version.
Other than that, I'm not even sure if there are any other options.
If the statistics are large and there are few false positives, you can make it easier and close your eyes to these positives.

X
xmoonlight, 2013-04-10
@xmoonlight

The solution is this:
We extract the information in the opposite direction: the received content text/html -> stream ID -> the first request to the server after opening the stream ID (or the last redirect).

M
mace-ftl, 2014-12-25
@mace-ftl

file.php?id=165&sid=1088c7be4e5127532d93

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question