A
A
Andrey Shamov2010-09-22 12:02:45
JavaScript
Andrey Shamov, 2010-09-22 12:02:45

How to save HTML page generated in JavaScript?

There are pages on which there are dynamic parts received by Ajax or simply some Javascript functions (for example, templating engines).
So here's how to save the compiled page, ie. with all js functions done? In the case of Ajax, the download may take place some time after the DOM is loaded.
The solution is desirable in Java (preferably C #) or through the Windows / Linux console.
For example, you need to download this forexite: Calendar for the week .

Answer the question

In order to leave comments, you need to log in

7 answer(s)
S
Stasik0, 2010-09-22
@lomaster

The most hacky option is to drive in javascript:alert(document.documentElement.innerHTML); in the browser... then Ctrl+A ;)
and for the server there is htmlunit.sourceforge.net/

W
web4_0, 2010-09-22
@web4_0

Clarify the question: do you need to see the generated html or so that you can directly save it to disk?
If you just look, then in FF it will help to select the entire page with Ctrl + A and then in the context menu view Selection Source or you can even install the WebDeveloper extension, it has a View Generated Source

A
Ajex, 2010-09-22
@Ajex

Take a look at this thread, there are several links on the topic.
habrahabr.ru/blogs/webdev/87705/
For Java look in this direction: download.oracle.com/javase/6/docs/technotes/guides/scripting/programmer_guide/index.html

M
MT, 2010-09-24
@MTonly

Firefox → Select All (Ctrl+A) → Right Click → Selection Source

H
Horse, 2010-09-22
@Horse

I have the same question, but to analyze the results of the search results of Google.
If the matter is in ajax and only - you can send additional. request, and in response to receive the generated
html ...
PS I guess I didn’t answer the question, but this topic is also interesting to me and I hope that someone will answer it in more detail.

P
pietrovich, 2010-09-22
@pietrovich

in c#?
put a webBrowser on the form, hook onto any load handlers, and wait until some time has passed since the last time the load handlers fired. then pull innerHTML from document and save.
I won’t say more precisely, now there is nowhere to experiment at hand. I know that there is nothing complicated in accessing the DOM through it, and there were no problems with merging behind the loading processes either.
as an option, if you know exactly the structure of the site that you will rob, you can replace the original HTML before rendering. insert a link to your JS into it, in which you redefine one of the original functions, the call of which can be considered a marker for the end of the download. in the overridden functions you call the original one and then through window.external you call the container method (c #) which will save everything you need.

D
Dima, 2017-05-16
@v_m_smith

phantomjs

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question