A
A
Andrey Bazykin2010-12-22 20:44:56
Freelance
Andrey Bazykin, 2010-12-22 20:44:56

Removing unused files on hosting

Hello. I was given several dozens of sites for maintenance, for the entire history of which about 10-15 people worked on, naturally a lot of rubbish appeared, unused html pages, style files, pictures, etc. Searching for such files manually takes a very long time. Can anyone suggest how to automate the process? Thanks in advance.

Answer the question

In order to leave comments, you need to log in

7 answer(s)
E
EvilX, 2010-12-22
@EvilX

As an option, find files that have not been accessed for a certain time. Find will help with this (as I understand it, hosting is on unix?). -exec will delete files.
find ./ -used days ago -exec rm '{}' \;

S
Sergey, 2010-12-22
@butteff

A little unnecessary answer:
I think that you can clear the tmp folder for sure if the hosting is under niks.
Also clean any logs and mail.
Sometimes it can quite decently free up space.
And everything else can still be tricky, but involved.
Here you have to be careful.
I don't know how to automate this process.

P
pietrovich, 2010-12-22
@pietrovich

Sites are so ugly written, they copy-paste ordinary html pages. No php and js scripts, solid html and images.

copy to computer. create one "site" for each site in the dreamweaver. then ask the dreamweaver to search for orphans (Site, Check links sitewide (Ctrl+F8) and in the results filter by orphaned files)
the trial is quite enough for this. hotkeys in recent versions may differ. I looked at the old 8-ke
this method works quite tolerably. if the filenames are not calculated dynamically anywhere then it should work perfectly.

A
Antelle, 2010-12-22
@Antelle

If this is a simple html and I correctly understood the task - you can put the site on Apache, download it entirely (for example, teleport th) - then parse the logs that loaded.
What is not there - in the furnace.

A
Alexander Belugin, 2010-12-22
@unkinddragon

You can try to put the bot to download this site in compliance with the structure, the bot only follows the links. Then compare two folders, everything that is on the site, but not in the local copy - garbage.

E
eternals, 2010-12-22
@eternals

There are no methods for solving your problem in the general case. The probability of removing the target content tends to 100%. No teleport will help you find sections that are not linked from other pages, but to which there are links outside.
On the other hand, sites can be downloaded to your computer, then cleaned (using the same teleport method), and if necessary, restore from a copy.
But the most competent is to increase the disk, because. now it's not a problem.

A
Artem, 2015-07-31
@spelesto

By the way, you can also check the site with Xenu's Link Sleuth ( home.snafu.de/tilman/xenulink.html). Check what is requested. Then you can parse and identify unused files.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question