A few months ago I was tasked with tracking down whatever it was that kept devouring all the disk space on one of our servers. Not too hard except it’s a Linux server and I did not want to put in the effort to shell in and run commands every time something happened and I certainly did not want to have to get this one server into our production environment as it was used mostly for QA to keep their stuff.
I looked around to see if there was an easy solution and ran across agedu (age dee you) and I got them setup with it so they could do their own searches. The process to clean up disk is to track down the culprits and delete them, aged does a full drive scan and displays reports that show how much space is being used by each directory and file. It even shows the access time range for each directory.
The du vs aged thing
Yes, you could just run du and get a summary of disk usage; but, aged actually takes things to another level by distinguishing between data that is still being used and ones that are not been accessed for some time so it not only finds what is using up the most space, but also what is wasting your space by just taking up space and not being used.
From the aged site
Unix provides the standard du utility, which scans your disk and tells you which directories contain the largest amounts of data. That can help you narrow your search to the things most worth deleting.
However, that only tells you what’s big. What you really want to know is what’s too big. By itself, du won’t let you distinguish between data that’s big because you’re doing something that needs it to be big, and data that’s big because you unpacked it once and forgot about it.
Most Unix file systems, in their default mode, helpfully record when a file was last accessed. Not just when it was written or modified, but when it was even read. So if you generated a large amount of data years ago, forgot to clean it up, and have never used it since, then it ought in principle to be possible to use those last-access time stamps to tell the difference between that and a large amount of data you’re still using regularly.
agedu is a program which does this. It does basically the same sort of disk scan as du, but it also records the last-access times of everything it scans. Then it builds an index that lets it efficiently generate reports giving a summary of the results for each sub-directory, and then it produces those reports on demand.
How do you get it?
To install aged on your OS, do the following
sudo apt-get install agedu
sudo pkg_add -r agedu
RHEL / CentOS / Fedora / Scientific Linux
Turn on EPEL repo first
yum install agedu
How to harness the power
Before you can do anything with aged, you have to run a scan to index the data.
Now that you have the index, you run the following command to create the display page:
agedu -w --address 127.0.0.1:8081
After that, you probably want to delete the data file agedu.dat, since it’s pretty large. In fact, the command agedu -R will do this for you; and you can chain agedu commands on the same command line, so that instead of the above you could have done
agedu -s /dir/name -w -R
for a single self-contained run of agedu which builds its index, serves web pages from it, and cleans it up when finished.