home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.sources.misc
- From: jv@mh.nl (Johan Vromans)
- subject: v10i065: Watching disk usage
- Sender: allbery@uunet.UU.NET (Brandon S. Allbery - comp.sources.misc)
-
- Posting-number: Volume 10, Issue 65
- Submitted-by: jv@mh.nl (Johan Vromans)
- Archive-name: dusage.pl
-
- Guarding disk space is one of the problems of system management.
-
- Some time ago I converted an old awk/sed/sh script to keep track of
- disk usage to a new perl program, added new features, options; even
- wrote a manual page.
-
- Using a list of pathnames, this program filters the output of du(1) to
- find the amount of disk space used for each of the paths (actually, it
- collects all values in one single du run). It adds the new value to
- the list, shifting old values up. It then generates a nice report of
- the amount of disk space occupied in each of the specified paths,
- together with the amount it grew (or shrinked) since the previous run,
- and since 7 runs ago. When run daily, this gives daily and weekly
- figures.
-
- #---------------------------------- cut here ----------------------------------
- # This is a shell archive. Remove anything before this line,
- # then unpack it by saving it in a file and typing "sh file".
- #
- # Wrapped by Johan Vromans <jv@mh.nl> on Sat Feb 3 21:36:57 1990
- #
- # This archive contains:
- # README dusage.1 dusage.pl
- #
- # Error checking via wc(1) will be performed.
-
- LANG=""; export LANG
-
- echo x - README
- cat >README <<'@EOF'
- Guarding disk space is one of the problems of system management.
-
- Some time ago I converted an old awk/sed/sh script to keep track of
- disk usage to a new perl program, added new features, options; even
- wrote a manual page.
-
- Using a list of pathnames, this program filters the output of du(1) to
- find the amount of disk space used for each of the paths (actually, it
- collects all values in one single du run). It adds the new value to
- the list, shifting old values up. It then generates a nice report of
- the amount of disk space occupied in each of the specified paths,
- together with the amount it grew (or shrinked) since the previous run,
- and since 7 runs ago. When run daily, this gives daily and weekly
- figures.
-
- # This program requires perl version 3.0, patchlevel 4 or higher.
-
- # Copyright 1990 Johan Vromans, all rights reserved.
- # Peaceware. This program may be used, modified and distributed as long as
- # this copyright notice remains part of the source. It may not be sold, or
- # be used to harm any living creature including the world and the universe.
- @EOF
- set `wc -lwc <README`
- if test $1$2$3 != 211931064
- then
- echo ERROR: wc results of README are $* should be 21 193 1064
- fi
-
- chmod 644 README
-
- echo x - dusage.1
- cat >dusage.1 <<'@EOF'
- .TH DUSAGE 1
- .SH NAME
- dusage \- provide disk usage statistics
- .SH SYNOPSIS
- .B dusage
- .RB [ \-afghruD ]
- .RI "[\fB\-i\fR" " input" ]
- .RI "[\fB\-p\fR" " dir" ]
- .RI [ "control file" ]
- .SH DESCRIPTION
- .I Dusage
- is a perl script which produces disk usage statistics. These
- statistics include the number of blocks, the increment since the previous run
- (which is assumed to be yesterday if run daily), and the increment
- since 7 runs ago (which could be interpreted as a week if run daily).
- .I Dusage
- is driven by a
- .IR "control file" ,
- which describes the names of the files (directories) to be reported,
- and which also contains the results of previous runs.
- .PP
- When
- .I dusage
- is run, it reads the
- .IR "control file" ,
- [optionally] gathers new disk usage values by calling
- .IR du (1),
- prints the report, and [optionally] updates the
- .I control file
- with the new information.
- .PP
- Filenames in the control file may have wildcards. In this case, the
- wildcards are expanded, and all entries reported. Both the expanded
- names as the wildcard info are maintained in the control file. New
- files in these directories will automatically show up, deleted files
- will disappear when they have run out of data in the control file (but
- see the
- .B \-r
- option).
- .br
- Wildcard expansion only adds filenames which are not already on the list.
- .PP
- The control file may also contain filenames preceded with an
- exclamation mark ``!''; these entries are skipped. This is meaningful
- in conjunction with wildcards, to exclude entries which result from a
- wildcard expansion.
- .PP
- The control file may have lines starting with a dash ``\-'',
- which causes the report to start on a new page. Any text following the
- dash is placed in the page header, immediately following the text
- ``Disk usage statistics''.
- .PP
- The available command line options are:
- .TP 5
- .B \-D
- Turns on debugging, which yields lots of trace information.
- .TP
- .B \-a
- Reports the statistics for this and all previous runs, as opposed to
- the normal case, which is to generate the statistics for this run, and
- the differences between the previous and 7th previous run.
- .TP
- .B \-f
- Reports file statistics also. Default is to only report directories.
- .TP
- .B \-g
- Gathers new data by calling
- .IR du (1).
- .TP
- .B \-h
- Provides a help message. No work is done.
- .TP
- .BI \-i " input"
- Uses
- .I input
- as data obtained by calling
- .IR du (1).
- .TP
- .BI \-p " dir"
- All filenames in the control file are interpreted relative to this
- directory.
- .TP
- .B \-r
- Retains entries which don't have any data anymore. If this option is
- not used, entries without data are not reported, and removed from the
- control file.
- .TP
- .B \-u
- Update the control file with new values.
- .PP
- The default name for the control file is
- .BR .du.ctl ,
- optionally preceded by the name supplied with the
- .B \-p
- option.
- .SH EXAMPLES
- Given the following control file:
- .sp
- .nf
- .ne 3
- .in +.5i
- \- for manual page
- maildir
- maildir/*
- !maildir/unimportant
- src
- .in
- .fi
- .sp
- This will generate the following (example) report when running the
- command ``dusage -gu controlfile'':
- .sp
- .nf
- .ne 7
- .in +.5i
- Disk usage statistics for manual page Wed Jan 10 13:38
-
- blocks +day +week directory
- ------- ------- ------- --------------------------------
- 6518 maildir
- 2 maildir/dirent
- 498 src
- .in
- .fi
- .sp
- After updating the control file, it will contain:
- .sp
- .nf
- .ne 4
- .in +.5i
- \- for manual page
- maildir 6518::::::
- maildir/dirent 2::::::
- maildir/*
- !maildir/unimportant
- src 498::::::
- .in
- .fi
- .sp
- The names in the control file are separated by the values with a TAB;
- the values are separated with colons. Also, the entries found by
- expanding the wildcard are added. If the wildcard expansion had
- generated a name ``maildir/unimportant'' it would have been skipped.
- .br
- When the program is rerun after one day, it could print the following
- report:
- .sp
- .nf
- .ne 7
- .in +.5i
- Disk usage statistics for manual page Wed Jan 10 13:38
-
- blocks +day +week directory
- ------- ------- ------- --------------------------------
- 6524 +6 maildir
- 2 0 maildir/dirent
- 486 -12 src
- .in
- .fi
- .sp
- The control file will contain:
- .sp
- .nf
- .ne 4
- .in +.5i
- \- for manual page
- maildir 6524:6518:::::
- maildir/dirent 2:2:::::
- maildir/*
- !maildir/unimportant
- src 486:498:::::
- .in
- .fi
- .sp
- It takes very little fantasy to imagine what will happen on subsequent
- runs...
- .PP
- When the contents of the control file are to be changed, e.g. to add
- new filenames, a normal text editor can be used. Just add or remove
- lines, and they will be taken into account automatically.
- .PP
- When run without
- .B \-g
- or
- .B \-u
- options, it actually reproduces the report from the previous run.
- .PP
- When multiple runs are required, save the output of
- .IR du (1)
- in a file, and pass this file to
- .I dusage
- using the
- .BI \-i "file"
- option.
- .SH BUGS
- Running the same control file with different values of the
- .B \-f
- and
- .B \-r
- options may cause strange results.
- .SH AUTHOR
- Johan Vromans, Multihouse Research, Gouda, The Netherlands.
- .sp
- Send bugs and remarks to <jv@mh.nl> .
- @EOF
- set `wc -lwc <dusage.1`
- if test $1$2$3 != 2048505139
- then
- echo ERROR: wc results of dusage.1 are $* should be 204 850 5139
- fi
-
- chmod 444 dusage.1
-
- echo x - dusage.pl
- sed 's/^@//' >dusage.pl <<'@EOF'
- #!/usr/bin/perl
-
- # This program requires perl version 3.0, patchlevel 4 or higher.
-
- # Copyright 1990 Johan Vromans, all rights reserved.
- # Peaceware. This program may be used, modified and distributed as long as
- # this copyright notice remains part of the source. It may not be sold, or
- # be used to harm any living creature including the world and the universe.
-
- $my_name = $0;
-
- ################ usage ################
-
- sub usage {
- local ($help) = shift (@_);
- local ($usg) = "usage: $my_name [-afghruD][-i input][-p dir] ctlfile";
- die "$usg\nstopped" unless $help;
- print STDERR "$usg\n";
- print STDERR <<EndOfHelp
-
- -D - provide debugging info
- -a - provide all statis
- -f - also report file statistics
- -g - gather new data
- -h - this help message
- -i input - input data as obtained by 'du dir' [def = 'du dir']
- -p dir - path to which files in the control file are relative
- -r - do not discard entries which don't have data
- -u - update the control file with new values
- ctlfile - file which controls which dirs to report [def = dir/.du.ctl]
- EndOfHelp
- ;
- exit 1;
- }
-
- ################ main stream ################
-
- &do_get_options; # process options
- &do_parse_ctl; # read the control file
- &do_gather if $gather; # gather new info
- &do_report_and_update; # report and update
-
- ################ end of main stream ################
-
- ################ other subroutines ################
-
- sub do_get_options {
-
- # Default values for options
-
- $debug = 0;
- $noupdate = 1;
- $retain = 0;
- $gather = 0;
- $allfiles = 0;
- $allstats = 0;
-
- # Command line options. We use a modified version of getopts.pl.
-
- &usage (0) if &Getopts ("Dafghi:p:ru");
- &usage (1) if $opt_h;
- &usage (0) if $#ARGV > 0;
-
- $debug |= $opt_D if defined $opt_D; # -D -> debug
- $allstats |= $opt_a if defined $opt_a; # -a -> all stats
- $allfiles |= $opt_f if defined $opt_f; # -f -> report all files
- $gather |= $opt_g if defined $opt_g; # -g -> gather new data
- $retain |= $opt_r if defined $opt_r; # -r -> retain old entries
- $noupdate = !$opt_u if defined $opt_u; # -u -> update the control file
- $du = $opt_i if defined $opt_i; # -i input file
- if ( defined $opt_p ) { # -p path
- $root = $opt_p;
- $root = $` while ($root =~ m|/$|);
- $prefix = "$root/";
- $root = "/" if $root eq "";
- }
- else {
- $prefix = $root = "";
- }
- $table = ($#ARGV == 0) ? shift (@ARGV) : "$prefix.du.ctl";
- $runtype = $allfiles ? "file" : "directory";
- if ($debug) {
- print STDERR "@(#)@ dusage 1.7 - dusage.pl\n";
- print STDERR "Options:";
- print STDERR " debug" if $debug; # silly, isn't it...
- print STDERR $noupdate ? " no" : " ", "update";
- print STDERR $retain ? " " : " no", "retain";
- print STDERR $gather ? " " : " no", "gather";
- print STDERR $allstats ? " " : " no", "allstats";
- print STDERR "\n";
- print STDERR "Root = $root [prefix = $prefix]\n";
- print STDERR "Control file = $table\n";
- print STDERR "Input data = $du\n" if defined $du;
- print STDERR "Run type = $runtype\n";
- print STDERR "\n";
- }
- }
-
- sub do_parse_ctl {
-
- # Parsing the control file.
- #
- # This file contains the names of the (sub)directories to tally,
- # and the values dereived from previous runs.
- # The names of the directories are relative to the $root.
- # The name may contain '*' or '?' characters, and will be globbed if so.
- # An entry starting with ! is excluded.
- #
- # To add a new dir, just add the name. The special name '.' may
- # be used to denote the $root directory. If used, '-p' must be
- # specified.
- #
- # Upon completion:
- # - %oldblocks is filled with the previous values,
- # colon separated, for each directory.
- # - @targets contains a list of names to be looked for. These include
- # break indications and globs info, which will be stripped from
- # the actual search list.
-
- open (tb, "<$table") || die "Cannot open control file $table, stopped";
- @targets = ();
- %oldblocks = ();
- %newblocks = ();
-
- while ($tb = <tb>) {
- chop ($tb);
-
- # preferred syntax: <dir><TAB><size>:<size>:....
- # allowable <dir><TAB><size> <size> ...
- # possible <dir>
-
- if ( $tb =~ /^-/ ) { # break
- push (@targets, "$tb");
- printf STDERR "tb: *break* $tb\n" if $debug;
- next;
- }
-
- if ( $tb =~ /^!/ ) { # exclude
- $excl = $'; #';
- @a = grep ($_ ne $excl, @targets);
- @targets = @a;
- push (@targets, "*$tb");
- printf STDERR "tb: *excl* $tb\n" if $debug;
- next;
- }
-
- if ($tb =~ /^(.+)\t([\d: ]+)/) {
- $name = $1;
- @blocks = split (/[ :]/, $2);
- }
- else {
- $name = $tb;
- @blocks = ("","","","","","","","");
- }
-
- if ($name eq ".") {
- if ( $root eq "" ) {
- printf STDERR "Warning: \".\" in control file w/o \"-p path\" - ignored\n";
- next;
- }
- $name = $root;
- } else {
- $name = $prefix . $name unless ord($name) == ord ("/");
- }
-
- # Check for globs ...
- if ( $gather && $name =~ /\*|\?/ ) {
- print STDERR "glob: $name\n" if $debug;
- foreach $n ( <${name}> ) {
- next unless $allfiles || -d $n;
- # Globs never overwrite existing entries
- if ( !defined $oldblocks{$n} ) {
- $oldblocks{$n} = ":::::::";
- push (@targets, $n);
- }
- printf STDERR "glob: -> $n\n" if $debug;
- }
- # Put on the globs list, and terminate this entry
- push (@targets, "*$name");
- next;
- }
-
- push (@targets, "$name");
- # Entry may be rewritten (in case of globs)
- $oldblocks{$name} = join (":", @blocks[0..7]);
-
- print STDERR "tb: $name\t$oldblocks{$name}\n" if $debug;
- }
- close (tb);
- }
-
- sub do_gather {
-
- # Build a targets match string, and an optimized list of directories to
- # search.
- $targets = "//";
- @list = ();
- $last = "///";
- foreach $name (sort (@targets)) {
- next if $name =~ /^[-*]/;
- next unless $allfiles || -d $name;
- $targets .= "$name//";
- next if ($name =~ m|^$last/|);
- push (@list, $name);
- $last = $name;
- }
-
- print STDERR "targets: $targets\n" if $debug;
- print STDERR "list: @list\n" if $debug;
- print STDERR "reports: @targets\n" if $debug;
-
- $du = "du " . ($allfiles ? "-a" : "") . " @list|"
- unless defined $du; # in which case we have a data file
-
- # Process the data. If a name is found in the target list,
- # %newblocks will be set to the new blocks value.
-
- open (du, "$du") || die "Cannot get data from $du, stopped";
- while ($du = <du>) {
- chop ($du);
- ($blocks,$name) = split (/\t/, $du);
- if (($i = index ($targets, "//$name//")) >= 0) {
- # tally and remove entry from search list
- $newblocks{$name} = $blocks;
- print STDERR "du: $name $blocks\n" if $debug;
- substr ($targets, $i, length($name) + 2) = "";
- }
- }
- close (du);
- }
-
-
- # Report generation
-
- format std_hdr =
- Disk usage statistics@<<<<<<<<<<<<<<<<<<<<<@<<<<<<<<<<<<<<<
- $subtitle, $date
-
- blocks +day +week @<<<<<<<<<<<<<<<
- $runtype
- ------- ------- ------- --------------------------------
- @.
- format std_out =
- @@>>>>>> @>>>>>>> @>>>>>>> ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<..
- $blocks, $d_day, $d_week, $name
- @.
-
- format all_hdr =
- Disk usage statistics@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<
- $subtitle, $date
-
- --0-- --1-- --2-- --3-- --4-- --5-- --6-- --7-- @<<<<<<<<<<<<<<<
- $runtype
- ------- ------- ------- ------- ------- ------- ------- ------- --------------------------------
- @.
- format all_out =
- @@>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> @>>>>>>> ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<..
- $a[0], $a[1], $a[2], $a[3], $a[4], $a[5], $a[6], $a[7], $name
- @.
-
- sub do_report_and_update {
-
- # Prepare update of the control file
- if ( !$noupdate ) {
- if ( !open (tb, ">$table") ) {
- print STDERR "Warning: cannot update control file $table - continuing\n";
- $noupdate = 1;
- }
- }
-
- if ( $allstats ) {
- $^ = "all_hdr";
- $~ = "all_out";
- }
- else {
- $^ = "std_hdr";
- $~ = "std_out";
- }
- $date = `date`;
- chop ($date);
-
- # In one pass the report is generated, and the control file rewritten.
-
- foreach $name (@targets) {
- if ($name =~ /^-/ ) {
- $subtitle = $'; #';
- print tb "$name\n" unless $noupdate;
- print STDERR "tb: $name\n" if $debug;
- $- = -1;
- next;
- }
- if ($name =~ /^\*$prefix/ ) {
- print tb "$'\n" unless $noupdate; #';
- print STDERR "tb: $'\n" if $debug; #';
- next;
- }
- @a = split (/:/, $oldblocks{$name});
- unshift (@a, $newblocks{$name}) if $gather;
- $name = "." if $name eq $root;
- $name = $' if $name =~ /^$prefix/; #';
- if ($#a < 0) { # no data?
- if ($retain) {
- @a = ("","","","","","","","");
- }
- else {
- # Discard
- print STDERR "--: $name\n" if $debug;
- next;
- }
- }
- print STDERR "Warning: ", 1+$#a, " entries for $name\n"
- if ($debug && $#a != 8);
- $line = "$name\t" . join(":",@a[0..7]) . "\n";
- print tb $line unless $noupdate;
- print STDERR "tb: $line" if $debug;
-
- $blocks = $a[0];
- if ( !$allstats ) {
- $d_day = $d_week = "";
- if ($blocks ne "") {
- if ($a[1] ne "") { # dayly delta
- $d_day = $blocks - $a[1];
- $d_day = "+" . $d_day if $d_day > 0;
- }
- if ($a[7] ne "") { # weekly delta
- $d_week = $blocks - $a[7];
- $d_week = "+" . $d_week if $d_week > 0;
- }
- }
- }
- write;
- }
-
- # Close control file, if opened
- close (tb) unless $noupdate;
- }
-
- # Modified version of getopts ...
-
- sub Getopts {
- local($argumentative) = @_;
- local(@args,$_,$first,$rest);
- local($opterr) = 0;
-
- @args = split( / */, $argumentative );
- while(($_ = $ARGV[0]) =~ /^-(.)(.*)/) {
- ($first,$rest) = ($1,$2);
- $pos = index($argumentative,$first);
- if($pos >= $[) {
- if($args[$pos+1] eq ':') {
- shift(@ARGV);
- if($rest eq '') {
- $rest = shift(@ARGV);
- }
- eval "\$opt_$first = \$rest;";
- }
- else {
- eval "\$opt_$first = 1";
- if($rest eq '') {
- shift(@ARGV);
- }
- else {
- $ARGV[0] = "-$rest";
- }
- }
- }
- else {
- print stderr "Unknown option: $first\n";
- $opterr++;
- if($rest ne '') {
- $ARGV[0] = "-$rest";
- }
- else {
- shift(@ARGV);
- }
- }
- }
- return $opterr;
- }
- @EOF
- set `wc -lwc <dusage.pl`
- if test $1$2$3 != 379157810356
- then
- echo ERROR: wc results of dusage.pl are $* should be 379 1578 10356
- fi
-
- chmod 444 dusage.pl
-
- exit 0
- --
- Johan Vromans jv@mh.nl via internet backbones
- Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv
- Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62944/62500
- ------------------------ "Arms are made for hugging" -------------------------
-
-