plume-lib logoPlume-lib: a library of utilities for programming

Plume-lib is a library of useful abstractions for programming. It includes:

This file overviews plume-lib's programs and libraries. It also contains pointers to more detailed documentation of each program and library.

Installation instructions:

  1. Obtain the source code:
    git clone https://github.com/mernst/plume-lib.git
  2. Build plume-lib:
    cd plume-lib
    make
  3. Set environment variables:

(As an alternative to steps 1 and 2 above, you can download and unpack file plume-lib-X.Y.Z.tar.gz. However, to get the latest and greatest version of plume-lib, we recommend cloning the repository and periodically doing git pull && make to refresh it.)

Bug reports: Please use the issue tracker.

License: Plume-lib is distributed under the MIT License, except for a few files that are governed by a different license. See file docs/LICENSE.txt for details.

Contents:

Programs

HTML and WWW

html-update
The "html-update" family of programs automatically updates text on a webpage that you maintain.
checklink
A slightly modified version of the W3C Link Checker.
checklink-args.txt
A set of command-line arguments to the checklink program, that suppress spurious warnings and thus make the output easier to scan for real problems. Run checklink like this:
   plume-lib/bin/checklink -q -r `grep -v '^#' plume-lib/bin/checklink-args.txt` MYURL
checklink-persistent-errors
This script reports errors that persist over multiple runs of the checklink program, ignoring transient errors. Documentation at top of file.
HtmlPrettyPrint
Pretty-prints an HTML file, after converting it to valid XML. To use:
  java plume.HtmlPrettyPrint file.html > filepp.html
html-canonical-urls
This file maps a textual string (such as the name of a person, institution, er event) to the a canonical URL for that string. It is used by the bibtex2web program.

Images

transgif
Change a .gif file's background to transparent. Documentation at top of file.
html-add-favicon
Takes as arguments a .png file for a "favicon", and a set of .html files. It makes each HTML file use the given favicon. A favicon is a favorites icon, which is intended to appear in the address bar of your browser when you browse to the given page. Documentation at top of file.
add-favicon
Takes as arguments a directory and a .png file for a "favicon". It makes each HTML file under the given directory use the given favicon. A favicon is a favorites icon, which is intended to appear in the address bar of your browser when you browse to the given page. Documentation at top of file.

Version control

mvc or MultiVersionControl
Lets you run a version control command, such as "status" or "update", on a set of CVS/Git/Hg/SVN checkouts rather than just one. Documentation.
cvschanges
Report changes by others since my last cvs update, ignoring my changes since then.
ediff-merge-script
A script for use as a git mergetool; runs Emacs ediff as the mergetool. Documentation at top of file.
git-auto-invoke-mergetool.sh
An alias for git that runs git mergetool whenever there is a conflict. Documentation at top of file.

Quieter output from CVS

cvsdiff
Run cvs diff, but filter out empty diffs.
cvslog
Eliminates empty entries from cvs log output.
cvs-log-summarize
Summarize the output of cvs log. This script groups any sequence of CVS checkins by the same author with no more than 2 minutes separating them (but not necessarily with identical checkin messages). For each such sequence of CVS checkins, a list of files and checkin messages is presented.
cvsupdate
Run cvs update, very quietly: only inform of conflicts (and some errors).

Searching and replacing

Lookup
Lookup searches a set of files, much like grep does. However, Lookup searches by entry (by default, paragraphs) rather than by line, respects comments (ignores matches within them), respects \include directives (searches the named file), and has other options. Documentation.
For an example application, see the uwisdom project and its README file.
preplace
Replace all matching regular expressions in the given files (or all files under the current directory). The timestamp on each file is updated only if the replacement is performed. Documentation at top of file.
search
Jeffrey Friedl's search program combines find and grep -- more or less do a 'grep' on a whole directory tree, but is more efficient, uses Perl regular expressions, and is much more powerful. This version fixes a tiny bug or two. For full documentation, see its manpage.
However, I recommend using ag instead of search.

Text formatting: LaTeX, PDF, PostScript, bibliographies

latex-process-inputs
Determines all files that are recursively \input by a given LaTeX file. Documentation at top of file. The program has two modes:
  1. Inline mode (the default): Create a single LaTeX file for the document, by inlining \input commands and removing comments. The result is appropriate to be sent to a publisher.
  2. List mode: List all the files that are (transitively) \input. This can be useful for getting a list of source files in a logical order, for example to be used in a Makefile or Ant buildfile.
hevea-retarget-crossrefs
Replaces HTML cross-references of the form <a href="#htoc1"> by cross-references to named labels, such as <a href="#introduction">. The former variety (which is generated, for example, by the Hevea program) is brittle, as it may change from run to run of Hevea. Documentation at top of file.
pdfinterleave
Suppose you scanned two-side paper in two passes (doing the second side by just turning over the whole pile, so its pages are in reverse order). This script reassembles the two PDFs into one. Invoke as: pdfinterleave infile1.pdf infile2.pdf outfile.pdf
pspage
Adds page numbers to a PostScript file.
acm-dl-abstracts
This program takes as input a filename or URL for an ACM digital library proceedings table of contents. It produces, to standard output, a HTML file that augments the table of contents with abstracts for each paper. This makes it possible to read all the abstracts on one HTML page, without clicking on any links. Documentation at top of file.
BibtexClean
Clean a BibTeX file by removing text outside BibTeX entries. Documentation.
plume-bib
Not a part of plume-lib, but a companion project. plume-bib is a collection of bibliographies in BibTeX format. See its README file for an explanation of its benefits and features.

Extracting part of a file

lines-after
Print all lines after the first one that matches the pattern. Documentation at top of file.
lines-before
Print all lines before the first one that matches the pattern. Documentation at top of file.
lines-between
Print all lines that occur between the two specified regexps (inclusive). That is, print a line matching the first regexp; then print all lines up to one matching the second regexp, but stop printing; then repeat. Optional argument --exclusive means don't print the matching lines. Documentation at top of file.
lines-from
Print all lines after the first one that matches the pattern, inclusive. Documentation at top of file.
lines-notbetween
Print all lines that do not occur between the two specified regexps (inclusive). That is, print until the first regexp is matched; then do not print until the second regexp is matched; then repeat. Optional argument --inclusive means don't print the matching lines. Documentation at top of file.
lines-upto
Print all lines before the first one that matches the pattern, inclusive. Documentation at top of file.

Emacs helper programs

emacs-byte-recompile-directory
Byte-compiles each Emacs Lisp file in the given directory, whose compiled .elc file is out of date. Requires an argument: the directory.
emacs-flatten-tags
Given a TAGS file, outputs (to stdout) that file with all recursively included TAGS files included. While the result is larger and depends on more files, the whole thing is searched for a best match rather than a particular subfile being exhaustively searched (even returning poor matches) before going to the next subfile.
emacs-mailto-handler
Takes a mailto link as its argument and pass it to Emacs.
For example, using the MozEX extension for Firefox, set the mailer to:
  emacs-mailto-handler %r
(you may need to specify the full pathname of emacs-mailto-handler) and add to your ~/.emacs:
  (autoload 'mailto-compose-mail "mailto-compose-mail")
emacsclient-a
If the Emacsclient daemon doesn't exist already, start it and connect to it. (The name comes from the fact that the implementation is just "emacsclient -a".)

Java helper programs

Finding Java files

JWhich
Given a Java class name, display the absolute pathname of the class file that would be loaded first by the class loader. Documentation.
find-java
Find Java source code or class files (.java or .class) on CLASSPATH. The output is in the order in which files are found on CLASSPATH. Documentation at top of file.

Javadoc

javadoc-index-to-alist
Construct a .javadoc-index.el file for use with javadoc-lookup for Emacs, which permits convenient lookup of Javadocs from Emacs. Documentation at top of file.

Compilation

javac-xlint
Wraps an invocation of javac, making 3 changes: Documentation at top of file.
javac-progress
Wraps an invocation of javac, but processes its STDERR to give progress indications. Documentation at top of file.
java-cpp
This acts like the C preprocessor (cpp), but Its name comes from the fact that it is useful for running on a source file with cpp macros, to create Java source code. Documentation at top of file.

Dependences

java-dependencies
Creates a list of the .java files used by a class or classes. Documentation at top of file.
classfile-orphans
Print names of .class files with no corresponding .java file or file containing the definition of the class. Documentation at top of file.

.class file processing

ClassFileVersion
Given a list of .class files, or a .jar file, print the class file version and also the JDK/JRE version required to run each .class file. Documentation.
classfile_check_version
Check that the version of the classfile format is ≤ the specified version. Used to ensure that classfiles are OK for a particular version of Java. Documentation at top of file.

Indentation and formatting

run-google-java-format.py

Don't use this version. Use it from the run-google-java-format repository instead.

The google-java-format program reformats Java source code, but it creates poor formatting for annotations in comments. This script runs google-java-format and then performs small changes to improve formatting of annotations in comments. If called with no arguments, it reads from and writes to standard output. Documentation at top of file.

check-google-java-format.py

Don't use this version. Use it from the run-google-java-format repository instead.

Given .java file names on the command line, reports any that would be reformatted by the run-google-java-format.py program, and returns non-zero status if there were any. If called with no arguments, it reads from standard output. Documentation at top of file.

Scheduling

ICalAvailable
Given one or more calendars in iCalendar format, produces a textual summary of available times. This is useful for sending someone a list of acceptable times for a meeting. Also see the ical-available Emacs function, which inserts the output of this program. Documentation.
schedule
Given a set of scheduling constraints (times that are impossible, and times that are undesirable), this script outputs times that are possible, and times that are desirable. Documentation at top of file.

Email

mail-e
Reads standard output, and if not empty calls the mail program. This feature is useful in scripts and cron jobs, but is not supported in all versions of mail. Documentation at top of file.
imap-move
This script moves all IMAP messages from one folder to another. Documentation at top of file.

Miscellaneous

striplines
Strips #line directives out of a file. The file is modified in place, but a backup is made to filename.bak.
path-remove
Cleans up a path environment variable by removing duplicates and non-existent directories. Can optionally remove certain path elements. Works for either space- or colon- delimiated paths. Documentation at top of file.
touch-oldify
Give the argument files the oldest possible timestamp. This can be useful to cause "make" to re-build the file.
cronic
A small shim shell script for wrapping cron jobs so that cron only sends email when an error has occurred. Documentation at top of file and at http://habilis.net/cronic/.
diff-remove-empty
Filter out empty parts (hunks and file sections) of a diff file This is useful after running some other program that removes some lines from a diff file.
junk
View and manipulate junk files, such as backup files and intermediate Without argument, shows junk files subordinate to current directory. Documentation in file junk.doc.
repeated-words
Reports any word that appears twice in a row. Such a word is often a typo.
sort-directory-order
Sorts the input lines by directory order: first, every file in a given directory, in sorted order; then, process subdirectories recursively, in sorted order This is useful for users (e.g., when printing) and for making output deterministic. Documentation at top of file.
sort-reversed
Like sort, but the key is the reverse of each line. Some sort implementations have a flag -r that has this same effect.
trigger-travis.sh
Trigger a new Travis-CI job. This is useful for triggering a dependent build: invoke this in the "after-success:" block of repository A's .travis.yml file, so that if Travis job A succeeds, then Travis job B is run next.

Cygwin

cygwin-runner
Takes a command with arguments and translates those arguments from Cygwin-style filenames into Windows-style filenames. Its real advantage is the little bit of intelligence it has as far as which things are files and which are not. Documentation at top of file.
java-cygwin
A wrapper for calling Java from Cygwin, that tries to convert any arguments that are Unix-style paths into Windows-style paths. Documentation at top of file.
javac-cygwin
A wrapper for calling the Java compiler from Cygwin, that tries to convert any arguments that are Unix-style paths into Windows-style paths. Documentation at top of file.
javadoc-cygwin
A wrapper for calling Javadoc from Cygwin, that tries to convert any arguments that are Unix-style paths into Windows-style paths. Documentation at top of file.

Libraries

Emacs libraries

Documentation is forthcoming. In the meanwhile, each individual library is generally well-documented, so feel free to browse.

Java libraries

Command-line option argument processing

Options
The Options class: Thus, the programmer is freed from writing duplicative, boilerplate code and documentation that could get out of sync with the rest of the program. Documentation.

Collections and iterators

ArraysMDE
Utilities for manipulating arrays and collections. This complements java.util.Arrays and java.util.Collections. Documentation.
LimitedSizeIntSet
LimitedSizeSet stores up to some maximum number of unique values, at which point its rep is nulled, in order to save space. Documentation.
There is also LimitedSizeIntSet, which takes less memory than LimitedSizeSet<Integer>. Documentation.
WeakHasherMap
WeakHashMap is a modified version of WeakHashMap from JDK 1.2.2, that adds a constructor that takes a Hasher argument. Documentation.
WeakIdentityHashMap
WeakIdentityHashMap is a modified version of WeakHashMap from JDK 1.5, that uses System.identityHashCode() rather than the object's hash code. Documentation.
OrderedPairIterator
Given two sequences/iterators/whatever, OrderedPairIterator returns a new sequence/iterator/whatever that pairs the matching elements of the inputs, according to their respective sort orders. (This opertation is sometimes called "zipping".) Documentation.
IterableIterator
In Java, Iterators are not Iterable, so they cannot be used in new-style for loops. The IterableIterator wrapper makes an Iterator that is also Iterable — that is, it implements the iterator() method. Documentation.

Text processing

StrTok
Provides a somewhat simpler interface for tokenizing strings than does StreamTokenizer. All tokenizing is done by StreamTokenizer. Documentation.
EntryReader
Class that reads "entries" from a file. In the simplest case, entries can be lines. It supports: include files, comments, and multi-line entries (paragraphs). The syntax of each of these is customizable. Documentation.
RegexUtil
Utility methods for regular expressions, most notably for testing whether a string is a regular expression. Documentation.
FileIOException
Extends IOException by also reporting a file name and line number at which the exception occurred. Documentation.
StringBuilderDelimited
Like StringBuilder, but adds a delimiter between each pair of strings that are insered into the Stringbuilder. This can simplify the logic of programs and also avoid errors. Documentation.
CountingPrintWriter
Prints formatted representations of objects to a text-output stream counting the number of bytes and characters printed. Documentation.
Digest
Computes a message digest for a file. Documentation.

Math

MathMDE
Mathematical utilities. Documentation.
FuzzyFloat
Routines for doing approximate ('fuzzy') floating point comparisons. Those are comparisons that only require the floating point numbers to be relatively close to one another to be equal, rather than exactly equal. Documentation.

Random selection

RandomSelector
Selects k elements uniformly at random from an arbitrary iterator, using O(k) space. Documentation.
MultiRandSelector
Like RandomSelector, performs a uniform random selection over an iterator. However, the objects in the iteration may be partitioned so that the random selection chooses the same number from each group. Documentation.

Processes

TimeLimitProcess
A subclass of Process such that the process is killed if it runs for more than the specified amount of wall clock time. Documentation.
FileCompiler
Defines methods that compile Java source files. Documentation.

Tuples

Pair
Mutable pair class: type-safely holds two objects of possibly-different types. Documentation.
Triple
Mutable triple class: type-safely holds three objects of possibly-different types. Documentation.

Miscellaneous

BCELUtil
Static utility methods for working with BCEL. Documentation.
DeterministicObject
A version of Object with a deterministic hashCode() method. Instantiate this instead of Object to remove a source of nondeterminism from your programs.
GraphMDE
Graph utility methods. This class does not model a graph: all methods are static. Documentation.
Intern
Utilities for interning objects. Interning is also known as canonicalization or hash-consing: it returns a single representative object that .equals() the object, and the client discards the argument and uses the result instead. Documentation.
SimpleLog
A simple logging class with timers, subtasks, backtraces, output to file or standard out, and the ability to be enabled or disabled. Documentation.
Stopwatch
A simple class for recording computing elapsed time. Documentation.
UtilMDE
Utility functions that do not belong elsewhere in the plume package. Documentation.

Perl libraries

checkargs.pm
checkargs.pm checks the number of arguments passed to a Perl function at run time, catching some common errors that could otherwise go undetected until later in the program. Documentation at top of file.