Plume-lib: a library of utilities for programming
Monolithic plume-lib is obsolete! This web page describes an
obsolete package, called monolithic plume-lib. As of August 2018,
it has been split
into many smaller packages.
Each one is smaller and more focused.
The Java
libraries are conveniently available from the Maven Central Repository.
Obtain the Java programs, and the non-Java libraries and programs,
from their repositories.
Plume-lib is a library of useful abstractions for programming. It includes:
- standalone programs
- libraries written for Java, Emacs Lisp, and Perl
This file overviews plume-lib's programs and libraries.
It also contains pointers to more detailed documentation of each program
and library.
Bug reports: Please use the
issue tracker.
License:
Plume-lib is distributed under the MIT License, except for a few files that
are governed by a different license. See file docs/LICENSE.txt for details.
Contents:
Installation instructions
-
Obtain the source code:
git clone https://github.com/mernst/plume-lib.git
-
Build plume-lib:
cd plume-lib
make
- Set environment variables:
- Add java/plume.jar to $CLASSPATH
- Add the bin subdirectory to $PATH
- Add the emacs subdirectory to Emacs's load-path.
(As an alternative to steps 1 and 2 above, you can download and unpack file
plume-lib-X.Y.Z.tar.gz.
However, to get the latest and greatest version of plume-lib, we recommend
cloning the repository and periodically doing git pull && make to refresh
it.)
Programs
HTML and WWW
- html-update
-
The "html-update" family of programs automatically updates text on a
webpage that you maintain.
-
html-update-toc updates a webpage's table of contents, in place.
Documentation at top of file.
-
html-toc computes a webpage's table of contents, based on its
header tags (<h1>, <h2>, etc.).
Documentation in html-update-toc.
-
html-update-link-dates updates webpage text that refers to the date
and/or size of a linked-to file.
Documentation at top of file.
Do not use the scripts from here; use the versions in
https://github.com/plume-lib/html-tools.
- checklink
-
A
slightly modified
version of the
W3C Link
Checker.
This has moved; get it
from https://github.com/plume-lib/checklink.
- checklink-args.txt
-
A set of command-line arguments to the checklink program, that suppress
spurious warnings and thus make the output easier to scan for real problems.
Run checklink like this:
plume-lib/bin/checklink -q -r `grep -v '^#' plume-lib/bin/checklink-args.txt` MYURL
This has moved; get it
from https://github.com/plume-lib/checklink.
- checklink-persistent-errors
-
This script reports errors that persist over multiple runs of the
checklink program, ignoring transient errors.
Documentation at top of file.
This has moved; get it
from https://github.com/plume-lib/checklink.
- html-canonical-urls
-
This file maps a textual string (such as the name of a person,
institution, er event) to the canonical URL for that string.
It is used by the
bibtex2web
program.
Do not use this file; use the version in
https://github.com/plume-lib/html-tools.
Images
- transgif
-
Change a .gif file's background to transparent.
Documentation at top of file.
- html-add-favicon
-
Takes as arguments a .png file for a "favicon", and a set of
.html files. It makes each HTML file use the given favicon. A
favicon is a favorites icon, which is intended to appear in the address
bar of your browser when you browse to the given page.
Documentation at top of file.
Do not use the script from here; use the version in
https://github.com/plume-lib/html-tools.
- add-favicon
-
Takes as arguments a directory and a .png file for a "favicon".
It makes each HTML file under the given directory use the given favicon. A
favicon is a favorites icon, which is intended to appear in the address
bar of your browser when you browse to the given page.
Documentation at top of file.
Do not use the script from here; use the version in
https://github.com/plume-lib/html-tools.
Version control
- mvc or MultiVersionControl
-
Lets you run a version control command, such as "status" or "update", on
a set of CVS/Git/Hg/SVN checkouts rather than just one.
Documentation.
- cvschanges
-
Report changes by others since my last cvs update, ignoring my
changes since then.
- ediff-merge-script
-
A script for use as a git mergetool; runs Emacs ediff as the mergetool.
Documentation at top of file.
- git-auto-invoke-mergetool.sh
-
An alias for git that runs git mergetool whenever there is a conflict.
Documentation at top of file.
Quieter output from CVS
- cvsdiff
-
Run cvs diff, but filter out empty diffs.
- cvslog
-
Eliminates empty entries from cvs log output.
- cvs-log-summarize
-
Summarize the output of cvs log.
This script groups any sequence of CVS checkins by the same author with
no more than 2 minutes separating them (but not necessarily with
identical checkin messages). For each such sequence of CVS checkins, a
list of files and checkin messages is presented.
- cvsupdate
-
Run cvs update, very quietly: only inform of conflicts
(and some errors).
Searching and replacing
- Lookup
-
Lookup searches a set of files, much like grep does. However, Lookup
searches by entry (by default, paragraphs) rather than by line,
respects comments (ignores matches within them), respects
\include directives (searches the named file), and has other
options.
Documentation.
For an example application, see the
uwisdom project and its
README file.
- preplace
-
Replace all matching regular expressions in the given files (or all files
under the current directory). The timestamp on each file is updated only
if the replacement is performed.
Documentation at top of file.
- search
-
Jeffrey Friedl's search program combines find and grep
-- more or less do a 'grep' on a whole directory tree, but is more
efficient, uses Perl regular expressions, and is much more powerful.
This version fixes a tiny bug or two. For full documentation, see its
manpage.
However, I recommend using ag
instead of search.
Text formatting: LaTeX, PDF, PostScript, bibliographies
- latex-process-inputs
-
Determines all files that are recursively \input by a given
LaTeX file.
Documentation at top of file.
The program has two modes:
-
Inline mode (the default): Create a single LaTeX file for the document,
by inlining \input commands and removing comments.
The result is appropriate to be sent to a publisher.
-
List mode: List all the files that are (transitively) \input.
This can be useful for getting a list of source files in a logical order,
for example to be used in a Makefile or Ant buildfile.
- hevea-retarget-crossrefs
-
Replaces HTML cross-references of the form
<a href="#htoc1">
by cross-references to named labels, such as
<a href="#introduction">.
The former variety (which is generated, for example, by the Hevea
program) is brittle, as it may change from run to run of Hevea.
Documentation at top of file.
Do not use the script from here; use the version in
https://github.com/plume-lib/html-tools.
- pdfinterleave
-
Suppose you scanned two-side paper in two passes (doing the second side
by just turning over the whole pile, so its pages are in reverse order).
This script reassembles the two PDFs into one.
Invoke as:
pdfinterleave infile1.pdf infile2.pdf outfile.pdf
- pspage
-
Adds page numbers to a PostScript file.
- acm-dl-abstracts
-
This program takes as input a filename or URL for an
ACM digital library
proceedings table of contents. It produces, to standard output, a HTML
file that augments the table of contents with abstracts for each paper.
This makes it possible to read all the abstracts on one HTML page,
without clicking on any links.
Documentation at top of file.
- BibtexClean
-
Clean a BibTeX file by removing text outside BibTeX entries.
Documentation.
- plume-bib
-
Not a part of plume-lib, but a companion project.
plume-bib is a collection
of bibliographies in BibTeX format.
See its README
file for an explanation of its benefits and features.
- lines-after
-
Print all lines after the first one that matches the pattern.
Documentation at top of file.
- lines-before
-
Print all lines before the first one that matches the pattern.
Documentation at top of file.
- lines-between
-
Print all lines that occur between the two specified regexps (inclusive).
That is, print a line matching the first regexp; then print all lines
up to one matching the second regexp, but stop printing; then repeat.
Optional argument --exclusive means don't print the matching lines.
Documentation at top of file.
- lines-from
-
Print all lines after the first one that matches the pattern, inclusive.
Documentation at top of file.
- lines-notbetween
-
Print all lines that do not occur between the two specified regexps
(inclusive). That is, print until the first regexp is matched; then do
not print until the second regexp is matched; then repeat.
Optional argument --inclusive means don't print the matching lines.
Documentation at top of file.
- lines-upto
-
Print all lines before the first one that matches the pattern, inclusive.
Documentation at top of file.
Emacs helper programs
- emacs-byte-recompile-directory
-
Byte-compiles each Emacs Lisp file in the given directory, whose compiled
.elc file is out of date. Requires an argument: the directory.
- emacs-flatten-tags
-
Given a TAGS file, outputs (to stdout) that file with all recursively
included TAGS files included. While the result is larger and depends on
more files, the whole thing is searched for a best match rather than a
particular subfile being exhaustively searched (even returning poor
matches) before going to the next subfile.
- emacs-mailto-handler
-
Takes a mailto link as its argument and pass it to Emacs.
For example, using the MozEX
extension for Firefox, set the mailer to:
emacs-mailto-handler %r
(you may need to specify the full pathname of emacs-mailto-handler)
and add to your ~/.emacs:
(autoload 'mailto-compose-mail "mailto-compose-mail")
- emacsclient-a
-
If the Emacsclient daemon doesn't exist already, start it and connect to
it. (The name comes from the fact that the implementation is just
"emacsclient -a".)
Java helper programs
Finding Java files
- JWhich
-
Given a Java class name, display the absolute pathname of the class file
that would be loaded first by the class loader.
Documentation.
- find-java
-
Find Java source code or class files (.java or .class) on CLASSPATH.
The output is in the order in which files are found on CLASSPATH.
Documentation at top of file.
Javadoc
- javadoc-index-to-alist
-
Construct a .javadoc-index.el file for use with
javadoc-lookup
for Emacs, which permits convenient lookup of Javadocs from Emacs.
Documentation at top of file.
Compilation
- javac-xlint
-
Wraps an invocation of javac, making 3 changes:
- It supplies the -Xlint option.
- It suppresses warning messages, based on a regexp or on code comments.
- It returns non-zero status if any other warnings (or errors) exist.
Ordinarily, javac returns non-zero status only if errors exist.
Documentation at top of file.
- javac-progress
-
Wraps an invocation of javac, but processes its STDERR to give progress
indications.
Documentation at top of file.
- java-cpp
-
This acts like the C preprocessor (cpp), but
- it does not remove comments, and
- it cleans up spacing in the processed file.
Its name comes from the fact that it is useful for running on a source
file with cpp macros, to create Java source code.
Documentation at top of file.
Dependences
- java-dependencies
-
Creates a list of the .java files used by a class or classes.
Documentation at top of file.
- classfile-orphans
-
Print names of .class files with no corresponding .java file or file
containing the definition of the class.
Documentation at top of file.
.class file processing
- ClassFileVersion
-
Given a list of .class files, or a .jar file, print the
class file version and also the JDK/JRE version required to run each
.class file.
Documentation.
- classfile_check_version
-
Check that the version of the classfile format is ≤ the specified version.
Used to ensure that classfiles are OK for a particular version of Java.
Documentation at top of file.
- run-google-java-format.py
-
Don't use this version. Use it from the run-google-java-format repository instead.
The
google-java-format
program reformats Java source code, but it creates poor formatting for
annotations in comments. This script runs google-java-format and then
performs small changes to improve formatting of annotations in comments.
If called with no arguments, it reads from and writes to standard output.
Documentation
at top of file.
- check-google-java-format.py
-
Don't use this version. Use it from the run-google-java-format repository instead.
Given .java
file names on the command line, reports any that
would be reformatted by the run-google-java-format.py program, and returns
non-zero status if there were any.
If called with no arguments, it reads from standard output.
Documentation
at top of file.
Scheduling
- ICalAvailable
-
Given one or more calendars in
iCalendar format,
produces a textual summary of available times.
This is useful for sending someone a list of acceptable times for a meeting.
Also see the ical-available Emacs function, which inserts the
output of this program.
Documentation.
- schedule
-
Given a set of scheduling constraints (times that are impossible, and
times that are undesirable), this script outputs times that are possible,
and times that are desirable.
Documentation at top of file.
Email
-
mail-e
-
Reads standard output, and if not empty calls the mail program.
This feature is useful in scripts and cron jobs, but is not supported
in all versions of mail.
Documentation
at top of file.
- imap-move
-
This script moves all IMAP messages from one folder to another.
Documentation at top of file.
Miscellaneous
- striplines
-
Strips #line directives out of a file. The file is modified in
place, but a backup is made to filename.bak.
-
path-remove
-
Cleans up a path environment variable by removing duplicates and
non-existent directories.
Can optionally remove certain path elements.
Works for either space- or colon- delimiated paths.
Documentation at top of file.
-
touch-oldify
-
Give the argument files the oldest possible timestamp.
This can be useful to cause "make" to re-build the file.
-
cronic
-
A small shim shell script for wrapping cron jobs so that cron only sends
email when an error has occurred. Documentation
at top of file and at
http://habilis.net/cronic/.
-
diff-remove-empty
-
Filter out empty parts (hunks and file sections) of a diff file
This is useful after running some other program that removes some lines
from a diff file.
-
junk
-
View and manipulate junk files, such as backup files and intermediate
Without argument, shows junk files subordinate to current directory.
Documentation in
file junk.doc.
-
repeated-words
-
Reports any word that appears twice in a row. Such a word is often a typo.
-
sort-directory-order
-
Sorts the input lines by directory order: first, every file in a given
directory, in sorted order; then, process subdirectories recursively, in
sorted order This is useful for users (e.g., when printing) and for making
output deterministic.
Documentation
at top of file.
-
sort-reversed
-
Like sort, but the key is the reverse of each line.
Some sort implementations have a flag -r that has this
same effect.
-
trigger-travis.sh
-
Trigger a new Travis-CI job.
This is useful for triggering a dependent build: invoke this in the
"after-success:" block of repository A's .travis.yml file,
so that if Travis job A succeeds, then Travis job B is run next.
Cygwin
- cygwin-runner
-
Takes a command with arguments and translates those arguments from
Cygwin-style filenames into Windows-style filenames. Its real advantage
is the little bit of intelligence it has as far as which things are files
and which are not.
Documentation at top of file.
- java-cygwin
-
A wrapper for calling Java from Cygwin, that tries to convert any
arguments that are Unix-style paths into Windows-style paths.
Documentation at top of file.
- javac-cygwin
-
A wrapper for calling the Java compiler from Cygwin, that tries to convert any
arguments that are Unix-style paths into Windows-style paths.
Documentation at top of file.
- javadoc-cygwin
-
A wrapper for calling Javadoc from Cygwin, that tries to convert any
arguments that are Unix-style paths into Windows-style paths.
Documentation at top of file.
Libraries
Emacs libraries
Documentation is forthcoming. In the meanwhile, each individual library is
generally well-documented, so feel free to browse.
Java libraries
Command-line option argument processing
- Options
-
The Options class:
- parses command-line options and sets fields in your program accordingly,
- creates usage messages (such as printed by a --help option), and
- creates documentation suitable for a manual, webpage, or manpage.
Thus, the programmer is freed from writing duplicative, boilerplate code
and documentation that could get out of sync with the rest of the program.
If you use this, the Javadoc tool needs to be on your classpath.
This version is deprecated; the Options class has moved to https://github.com/plume-lib/options.
Collections and iterators
- ArraysMDE
-
Utilities for manipulating arrays and collections.
This complements java.util.Arrays and java.util.Collections.
Documentation.
- LimitedSizeIntSet
-
LimitedSizeSet stores up to some maximum number of unique
values, at which point its rep is nulled, in order to save space.
Documentation.
There is also LimitedSizeIntSet, which takes less memory than LimitedSizeSet<Integer>.
Documentation.
- WeakHasherMap
-
WeakHashMap is a modified version of WeakHashMap from JDK 1.2.2, that
adds a constructor that takes a Hasher argument.
Documentation.
- WeakIdentityHashMap
-
WeakIdentityHashMap is a modified version of WeakHashMap from JDK 1.5,
that uses System.identityHashCode() rather than the object's hash code.
Documentation.
- OrderedPairIterator
-
Given two sequences/iterators/whatever, OrderedPairIterator returns a new
sequence/iterator/whatever that pairs the matching elements of the
inputs, according to their respective sort orders. (This opertation is
sometimes called "zipping".)
Documentation.
- IterableIterator
-
In Java, Iterators are not Iterable, so they cannot be used in new-style
for loops. The IterableIterator wrapper makes an Iterator that is
also Iterable — that is, it implements the iterator() method.
Documentation.
Text processing
- StrTok
-
Provides a somewhat simpler interface for tokenizing strings than
does StreamTokenizer. All tokenizing is done by StreamTokenizer.
Documentation.
- EntryReader
-
Class that reads "entries" from a file. In the simplest case, entries
can be lines. It supports:
include files,
comments, and
multi-line entries (paragraphs).
The syntax of each of these is customizable.
Documentation.
- RegexUtil
-
Utility methods for regular expressions, most notably for testing whether
a string is a regular expression.
Documentation.
- FileIOException
-
Extends IOException by also reporting a file name and line
number at which the exception occurred.
Documentation.
- StringBuilderDelimited
-
Like StringBuilder, but adds a delimiter between each pair of strings
that are insered into the Stringbuilder. This can simplify the logic of
programs and also avoid errors.
Documentation.
- CountingPrintWriter
-
Prints formatted representations of objects to a text-output
stream counting the number of bytes and characters printed.
Documentation.
- Digest
-
Computes a message digest for a file.
Documentation.
Math
- MathMDE
-
Mathematical utilities.
Documentation.
- FuzzyFloat
-
Routines for doing approximate ('fuzzy') floating point comparisons.
Those are comparisons that only require the floating point numbers to be
relatively close to one another to be equal, rather than exactly
equal.
Documentation.
Random selection
- RandomSelector
-
Selects k elements uniformly at random from
an arbitrary iterator, using O(k) space.
Documentation.
- MultiRandSelector
-
Like RandomSelector, performs a uniform random selection over an
iterator. However, the objects in the iteration may be partitioned so
that the random selection chooses the same number from each group.
Documentation.
Processes
- TimeLimitProcess
-
A subclass of Process such that the process is killed if it runs for more
than the specified amount of wall clock time.
Documentation.
This class is deprecated; use Apache Commons Exec instead.
- FileCompiler
-
Defines methods that compile Java source files.
Documentation.
Tuples
- Pair
-
Mutable pair class: type-safely holds two objects of possibly-different types.
Documentation.
- Triple
-
Mutable triple class: type-safely holds three objects of possibly-different types.
Documentation.
Miscellaneous
- BCELUtil
-
Static utility methods for working with BCEL.
Documentation.
- DeterministicObject
-
A version of Object with a deterministic hashCode() method.
Instantiate this instead of Object to remove a source of
nondeterminism from your programs.
- GraphMDE
-
Graph utility methods. This class does not model a graph: all methods
are static.
Documentation.
- Intern
-
Utilities for interning objects. Interning is also known as
canonicalization or hash-consing: it returns a single representative
object that .equals() the object, and the client discards the
argument and uses the result instead.
Documentation.
- SimpleLog
-
A simple logging class with timers, subtasks, backtraces, output to
file or standard out, and the ability to be enabled or disabled.
Documentation.
- Stopwatch
-
A simple class for recording computing elapsed time.
Documentation.
- UtilMDE
-
Utility functions that do not belong elsewhere in the plume package.
Documentation.
Perl libraries
- checkargs.pm
-
checkargs.pm checks the number of arguments passed to a Perl function at run
time, catching some common errors that could otherwise go undetected until
later in the program.
Documentation at top of file.