gov.nih.nlm.util
Class SystemToolkit

java.lang.Object
  extended bygov.nih.nlm.util.SystemToolkit

public class SystemToolkit
extends Object

Utility toolkit.

Author:
MEME Group

Constructor Summary
SystemToolkit()
           
 
Method Summary
static void copy(File in, File out)
          Copies the input file to the output file.
static void copy(File in, File out, ProgressMonitor pm)
          Copies the input file to the output file.
static String[] getIndexWords(String[] words)
          Lowercases words and returns the list sorted and uniqed.
static String[] getZipEntries(String zip_file_name)
          Returns a list of zip entry file names.
static InputStream getZipInputStream(String zf, String ze)
          Returns an InputStream for the ZipEntry in the specified ZipFile.
static boolean isStopWord(String word)
          Checks whether the word is one of the stop words.
static void main(String[] s)
           
static String md5(File file)
          Returns the MD5 value for the given File.
static String md5(File file, ProgressMonitor pm)
          Returns the MD5 value for the given File.
static String md5(String text)
          Compute the MD5 hash of a string.
static String md5(String text, String char_encoding)
          Compute the MD5 hash of a string using the specified character encoding.
static String md5CrossPlatform(File file)
          Returns the MD5 value for the given File.
static String md5CrossPlatform(File file, ProgressMonitor pm)
          Returns the MD5 value for the given File.
static String[][] readFieldedFile(File file, String delim)
          Reads the fielded file into a two-dimensional string array.
static String[][] readFieldedReader(BufferedReader in, String delim)
          Reads the fielded file into a two-dimensional string array.
static String readLine(RandomAccessFile raf, String char_set)
          Reads and returns a line from the RandomAccessFile using the specified character set.
static String removeLinks(String html)
          Removes any links within a specified HTML document.
static String removeTags(String html)
          Returns an HTML document stripped of its tags.
static long seekstr(RandomAccessFile raf, String search_string, String char_set)
          Seek to the location in the RandomAccessFile where the first line of text starts with the search string.
static void sort(String filename)
          Sort the specified file.
static void sort(String filename, boolean unique)
          Sort the specified file (optionally uniquely).
static void sort(String filename, boolean unique, ProgressMonitor pm)
          Sort the specified file (optionally uniquely).
static void sort(String filename, Comparator comp)
          Sort the specified file using the specified Comparator.
static void sort(String filename, Comparator comp, boolean unique)
          Sort the specified file using the specified Comparator and optionally sort uniquely.
static void sort(String filename, Comparator comp, boolean unique, ProgressMonitor pm)
          Sort the specified file using the specified Comparator and optionally sort uniquely.
static void sort(String filename, ProgressMonitor pm)
          Sort the specified file.
static String toHexString(byte[] v)
          Converts a byte[] to a hex string.
static void unzip(String zip_file, String output_dir, String archive_subdir, ProgressMonitor pm)
          Unzips the specified file into the specified directory using.
static void unzip(String unzip_cmd, String zip_file, String output_dir, String archive_subdir, ProgressMonitor pm)
          Unzips the specified file into the specified directory using the specified operating system command.
static void unzipInternal(String zip_file, String output_dir, String archive_subdir)
          Unzips the specified file into the specified directory using pure java.
static void unzipInternal(String zip_file, String output_dir, String archive_subdir, ProgressMonitor pm)
          Unzips the specified file into the specified directory using pure java.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SystemToolkit

public SystemToolkit()
Method Detail

isStopWord

public static boolean isStopWord(String word)
Checks whether the word is one of the stop words. Assumes the word is lowercased

Parameters:
word - lowercase string
Returns:
true if the word is a stop word, false otherwise

getIndexWords

public static String[] getIndexWords(String[] words)
Lowercases words and returns the list sorted and uniqed. Also removes stop words.

Parameters:
words - words from a string
Returns:
sorted uniqued, lowercased word list

copy

public static void copy(File in,
                        File out)
                 throws IOException
Copies the input file to the output file.

Parameters:
in - the input File
out - the output File
Throws:
IOException

copy

public static void copy(File in,
                        File out,
                        ProgressMonitor pm)
                 throws IOException
Copies the input file to the output file. Supports ability to monitor progress

Parameters:
in - the input File
out - the output File
pm - the ProgressMonitor
Throws:
IOException

md5

public static String md5(String text)
                  throws NoSuchAlgorithmException,
                         UnsupportedEncodingException
Compute the MD5 hash of a string.

Parameters:
text - the value to compute a hash for
Returns:
the MD5 hash value of the string
Throws:
NoSuchAlgorithmException - if failed due to no such algorithm
UnsupportedEncodingException - if failed due to unsupported encoding

md5

public static String md5(String text,
                         String char_encoding)
                  throws NoSuchAlgorithmException,
                         UnsupportedEncodingException
Compute the MD5 hash of a string using the specified character encoding.

Parameters:
text - the value to compute a hash for
char_encoding - a character encoding (e.g. "UTF-8")
Returns:
the MD5 hash value of the string
Throws:
NoSuchAlgorithmException - if failed due to no such algorithm
UnsupportedEncodingException - if failed due to unsupported encoding

md5

public static String md5(File file)
                  throws IOException,
                         NoSuchAlgorithmException
Returns the MD5 value for the given File.

Parameters:
file - for which to determine the MD5
Returns:
MD5 string
Throws:
IOException
NoSuchAlgorithmException

md5

public static String md5(File file,
                         ProgressMonitor pm)
                  throws IOException,
                         NoSuchAlgorithmException
Returns the MD5 value for the given File.

Parameters:
file - for which to determine the MD5
pm - an optional progress monitor to track whether operation should be cancelled
Returns:
MD5 string
Throws:
IOException
NoSuchAlgorithmException

md5CrossPlatform

public static String md5CrossPlatform(File file)
                               throws IOException,
                                      NoSuchAlgorithmException
Returns the MD5 value for the given File.

Parameters:
file - for which to determine the MD5
Returns:
MD5 string
Throws:
IOException
NoSuchAlgorithmException

md5CrossPlatform

public static String md5CrossPlatform(File file,
                                      ProgressMonitor pm)
                               throws IOException,
                                      NoSuchAlgorithmException
Returns the MD5 value for the given File.

Parameters:
file - for which to determine the MD5
pm - an optional progress monitor to track whether operation should be cancelled
Returns:
MD5 string
Throws:
IOException
NoSuchAlgorithmException

toHexString

public static String toHexString(byte[] v)
Converts a byte[] to a hex string.

Parameters:
v - the byte[]
Returns:
the hex string

removeTags

public static String removeTags(String html)
Returns an HTML document stripped of its tags.

Parameters:
html - an html document
Returns:
plain text document

removeLinks

public static String removeLinks(String html)
Removes any links within a specified HTML document.

Parameters:
html - an html document
Returns:
document without any links

unzip

public static void unzip(String zip_file,
                         String output_dir,
                         String archive_subdir,
                         ProgressMonitor pm)
                  throws IOException
Unzips the specified file into the specified directory using. Makes use of the system unzip.native property to determine whether to use the unzip program stored in unzip.path system property or to use a pure java solution.

Parameters:
zip_file - the zip file name
output_dir - the directory to unzip to
archive_subdir - the portion of the archive to extract
pm - the progrss monitor
Throws:
IOException

unzip

public static void unzip(String unzip_cmd,
                         String zip_file,
                         String output_dir,
                         String archive_subdir,
                         ProgressMonitor pm)
                  throws IOException
Unzips the specified file into the specified directory using the specified operating system command. This method makes use of a progress monitor (if usingView() returns true).

Parameters:
unzip_cmd - the OS command to use to unzip
zip_file - the zip file name
output_dir - the directory to unzip to
archive_subdir - the portion of the archive to extract
pm - the progrss monitor
Throws:
IOException

unzipInternal

public static void unzipInternal(String zip_file,
                                 String output_dir,
                                 String archive_subdir)
                          throws IOException
Unzips the specified file into the specified directory using pure java.

Parameters:
zip_file - the ZipFile name to unzip
output_dir - the output directory
archive_subdir - indicates what to unzip, (use * for everything)
Throws:
IOException - if anything goes wrong manipulating the files.

unzipInternal

public static void unzipInternal(String zip_file,
                                 String output_dir,
                                 String archive_subdir,
                                 ProgressMonitor pm)
                          throws IOException
Unzips the specified file into the specified directory using pure java.

Parameters:
zip_file - the ZipFile name to unzip
output_dir - the output directory
archive_subdir - indicates what to unzip, (use * for everything)
pm - the ProgressMonitor used to track progress.
Throws:
IOException - if anything goes wrong manipulating the files.

seekstr

public static long seekstr(RandomAccessFile raf,
                           String search_string,
                           String char_set)
                    throws IOException
Seek to the location in the RandomAccessFile where the first line of text starts with the search string. This implements a binary search function in a file (like the UNIX look command).

Parameters:
raf - the RandomAccessFile to search
search_string - the search string
char_set - the character set to use for the search_string
Returns:
the index into the file where to start looking
Throws:
IOException - if anything goes wrong

readLine

public static String readLine(RandomAccessFile raf,
                              String char_set)
                       throws IOException
Reads and returns a line from the RandomAccessFile using the specified character set.

Parameters:
raf - the RandomAccessFile
char_set - the character set
Returns:
a line from the RandomAccessFile in the specified character set
Throws:
IOException

getZipInputStream

public static InputStream getZipInputStream(String zf,
                                            String ze)
                                     throws IOException
Returns an InputStream for the ZipEntry in the specified ZipFile. Uses the native unzip program found in the system "unzip.path" property.

Parameters:
zf - the ZipFile
ze - the ZipEntry
Returns:
the InputStream
Throws:
IOException - if anything goes wrong

getZipEntries

public static String[] getZipEntries(String zip_file_name)
                              throws IOException
Returns a list of zip entry file names. Uses the native unzip program found in the system "unzip.path" property.

Parameters:
zip_file_name - the zip file name
Returns:
a list of zip entry file names
Throws:
IOException

main

public static void main(String[] s)

sort

public static void sort(String filename)
                 throws IOException
Sort the specified file.

Parameters:
filename - the file to sort
Throws:
IOException - if failed to sort

sort

public static void sort(String filename,
                        ProgressMonitor pm)
                 throws IOException
Sort the specified file.

Parameters:
filename - the file to sort
pm - ProgressMonitor
Throws:
IOException - if failed to sort

sort

public static void sort(String filename,
                        boolean unique)
                 throws IOException
Sort the specified file (optionally uniquely).

Parameters:
filename - the file to sort
unique - a boolean which detemine duplicate lines
Throws:
IOException - if failed to sort

sort

public static void sort(String filename,
                        boolean unique,
                        ProgressMonitor pm)
                 throws IOException
Sort the specified file (optionally uniquely).

Parameters:
filename - the file to sort
unique - a boolean which detemine duplicate lines
pm - ProgressMonitor
Throws:
IOException - if failed to sort

sort

public static void sort(String filename,
                        Comparator comp)
                 throws IOException
Sort the specified file using the specified Comparator.

Parameters:
filename - the file to sort
comp - the Comparator
Throws:
IOException - if failed to sort

sort

public static void sort(String filename,
                        Comparator comp,
                        boolean unique)
                 throws IOException
Sort the specified file using the specified Comparator and optionally sort uniquely.

Parameters:
filename - the file to sort
comp - the Comparator
unique - a boolean which detemine duplicate lines
Throws:
IOException - if failed to sort

sort

public static void sort(String filename,
                        Comparator comp,
                        boolean unique,
                        ProgressMonitor pm)
                 throws IOException
Sort the specified file using the specified Comparator and optionally sort uniquely.

Parameters:
filename - the file to sort
comp - the Comparator
unique - a boolean which detemine duplicate lines
pm - a ProgressMonitor
Throws:
IOException - if failed to sort

readFieldedFile

public static String[][] readFieldedFile(File file,
                                         String delim)
                                  throws IOException
Reads the fielded file into a two-dimensional string array.

Parameters:
file - the file to read
delim - the field separator in the file
Returns:
a String[][] of the lines/fields
Throws:
IOException - if anything goes wrong

readFieldedReader

public static String[][] readFieldedReader(BufferedReader in,
                                           String delim)
                                    throws IOException
Reads the fielded file into a two-dimensional string array.

Parameters:
in - the BufferedReader input
delim - the field separator in the file
Returns:
a String[][] of the lines/fields
Throws:
IOException - if anything goes wrong


Copyright ©2005