EXPLANATION OF THE SOFTWARE AT: http://probability.ca/usscc/ SUMMARY: This directory contains various computer programs used during the writing of the research paper "Detecting Multiple Authorship of United States Supreme Court Legal Decisions Using Function Words", by J.S. Rosenthal and A. Yoon (2009), available at: http://probability.ca/jeff/research.html The software automatically downloads and sorts plain-text versions of the United States Supreme Court decisions from 1991-2009, using the Cornell University Law School source provided at: http://www.law.cornell.edu/supct/ It then performs various statistical analyses on the text of these decisions, including the frequency of various "function words" and more. The programs are designed for use only on linux/unix/Mac machines. All programs in this directory are Copyright (c) 2010 by Jeffrey S. Rosenthal, and are licensed for general copying, distribution and modification according to the GNU General Public License (http://www.gnu.org/copyleft/gpl.html). TO PREPARE TO USE THIS SOFTWARE: First, on your own linux/unix/Mac computer, make sure you have "cc" and "lynx" installed. ("cc" or "gcc" is the C compiler; on linux/unix it should be pre-loaded, but for Mac you might need to install the Xcode package from developer.apple.com/tools/xcode. "lynx" is a plain-text web browser, available from e.g. www.lynxbrowser.com.) Then, create a new file directory in which you have write permission, e.g. "~/usscc". Then, download and run the INSTALL file, e.g. by typing: lynx -source http://probability.ca/usscc/INSTALL > INSTALL chmod +x INSTALL ./INSTALL You are now ready to begin! USING THIS SOFTWARE: From within that same directory, proceed as follows. First, type "graball" followed by a justice's name to automatically download as plain text (removing extraneous header and footer text), and sort (by session and date), and perform statistical analysis on, all of the majority decisions that they authored from 1991-2009: ./graball Kennedy ./graball Scalia (etc.) (This command also generates a subdirectory "info" with additional statistical information about each decision.) You can optionally repeat the same analysis later using "textvarit": ./textvarit Kennedy (etc.) Once a justice's decisions have been downloaded and analysed, then you can perform various bootstrap comparisons. For example, to compare their 1990s decisions with their 2000s decisions, use "comp1920": ./comp1920 Kennedy (etc.) To compare their decisions from the first half (Sept-March) of all sessions to those of the second half (April-Aug), use "compab": ./compab Kennedy (etc.) To compare two different justices, use "comptwo": ./comptwo Kennedy Scalia (etc.) To perform a cross-validation test of a naive Bayes classifier for determining which of two justices authored a decision, use "naivebayesit": ./naivebayesit Kennedy Scalia (etc.) To perform a cross-validation test of a linear classifier for determining which of two justices authored a decision, use "lindiscit": ./lindiscit Kennedy Scalia (etc.) You can also download a justice's DISSENTING opinions with "grabdisall": ./grabdisall Stevens (etc.) and can then perform comparisons using that data as well: ./textvarit StevensDissent ./compab StevensDissent ./comptwo StevensDissent Scalia ./lindiscit StevensDissent Scalia (etc.) You may contact me with questions. -- Jeffrey Rosenthal, jeff@math.toronto.edu, http://probability.ca/jeff/