home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Simtel MSDOS 1992 September
/
Simtel20_Sept92.cdr
/
msdos
/
statstcs
/
mystat.arc
/
MYSTAT.TXT
< prev
next >
Wrap
Text File
|
1987-02-15
|
21KB
|
576 lines
@ 0 18
MYSTAT, the personal version of SYSTAT
MYSTAT has commands that let you enter, transform and analyze data.
To use a command, simply type the name of the command, the options
you wish and hit the [Enter] key. For example, to begin this
introduction you typed the INTRO command name and hit the [Enter] key.
>INTRO [Enter]
You can abbreviate command names and options to just the first two characters
and use upper and lower case letters interchangeably. Thus you could just as
well have typed
>in [Enter]
to start this introductory tutorial. One exception to this rule is the
STEM command, which requires a minimum of STE to distinguish it from STATS.
@ 18 6
If a command is too long to fit on a single line, end the first line with a
comma, hit [Enter], and continue the command on the next line. For example
>stats pop rainfall / mean sd skewness kurtosis minimum maximum , [Enter]
>range variance sem sum [Enter]
@ 6 5
*** MYSTAT Menu ***
MYSTAT displays the names of all the commands you can use on a menu.
We reproduce this menu on the next screen.
@ 5 23
MYSTAT A Personal Version of SYSTAT
>>>>>>>>>> DEMO HELP INTRO SYSTAT <<<<<<<<<<
---------- ---------- ---------- ---------- ---------- ----------
EDIT USE SAVE PUT SORT RANK
SUBMIT OUTPUT NOTE FORMAT MENU CHAR
NAMES WEIGHT - - - -
---------- ---------- ---------- ---------- ---------- ----------
PLOT TPLOT HISTOGRAM BOX STEM TTEST
STATS TABULATE PEARSON SIGN WILCOXON FRIEDMAN
KS CATEGORY MODEL ESTIMATE QUIT
>
Enter HELP or other command. QUIT returns you to DOS.
@ 23 18
To find out what one of these commands does or how to specify it, use the HELP
command. Just type the word HELP and hit [Enter].
>HELP [Enter]
MYSTAT will display very short descriptions of all commands. To obtain more
information about a particular command, type HELP and the name of the command.
MYSTAT then will display a description of the command, a syntax description
of the command and one or more examples of typical uses of the command.
Comments on the examples appear in parentheses (...).
After you read the help description, hit [Enter] to return to the menu.
For example, to learn about the USE command, type
>HELP USE [Enter]
MYSTAT then would display the following screen
@ 18 10
The USE command reads the variables in a MYSTAT file.
USE <file>
Example:
USE MYDATA (reads from MYDATA.SYS in default drive/directory)
USE B:MYDATA (reads MYDATA.SYS from B: drive)
USE '\SYSTAT\NEWDATA.SYS' (fully qualified names must have .SYS extension)
@ 10 16
In syntax descriptions
<file> means any valid MYSTAT file name. Basic MYSTAT file names are one
to eight letters and/or numerals, beginning with a letter.
You can append a drive letter to a file name. MYSTAT assumes
a file extension of .SYS for system data files. If you specify a
fully qualified file name, it must appear in quotes and include the
.SYS extension. On a floppy-only system, it's best to leave the
working disk in drive A: and USE and SAVE all data from drive B:
MYDATA
'MYDATA.SYS'
B:NEWDATA
'B:NEWDATA.SYS'
'\DATADIR\MYDATA.SYS'
@ 16 6
Next, one could ask about the STATS command.
>HELP STATS
which would produce the following screen.
@ 6 17
The STATS command produces basic statistics. If you
choose no options, it will produce N, MINIMUM, MAXIMUM,
MEAN, SD. Otherwise, it will produce just the option(s)
you choose. If you use BY to get subgroup statistics, the
file must be sorted by the grouping variable(s) and BY must
follow all statistics options.
STATS [<var1>,<var2>,<...>]
[/MEAN,SD,SKEWNESS,KURTOSIS,MINIMUM,MAXIMUM,RANGE,VARIANCE,SEM,SUM]
[BY <var3>,<var4>,<...>]
Examples:
STATS (basic statistics for whole file)
STATS VAR1,VAR2 / SEM (standard error of the mean)
STATS / BY GROUPS (basic statistics for cases in each group)
@ 17 21
In syntax descriptions
[...] brackets enclose optional specifications
<var> means any valid MYSTAT variable name. MYSTAT variable names
are one to eight letters and/or numerals, and/or underscores
beginning with a letter. If a variable contains character
instead of numeric data, follow the name with a $. Values of
character variables may contain up to 12 digits. MYSTAT also
provides singly subscripted numeric variables with subscripts
up to 99.
VAR_1 numeric variable
NAME$ character variable
MEASURE(3) subscripted numeric variable
ITEM(1-5) range for a subscripted variable,
i.e. ITEM(1)...ITEM(5)
[<var1>,<var2>,<...>] means an optional list of variable names. If you
omit this MYSTAT will use all the numeric variables in the file.
@ 21 11
Having learned the syntax and use of the commands with HELP, we'll
employ USE and STATS to obtain some descriptive statistics on data in
the CITIES.SYS file created by DEMO.
We type
>USE CITIES [Enter]
This tells MYSTAT to read the data from a MYSTAT file named CITIES.SYS from
the default disk drive/directory. In response, MYSTAT clears the screen and
displays the names of all the variables in the file.
@ 11 20
VARIABLES IN MYSTAT FILE ARE:
CITY$ STATE$ POP RAINFALL LOGPOP
DATA IS STORED IN SINGLE PRECISION
Press ENTER <-' or RETURN
@ 20 16
From this display one can see that there are five variables in the file.
Three are numeric and CITY$ and STATE$ are character. There are no array
variables. This file stores data in single precision (approximately 9
decimal digits). This option was chosen when this file was created. The
storage option (single or double precision) does not affect computations,
which always uses double precision arithmetic (at least 15 digits precision).
After one presses [Enter], MYSTAT returns to the menu. One then can
enter the STATS command. In DEMO, we showed the output for STATS
with no options. Here, we type
>STATS RAINFALL / MEAN SD RANGE [Enter]
This command requests the mean, standard deviation, and range of one
variable, rainfall. This produces the output in the next screen
@ 16 19
TOTAL OBSERVATIONS: 8
RAINFALL
N OF CASES 8
MEAN 35.520
STANDARD DEV 18.032
RANGE 52.210
Press ENTER <-' or RETURN
@ 19 8
After you press [Enter], MYSTAT returns to the menu. Type the command
>QUIT [Enter]
to leave MYSTAT and return to DOS. Before MYSTAT returns you to DOS, it
prints a summary of the commands used. To save this command log
in a file, issue an OUTPUT command before QUIT. (See below.)
@ 8 15
MYSTAT PROCESSING FINISHED
INPUT STATEMENTS FOR THIS JOB:
USE CITIES
STATS RAINFALL / MEAN SD RANGE
A:>
@ 15 17
*** MYSTAT Operation ***
You can operate MYSTAT in three modes.
1 Interactive analyses with a menu. This is the default.
2 Interactive analyses without a menu.
3 Batch mode, where MYSTAT reads a series of commands from a file.
When you first use MYSTAT, you should use the menu to
remind you what commands are available. After you become proficient
with MYSTAT, you can use the MENU command to turn the MYSTAT menu off.
At times you may use the SUBMIT command. This treats commands in a file as
though they were typed on the keyboard. The DEMO command you have seen
before SUBMITs a file containing a series of MYSTAT commands that create a
file named CITIES.SYS and analyze the data in it.
@ 17 13
Output Destination and Appearance
OUTPUT routes output to the console (screen), a file, or the printer.
NOTE allows you to write comments on your output.
FORMAT determines the number of digits (0-9) to the right of the decimal point
in all numerical output. The default value is 3. Use the UNDERFLOW
option to print tiny numbers in exponential form.
CHAR allows you to choose IBM screen/printer graphics characters or generic
characters that will print on any printer.
@ 13 17
*** MYSTAT Data Input, Editing, Transformations ***
The EDIT command starts the MYSTAT full screen data editor. Use the
editor to create new or edit existing MYSTAT files and to create new
or transform existing variables.
Imagine you wished to change an incorrect value in the CITIES.SYS file that
contained the square root of the rainfall for each city. Edit the file
by typing
>EDIT CITIES [Enter]
MYSTAT first reads the file, and then displays the first 15 cases and five
variables of a file in the worksheet. The other cases and variables
still are available but off the screen. The cursor resides at the first case
and first variable of the worksheet.
@ 17 18
_MYSTAT Editor____________________________________________________________
Case |__CITY$_______STATE$________POP________RAINFALL_______LOGPOP____
1 |XXXNew York NY 7164742.0 57.0 15.8
2 |Los Angeles CA 3096721.0 7.8 14.9
3 | Chicago IL 2992472.0 34.0 14.9
4 | Dallas TX 974234.0 33.9 13.8
5 | Phoenix AZ 853266.0 14.9 13.7
6 | Miami FL 346865.0 60.0 12.8
7 | Washington DC 638432.0 37.7 13.4
8 |Kansas City MO 448159.0 38.8 13.0
9 |
10 |
11 |
12 |
13 |
14 |
______15_|________________________________________________________________
@ 18 22
Move around the rows and columns with either Wordstar-like keyboard
cursor commands or PC special keys:
Q W E R
Esc Home up ^ arrow PgUp
(toggle) |
A S D F
Ins <-- --> Del
(left page) left arrow right arrow (right page)
Z X C
End down | arrow PgDn
v
Either D [Enter] or [-->] move the cursor one cell to the right.
If you pass the end of the screen, the worksheet will scroll to the next
case or variable. To change a value or to enter a new value, use these
keys to go to the cell you want, then type the new value and hit [Enter]
or a special cursor key. Enclose character values or variable names in
quotes. Enter a missing value as a period.
@ 22 20
To use MYSTAT EDIT commands, type either Q [Enter] or [Esc]. This
moves the cursor to the command line. Enter the command(s) as usual. To
return to the worksheet enter either Q [Enter] or [Esc].
EDIT commands are:
Q (or Escape key) toggles between worksheet/command line
FIND <expression> moves cursor to selected case
FORMAT <#> sets number of decimals MYSTAT displays
SAVE <filename> saves new/edited data to MYSTAT file
LET <statement> transforms variables
IF <expression> THEN LET <statement> conditionally transforms variables
REPEAT <#> fill a data template with missing values
HELP help for edit commands
QUIT return to main menu
To leave the editor, toggle to the command line and type
>quit [Enter]
@ 20 9
FIND displays the values in the worksheet starting at a specified case.
>FIND RAINFALL = 14.9
would position the cursor at case 10, the first case that meets the
condition. If the first case that meets the condition is not currently
displayed, the editor displays a new set of cases on the screen, starting
with the target case.
@ 9 23
LET and IF ... THEN LET transform existing or create new variables.
LET <var> = <expression>
IF <expression> THEN LET <var> = <expression>
An expression may contain any MYSTAT variables, operators and/or functions.
+ addition CASE current case number
- subtraction URAN uniform random number
* multiplication NRAN normal random number
/ division
^ exponentiation INT integer truncation
SQR square root
< less than LOG natural log
<= =< less than or equal EXP exponential function
= equal to ABS absolute value
<> not equal to
>= => greater than or equal SIN sine
to COS cosine
AND logical and TAN tangent
OR logical or ASN arcsine
ACS arccosine
CDF standard normal CDF ATN arctangent
IDF inverse normal CDF ATH hyperbolic arctangent (Fisher's Z)
@ 23 15
Transform numeric variables:
>LET X2 = X^2
>LET logit1 = 1 / (1 + EXP(A + B*X) )
>let Z = ATH(r)
Transform numeric variables conditionally:
>IF sex$ = 'male' THEN LET IQ = 0
>if group > 2 then let newgroup = 2
Transform a coded value to a missing value:
>IF a = -9 THEN LET a = .
@ 15 13
Use REPEAT to Create a file with random data
>EDIT (invoke the full-screen data editor for a new data set)
Define variable names such as A, B, etc., remembering to put names in quotes.
Hit [Esc] to get to the command line
>REPEAT 20 (create 20 cases with missing values for each variable)
>LET A=URAN (fill the values of variable A with uniform random values)
. . . (etc. for all variables)
>SAVE RANDOM (save made up data to file RANDOM.SYS)
>QUIT (leave the editor)
@ 13 22
*** MYSTAT File Input and Output ***
When you have edited the values and performed the transformations, you
must save the data into a new MYSTAT data file. The SAVE command is
completely parallel to the USE command.
>SAVE b:newfile [Enter]
places the data in a MYSTAT file called NEWFILE.SYS on the B: drive.
You can use MYSTAT files you created in the data editor or with SAVE and
other MYSTAT commands. You can even write your data to a comma-delimited
ASCII file to use with other programs.
USE reads the values of variables in a MYSTAT (or SYSTAT) file.
SAVE saves your data into a MYSTAT file. You must use SAVE in the editor
or before the commands SORT, RANK or ESTIMATE to create a file.
PUT works like SAVE, except that MYSTAT puts data into a raw, comma-
delimited ASCII data file instead of saving it to a MYSTAT file.
@ 22 16
*** Other MYSTAT Data Manipulation ***
You can weight each observation in your file by the value of a variable.
You can also sort the cases in your datasets by as many as 10 numeric and/or
character variables. You can also rank the values of any variables. Use
SAVE before these commands to create a file with sorted or ranked data to
use in other commands.
WEIGHT allows you to specify a weighting variable. MYSTAT truncates the
value of the weighting variable and duplicates the case that many
times before reading the next case.
SORT sorts (reorders) the cases in a file in ascending order on selected
variables.
RANK converts values of specified variables to their ranks.
@ 16 8
*** Descriptive Statistics ***
STATS provides complete descriptive statistics on numerical variables
including the sum, mean, standard error of the mean, minimum, maximum,
range, standard deviation, variance, skewness and kurtosis. Use the
BY option to obtain descriptive statistics for subgroups if you
first SORT the file by the grouping variable(s).
@ 8 14
*** Graphical Data Analysis ***
PLOT creates a two-way plot of one or more Y variables on a vertical
scale against an X variable on a horizontal scale. The plotting
symbol can represent a third variable.
HISTOGRAM displays a histogram for one or more variables.
BOX creates a boxplot for one or more variables.
STEM produces a stem-and-leaf diagram for one or more variables.
TPLOT plots a series of data values.
@ 14 6
*** Distributional Forms ***
KS Kolmogorov-Smirnov tests whether a sample came from a specified type of
distribution (such as normal) or whether two variables have the same
distribution.
@ 6 7
*** Frequencies and Contingency Table Analysis ***
TABULATE produces frequency and n-way crosstabulation tables.
For two-way tables TABULATE provides Chi-square test statistics,
association coefficients and PRE statistics with their asymptotic
standard errors.
@ 7 11
*** Independent and Dependent Group Tests ***
TTEST does either dependent (paired) or independent t-tests.
SIGN computes a sign test on all pairs of specified variables.
WILCOXON calculates a Wilcoxon signed-rank test on pairs of variables.
FRIEDMAN computes a Friedman nonparametric analysis of variance
on selected variables.
@ 11 5
*** Correlations ***
PEARSON computes a matrix of Pearson product moment correlations.
Use RANK and PEARSON to compute Spearman rank-order correlations.
@ 5 23
*** Linear Models ***
Use the CATEGORY, MODEL, SAVE and ESTIMATE commands to analyze regression,
ANOVA and ANACOVA models including those with factor by covariate interactions
and unbalanced designs. For unbalanced designs MYSTAT uses Yates' method of
weighted squares of means. You can also perform extensive residual analyses.
CATEGORY specifies the number of categories for one or more variables used
as categorical predictors (factors). A CATEGORY variable must
have integer values from 1 to k, where k is the no. of categories.
MODEL specifies a model to estimate. If you specify CONSTANT (an
intercept term), it must be first.
SAVE saves a file containing model variables, estimates, residuals,
standard error of prediction, leverage, Cook's D and studentized
residuals. MYSTAT names these ESTIMATE, RESIDUAL, SEPRED, LEVERAGE,
COOK and STUDENT. MYSTAT lists cases with extreme studentized
residuals or leverage and prints the Durbin-Watson statistic
and autocorrelation coefficient.
ESTIMATE causes MYSTAT to estimate the specified model.
@ 23 15
Regression Analysis
Simple linear regression with no constant (intercept) in the model:
>USE DATAFILE
>MODEL Y = X
>ESTIMATE
Multiple linear regression with constant and save file for residuals analysis:
>USE NEWDATA
>SAVE RESIDS
>MODEL Y = CONSTANT + X + Z
>ESTIMATE
@ 15 18
ANOVA
One-way design, factor SEX has two levels with values 1 and 2:
>CATEGORY SEX=2
>MODEL IQ=CONSTANT+SEX
>ESTIMATE
Two-way ANOVA with an A by B interaction term (A*B):
>CATEGORY A=2,B=3
>MODEL Y = CONSTANT + A + B + A*B
>ESTIMATE
A has two levels, 1 and 2, and B has three levels, 1, 2, and 3.
@ 18 12
Analysis of Covariance
A and B are factors, C is a covariate:
>CATEGORY A=2,B=3
>MODEL Y = CONSTANT + A + B + C + A*B + A*C + A*B*C
>ESTIMATE
>MODEL Y = CONSTANT+C+A+B+A*B
>ESTIMATE
The first model includes factor by covariate interactions, the second does not.
@ 12 20
Residual Analyses
If you specify SAVE with a linear model, MYSTAT automatically identifies
possible outliers and reports serial correlation diagnostics. Use the
variables in the residuals file you save to perform additional tests.
>MODEL Y = CONSTANT + X1 + X2
>SAVE RESIDS (save information in RESIDS.SYS file)
>ESTIMATE (estimate model and get automatic diagnostics)
>USE RESIDS (get file containing residual and model variables)
>PLOT RESIDUAL*ESTIMATE (scatterplot of residuals and estimated values)
>PLOT RESIDUAL*X1 (assess nonlinearity in relation of X1 to Y)
>TPLOT RESIDUAL (assess possible serial pattern to residuals)
>STEM RESIDUAL (look at shape of residual distribution)
>KS RESIDUAL /NORMAL (test normality of residuals)
>BOX LEVERAGE (see if any leverage values extreme relative to others)
Use your imagination!
@ 20 0