Wednesday, October 17, 2007

Week 16th-SPSS!!

Gomennasai, for the late posting! Here's my posting for week 16th. Basically, for the past few weeks I have been involved in the statistics analysis for both of my gene projects using the SPSS software. SPSS is the abbreviation used for Statistical Package for the Social Sciences; it provides almost everything and anything you required to perform an analysis.

Common Statistics Perform in SPSS
In SPSS, it comprises of multiple statistics tests, most of them are covered in year 1 math stats module. Two of the more commonly features used are the descriptive statistics and the bivariate/multivariate statistics.

Descriptive statistics
The most fundamental and frequent statistic feature used. It summarized the samples’ results and portrays the tabulated results in forms of graphs or tables. Thence it is mainly used for quick and basic analysis. Here are some of the statistics tests used in descriptive statistics:
Frequencies
A simple measurement used to computes/determine the mean, median, modes and SD of the results (variables).

Cross tabulation
A cross tab provide information on 2 or more variables’ distribution consecutively and is usually presents a table format. Therefore, a cross tab is different from a frequency test. Some of the statistics tests used abide by with cross tab are chi-square (which test for clinical significant in variables), contingency coefficient (which test for strong interaction between variables) and phi coefficient (which test for degree of interaction between variables).

Bivariate/Multivariate Statistics
As the name implied, this feature is only suitable for 2 variables. Any statistics involving more variables would be under the multivariate statistics. Similarly here is a list of the some of the statistics tests used in bivariate statistics:


t-test
Commonly used when the sample size is minute or when the standard deviation is unknown. It is used to test if a null hypothesis is true or false (accept or reject). For example, if null hypothesis used is: mean of A = mean of B (where both the means of the variables is equal). After a series of calculation if t < a =" mean">


ANOVA
An ANOVA test compares the variables’ mean with their variances. Unlike t-test, ANOVA used variances instead of t value and also assess on more than two variables together.


So how and wat statistics should be applied to different data analysis?
The answer would depend on wat type of “story “ you wish to express in your research. For instance, if you wish to find out the relationship of drug X on blood glucose level. Then a linear regression approach would be an ideal.

A linear regression is a technique used to determine the relationship of dependent variables with independent variables. In this case, blood glucose level is your dependent variable while the dosage of drug X would be the independent variables.
Once you identify your variables you can start analyzed using both the descriptive statistics and the bivariate/multivariate statistics.

Another technique used for data analysis is called the factors analysis. This approach look into dependent variables, meanwhile, indirectly identifies the independent variables within. For example, by looking at blood glucose level (dependent variables), other independent variable such as concentration of drug X would be determined.

Other technique includes K-means cluster analysis, Hierarchical cluster analysis and Ordinal regression.

Lost in this statistics info?? Don’t worry maybe these definitions can help out.
Dependent variables
A dependent variable is an outcome, a prediction that can be influenced by the independent variable. Usually is something, which cannot be control.

Independent variables
An independent variable is something, which you may control or predict in experiment.

Means
Mean equal average

Null hypothesis
A null hypothesis is a presumed statement before statistics analysis

Data analysis

For SPSS to analysis a data/ samples results, a syntax is often used. Simply type in the code (instructions) you want the SPSS to do and select run. Subsequently, the results will appear in the output file in a graph, table, histogram formats (etc). Based on the tabulated results, interpretations can be made.

That all for this week!! Stay tune to next week SIP sharing!!
If you have any question on my post feel free to leave comment.

Avery (0503292E)
TG02

3 comments:

royal physicians said...

Avery, u r my hero!! u know for SPSS program, what should be used in order to check the correlation between 2 proteins in the same kind of tissue sample?? Spearman R test isit? Can i use chi-sq??

Kangting
0503331A

Vino said...

hello

In ur SPSS program, do they teach you how to perform krukal walli test?? its quite similar to oneway anova. Any idea?


Vino
TGo2

royal physicians said...

To kang ting
Chi square only look at the sig between the proteins. As for Spearman R test all i know is that it look at the linear relations. Maybe you can try the analysis of univariate/multinomial regression to see the associations.

To vino,
I am not sure how to perform a krukal walli test but if it is the same as the oneway anova then perhaps you can try going to the analyse menu. However, due to the diff in spss version,it might not be the same for the spss you use .