March 24, 2008...3:54 pm

Mathematics for Data Analysis – Part 1

Jump to Comments

1. Basic Statistics

Statistics helps in collecting, classifying and interpreting data that conveys information.

Simple statistical measures to perform the fundamental analysis are: the mean (what is the data average?), the median (data point that splits the total data into 2 equal parts ), the mode (the variable value with highest frequency), the variance (the spread of the data points from the mean), skewness (data symmetry), kurtosis (height of the data), correlation (variable inter-relationship of data).

SKEWNESS:

SKEWNESS 1

SKEWNESSSKEWNESS

The effect of skew on mean and median:

Here the distribution shows positive skew.The mean is larger than median.

basicstatsimg4.gif

Here the distribution shows negative skew. The mean is smaller than median.

basicstatsimg5.gif

The ultimate goal of every research or scientific analysis is finding relations between a set of variables:

  • The Response variable
  • The Independent variables

Correlation research involves measuring such relations easy.

What is the Correlation?

basicstatsimg6.jpg

Basically correlation is a measure of the relation between two or more variables. There are very many measures of the magnitude of relationships between variables which have been developed by statisticians. The choice of a specific measure in given circumstances depends on the number of variables involved, measurement scales used, nature of the relations, etc. Correlation coefficients can range from -1.00 to +1.00. The value of -1.00 represents a perfect negative correlation while a value of +1.00 represents a perfect positive correlation. A value of 0.00 represents a lack of correlation.
Almost all of them, however, follow one general principle: they attempt to evaluate the observed relation by comparing it to the “maximum imaginable relation” between those specific variables.

Coming Up next… Chi-Squared Test

Leave a Reply