Hi. Today we'll start to talk about statistics. What exactly do we mean when we use words statistics? There are two main interpretations, the method, and the data. The data mean descriptive statistics. It is study how to summarize the dataset, and described it with couple most important numbers. Statistical inference is the method to make judgments about the larger group when we have a data from a smaller group. Statistical inference will be considered in the chapters, and in this week, we will discover only descriptive statistics. There are two fundamental notions in the statistics, the population and the sample. It is crucial to know the difference between the. Population is defined as all members of some specified groups. The descriptive measures of populations are called parameters. We are mainly interested in several types of parameters, like mean, value, range, variance and some others. There's a mild problem with the population as it is often hard to be obtained for investigations. Maybe it is too costly, too much time, or too much mining, or maybe some members of the population are hard to reach. This problem could be solved if you use a small subset of the population, which is called sample. All descriptive measures of a sample are called samples statistics. Here we will have sample means, sample range and sample variance. It should be noticed that the values of the sample statistics will be different for different samples, while the values of population parameters are one for the whole population. We have successfully solved the problem of low availability of the population for investigations. But actually sample studies hide their own problems. First of all, it is quite hard to obtain a good sample. There are a lot of requirements for samples, and believe me, it is not easy. The second problem is that we could be very accurate, or to generalize any sample result to the whole population. We will discuss this problems and their solutions in the four of the chapters. Now let's look at the example. All customers of company Apple, all over the world, is a population. While 100 customers of company Apple who participate are serving is a sample. Average yearly income of all customers is a population parameter, and honestly we will never know it. It is too hard to obtain. We can find the sample statistics, which is average yearly income of the people in our sample. This average income from a sample would change, if we change the sample, so we take another people. But, the population yearly average income will stay the same. Let's summarize and compare populations and samples. Populations include all members of some group, while samples is the subset of the group. Descriptive measures of populations are called population parameters and the non-dependent sample. Descriptive measures of samples are called sample statistics, and do depend on a sample. One problem with population is, it is hard to be obtained for investigations in the main problem with a sample is that, it's hard to generalize the results that we obtain. We proceed with measurement scales. We will consider four types of scales, nominal, ordinal, interval and ratio. Nominal scales split the data into groups without ranking. It is the weakest level of measurement. For example, or we can categorize stock mutual funds by their objectives. Growth funds, value funds and blend funds. Number 2 is better than Number 1, but we cannot say that value funds are better than growth ones, they just have different domestic mastering goals. Oriental scales contains ranking. If we split funds to categories according to the yearly performance, then we can compare the funds in the first group have better performance than the funds in the second group. But this scale cannot compare the differences between the first two groups, and the Groups number 3 and 4. This means that we cannot make any arithmetical operations in the ordinal scales. Because we know which category is better, but we don't know how much better. The problem of arithmetic operations is partially solved in the interval scale. It promises both ranking, and equal differences between the scale areas. The usual example for the interval scale is the Celsius temperature. We can say that the difference between 20 degrees and 19 degrees are the same as difference between six degrees and five degrees. But we never say that plus 20 is well four times warmer than plus five. This happens because zero degrees by same scale does not mean the absence of temperature, it's just a point when water freezes. That is why we compare the scale values in interval scale, only additively, but not proportionally. Finally, the most powerful type of scale is a ratio scale. It has all benefits of interval scale, but also it has true zero as point of origin. Plain examples of ratio scales are money and returns. Now, $0 mean that we have no money, it is absence of money, and thanks to this, $15 are really three times better than $5 because we can buy five more goods. We can do arithmetic operations with ratio scales, and it is quite important for the statistical studies. Here, on this slide, you can try to determine the scales by yourself and discuss it to the other students. Now let's summarize. Nominal scales gives categories without ranking. Ordinal scale has ranking, but still down in arithmetic operations. Interval scales promises equal differences between the scale values, but no proportional comparing. The ratio scales have all benefits of the previous scaling, but also has a true zero, and this is the best scale.