Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Descriptive Sta tistics, Exercises of Accounting

asdadasdas adsdsdadwaaadiofhseifbfobsvniosehfiancknsfihwio

Typology: Exercises

2020/2021

Uploaded on 09/23/2021

therese-de-la-torre
therese-de-la-torre 🇵🇭

1 document

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Descriptive Statistics
Descriptive statistics are used to describe the basic features of the data in a study. They provide
simple summaries about the sample and the measures. Together with simple graphics analysis, they
form the basis of virtually every quantitative analysis of data.
Descriptive statistics are typically distinguished from inferential statistics. With descriptive statistics
you are simply describing what is or what the data shows. With inferential statistics, you are trying to
reach conclusions that extend beyond the immediate data alone. For instance, we use inferential
statistics to try to infer from the sample data what the population might think. Or, we use inferential
statistics to make judgments of the probability that an observed difference between groups is a
dependable one or one that might have happened by chance in this study. Thus, we use inferential
statistics to make inferences from our data to more general conditions; we use descriptive statistics
simply to describe what’s going on in our data.
Descriptive Statistics are used to present quantitative descriptions in a manageable form. In a
research study we may have lots of measures. Or we may measure a large number of people on
any measure. Descriptive statistics help us to simplify large amounts of data in a sensible way. Each
descriptive statistic reduces lots of data into a simpler summary. For instance, consider a simple
number used to summarize how well a batter is performing in baseball, the batting average. This
single number is simply the number of hits divided by the number of times at bat (reported to three
significant digits). A batter who is hitting .333 is getting a hit one time in every three at bats. One
batting .250 is hitting one time in four. The single number describes a large number of discrete
events. Or, consider the scourge of many students, the Grade Point Average (GPA). This single
number describes the general performance of a student across a potentially wide range of course
experiences.
Every time you try to describe a large set of observations with a single indicator you run the risk of
distorting the original data or losing important detail. The batting average doesn’t tell you whether the
batter is hitting home runs or singles. It doesn’t tell whether she’s been in a slump or on a streak.
The GPA doesn’t tell you whether the student was in difficult courses or easy ones, or whether they
were courses in their major field or in other disciplines. Even given these limitations, descriptive
statistics provide a powerful summary that may enable comparisons across people or other units.
Univariate Analysis
Univariate analysis involves the examination across cases of one variable at a time. There are three
major characteristics of a single variable that we tend to look at:
the distribution
the central tendency
the dispersion
In most situations, we would describe all three of these characteristics for each of the variables in
our study.
The Distribution
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download Descriptive Sta tistics and more Exercises Accounting in PDF only on Docsity!

Descriptive Statistics

Descriptive statistics are used to describe the basic features of the data in a study. They provide

simple summaries about the sample and the measures. Together with simple graphics analysis, they

form the basis of virtually every quantitative analysis of data.

Descriptive statistics are typically distinguished from inferential statistics. With descriptive statistics

you are simply describing what is or what the data shows. With inferential statistics, you are trying to

reach conclusions that extend beyond the immediate data alone. For instance, we use inferential

statistics to try to infer from the sample data what the population might think. Or, we use inferential

statistics to make judgments of the probability that an observed difference between groups is a

dependable one or one that might have happened by chance in this study. Thus, we use inferential

statistics to make inferences from our data to more general conditions; we use descriptive statistics

simply to describe what’s going on in our data.

Descriptive Statistics are used to present quantitative descriptions in a manageable form. In a

research study we may have lots of measures. Or we may measure a large number of people on

any measure. Descriptive statistics help us to simplify large amounts of data in a sensible way. Each

descriptive statistic reduces lots of data into a simpler summary. For instance, consider a simple

number used to summarize how well a batter is performing in baseball, the batting average. This

single number is simply the number of hits divided by the number of times at bat (reported to three

significant digits). A batter who is hitting .333 is getting a hit one time in every three at bats. One

batting .250 is hitting one time in four. The single number describes a large number of discrete

events. Or, consider the scourge of many students, the Grade Point Average (GPA). This single

number describes the general performance of a student across a potentially wide range of course

experiences.

Every time you try to describe a large set of observations with a single indicator you run the risk of

distorting the original data or losing important detail. The batting average doesn’t tell you whether the

batter is hitting home runs or singles. It doesn’t tell whether she’s been in a slump or on a streak.

The GPA doesn’t tell you whether the student was in difficult courses or easy ones, or whether they

were courses in their major field or in other disciplines. Even given these limitations, descriptive

statistics provide a powerful summary that may enable comparisons across people or other units.

Univariate Analysis

Univariate analysis involves the examination across cases of one variable at a time. There are three

major characteristics of a single variable that we tend to look at:

 the distribution

 the central tendency

 the dispersion

In most situations, we would describe all three of these characteristics for each of the variables in

our study.

The Distribution

The distribution is a summary of the frequency of individual values or ranges of values for a variable.

The simplest distribution would list every value of a variable and the number of persons who had

each value. For instance, a typical way to describe the distribution of college students is by year in

college, listing the number or percent of students at each of the four years. Or, we describe gender

by listing the number or percent of males and females. In these cases, the variable has few enough

values that we can list each one and summarize how many sample cases had the value. But what

do we do for a variable like income or GPA? With these variables there can be a large number of

possible values, with relatively few people having each one. In this case, we group the raw scores

into categories according to ranges of values. For instance, we might look at GPA according to the

letter grade ranges. Or, we might group income into four or five ranges of income values.

Category Percent

Under 35 years old 9%

One of the most common ways to describe a single variable is with a frequency distribution.

Depending on the particular variable, all of the data values may be represented, or you may group

the values into categories first (e.g., with age, price, or temperature variables, it would usually not be

sensible to determine the frequencies for each value. Rather, the value are grouped into ranges and

the frequencies determined.). Frequency distributions can be depicted in two ways, as a table or as

a graph. The table above shows an age frequency distribution with five categories of age ranges

defined. The same frequency distribution can be depicted in a graph as shown in Figure 1. This type

of graph is often referred to as a histogram or bar chart.

Figure 1. Frequency distribution bar chart.

Distributions may also be displayed using percentages. For example, you could use percentages to

describe the:

 percentage of people in different income levels

 percentage of people in different age ranges

 percentage of people in different ranges of standardized test scores

Central Tendency

to compute the standard deviation, we first find the distance between each value and the mean. We

know from above that the mean is

. So, the differences from the mean are:

Notice that values that are below the mean have negative discrepancies and values above it have

positive ones. Next, we square each discrepancy:

Now, we take these “squares” and sum them to get the Sum of Squares (SS) value. Here, the sum

is

. Next, we divide this sum by the number of scores minus

. Here, the result is

/ 7 = 50.125. This value is known as the variance. To get the standard deviation, we take the

square root of the variance (remember that we squared the deviations earlier). This would

be SQRT(50.125) = 7.079901129253.

Although this computation may seem convoluted, it’s actually quite simple. To see this, consider the

formula for the standard deviation:

\sqrt{\frac{\sum(X-\bar{X})^2}{n-1}} n −1∑( X − X ˉ) 2

where:

 X is each score,

is the mean (or average),

n is the number of values,

 Σ means we sum across the values.

In the top part of the ratio, the numerator, we see that each score has the mean subtracted from it,

the difference is squared, and the squares are summed. In the bottom part, we take the number of

scores minus

. The ratio is the variance and the square root is the standard deviation. In English,

we can describe the standard deviation as:

the square root of the sum of the squared deviations from the mean divided by the number of scores

minus one.

Although we can calculate these univariate statistics by hand, it gets quite tedious when you have

more than a few values and variables. Every statistics program is capable of calculating them easily

for you. For instance, I put the eight scores into SPSS and got the following table as a result:

Metric Value

N 8

Mean 20.

Median 20.

Mode 15.

Standard Deviation 7.

Variance 50.

Range 21.

which confirms the calculations I did by hand above.

The standard deviation allows us to reach some conclusions about specific scores in our distribution.

Assuming that the distribution of scores is normal or bell-shaped (or close to it!), the following

conclusions can be reached:

 approximately 68% of the scores in the sample fall within one standard deviation of the mean

 approximately 95% of the scores in the sample fall within two standard deviations of the mean

 approximately 99% of the scores in the sample fall within three standard deviations of the mean

For instance, since the mean in our example is

and the standard deviation is

, we

can from the above statement estimate that approximately 95% of the scores will fall in the range

of 20.875-(27.0799) to 20.875+(27.0799) or between 6.7152 and 35.0348. This kind of

information is a critical stepping stone to enabling us to compare the performance of an individual on

one variable with their performance on another, even when the variables are measured on entirely

different scales.

Next topic

Your data set is the collection of responses to the survey. Now you can use descriptive statistics

to find out the overall frequency of each activity (distribution), the averages for each activity

(central tendency), and the spread of responses for each activity (variability).

Frequency distribution

A data set is made up of a distribution of values, or scores. In tables or graphs, you can

summarize the frequency of every possible value of a variable in numbers or percentages.

 Simple frequency distribution table

 Grouped frequency distribution table

For the variable of gender, you list all possible answers on the left hand column. You count the

number or percentage of responses for each answer and display it on the right hand column.

Gender Number

Man 182

Woman 235

No answer 27

From this table, you can see that more women than men took part in the study.

What can proofreading do for your paper?

Scribbr editors not only correct grammar and spelling mistakes, but also strengthen your writing

by making sure your paper is free of vague language, redundant words and awkward phrasing.

See editing example

Measures of central tendency

Measures of central tendency estimate the center, or average, of a data set.

The mean, median and mode are 3 ways of finding the average.

Here we will demonstrate how to calculate the mean, median, and mode using the first 6

responses of our survey.

Mean

 Median

 Mode

The mean, or M , is the most commonly used method for finding the average.

Standard deviations of visits to the library in the past yearIn the table below, you complete Steps

1 through 4.

Raw data Deviation from mean Squared deviation

M = 9.5 Sum = 0 Sum of squares = 421.

Step 5: 421.5/5 = 84.

Step 6: √84.3 = 9.

From learning that s = 9.18 , you can say that on average, each score deviates from the mean by

9.18 points.

Variance

The variance is the average of squared deviations from the mean. Variance reflects the degree of

spread in the data set. The more spread the data, the larger the variance is in relation to the mean.

To find the variance, simply square the standard deviation. The symbol for variance is s

2

Variance of visits to the library in the past year Data set: 15, 3, 12, 0, 24, 3

s = 9.

s

2

Univariate descriptive statistics

Univariate descriptive statistics focus on only one variable at a time. It’s important to examine

data from each variable separately using multiple measures of distribution, central tendency and

spread. Programs like SPSS and Excel can be used to easily calculate these.

Visits to the library

N 6

Mean 9.

Median 7.

Mode 3

Standard deviation 9.

Variance 84.

Range 24

If you were to only consider the mean as a measure of central tendency, your impression of the

“middle” of the data set can be skewed by outliers, unlike the median or mode.

Likewise, while the range is sensitive to extreme values, you should also consider the standard

deviation and variance to get easily comparable measures of spread.

Bivariate descriptive statistics

If you’ve collected data on more than one variable, you can use bivariate or multivariate

descriptive statistics to explore whether there are relationships between them.

In bivariate analysis, you simultaneously study the frequency and variability of two variables to

see if they vary together. You can also compare the central tendency of the two variables before

performing further statistical tests.

Multivariate analysis is the same as bivariate analysis but with more than two variables.

Contingency table

Scatter plot example: Library visits and movie theater visitsYou investigate whether people who

visit the library more tend to watch a movie at a theater less. You plot the number of times

participants watched movies at a theater along the x-axis and visits to the library along the y-

axis.

From your scatter plot, you see that as the number of movies seen at movie theaters decreases,

the number of visits to the library increases. Based on your visual assessment of a possible linear

relationship, you perform further tests of correlation and regression.