R语言代做编程辅导和解答:Lab Activities - MAT 500

2022-11-17 17:17:17 浏览数 (1)

全文链接:http://tecdat.cn/?p=30394

Complete the following exercises using the code discussed during computer lab. Save your work in an R script as well as a Word document containing the necessary output and comments. Be sure to use notes in the script to justify any computations. If you have any questions, do not hesitate to ask

1 Probability Distributions

  1. Generate four vectors with binomial random numbers of sample sizes of 10, 100, 1000, and 10000 using n = 50, p = 0.4, and seed 5. Find the mean and standard deviation of each of these vectors and compare them to the theoretical mean and standard deviation. What do you see as n increases?
  2. Generate the same four vectors as in the previous exercise. Print out four histograms to graphically represent the data. What distribution do the histograms appear to be approaching as n increases
  3. Generate a vector of 1000 random numbers from a χ 2 distribution with 8 degrees of freedom using seed 100. Find the five number summary, mean, and standard deviation. Represent the vector graphically using a probability histogram with pdf overlayed on the same graph. Assess the normality of the sample data
  4. . Clearly the sample data from the F distribution generated earlier in this chapter was not normal. To assess the fit of a random variable to the proper distribution, one uses a Quantile Quantile plot. Using seed 1, generate 300 random numbers from an F distribution with 5 and 10 degrees of freedom. Create a QQ plot by finding the theoretical sample quantiles of F as well as the sample quantiles of the random data. Plot these vectors to see if the random number generator is indeed providing sample data from an F distribution. Hint: This problem requires the use of the quantile() function to find the sample quantiles of a data set.
  5. Find the following probabilities: (a) P(B = 5) where B ∼ Binom(12, 0.6) (b) P(B ≥ 5) where B ∼ Binom(12, 0.6) (c) P(Z < 1.12) where Z ∼ N(0, 1) (d) P(6.5 < X) where X ∼ N(7, 4) (e) P(−1021 < t < −664) where t ∼ t(1) (f) P(t > 1.96) where t ∼ t(500)
  6. Find the following quantiles: (a) 30th quantile for Z ∼ N(0, 1) (b) 30th quantile for X ∼ N(7, 4) (c) 95th quantile for t ∼ t(1
  7. (d) 95th quantile for t ∼ t(500) (e) Q1, Q2, and Q3 for F(5, 10).

2 Representing Categorical Data

A rehabilitation study for cocaine users included administering two drugs and a placebo to determine effectiveness. There were 24 subjects in each group. Fourteen of the users given Desipramine relapsed, 18 of the uses given Lithium relapsed and 20 of the placebo group relapsed. Create two tables, one containing the counts and the other containing the marginal distributions for each drug. Print the tables and represent the data graphically. Use a bar graph with bars for both outcomes as well as two pie charts, one for each outcome.

3 Exploratory Data Analysis

  1. Using the ‘datasets’ library in R, save the mtcars data set as cars matrix. Find the summary statistics of the mpg column, as well as a boxplot. Create a boxplot of the mpg column by the cylinder column. The output should have three plots on the same set of axis. Summarize the boxplot in words.
  2. Refer to the previous exercise. Check the normality of the mpg column. Perform pairwise hypothesis tests to determine if the average mpg differs depending on the number of cylinders. Use both methods discussed in this section. Based on the normality assessment, which testing method should be used

Day 1a Lab Activities - 解答

Probability Distributions

1. 

Sample Size

µ

σ

s

10

20

3.4641

20.5

3.6591

100

20

3.4641

20.24

3.5337

1000

20

3.4641

19.994

3.4912

10000

20

3.4641

20.0253

3.4795

 As n increases, the standard deviation approaches the true standard deviation.  The mean also approaches the true mean, but this happens with a much smaller sample size than what is needed for the standard deviation.

2.  The histograms appear to be approaching a normal distribution with mean 20.

3.     Min. 1st Qu.  Median    Mean 3rd Qu.    Max.                St. Dev

         0.7603  4.8880  7.4250  7.9070 10.1300 26.2100     3.942555

The normal quantile plot is not linear, therefore, the data is not normal.

4.  The random numbers appear to be an F distribution with the exception of the 7 largest numbers.

5.     a) 0.1009024

        b)  0.8417877

        c)  0.8686431

        d)  0.5497382

        e)  0.0001676192

        f)  0.02527539

6.  a)  -0.5244005

        b)  4.902398

        c)  6.313752

        d)  1.647907

        e)  0.5291417, 0.9319332, 1.5853233

Representing Categorical Data

`1.                Count Data

                 Desipramine  Lithium Placebo          

                 Yes          14      18        20

                 No           10               6            4

Marginal Distributions

                  Desipramine Lithium   Placebo

                 Yes   0.5833333    0.75         0.8333333

                 No    0.4166667    0.25        0.1666667`

0 人点赞