Two Sample Z-Test

A statistical test to compare two sample means

statistics
python
Author

Possible Institute

Published

November 18, 2022

A two-sample Z-test is a statistical procedure used to determine whether the mean of two populations is significantly different. This test is most appropriate when you have two independent samples and the population standard deviations are known, or the sample size is large enough to estimate the population standard deviations.

Before applying the two-sample Z-test, you should ensure that your data meet the following assumptions:

Independence: The two samples are randomly drawn and independent from each other. Normality: The populations from which the samples are drawn should be normally distributed or the sample sizes should be sufficiently large (usually n > 30) according to the Central Limit Theorem. Standard Deviation: The population standard deviations are known or can be approximated from the sample.

two_samples_z_test.py
def twoSampZ(X1, X2, mudiff, sd1, sd2, n1, n2):
    from numpy import sqrt, abs, round
    from scipy.stats import norm
    pooledSE = sqrt(sd1**2/n1 + sd2**2/n2)
    z = ((X1 - X2) - mudiff)/pooledSE
    pval = 2*(norm.sf(abs(z)))
    return round(z, 9), round(pval, 9)


z, p = twoSampZ(79.24, 73.74, 0, 19.76, 16.86, 174, 159)
a, b = twoSampZ(109.31, 113.42, 0, 27.671, 25.464, 206, 190)

print(z, p)
print(a, b)

Another Example

Sure, let’s consider an example where we have test scores from two different classes and we want to see if there is a significant difference between the means of the two classes’ test scores.

Assume that we know the standard deviations of scores in the two classes from previous years or other data.

Here is the Python code to perform a two-sample Z-test:

Code
import numpy as np
from scipy.stats import norm

# Sample data
class1_scores = np.array([87, 89, 91, 93, 95])
class2_scores = np.array([81, 83, 85, 87, 89])

# Known population standard deviations
std_dev_class1 = 4.5
std_dev_class2 = 3.9

# Sample means
mean_class1 = np.mean(class1_scores)
mean_class2 = np.mean(class2_scores)

# Sample sizes
n_class1 = len(class1_scores)
n_class2 = len(class2_scores)

# Calculate the test statistic (Z-score)
z = (mean_class1 - mean_class2) / np.sqrt((std_dev_class1**2/n_class1) + (std_dev_class2**2/n_class2))

# Calculate the two-tailed p-value
p = (1 - norm.cdf(abs(z))) * 2

print('Z-score: ', z)
print('p-value: ', p)
Z-score:  2.253029545296665
p-value:  0.024257286140158207

Here, the Z-score is the test statistic and it follows a standard normal distribution. The p-value is the probability of observing a test statistic as extreme as the one calculated, given that the null hypothesis is true.

If the p-value is less than your chosen significance level (often 0.05), you would reject the null hypothesis and conclude there is a significant difference between the means of the two classes’ test scores. If the p-value is greater than your significance level, you would not reject the null hypothesis and conclude there is not a significant difference between the means.

Please note that the Python scipy library provides a function for the two-sample Z-test if the population standard deviations are not known but the sample size is large, the function is scipy.stats.ttest_ind. However, it assumes equal variance in both populations unless specified otherwise. It essentially performs a t-test, which becomes nearly identical to a Z-test with large sample sizes.

Also, this is a simplified example. In a real-world scenario, the appropriate test to use would depend on the nature of your data and the specific circumstances of your analysis. It’s always a good idea to consult with a statistician or knowledgeable advisor when performing these types of analyses.