We are given a table showing the distribution of salaries (in thousands of FCFA) of 210 employees. The table has two columns: salary range and the number of employees in that range. The problem asks us to perform several statistical analyses on this data, including finding the modal class, quartiles, mean, variance, standard deviation, coefficient of variation, and Yule's coefficient of skewness. It also asks how the mean, variance, and standard deviation change if the salaries increase by a fixed amount or by a percentage. Finally, it asks about concentration measures such as the mediale, concentration range and Gini index, and the effects of a salary increase on them.

Probability and StatisticsDescriptive StatisticsMeanVarianceStandard DeviationQuartilesModal ClassCoefficient of VariationSkewnessGini IndexMedialeConcentration RangeLorenz Curve
2025/4/24

1. Problem Description

We are given a table showing the distribution of salaries (in thousands of FCFA) of 210 employees. The table has two columns: salary range and the number of employees in that range. The problem asks us to perform several statistical analyses on this data, including finding the modal class, quartiles, mean, variance, standard deviation, coefficient of variation, and Yule's coefficient of skewness. It also asks how the mean, variance, and standard deviation change if the salaries increase by a fixed amount or by a percentage. Finally, it asks about concentration measures such as the mediale, concentration range and Gini index, and the effects of a salary increase on them.

2. Solution Steps

Let's address the questions one by one.

1. Determining the Modal Class

The modal class is the salary range with the highest frequency (number of employees). In the given data, the highest frequency is 60, which corresponds to the salary range [150,220[[150, 220[.
Interpretation: The most common salary range for the employees is between 150,000 FCFA and 220,000 FCFA.

2. Calculating Quartiles

We will use linear interpolation to calculate the quartiles.
* Q1 (First Quartile): 25% of the data falls below Q

1. $0.25 * 210 = 52.5$. The cumulative frequency before the interval $[125, 150[$ is

1

4. The cumulative frequency of interval $[125, 150[$ is $14 + 38 = 52$. The cumulative frequency before the interval $[150,220[$ is

5

2. Therefore Q1 lies in the interval $[150,220[$.

Q1=L+(0.25Ncf)fwQ1 = L + \frac{(0.25*N - cf)}{f} * w
Where:
LL is the lower limit of the quartile class = 150
NN is the total number of observations = 210
cfcf is the cumulative frequency of the class before the quartile class = 52
ff is the frequency of the quartile class = 60
ww is the class width = 220150=70220-150 = 70
Q1=150+(52.552)6070=150+0.56070=150+0.583=150.583Q1 = 150 + \frac{(52.5 - 52)}{60} * 70 = 150 + \frac{0.5}{60} * 70 = 150 + 0.583 = 150.583
* Q2 (Second Quartile, Median): 50% of the data falls below Q

2. $0.50 * 210 = 105$. The cumulative frequency before the interval $[150, 220[$ is $14 + 38 = 52$. The cumulative frequency of interval $[150,220[$ is $52 + 60 = 112$. Therefore Q2 lies in the interval $[150,220[$.

Q2=L+(0.5Ncf)fwQ2 = L + \frac{(0.5*N - cf)}{f} * w
Where:
LL is the lower limit of the quartile class = 150
NN is the total number of observations = 210
cfcf is the cumulative frequency of the class before the quartile class = 52
ff is the frequency of the quartile class = 60
ww is the class width = 220150=70220-150 = 70
Q2=150+(10552)6070=150+536070=150+61.833=211.833Q2 = 150 + \frac{(105 - 52)}{60} * 70 = 150 + \frac{53}{60} * 70 = 150 + 61.833 = 211.833
* Q3 (Third Quartile): 75% of the data falls below Q

3. $0.75 * 210 = 157.5$. The cumulative frequency before the interval $[220, 300[$ is $14 + 38 + 60 = 112$. The cumulative frequency of interval $[220,300[$ is $112+27 = 139$. The cumulative frequency before the interval $[300,350[$ is $112 + 27 = 139$. The cumulative frequency of interval $[300,350[$ is $139+33 = 172$. Therefore Q3 lies in the interval $[300,350[$.

Q3=L+(0.75Ncf)fwQ3 = L + \frac{(0.75*N - cf)}{f} * w
Where:
LL is the lower limit of the quartile class = 300
NN is the total number of observations = 210
cfcf is the cumulative frequency of the class before the quartile class = 139
ff is the frequency of the quartile class = 33
ww is the class width = 350300=50350-300 = 50
Q3=300+(157.5139)3350=300+18.53350=300+28.03=328.03Q3 = 300 + \frac{(157.5 - 139)}{33} * 50 = 300 + \frac{18.5}{33} * 50 = 300 + 28.03 = 328.03

3. Calculate the arithmetic mean, interquartile range, variance, standard deviation, and coefficient of variation.

First calculate the midpoints of the intervals, we have these midpoints: 80, 137.5, 185, 260, 325, 387.5, 512.

5. The frequencies are: 14, 38, 60, 27, 33, 20,

1
8.
Mean (μ\mu):
μ=fixifi\mu = \frac{\sum{f_i x_i}}{\sum{f_i}}
μ=(1480+38137.5+60185+27260+33325+20387.5+18512.5)210=1120+5225+11100+7020+10725+7750+9225210=52165210=248.405\mu = \frac{(14*80 + 38*137.5 + 60*185 + 27*260 + 33*325 + 20*387.5 + 18*512.5)}{210} = \frac{1120 + 5225 + 11100 + 7020 + 10725 + 7750 + 9225}{210} = \frac{52165}{210} = 248.405
Interquartile Range (IQR):
IQR=Q3Q1=328.03150.583=177.447IQR = Q3 - Q1 = 328.03 - 150.583 = 177.447
Variance (σ2\sigma^2):
σ2=fi(xiμ)2fi\sigma^2 = \frac{\sum{f_i(x_i - \mu)^2}}{\sum{f_i}}
σ2=1210[14(80248.405)2+38(137.5248.405)2+60(185248.405)2+27(260248.405)2+33(325248.405)2+20(387.5248.405)2+18(512.5248.405)2]\sigma^2 = \frac{1}{210} [14(80-248.405)^2 + 38(137.5-248.405)^2 + 60(185-248.405)^2 + 27(260-248.405)^2 + 33(325-248.405)^2 + 20(387.5-248.405)^2 + 18(512.5-248.405)^2]
σ2=1210[1428367.14+3812300.95+604019.14+27134.46+335865.33+2019352.22+1869728.22]=1210[397140+467436+241148.4+3630.42+193555.89+387044.4+1255108]=2944063.11210=14019.35\sigma^2 = \frac{1}{210} [14*28367.14 + 38*12300.95 + 60*4019.14 + 27*134.46 + 33*5865.33 + 20*19352.22 + 18*69728.22] = \frac{1}{210} [397140 + 467436 + 241148.4 + 3630.42 + 193555.89 + 387044.4 + 1255108] = \frac{2944063.11}{210} = 14019.35
Standard Deviation (σ\sigma):
σ=σ2=14019.35=118.403\sigma = \sqrt{\sigma^2} = \sqrt{14019.35} = 118.403
Coefficient of Variation (CV):
CV=σμ100=118.403248.405100=47.66CV = \frac{\sigma}{\mu} * 100 = \frac{118.403}{248.405} * 100 = 47.66%

4. If the salaries increase monthly by 25,000 FCFA (25 in thousands) or by 10%.

If the salaries increase by a fixed amount (25):
* Mean: The mean will increase by
2

5. New mean = $248.405 + 25 = 273.405$.

* Variance: The variance will remain unchanged since the spread of the data is the same. σ2=14019.35\sigma^2 = 14019.35.
* Standard Deviation: The standard deviation will also remain unchanged. σ=118.403\sigma = 118.403.
If the salaries increase by 10%:
* Mean: The mean will increase by 10%. New mean = 248.4051.1=273.2455248.405 * 1.1 = 273.2455.
* Variance: The variance will increase by (1.1)2(1.1)^2. New variance = 14019.35(1.1)2=14019.351.21=16963.413514019.35 * (1.1)^2 = 14019.35 * 1.21 = 16963.4135
* Standard Deviation: The standard deviation will increase by 10%. New standard deviation = 118.4031.1=130.2433118.403 * 1.1 = 130.2433

5. Dissymmetry of the frequency polygon and Yule's coefficient of skewness.

Yule's coefficient of skewness (Q):
Q=Q3+Q12Q2Q3Q1Q = \frac{Q3 + Q1 - 2*Q2}{Q3 - Q1}
Q=328.03+150.5832211.833328.03150.583=478.613423.666177.447=54.947177.447=0.3096Q = \frac{328.03 + 150.583 - 2*211.833}{328.03 - 150.583} = \frac{478.613 - 423.666}{177.447} = \frac{54.947}{177.447} = 0.3096
Interpretation: Since the coefficient is positive, the distribution is positively skewed (skewed to the right).

6. Concentration of Salaries

a. Calculate the mediale.
To find the mediale, first calculate the total income. Then find the income level where half the total income is earned.
Total income = fixi=52165\sum{f_i x_i} = 52165 (calculated previously)
Half of total income = 52165/2=26082.552165 / 2 = 26082.5.
We need to determine which interval contains the mediale. We calculate cumulative income.
* [35,125[[35, 125[: 1480=112014 * 80 = 1120. Cumulative income = 1120
* [125,150[[125, 150[: 38137.5=522538 * 137.5 = 5225. Cumulative income = 1120+5225=63451120 + 5225 = 6345
* [150,220[[150, 220[: 60185=1110060 * 185 = 11100. Cumulative income = 6345+11100=174456345 + 11100 = 17445
* [220,300[[220, 300[: 27260=702027 * 260 = 7020. Cumulative income = 17445+7020=2446517445 + 7020 = 24465
* [300,350[[300, 350[: 33325=1072533 * 325 = 10725. Cumulative income = 24465+10725=3519024465 + 10725 = 35190
The mediale lies in the interval [300,350[[300, 350[.
We can estimate the mediale by linear interpolation.
Mediale=L+(0.5TotalIncomecf)fwMediale = L + \frac{(0.5 * Total Income - cf)}{f * w} where L = 300, cf = 24465, f = 33 and w = 50, the median income = 52165/2=26082.552165/2 = 26082.5
Mediale=300+26082.5244651072550=300+1617.533=300+49.01=349.02Mediale = 300 + \frac{26082.5-24465}{10725}50= 300 + \frac{1617.5}{33} = 300+49.01= 349.02
Mediale=300+(26082.5244653350)50=300+(1617.51650)50=300+49.01=349.015Mediale=300+(\frac{26082.5-24465}{33*50})*50 = 300+ (\frac{1617.5}{1650})*50 = 300 + 49.01 = 349.015
b. Calculate the concentration range.
The concentration range spans from the median income to the maximum income. The median is estimated at 211.833211.833 previously calculated. The range extends to
6
0
0.
c. Construct the Lorenz curve and calculate the Gini index using the triangle method.
To calculate the Gini index using the triangle method, we need to calculate the cumulative percentage of employees and the cumulative percentage of income.
| Salary Range | Employees | % Employees | Cumulative % Employees | Income Midpoint | Income | Cumulative Income | % Income | Cumulative % Income |
|--------------|-----------|---------------|--------------------------|-----------------|--------|-------------------|-----------|-----------------------|
| [35, 125[ | 14 | 6.67% | 6.67% | 80 | 1120 | 1120 | 2.15% | 2.15% |
| [125, 150[ | 38 | 18.10% | 24.76% | 137.5 | 5225 | 6345 | 10.02% | 12.16% |
| [150, 220[ | 60 | 28.57% | 53.33% | 185 | 11100 | 17445 | 21.28% | 33.44% |
| [220, 300[ | 27 | 12.86% | 66.19% | 260 | 7020 | 24465 | 13.46% | 46.90% |
| [300, 350[ | 33 | 15.71% | 81.90% | 325 | 10725 | 35190 | 20.56% | 67.46% |
| [350, 425[ | 20 | 9.52% | 91.43% | 387.5 | 7750 | 42940 | 14.86% | 82.32% |
| [425, 600[ | 18 | 8.57% | 100.00% | 512.5 | 9225 | 52165 | 17.68% | 100.00% |
The Gini index is approximately the area between the line of perfect equality and the Lorenz curve, divided by the total area under the line of perfect equality. We can approximate this area using the trapezoidal rule. Since our data represents cumulative percentage, the line of perfect equality is a straight line going from (0,0) to (100,100). Thus the area under the line of perfect equality is 0.

5. $Gini= 1 - \sum_{i=1}^n (X_i - X_{i-1})(Y_i + Y_{i-1})$, where the Xi are the cumulative percentage of employees and Yi are cumulative percentage income. The first term is 0, and we set the value of X_0 and Y_0 to

0. $Gini = 1 - [0.0667*(0.0215) +0.2476 - 0.0667)*(0.1216+0.0215)/2 + (0.5333-0.2476)*(0.3344+0.1216)/2 + (0.6619 - 0.5333)*(0.4690+0.3344)/2 +(0.8190-0.6619)*(0.6746+0.4690)/2+ (0.9143-0.8190)*(0.8232+0.6746)/2+ (1-0.9143)(1+0.8232)/2] = 0.2707$.

d. Influence of a 15% salary increase on the Gini index.
If all salaries increase by 15%, the Gini index will remain approximately the same. A proportional increase in all salaries does not change the relative distribution of income. The Gini index reflects the relative inequality.

3. Final Answer

1. Modal Class: $[150, 220[$ (thousands of FCFA)

2. Q1: 150.583, Q2: 211.833, Q3: 328.03 (thousands of FCFA)

3. Mean: 248.405, IQR: 177.447, Variance: 14019.35, Standard Deviation: 118.403, CV: 47.66% (thousands of FCFA)

4. If salaries increase by 25,000 FCFA: new mean: 273.405, variance and standard deviation are unchanged. If salaries increase by 10%: new mean: 273.2455, new variance: 16963.4135, new standard deviation: 130.2433 (thousands of FCFA)

5. Yule's Coefficient of Skewness: 0.3096

6. a. Mediale: 349.

0
1

5. b. Concentration range: from the median ($211.833$) to

6
0

0. c. Gini Index: 0.2707 d. The Gini index is almost unchanged if all salaries increase by 15%.

Related problems in "Probability and Statistics"

The problem asks us to identify which of the four graphs (W, X, Y, and Z) best represents the data i...

Data VisualizationGraph InterpretationPercentageData Analysis
2025/4/26

The problem presents a table showing the annual profit (in millions of FCFA) of 170 Senegalese compa...

Descriptive StatisticsFrequency DistributionHistogramsFrequency PolygonsCumulative FrequencyMeasures of Central TendencyMeanStandard DeviationCoefficient of VariationModal ClassMedianMean DeviationGini IndexData Analysis
2025/4/25

The problem presents data about the number of errors made by 7 candidates during a typing test. It a...

Descriptive StatisticsData AnalysisMeanMedianModeVarianceStandard DeviationCoefficient of VariationQuartilesGraphical Representation
2025/4/25

The problem presents a statistical dataset of 180 authors categorized by the number of manuals they ...

Descriptive StatisticsMeanMedianModeVarianceStandard DeviationCoefficient of VariationFrequency Distribution
2025/4/24

The problem presents a table showing the distribution of annual profits (in millions of CFA francs) ...

Descriptive StatisticsMomentsVarianceSkewnessKurtosisFrequency Distribution
2025/4/24

The problem is based on a pie chart showing the distribution of marks scored by 200 pupils. We need ...

Data AnalysisPie ChartFrequency DistributionPercentagesStatistics
2025/4/24

Based on the provided information, we have a hypothesis test with a sample size of $n=17$ and a sign...

Hypothesis TestingTest StatisticCritical ValueNull HypothesisSignificance Level
2025/4/24

Teams A and B are playing a series of games. The first team to win 4 games wins the championship. Gi...

CombinatoricsProbabilitySequencesGame Theory
2025/4/24

We have a batch of 100 musical instruments, 4 of which have impure tone. We randomly select 3 instr...

ProbabilityConditional ProbabilityCombinationsBayes' Theorem (implied)
2025/4/24

The problem provides a table showing the distribution of marks scored by students in a test. (a) Giv...

MeanInterquartile RangeProbabilityData AnalysisFrequency Distribution
2025/4/24