Salary Statistics Fields - Complete Explanation
This document provides a comprehensive explanation of all statistical fields returned by the Salary Statistics by Level query. Understanding these metrics will help you interpret salary data and make informed decisions about compensation, market analysis, and hiring strategies.
Type: Integer
Description: The lowest salary value among all applicants at this level.
What it means:
- The absolute minimum salary found in the dataset
- Represents the floor of the salary range
- May be an outlier if significantly lower than other values
When to use:
- Understand the lower bound of salaries
- Identify minimum acceptable compensation
- Set salary floors for job postings
Example:
min_salary: 500
→ At least one applicant at this level has a salary of 500
Interpretation tips:
- Compare with Q1 to identify if it's an outlier
- If min is much lower than Q1, it may be an entry-level or part-time position
- Use with max_salary to understand the full range
Type: Integer
Description: The highest salary value among all applicants at this level.
What it means:
- The absolute maximum salary found in the dataset
- Represents the ceiling of the salary range
- May be an outlier if significantly higher than other values
When to use:
- Understand the upper bound of salaries
- Identify maximum potential compensation
- Set salary caps for budget planning
Example:
max_salary: 5000
→ At least one applicant at this level has a salary of 5000
Interpretation tips:
- Compare with Q3 to identify if it's an outlier
- If max is much higher than Q3, it may be a senior specialist or leadership role
- Use with min_salary to understand the full range
Type: Decimal (rounded to 2 places)
Description: The arithmetic mean of all salaries at this level. Calculated as the sum of all salaries divided by the number of applicants.
Formula: SUM(salary) / COUNT(*)
What it means:
- The "typical" salary if all salaries were equal
- Sensitive to outliers (very high or very low values)
- Most commonly used measure of central tendency
When to use:
- Quick overview of typical compensation
- Budget planning and salary benchmarking
- Comparing levels or groups
Example:
avg_salary: 1750.25
→ The average salary across all applicants at this level is 1750.25
Interpretation tips:
- Compare with median: If avg > median, high earners pull the average up
- Compare with median: If avg < median, low earners pull the average down
- Outlier sensitivity: A few very high salaries can significantly increase the average
Limitations:
- Can be misleading if there are extreme outliers
- Doesn't show the distribution of salaries
- May not represent the "typical" salary if distribution is skewed
Type: Integer
Description: The middle value when all salaries are sorted from lowest to highest. Exactly half of the applicants earn less than this value, and half earn more.
What it means:
- The "middle" salary in the dataset
- Not affected by outliers (robust statistic)
- Better representation of "typical" salary than average when data is skewed
When to use:
- Primary measure for understanding typical compensation
- Better than average when outliers are present
- Setting competitive salary offers
Example:
median_salary: 1700
→ Half of applicants earn less than 1700, half earn more
Interpretation tips:
- More reliable than average when data has outliers
- Represents the 50th percentile - middle of the distribution
- Compare with average: Large difference indicates skewed distribution
- Best single number to represent "typical" salary
Why it's important:
- Resistant to outliers (one very high salary won't affect it)
- Represents the true middle of the data
- Industry standard for salary benchmarking
Type: Integer
Description: The salary value below which 25% of applicants fall. Also known as the lower quartile.
What it means:
- 25% of applicants earn less than this value
- 75% of applicants earn more than this value
- Marks the boundary between the bottom quarter and the rest
When to use:
- Understand the lower end of the salary distribution
- Identify entry-level or junior compensation
- Set minimum salary expectations
Example:
q1_salary: 1200
→ 25% of applicants earn less than 1200, 75% earn more
Interpretation tips:
- Lower boundary of the middle 50% (interquartile range)
- Compare with median: Large gap indicates many low earners
- Use with Q3 to understand the spread of the middle 50%
Type: Integer
Description: The salary value below which 75% of applicants fall. Also known as the upper quartile.
What it means:
- 75% of applicants earn less than this value
- 25% of applicants earn more than this value
- Marks the boundary between the top quarter and the rest
When to use:
- Understand the higher end of the salary distribution
- Identify senior or experienced compensation
- Set competitive salary targets
Example:
q3_salary: 2200
→ 75% of applicants earn less than 2200, 25% earn more
Interpretation tips:
- Upper boundary of the middle 50% (interquartile range)
- Compare with median: Large gap indicates many high earners
- Use with Q1 to understand the spread of the middle 50%
Type: Integer
Description: The difference between the maximum and minimum salaries. Calculated as max_salary - min_salary.
What it means:
- The total spread of salaries from lowest to highest
- Shows the variability in compensation
- Larger ranges indicate more diversity in salary levels
When to use:
- Understand salary variability within a level
- Identify levels with wide compensation ranges
- Assess market diversity
Example:
salary_range: 2000
→ The difference between highest and lowest salary is 2000
Interpretation tips:
- Large range: High variability, many factors affect salary
- Small range: More standardized compensation
- Compare across levels: Some levels may have wider ranges than others
- Use with quartiles: Range can be misleading if outliers exist
Limitations:
- Sensitive to outliers (one extreme value affects the entire range)
- Doesn't show where most salaries fall
- Use interquartile range (Q3 - Q1) for a more robust measure
Type: Decimal (rounded to 2 places)
Description: A measure of how spread out the salaries are from the average. Indicates the typical distance of salaries from the mean.
Formula: Square root of the variance
What it means:
- Low standard deviation: Salaries are clustered close to the average
- High standard deviation: Salaries are spread out widely
- Zero standard deviation: All salaries are identical (rare)
When to use:
- Understand salary consistency within a level
- Identify levels with high or low variability
- Assess market standardization
Example:
stddev_salary: 450.25
→ Salaries typically vary by about 450 from the average
Interpretation tips:
- Compare with average: If stddev ≈ 0.2 × average, moderate spread
- Compare with average: If stddev > 0.5 × average, high variability
- Compare with average: If stddev < 0.1 × average, low variability
- NULL value: Occurs when there's only one data point (can't calculate spread)
Rule of thumb:
- 68% of salaries fall within 1 standard deviation of the average
- 95% of salaries fall within 2 standard deviations of the average
- 99.7% of salaries fall within 3 standard deviations of the average
Example calculation:
If avg_salary = 2000 and stddev_salary = 400:
- 68% of salaries are between 1600 and 2400
- 95% of salaries are between 1200 and 2800
- 99.7% of salaries are between 800 and 3200
When Average > Median:
- High earners pull the average up
- Right-skewed distribution (tail on the right)
- Indicates some very high salaries
When Average < Median:
- Low earners pull the average down
- Left-skewed distribution (tail on the left)
- Indicates some very low salaries
When Average ≈ Median:
- Symmetrical distribution
- Salaries are evenly distributed
- Average and median both represent typical salary well
Interquartile Range (IQR) = Q3 - Q1
Small IQR:
- Most salaries are clustered in a narrow range
- More standardized compensation
- Less variability
Large IQR:
- Salaries are spread across a wide range
- More diverse compensation
- Higher variability
Symmetric Distribution:
- Q1 and Q3 are equidistant from median
- (Median - Q1) ≈ (Q3 - Median)
Skewed Distribution:
- Q1 and Q3 are not equidistant from median
- (Median - Q1) ≠ (Q3 - Median)
Both measure spread, but differently:
- Range: Total spread (max - min), sensitive to outliers
- Standard Deviation: Typical spread from average, less sensitive to outliers
Use Range when:
- You need to know the absolute limits
- Outliers are important to understand
Use Standard Deviation when:
- You want to understand typical variability
- You want a measure less affected by outliers
Goal: Determine appropriate salary range for a job posting
Recommended approach:
- Look at median_salary as the target midpoint
- Use q1_salary as the minimum (25th percentile)
- Use q3_salary as the maximum (75th percentile)
- Consider avg_salary if distribution is symmetric
Example:
Level: MIDDLE
median_salary: 1700
q1_salary: 1200
q3_salary: 2200
Recommended range: 1200 - 2200
Target offer: ~1700 (median)
Goal: Find levels with unusual salary patterns
Recommended approach:
- Compare avg_salary vs median_salary
- Check stddev_salary for variability
- Examine salary_range for outliers
Red flags:
- Large difference between average and median (> 20%)
- Very high standard deviation (> 50% of average)
- Extremely wide salary range
- Small sample size (< 10)
Goal: Estimate total compensation costs
Recommended approach:
- Use avg_salary for total cost estimation
- Use median_salary for typical cost per hire
- Use q3_salary for worst-case scenario planning
- Multiply by expected number of hires
Example:
Planning to hire 5 MIDDLE level developers:
avg_salary: 1750.25
Expected total: 5 × 1750.25 = 8,751.25
Worst case (using Q3):
q3_salary: 2200
Worst case total: 5 × 2200 = 11,000
Goal: Compare compensation across levels or groups
Recommended approach:
- Compare median_salary across levels (most reliable)
- Compare avg_salary as secondary check
- Compare q1_salary and q3_salary for distribution
- Consider stddev_salary for consistency
Example comparison:
JUNIOR:
median: 800
avg: 850
q1: 600, q3: 1000
MIDDLE:
median: 1700
avg: 1750
q1: 1200, q3: 2200
Analysis:
- MIDDLE earns ~2x JUNIOR (median comparison)
- Both have similar distribution shape (symmetric)
- MIDDLE has wider range (more variability)
Reality: Median is often more representative, especially with outliers.
When to use average:
- Symmetric distribution
- No outliers
- Need for mathematical operations (sum, etc.)
When to use median:
- Skewed distribution
- Presence of outliers
- Need for typical value
Reality: Range shows extremes, not typical values. Use quartiles for typical range.
Better approach:
- Use Q1 to Q3 (interquartile range) for where most salaries fall
- Use min/max only to understand absolute limits
Reality: High variability can indicate:
- Diverse skill levels within a level
- Different specializations
- Market flexibility
- Growth opportunities
Context matters: High variability might be expected and acceptable.
Reality: More data helps, but data quality and representativeness matter more.
Consider:
- Is the sample representative?
- Are there selection biases?
- Is the data current and relevant?
| Sample Size | Reliability | Recommendation |
|---|
| < 10 | Low | Use with extreme caution, may not be representative |
| 10-30 | Moderate | Use with caution, consider confidence intervals |
| 30-100 | Good | Reliable for most purposes |
| > 100 | Excellent | Highly reliable, can perform detailed analysis |
While not calculated in the query, understanding confidence helps:
- Larger samples: Narrower confidence intervals, more precise estimates
- Smaller samples: Wider confidence intervals, less precise estimates
- Rule of thumb: ±10% margin of error for samples of 30-100
The statistics can be visualized as a box plot:
min_salary ──┐
│
q1_salary ──┤ ┌─────┐
│ │ │
median ─────┤─┤ ■ ├── q3_salary
│ │ │
│ └─────┘
max_salary ─┘
Box plot elements:
- Whiskers: min to max (or Q1-1.5×IQR to Q3+1.5×IQR for outliers)
- Box: Q1 to Q3 (interquartile range)
- Line in box: Median
- Average: Often shown as a dot or X
Symmetric (Normal-like):
- Average ≈ Median
- Q1 and Q3 equidistant from median
- Standard deviation moderate
Right-skewed (High outliers):
- Average > Median
- Q3 further from median than Q1
- Standard deviation high
Left-skewed (Low outliers):
- Average < Median
- Q1 further from median than Q3
- Standard deviation high
| Field | Type | Best For | Limitations |
|---|
min_salary | Integer | Lower bound | May be outlier |
max_salary | Integer | Upper bound | May be outlier |
avg_salary | Decimal | Quick overview | Sensitive to outliers |
median_salary | Integer | Typical value | Less intuitive |
q1_salary | Integer | Lower quartile | None |
q3_salary | Integer | Upper quartile | None |
salary_range | Integer | Total spread | Sensitive to outliers |
stddev_salary | Decimal | Variability | Requires > 1 data point |
- Target: Use
median_salary - Minimum: Use
q1_salary (25th percentile) - Maximum: Use
q3_salary (75th percentile)
- Primary: Use
median_salary (most reliable) - Secondary: Use
avg_salary (if symmetric distribution) - Check: Compare both to assess distribution shape
- Expected: Use
avg_salary × number of hires - Conservative: Use
q3_salary × number of hires - Optimistic: Use
q1_salary × number of hires
- Central tendency: Compare
median_salary across groups - Variability: Compare
stddev_salary across groups - Distribution: Compare quartiles (Q1, median, Q3) across groups
- Query Documentation: See
salary-statistics-by-level-guide.md - Query File: See
salary-statistics-by-level.sql - Statistical Concepts:
- Percentiles and quartiles
- Measures of central tendency
- Measures of variability
- Distribution shapes
- Distribution: Is the distribution symmetric or skewed? (Compare avg vs median)
- Variability: How much do salaries vary? (Check stddev and range)
- Outliers: Are min/max values outliers? (Compare with quartiles)
- Representativeness: Is this sample representative of the population?
- Context: How do these compare to industry standards?
- Trends: How do these compare across different levels or groups?
By understanding these statistics and asking the right questions, you can make informed decisions about compensation, market analysis, and hiring strategies.
This article was prepared by the HR Drone platform to contribute to the development of data-driven HR practices, salary analytics culture, and informed compensation decision-making.