Validate Employment Status Classifications
Source:R/status_labeling.R
validate_status_classifications.RdValidates the consistency and logical correctness of employment status classifications, checking for impossible combinations and missing labels. Adapted for the enhanced flexible prior value system - validates patterns rather than specific hardcoded values. Optimized for large datasets.
Usage
validate_status_classifications(
classified_data,
rules = get_default_status_rules()
)Value
List with validation results and any detected issues:
is_valid: Overall validation status
validation_rate: Proportion of valid classifications
missing_labels: Count of missing status labels
impossible_combinations: Detailed breakdown of logical errors
unexpected_statuses: Status labels that don't match expected patterns
missing_critical_statuses: Critical status types that are missing
Details
**Enhanced Validation for Flexible Prior Values**:
The validation has been updated to accommodate the flexible prior value system: - No longer validates specific prior values (since they can be any numeric code) - Focuses on structural consistency between arco and stato patterns - Validates that status labels follow expected patterns (disoccupato, occ_, over_) - Checks for missing critical status types rather than exact label matches - Supports custom prior_labels through rules parameter validation
**Validation Checks**: - Missing or empty status labels - Impossible combinations (e.g., unemployment with arco > 0) - Structural inconsistencies between employment counts and status patterns - Unexpected status label formats - Missing critical status categories
Examples
if (FALSE) { # \dontrun{
library(data.table)
# Sample classified data with flexible prior values
classified_data <- data.table(
cf = rep("A001", 4),
inizio = 1:4,
fine = 2:5,
arco = c(0, 1, 2, 1),
prior = c(NA, 2, 5, 1), # Includes unmapped value 5
durata = rep(1, 4),
stato = c("disoccupato", "occ_fixed", "over_5_ft", "occ_ft")
)
# Validate with default rules
validation <- validate_status_classifications(classified_data)
print(validation$is_valid)
print(validation$validation_rate)
# Check for any issues
if (!validation$is_valid) {
print(validation$impossible_combinations)
print(validation$unexpected_statuses)
}
# Validate with custom rules that map prior value 5
custom_rules <- create_custom_status_rules(
prior_labels = list("1" = "ft", "2" = "fixed", "5" = "intern")
)
validation_custom <- validate_status_classifications(classified_data, custom_rules)
print(validation_custom$is_valid)
} # }