Merge Original Data Columns with Temporal Segments
Source:R/merge_columns.R
merge_original_columns.RdEnriches vecshift output by merging additional columns from the original contract data. This function allows you to carry forward metadata and additional attributes from the original employment contracts into the temporal segments created by vecshift().
Arguments
- original_data
A data.table containing the original contract records that were processed by vecshift(). Must include 'id' column and the columns specified in the 'columns' parameter.
- segments
A data.table containing the output from vecshift(). Must include 'id' column for merging.
- columns
Character vector specifying which columns from original_data to merge into segments. Column names must exist in original_data.
Value
A data.table combining the segments data with the requested columns from original_data. The temporal ordering and structure of segments is preserved. Unemployment periods (id = 0) are included but without the merged columns.
Details
The function performs an inner join between the segments data and the original contract data using the 'id' column as the key. Since vecshift() creates multiple temporal segments from single contracts, the same original contract data may appear in multiple segment rows.
Unemployment periods (where id = 0) are preserved but will not have merged columns from the original data, as they don't correspond to specific contracts.
Examples
if (FALSE) { # \dontrun{
library(data.table)
# Create sample employment data with additional metadata
original_dt <- data.table(
id = 1:3,
cf = c("ABC123", "ABC123", "DEF456"),
INIZIO = as.Date(c("2023-01-01", "2023-06-01", "2023-02-01")),
FINE = as.Date(c("2023-05-31", "2023-12-31", "2023-11-30")),
prior = c(1, 0, 1),
company = c("CompanyA", "CompanyB", "CompanyC"),
salary = c(50000, 25000, 60000),
department = c("IT", "HR", "Finance")
)
# Transform to temporal segments
segments <- vecshift(original_dt)
# Merge additional columns
enriched <- merge_original_columns(
original_data = original_dt,
segments = segments,
columns = c("company", "salary", "department")
)
print(enriched)
# Merge single column
with_company <- merge_original_columns(
original_data = original_dt,
segments = segments,
columns = "company"
)
} # }