CadetCareerProblem – Generated Data Corrections Methods¶

`fix_generated_data(printing=None)` ¶

Fix Generated Data Instance.

This method prepares a synthetic cadet–AFSC instance by populating missing data, generating preferences, updating qualification matrices, and ensuring valid preference and utility structures. It is intended only for generated (synthetic) data instances and should not be used on real-world datasets.

Parameters¶

printing : bool, optional If True, print progress updates at each step. Defaults to the instance's self.printing.

Notes¶

This function assumes that no valid AFSC preference or utility data currently exists. It will:

Convert utility values to preference ranks
Generate fake AFSC preference lists
Create separate rated datasets for each SOC
Update the qualification matrix based on preferences
Remove ineligible cadets from all matrices
Normalize preference matrices to eliminate ranking gaps
Force first-choice utilities to 100%
Convert AFSC preferences to percentile scores
Update cadet preference columns
Create new utility matrices from cadet columns
Rebuild rated eligibility lists
Generate random value parameters
Perform a final parameter sanity check

Source code in afccp/main.py

def fix_generated_data(self, printing=None):
    """
    Fix Generated Data Instance.

    This method prepares a synthetic cadet–AFSC instance by populating missing data, generating preferences,
    updating qualification matrices, and ensuring valid preference and utility structures. It is intended
    **only** for generated (synthetic) data instances and should not be used on real-world datasets.

    Parameters
    ----------
    printing : bool, optional
        If True, print progress updates at each step. Defaults to the instance's `self.printing`.

    Notes
    -----
    This function assumes that no valid AFSC preference or utility data currently exists. It will:

    - Convert utility values to preference ranks
    - Generate fake AFSC preference lists
    - Create separate rated datasets for each SOC
    - Update the qualification matrix based on preferences
    - Remove ineligible cadets from all matrices
    - Normalize preference matrices to eliminate ranking gaps
    - Force first-choice utilities to 100%
    - Convert AFSC preferences to percentile scores
    - Update cadet preference columns
    - Create new utility matrices from cadet columns
    - Rebuild rated eligibility lists
    - Generate random value parameters
    - Perform a final parameter sanity check
    """

    if printing is None:
        printing = self.printing

    # Get "c_pref_matrix" from cadet preferences
    self.convert_utilities_to_preferences(cadets_as_well=True)

    # Generate AFSC preferences for this problem instance
    self.generate_fake_afsc_preferences()

    # Generate rated data (Separate datasets for each SOC)
    self.generate_rated_data()

    # Update qualification matrix from AFSC preferences (treating CFM lists as "gospel" except for Rated/USSF)
    self.update_qualification_matrix_from_afsc_preferences()

    # Removes ineligible cadets from all 3 matrices: degree qualifications, cadet preferences, AFSC preferences
    self.remove_ineligible_choices(printing=printing)

    # Take the preferences dictionaries and update the matrices from them (using cadet/AFSC indices)
    self.update_preference_matrices()  # 1, 2, 4, 6, 7 -> 1, 2, 3, 4, 5 (preference lists need to omit gaps)

    # Force first choice utility values to be 100%
    self.update_first_choice_cadet_utility_to_one(printing=printing)

    # Convert AFSC preferences to percentiles (0 to 1)
    self.convert_afsc_preferences_to_percentiles()  # 1, 2, 3, 4, 5 -> 1, 0.8, 0.6, 0.4, 0.2

    # The "cadet columns" are located in Cadets.csv and contain the utilities/preferences in order of preference
    self.update_cadet_columns_from_matrices()  # We haven't touched "c_preferences" and "c_utilities" until now

    # Update utility matrix from columns (and create final cadet utility matrix)
    self.update_cadet_utility_matrices_from_cadets_data(printing=printing)

    # Modify rated eligibility by SOC, removing cadets that are on "Rated Cadets" list...
    self.modify_rated_cadet_lists_based_on_eligibility(printing=printing)  # ...but not eligible for any rated AFSC

    # Generate fake (random) set of value parameters
    self.generate_random_value_parameters()

    # Sanity check the parameters to make sure it all looks good! (No issues found.)
    self.parameter_sanity_check()

`convert_utilities_to_preferences(cadets_as_well=False)` ¶

Converts utility matrices to ordinal preference matrices.

This method transforms the continuous utility values stored in the model's parameters into discrete ordinal preferences used in assignment algorithms. By default, it converts only the AFSC utility matrix (afsc_utility) into the a_pref_matrix. If specified, it also converts the cadet utility matrix (cadet_utility) into the c_pref_matrix.

Parameters¶

cadets_as_well : bool, optional If True, both cadet and AFSC utility matrices are converted to preference matrices. If False (default), only AFSC utilities are converted.

Returns¶

None This method updates the instance’s parameters attribute in-place.

Note

The resulting a_pref_matrix ranks cadets for each AFSC from most to least preferred.
The optional c_pref_matrix ranks AFSCs for each cadet, using 1-based indexing.

Examples¶

# Convert only AFSC utility matrix
instance.convert_utilities_to_preferences()

# Convert both AFSC and cadet utility matrices
instance.convert_utilities_to_preferences(cadets_as_well=True)

`generate_fake_afsc_preferences(fix_cadet_eligibility=False)` ¶

Generate Simulated AFSC Preferences using certain known parameters.

This method simulates AFSC (Air Force Specialty Code) preferences for cadets using weighted scores derived from merit, tier objectives, and other cadet/AFSC attributes. It supports scenarios where cadet eligibility should be strictly enforced before generating preferences.

Parameters¶

fix_cadet_eligibility : bool, optional If True, cadet preferences are regenerated to strictly respect eligibility constraints. If False (default), original eligibility is used as-is when computing preferences.

Returns¶

None Updates the parameters attribute of the current CadetCareerProblem instance with:

afsc_utility: N x M utility matrix
a_pref_matrix: AFSCs' ranked preferences over cadets
c_pref_matrix: Cadets' ranked preferences over AFSCs
afsc_preferences, cadet_preferences: Index-based ranking dictionaries

`generate_rated_data()` ¶

Generate Simulated Rated Board Data for USAFA and ROTC Cadets.

This method generates Order of Merit (OM) data and interest levels for cadets eligible for Rated AFSCs (e.g., Pilot, CSO, RPA, ABM) for each commissioning source. It only generates data if it does not already exist.

Rated OM and interest data (legacy ROTC Rated Board Data) is essential (not the interest matrix, though) for modeling the Air Force’s Rated board process, allowing simulation and evaluation of Rated cadet assignments.

Returns¶

None The method updates the internal self.parameters dictionary in-place by adding:

ROTC-rated interest matrix (rr_interest_matrix)
OM matrices for each Source of Commission (SOC), e.g., ur_om_matrix, rr_om_matrix

Examples¶

instance = CadetCareerProblem("Random", N=30, M=6, P=6)
instance.generate_rated_data()  # Adds Rated OM and interest matrices

`generate_random_value_parameters(num_breakpoints=24, vp_weight=100, printing=None)` ¶

Generate Random Value Parameters for Assignment Problem.

This method initializes a new set of randomly generated value-focused thinking (VFT) parameters for a cadet-AFSC matching problem instance. The generated value parameters include random weights, value functions, and AFSC/cadet preferences, making this method ideal for testing and experimentation.

Parameters:¶

num_breakpoints (int, optional): Number of breakpoints to use for linearizing value functions. Defaults to 24. vp_weight (int, optional): Scalar weight applied to the overall value parameter set. Defaults to 100. printing (bool, optional): Whether to print progress information. Defaults to the instance attribute.

Returns:¶

dict: A complete dictionary of randomly generated value parameters for the instance.

Example:¶

# Generate and assign a new set of value parameters using 30 breakpoints
vp = instance.generate_random_value_parameters(num_breakpoints=30, vp_weight=80)

`generate_example_castle_value_curves(num_breakpoints: int = 10)` ¶

Generate and Store Example CASTLE Value Curves.

This method creates example concave value curves for each CASTLE-level AFSC and stores the breakpoint information in the instance's parameters dictionary under the key 'castle_q'.

These curves represent marginal utility of inventory over a range of potential quantities for use in CASTLE sustainment simulations.

Parameters:¶

num_breakpoints (int, optional): Number of breakpoints to use for the concave curve. Defaults to 10.

Returns:¶

None

Example:¶

instance.generate_example_castle_value_curves(num_breakpoints=15)
q = instance.parameters['castle_q']

CadetCareerProblem – Generated Data Corrections Methods¶

`fix_generated_data(printing=None)` ¶

Parameters¶

Notes¶

`convert_utilities_to_preferences(cadets_as_well=False)` ¶

Parameters¶

Returns¶

Examples¶

See Also¶

`generate_fake_afsc_preferences(fix_cadet_eligibility=False)` ¶

Parameters¶

Returns¶

See Also¶

`generate_rated_data()` ¶

Returns¶

Examples¶

See Also¶

`generate_random_value_parameters(num_breakpoints=24, vp_weight=100, printing=None)` ¶

Parameters:¶

Returns:¶

Example:¶

See Also:¶

`generate_example_castle_value_curves(num_breakpoints: int = 10)` ¶

Parameters:¶

Returns:¶

Example:¶

See Also:¶

CadetCareerProblem – Generated Data Corrections Methods¶

fix_generated_data(printing=None) ¶

Parameters¶

Notes¶

convert_utilities_to_preferences(cadets_as_well=False) ¶

Parameters¶

Returns¶

Examples¶

See Also¶

generate_fake_afsc_preferences(fix_cadet_eligibility=False) ¶

Parameters¶

Returns¶

See Also¶

generate_rated_data() ¶

Returns¶

Examples¶

See Also¶

generate_random_value_parameters(num_breakpoints=24, vp_weight=100, printing=None) ¶

Parameters:¶

Returns:¶

Example:¶

See Also:¶

generate_example_castle_value_curves(num_breakpoints: int = 10) ¶

Parameters:¶

Returns:¶

Example:¶

See Also:¶

`fix_generated_data(printing=None)` ¶

`convert_utilities_to_preferences(cadets_as_well=False)` ¶

`generate_fake_afsc_preferences(fix_cadet_eligibility=False)` ¶

`generate_rated_data()` ¶

`generate_random_value_parameters(num_breakpoints=24, vp_weight=100, printing=None)` ¶

`generate_example_castle_value_curves(num_breakpoints: int = 10)` ¶