TrainingSet (starclass.training_sets.TrainingSet)
- class starclass.training_sets.TrainingSet(level='L1', datalevel='corr', tf=0.0, linfit=False, random_seed=42)[source]
- Bases: - object- Generic Training Set. - key
- Unique identifier for training set. - Type:
- str 
 
 - linfit
- Indicating if linfit mechanism is enabled. - Type:
- bool 
 
 - testfraction
- Test-fraction. - Type:
- float 
 
 - StellarClasses
- Enum of the classes associated with this training set. - Type:
- enum 
 
 - random_seed
- Random seed in use. - Type:
- int 
 
 - features_cache
- Path to directory where cache of extracted features is being stored. - Type:
- str 
 
 - train_idx
- Type:
- ndarray 
 
 - test_idx
- Type:
- ndarray 
 
 - crossval_folds
- Number of cross-validation folds the training set has been split into. If - 0the training set has not been split.- Type:
- int 
 
 - fold
- The current cross-validation fold. This is - 0in the original training set.- Type:
- int 
 
 - __init__(level='L1', datalevel='corr', tf=0.0, linfit=False, random_seed=42)[source]
- Initialize TrainingSet. - Parameters:
- level (str) – Level of the classification. Choises are - 'L1'and- 'L2'. Default is level 1.
- tf (float) – Test-fraction. Default=0. 
- linfit (bool) – Should linfit be enabled for the trainingset? If - linfitis enabled, lightcurves will be detrended using a linear trend before passed on to have frequencies extracted. See- BaseClassifier.calc_features()for details.
- random_seed (int) – Random seed. Default=42. 
- datalevel (str) – Deprecated. 
 
 - Code author: Rasmus Handberg <rasmush@phys.au.dk> 
 - clear_cache()[source]
- Clear features cache. - This will delete the features cache directory in the training-set data directory, and delete all MOAT cache tables in the training-set. - Code author: Rasmus Handberg <rasmush@phys.au.dk> 
 - features()[source]
- Iterator of features for training. - Returns:
- Iterator of dicts containing features to be used for training. 
- Return type:
- Iterator 
 - Code author: Rasmus Handberg <rasmush@phys.au.dk> 
 - features_test()[source]
- Iterator of features for testing. - Returns:
- Iterator of dicts containing features to be used for testing. 
- Return type:
- Iterator 
 - Code author: Rasmus Handberg <rasmush@phys.au.dk> 
 - classmethod find_input_folder()[source]
- Find the folder containing the data for the training set. - This is a class method, so it can be called without having to initialize the training set. 
 - folds(n_splits=5)[source]
- Split training set object into stratified folds. - Parameters:
- n_splits (int, optional) – Number of folds to split training set into. Default=5. 
- Returns:
- Iterator of folds, which are also
- TrainingSetobjects.
 
- Return type:
- Iterator of - TrainingSetobjects
 
 - generate_todolist()[source]
- Generate todo.sqlite file in training set directory. - Code author: Rasmus Handberg <rasmush@phys.au.dk> 
 - labels()[source]
- Labels of training-set. - Returns:
- Tuple of labels associated with features in features().
- Each element is itself a tuple of enums of - StellarClasses.
 
- Tuple of labels associated with features in 
- Return type:
- tuple 
 - Code author: Rasmus Handberg <rasmush@phys.au.dk> 
 - labels_test()[source]
- Labels of test-set. - Returns:
- Tuple of labels associated with features in features_test().
- Each element is itself a tuple of enums of - StellarClasses.
 
- Tuple of labels associated with features in 
- Return type:
- tuple 
 - Code author: Rasmus Handberg <rasmush@phys.au.dk> 
 - tset_datadir(url)[source]
- Setup TrainingSet data directory. If the directory doesn’t already exist, - Parameters:
- url (string) – URL from where to download the training-set if it doesn’t already exist. 
- Returns:
- Path to directory where training set is stored. 
- Return type:
- string 
 - Code author: Rasmus Handberg <rasmush@phys.au.dk>