Task Manager (starclass.TaskManager
)
- class starclass.TaskManager(todo_file, cleanup=False, readonly=False, overwrite=False, classes=None, load_into_memory=False, backup_interval=10000)[source]
Bases:
object
A TaskManager which keeps track of which targets to process.
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- __init__(todo_file, cleanup=False, readonly=False, overwrite=False, classes=None, load_into_memory=False, backup_interval=10000)[source]
Initialize the TaskManager which keeps track of which targets to process.
- Parameters:
todo_file (str) – Path to the TODO-file.
cleanup (bool) – Perform cleanup/optimization of TODO-file before doing initialization. Default=False.
overwrite (bool) – Overwrite any previously calculated results. Default=False.
classes (Enum) – Possible stellar classes. This is only used for for translating saved stellar classes in the
other_classifiers
table into proper enums.load_into_memory (bool) – Create a in-memory copy of the entire TODO-file, and work of this copy to speed up queries. Will result in larger memory use. Default=True.
backup_interval (int) – Save in-memory copy of database to disk after this number of results saved by
save_results()
. Default=10000.
- Raises:
FileNotFoundError – If TODO-file could not be found.
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- assign_final_class(tset, data_dir=None)[source]
Assing final classes based on all starclass results.
This will create a new column in the todolist table named “final_class”.
- Parameters:
tset (
TrainingSet
) – Training-set used.data_dir (str, optional) – Data directory to load models from.
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- backup()[source]
Save backup of todo-file to disk. This only has an effect when load_into_memory is enabled.
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- get_number_tasks(classifier=None)[source]
Get number of tasks to be processed.
- Parameters:
classifier (str, optional) – Constrain to tasks missing from this classifier.
- Returns:
Number of tasks due to be processed.
- Return type:
int
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- get_task(priority=None, classifier=None, change_classifier=True, chunk=1, ignore_existing=False)[source]
Get next task to be processed.
- Parameters:
priority (integer)
classifier (string) – Classifier to get next task for. If no tasks are available for this classifier, and change_classifier=True, a task for another classifier will be returned.
change_classifier (boolean) – Return task for another classifier if there are no more tasks for the provided classifier. Default=True.
chunk (int, optional) – Chunk of tasks to return. Default is to not chunk (=1).
- Returns:
- List of dictionaries of settings for tasks.
If no tasks are found
None
is returned.
- Return type:
list or None
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- moat_clear()[source]
Clear Mother Of All Tables (MOAT).
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- moat_query(classifier, priority)[source]
Query Mother Of All Tables (MOAT) for cached features.
- Parameters:
classifier (str)
priority (int)
- Returns:
Dictionary with features stores in MOAT.
- Return type:
dict
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- save_results(results)[source]
Save results, or list of results, to TODO-file.
- Parameters:
results (list or dict) – Dictionary of results and diagnostics.
- Raises:
ValueError – If attempting to save results from multiple different training sets.
Code author: Rasmus Handberg <rasmush@phys.au.dk>
- save_settings()[source]
Save settings to TODO-file and create method-specific columns in
diagnostics_corr
table.Code author: Rasmus Handberg <rasmush@phys.au.dk>
- start_task(tasks)[source]
Mark tasks as STARTED in the TODO-list.
- Parameters:
tasks (list or dict) – Task or list of tasks coming from
get_tasks()
.
Code author: Rasmus Handberg <rasmush@phys.au.dk>