Task Manager (starclass.TaskManager)

class starclass.TaskManager(todo_file, cleanup=False, readonly=False, overwrite=False, classes=None, load_into_memory=False, backup_interval=10000)[source]

Bases: object

A TaskManager which keeps track of which targets to process.

Code author: Rasmus Handberg <rasmush@phys.au.dk>

__init__(todo_file, cleanup=False, readonly=False, overwrite=False, classes=None, load_into_memory=False, backup_interval=10000)[source]

Initialize the TaskManager which keeps track of which targets to process.

Parameters:
  • todo_file (str) – Path to the TODO-file.

  • cleanup (bool) – Perform cleanup/optimization of TODO-file before doing initialization. Default=False.

  • overwrite (bool) – Overwrite any previously calculated results. Default=False.

  • classes (Enum) – Possible stellar classes. This is only used for for translating saved stellar classes in the other_classifiers table into proper enums.

  • load_into_memory (bool) – Create a in-memory copy of the entire TODO-file, and work of this copy to speed up queries. Will result in larger memory use. Default=True.

  • backup_interval (int) – Save in-memory copy of database to disk after this number of results saved by save_results(). Default=10000.

Raises:

FileNotFoundError – If TODO-file could not be found.

Code author: Rasmus Handberg <rasmush@phys.au.dk>

assign_final_class(tset, data_dir=None)[source]

Assing final classes based on all starclass results.

This will create a new column in the todolist table named “final_class”.

Parameters:
  • tset (TrainingSet) – Training-set used.

  • data_dir (str, optional) – Data directory to load models from.

Code author: Rasmus Handberg <rasmush@phys.au.dk>

backup()[source]

Save backup of todo-file to disk. This only has an effect when load_into_memory is enabled.

Code author: Rasmus Handberg <rasmush@phys.au.dk>

close()[source]

Close TaskManager and all associated objects.

get_number_tasks(classifier=None)[source]

Get number of tasks to be processed.

Parameters:

classifier (str, optional) – Constrain to tasks missing from this classifier.

Returns:

Number of tasks due to be processed.

Return type:

int

Code author: Rasmus Handberg <rasmush@phys.au.dk>

get_task(priority=None, classifier=None, change_classifier=True, chunk=1, ignore_existing=False)[source]

Get next task to be processed.

Parameters:
  • priority (integer)

  • classifier (string) – Classifier to get next task for. If no tasks are available for this classifier, and change_classifier=True, a task for another classifier will be returned.

  • change_classifier (boolean) – Return task for another classifier if there are no more tasks for the provided classifier. Default=True.

  • chunk (int, optional) – Chunk of tasks to return. Default is to not chunk (=1).

Returns:

List of dictionaries of settings for tasks.

If no tasks are found None is returned.

Return type:

list or None

Code author: Rasmus Handberg <rasmush@phys.au.dk>

moat_clear()[source]

Clear Mother Of All Tables (MOAT).

Code author: Rasmus Handberg <rasmush@phys.au.dk>

moat_create(classifier, columns)[source]
moat_query(classifier, priority)[source]

Query Mother Of All Tables (MOAT) for cached features.

Parameters:
  • classifier (str)

  • priority (int)

Returns:

Dictionary with features stores in MOAT.

Return type:

dict

Code author: Rasmus Handberg <rasmush@phys.au.dk>

save_results(results)[source]

Save results, or list of results, to TODO-file.

Parameters:

results (list or dict) – Dictionary of results and diagnostics.

Raises:

ValueError – If attempting to save results from multiple different training sets.

Code author: Rasmus Handberg <rasmush@phys.au.dk>

save_settings()[source]

Save settings to TODO-file and create method-specific columns in diagnostics_corr table.

Code author: Rasmus Handberg <rasmush@phys.au.dk>

start_task(tasks)[source]

Mark tasks as STARTED in the TODO-list.

Parameters:

tasks (list or dict) – Task or list of tasks coming from get_tasks().

Code author: Rasmus Handberg <rasmush@phys.au.dk>