DataLoadingJob

ppg2.DataLoadingJob #

ppg2.DataLoadingJob(
        job_id,
        load_function,
        resources: Resources = Resources.SingleCore,
        depend_on_function=True,
    ):
  • job_id: A job_id
  • load_function: A parameter less function that stores data somewhere.
  • resources: See Resources.
  • depend_on_function: Whether to create a FunctionInvariant for the load_function. See FunctionInvariant.

This is an ephemeral job, but there is no unloading.

This job runs in the controlling process (see process-structure).

Prefer an AttributeLoadingJob if possible, since that can unload the data as well.

The load_function should return an object - it’s DeepHashed representation is used to calculate the tracking hash of the job.

Alternatively, you may return UseInputHashesForOutput(), which signals the job to use a hash of it’s input hashes as the tracking hash, making it essentially ’transparent’. This is useful for objects that are currently not supported by DeepDiff.

Returning None is no longer supported.

You must return a value, or UseInputHashesForOutput.

Use a constant value if you really don’t want this job to not invalidate downstreams ever.

If you already have a hash handy (or your data is not pickleable), you can return a ppg2.ValuePlusHash(value, hash_hexdigest) object to circumvent the pickling requirement.