Infinity Tools Package#

Sometimes more control is needed than is provided by the high-level workflows reviewed so far. In this case, the Infinity Tools Python package – our software that sits in between direct REST API interaction and the high-level workflows – is the backbone of front-end access to the Infinity API. Whether you are using our workflows or are interested in developing your own code, being familiar with Infinity Tools is very helpful.

What is the purpose of Infinity Tools?#

Infinity Tools wraps all of the functionality available via our REST API, but in a higher-level and developer-friendly way. It also provides tools to enable more complex data science workflows.

What are the key abstractions provided by Infinity Tools?#

Two fundamental concepts introduced by Infinity Tools are that of a Session and a Batch. Each is implemented as a Python class in Infinity Tools and almost all functionality for interacting with the REST API you’ll need is available as a method on one or the other.

  • An API Session. The fundamental workflow for light- and power-users alike is to generate 1 or more batches of synthetic data targeting a specific generator. An Infinity Tools Session wraps this process and provides various ergonomic facilities. Users create a new session by providing their authentication token and a target generator. A session is then minted with behavior specialized for the selected generator. The session allows for more efficient synthetic data generation with fewer errors and less boilerplate code.

  • A synthetic data Batch. Whether producing 1 preview image or 1,000 videos, or using any of our application-specific generators, there is only one way to generate data with our API – submit a batch. At its core, a batch is defined by a list of parameters for a target generator, but Infinity Tools provides many abstractions for generating batches, validating their parameters, submitting them to the API, querying their status in the cloud, downloading and analyzing results, and reconstituting old batches for further use.

Note

In the next two sections we will walk through core and advanced features of Infinity Tools with explanatory code snippets. The last section highlights and links to end-to-end example notebooks built with Infinity Tools.

Core Infinity Tools#

The primary role of Infinity Tools is to facilitate interaction with the Infinity REST API using the Session and Batch tools.

Using a Session#

The visionfit.api submodule has a Session tailored to the VisionFit generator:

from infinity_tools.visionfit.api import VisionFitSession

sesh = VisionFitSession(token="API_TOKEN", generator="visionfit-v0.4.0")

Get parameter options#

The session can print important metadata about job parameters helpful when constructing a batch to submit. This includes the type of each parameter, constraints (min and max values, finite choices, etc.), and the default value for each parameter.

from pprint import pprint
pprint(sesh.parameter_info)

Specify job parameters with sample_input()#

A Session provides the ability to specify details about job parameters you care about and randomly sample those you don’t using the sample_input method. A standard way of preparing a batch to submit to the API is to call this function in a loop or otherwise append new job parameters to a Python list, one for each job you would like in the batch.

num_jobs = 10
light_min = sesh.parameter_info["lighting_power"]["options"]["min"]
light_max = sesh.parameter_info["lighting_power"]["options"]["max"]
job_params = [sesh.sample_input(
    exercise = "UPPERCUT-LEFT",
    num_reps = 1,
    lighting_power = random.uniform(light_min, light_max),
) for _ in range(num_jobs)]

Specify job parameters without sample_input()#

While sample_input is convenient, you may not always want to randomly sample unspecified parameters. You can always directly construct job parameters as a list of dictionaries without using sample_input.

You can decide how unspecified parameters should be handled through the random_sample argument when submitting the batch. False fills unspecified parameters with their default values. True fills unspecified params with a value that is random uniform sampled from the parameter’s range.

Note

Depending on the generator, sample_input may take into consideration more complex joint constraints that may exist with certain parameters, while specifying random_sample=True with submit cannot do this.

job_params = [
    {"num_reps": 1, 
    "lighting_power": random.uniform(light_min, light_max)}
    for _ in range(num_jobs)
]

batch = sesh.submit(
    job_params=job_params,
    is_preview=True,
    random_sample=False, # Unspecified params are set to their default values
    batch_name = "example with default values",
)

Print the default value associated with each parameter:

pprint(sesh.parameter_info)

Estimate samples#

We can estimate the number of samples (frames) a given batch would generate before submitting the batch for execution. In general, it is not possible to predict exact sample estimates for generators that introduce randomness at runtime. However, the accuracy of the estimate increases with the size of the batch.

samples_by_job = sesh.estimate_samples(job_params, is_preview=False)
total_frames = sum(samples_by_job)

Note that if is_preview=True in the call to estimate_samples, the number of frames will exactly equal the number of jobs in the batch, since previews are single-frame by definition.

Submit a batch#

We can submit a list of job parameters to the API. The submit method will first validate the job parameters and raise an exception if any errors or invalid values are encountered. If the submission is successful, a Batch object is returned. It is possible to submit single-frame previews with is_preview=True or videos with is_preview=False.

batch = sesh.submit(job_params, is_preview=True, batch_name="example preview batch")

Note that if a batch submission is estimated to exceed the number of samples remaining in the user account, the submission will fail with a descriptive error message. In this case, you can (1) purchase more samples via the API dashboard, (2) reduce the number of jobs in the batch, or (3) reduce the number of samples per job by changing the job parameters. When each job completes, the sample estimate for the job is credited back to the sample balance, and the actual number of samples rendered is deducted from the balance.

Using a Batch#

The returned Batch object from submission has many utility methods associated with it.

Get static information about the batch#

View the list of the UUIDs for each job in the batch:

print(batch.job_ids)

View the list of job parameters for each job in the batch:

print(batch.job_params)

Get the number of jobs in the batch:

print(batch.num_jobs)

Get dynamic information about the batch#

See how many jobs are still processing, if any:

print(batch.get_num_jobs_remaining())

Get detailed batch metadata from the API:

print(batch.get_batch_data())

Get metadata only for a specific job in the batch:

target_job = batch.job_ids[0]
print(batch.get_job_summary_data(job_id = target_job))

Check for job completion: non-blocking and blocking#

At any time, we can check if a batch has completed processing in the cloud with a non-blocking call to get_num_jobs_remaining. You can also await completion of all jobs – this will block the notebook cell (or more generally, current Python process) until completion.

Query for batch completion in a non-blocking way:

print(f"Finished? {'Y' if batch.get_num_jobs_remaining() == 0 else 'N'}")

Block all Python processes while awaiting completion of the batch. Set a 60 minute timeout:

completed_jobs = batch.await_completion(timeout = 60 * 60)

Get completed job info and download results#

Get a list of the currently completed (without error) jobs:

completed_jobs = batch.get_valid_completed_jobs()

Download currently completed jobs to the specified path:

batch.download(path = "../tmp")

Advanced Infinity Tools#

On top of the batch and session concepts, Infinity Tools provides convenience methods and functions to enable synthetic data generation workflows. Some of the most noteworthy are:

  • Tools to analyze and visualize the distribution of job parameters in synthetic data batches (before and after submission to the API).

  • Tools to download and unpack synthetic data from the Infinity REST API.

  • Tools to visualize and analyze downloaded data.

Tools to help with Submission#

Visualize job parameters for a batch#

Carefully controlling the statistical distribution of job parameters in a batch is a typical need. Before submitting a batch, you can visualize a list of job parameters to ensure you have the properties you need.

from infinity_tools.visionfit.vis import visualize_job_params
job_params = [sesh.sample_input() for _ in range(100)]
visualize_job_params(job_params)

Use pandas dataframes#

DataFrame libraries such as pandas can easily be used to analyze, review, and update job parameters.

import pandas as pd
df = pd.DataFrame.from_records(job_params)
df.head()
# ... modify dataframe ...
final_job_params = df.to_dict("records")
batch = sesh.submit(
    job_params=final_job_params,
    is_preview=False,
    batch_name="modified batch using pandas df"
)

Manually validate the job parameters#

The submission methods will check validity of job parameters, but you can also manually check at any time.

errors_by_job = sesh.validate_job_params(final_job_params)
assert all([e is None for e in errors_by_job])

Re-run a batch with simple modifications#

A common task is to re-run a previous batch, or a subset of its jobs, with targeted parameter changes. For example, we can re-run a batch of previews that we like as full high resolution videos:

seed_job_params = batch.get_job_params_seeded_for_rerun()
for jp in seed_job_params:
    jp["image_height"] = 512
    jp["image_width"] = 512
high_res_batch = sesh.submit(
    job_params=seed_job_params,
    is_preview=False,
    batch_name="high res rerun"
)

Generate parameter sweeps based on a previous seed#

Another common task is to sweep one or more previous jobs (image or video) across a range of parameters, using the state of the previous job as a seed. You can construct this manually or use some helper functions provided by Infinity Tools. For example:

# Re-run the first five seeds from a previous batch of previews, each
# expanded across a range of camera heights.
seed_job_params = batch.get_job_params_seeded_for_rerun()[0:5]
seed_job_ids = [p["state"] for p in seed_job_params]
sweep_params =[{"camera_height": v} for v in np.linspace(1.0, 2.5, 10)]

# This will have 5 * 10 = 50 jobs in the expanded batch.
new_batch_params = sesh.expand_overrides_across_each_preview_state(
    seed_job_ids,
    sweep_params,
)
sweep_batch = sesh.submit(
    job_params=new_batch_params,
    is_preview=False,
    batch_name="camera height sweep"
)

Tools for Analysis of Results#

Visualize batch parameters in tabular format#

# Review submitted batch job parameters with dataframe.
df_jp = pd.DataFrame.from_records(batch.job_params)
df_jp.head()

# Download and visualize results from a batch
download_path = "tmp/results"
batch.download(download_path)

from infinity_tools.visionfit.vis import summarize_batch_results_as_dataframe
df_results = summarize_batch_results_as_dataframe(download_path)

Visualize batch parameters in graphical format#

Similar to visualize_job_params, we can visualize the parameter distribution data alongside new metadata available from completed results with visualize_batch_results.

from infinity_tools.visionfit.vis import visualize_batch_results
visualize_batch_results(download_path)

Visualize the synthetic data (videos and previews)#

Infinity Tools also provides utilities to visualize resulting images and videos directly. Visualizations can include keypoint overlays and segmentation information for rich visual analysis of your synthetic data.

# For a batch of previews.
from infinity_tools.common.vis.images import get_subdirectories, view_previews
preview_job_paths = get_subdirectories(download_path)
view_previews(random.choices(preview_job_paths, k=20)

# For a batch of videos.
from infinity_tools.visionfit.vis import visualize_all_labels
video_path = visualize_all_labels(target_job_path)

Example Notebooks#

You can find example notebooks in infinity-workflows that showcase Infinity Tools functionality. These can be used as a convenient jumping off point for more complex and custom usage of the Infinity API.

VisionFit#

Submit and Download Batch Demo#

The submit_and_download_batch_demo.ipynb notebook walks through the basics of building, submitting, downloading, and analyzing a large batch of synthetic data.

Parameter Sweep Demo#

The parameter_sweep_demo.ipynb notebook walks through a common use case of wanting to sweep one or more parameters against a previously submitted job or jobs.