System Overview#

At the core of the Infinity system is a REST API used to generate synthetic data.

Using the REST API, we can start a batch of jobs, each with specific parameters. Each job will generate synthetic data using a specific generator, a specific program that generates data. For a computer vision generator, for example, the output from a job is an archive (zip file) containing a video, label metadata, and images with the scene objects segmented for each video frame. Each generator’s outputs will vary slightly. The user can log in to our API User Portal to view the status of batches and their jobs as they execute in the cloud or can use our convenient Python tools for this purpose.

When jobs are created via the API, the jobs are dispatched to cloud workers, and the resulting data is made available for download by the end user. Jobs will vary in their computational complexity based on the generator and input parameters. Accordingly, time to job completion will vary. Some jobs will take seconds, while a high resolution video render may require several hours.

No matter the task at hand, the steps for producing synthetic data using the Infinity system are always the same:

Choose a generator (program that generates data).
Inspect parameters available for the generator.
Construct a batch, consisting of multiple jobs (sets of parameters) for that generator.
Submit the batch to the API for execution.
Poll your batch for completion, and download your synthetic data when ready.

We provide two helpful tools for generating data and interacting with the API:

infinity-workflows - a repository of Jupyter notebooks and modules that define simple workflows for data generation. This is a great place to get started.
infinity-core - a Python package that provides a convenient wrapper around the API. This provides the building blocks for custom workflows and is used extensively by infinity-workflows to manage batches.

We introduce these tools in detail in subsequent sections.