Getting Started

This guide covers Packflow installation and basic usage of the CLI to create a simple Packflow project.

Installing Packflow

Prerequisites

  • Python (version 3.10+)

Install from PyPI

Packflow can be installed directly from PyPI:

pip install packflow

Install from Source

For development or to install from source:

# Clone repository and navigate to the root directory
git clone https://github.com/dow-cdao/packflow.git
cd packflow

# Install package
pip install .

# For contributors: install in editable mode
pip install -e .

Viewing Pre-built Documentation

The simplest way to view the Packflow documentation is to serve the pre-built HTML files included in the repository:

# Navigate to the pre-built docs folder
cd docs/built/html

# Start a local web server
python -m http.server 8000

# Access the documentation in a web browser at http://127.0.0.1:8000/

Important

If a “Not Found” error page is received when first accessing the documentation, wait a moment for the server to fully start and refresh the page.

Building Documentation from Source

The following are required to build documentation from source:

  • Python (version 3.10+)

  • Pip

  • Packflow (the version corresponding to the docs being served)

  • Pandoc - Must be installed separately, from system package manager (see Pandoc installation instructions)

  • make command (xcode-select on macOS and WSL on Windows, or build-essential on Linux)

Steps to build and serve documentation:

# Navigate to docs folder
cd docs

# Install Python dependencies
pip install -r requirements.txt

# Serve documentation with live updates (development)
make dev

# OR serve static multi-version documentation (production)
make prod-serve

# Access the documentation in a web browser at http://127.0.0.1:8000/

Creating a Packflow Project

This section covers the initial setup process for creating a Packflow project, defining an Inference Backend, and running Packflow’s validation checks on the input/output requirements of the Inference Backend.

Step 1: Create the project structure

Initialize a new project by running packflow create hello-world. This will create a new directory named hello-world that contains the following directory structure:

hello-world/
├── packflow.yaml
├── LICENSE.txt
├── MODEL_CARD.md
├── README.md
├── requirements.txt
├── inference.py
├── validate.py
└── test_inference.py

Step 2: Write the Inference Backend

Open the inference.py with a code or text editor of your choice. Some templated code will be provided. Populate the execute() function with logic to double the value under the key ‘number’, and return the doubled number:

inference.py
 1from typing import Any, List
 2
 3from packflow import InferenceBackend
 4
 5
 6class Backend(InferenceBackend):
 7    def transform_inputs(self, inputs: List[dict]) -> Any:
 8        """
 9        Preprocessing steps or other transformations before running inference.
10        ...
11        """
12        return inputs
13
14    def execute(self, inputs: Any) -> Any:
15        """The main execution of inference or analysis for the developed application.
16
17        This method should remain targeted to passing data through the model/execution
18        code for profiling purposes. Minimal pre- or post-processing should occur at this
19        step unless completely necessary.
20
21        Parameters
22        ----------
23        inputs: List[Dict]
24            The output of the transform_inputs method. If the transform_inputs method is
25            not overridden, the data is formatted as records (list of dictionaries)
26
27        Returns
28        -------
29        Any
30            Model Outputs
31
32        Notes
33        -----
34        The transform_outputs() method should handle all postprocessing including calculating
35        metrics, converting outputs back to Python types, and other postprocessing steps. Try
36        to keep this method focused purely on inference/analysis.
37        """
38        outputs = []
39        for row in inputs:
40            outputs.append({"doubled": row["number"] * 2})
41
42        return outputs
43
44    def transform_outputs(self, outputs):
45        """
46        Postprocessing steps or other transformation steps to be executed prior to
47        returning outputs.
48        ...
49        """
50        return outputs

The Inference Backend is now ready to be loaded, validated, and shared.

Step 3: Local Validation

Now that the Inference Backend is written, use the built-in validation to ensure it will run as expected in production.

This can be done programmatically. Open the validate.py script and modify it to match the Inference Backend’s inputs:

validate.py
 1# Import Packflow's dev tools to run validations on the Inference Backend
 2from packflow.loaders import LocalLoader
 3
 4# Load the backend in the current directory
 5# The path 'inference:Backend' can be interpreted as
 6#   `from inference import Backend`
 7backend = LocalLoader("inference:Backend").load()
 8
 9# Define sample inputs that represent realistic data for the backend.
10# These should exercise the expected input format(s) the backend will receive.
11SAMPLE_SINGLE_ROW = {"number": 5}
12
13SAMPLE_BATCH = [
14    {"number": 5},
15    {"number": 10},
16    # Add more sample rows as needed
17]
18
19if __name__ == "__main__":
20    print("Running validation...")
21
22    print(f"Sample single row: {SAMPLE_SINGLE_ROW}\n")
23    print(f"Sample batch: {SAMPLE_BATCH}\n")
24
25    # backend.validate() checks that the outputs meet Packflow's format
26    # requirements. Returns outputs if valid.
27    outputs_single_row = backend.validate(SAMPLE_SINGLE_ROW)
28    outputs_batch = backend.validate(SAMPLE_BATCH)
29
30    print(f"Outputs single row: {outputs_single_row}")
31    print(f"Outputs batch: {outputs_batch}")
32    print("\nValidation passed!")

Note

Validation can be run via the validate.py file, or directly from a Notebook. However the path will need to be updated if it is not running in the same directory

Passing "inference:Backend" to the Local Loader is roughly equal to from inference import Backend. If the script is nested further, the path can be separated via dot notation, such as src.mypackage.inference:Backend.

If any validations fail, an exception message containing details of the issue and what needs to be fixed will be returned.

Next Steps

Please see the Creating a Custom Backend section of this site for more detailed information on building custom Inference Backends with Packflow.