Getting Started
This guide covers Packflow installation and basic usage of the CLI to create a simple Packflow project.
Installing Packflow
Prerequisites
Python (version 3.10+)
Install from PyPI
Packflow can be installed directly from PyPI:
pip install packflow
Install from Source
For development or to install from source:
# Clone repository and navigate to the root directory
git clone https://github.com/dow-cdao/packflow.git
cd packflow
# Install package
pip install .
# For contributors: install in editable mode
pip install -e .
Viewing Pre-built Documentation
The simplest way to view the Packflow documentation is to serve the pre-built HTML files included in the repository:
# Navigate to the pre-built docs folder
cd docs/built/html
# Start a local web server
python -m http.server 8000
# Access the documentation in a web browser at http://127.0.0.1:8000/
Important
If a “Not Found” error page is received when first accessing the documentation, wait a moment for the server to fully start and refresh the page.
Building Documentation from Source
The following are required to build documentation from source:
Python (version 3.10+)
Pip
Packflow (the version corresponding to the docs being served)
Pandoc - Must be installed separately, from system package manager (see Pandoc installation instructions)
make command (
xcode-selecton macOS and WSL on Windows, orbuild-essentialon Linux)
Steps to build and serve documentation:
# Navigate to docs folder
cd docs
# Install Python dependencies
pip install -r requirements.txt
# Serve documentation with live updates (development)
make dev
# OR serve static multi-version documentation (production)
make prod-serve
# Access the documentation in a web browser at http://127.0.0.1:8000/
Creating a Packflow Project
This section covers the initial setup process for creating a Packflow project, defining an Inference Backend, and running Packflow’s validation checks on the input/output requirements of the Inference Backend.
Step 1: Create the project structure
Initialize a new project by running packflow create hello-world. This will create a new directory named hello-world that contains the following directory structure:
hello-world/
├── packflow.yaml
├── LICENSE.txt
├── MODEL_CARD.md
├── README.md
├── requirements.txt
├── inference.py
├── validate.py
└── test_inference.py
Step 2: Write the Inference Backend
Open the inference.py with a code or text editor of your choice. Some templated code will be provided. Populate the execute() function with logic to double the value under the key ‘number’, and return the doubled number:
1from typing import Any, List
2
3from packflow import InferenceBackend
4
5
6class Backend(InferenceBackend):
7 def transform_inputs(self, inputs: List[dict]) -> Any:
8 """
9 Preprocessing steps or other transformations before running inference.
10 ...
11 """
12 return inputs
13
14 def execute(self, inputs: Any) -> Any:
15 """The main execution of inference or analysis for the developed application.
16
17 This method should remain targeted to passing data through the model/execution
18 code for profiling purposes. Minimal pre- or post-processing should occur at this
19 step unless completely necessary.
20
21 Parameters
22 ----------
23 inputs: List[Dict]
24 The output of the transform_inputs method. If the transform_inputs method is
25 not overridden, the data is formatted as records (list of dictionaries)
26
27 Returns
28 -------
29 Any
30 Model Outputs
31
32 Notes
33 -----
34 The transform_outputs() method should handle all postprocessing including calculating
35 metrics, converting outputs back to Python types, and other postprocessing steps. Try
36 to keep this method focused purely on inference/analysis.
37 """
38 outputs = []
39 for row in inputs:
40 outputs.append({"doubled": row["number"] * 2})
41
42 return outputs
43
44 def transform_outputs(self, outputs):
45 """
46 Postprocessing steps or other transformation steps to be executed prior to
47 returning outputs.
48 ...
49 """
50 return outputs
The Inference Backend is now ready to be loaded, validated, and shared.
Step 3: Local Validation
Now that the Inference Backend is written, use the built-in validation to ensure it will run as expected in production.
This can be done programmatically. Open the validate.py script and modify it to match the Inference Backend’s inputs:
1# Import Packflow's dev tools to run validations on the Inference Backend
2from packflow.loaders import LocalLoader
3
4# Load the backend in the current directory
5# The path 'inference:Backend' can be interpreted as
6# `from inference import Backend`
7backend = LocalLoader("inference:Backend").load()
8
9# Define sample inputs that represent realistic data for the backend.
10# These should exercise the expected input format(s) the backend will receive.
11SAMPLE_SINGLE_ROW = {"number": 5}
12
13SAMPLE_BATCH = [
14 {"number": 5},
15 {"number": 10},
16 # Add more sample rows as needed
17]
18
19if __name__ == "__main__":
20 print("Running validation...")
21
22 print(f"Sample single row: {SAMPLE_SINGLE_ROW}\n")
23 print(f"Sample batch: {SAMPLE_BATCH}\n")
24
25 # backend.validate() checks that the outputs meet Packflow's format
26 # requirements. Returns outputs if valid.
27 outputs_single_row = backend.validate(SAMPLE_SINGLE_ROW)
28 outputs_batch = backend.validate(SAMPLE_BATCH)
29
30 print(f"Outputs single row: {outputs_single_row}")
31 print(f"Outputs batch: {outputs_batch}")
32 print("\nValidation passed!")
Note
Validation can be run via the validate.py file, or directly from a Notebook. However the path will need to be updated
if it is not running in the same directory
Passing "inference:Backend" to the Local Loader is roughly equal to from inference import Backend. If the script
is nested further, the path can be separated via dot notation, such as src.mypackage.inference:Backend.
If any validations fail, an exception message containing details of the issue and what needs to be fixed will be returned.
Next Steps
Please see the Creating a Custom Backend section of this site for more detailed information on building custom Inference Backends with Packflow.