Inference Backends 101

This Notebook is designed to be a bare-bones introduction to Inference Backend development. It will not perform any data operations, but will instead show some basic operations, including:

  1. Writing an Inference Backend to show execution flow

  2. Exploration of available preprocessors.

  3. Quick validation to ensure the Backend is operational

Write the Inference Backend

In this guide, we will not perform any meaningful data transformations or run models–instead, we will explore the flow of data through an inference backend and how built-in preprocessors can facilitate your development process.

First, we can create a simple inference backend that simply prints the received inputs, then returns what it received:

[1]:
from packflow import InferenceBackend


class Backend(InferenceBackend):
    def execute(self, inputs):
        print("Executing against data:", inputs)
        return inputs


backend = Backend()

backend
2026-01-21 14:00:33.055 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {}
2026-01-21 14:00:33.055 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=True, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
2026-01-21 14:00:33.056 | INFO     | packflow.backend.preprocessors:resolve:127 - Current config does not require preprocessing steps. Defaulting to Passthrough mode.
2026-01-21 14:00:33.056 | INFO     | packflow.backend.base:_initialize:103 - Initialized Backend in 0.0000 ms
[1]:
Backend[
  BackendConfig(verbose=True, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
]

As seen above, Packflow will provide production-ready logs during initialization, included the parsed configurations from keyword arguments and JSON configuration files, and run validation on the configs–more on this later.

Now we can generate some sample data and pass it through the loaded backend:

[2]:
samples = [dict(number=i) for i in range(5)]

backend(samples)
2026-01-21 14:00:44.690 | INFO     | packflow.backend.base:__call__:86 - ExecutionMetrics(batch_size=5, execution_times=ExecutionTimes(preprocess=0.00558, transform_inputs=None, execute=0.05417, transform_outputs=None), total_execution_time=0.059750000000000004)
Executing against data: [{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]
[2]:
[{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]

With each call, Packflow will collect and log execution metrics for downstream analysis, which can be seen above. If you would prefer to not print these logs, you can initialize the backend with verbose=False.

We can also validate that the backend meets Packflow’s API requirements by calling .validate():

[3]:
backend = Backend(verbose=False)

backend.validate(samples)
2026-01-21 14:00:48.021 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'verbose': False}
2026-01-21 14:00:48.022 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
2026-01-21 14:00:48.022 | INFO     | packflow.backend.preprocessors:resolve:127 - Current config does not require preprocessing steps. Defaulting to Passthrough mode.
Executing against data: [{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]
[3]:
[{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]

Preprocessors

Packflow has built-in preprocessors that assist with records parsing and transformation. The following sections will explore the main preprocessors that are available and some practical uses of the built-in functionality.

Passthrough

Starting with the most straight-forward, the passthrough preprocessor does exactly that–all configurations are ignored and the raw data is passed straight to the transform_inputs() or execute() function.

[4]:
backend = Backend(input_format="passthrough", verbose=False)

backend(samples)
2026-01-21 14:00:53.661 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'input_format': 'passthrough', 'verbose': False}
2026-01-21 14:00:53.661 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.PASSTHROUGH: 'passthrough'>, rename_fields={}, feature_names=[], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
Executing against data: [{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]
[4]:
[{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]

This is mostly useful for when your use-case requires access to the raw data or advanced preprocessing would be more useful.

Records [Default]

The records preprocessor is the default preprocessor for all Inference Backends. The default values for the configuration make this preprocessor act as a passthrough. However, if any baseline configurations are modified, it will begin to provide functionality automatically.

[5]:
samples = [dict(number=i, other_field=i + 1) for i in range(5)]

backend = Backend(input_format="records", verbose=False)

backend(samples)
2026-01-21 14:00:56.080 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'input_format': 'records', 'verbose': False}
2026-01-21 14:00:56.081 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
2026-01-21 14:00:56.081 | INFO     | packflow.backend.preprocessors:resolve:127 - Current config does not require preprocessing steps. Defaulting to Passthrough mode.
Executing against data: [{'number': 0, 'other_field': 1}, {'number': 1, 'other_field': 2}, {'number': 2, 'other_field': 3}, {'number': 3, 'other_field': 4}, {'number': 4, 'other_field': 5}]
[5]:
[{'number': 0, 'other_field': 1},
 {'number': 1, 'other_field': 2},
 {'number': 2, 'other_field': 3},
 {'number': 3, 'other_field': 4},
 {'number': 4, 'other_field': 5}]

Filtering Features

[6]:
# Filter
backend = Backend(input_format="records", feature_names=["number"], verbose=False)

backend(samples)
2026-01-21 14:01:15.616 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'input_format': 'records', 'feature_names': ['number'], 'verbose': False}
2026-01-21 14:01:15.616 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=['number'], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
Executing against data: [{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]
[6]:
[{'number': 0}, {'number': 1}, {'number': 2}, {'number': 3}, {'number': 4}]

Reorder Fields/Columns

[7]:
# Change Order
backend = Backend(
    input_format="records", feature_names=["other_field", "number"], verbose=False
)

backend(samples)
2025-07-08 17:41:15.269 | DEBUG    | packflow.backend.configuration:load_backend_configuration:59 - Loaded raw configuration: {'input_format': 'records', 'feature_names': ['other_field', 'number'], 'verbose': False}
2025-07-08 17:41:15.270 | INFO     | packflow.backend.configuration:load_backend_configuration:63 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=['other_field', 'number'], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter=':')
Executing against data: [{'other_field': 1, 'number': 0}, {'other_field': 2, 'number': 1}, {'other_field': 3, 'number': 2}, {'other_field': 4, 'number': 3}, {'other_field': 5, 'number': 4}]
[7]:
[{'other_field': 1, 'number': 0},
 {'other_field': 2, 'number': 1},
 {'other_field': 3, 'number': 2},
 {'other_field': 4, 'number': 3},
 {'other_field': 5, 'number': 4}]

Renaming Input fields

[7]:
# Renaming fields and Changing Order
backend = Backend(
    input_format="records",
    rename_fields={"other_field": "feature_0", "number": "feature_1"},
    feature_names=["feature_0", "feature_1"],
    verbose=False,
)

backend(samples)
2026-01-21 14:01:54.276 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'input_format': 'records', 'rename_fields': {'other_field': 'feature_0', 'number': 'feature_1'}, 'feature_names': ['feature_0', 'feature_1'], 'verbose': False}
2026-01-21 14:01:54.277 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={'other_field': 'feature_0', 'number': 'feature_1'}, feature_names=['feature_0', 'feature_1'], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
Executing against data: [{'feature_0': 1, 'feature_1': 0}, {'feature_0': 2, 'feature_1': 1}, {'feature_0': 3, 'feature_1': 2}, {'feature_0': 4, 'feature_1': 3}, {'feature_0': 5, 'feature_1': 4}]
[7]:
[{'feature_0': 1, 'feature_1': 0},
 {'feature_0': 2, 'feature_1': 1},
 {'feature_0': 3, 'feature_1': 2},
 {'feature_0': 4, 'feature_1': 3},
 {'feature_0': 5, 'feature_1': 4}]

Flattening Nested Fields

[8]:
samples = [{"number": {"value": [i, i + 1]}} for i in range(5)]

backend = Backend(flatten_nested_inputs=True, verbose=False)

backend(samples)
2026-01-21 14:01:56.551 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'flatten_nested_inputs': True, 'verbose': False}
2026-01-21 14:01:56.553 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=True, flatten_lists=False, nested_field_delimiter='.')
Executing against data: [{'number.value': [0, 1]}, {'number.value': [1, 2]}, {'number.value': [2, 3]}, {'number.value': [3, 4]}, {'number.value': [4, 5]}]
[8]:
[{'number.value': [0, 1]},
 {'number.value': [1, 2]},
 {'number.value': [2, 3]},
 {'number.value': [3, 4]},
 {'number.value': [4, 5]}]
[11]:
# with a custom delimiter:
backend = Backend(flatten_nested_inputs=True, nested_field_delimiter=":", verbose=False)

backend(samples)
2026-01-21 14:02:41.691 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'flatten_nested_inputs': True, 'nested_field_delimiter': ':', 'verbose': False}
2026-01-21 14:02:41.692 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=True, flatten_lists=False, nested_field_delimiter=':')
Executing against data: [{'number:value': [0, 1]}, {'number:value': [1, 2]}, {'number:value': [2, 3]}, {'number:value': [3, 4]}, {'number:value': [4, 5]}]
[11]:
[{'number:value': [0, 1]},
 {'number:value': [1, 2]},
 {'number:value': [2, 3]},
 {'number:value': [3, 4]},
 {'number:value': [4, 5]}]
[12]:
# with flattening lists:
backend = Backend(
    flatten_nested_inputs=True,
    nested_field_delimiter=".",
    flatten_lists=True,
    verbose=False,
)

backend(samples)
2026-01-21 14:02:50.230 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'flatten_nested_inputs': True, 'nested_field_delimiter': '.', 'flatten_lists': True, 'verbose': False}
2026-01-21 14:02:50.231 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=True, flatten_lists=True, nested_field_delimiter='.')
Executing against data: [{'number.value.0': 0, 'number.value.1': 1}, {'number.value.0': 1, 'number.value.1': 2}, {'number.value.0': 2, 'number.value.1': 3}, {'number.value.0': 3, 'number.value.1': 4}, {'number.value.0': 4, 'number.value.1': 5}]
[12]:
[{'number.value.0': 0, 'number.value.1': 1},
 {'number.value.0': 1, 'number.value.1': 2},
 {'number.value.0': 2, 'number.value.1': 3},
 {'number.value.0': 3, 'number.value.1': 4},
 {'number.value.0': 4, 'number.value.1': 5}]

Numpy Preprocessor

Packflow also has built-in support for converting records to Numpy arrays.

This preprocessor requires ``feature_names`` to be set in the configuration to ensure column order.

Since a Numpy array isn’t an allowed output format, we’ll need to write a slightly more advanced Backend to handle converstion back to a JSON-serializable output.

Thankfully, Packflow has built-in support for converting outputs for Numpy, PyTorch, Tensorflow, and PIL Images to ensure they meet API Requirements.

[15]:
from packflow.utils.normalize import ensure_valid_output


class NumpyBackend(InferenceBackend):
    def execute(self, inputs):
        print("Executing against data:", inputs)

        # double the values in the Numpy array
        return inputs * 2

    def transform_outputs(self, inputs):
        # Use built-in Packflow utilities to convert type handling
        return ensure_valid_output(inputs, parent_key="doubled")


# Generate Sample Data

samples = [{"feature1": i, "parent_key": {"value": i + 1}} for i in range(5)]

# Initialize and run the
backend = NumpyBackend(
    input_format="numpy",
    feature_names=[
        "parent_key.value"
    ],  # note that the Numpy preprocessor will work with nested fields!
    verbose=False,
)

backend(samples)
2026-01-21 14:06:04.805 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'input_format': 'numpy', 'feature_names': ['parent_key.value'], 'verbose': False}
2026-01-21 14:06:04.806 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.NUMPY: 'numpy'>, rename_fields={}, feature_names=['parent_key.value'], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
Executing against data: [[1]
 [2]
 [3]
 [4]
 [5]]
[15]:
[{'doubled': 2},
 {'doubled': 4},
 {'doubled': 6},
 {'doubled': 8},
 {'doubled': 10}]

Execution Order

The Inference Backend acts as a preset DAG, performing precossing steps then executing transform_inputs(), execute(), then transform_outputs(). Both transformation methods are completely optional but are highly recommended for any use-case that has custom logic for either pre- or post-execution logic. This is for profiling reasons to address any production throughput bottlenecks or identifying areas for improvement.

As a quick example, here is a backend that leverages all steps:

[16]:
class FullBackend(InferenceBackend):
    def initialize(self):
        self.logger.info("Hello from __init__!")

    def transform_inputs(self, inputs):
        inputs = [row["number"] for row in inputs]
        print("Transformed Inputs:", inputs)
        return inputs

    def execute(self, inputs):
        results = [i * 2 for i in inputs]
        print("Output of Execute:", results)
        return results

    def transform_outputs(self, inputs):
        # business logic
        output = []
        for num in inputs:
            output.append({"doubled": num, "is_even": num % 2 == 0})

        print("Transformed Outputs:", output)
        return output


# Create samples and run backend
samples = [{"number": i} for i in range(5)]

backend = FullBackend(verbose=False)

assert backend.validate(samples)
2026-01-21 14:06:14.991 | DEBUG    | packflow.backend.configuration:load_backend_configuration:63 - Loaded raw configuration: {'verbose': False}
2026-01-21 14:06:14.993 | INFO     | packflow.backend.configuration:load_backend_configuration:67 - Configuration: BackendConfig(verbose=False, input_format=<InputFormats.RECORDS: 'records'>, rename_fields={}, feature_names=[], flatten_nested_inputs=False, flatten_lists=False, nested_field_delimiter='.')
2026-01-21 14:06:14.995 | INFO     | packflow.backend.preprocessors:resolve:127 - Current config does not require preprocessing steps. Defaulting to Passthrough mode.
2026-01-21 14:06:14.996 | INFO     | __main__:initialize:3 - Hello from __init__!
Transformed Inputs: [0, 1, 2, 3, 4]
Output of Execute: [0, 2, 4, 6, 8]
Transformed Outputs: [{'doubled': 0, 'is_even': True}, {'doubled': 2, 'is_even': True}, {'doubled': 4, 'is_even': True}, {'doubled': 6, 'is_even': True}, {'doubled': 8, 'is_even': True}]

Conclusion

In this example notebook, we went through the standard flow of defining an Inference Backend. See the other Example notebooks for more specific examples and usage patterns!