Compute stock-out rate from shipping provider inventory CSVs in Python and publish as a Functory API for weekly standups
This article shows a pragmatic, production-ready pattern: take multiple shipping-provider inventory CSVs, normalize them, compute a clear weekly stock-out rate metric, and expose that computation as a no-ops cloud API using Functory. The goal is to give small product-led startups (no dedicated DevOps) a reliable, repeatable way to produce the numbers used in weekly standups without manual spreadsheet wrangling.
We’ll cover exactly what CSV columns are expected, how the data is transformed, an implementation using pandas, one practical dataset example, and how to package the logic as a single Functory function for on-demand or programmatic use (no servers, no cron jobs needed).
What the function does—and the exact inputs and outputs
Inputs (exact): one or more CSV files produced by shipping providers or your warehouse export. Each file must be UTF-8 CSV with a header row. Required columns (names are case-insensitive but we’ll normalize):
- sku (string) — product SKU or item identifier
- provider_id (string) — shipping provider or warehouse name, e.g. "shippo-1", "fulfillco"
- snapshot_ts (ISO 8601 timestamp) — e.g. "2025-11-21T13:00:00Z" (UTC preferred)
- available_qty (integer) — currently available units for sale
- reserved_qty (integer, optional) — reserved/allocated units
- inbound_qty (integer, optional) — inbound quantity expected
Transformations performed (precise):
- Load multiple CSVs and coerce types (sku:str, provider_id:str, snapshot_ts:datetime, available_qty:int).
- Normalize column names and provider IDs; drop rows with missing sku or snapshot_ts.
- Deduplicate by (sku, provider_id, snapshot_ts) keeping the latest row if duplicates appear.
- Resample/group snapshots into ISO weeks (Monday start) and compute for each week and provider:
- active_skus = number of unique SKUs with any snapshot that week
- stock_out_skus = number of unique SKUs that had available_qty == 0 in any snapshot that week
- weekly_stock_out_rate = stock_out_skus / active_skus (float 0–1)
- Return a DataFrame or JSON table with columns: week_start (YYYY-MM-DD), provider_id, active_skus, stock_out_skus, weekly_stock_out_rate.
Real-world scenario (concrete)
Startup context: A SaaS+hardware company uses two 3PLs that each dump a CSV daily into S3. Product and ops need a one-line metric for weekly standups: "stock-out rate by provider."
Concrete files:
- shippo-2025-11-19.csv (1200 rows, columns as above)
- fulfillco-2025-11-19.csv (900 rows)
Example input rows (CSV):
sku,provider_id,snapshot_ts,available_qty,reserved_qty,inbound_qty
SKU-1001,shippo,2025-11-17T08:00:00Z,0,2,10
SKU-1002,shippo,2025-11-18T09:00:00Z,5,0,0
SKU-2001,fulfillco,2025-11-17T07:30:00Z,0,0,0
Expected output (one JSON row per provider-week):
[{"week_start":"2025-11-17","provider_id":"shippo","active_skus":200,"stock_out_skus":24,"weekly_stock_out_rate":0.12},
{"week_start":"2025-11-17","provider_id":"fulfillco","active_skus":180,"stock_out_skus":9,"weekly_stock_out_rate":0.05}]
Example dataset and specific problem
Fabricated example dataset used in testing:
- Size: 1000 rows per week combined, representing 200 unique SKUs across two providers.
- Snapshot cadence: random snapshots at 4 times/day per provider (so ~28 snapshots per SKU/week).
- Problem solved: product managers previously counted SKUs manually in Google Sheets by checking daily exports and computing the ratio; that took ~45 minutes per week and introduced inconsistent definitions of "stock-out." This function standardizes the definition and automates the count.
Step-by-step workflow (how a developer uses this)
- Collect provider CSVs in a folder or accept URLs (S3 presigned links) as inputs.
- Call the compute_stock_out_rate(paths) function to get a pandas DataFrame of weekly metrics.
- Save the result as JSON or CSV for the weekly standup, or post to Slack via webhook.
- Optionally wrap the function as a Functory API so non-engineer PMs can hit a web UI to generate the report on demand.
- Load CSVs → normalize columns → coerce datatypes.
- Group rows into ISO weeks (week_start = snapshot_ts.floor('W-MON')).
- For each (week_start, provider_id): compute unique SKUs and SKUs with any available_qty == 0.
- Compute ratio stock_out_skus / active_skus and return table sorted by week_start and provider_id.
Python implementation (small runnable example)
from pathlib import Path
from typing import List
import pandas as pd
def compute_stock_out_rate(csv_paths: List[str],
sku_col='sku',
provider_col='provider_id',
ts_col='snapshot_ts',
avail_col='available_qty') -> pd.DataFrame:
# Load and concatenate
dfs = []
for p in csv_paths:
df = pd.read_csv(p)
df.columns = [c.strip().lower() for c in df.columns]
df = df.rename(columns={sku_col: 'sku', provider_col: 'provider_id', ts_col: 'snapshot_ts', avail_col: 'available_qty'})
df = df[['sku', 'provider_id', 'snapshot_ts', 'available_qty']].dropna(subset=['sku', 'snapshot_ts'])
df['snapshot_ts'] = pd.to_datetime(df['snapshot_ts'], utc=True)
df['available_qty'] = pd.to_numeric(df['available_qty'], errors='coerce').fillna(0).astype(int)
dfs.append(df)
if not dfs:
return pd.DataFrame()
df = pd.concat(dfs, ignore_index=True)
# Deduplicate keeping the last snapshot per tuple
df = df.sort_values('snapshot_ts').drop_duplicates(subset=['sku', 'provider_id', 'snapshot_ts'], keep='last')
# Week start (Monday) as date
df['week_start'] = df['snapshot_ts'].dt.to_period('W-MON').start_time.dt.date
# Compute metrics
agg_active = df.groupby(['week_start', 'provider_id'])['sku'].nunique().rename('active_skus')
agg_out = df[df['available_qty'] == 0].groupby(['week_start', 'provider_id'])['sku'].nunique().rename('stock_out_skus')
res = pd.concat([agg_active, agg_out], axis=1).fillna(0).reset_index()
res['weekly_stock_out_rate'] = (res['stock_out_skus'] / res['active_skus']).round(4)
return res.sort_values(['week_start', 'provider_id'])
# Example usage
if __name__ == '__main__':
sample = ['data/shippo-2025-11-19.csv', 'data/fulfillco-2025-11-19.csv']
df = compute_stock_out_rate(sample)
print(df.head())
How Functory Makes It Easy
On Functory you’d wrap the same core logic into a single main(...) entrypoint. Functory expects the main function signature to declare typed parameters (strings, numbers, or file paths). Those parameters become input fields in the web UI and JSON keys on the API. The function's return value (a JSON-like dict or a path to a CSV file) becomes the downloadable output.
Concretely, to publish on Functory you would:
- Choose an exact Python version such as "3.11.11" for the environment.
- Create a requirements.txt with pinned versions, e.g.
pandas==2.1.0,python-dateutil==2.8.2. - Implement
def main(csv_files: List[FilePath]) -> str(or return a JSON dict). Functory calls this main(...) directly; no CLI wrapper required. - Inputs: uploaded CSV files or URLs passed as strings via API. Outputs: either a path to a saved CSV (Functory will expose a download) or a JSON table returned directly.
Benefits you get from Functory: no servers to manage, automatic cloud execution and autoscaling, built-in logging via print(), secure per-execution isolation, and pay-per-use billing. You can trigger the function manually from the UI or call it programmatically from another backend, CI job, or even an LLM agent that orchestrates weekly reports. To chain workflows, call a pre-processing Functory function that normalizes provider CSVs, then call the stock-out function, and finally call a notification function that posts the result to Slack.
Comparison to other approaches
Common alternatives today:
- Manual spreadsheets: engineers and PMs copy/paste CSVs into Google Sheets and write pivot formulas. This is error-prone and slow.
- Ad-hoc cron scripts on EC2: inexpensive but requires SSH and fragile ops maintenance.
- Full ETL pipelines (Airflow + data warehouse): powerful but heavyweight for a small team and high initial cost.
The function-based approach (small, well-tested Python function hosted as a Functory API) wins when you need repeatable, auditable results without managing infrastructure. It keeps the core logic in code (unit-testable), exposes a simple API for automation, and eliminates the weekly manual steps that cost time.
Business impact
Concrete benefit: automating the weekly stock-out rate computation reduces manual processing time from ~45 minutes to under 3 minutes (one API call and a Slack post) — a ~93% reduction in meeting prep time for a two-person product ops team. It also standardizes the metric across teams so product decisions are based on consistent definitions.
Alternatives and trade-offs
Alternative definitions of "stock-out rate" include time-weighted availability (percent of snapshots where available_qty==0) or lost-sales modeling that uses historical sell-through rates. Those require higher-frequency snapshots and demand forecasting. This weekly unique-SKU definition is simple, stable, and cheap to compute, making it suitable for weekly standups where clarity is more valuable than probabilistic precision.
Conclusion: a focused function that computes weekly stock-out rate from shipping-provider CSVs gives small teams a reliable, repeatable metric for standups without the overhead of full ETL or server management. Next steps: add provider-specific parsers to auto-handle different CSV schemas, produce an Excel summary file from the result, or wire the function into an automated Slack workflow. Try publishing the function on Functory, run it on last week's exports, and compare the numbers to your historical spreadsheet — you'll usually find inconsistencies that this automation eliminates.
Thanks for reading.
