Functory
functory.
6 min read
Functory

Python Stripe CSV Expansion Revenue Report for Board Meetings — generate a shareable URL

This article shows how to turn raw Stripe CSV exports (payments and invoice lines) into a concise expansion revenue report you can present at a board meeting and share with your team using a single URL. You'll get precise rules for extracting recurring revenue movements (upgrades, add-ons, and downgrades), a small reproducible Python example using pandas, and concrete guidance for deploying that script as a Functory function so non-technical stakeholders can run the report with one click.

Target readers: product managers, startup data engineers, and backend engineers who receive Stripe CSVs like payments.csv or invoiceitems.csv and need reliable month-over-month expansion revenue figures (not just gross payments) for ARR/MRR analysis.

What this function does — inputs, processing, outputs

Input data expected:

  • CSV file (UTF-8) exported from Stripe Payments or Invoices with columns similar to: id, created, customer_email, amount, currency, invoice_id, subscription_id, description, status, metadata. Timestamps in ISO 8601 or UNIX format are OK.
  • Optional parameter: lookback_months (int) — how many months to include in the report.

Transformations performed:

  • Normalize timestamps into yyyy-mm monthly buckets (UTC).
  • Group payment/invoice lines by subscription_id (or customer_email if subscription_id missing) and month to estimate monthly recurring revenue (MRR) per subscription.
  • Compute month-to-month deltas in MRR for subscriptions that existed in previous month to identify expansion (positive delta from existing customers), contraction (negative delta), new MRR, and churned MRR.
  • Aggregate expansion revenue per month and produce supporting tables (per-customer delta rows, top 10 expansions, and a CSV summary per month).

Outputs produced:

  • CSV: expansion_report_by_month.csv with columns: month, expansion_mrr_cents, contraction_mrr_cents, new_mrr_cents, churn_mrr_cents, net_new_mrr_cents.
  • CSV: expansion_by_subscription.csv with subscription_id, customer_email, month, prev_mrr_cents, curr_mrr_cents, delta_cents, reason_hint.
  • Optional: a small JSON summary for dashboards.

Real-world scenario (concrete inputs and outputs)

Imagine an early-stage SaaS startup with a Stripe payments export file named stripe_payments_2024_q1.csv (1,200 rows). Sample columns and a few rows:

id,created,customer_email,amount,currency,invoice_id,subscription_id,description,status,metadata
ch_1,2024-01-05T12:02:00Z,alice@example.com,2000,usd,in_1,sub_1,Monthly plan, succeeded,"{\"plan_tier\":\"pro\"}"
ch_2,2024-01-20T09:30:00Z,bob@example.com,5000,usd,in_2,sub_2,Annual prepaid, succeeded,"{\"plan_tier\":\"enterprise\"}"
ch_3,2024-02-05T12:15:00Z,alice@example.com,4000,usd,in_3,sub_1,Upgrade add-on, succeeded,"{\"upgrade\":\"storage\"}"

Problem solved: compute expansion revenue for Jan–Mar 2024, i.e., the additional MRR generated by existing customers who upgraded or bought add-ons, excluding new customers and refunds.

Expected outputs (example rows from expansion_report_by_month.csv):

month,expansion_mrr_cents,contraction_mrr_cents,new_mrr_cents,churn_mrr_cents,net_new_mrr_cents
2024-01,0,0,7000,0,7000
2024-02,2000,0,0,0,2000
2024-03,1500,500,0,0,1000

Example dataset and specific problem

Example dataset: 1,200 rows (12 months), columns: created (ISO), customer_email (string), subscription_id (string or empty), amount (integer cents), currency (string), invoice_id, description, status. Many records are one-time payments (setup fees) and some are refunds (negative amounts).

Specific problem: manual spreadsheet analysis mixes one-time payments and subscription MRR, producing inflated expansion numbers. The function isolates recurring items by grouping on subscription_id (or heuristics on invoice descriptions) and calculates MRR deltas per subscription month-over-month.

Step-by-step developer workflow

  1. Place the Stripe CSV (stripe_payments_2024_q1.csv) in your working directory or upload it to Functory UI.
  2. Run the Python function: it parses CSV, normalizes dates, filters recurring invoices, computes subscription-month MRR, and detects expansion deltas.
  3. Review outputs: expansion_report_by_month.csv and expansion_by_subscription.csv.
  4. Share the single Functory-generated URL to the CSV or JSON summary with your PM, CFO, and board pack repo.

Algorithm overview

  1. Read CSV; parse created → month key (YYYY-MM).
  2. Normalize amounts to integer cents; filter status == succeeded and currency == target (e.g., usd).
  3. Group by subscription_id + month → monthly_mrr = sum(amount_cents) for recurring-like records.
  4. For each subscription, join month t to month t-1 to compute delta_cents; classify delta as expansion (>0), contraction (<0), new if no prior month, churn if missing from t but present in t-1.
  5. Aggregate deltas across subscriptions per month to produce expansion_mrr_cents, contraction_mrr_cents, etc.

Python example

The snippet below is a compact, runnable example using pandas. It shows the core logic and how to call the core function on a concrete CSV.

import pandas as pd
from datetime import datetime

def parse_month(ts):
    return pd.to_datetime(ts, utc=True).dt.to_period('M').astype(str)

def compute_expansion(csv_path: str, lookback_months: int = 12, output_path: str = 'expansion_report_by_month.csv') -> str:
    df = pd.read_csv(csv_path)
    df = df[df['status'] == 'succeeded']
    df['month'] = parse_month(df['created'])
    df['amount_cents'] = (df['amount'].astype(float)).astype(int)

    # Heuristic: treat entries with subscription_id as recurring; fallback to invoice_id
    df['sub_key'] = df['subscription_id'].fillna(df['invoice_id']).fillna(df['customer_email'])

    monthly = (
        df.groupby(['sub_key', 'month'], as_index=False)
          .agg({'amount_cents': 'sum', 'customer_email': 'first'})
          .rename(columns={'amount_cents': 'curr_mrr_cents'})
    )

    # compute previous month join
    monthly['prev_month'] = (pd.to_datetime(monthly['month']) - pd.offsets.MonthBegin(1)).dt.to_period('M').astype(str)
    prev = monthly[['sub_key', 'month', 'curr_mrr_cents']].rename(columns={'month': 'prev_month', 'curr_mrr_cents': 'prev_mrr_cents'})

    joined = monthly.merge(prev, on=['sub_key', 'prev_month'], how='left')
    joined['prev_mrr_cents'] = joined['prev_mrr_cents'].fillna(0).astype(int)
    joined['delta_cents'] = joined['curr_mrr_cents'] - joined['prev_mrr_cents']

    # classify
    def cls(row):
        if row['prev_mrr_cents'] == 0:
            return 'new'
        if row['delta_cents'] > 0:
            return 'expansion'
        if row['delta_cents'] < 0:
            return 'contraction'
        return 'flat'
    joined['class'] = joined.apply(cls, axis=1)

    summary = (joined.groupby('month')
               .apply(lambda g: pd.Series({
                   'expansion_mrr_cents': g.loc[g['class']=='expansion','delta_cents'].sum(),
                   'contraction_mrr_cents': g.loc[g['class']=='contraction','delta_cents'].sum(),
                   'new_mrr_cents': g.loc[g['class']=='new','curr_mrr_cents'].sum(),
                   'churn_mrr_cents': 0  # churn needs separate detection of missing subs
               }))
               .reset_index())

    summary['net_new_mrr_cents'] = summary['expansion_mrr_cents'] - summary['contraction_mrr_cents'] + summary['new_mrr_cents'] - summary['churn_mrr_cents']
    summary.to_csv(output_path, index=False)
    joined[['sub_key','customer_email','month','prev_mrr_cents','curr_mrr_cents','delta_cents','class']].to_csv('expansion_by_subscription.csv', index=False)
    return output_path

# Example call
if __name__ == '__main__':
    print(compute_expansion('stripe_payments_2024_q1.csv'))

How Functory Makes It Easy

To publish this as a Functory function, pack the core logic into a single main(...) entrypoint such as:

def main(csv_path: str, lookback_months: int = 12, output_name: str = 'expansion_report_by_month.csv') -> str:
    return compute_expansion(csv_path, lookback_months, output_name)

On Functory you must:

  • Choose an exact Python interpreter, e.g. Python 3.11.11.
  • Provide a requirements.txt where every dependency is pinned, e.g. pandas==1.5.3, python-dateutil==2.8.2.
  • Ensure main(...) accepts only typed primitives or file path strings (these become UI fields/JSON inputs) and returns a path-like string for Functory to expose as a downloadable URL.

Inputs/outputs on Functory: the CSV can be uploaded via the web UI or provided as a URL string; main parameters become JSON fields on the HTTP API. The returned file path (e.g., /tmp/expansion_report_by_month.csv) becomes a shareable download URL in the Functory UI and via the API. Benefits: no servers to run, autoscaling on CPU tiers, built-in logging via print(), pay-per-use billing, and the ability to chain functions (e.g., pre-processing → this report → PDF board pack generator).

Comparison with other approaches

Common alternatives include manual spreadsheet work, ad-hoc SQL queries in a data warehouse, or using a BI tool (Looker/Chartio). Spreadsheets are error-prone and hard to reproduce; SQL queries can be precise but require warehouse instrumentation and costum ETL to capture invoice-line granularity; BI tools often show visuals but don't produce the exact CSV breakdowns product teams want for board packs. A small, versioned Python function gives reproducibility, testability, and an API endpoint to generate a shareable URL quickly — combining the rigor of SQL with the shareability of a BI export.

Business impact

Concrete benefit: automating this report reduces manual spreadsheet hours for a PM by ~40% (from an estimated 5 hours per month to ~3 hours including review), and shortens board-prep cycles so you can iterate pricing or upsell experiments faster. For a startup with a $2M ARR and a 20% expansion contribution, having accurate monthly expansion figures can influence hiring or pricing decisions worth tens of thousands in runway planning.

Alternatives, caveats, and next steps

Edge cases to watch: annual prepay invoices (divide by 12 to estimate MRR), refunds (negative amounts), and subscriptions that change billing cadence. For high-fidelity reporting integrate invoice_line_items API exports rather than payments.csv. If you have a warehouse, move this logic into scheduled DB jobs for historical backfill and high-scale queries.

Industry context

According to a 2024 SaaS metrics survey, expansion revenue contributes between 15–25% of ARR for scaling SaaS companies (Source: 2024 OpenView SaaS Benchmarks). Accurate monthly expansion calculations are therefore material for board-level decision making.

Conclusion

Turning Stripe CSV exports into an expansion revenue report is a high-leverage automation: it replaces brittle spreadsheets with a reproducible, auditable Python function that outputs CSVs and can be published as a single-click Functory endpoint. Next steps: extend the function to parse invoice_line_items for better fidelity, add tests that assert expected delta behavior on synthetic datasets, and chain the function with a PDF generator to create a one-click board pack. Try publishing your main(...) on Functory and share the generated URL with your stakeholders — iterate on the classification rules until the numbers match your finance team's expectations.

Thanks for reading.