Functory — Run, Publish & Monetize compute

Clean Your Data: Easy Remove Duplicates Instantly

Smartly detect and clean duplicates from your dataset (CSV or Excel). This function scans your data to find: 🔁 Exact duplicates — identical rows or repeated entries. 🤖 Fuzzy duplicates — similar rows with small differences (typos, spacing, casing, or minor text variations).

csv

$0.01 + $0.001/s

Clean Your Data: Remove Duplicates Instantly

Smartly detect and clean duplicates from your dataset (CSV or Excel). This function scans your data to find: - 🔁 **Exact duplicates** — identical rows or repeated entries. - 🤖 **Fuzzy duplicates** — similar rows with small differences (typos, spacing, casing, or minor text variations). It automatically keeps the **first valid occurrence** of each duplicate and exports everything neatly organized in a single downloadable ZIP. 📦 Inside the ZIP you’ll get: 1. `deduplicated_<name>.csv` — your cleaned dataset (duplicates removed) 2. `duplicates_removed_<name>.csv` — all duplicate rows that were dropped 3. `fuzzy_pairs_<name>.csv` — pairs of rows that look alike (based on similarity) Args: file (FilePath): The uploaded CSV or Excel file to analyze. subset (str): Optional — comma-separated list of column names to check. If left empty, all columns are analyzed. similarity_threshold (int): Optional — how strict fuzzy matching should be (0–100). Higher = only very similar values are flagged. Default = 90 (good balance). Returns: str: Generated ZIP archive containing the cleaned dataset and detailed duplicate reports.

duplicate

$0.01 + $0.001/s

NaN File Generator (example)

Generate a synthetic CSV file with random data and some NaN values. Args: rows (int): Number of rows to generate. nan_ratio (float): Approximate fraction of cells to replace with NaN (0–1). Returns: str: Path to the generated CSV file.

example

$0.001/s

Generate Random Example Dataset

You can simply generate and download a testfile for other functions

4.5

training

$0.001/s

NaN row remover (file cleaner)

Remove rows containing NaN values from a CSV or Excel file. The cleaned file is saved in the same format (CSV or XLSX). Args: file (FilePath): Input CSV or Excel file. Returns: str: Path to the cleaned file (same format as input).

NaN

$0.05 + $0.001/s

Discover amazingfunctions

Clean Your Data: Easy Remove Duplicates Instantly

Clean Your Data: Remove Duplicates Instantly

NaN File Generator (example)

Generate Random Example Dataset

NaN row remover (file cleaner)

Discover amazing
functions