Detect Data File Changes Before They Corrupt Your Pipeline
A CSV or JSON export that looks identical can differ byte-for-byte after a system migration, an ETL job, or an upstream schema change — and downstream pipelines will load the wrong data silently. Deliteful's File Hash Checker lets data engineers generate MD5, SHA-1, SHA-256, and SHA-512 checksums for data files to catch changes before ingestion.
Data engineers working with recurring file-based ingestion pipelines face a specific integrity problem: how do you know the file you received today is actually different from the one you loaded last week, beyond a naive row count check? Cryptographic hashing is the reliable answer — a single changed byte produces a completely different hash. This is especially critical when validating vendor data drops, confirming reprocessed exports, or auditing file lineage in a data warehouse.
Deliteful supports hashing CSV, JSON, Excel (XLSX/XLS), and other common data file formats up to 500MB for CSVs and 200MB for Excel files. All algorithms run in a single streaming pass, and the plain-text checksum report is straightforward to incorporate into a data quality checklist or audit log. Batch processing supports up to 50 files per run.
How it works
- 1
Sign in with Google
Create your free Deliteful account in seconds — no credit card needed.
- 2
Upload your data files
Drop in CSV, JSON, XLSX, or other formats. CSV files up to 500MB and Excel up to 200MB are supported.
- 3
Choose your algorithms
Enter md5, sha256, sha512, etc. as a comma-separated list, or leave blank to run all four algorithms by default (MD5, SHA-1, SHA-256, SHA-512).
- 4
Generate checksums
Deliteful reads each file in a single streaming pass and computes all requested hashes.
- 5
Save the report
Download the plain-text checksum report and store it alongside your data file as part of your ingestion audit trail.
Frequently asked questions
- Can file hashing detect if a CSV was modified after delivery?
- Yes. Any change to the file's contents — even a single character — produces a completely different hash. Compare the hash of the received file against a baseline hash from the sender to confirm the file is unmodified.
- What is the file size limit for CSV and Excel files?
- CSV files are supported up to 500MB, Excel (XLSX/XLS) files up to 200MB. Per-batch limits are 50 files or 2GB total, whichever comes first.
- Does the tool compare hashes automatically, or do I need to compare manually?
- The tool generates hash values and outputs them in a plain-text report. Comparison against a baseline is done manually or via your own scripting — the report format is easy to parse.
- Which algorithm should I use to verify data file integrity?
- SHA-256 is the current standard for integrity verification and is included by default. MD5 is faster but has known collision vulnerabilities and is generally not recommended for security-sensitive use cases.
- Can I hash JSON files from API exports?
- Yes. JSON is a supported input type. Upload your API export file directly and receive a checksum for it in the report.
Create your free Deliteful account with Google and start adding cryptographic integrity checks to your data ingestion workflow.