AbraCalc

Data Format Converters: Convert, Filter, Sort, and Compare Structured Data

Modern data pipelines constantly move information between formats — a spreadsheet export becomes a JSON API payload, a YAML config needs to be verified as valid JSON, an XML feed must be parsed into a database-friendly structure. This guide explains how the most common structured data formats relate to each other, how to convert between them, and how to clean and transform data along the way.

The Four Main Formats

FormatBest forHuman-readable?Supports nesting?
CSVTabular data, spreadsheetsYesNo
JSONAPIs, configuration, nested dataSomewhatYes
YAMLConfiguration files, DevOpsVeryYes
XMLLegacy APIs, document formatsSomewhatYes

Each format has strengths. CSV is the simplest and works directly in Excel or Google Sheets. JSON is the native format of JavaScript and most REST APIs. YAML (a superset of JSON) is favored in Kubernetes, GitHub Actions, and Ansible because it is easy to read and write. XML carries more metadata via attributes and namespaces, making it common in enterprise integrations (SOAP, RSS, SVG).

Converting Between Formats

CSV ↔ JSON

Converting CSV to JSON treats each row as an object, with column headers as keys. The CSV to JSON Converter handles this automatically, including detecting numeric vs. string values. Going the other direction — a flat array of JSON objects back to a table — use the JSON to CSV Converter. Note that nested JSON objects cannot be represented directly in CSV; you must flatten them first (see the JSON Flattener below).

Example:

CSV rowJSON object
1,Alice,true (with headers id,name,active){"id": 1, "name": "Alice", "active": true}

JSON ↔ YAML

JSON and YAML are structurally equivalent — every valid JSON document is also valid YAML. The key syntactic differences: YAML uses indentation instead of braces, omits most quotes, and supports comments (#). Use the YAML to JSON Converter to produce machine-parseable JSON from a YAML config, or the JSON to YAML Converter to produce a more human-readable config from a JSON payload.

JSON → XML

XML requires a single root element, which JSON (an object or array) does not guarantee. The JSON to XML Converter wraps the output in a root element and maps JSON keys to XML tags. Array items become repeated sibling tags. This is the format required by many legacy SOAP APIs and document interchange standards.

JSON Flattening

Nested JSON objects like {"user": {"id": 1, "name": "Alice"}} cannot be directly loaded into a flat table. The JSON Flattener collapses nested keys using dot notation:

Nested JSONFlattened JSON
{"user": {"id": 1, "name": "Alice"}}{"user.id": 1, "user.name": "Alice"}

After flattening, the result can be fed into the JSON to CSV Converter to produce a flat table — a common preprocessing step before loading into a spreadsheet or SQL database.

CSV Data Cleaning and Transformation

Raw CSV exports rarely arrive in a perfectly usable state. A suite of targeted tools handles the most common cleanup tasks:

  • CSV Column Extractor — keep only the columns you need, discarding the rest. Reduces file size and simplifies downstream processing.
  • CSV Row Filter — filter rows by value (e.g., keep only rows where country = "DE"). Works like a SQL WHERE clause without a database.
  • CSV Sorter — sort rows ascending or descending by one or more columns. Equivalent to ORDER BY in SQL.
  • CSV Duplicate Row Remover — removes exact duplicate rows, ensuring each record is unique before loading into a target system.
  • CSV Transpose — swaps rows and columns (pivots the table). Useful when your data is oriented differently than what a downstream tool expects.
  • CSV Splitter (by Rows) — breaks a large CSV into smaller chunks of N rows each, useful when a target API or import tool has a row limit per request.

Worked Example: Clean and Convert a Sales Export

Suppose you have a 10,000-row CSV export from a CRM with 30 columns, and you need to load only active US customers into a JSON API.

  1. Use CSV Column Extractor to keep only id, name, email, country, active.
  2. Use CSV Row Filter to keep rows where country = US and active = true.
  3. Use CSV Duplicate Row Remover to eliminate any duplicate email entries.
  4. Use CSV to JSON Converter to produce the JSON array required by the API.
  5. If the API has a 500-record limit, use CSV Splitter before the final conversion step.

Comparing JSON: The JSON Diff Viewer

When a configuration or API response changes between versions, the JSON Diff Viewer shows exactly which keys were added, removed, or changed — displayed as a structured side-by-side or inline diff. This is far more useful than a plain text diff for nested objects, because it understands the JSON structure rather than comparing raw text lines.

CSV Statistics Summary

Before transforming data, it helps to understand it. The CSV Statistics Summary tool produces per-column statistics for numeric columns (min, max, mean, median, standard deviation) and per-column value counts for categorical columns — similar to Python's pandas.DataFrame.describe(), but with no code required.

Common Mistakes to Avoid

  • Converting nested JSON to CSV without flattening first. Nested objects become [object Object] in CSV — always flatten first.
  • Forgetting CSV quoting rules. Values that contain commas must be wrapped in double quotes. Most tools handle this, but manually edited CSVs sometimes break the convention.
  • Assuming YAML is always valid JSON. YAML supports features (anchors, aliases, block scalars, comments) that have no JSON equivalent and will cause JSON parsers to fail.

Frequently Asked Questions

What is the fastest way to convert a YAML config to JSON for an API call?

Paste the YAML into the YAML to JSON Converter. If the YAML uses anchors or aliases, the tool resolves them before outputting JSON.

How do I remove duplicate rows but keep the first occurrence?

The CSV Duplicate Row Remover retains the first occurrence of each duplicate row by default, which is the standard expected behavior for deduplication.

Can I compare two JSON files to find what changed?

Yes — paste both JSON documents into the JSON Diff Viewer. It highlights additions, deletions, and value changes at every nesting level.

Related Calculators