A Tale of Two Formats

If you've spent any time poking around APIs, config files, or developer documentation, you've almost certainly bumped into both XML and JSON. One of them is everywhere right now. The other is mostly found in legacy enterprise systems, RSS feeds, and the nightmares of developers who had to maintain SOAP services in the mid-2000s.

This isn't just a story about file formats. It's a story about what happens when software tooling gets too clever for its own good — and what developers actually reach for when they have a choice.

The Rise of XML: Enterprise's Darling

XML — the eXtensible Markup Language — was standardized by the W3C in 1998 and quickly became the go-to format for data interchange. The pitch was compelling: a human-readable, self-describing format that could represent any data structure, validated against a schema, transformable via XSLT, and processed by a rich ecosystem of tools.

For enterprise software of the late 1990s and early 2000s, XML felt like the answer to everything. Configuration files? XML. Web services? SOAP over XML. Document formats? XML (OpenDocument, OOXML). The web itself was going to become XHTML — a stricter, XML-compliant version of HTML.

SOAP (Simple Object Access Protocol) was particularly emblematic of the era. It let systems communicate over HTTP using XML envelopes, with WSDL files describing the service contract. Banks, governments, and large enterprises standardized on it. A whole industry of middleware — WebSphere, BizTalk, WebLogic — built around processing these XML documents.

"XML is not a language in the sense of a programming language or even a markup language in isolation — it is a meta-language: a set of rules for defining languages." — Tim Bray, XML co-author

And that was partly the problem. XML was designed to be everything to everyone. It succeeded at that — and in doing so, it became enormously complicated.

What Went Wrong With XML

XML's ambitions were its undoing. Here's what developers ran into in practice:

  • Verbosity: Every piece of data needs an opening and closing tag. Want to store the number 42? You're writing <value>42</value>. Multiply that across a large payload and your network traffic balloons.
  • Namespaces: XML namespaces were introduced to avoid naming conflicts between schemas. In theory: sensible. In practice: a maintenance nightmare that required verbose URI prefixes sprinkled throughout your documents.
  • Parsing complexity: A proper XML parser is not a small thing. It needs to handle entities, CDATA sections, processing instructions, namespace resolution, and more. This made XML slow and memory-hungry compared to simpler alternatives.
  • XSLT: Transforming XML using XSLT was itself an XML-based language that required a dedicated transformation engine. It had its own learning curve and debugging was painful.
  • No native browser support for data: Browsers could display XML but JavaScript had no built-in, ergonomic way to work with it. You needed the DOM API, which was verbose and inconsistent across browsers.

By the mid-2000s, the cracks were showing. Developers building the new generation of web applications — dynamic, AJAX-driven, talking to servers in the background — found XML cumbersome. They needed something lighter.

Enter JSON: The Accidental Standard

Douglas Crockford didn't invent JSON so much as he discovered it. In the early 2000s, he recognized that a subset of JavaScript's object and array literal syntax was a perfectly valid, human-readable data interchange format. He registered json.org in 2002, wrote a simple RFC, and put it out into the world.

The insight was elegant: JavaScript was already running in every browser. If your data format is JavaScript object notation, then parsing it in the browser is nearly free. No schema. No transformation language. No namespaces. Just data.

Here's the same piece of data in both formats, so you can see what developers were choosing between:

The same data in XML

<?xml version="1.0" encoding="UTF-8"?> <user> <id>1042</id> <name>Priya Sharma</name> <email>priya@example.com</email> <active>true</active> <roles> <role>editor</role> <role>viewer</role> </roles> <address> <city>Mumbai</city> <pincode>400001</pincode> </address> </user>

The same data in JSON

{ "id": 1042, "name": "Priya Sharma", "email": "priya@example.com", "active": true, "roles": ["editor", "viewer"], "address": { "city": "Mumbai", "pincode": "400001" } }

Same data. JSON is about 35% shorter here, and that gap grows as payloads get larger. More importantly, the JSON version is instantly familiar to any developer who has used a modern programming language. The XML version requires you to understand the XML data model just to read it.

Why JSON Won

The victory wasn't just about brevity. Several forces converged:

  • Native browser parsing:JSON.parse() was added to all major browsers by 2009. Before that, developers were using eval() on JSON strings — which worked, though it made security folks wince.
  • The REST API boom: As SOAP-based web services gave way to REST APIs, JSON became the default response format. Twitter, GitHub, Stripe, and virtually every modern API shipped JSON. Developers followed the ecosystem.
  • Lightweight parsers in every language: Every mainstream language got a fast, idiomatic JSON library. Python's json module, Ruby's JSON, Go's encoding/json. Parsing JSON became a one-liner everywhere.
  • Data types that make sense: JSON has strings, numbers, booleans, arrays, objects, and null. That maps naturally to what most programming languages use. XML, by contrast, treats everything as a string — you have to layer schemas on top to get typed data.
  • NoSQL databases: MongoDB, CouchDB, and similar databases store documents in JSON (or BSON). Once your database is JSON-native, your whole stack tends to follow.

Where XML Still Lives

It would be unfair to write XML's obituary. It's very much alive in specific niches where its features genuinely shine:

  • SVG: Scalable Vector Graphics is XML, and it works beautifully — the hierarchical, attribute-rich structure maps well to graphics primitives.
  • RSS and Atom feeds: Podcast feeds and blog syndication still run on XML. Your podcast app is parsing XML every time it refreshes.
  • Android manifests: Android's AndroidManifest.xml and layout files are XML. There's a reason Jetpack Compose was introduced — developers were tired of XML layouts.
  • Office formats: DOCX and XLSX are ZIP archives containing XML files. Open a Word document with a ZIP utility and you'll find XML inside.
  • SOAP legacy systems: Banks, healthcare systems, and government APIs built in the 2000s are still running SOAP. Migrating them is expensive and risky, so they persist.
  • Configuration: Maven's pom.xml, Spring's application context, Ant build files — the Java ecosystem in particular has a deep XML heritage.

JSON's Rough Edges

JSON is not perfect. Developers who've worked with it seriously run into the same frustrations eventually:

  • No comments: You cannot add comments to JSON. This is a deliberate design choice by Crockford (to prevent people from using JSON as config files), but it's deeply annoying when you need to document a configuration value.
  • No date type: JSON has no native date format. Dates are typically represented as ISO 8601 strings ("2026-04-15T10:30:00Z") by convention, but the spec doesn't enforce this.
  • No schema by default: XML had XSD schemas built into its ecosystem. JSON has JSON Schema, but it's a separate specification that isn't universally supported.
  • Trailing commas: Forget to remove a trailing comma in a JSON object or array and your parser throws an error. Every developer has lost time to this.
  • No binary support: JSON is text-only. For binary data you're base64-encoding it, which adds overhead.

These limitations are why YAML became popular for configuration files — it's essentially a superset of JSON that adds comments, multi-line strings, and a less cluttered syntax. And why formats like Protocol Buffers and MessagePack exist for high-performance binary serialization.

The Bigger Lesson

The JSON vs XML story is a case study in how software ecosystems evolve. XML was designed by committee to solve every data interchange problem for all time. It succeeded at being comprehensive, and that comprehensiveness became a burden.

JSON was designed to solve one concrete problem — passing data between a web server and a browser — and it solved it elegantly. The simplicity made it easy to adopt, easy to implement, and easy to extend into other domains.

Simplicity is not the absence of features. It's the presence of the right features in the right amount.

When you're working with JSON day to day — debugging API responses, formatting payloads, validating structures — a good tool makes the difference. Our JSON Formatter lets you paste raw JSON and instantly get it pretty-printed, validated, and easy to read. Particularly useful when you're staring at a minified API response at 11pm trying to figure out why a nested key is missing.

XML had its moment, and it shaped how we think about structured data. JSON took the best of that thinking and stripped away everything that wasn't load-bearing. That's usually how progress works.

Share: