The liberating effects of data formatting standards

We at the Knowledge Standards Foundation speak about an “encyclopedia network” and about “standards” for the entire network architecture. And it probably is a good idea to be thinking about that.

But, having explored a number of different attempts to decentralize social media, I have come around to a simple view about what we really need—and if I am right, it is surprisingly low-hanging fruit. To wit:

We need uniform data formatting standards for different kinds of distinguishable content. Then we need developers to use those standards.

That is all.

I invite you to examine that proposition.

Look at RSS and Atom. They provide little more than a method to generate a feed of blog posts. Before RSS and Atom were formulated, people had already been in the habit of posting updates on their personal websites. All that RSS and Atom did was to define a few simple features, such as title, body, and author—so they are just standards for formatting web content and associated metadata.

Merely because these data formatting standards exist, there is no single proprietary blogging software (there are others besides WordPress, and it’s open source anyway). There is no single dominant blogging platform or corporation. That is why there is no way for a giant Silicon Valley corporation to silence a particular blogger.

Common data formatting standards, like RSS, guarantee that you can export your data in a useful way. You can grab all the content of your blog formatted according to the RSS standard, and then import it to some other software or platform. But you cannot do that with your social media data. Big Tech monopolies like Twitter and Facebook are propped up by the fact that their content format is proprietary. If it weren’t, then it would be much easier to export your tweets, your Facebook posts, and your other content elsewhere. It would not matter as much where the content was.

Now, let me be clear. Suppose Twitter and Facebook did suddenly start making your content available in a useful, common standard. For a while (possibly years), they would still have effective monopolies over discussion using their types of content, for the simple reason that you wouldn’t have as many readers or viewers outside of their networks. But as soon as they all started using common data formatting standards, then it would be possible for developers, including groups of competitors and the communities devoted to open source and free speech, to create aggregators—just like blog and news aggregators—bringing together all content of a similar kind (tweets, Facebook-type posts, annotated pictures, videos, etc.). People interested in that content would naturally gravitate to those aggregators, because they would be able to find all their old interesting content, and also some off-network content (their friends who had left Facebook and Twitter in disgust).

But, you say, we cannot force the likes of Facebook and Twitter to use a uniform data formatting standard; they’d never agree, since it would undermine their business model. To that, I have two replies.

First, yes, we can. We could legislate it. It’s even possible, although not very likely, that Democrats and other parties in other countries would be in favor of this simple, practical way of “breaking up the monopolies.” But don’t hold your breath; Democrats get a lot of money from Silicon Valley and they are doing the sort of censorship that the radical elements of the party want to be done. As to the Republicans, if the Democrats don’t want it, chances are they won’t want it either—but they’ll win votes by pretending that they do. They’ll just…never get around to it, for some weird reason.

Second, we can write tools that will automatically export archives and feeds into a common format. Twitter should itself support, giving me an RSS of my most recent (say) 100 posts. But they don’t. However, someone could do so for them. In fact, for a while, someone did run such a free service for a while, and you can pay for this service.

Here then is the vision that I want the KSF to champion. Imagine an encyclopedia formatting standard. Wikipedia does not have this quite yet, but much of the work has already been done. Then all we need to do is write software for different encyclopedias that generates feeds and archives—if not of all the content, then at least the metadata. Even if nothing else happened, it is only a matter of time before some excellent meta-search engines were developed. And after that, an equally distributed means of rating articles. The result is a massive Internet-wide encyclopedia project, not limited just to Wikipedia.

Imagine RSS feeds being created for every Twitter and public Facebook feed that wanted one—and also for Parler, MeWe, Gab, Minds, and the Fediverse (Mastodon etc.). Then imagine a massive aggregator: again, that’s a service that combines data from all those websites and makes it easily available in one place (at least to programmers; but in this case, let’s imagine the broader public as well). You can follow every feed you want, regardless of its location. And you can even respond from the same place—and, until such time as those other networks incorporate broader RSS network feeds into their own services, the aggregator will log in to your accounts on all those other services, and post your response. The result (with further improvements) is a massive Internet-wide social media network, not limited to Twitter and Facebook and other such tech giants. Because you can participate from software that you own, you retain control over your own data. Even if one network, like Twitter, shuts down your microposts, your friends will be able to see them via the broader network.

Similarly videos. I guess it’s getting obvious by now: a service like BitChute allows you to easily export your videos from YouTube to their service (this is already the case, in fact), which puts them into an uncensorable, public cloud, accessible via “magnet” addresses. You will (or should) have exclusive control over your videos. Then it’s simply a matter of standardizing common video and video metadata formats, and making the superset of all videos online—and voila, we do not have to depend on censorious bullies like YouTube.

In fact, that illustrates the next point I want to make. In each case, where there is a standardized data format, it is possible to put all that content, or at least the metadata about the content, in “content-addressable storage”—basically, a method of putting the content into any of a number of interoperable public clouds, such as BitTorrent, IPFS, and blockchains. What does that mean? It means that the content exists everywhere and nowhere—in other words, anyone can pull down an encyclopedia article or video, and you can build whole systems based on the content, at least if there is a common data formatting standard. It is an excellent way to implement the idea of a “public commons.”

So all of the above are nice dreams, but they all require that there be common data formatting standards. For that reason, I have two proposals.

First, I want the first unit of the upcoming seminar to be focused on data formatting standards.

Second, I want to look into publishing a draft data formatting standard for encyclopedias sooner, rather than later.

Categorized as Strategy

By Larry Sanger

See this page for my bio. Welcome to this site! Thanks for being here!

Leave a comment