Google, Facebook, Microsoft, and Twitter Partner on New Data Project
Facebook is participating in the Data Transfer Project, a collaboration of organizations, including Google, Microsoft and Twitter, committed to building a common way for people to transfer data into and out of online services.
Moving your data between any two services can be complicated because every service is built differently and uses different types of data that may require unique privacy controls and settings. For example, you might use an app where you share photos publicly, a social networking app where you share updates with friends, and a fitness app for tracking your workouts. People increasingly want to be able to move their data among different kinds of services like these, but they expect that the companies that help them do that will also protect their data.
These are the kinds of issues the Data Transfer Project aims to tackle.
The Data Transfer Project was formed in 2017 to create an open-source, service-to-service data portability platform so that all individuals across the web could easily move their data between online service providers whenever they want.
The DTP extends data portability beyond downloading a copy of your data from your service provider, to providing consumers the ability to directly transfer data in and out of any participating provider.
The protocols and methodology of DTP enable direct, service-to-service data transfer with streamlined engineering work.
The Project, which is in its early stages, comprises three main components:
Data Models are the canonical formats that establish a common understanding of how to transfer data. Adapters provide a method for converting each Provider's proprietary data and authentication formats into a form that is usable by the system. Task Management Library provides the plumbing to power the system. Data Models Data Models represent the data when being transferred between two different companies. Ideally each company would use interoperable APIs (e.g. ActivityPub) to allow data to flow between them. However in many cases that is not the case. In those cases there needs to be a way to transfer the data from one companies representation to another companies representation.
Data Models are clustered together, typically by industry grouping, to form Verticals. A Provider could have data in one or more Verticals. Verticals could be photos, email, contacts, or music. Each Vertical has its own set of Data Models that enable seamless transfer of the relevant file types. For example, the Music vertical could have Data Models for music, playlists and videos.
Ideally, a Vertical will have a small number of well-defined and widely-adopted Data Models. In such a situation, the generally accepted standard will be used as the Data Model for that Vertical across companies. This is not currently the case for most Verticals because Data Models have emerged organically in a largely disconnected ecosystem.
One goal of DTP is to encourage organizations to use common Data Models in their systems, which will happen if organizations take importing and exporting data into consideration when initially designing their systems or providing updates. Using a common Data Model will significantly reduce the need for companies to maintain and update proprietary APIs.
There are two main kinds of Adapters: Data Adapters and Authentication Adapters. These Adapters exist outside of a Provider's core infrastructure and can be written either by the Provider itself, or by third parties that would like to enable data transfer to and from a Provider.
Data Adapters Data Adapters are pieces of code that translate a given Provider's APIs into Data Models used by DTP. Data Adapters come in pairs: an exporter that translates from the Provider's API into the Data Model, and an importer that translates from the Data Model into the Provider's API.
Authentication Adapters Authentication Adapters are pieces of code that allow consumers to authenticate their accounts before transferring data out of or into another Provider. OAuth is likely to be the choice for most Providers, however DTP is agnostic to the type of authentication.
The Task Management Libraries handle background tasks, such as calls between the two relevant Adapters, secure data storage, retry logic, rate limiting, pagination management, failure handling, and individual notifications. DTP has developed a collection of Task Management Libraries as a reference implementation for how to utilize the Adapters to transfer data between two Providers. If preferred, Providers can choose to write their own implementation of the Task Management Libraries that utilize the Data Models and Adapters of DTP.
DTP is still in very active development. The companies "welcome everyone to participate," saying that "the more expertise and viewpoints we have contributing to the project the more successful it will be."