A Personal Data Vault is a concept that tries to invert the way we store and access data. In short, instead of using applications that are managing our data, we use applications that can connect to our own personal data vaults, thereby keeping all the benefits of data ownership, portability, etc…
The technology behind this, is powered by Solid (not to be confused with the Solid principles in programming), and in this post we’ll explain how a simple survey application can be built using this technology.
TLDR;
You can try out the demo application yourself! Make sure you have a Solid pod hosted somewhere, or create a free one on solidcommunity.net or inrupt.net. Next, access the Survey Creator App and create yourself a survey. The data will be stored in your own Pod. Once you make your survey public, you’ll be able to share a link to the survey for people to fill it in. In the current demo version, you’ll also need a Solid Pod when filling in the survey.
Turtle, Do you speak it?
When data is stored in your personal data vault, it’s stored in a specific format (Turtle) that should be understandable for any future application that wants to read it. After all, this enables data portability among other benefits. That way, if you encounter an other survey application and that’s better then our demo application (and supports Solid), you simply continue to work on your surveys in this new application where you left off. There is no need anymore to ‘export’ and ‘import’ your data via various tools and file formats. It’s already stored and available in your own vault!
If you want to know what this data looks like, here is an example of a survey stored in turtleformat.
Don’t worry if this doesn’t make sense to you. This code is perfectly machine-readable thanks to the use of ontologies which bring meaning💡, order 📐 and relationships ◁▶ to various pieces of data resulting in Linked Data. The main ontology used in the demo survey application is the Survey Ontology and can also be visualized. Now you can recognize some of the vocabulary used in the piece of code above, like SurveyElement, SingleInputQuestion and SurveyProcedure. By using a specific and public available vocabulary like the one mentioned here, other applications are now able to interpret and parse the data that’s stored in your data vault as well.
Databases everywhere
How is this going to change IT infrastructure and applications? As you can imagine, if everyone - and everything - owns (multiple) personal data vaults, that’s going to be a lot of databases alright. By spreading data in a lot of different stores brings a new host of problems that needs to be tackled. One of those problems is for example analysis over large scale datasets. Because if that data is spread over a lot of different datavaults, it becomes increasingly difficult to aggregate and deduce new insights from it. In other words, we still need some ways to aggregate data for various reasons.
In the demo survey application, this is done by a back-end service collecting responses on behalf of a survey creator. Below is a simplified overview how the application interacts.
Be aware this is only one solution to this problem. In the case of a survey application like this one, another way of dealing with data aggregation, could be by dealing with the aggregation directly in the survey creator’s vault. However, this would require some fine grained access control mechanics, which will depend heavily on specific use cases.
Conclusion
As you can see, building applications on top of personal data vaults is already possible today! However, the road is still long, and a lot of technical questions regarding Solid and personal data vaults still need be tackled. This shouldn’t hold you back in building and experimenting with this technology!