Filtering, Querying & Transforming Data
Data querying, filtering and transformation.
Data Bundling
Provided that the PDA stores any JSON-formatted data provided to it, it was important for us to design mechanisms for suitable data retrieval. Data bundling in the PDA allows for extremely flexible data transformations and filtering when retrieving data:
Picking specific parts (fields) of interest out of all the data available to avoid exposing data that is not required for the specific application. If you visualise data in a table, this would look like vertically slicing the table.
Filtering only the data that is required, based on values of the data stored. Using the table analogy, this would look like horizontally slicing the table.
Interleaving data from different, potentially heterogeneous endpoints – think about location data coming in from a range of different sources, when an application is only concerned with having the most recent longitude and latitude, no matter which application it has come from.
Restructuring the data to the desired JSON format on the fly, for example to unify the structure of data from different endpoints being interleaved or to reformat to something more convenient for the developer.
The first step in the process is to understand Data Combinators.
Data Combinators
The API supports a notion of custom data "combinators", with the key feature being data transformation. It allows for:
remapping data JSON from such different streams into structures chosen by the developer to facilitate consistent structures across unrelated sources
combining data from multiple feeds into a single response stream
ordering of data according to underlying JSON structure fields
filtering of data according to underlying JSON values (including text-based search)
registering a datapoint with a data-mapping specification and
GETing data from the registered endpoint.
Creating a simple combinator
One of the simplest types of data transformation, is the remapping of the data structure. This can be done by creating a combinator:
Request: POST /api/v2.6/combinator/$COMBINATOR_NAME with header x-auth-token. Where $COMBINATOR_NAME is a chosen name for your data combinator. Combinator name can be any valid URL path, but must be unique – request will fail with an error otherwise.
Here's a simple example extracting two fields, longitude and latitude from a Rumpel location's endpoint and unwrapping them to a top-level object:
[
{
"endpoint": "rumpel/locations",
"mapping": {
"longitude": "data.locations.longitude",
"latitude": "data.locations.latitude"
}
},
{
"endpoint": "rumpel/profile",
"mapping": {
"firstName": "data.firstName",
"lastName": "data.lastName"
}
}
]Fetching data from a Data Combinator
The created combinator can be used by simply sending GET to /api/v2.6/combinator/$COMBINATOR_NAME with header x-auth-token.
It responds with the same data structure as plain data APIs: with a list of data records wrapped with the basic record details and the data itself remapped according to the registered combinator.
[
{
"endpoint": "rumpel/locations",
"recordId": "e965e022-6613-476a-a0cd-1f587a41b148",
"data": {
"longitude": "0.101014673709963",
"latitude": "51.671358277138"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "fcf1a26b-e49f-4457-915b-156e14140f38",
"data": {
"longitude": "0.100905202634514",
"latitude": "51.674001392439"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "8f7afa92-39e2-48ab-8028-f5aebaa9918e",
"data": {
"longitude": "0.080477950927866",
"latitude": "51.6658257133844"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "d3a6f04b-4df6-4888-a7b0-c1d5ca272de9",
"data": {
"longitude": "0.0641066288762133",
"latitude": "51.6641215101037"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "6a858d87-899e-4961-b722-0738d07c755e",
"data": {
"longitude": "0.0961801595986785",
"latitude": "51.6712232446779"
}
}
]Data Filtering
The combinator's API allows for powerful filtering of data according to the recorded values. The combinator gets created by POSTing a request to /api/v2.6/combinator/$COMBINATOR_NAME as previously. However, for each source of data you may also define one or more filters in addition to the endpoint and transformation used to remap the data:
[
{
"endpoint": "rumpel/locations",
"filters": [
{
"field": "data.locations.timestamp",
"transformation": {
"transformation": "datetimeExtract",
"part": "hour"
},
"operator": {
"operator": "between",
"lower": 7,
"upper": 9
}
}
]
}
]The above example extracts the hour part of the location timestamp and filters for records with the hour between 7 and 9. If you add multiple filters, they act like logical AND operator: a data record has to match all filters to be included in the result. Every filter consists of three fields:
Parameter
Type
Meaning
field
String
The JSON path of the field to use for filtering – it can be a simple JSON value, an array or an object.
transformation
Transformation Object
Optionally transforms the field in question before applying a filter. You can find the supported transformations below.
operator
Operator Object
The filtering Operator. You can find the supported operators below.
transformation– currently supported transformations:identity– keep the value as-is, effect is the same as iftransformationwas not defineddatetimeExtractwithpart– extract part of a date from an ISO 8601 formatted date fieldtimestampExtractwithpart– extract part of a date from a UNIX timestamp date fieldsearchable– convert the field to searchable text. Must be used together with thefindoperator below
operator– different operator types:intogether withvaluefield, set to check iffieldis in (is contained by)valuecontainstogether withvaluefield, set to check iffieldcontainsvaluebetweentogether withloweranduppervalues, checks if thelower<field<upperfindtogether withsearchfield set to the search string to perform text-based search on. Must be used together with thesearchabletransformation above.
The illustrated ways of creating data combinators hopefully provide you with a comprehensive tool to extract data in any way you like. The next step is to build up a layer of bundles on top of them to allow for retrieving a bigger variety of data in one big bundle.
Data Bundles
Data Bundles add a thin layer around combinators, useful in 2 ways:
Retrieving data into explicitly named properties from different
combinatorsAccepts
orderByandlimitparameters to control how many data points are returned for a specific bundle property
Using previously covered examples of profile and location data, they are clearly very distinct, but an application may still benefit from having both at the same time. For instance, it may only care for the most recent information on user's profile and their 5 most recent locations. This can be achieved with a POST request in https://postman.hubat.net/api/v2.6/data-bundle/localprofile with header x-auth-token and body:
{
"profile": {
"endpoints": [
{
"endpoint": "rumpel/profile"
}
],
"limit": 1
},
"location": {
"endpoints": [
{
"endpoint": "rumpel/locations",
"mapping": {
"longitude": "data.locations.longitude",
"latitude": "data.locations.latitude"
}
}
],
"limit": 5
}
}The response includes the specific data requested:
{
"profile": [
{
"endpoint": "rumpel/profile",
"recordId": "9b136020-372a-4777-81f9-2c4ce6925aea",
"data": {
"profile": {
"website": {
"link": "https://example.com",
"private": "false"
},
"nick": {
"private": "true",
"name": ""
},
"primary_email": {
"value": "[email protected]",
"private": "false"
},
"private": "false",
"youtube": {
"link": "",
"private": "true"
},
"address_global": {
"city": "London",
"county": "",
"country": "UK",
"private": "true"
},
"age": {
"group": "",
"private": "true"
},
"personal": {
"first_name": "",
"private": "false",
"preferred_name": "Test",
"last_name": "User",
"middle_name": "",
"title": ""
},
"blog": {
"link": "",
"private": "false"
},
"facebook": {
"link": "",
"private": "false"
},
"address_details": {
"no": "",
"street": "",
"private": "false",
"postcode": ""
},
"emergency_contact": {
"first_name": "",
"private": "true",
"relationship": "",
"last_name": "",
"mobile": ""
},
"alternative_email": {
"private": "true",
"value": ""
},
"fb_profile_photo": {
"private": "false"
},
"twitter": {
"link": "",
"private": "false"
},
"about": {
"body": "A short bio about me shown on my PHATA",
"private": "false",
"title": "Me the Test User"
},
"mobile": {
"no": "",
"private": "true"
},
"gender": {
"type": "",
"private": "true"
}
}
}
}
],
"location": [
{
"endpoint": "rumpel/locations",
"recordId": "e965e022-6613-476a-a0cd-1f587a41b148",
"data": {
"longitude": "0.101014673709963",
"latitude": "51.671358277138"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "fcf1a26b-e49f-4457-915b-156e14140f38",
"data": {
"longitude": "0.100905202634514",
"latitude": "51.674001392439"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "8f7afa92-39e2-48ab-8028-f5aebaa9918e",
"data": {
"longitude": "0.080477950927866",
"latitude": "51.6658257133844"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "d3a6f04b-4df6-4888-a7b0-c1d5ca272de9",
"data": {
"longitude": "0.0641066288762133",
"latitude": "51.6641215101037"
}
},
{
"endpoint": "rumpel/locations",
"recordId": "6a858d87-899e-4961-b722-0738d07c755e",
"data": {
"longitude": "0.0961801595986785",
"latitude": "51.6712232446779"
}
}
]
}To keep the example simple, it does not include complex data combinators covered in the previous step. However you will notice that the endpoints property has exactly the same format as the body of a request for creating a new combinator.
Like Data Combinators, Data Bundles can only be directly used by privileged applications such as the personal data dashboard. However this leads us to Data Debits for consented data sharing as Bundles is the format used to specify the data requested from the user.
Last updated
Was this helpful?