Skip to content

Developer Portal

Discover how to use the General System Platform to unlock the full value of historic and streaming spatiotemporal data.

Read the docs

Developer Portal

  • About

    The General System platform is a cloud-based technology designed for fast and effective processing and analysis of real time and/or large scale spatiotemporal data.

    Find out more
  • Technical overview

    Thread-per-core architecture, vectorised storage, and userspace I/O scheduling combined with the ability to simultaneously ingest, index, and query across multiple dimensions at massive scale.

    Read the whitepaper
  • Try it

    Check out the speed and scale of the solution by running some of your own queries on a synthetic data set of 92.4 billion records.

    Book a test drive
  • User guide

    Learn common workflows using the API. From managing users & datasets to importing and querying data, the User Guide has you covered.

    Read the guide
  • Tutorials

    Learn about GSP through interactive Jupyter Notebooks and the Python SDK. Query basics, advanced querying, and common analysis workflows.

    Jump to tutorials
  • Reference

    HTTP API
    Python SDK

Example

The script queries for all the records for a single ID around the Gherkin building in London within the first week of January 2022 against a dataset of 4 billion synthetic GPS records.

query_records.py

dfipy version

import os

from dfi import Client
from dfi.models.filters import TimeRange
from dfi.models.filters.geometry import Polygon

token = os.getenv("API_TOKEN")
url = os.getenv("URL")
dataset_id = os.getenv("DATASET_ID")

dfi = Client(token, url)

uids = ["02386d8e-d7ec-433e-b4ab-219b10544d9a"]
time_range = TimeRange().from_strings(
    min_time="2022-01-01T00:00:00Z", max_time="2022-01-08T00:00:00Z"
)
gherkin = {
    "type": "Polygon",
    "coordinates": [
        [
            [-0.08061210118135528, 51.514749367680494],
            [-0.08085849134944374, 51.51455770247413],
            [-0.08075993528229997, 51.51428170316034],
            [-0.08039035002990011, 51.51418203633051],
            [-0.0799591672357135, 51.51428170316034],
            [-0.07982365264318725, 51.51452703596634],
            [-0.08011932084510204, 51.51477236745105],
            [-0.08061210118135528, 51.514749367680494],
        ]
    ],
}
geometry = Polygon().from_geojson(gherkin)

records = dfi.query.records(
    dataset_id=dataset_id,
    uids=uids,
    time_range=time_range,
    geometry=geometry,
)

print(records)

query_records.py

The script performs a groupby ID and count for records around the Gherkin building in London within the first week of January 2022 against a dataset of 4 billion synthetic GPS records.

unique_id_counts.py

dfipy version

import os
from pprint import pprint

from dfi import Client
from dfi.models.filters import TimeRange
from dfi.models.filters.geometry import Polygon

token = os.getenv("API_TOKEN")
url = os.getenv("URL")
dataset_id = os.getenv("DATASET_ID")

dfi = Client(token, url)

uids = ["052b7d5a-f434-4967-8125-506b0db53385"]
time_range = TimeRange().from_strings(
    min_time="2022-01-01T00:00:00Z", max_time="2022-01-08T00:00:00Z"
)
gherkin = {
    "type": "Polygon",
    "coordinates": [
        [
            [-0.08061210118135528, 51.514749367680494],
            [-0.08085849134944374, 51.51455770247413],
            [-0.08075993528229997, 51.51428170316034],
            [-0.08039035002990011, 51.51418203633051],
            [-0.0799591672357135, 51.51428170316034],
            [-0.07982365264318725, 51.51452703596634],
            [-0.08011932084510204, 51.51477236745105],
            [-0.08061210118135528, 51.514749367680494],
        ]
    ],
}
geometry = Polygon().from_geojson(gherkin)

unique_id_counts = dfi.query.unique_id_counts(
    dataset_id=dataset_id,
    time_range=time_range,
    geometry=geometry,
)

pprint(unique_id_counts)

unique_id_counts.py

The script performs a groupby ID and count for records around the Gherkin building in London within the first week of January 2022 against a dataset of 4 billion synthetic GPS records.

count.py

dfipy version

import os
from pprint import pprint

from dfi import Client
from dfi.models.filters import TimeRange
from dfi.models.filters.geometry import Polygon

token = os.getenv("API_TOKEN")
url = os.getenv("URL")
dataset_id = os.getenv("DATASET_ID")

dfi = Client(token, url)

time_range = TimeRange().from_strings(
    min_time="2022-01-01T00:00:00Z", max_time="2022-01-08T00:00:00Z"
)
gherkin = {
    "type": "Polygon",
    "coordinates": [
        [
            [-0.08061210118135528, 51.514749367680494],
            [-0.08085849134944374, 51.51455770247413],
            [-0.08075993528229997, 51.51428170316034],
            [-0.08039035002990011, 51.51418203633051],
            [-0.0799591672357135, 51.51428170316034],
            [-0.07982365264318725, 51.51452703596634],
            [-0.08011932084510204, 51.51477236745105],
            [-0.08061210118135528, 51.514749367680494],
        ]
    ],
}
geometry = Polygon().from_geojson(gherkin)

count = dfi.query.count(
    dataset_id=dataset_id,
    time_range=time_range,
    geometry=geometry,
)

print(f"{count} records")

count.py