API Documentation

The PANDA application is built on top of a REST API that can be used to power custom applications or import/export data in novel ways.

The PANDA API follows the conventions of Tastypie except in important cases where doing so would create unacceptable limitations. If this documentation seems incomplete, refer to Tastypie’s page on Interacting with the API to become familiar with the common idiom.


You will probably want to try these URLs in your browser. In order to make them work you’ll need to use the format, email, and api_key query string parameters. For example, to authenticate as the default administrative user that comes with PANDA, append the following query string to any url described on this page:


Unless otherwise specified, all endpoints that return lists support the limit and offset parameters for pagination. Pagination information is contained in the embedded meta object within the response.


User objects can be queried to retrieve information about PANDA users. Passwords and API keys are not included in responses.


If accessing the API with normal user credentials you will only be allowed to fetch/list users and to update your own data. Superusers can update any user, as well as delete existing users and create new ones.

Example User object:

    date_joined: "2011-11-04T00:00:00",
    email: "panda@pandaproject.net",
    first_name: "Redd",
    id: "1",
    is_active: true,
    last_login: "2011-11-04T00:00:00",
    last_name: "",
    resource_uri: "/api/1.0/user/1/"








To create a new user, POST a JSON document containing at least the email property to http://localhost:8000/api/1.0/user/. Other properties such as first_name and last_name may also be set. If a password property is specified it will be set on the new user, but it will not be included in the response. If password is omitted the user will need to set a password before they can log in (not yet implemented).


The Task API allows you to access data about import, export and reindexing processes running on PANDA. This data is read-only.

Example Task object:

    end: "2011-12-12T15:11:25",
    id: "1",
    message: "Import complete",
    resource_uri: "/api/1.0/task/1/",
    start: "2011-12-12T15:11:25",
    status: "SUCCESS",
    task_name: "panda.tasks.import.csv",
    traceback: null





List filtered by status

List tasks that are PENDING (queued, but have not yet started processing):



Possible task statuses are PENDING, STARTED, SUCCESS, and FAILURE.

List filtered by date

List tasks that ended on October 31st, 2011:




Data Uploads

Due to limitations in upload file-handling, it is not possible to create Uploads via the normal API. Instead data files should be uploaded to http://localhost:8000/data_upload/ either as form data or as an AJAX request. Examples of how to upload files with curl are at the end of this section.

Example DataUpload object:

    columns: [
    creation_date: "2012-02-08T17:50:09",
    creator: {
        date_joined: "2011-11-04T00:00:00",
        email: "user@pandaproject.net",
        first_name: "User",
        id: "2",
        is_active: true,
        last_login: "2012-02-08T22:45:28",
        last_name: "",
        resource_uri: "/api/1.0/user/2/"
    data_type: "csv",
    dataset: "/api/1.0/dataset/contributors/",
    dialect: {
        delimiter: ",",
        doublequote: false,
        lineterminator: "\r\n",
        quotechar: "\"",
        quoting: 0,
        skipinitialspace: false
    encoding: "utf-8",
    filename: "contributors.csv",
    "guessed_types": ["int", "unicode", "unicode", "unicode"],
    id: "1",
    imported: true,
    original_filename: "contributors.csv",
    resource_uri: "/api/1.0/data_upload/1/",
    sample_data: [
            "Chicago Tribune"
            "Chicago Tribune"
            "The Spokesman-Review"
            "PANDA Project"
    size: 168







Download original file


Upload as form-data

When accessing PANDA via curl, your email and API key can be specified with the headers PANDA_EMAIL and PANDA_API_KEY, respectively:

curl -H "PANDA_EMAIL: panda@pandaproject.net" -H "PANDA_API_KEY: edfe6c5ffd1be4d3bf22f69188ac6bc0fc04c84b" \
-F file=@test.csv http://localhost:8000/data_upload/

Upload via AJAX

curl -H "PANDA_EMAIL: panda@pandaproject.net" -H "PANDA_API_KEY: edfe6c5ffd1be4d3bf22f69188ac6bc0fc04c84b" \
--data-binary @test.csv -H "X-Requested-With:XMLHttpRequest" http://localhost:8000/data_upload/?qqfile=test.csv


When using either upload method you may specify the character encoding of the file by passing it as a parameter, e.g. ?encoding=latin1


Categories are referenced by slug, rather than by integer id (though they do have one).

Example Category object:

    dataset_count: 2,
    id: "1",
    name: "Crime",
    resource_uri: "/api/1.0/category/crime/",
    slug: "crime"




When queried as a list, a “fake” category named “Uncategorized” will also be returned. This category includes the count of all Datasets not in any other category. It’s slug is uncategorized and its id is 0, but it can only be accessed as a part of the list.





Dataset is the core object in PANDA and by far the most complicated. It contains several embedded objects describing the columns of the dataset, the user that created it, the related uploads, etc. It also contains information about the history of the dataset and whether or not it is currently locked (unable to be modified). Datasets are referenced by slug, rather than by integer id (though they do have one).

Example Dataset object:

    categories: [ ],
    column_schema: [
            indexed: false,
            indexed_name: null,
            max: null,
            min: null,
            name: "first_name",
            type: "unicode"
            indexed: false,
            indexed_name: null,
            max: null,
            min: null,
            name: "last_name",
            type: "unicode"
            indexed: false,
            indexed_name: null,
            max: null,
            min: null,
            name: "employer",
            type: "unicode"
    creation_date: "2012-02-08T17:50:11",
    creator: {
        date_joined: "2011-11-04T00:00:00",
        email: "user@pandaproject.net",
        first_name: "User",
        id: "2",
        is_active: true,
        last_login: "2012-02-08T22:45:28",
        last_name: "",
        resource_uri: "/api/1.0/user/2/"
    current_task: {
        creator: "/api/1.0/user/2/",
        end: "2012-02-08T17:50:12",
        id: "1",
        message: "Import complete",
        resource_uri: "/api/1.0/task/1/",
        start: "2012-02-08T17:50:12",
        status: "SUCCESS",
        task_name: "panda.tasks.import.csv",
        traceback: null
    data_uploads: [
            columns: [
            creation_date: "2012-02-08T17:50:09",
            creator: {
                date_joined: "2011-11-04T00:00:00",
                email: "user@pandaproject.net",
                first_name: "User",
                id: "2",
                is_active: true,
                last_login: "2012-02-08T22:45:28",
                last_name: "",
                resource_uri: "/api/1.0/user/2/"
            data_type: "csv",
            dataset: "/api/1.0/dataset/contributors/",
            dialect: {
                delimiter: ",",
                doublequote: false,
                lineterminator: "
                quotechar: """,
                quoting: 0,
                skipinitialspace: false
            encoding: "utf-8",
            filename: "contributors.csv",
            id: "1",
            imported: true,
            original_filename: "contributors.csv",
            resource_uri: "/api/1.0/data_upload/1/",
            sample_data: [
                    "Chicago Tribune"
                    "Chicago Tribune"
                    "The Spokesman-Review"
                    "PANDA Project"
            size: 168
    description: "",
    id: "1",
    initial_upload: "/api/1.0/data_upload/1/",
    last_modification: null,
    last_modified: null,
    last_modified_by: null,
    locked: false,
    locked_at: "2012-03-29T14:28:02",
    name: "contributors",
    related_uploads: [ ],
    resource_uri: "/api/1.0/dataset/contributors/",
    row_count: 4,
    sample_data: [
            "Chicago Tribune"
            "Chicago Tribune"
            "The Spokesman-Review"
            "PANDA Project"
    slug: "contributors"





List filtered by category


List filtered by user

A shortcut is provided for listing datasets created by a specific user. Simply pass the creator_email parameter. Note that this paramter can not be combined with a search query or other filter.


Search for datasets

The Dataset list endpoint also provides full-text search over datasets’ metadata via the q parameter.


By default search results are complete Dataset objects, however, it’s frequently useful to return simplified objects for rendering lists, etc. These simple objects do not contain the embedded task object, upload objects or sample data. To return simplified objects just add simple=true to the query.





To create a new Dataset, POST a JSON document containing at least a name property to /api/1.0/dataset/. Other properties such as description may also be included.

If data has already been uploaded for this dataset, you may also specify the data_upload property as either an embedded Upload object, or a URI to an existing DataUpload (for example, /api/1.0/data_upload/17/).

If you are creating a Dataset specifically to be updated via the API you will want to specify columns at creation time. You can do this by providing a columns query string parameter containing a comma-separated list of column names, such as ?columns=foo,bar,baz. You may also specify a column_types parameter which is an array of types for the columns, such as column_types=int,unicode,bool. Lastly, if you want PANDA to automatically indexed typed columns for data added to this dataset, you can pass a typed_columns parameter indicating which columns should be indexed, such as typed_columns=true,false,true.


Begin an import task. Any data previously imported for this dataset will be lost. Returns the original dataset, which will include the id of the new import task:



Exporting a dataset is an asynchronous operation. To initiate an export you simple need to make a GET request. The requesting user will be emailed when the export is complete:



Reindexing allows you to add (or remove) typed columns from the dataset. You initiate a reindex with a GET request and can supply column_types and typed_columns fields in the same format as documented above in the section on creating a Dataset.



Data objects are referenced by a unicode external_id property, specified at the time they are created. This property must be unique within a given Dataset, but does not need to be unique globally. Data objects are accessible at per-dataset endpoints (e.g. /api/1.0/dataset/[slug]/data/). There is also a cross-dataset Data search endpoint at /api/1.0/data/, however, this endpoint can only be used for search–not for create, update, or delete. (See below for more.)


The external_id property of a Data object is the only way it can be accessed through the API. In order to work with Data via the API you must include this property at the time you create it. By default this property is null and the Data can not be accessed except via search.

An example Data object with an external_id:

    "data": [
        "Chicago Tribune"
    "dataset": "/api/1.0/dataset/contributors/",
    "external_id": "1",
    "resource_uri": "/api/1.0/dataset/contributors/data/1/"

An example Data object without an external_id, note that it also has no resource_uri:

    "data": [
        "Chicago Tribune"
    "dataset": "/api/1.0/dataset/contributors/",
    "external_id": null,
    "resource_uri": null


You can not add, update or delete data in a locked dataset. An error will be returned if you attempt to do so.


There is no schema endpoint for Data.


When listing data, PANDA will return a simplified Dataset object with an embedded meta object and an embedded objects array containing Data objects. The added Dataset metadata is purely for convenience when building user interfaces.



To fetch a single Data from a given Dataset:


Create and update

Because Data is stored in Solr (rather than a SQL database), there is no functional difference between Create and Update. In either case any Data with the same external_id will be overwritten when the new Data is created. Because of this requests may be either POST‘ed to the list endpoint or PUT to the detail endpoint.

An example POST:

    "data": [
        "column A value",
        "column B value",
        "column C value"
    "external_id": "id_value"

This object would be POST‘ed to:


An example PUT:

    "data": [
        "new column A value",
        "new column B value",
        "new column C value"

This object would be PUT to:


Bulk create and update

To create or update objects in bulk you may PUT an array of objects to the list endpoint. Any object with a matching external_id will be deleted and then new objects will be created. The body of the request should be formatted like:

    "objects": [
            "data": [
                "column A value",
                "column B value",
                "column C value"
            "external_id": "1"
            "data": [
                "column A value",
                "column B value",
                "column C value"
            "external_id": "2"


To delete an object send a DELETE request to its detail url. The body of the request should be empty.

Delete all data from a dataset

In addition to deleting individual objects, its possible to delete all objects within a dataset, by sending a DELETE request to the root per-dataset data endpoint. The body of the request should be empty.
