Technical

How to download the Australian BioNet Database

Did you know that there is a nest of endangered long nosed bandicoots living just beside the popular Manly beach in Sydney, Australia? Well, I didn’t, until I looked at BioNet. The Australian NSW government created BioNet as a government database of all flora and fauna species sightings in NSW. It’s absolutely fantastic. If you’re an architect and want to see how you might impact the urban ecosystem in NSW, look at BioNet. If you’re an ecologist of some kind, you probably already use it. If you’re just a good citizen who wants to remodel your back yard to improve urban ecology, BioNet is there for you.

Fortunately, BioNet comes with an online search system called Atlas. It’s simple to use, but unfortunately it has limits on the data it produces. It won’t show you all the fields associated with species, won’t show meta fields, and has a limit to the quantity of records shown. Thankfully, BioNet comes with an API which can be queried with programming knowledge. I’ve written a bit of Python which will allow you to download regions of data; but before we get to that, let’s see a graphic!

Sydney BioNet species map

I’ve plotted every species on the database close to Sydney in the map above. Size is relative to the number of species sighted (logarithmic relationship). I haven’t done any real filtering beyond this, so it’s not very meaningful, but it shows the data and shows it can be geolocated. It also looks like someone murdered the country, but I’ll post the interesting visualisations in a future post.

The Python code works in two parts. The first queries the API for json results divided into square tiles from a top left and bottom right latitude and longitude coordinate region. This’ll give you a bunch of *.json files in the current working directory. Edit the coordinates and resolution as necessary, and off you go. I’ve put in a series of fields that should be good for more general uses, but you can check the BioNet Data API for all fields.

import os

start = (-33.408554, 150.326152)
end = (-34.207799, 151.408916)

lat = start[0]
lon = start[1]

def create_url(lat, lon, lat_next, lon_next):
    return 'https://data.bionet.nsw.gov.au/biosvcapp/odata/SpeciesSightings_CoreData?$select=kingdom,catalogNumber,basisOfRecord,dcterms_bibliographicCitation,dataGeneralizations,informationWithheld,dcterms_modified,dcterms_available,dcterms_rightsHolder,IBRASubregion,scientificName,vernacularName,countryConservation,stateConservation,protectedInNSW,sensitivityClass,eventDate,individualCount,observationType,status,coordinateUncertaintyInMeters,decimalLatitude,decimalLongitude,geodeticDatum&$filter=((decimalLongitude ge ' + str(lon) + ') and (decimalLongitude le ' + str(lon_next) + ')) and ((decimalLatitude le ' + str(lat) + ') and (decimalLatitude ge ' + str(lat_next) + '))'

i = 0
resolution = 0.05

while (lat > end[0]):
    while (lon < end[1]):
        lat_next = round(lat - resolution, 6)
        lon_next = round(lon + resolution, 6)
        url = create_url(lat, lon, lat_next, lon_next).replace(' ', '%20').replace('\'', '%27')
        os.system('curl \'' + url + "\' -H 'Host: data.bionet.nsw.gov.au' -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Cookie: NSC_EBUB_CJPOFU_443_mcwjq=ffffffff8efb154f45525d5f4f58455e445a4a423660' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' -H 'Cache-Control: max-age=0' > " + str(i) + '.json')
        i += 1

        lon = round(lon + resolution, 6)
    lon = start[1]
    lat = round(lat - resolution, 6)

Now we’ll run another little script which will convert all the json files in the directory into a single csv file. You can read this csv file in programs like Excel or QGIS for further analysis.

import unicodecsv as csv
import json

f = csv.writer(open('bionet.csv', 'wb+'), encoding='utf-8')
number_of_json_files = 352

f.writerow([
    'IBRASubregion',
    'basisOfRecord',
    'catalogNumber',
    'coordinateUncertaintyInMeters',
    'countryConservation',
    'dataGeneralizations',
    'dcterms_available',
    'dcterms_bibliographicCitation',
    'dcterms_modified',
    'dcterms_rightsHolder',
    'decimalLatitude',
    'decimalLongitude',
    'eventDate',
    'geodeticDatum',
    'individualCount',
    'informationWithheld',
    'observationType',
    'protectedInNSW',
    'scientificName',
    'sensitivityClass',
    'stateConservation',
    'status',
    'kingdom',
    'vernacularName',
    ])
i = 0
while i < number_of_json_files:
    data = json.load(open(str(i) + '.json'))
    print(i)
    for x in data['value']:
        f.writerow([
            x['IBRASubregion'],
            x['basisOfRecord'],
            x['catalogNumber'],
            x['coordinateUncertaintyInMeters'],
            x['countryConservation'],
            x['dataGeneralizations'],
            x['dcterms_available'],
            x['dcterms_bibliographicCitation'],
            x['dcterms_modified'],
            x['dcterms_rightsHolder'],
            x['decimalLatitude'],
            x['decimalLongitude'],
            x['eventDate'],
            x['geodeticDatum'],
            x['individualCount'],
            x['informationWithheld'],
            x['observationType'],
            x['protectedInNSW'],
            x['scientificName'],
            x['sensitivityClass'],
            x['stateConservation'],
            x['status'],
            x['kingdom'],
            x['vernacularName'],
            ])
    i += 1

That’s it! Have fun and don’t forget to check for frogs in your backyards. If you don’t have any, build a pond. Or at least a water bath for the birds.

Technical

Building REST APIs with auto-discoverable auto-tested code

For the past few months, one of the projects I’ve been working on with SevenStrokes involves building a REST API for a service. REST APIs are tricky things to get right: they’re deceptively simple to describe, yet play host to plenty of interesting topics to delve into. Such topics can be statelessness, resource scope, authentication, hypermedia representation and so on.

However I’m going to only talk about the very basics (which many people overlook), and demonstrate how the Richardson Maturity Model can help with automated testing and documentation. If you haven’t heard of RMM yet, I recommend you stop reading and go through it now (especially if you’ve built a REST-like API before).

Let’s say our REST API conforms to a level 3 RMM: we have a set of standardised verbs, querying logical resources, receiving standardised status codes, and being able to navigate the entire system via links. We’ve got a pretty good setup so far. All these items in the RMM help our REST API system scale better. However what is doesn’t yet help with is keeping our documentation up to date. This is vital, because we know that the holy grail for REST API is an auto-generated, always up-to-date, stylish documentation that promotes your site/product api. There’s a bunch of tools that help you do this right now, but I think they’re all rather half-baked and used as a bolt-on rather than a core part of your application.

To solve this, I’m going to recommend one more addition: every resource must have the OPTIONS verb implemented. When invoked, it will respond with the following:

  1. An Allow header, specifying all the other verbs available on the invoked resource.
  2. A response body, containing the verbs, and under them in the hierarchy of the body (in whatever format), a description of:
    • Their input parameters, including type, and required boolean
    • A list of example requests and responses, detailing what headers, parameters and body are included in the request, and what headers, status code and body is included in the response.
  3. A list of assumptions that are being made for each example scenario (if applicable)
  4. A list of effects on the system for each example scenario (if applicable)
  5. A list of links to any subresources with descriptions

Let’s see a brief example:

# OPTIONS /user/

{
    "GET": {
        "title": "Get information about your user",
        "parameters": {
            "foobar": {
                "title": "A description of what foobar does",
                "type": "string",
                "required": false
            },
            [ ... snip ... ]
        },
        "examples": [
            {
                "title": "View profile information successfully",
                "request": { "headers": { "Authentication": "{usersignature}" } },
                "response": {
                    "status": 200,
                    "data": {
                        "id": "1",
                        "username": "username1",
                        [ ... snip ... ]
                    }
                }
            },
            [ ... snip ... ]
        ]
    },
    [ ... snip ... ]
    "_links": {
        "self": {
            "href": "\/makkoto-api\/user"
        },
        [ ... snip ... ]
    }
}

Sound familiar? That’s right. It’s documentation. Better than that, it’s embedded documentation. Oh, and better still, it’s auto-discoverable documentation. And if that isn’t great enough, it’s documentation identical to the format of requests and responses that API clients will be working with.

Sure, it’s pretty nifty. But that’s not all! Let’s combine this with TDD/BDD. I’ve written a quick test here:

Feature: Discover
    In order to learn how the REST API works
    As an automated, standards-based REST API client
    I can auto-discover and auto-generate tests for the API

    Scenario: Generate all tests
        Given that I have purged all previously generated tests
        Then I can generate all API tests

That’s right. This test crawls the entire REST API resource tree (starting at the top-level resource, of course), invokes OPTIONS for each resource, and generates tests based on the documentation that you’ve written.

Let’s see a quick demo in action.

Auto-documentation for REST APIs in action

It’s a really great workflow: write documentation first, generate tests from it, and then zone in on your tests in detail. This ensure that your code, tests and documentation are always in sync.

I hope someone finds this useful :) For the curious, the testing tool is Behat, and output format used is application/hal+json, using the HAL specification for linking, and link URI templates.

Life & much, much more

A Beaglebone, a Blender, a Board, and a Swarm.

Hardware isn’t generally my thing. When it comes to software, I like to break and create. But in my opinion, hardware should just work. But even though that’s another story altogether, it did explain my apprehension when I greeted the UPS guy one morning delivering a BeagleBone Black.

beagleboneblack

Let’s begin with the BBB. It’s a computer the size of a credit card, which isn’t that impressive if you realise that your phone is a computer. I find the best way to explain it is in terms of two other products, the Arduino and the Raspberry Pi. The Arduino is a similarly sized (comes in multiple sizes though) controller where you can upload scripts, plug in a hardware circuit (wires and lightbulb, that sort of thing), and have it control the circuit. Despite its power in hardware control, it only has a small scripting interface for you to do your programming. The Raspberry Pi is the opposite. It’s a full Linux computer (based off Debian), but does not have proper hardware controls out of the box. The BBB provides the best of both worlds: a full Linux system (Angstrom Linux, but of course you can flash your own), and a ridiculous number of IO pins to control circuits. All this awesome power at 45USD.

The next step upon receiving this wonderboard was obvious. Let’s build a swarm of robots. Along with two university friends, Lawrence Huang and Gloria Nam, we set out planning the system.

world

The base was to be constructed out of a 1200x1200mm wooden plywood board and cut it into a circle with a hole in the middle. This would be the “world” where the robot swarm would live on. This world would operate like a Lazy Susan, and would have a two depots filled with some sort of resource. One at the center, and one at the perimeter. This gave the colony a purpose: it would need to collect resources. Above the board was where we would put the computer, BBB, power supply, and cables to hook up to all the bots below.

We then had to determine the behavior and movement capabilities of the swarm. It had to act as one, but still remain separate entities. It also had to disperse to discover where the rotated resource depots were, and the swarm as a whole had a set of goals and quota limitations. Five movement types (along with the math) were worked out to allow the bots smooth and flexible movement across the terrain.

rules

The overmind was next. We would use Blender‘s very flexible boid simulator along with custom Python scripts using Blender’s Python API to simulate the swarm behavior on the computer and set swarm goals. At the same time, a real-time top-down view could be generated and displayed. Due to budget reasons, we couldn’t build the entire swarm of robots, but instead settled on building just one bot in the swarm, and having this bot track the motions of a single bot on the computer screen, but still behave as part of the full 32-robot swarm on the screen. Viewers could then see on the screen the full swarm behavior, and physically see a single bots behavior in front of them.

swarmscreenshot

The car itself was then built. It was relatively small and was barely enough to fit the two continuous-rotation servo motors that were required to power its left and right treads. It had a little tank on its top to hold resources, a depositing mechanism at its front, and dragged along a massive conveyor belt to collect resources behind it.

car

Now the fun part – calibrating the simulated swarm with the actual physical swarm behavior, and doing all the physical PWM circuits. Many sleepless nights later it was a success. Here we see the bot doing a weird parking job into the depot and collecting resources, going back to the center, and depositing it. Apologies for the lack of video.

collect

And there we have it. A swarm of robots. Did it curb my fear of hardware? Not entirely.

frontshot

For those interested in the actual system, here’s a macro overview:

system

A few extra fun things from the project:

  • Calibration was not easy. Actually, it was very hard. No, it was stupidly hard. It was ridiculously hard. Real life has so many uncertainties.
  • Each bot is tethered to the overmind via 8 wires (3 per tread, 2 for conveyor belt). Could it be made into a wireless swarm? Yes. Did we have the money? No.
  • Could it be modified to move in 3D XYZ space like a swarm of helicopters? Yes. Would I do the math for it? No.
  • The actual simulation was done on the computer via Blender + custom python scripts. The computer was then connected via a persistent master SSH connection, which was reused to send simple signals to the pin’s embedded controller. So all in all the BBB actually didn’t do much work. It was just a software->hardware adapter.
  • Because the computer was doing all the work, it wasn’t hard to add network hooks. This meant we could actually control the system via our phones (which we did).
  • Weirdest bug? When (and only when) we connected the computer to the university wifi, flicking a switch 10 meters away in a completely separate circuit (seriously, completely separate) would cause the BBB to die. Still completely confused and will accept any explanation.
  • Timeframe for the project? 4 weeks along with other obligations.
  • Prior hardware and circuit experience: none. Well. Hooking up a lightbulb to a battery. Or something like that.
  • Casualties included at least three bot prototypes, a motor, and at least 50 Styrofoam rabbits (don’t ask)
  • Why are all these diagrams on weird old paper backgrounds? Why not?
  • At the time of the project, the BBB was less than a month old. This meant practically no documentation, and lack of coherent support in their IRC channels. As expected, this was hardly a good thing.

Project success. I hope you enjoyed it too :)

Uncategorized

WIPUP 22.04.11b released!

WIPUP is a way for you to share your long-term projects and discover the passions of others.

Easter has started, and lots of interesting things are cropping up here and there – one of which is that WIPUP has seen a much-needed update. The last time this happened was way back in November, which is a stunning 5 months ago (yes, that’s almost half a year – doesn’t time fly?).

(Yes, it’s such a cliched and overdone splash screen – click it to read the release notes)

This release, unfortunately, isn’t a big one either. There weren’t any new features added at all, but instead consisted simply of visual polishing here and there to make it a more pleasant system to use and look at.

The reason for such a minor release after all this time is that WIPUP is maturing. WIPUP is aimed at a rather niche group – people who firstly are working on a moderate-to-long-term project. That already cuts out the average joe on the street. Then, that project must be something they are able to, and want to, share. That cuts out the majority of company-funded or commercial projects, as well as every person who is uncomfortable with showing work they think is “bad” and “incomplete”. WIPUP continues to slice away at the market by aiming at those who are comfortable with using a third-party system to host it, rather than their own setup, even though WIPUP is open-source and has an API.

For this niche, it satisfies all of its needs.

This niche – of which the target audience is (rather selfishly) myself.

Yes. You read that right. WIPUP was created for myself. If other people find it useful, then that’s great for them too. But all in all, I created this tool because I needed it. The idea for WIPUP was born by my desire to document the ThoughtScore project – my pet movie – in a more sane way than an increasingly large thread on the BlenderArtists forums. Has it succeeded? Yes. Is it still in use for that? Yes. It’s also used by me to document my work on the KDE.org redesign. It’s also used on my localhost to organise my scraps of work I produce for my architecture course, which will then be compiled into my portfolio.

What is my ambition?

Despite its selfish beginnings, there is a reason WIPUP was made open-source and then added the Open Collaboration Services API. This is because I have an ambition for WIPUP. I want it to be used by the end-users of open-source projects.

People are fascinating. The people who indulge in open-source are even more fascinating, because the average person is passionate enough about a cause like the open-source movement to turn it into their computing life, which is a large element of our lives nowadays. From that, most of you are working on really interesting projects on the side – learning a language, writing a book, composing a song, making a movie. I want WIPUP to exhibit the weird and wonderful of your creations – to emphasise and expose open-source’s greatest strength: the community. I’ve realised that when I threw myself in the wacky world of open-source that I discovered a goldmine of knowledge and passion. I want everybody to realise that too – and be proud of it.

What is your ambition?

Technical

Walkthrough of a CSS3 website design slice.

Slicing is a sign of a terrible golfer. Slicing is also the process of cutting up an image design into smaller images and writing markup code to turn it into a living, breathing website. I recently got a request from a friend to slice their portfolio website. Here is the original design he sent to me (and dumped on WIPUP as well).

It is a fixed width, fixed height website design. Technically speaking, it’s a rather simple design. Most website frontend coders would just launch right into slicing, but this time I wanted to have some fun. I wanted the freedom that any slicer and designer yearns towards – perfect separation between presentation and content, and complete disregard for browser compatibility.

Yes, if you haven’t already guessed, I built this site with CSS3. The only images I used in the end were the green background image, and the splash screen background image (oh, and the leaf icons for the navigation, but those don’t really count).

Most of the layout was straightforward using things like the new border-radius and box-shadow tags. However the lump in the navigation bar posed some complications. In the end I was able to recreate it using a three-layered solution (via the z-index tag). The first layer held the navigation strip with shadow effects. The second (above first) layer created the lump in the navigation’s shape and shadow. A third layer mimicked the second except with a slightly decreased width, slightly offset at the top and a shadow of the same colour as the background to create a "fading" effect for the shadow on the sides. With position: relative, and offsetting to place them, I managed to recreate the effect pretty darn well, if I might say so myself.

Finally, I used Google’s Font API to choose a more appropriate font, applied text-shadows (with a different colour in my a:hover tags to create a nice glow effect) and stuck it up online for my friend to see. Here’s the result (output via Gecko renderer):

This multi-tab bar is a common webdesign element, so this trick might help other CSS3-yearning developers. Here’s the code for those who are interested. The design works in Firefox, Opera, and Safari. Chrome does not render rounded shadows correctly but otherwise works fine. It fails with IE8 and below. Haven’t tested IE9.