|
|
Log in / Subscribe / Register

Saving birds with technology

By Jake Edge
February 6, 2019

LCA

Two members of the Cacophony Project came to linux.conf.au 2019 to give an overview of what the project is doing to increase the amount of bird life in New Zealand. The idea is to use computer vision and machine learning to identify and eventually eliminate predators in order to help bird populations; one measure of success will be the volume and variety of bird song throughout the islands. The endemic avian species in New Zealand evolved without the presence of predatory mammals, so many of them have been decimated by the predation of birds and their eggs. The Cacophony Project is looking at ways to reverse that.

Menno Finlay-Smits and Clare McLennan started their presentation with a recording of what parts of New Zealand might have sounded like before the arrival of humans and other mammals. Unfortunately, most of New Zealand does not sound like that any more, Finlay-Smits said. The Cacophony Project is a non-profit using open-source hardware and software to help restore the levels of bird song in the country. [Menno Finlay-Smits] It is "very much a startup", he said. The project has a vision for where it wants to go, but does not have the solutions yet. Plans change regularly. The project works with other organizations on its aims and, of course, encourages volunteers.

New Zealand's native birds are "our national treasure", Finlay-Smits said. They are different from birds elsewhere in the world, so preserving them is important. They evolved without mammals, which is part of what led to multiple species of flightless birds in New Zealand. "They really need our help." The government is spending around NZ$70 million per year to control pests, but that only serves to suppress the numbers; it will never get them down to zero. There are also benefits to agriculture that come from eliminating these pests.

The current technology for identifying and capturing predatory mammals is fairly primitive. There are chew cards, which are made of plastic that bait (e.g. peanut butter) is applied to. Based on the different kinds of bite marks, the types of pests (e.g. rats, possums, stoats) can be determined. The chew cards work, but are not a great way of knowing what's in the area, he said. There are also wooden box traps that are somewhat effective to catch and eliminate the pests, though less than 1% of the animals that walk near them ever interact with them. Part of the reason is that the traps are "handicapped intentionally in their design" so that they do not capture animals other than the target type.

The project is looking for "something that is radically better": a device that can cover 100 times the area, catch four kinds of pests (the current traps target a single type), catch at least ten times as often, and that auto-resets so that it can do multiple catches without intervention. That would mean that each device was 4000 times as effective as the current technology.

Technology ecosystem

The project has a whole "ecosystem of technologies" that it is building. There is an audio recorder project that will be used to determine how well the project is doing on its goal. There is also a thermal-video platform to record and analyze data. A "sidekick" phone app is used to manage the platform; there are various analysis and visualization tools and a "bunch of stuff in the cloud", he said.

The audio recorder, which is called the "cacophonometer", is based around an Android phone. The idea is to have a cheap and easy way to measure bird song throughout the day. It wakes up once per hour, records the bird song in its location, and sends it off to be analyzed. The project is currently in the phase of getting the devices out there to start gathering the baseline data.

[Demo]

The thermal-camera platform is in the prototype phase; it uses a Raspberry Pi as the onboard computer. The pests that the project is interested in are mainly active at night, so they show up nicely against the cool night air in thermal video. He demonstrated the device by pointing its camera at the audience (seen in a picture on the left). There is a web server in the device that displayed the camera feed for the demo; it is used in the field to ensure that the camera is pointed in the right direction before walking away from the site.

The platform has the Raspberry Pi, a 3G or 4G modem, and a camera, all housed in a waterproof enclosure. The Raspberry Pi has a "hat" (daughterboard) with a real-time clock and some other circuitry, such as a microcontroller to turn the camera device off for 18 hours so that it only runs at night and uses less power. Running on this hardware is challenging, Finlay-Smits said. Putting electronic devices in the wild is "fraught with problems". The project struggled with waterproofing early on, but the enclosure now has good seals and waterproof connectors. In addition, it has a "Gore valve" that allows the pressure to change within the enclosure without allowing water to get in; early experiments did not have that and the pressure changes due to heat wore out the seals fairly quickly.

The camera being used is the Lepton 3, which is "quite reasonable for its price". It is not super-high resolution, but is good enough for the project's needs. It is somewhat difficult to use reliably, though; reading the camera fast enough to keep up with the data stream and not lose sync was difficult. The project switched from Python to Go for reading from the camera and changed the process to a realtime priority in order to capture the data.

The project has struggled with battery power a bit, he said. It started by using devices on mains power, sometimes with really long extension cords, but there is a limit to that. So it switched to off-the-shelf USB batteries that the developers added weatherproofing to, but the batteries turned out to be "too smart"; when the system shut down to save power, the batteries would follow suit and require human intervention to turn back on. Now there are some custom-made battery packs that are not as smart, thus work better for the device's needs.

Processing the video

At that point, Finlay-Smits handed off to McLennan to talk about how the video imagery is processed. She said that the first step is to do motion detection, but Cacophony does it differently than the existing crop of trail cameras that are used by game hunters and the like. Those devices have an infrared motion detector that turns on the camera, but it can take half a second for the camera to turn on. Stoats and rats are small and fast, so the project always has the camera on in order to detect motion.

[Clare McLennan]

She showed some footage of a stoat crossing the full frame in less than a second, which a trail camera would largely miss. So the camera stays on all of the time, which is working well but uses more power than the developers would like, especially now that the device is running on batteries. Some users have traps that they have trail cameras pointed at; they note that the bait is missing but the camera never got any footage—or even a still. The project's device can show them footage that gives them confirmation that there are animals in the area. For remote islands where these pests have been eliminated, it is critical to note when they have returned as early as possible, she said.

One problem is that distinguishing animal motion from wind is quite difficult. She showed footage of grass moving in the wind that repeatedly confused the motion-detection software. Finlay-Smits noted that, at the end of the day, trees may be warm from the day's sun and cooler leaves moving across them can confuse the detector. In addition, McLennan said, some animals don't appear all that warm; hedgehogs are warm on their bellies, but not on their "prickles", for example.

She showed more footage of creatures interacting with, and often outwitting, the traps. But if the motion detection mostly works well and you give people a way to look at the results from traps they have set, they will do so. They will get up in the morning to visit the web site to see what happened overnight; they will provide lots of feedback on what happened, as well as relating problems with the software or device design. In addition, "they will send you gruesome pictures of rats", she said with a chuckle.

The next stage is to identify animals in the footage without a human having to look at it. That is done using machine learning. In order to focus the training of the machine-learning model, they have identified and pulled out blocks with animals in them. Those blocks are linked frame to frame so that the model can also learn how the animals move. A bird and a rat have a similar size and shape at low resolution, but they move very differently.

For the animal footage, the background is eliminated and then a mask of the pixels that are part of the animal is created. The footage she showed was fairly obviously a possum in context, but when she showed a zoomed-in, pixelated image of just the animal in the thermal false colors, it really showed the scope of the problem the project is trying to solve. In addition, it is not always as simple as following a single animal as there can be more than one at once; occlusion of parts of an animal as it moves can also complicate things substantially.

Once the tracks are established in a video stream, the last stage is to classify the animal. In the end, the developers really want to be able to put them into one of two boxes: predators or non-predators. Humans, Kiwis, and other birds are non-predators, while rats and mice, hedgehogs, possums, and stoats (which includes ferrets and weasels) are all predators. Cats are difficult, since they are predators, but, depending on context, the project may not want to treat them that way. "People are passionate about their cats", she said.

Currently, videos are labeled by animal type for training purposes; eventually, the individual tracks should be labeled. Each track is reduced to a 48x48 pixel, three-second clip. The model is trained with this data, tested with some other test data, and when it is deemed ready, it is evaluated on a different set of videos. Early on, there was a limited set of data and the only possum footage had them climbing up trees; that led the model to conclude that anything that had a tree in it was a possum. It takes around six hours of training with a computer with a GPU; the project is looking at using Google Compute Engine to speed things up.

The model is built using NumPy, OpenCV, and TensorFlow. It creates a recurrent convolutional neural network; the "recurrent" parts means that it has memory, she said. The memory allows the model to take into account previous frames so that the movement of the animal is part of the decision-making process.

The main problem the project runs into is garbage in, garbage out, McLennan said. A single mislabeled track can confuse things quite a bit. In addition, the project is always looking for more video to use because more diversity of scenes and animal activity helps train better models. But 80% of the work actually goes into the processing and infrastructure to store, tag, and organize the footage.

The future

The next steps are for the project to get better at what it has been doing, she said. When there are multiple animal tracks, it would be helpful to be able to prioritize them based on the animal type, for example. In addition, there may need to be some training with dogs so that they can be recognized and avoided. The main question, eventually, is whether or not a trap should be opened based on what type of animal is present.

As time for their talk was running low, she handed off to Finlay-Smits so that he could talk about some future plans for the overall project. In the near future, the project needs to get the machine-learning model running on the camera devices; right now, everything runs in the cloud, which is not practical for remote sites. Once that is available, the project will have a device that can report on the numbers of pests they observe, which is an important starting point for many organizations, he said.

There is also an effort to create a "cacophony index" from the audio data that is being gathered by the cacophonometers. That will allow looking at the changes in bird song over time, for different seasons, and so on. Using audio to lure in these pests is another experimental technique the project is trying. By playing various types of sounds on a schedule or based on the presence of certain animals, researchers should be able to determine the effectiveness of the technique and which sounds work better or worse. If it is successful, it would mean that the range of the traps is increased, so fewer would be needed in a given area.

In five years, the project is looking at pairing the camera with a gun turret that fires paper "bullets" with toxin into the fur of these pests. The idea is that the pest will then groom themselves, ingest the toxin, and go off somewhere to die. This would allow a ten-meter radius to be covered, for example, and does not require a human to clear, unlike the traps. It obviously requires a lot of safety and legal review, but it has a lot of advantages, Finlay-Smits said.

Another possibility is drones, he said, not for shooting, but for scouting. They could be deployed in places that people did not need to walk to, gather some data over some number of days or weeks, then return. He has been told that the obstacle-avoidance software in drones makes it entirely possible to deploy in hard-to-reach places such as forest canopies—at least in a few years.

The project is all open source, he said. Like all such projects, Cacophony is always looking for help. As time expired, he said that there are lots of small projects that need doing using various languages and tools.

Video of the talk is available in WebM format or on YouTube.

[I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Christchurch for linux.conf.au.]

Index entries for this article
Conferencelinux.conf.au/2019


to post comments

Saving birds with technology

Posted Feb 7, 2019 15:56 UTC (Thu) by PengZheng (subscriber, #108006) [Link] (1 responses)

Hmm, 'pairing the camera with a gun turret' sounds like a weapon system.

Saving birds with technology

Posted Feb 8, 2019 8:50 UTC (Fri) by nilsmeyer (guest, #122604) [Link]

That's what I thought too, though I believe those weapon systems already exist...

Saving birds with technology

Posted Feb 8, 2019 8:53 UTC (Fri) by nilsmeyer (guest, #122604) [Link] (1 responses)

Immediately when reading about the project I asked myself "what about cats?". My downstairs neighbor used to have an outdoor cat, after the cat died the noise from birds (and squirrels) significantly increased. I wonder what the overall impact is, I read somewhere that cats are responsible for 3/4 of bird deaths in some areas.

Saving birds with technology

Posted Feb 8, 2019 14:35 UTC (Fri) by anselm (subscriber, #2796) [Link]

To quote The Oatmeal: “Dogs are man's best friend. Cats are man's adorable little serial killers.”


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds