Improving image recognition for moths in Europe
Chris van Swaay explores the growing possibilities of using AI to identify butterflies and moths
A few years ago the power of AI (Artificial Intelligence) was introduced to the butterfly and moth world through image recognition. Suddenly identifying butterflies and moths did not mean searching through books and comparing species by species to find the right one. Interestingly it works especially well for moths (even better than for butterflies), probably because photos of moths are nearly always made in traps when moths are settled, and each species sits in more or less the same way. In this blog post I want to give a short overview of the present state of moth identification by image recognition and discuss how it could be improved further.
Image recognition is working fairly well already and is available on our own ButterflyCount app for moths as well as several other apps such as ObsIdentify and iNaturalist. It works especially well in regions with many photos such as NW Europe, but there are large gaps where there are fewer photos available, especially in E and S Europe. Although there are some smaller (often national or regional, or project based) AI systems, the two main portals that work over the whole continent (and beyond) are observation.org and iNaturalist.org. Below I will discuss the main differences.
- Image recognition using AI works via validated photos. This means we must be sure about the name of the species on the photo. It doesn't matter how good the photos are, poor quality smartphone photos are also useful. What really matters is the amount of photos - the more the better - and the accuracy of the identification of the species.
- As far as we know the two large Europe-wide platforms (observation and iNaturalist) only use their own validated photos to train their AI (though they might download photos from GBIF now). Both platforms upload records to GBIF including all photos.
- These platforms have a fundamentally different way of validation of records. On iNaturalist everyone with an account can validate a record and there are rules on how to get to Research grade when records are fully validated. These are then uploaded to GBIF and used for training of the AI. On observation.org only experts with a special admin-account can validate the photos. However, there are not many experts who validate moths outside NL and B, so there is a huge amount of unvalidated photos.
- At iNaturalist the backlog of unvalidated photos is smaller, because a few dedicated people have validated large amounts of photos (e.g. one person did 1.3 million on his/her own). The open structure of the iNaturalist community makes it much simpler to help with identification.
- For this reason, the AI of iNaturalist currently works better for countries outside NW Europe, although they probably use more or less the same algorithms (all based on the ones Google released a few years ago): the AI of iNaturalist simply has more validated photos to use than observation.org.
- Observation.org makes it possible to use their AI either via the Obsidentify app, or by using their API (Application Programming Interface) for identification (this is done to identify the moth photos on the ButterflyCount app), though they often make a charge. iNaturalist also makes it possible to use their online AI while entering photos, via their app, or their API.
It is important to realise that there are thousands of photos still on hard drives of moth collectors, including places with gaps such as in S and E Europe. These people so far do not share their data and photos, and there is a risk that the opportunity to use them will disappear when they die. So simply collecting, photographing and uploading photos of missing moths would be a huge help for iNaturalist and observation.org to improve their species coverage and the accuracy of identification by AI. This will create some extra work for the present people who do the online validation in their own time, which seems to be less of a problem for iNaturalist than for observation.org unless new (and active) validators are found.
How could the image recognition of moths be improved, especially for E and S Europe?
- The easiest way would be to make an appeal to existing moth experts (focusing on S and E Europe) to submit their identified photos to a central location where they could be uploaded onto one of the existing platforms.
- Another way would be to visit existing collectors in their homes to collect their images directly and ensure they are labelled correctly with species and location data.
- Once a large new set of photos had been collected, experts would need to be organized to validate them at observation.org and iNaturalist.org. Although for both platforms the basic principle is that validators are not paid, this could be done for a period to get rid of the backlog of photos. Such validators could be employed by an institute, university, one of the partners of BC Europe or other experts. After a year, it should be possible to eliminate the backlog and the AI will have been greatly improved.
- The problem still remains that in regions such as S and E Europe there are simply fewer photos available, because not as many people trap moths there. A desktop analysis in which the data for the European Moth Red List are compared to GBIF data with existing photos will reveal where the biggest gaps are. These gaps could be targeted to find people who want to help in collecting as many photos as possible (e.g. to put out many traps and photograph all the moths inside). Volunteers from the European Butterfly Group could also get involved as well as the partners of BC Europe. It would be important that field data are collected via a standard protocol, such as the ButterflyCount app, which guarantees that all moths are seen as a sample of a community (and not as opportunistic records). This would not be extra work, but the data would be much more valuable from a scientific point of view.
- From the ButterflyCount app the data could be uploaded to gbif.org, but (if the recorder agrees) also to iNaturalist.org and observation.org.
- If these photos are uploaded to both platforms, they would be used directly for the improvement of the AI after the validation by experts (see point above).
- It is important to note that in several countries it is necessary to get permission to catch moths (and it would be necessary to kill some of the cryptic species for genitalia determination to be 100% sure of identification). It may also take several months to obtain the necessary permit, so there would be a lead in time to get permissions in place.
- It would also be helpful if both platforms also used the existing Gbif photos to train their AI, so they could expand the coverage and improve the accuracy of their image recognition.
In conclusion, there are huge opportunities to improve the use of AI recognition for the identification of moths. This would enable us to greatly improve our knowledge of moth distributions and abundance so that we can better understand the pressures they face and design appropriate conservation strategies. As moths are important pollinators of wildflowers, and are vital components of the ecosystem (e.g. a food source for many birds and bats), such data would help the overall drive to conserve Europeâ??s biodiversity and meet the targets set down in the EU Biodiversity 2030 strategy.