What's that bird song? ID birds by sound with BirdNet

Thumbnail image Drew Weber | Macaulay Library

Show Transcript

[Leo] Welcome everyone to all about BirdNET. Thank you all so much for joining us today. This is the second in a series of webinars, for the Cornell Lab of Ornithology’s Virtual Visitor Center Programs. This summer, we are highlighting each of the various mobile apps and other digital tools and resources created by the Cornell Lab. For each app, we post a short video demonstrating how to use the app in the field.

For example, last week I demonstrated the BirdNET app in Sapsucker Woods, and then we followed that up with a live webinar like today, where we chat with an expert to learn more about the app and about the science behind it. So let’s do introductions, my name is Leo Sack. I’m the Public Programs Assistant on the labs Visitor Center Team, and my job is to help the public learn about all of these amazing resources. So I’m very happy to be able to facilitate this conversation.

Now today’s topic is bird sound recognition, that is identifying birds by sound alone, using this amazing tool called the BirdNET app. And we are very lucky to have with us today, the lead developer of BirdNET, his name is Stefan Kahl and he is a postdoctoral researcher for the Cornell Lab of Ornithology in the labs Center for Conservation Bioacoustics. Stefan, thank you so much for joining us today.

[Stefan] Thanks for having me.

[Leo] Now, Stefan, I understand you’re not actually physically here at Cornell, so where are you and why?

[Stefan] I’m in Germany actually, I’m in the city of Chemnitz. The local university is still in lockdown, so we have to work remotely, but I’m used to it because if I chat with my us colleagues, we’re always doing it on Zoom. So it’s not entirely new to me, And I had kids repaint my background and I already saw someone posted a chat, but it looks so much better than yours, so I’m really proud of it?

[Leo] That’s fantastic. Yeah.

[Leo] And I don’t know if you said, what time is it there right now?

[Stefan] Oh, it’s 6:00 PM, so we’re six hours ahead. Okay. In this country.

[Leo] Thank you so much for taking time out of your evening-

[Stefan] Sure, oh sure. To join us. my pleasure.

[Leo] Excellent. Now, before we get too far into this let me explain to everyone how today’s webinar’s going to work. in just a moment, Stefan and I will get the conversation started because I know he has some really awesome stuff to share with us. But then we have a few questions lined up from audience members who submitted their questions ahead of time.

And then for our audience who are watching live of please type your own questions into the chat window, and we will queue those up for discussion as well. And for those of you who are here in our Zoom call, let me show you how to do this. I’m sharing my screen here and you want to go to the bottom of the Zoom screen and find this button that says chat.

And if you click on that, it will open up a chat window on the side. And when you go to that chat window, there is this section at the bottom that says “to” with a little dropdown menu, and you want to make sure you select “to all panelists and attendees” so that we can all see your message.

Now for those folks watching the webinar being streamed live on Facebook, just use the Facebook comment section. I have colleagues behind the scenes helping gather all of those questions for us to hopefully be able to get through as many questions as we can today. All right, Stefan, would you start us off with a little bit of background about yourself? Briefly what’s your area of interest and how did you get involved in developing BirdNET?

[Stefan] I’m a computer scientist and I did my bachelor’s and master’s degree here in Chemnitz and also took my PhD all in Computer Science. And during the course of my studies, I was focusing on multimedia retrieval, which means pulling information from audio, video, or texts and images mainly using machine learning techniques for it, so I had this kind of background already. And in 2016, when Holger Klinck, who was the director of the Center for Conservation Bioacoustics who’s also a German. Remember that he went to Antarctica with one of my colleagues and invited us over to the lab.

We were actually working on a recognition system for audio events and mainly audio events that occur in the homes of elderly people. And what we are trying to do is come up with a system that can detect if someone got hurt, if someone fell, if someone needs help. But it was challenging because what does it sound like if someone is actually tripping and falling? So you can always bring in some actors in a studio and make some recordings and they’re gonna be a high quality, and you will do a pretty decent job with any detection system, but then you cannot apply that to the real world, it’s not working because those are studio recordings. So we’re kind of stuck in this.

We have this recognition technology, but we didn’t have a really good data set, and then Holger said, “Yeah, well, can’t we apply that to birds? Wouldn’t that work? Isn’t that a complex acoustic class?” And we said, “Yeah, well, we should try it”, and we did. And we participated in this European evaluation campaign called Berkeley in 2017 and we did a pretty decent job and that maybe kicked off BirdNET, because okay, we can use this technology that we have, we can apply it to her, it’s working, but there’s so much more we want to explore. And I’ve been focusing on BirdNET ever since.

[Leo] Fantastic, Now, I want to make sure everyone’s on the same page knowing what the app does, so I want to quickly demonstrate it. You have created a really fantastic app here, so I’m sharing the screen of my phone right next to my video here, and I want to quickly demonstrate how this app works and then Stefan, I’m gonna ask you, or I’m gonna show what it does, and then I’m gonna ask Stefan to explain exactly what’s going on here, technically how this works. So the blue and green and yellow that we are seeing on the phone screen here is a spectrogram, right? This is showing what sounds there are. And right now you’re seeing my voice, but I’m going to play a bird song.

So you can really see on the spectrogram, what that bird song looks like, and then we can just select a portion of it, hit analyze, and BirdNET is going to tell us what bird that is. And in this case, it’s saying Wood Thrush and below that mix the scientific name, it says, “almost certain”. So now, Stefan, can you tell us, first of all, what does that almost certain mean? And then how does this app do this? This is amazing that it’s able to recognize any song or call and like every bird. Individual bird sing a little bit differently and there’s regional variations, and you know, every recording is a little different, but it’s able to actually recognize this. How does this work?

[Stefan] I do have a short presentation for this and I will start sharing my screen, and hopefully everyone can see this. So first of all what you just mentioned, when you start the recognition process, it starts with a microphone. And it doesn’t necessarily have to be a smartphone, could be any device that has a microphone to it. So we’re trying to bring a BirdNET to as many platforms as we possibly can. And typically what it’s doing is it’s recording raw audio data. And it’s also recording metadata and metadata in this case means there’s additional information. And for birds, we know that location matters, and that type of the time of the year matters, so the metadata that we’re collecting also is the GPS, latitude, longitude, and the week of the year. And what we’ll be doing with the raw audio is that we’re actually using spectrograms and spectrograms are visualizations of the audio signal.

And as you can see here, you can try to hear, I mean, someone who’s proficient enough can try to guess which bird it is that you can see. And now this audio recognition problem turns into an image recognition task. And the techniques for image recognition are much more advanced than they are for audio, and for us it was just like the step to take and say, “Okay, we have experienced in image processing, we have all these techniques, now, can’t we apply that to audio? Okay, yes, we can transform audio into images using spectrograms”. And spectrograms have a history in ornithology because they’re often used in scientific papers, in books, whenever you have to show a bird vocalization but you have to print it, you’re gonna use the spectrogram. We’re using actually three-second chunks, so we split the audio into three-second chunks, and then we pass it through BirdNET. And BirdNET is a deep artificial neural network. And as the names suggests, this is some sort of what most people call AI.

I’m not so keen on calling it real AI, but typically that’s what everyone is referring to. And it’s a really rough approximation of the signal processing in the human brain. You have neurons and these neurons are interconnected and they’re sending signals from one neuron to the next one. It’s a really, really simple approximation, but it has really worked well for this kind of task. So we pass each spectrogram through the BirdNET deep neural net, and it will give us species probabilities. And the species probability, they don’t have to add up to 100% because they’re individual probabilities. And for each species in this case, 1000 species, we get a certain score. And in this case BirdNET says 91%, it’s Northern Cardinal, 11% it’s an Indigo Bunting, at 9% for some reason, it’s says it’s European Goldfinch. So now we know, okay, location matters and time of the year matters. And we collected this metadata and now we’re using eBird checklists frequencies to build a occurrence mask.

And we use this occurrence mask to filter out the species that are not very likely in a certain location at a certain time of the year. And in this case eBird would tell us if it’s, let’s say in Ethicon, its in early April is a yeah. It’s really likely that that it’s Northern Cardinal. It’s not so likely that Indigo Bunting typically arriving a bit later, and it’s really not likely that it’s a European Goldfinch because that’s European species. So the final prediction that we’ll see on a smartphone on your desktop will be, it’s Northern Cardinal and it’s highly likely. And what we did is just we translated this score into a phrase of highly likely, highly uncertain, I’m almost certain and other kinds to make it a bit more understandable to really know what does this score actually mean. But it roughly translates to the confidence of BirdNET that this is the species. We’re not just presenting the results to the user but we’re also storing the audio and the metadata in an archive, and it is this archive that we’re really using for a scientific project. And we’re trying to do something with the submissions that we got from the BirdNET users.

[Leo] Fantastic, thank you. Okay, you mentioned that you’re actually storing some of that data in your archive for science. So we’ll get back to that later, but I think that’s really fascinating that you actually are harvesting scientific data from this. Let’s circle back to that. Cause we’ve got a lot of questions coming in already and the most common one is the one we were expecting. Everyone wants to know-

[Stefan] Yeah.

[Leo] Why is BirdNET currently only available for Android devices and when, oh, when will it become available for Apple devices like iPhones and iPads?

[Stefan] Yeah, that’s a really simple answer for it because we only know how to do it on Android. and the reason for that is in Germany, I’m at a public university that is mostly focused on teaching. And what I did is I prepared some seminars and courses for students to learn mobile development. And I did that by using open source tools because I can’t require each of the students to have a MacBook and it need a MacBook to develop for iOS, so it’s kind of hard.

So we focused on Android and that’s how I got to know Android. I learned by teaching students how to do it and I learned it so for me, it’s relatively easy to do it on Android, but it is so hard to take this knowledge to a different platform and do it for iOS. It’s a completely different programming environment, it’s a different language, so it’s so hard. But we were thinking, okay, we need this iOS app because people are requesting it constantly. And we know, we hear you, right? And we want to do it, we want to have it. And we’re actually doing this right now as a student project. So we’re taking BirdNET and say, okay, a couple of students and say, “You want to learn about iOS development? We want to teach you and we want to do this by developing this iOS version of the BirdNET app. So this is the reason is that it takes a bit longer this way but we can actually have this benefit and then having some students who can maintain the iOS versions on the iOS version and then some future updates.

And we have a prototype it’s working already, you can see the spectrogram, you can select this portion. You can submit it to the servers, and kind of detection. We’re still adding some more features and yeah, some more UI functionality, but relatively, I’m positive that we can maybe release it in the next few months. So let’s say, you’re gonna have like a few at the testing phase, a beta phase during the winter months where it’s not so much going on and can figure out if it’s working, if it’s not working, get some feedback from people who are using it and incorporate that into the app and then starting with the next season, hopefully, we have a full running version of the iOS app.

[Leo] Okay, so beta version this fall or winter, something like that and ready to go full blown

[Stefan] Yup, that’s it.

[Leo] Apple version.

[Stefan] That’s the roadmap, yeah .

[Leo] In the time for next spring burning season? Ah, that’d be fantastic. I know a lot of people are excited to hear that. Now in the meantime, is there a way that people who don’t own an Android device can use BirdNET to ID birds right now? So for example, one of the questions submitted to us ahead of time was from Nancy, who said, “I did a recording using a third party recording app, is there a way to upload my audio recording to BirdNET for ID?”

[Stefan] Yeah, there’s actually, so we have this website, and I think yeah, you can show it later what it looks like. And we’re trying to keep it up to date and have all the demos and prototypes that we have up there. So if you come back regularly you can check and see if there’s something new and there’s this yeah, this menu item, which says analysis of audio recordings. And if you click that, it will lead you to at demo page, which is not as sophisticated as the app, But you can select a file from your hard drive, upload it, and then the same recognition system that is used in the app is actually going to analyze your file.

Right, and then you see the file upload is in progress, it might take awhile. If many people are using it, it might crash occasionally, but we’re doing our best to keep it online. And then when it’s done and then the analysis is done you will see the spectrogram and you will get the species probability, in which doesn’t have the fancy images and icons because it was one of the first prototypes that we actually developed. But yeah, this is the CSA-

[Leo] Can you see it if I do this?

[Stefan] Right there it is.

[Leo] Yeah, so I actually had this one preloaded. I think the reason it was taking so long for us is because we’re currently doing the Zoom chat and that’s slowing down my internet speed.

[Stefan] That’s right.

[Leo] I’ve noticed I can see the spectrogram and I can also play it, right? Right.

[Stefan] You can play it, and then you see, as I said, we’re using the three-seconds chunks and every few seconds, it will give you a different prediction. So right now this the this the chickadee in the background, and you can see that the probabilities just change for a brief moment. So you can actually, I think, we should also be able to at least to click the spectrogram and draw a bounding box. So if you’re interested in just a specific fraction of the the recording, yeah, it has to be long enough. It has to be like a few seconds. And then it will resubmit this fraction of the recording and you can reanalyze it. So just in case you want to really know what’s that bird in the background, and then resubmit it again. So it’s not as sophisticated as the app, but still, if you have a recording that you just wanna know what it is, you can upload it here. And it doesn’t really matter if it’s 20 seconds, 30 seconds two seconds, you can upload it there.

[Leo] So some folks have asked, “Can it detect multiple species at once if the songs are overlapping?” And there’s your answer right here, your black-capped chickadee is fainter and in the background and overlapping, but it is detecting it, even though the probability isn’t as high.

[Stefan] Right, yeah, typically it works, but sometimes it’s just so hard to do that. If it’s just two birds, it might work. If they’re not exactly overlapping, maybe in the frequency range, it might work. But if there’s too much going on, you probably don’t have a chance. This is an active field of research and we’re really trying hard to improve this, but it’s not that trivial to do this.

[Leo] All right, and so I just want to mention for people, this website is birdnet.cornell.edu. And there’s a lot of stuff on here besides that one function. There’s this great live stream demo that is taking stuff from a microphone in Sapsucker Woods. and there’s links to the Android app. And I imagine Stefan, when you get the Apple version, the iOS version,

[Stefan] Yes.

[Leo] It will be on here as well, correct?

[Stefan] Definitely, Yeah, that’s true.

[Leo] But what we were just showing was the analysis of audio recordings. Okay, fantastic, next question. So you mentioned before using BirdNET for scientific research and I want to circle back to that. So are other researchers using it to help them analyze their own acoustic data? Like our people, other researchers just uploading their recordings to that website? Or are you collecting, you mentioned you were collecting data from the app itself, and so how are you using that?

[Stefan] Yeah, there’s two kinds of research that we’re doing and that’s the research that they’re doing for other groups that are sending us the data and we’re running it through BirdNET and yeah, trying to get some decent results so that it can do a density estimation or like a vocal activity index, or species diversity. But we’re also doing our own research and this research is, we’re currently focused on doing something with BirdNET submissions. And I do have a few more slides for that and I just wanna show you what we’re doing with it. And so first of all, so we’re just only available on Android and in North America and Europe only. And so far this year we had really, really high increase in installs and active users in Germany, those people are from Germany, but also in France in the US and the UK. And you can see that the numbers are really going up and we do have research creates submissions. So right now we do have around 20 million submissions from BirdNET users.

And we removed all the duplicates from it and all ones that had a human detection in it. And we removed those with a low confidence score. And we ended up with about 5 million observations, so now we want to do something with it. And first of all, what we try to do is to confirm if those observations are actually valid. And we looked at some species and we wanted to see if we can actually show some sort of migration patterns by looking at for instance, the common crane in Europe.

And what we did is we took a map and then we put all the observations as a gray dot on these maps, and this is on a weekly basis. And what you can see here starting in the top left corner, on week one, there’s only a few red dots and those dots are common crane observations, and the other dots are total observations that we have. And if you move on, if we need to move to week five and then six and seven, you can clearly see there’s this corridor that the common cranes are taking over France to reach their breeding grounds in Eastern Europe and Northern Europe.

And you can clearly see this on the map, how they’re moving, how the red dots are moving which is exciting to see cause okay, those are observations made by real people, made with your BirdNET app. And we can confirm this migratory pattern of the common crane which is a nice thing. We did this for Northern Europe as well, and we decided to look at the Verio and you can see, yeah, there’s not much going on in week one and two and three and so on, but then in week 11 in week 12, you can see there’s a few more red dots, so a few more observations. And then in week 16 and 17, really coming in and you can see that they’re actually moving North towards Canada. So this is another migratory pattern that we were able to confirm. And that implies that the BirdNET observations that we’re receiving are actually valid from an ecological point of view. I do have two more examples, on the left, that’s a video of the Brown Thrasher detection that we get in can see in the first weeks. So this is on a daily basis, this time early in the year, there’s not much going on, but you can see that in the South of the US in Florida, there’s a few Thrashers.

And then as the year moves on, you can see getting more and more observations, and you can see that the dots are actually moving North, so this is again, migratory pattern. This is not something which is completely new, we already knew that the birds are migrating in this way, but it’s nice to see that we can use BirdNET observations to confirm that. And on the right hand side, we have this for the Willow Warbler, which is a common bird in Europe. And again, early in the year, that’s not much going on, not a whole lot of detections. But then in March you see there’s been more and more observations. And as we move on to April and then in may they really start to shift towards Scandinavia which is really nice and can really nicely see this in this map.

So this is, again, this is not completely new, and this is not something that no one ever before had recorded or seen, but we know that the observations that we have, that the observations that we collected are actually valuable and that we can do some research with it. And we’re just getting started exploring this, it’s a huge dataset, it’s gigabytes and gigabytes of data. We receiving like a hundred or 120 gigabytes per day of audio and metadata, and this is a huge collection. And it is just so valuable to have this data and to finally being able to do research on a larger scale. And if you consider each of the devices as a recording station, you have a really good coverage on a continental scale.

[Leo] That is amazing. Okay, so we’re getting… That sounds like some awesome research. We’re getting lots and lots of questions about some of the limits of the current version of the app. So some people are noticing that the app can identify about a thousand species, and they’re asking about what countries that covers and correct me if I’m wrong. That right now, that covers the US, Canada, and Europe, right?

[Stefan] Right.

[Leo] So any species that you’re likely to hear calling in the US Canada or Europe, but if we go to central America or Australia or other parts of the world the app currently won’t know those species. And so people are asking, you know? When is that going to change? Also people are asking about needing an internet connection. Currently the app does need an internet connection, so it doesn’t have to be wifi, cell phone data works. And if you’re only sending a small snippets of sound, that’s not a lot of data, but you do need a signal. If you’re someplace without a signal you’re kind of stuck, right?

[Stefan] Yeah.

[Leo] And then a third thing is that you have to pause the recording and select it with your finger and submit that snippet over the internet and wait to get your ID, which means in the meantime, you might miss out on other sounds. So all of those are things that people are asking about and don’t get me wrong, the app is fantastic already, but people are really curious about these things. So as I understand it, this Apple iOS version of BirdNET will be essentially the same as the Android version, but looking further ahead into the future, do you have plans for an even more advanced version of the app?

[Stefan] Yeah, we do have those plans, and we’re tinkering around with it and it’s technically it’s a challenge to do that, and we wanna address all the points that you just mentioned. If you’re in the field, you don’t have internet connection, some sort of realtime mode would be nice, so you don’t miss out on any bird calls and then we want to have global coverage. And what we’re doing right now is I have another slide I’m going to show it to you. What we’re doing right now is collecting audio recordings from around the world. We’re using the Macaulay Library and we’re using xeno-canto for it, and if you can see on the right side of those two maps, that if you take the recording location of each of them recordings, you can clearly see the outline of all the continents. I mean, okay, we do have a dataset, and we do have data collections that include all continents that include almost all 10,000 species, and there’s multiple hundreds of thousands of recordings. And we can use those for training a beta classifier.

That’s what we’re trying to do right now, and the question is, can we actually train a classifier, a recognition system, which is capable of identifying more than 6,000 species. Six thousand because we have the most recordings for them for some rare species, we do not have a sufficient amount of recordings. So 6,000, that’s the coverage that we’re trying to achieve. And then the question is if we do have such a model, does it also fit on a smartphone? Because you want to do the detection on device. And so this is a active, ongoing research, and we were actually able to reduce the size of the current model while maintaining the same precision. And now we’re not just supporting 1000 species, we’re supporting more than 6,000 species. And this is amazing to see that we can push the limits of this recognition algorithm, yeah, to recognize this amount of species.

But this time not using eBird, as post filter, with not using these occurring mass, what we did is we trained the model on audio and metadata at the same time as we fed the audio signal and the GPS location of this recording into the model. And over the time this model learned the distribution and the range of the species, so you don’t have to put eBird data on the smartphone to get a detection which is likely for the area that you’re actually in. And then we’re also able to squeeze this model into an eight megabyte file, and it is really important to have a really small model, because if you’re in a remote area and you have to download an app, which is 200 megabytes in file size, that’s not going to happen. So really, yeah, there’s different angles that we’re taking, and I have prepared a video, which I would like to show.

[Leo] Can I interrupt real quick, Stefan?

[Stefan] Go ahead

[Leo] Can you go back to that last slide, I just have to point this out because the first time I heard you say this, I didn’t understand, you were talking about being able to cover 60% of the species of birds in the world over 6,000 species of birds, every song, every call. Go to all these countries all over the world and without even connecting to the internet,

[Stefan] Right.

[Leo] Have your phone be able to identify any bird sound you’re hearing for all these species. And when I first heard you say this, I for some reason in my head heard eight gigabytes, cause I assume that would have to be an enormous file, but you’re talking about the entire app to do all of this, no internet connection or anything, eight megabytes.

[Stefan] Right.

[Leo] That’s like one photo from my phones cam.

[Stefan] Yeah, there’s some really-

[Leo] That’s insane.

[Stefan] Yeah, so really crazy techniques nowadays and it’s not like we didn’t came up with this quantization process, other people did, but we’re just using it. And this is really common for mobile devices, your mobile device that you’re holding right now has a lot of this models, there’s speech recognition model, there’s image recognition model, and they’re all just this small file. but you have to include this thought into the design process. You can’t just come up any model and then try and to shrink it down. You have to think a bit more about what should it look like, and spend some time evaluating it. But yeah, that’s on, we were aiming for the size of the app at around 50 megabytes including some texts and images and interface. And so to be reasonable to have it as a download.

[Leo] That’s fantastic, okay. So what does this thing look like?

[Stefan] Yeah so we have a prove of concept, right now it’s just trying to get this started, and see okay, what would it look like? What would it feel like to have this on a smartphone? And I have this video and it has sound and I hope everyone can hear it. And it still has this scrolling spectrogram which because it gives best visualization to give you the clips of what the audio that we recorded, actually, yeah, of the quality of the audio that we’re recording. and you can see there’s a really clear spectrogram, and as soon as something pops up, I see, oh, wait a minute, what is that? Still pause recording select to recap this process for now, but then if you selected it, we get an instant detection because we were running on the device and then it said it’s a Common Firecrest, which is a common bird species Europe we have some information.

You can again, confirm and submit this observation if you like, if you want to share it. And then if you’re done with it you can restart the recording, and now, since we’re running on the device, we can actually switch to the real time mode. And then as soon as a bird is heard it would get instant detection, because the model is running in real time, it’s constantly analyzing the audio feed. And even if other birds are turning in Like this Black Redstart, which is also coming in through this .

You can see if there’s multiple birds vocalizing at the same time, you do have a chance of actually getting detection for both of them. There is is another common bird species And yeah, so that you can see the vocalizations in the spectrogram you can see the detection after a while you may get a good feel for which bird is which and then this helps you learn a bit more and then identify those species.

[Leo] Wow! That just blows my mind. I cannot wait for that version of the app. So again, you’re not talking about the iOS version, that’s gonna come out in the next few months, you’re talking about something further down the line.

[Stefan] Right.

[Leo] What is the timetable for that?

[Stefan] Well, I’m not really sure. So again, the same issue, it’s been developed for Android, but we really want to do it multi-platform from the beginning. So we don’t want to end up with having just one version and then having that to a different platform. So there’s a lot of new techniques that we have to explore and kind of see if we can develop this for both platforms at the same time. And I think, so it is working, but we need people that are helping us out and internally we’re discussing how we going to proceed with this project.

And then we’ll probably have a better version again and let some people around the globe give it a try. And then I was thinking about some people in Brazil or in India or in Japan or just like anywhere on the planet, making some recordings and giving us feedback on how well it actually works. Is the realtime mode actually something that is useful or is it not? is the scrolling spectrum still something that we need or don’t we need that anymore? So getting that feedback and then even what’s the detection quantity, if you’re in the Amazon rainforest? I mean, does it pick up anything? Is it just frogs and you just pick a random bird?

So yeah, that those questions we have to investigate that. And then I think in a year or year and a half, maybe there’s going to be a first version, which is available to the public, but then this time again, it’s going to be iOS and Android at the same time, so I don’t want to have the situation again where people are requesting the iOS version, and we can’t deliver.

[Leo] Yeah, wow! Okay, so a iOS version of the current app coming in a matter of months, this more advanced version, you said a year and a half, two years, something on that order, if I hear you right. But it looks definitely worth waiting for, and thank you for giving us that sneak peak.

[Stefan] Sure.

[Leo] Okay, circling back to the current version of the app, multiple audience members have asked a lot of different questions about what they can do to improve their recording quality or maximize their chances of getting an accurate ID. and they want to know whether settings they should adjust either on their phone or settings within the app itself increasing, decreasing volume, anything like that? And then also are there other things that they should be considering like getting a external microphone to attach to their phone or avoiding background noise, staying away from the road? You know, what advice do you have for people?

[Stefan] So yeah, first of all, I just want to say that it’s a real challenge to get a sound detection system running on Android platform. There’s so many different manufacturers and devices, it’s just a mess. And so we know that on some devices it is not working as expected. It’s so hard to, to find the reason for that, because we don’t have all the devices and we can’t give it a try. But if we get the feedback, I’m having this and that device and this and that is not working, we can try to investigate and improve things. And for the audio, really I have to dig through the settings menu and just tinker around with the settings that you have to find the best possible audio source and the best possible audio settings. So yeah, if you go to the settings menu-

[Leo] Would you like me to demonstrate that?

[Stefan] Yeah, if you could do that, that would be nice.

[Leo] All right, I’m gonna ask our friends behind the scene to spotlight my video for me while I do this. So I’m going to… I’m going to click the menu button in the top left corner here, and if I’m spotlighting, Stefan, you’ll be able to walk me through this. We’re go to the settings-

[Stefan] And then the settings menu, Right.

[Leo] Okay.

[Stefan] And then if you scroll down, so there’s the first few settings are really what kind of common names are you seeing or we have different translations, different localized versions. And then there’s the audio input source, and that’s probably the most important setting that you have and I would advise everyone to use the unprocessed mode. Not every device has it, but this unprocessed mode doesn’t have a post syndrome and many manufacturers are using post-processing to optimize the audio signal for speech. And oftentimes, they’re canceling out the background noise, but that’s where the birds are. So you really wanna if this is available on your phone, you wanna use the unprocessed mode because that means really getting the raw audio data.

The default mode typically on many devices is really, really good. Many devices do have a great microphone which is, I didn’t expect that when were first developing this app. We were thinking about, “Yeah, people have to attach an external microphone to it”. but I would say depending on your device, if it’s a newer device, typically, the quality is so good that there’s no need for an external microphone. I think the, the only time that an external microphone really makes sense is if you want to make a directional recording, if you really want to point to a bird and know what it is, but I think for the casual user, just using the internal microphone of your smart devices is good enough. If you scroll down, sorry, yeah. You wanted to say something?

[Leo] Absolutely, I was just going to ask, you know, what if your microphone seems to be too sensitive, not sensitive enough, maybe the is very far away. Maybe the bird is right up in your face and you’re getting, you want to adjust the volume, something like that.

[Stefan] Oh yeah, you should do that. As soon as you have like these, if there’s all yellow in the spectrogram and you can’t really make out a thing, that means there’s something wrong with the audio signal and there’s maybe the volume is, it’s just too loud, and you can turn down the volume. You can adjust the color map. So if you have difficulties in recognizing the bird sounds because of the colors of the smartphone, you can adjust that. You really have to find the best settings that help you to see what you listen, what you can hear, so if you can hear a bird, you have to see it in the spectrogram. And if you can see it there’s a high probability that BirdNET is able to detect it. If you can’t see it, you typically can’t detect it.

[Leo] So I want to see if I can demonstrate this real quick, I’m going to turn my amplitude gain way up. I was trying this before, and if I play that same-

[Stefan] Right, if it looks like this-

[Leo] Wood Thrush again, that you know, I’m right next to my computer speakers.

[Stefan] Right, if it looks like there’s, there’s something off so you can see there’s a lot of clipping.

[Leo] So that’s pretty solid yellow there.

[Stefan] Right, so there’s not a lot of details and you can’t really make out of things. So that’s an indication that something is off with the gain setting or maybe with the audio input. And if that happens, what is sitting right there, if you see these plaque lines in between, that means that there’s a background canceling active, which means that your device is actually suppressing the background noise, and that might also affect the detection quality.

[Leo] It’s interesting, for me, I think that started when I hit unprocessed.

[Stefan] Right.

[Leo] Instead of my-

[Stefan] As I said, each device has the same… I mean, each device started differently and unprocessed means something else on each device so it’s a mess. But typically there’s this ideal setting for each device, but you have to find it yourself.

[Leo] So the answer is play with your device and figure out what settings are right for you .

[Stefan] Right, that’s the advice, yeah. I think it’s going to be a bit easier for iOS because there’s just a limited amount of devices and they’re all using the same hardware, so it’s going to be easier. And I expect that we can find a good default setting for iOS. It might actually be easier than on Android.

[Leo] Okay, and what about stuff outside of settings in the app, about user, user error type of stuff? What about background noise, for example? You’re gonna have better luck if you’re not right next to a busy road?

[Stefan] Definitely, so it’s a good exercise actually going out. If you live in the city, going out and try to get a clean recording of a bird and it will notice that it’s just so hard to do it. And that gives you an understanding of how polluted your acoustic environment actually is. So there’s so much noise pollution, there’s people chatting, there’s trains going by, cars, there’s planes flying over, and all that sort of stuff is going on. Then this is what it gives you an understanding of, “Oh my God, yeah, this is really”.

And the birds have to fight this noise, so that’s the noise that the birds are experiencing too. And then it gives you a better understanding, and then you start looking around for the quiet places in your city, in your town, or even in the forest. And then you notice, “Oh, there’s this huge difference. I can hear so much more.

I can hear so many birds when it’s quiet”. and I think by just using BirdNET, it just gives you this feeling for the influence that human made sounds have on the birds. And yeah, definitely you should find a quiet spot and get a clean recording, get as close to the bird as possible, which is not always possible. But yeah, so if this bird is far away, if it’s a faint vocalization, if you just recorded it once and there’s a dog barking in the background. Yeah, the chances are really slim that we get about an ID.

[Leo] I have to say I’ve been using the app for several months and it has done so much to help train my ears. I feel like I’ve gotten much better at learning and recognizing bird songs myself now. I hear songs that I previously had no idea what they were and I’ve quizzed myself with BirdNET enough times. I’m like, “Oh, I know that one off the top of my head now”, but I’ve also gotten so much more sensitized to as you said, noticing that noise pollution in the background. Let’s see, I’m trying to think. I’m looking through all the many questions that have been submitted to us and trying to think what’s the best one to go to next. I did notice one person asked, “How is this amazing project funded? Can you tell us?

[Stefan] Oh yeah, sure. So we have a couple of funders right now. The postdoc position that I have is funded by Cornell alumni, it’s take whole shoe. The orthopedic Davis foundation is also funding this, We had the German Ministry of Education Research Fund, the the project, the European Social Fund. So there’s different funds that we’re actually accessing and they’re supporting this cause. So right now, so I’m funded through the year 2021, I think, and then I’m pretty sure we will find some funding to continue with this project. So hopefully fingers crossed this project stays alive, but I think, yeah, if we can convince people that this is some valuable work that we’re doing, I’m pretty sure that some funding will come in and we can continue.

[Leo] When you say, “Convince people, this is valuable work”, I think everybody on the call would agree that it’s certainly a valuable thing for the public. Are you thinking more in terms of you have to be able to justify the research results or what?”

[Stefan] Yeah, so the Citizen Science side of this project is really light and nice, and this is what gives us the outreach to reach the people and to plead our cause and to say, “Okay, yeah, this is what we’re doing”, but as a researcher, we’re really focused on the hard science and right now the 80% precision, which we have is just not good enough, it’s just at the verge of being, like, really valuable to biologist and ornithologist, and we have to improve that. And if we can show that this is possible, if we can show that we can push the limit of the detection quality then I’m pretty sure that they’re going to be a lot of opportunities to get data from other research group apply BirdNET into it, and then they can do their research and really save some time and get some valuable results out of it.

[Leo] Absolutely, that makes sense to me. We do have a lot of questions from… Well, we’ve got tons of questions, but I’m seeing a pattern of there being several people who are doing research of their own that want to know things like can, “I connect BirdNET to coding in Python?” Another person’s asking, “Is this an open source project?” So I don’t know if they’re thinking in terms of the algorithm or the Citizen Science data that you’re gathering. And gosh, there was another one in here somewhere, but you know, lots of people seem interested in… Oh, there was one about, “Could this type of sound recognition be adapted to apply to rodent sounds?”. So what would you say to people who have those research connected questions?

[Stefan] So yeah, it is an open source project for the most part. You have to clean up your code before you upload it, so the can people actually use it. By a tutorial on how to do that. I did that and recognition system that is actually powering the app is online on GitHub, so if you’re interested in how it works and there’s also a trained model, so you can download it, set it up on your machine and use it for research if you like. We will try to update this using a new framework, a new back end right now, you broke in my languages. But trying to publish our research, not just as a written paper but also as a code repository.

We also do have a public API, which you can query and send your recordings to. So if you want to build your own client instead of using our website you can do that. If you want to do that, just leave us a note or write us an email and then we’ll try to figure out how to do that. And yeah, so we’re also constantly getting this question is, “Okay, it works for birds, will it work for insects, for rodents? Would it work for frogs? Would it work for, I don’t know any bird, any animal sound? and the answer is, yeah, maybe. We can’t be entirely sure because other animals are using other frequency ranges than birds and it gets more challenging for insects who use higher frequencies.

It gets really challenging for bats, which are using a high in the frequency spectrum. But we have other researchers at the Center for Conservation Bioacoustics, who are working exactly on that. So we have an insect division, we have people working on insect recognition. We have people working on a Marine mammal of recognition.

We have people working on frog detections, and even, even gunshots. So if you think about poaching in Africa you want to detect gunshots and we know where people are that coming in and shooting elephants for instance. So this is another acoustic domain which helps conservation efforts by just using audio recognition system, so there’s so much more that you can do . and even circling back to the elderly people and assisting people in their homes, now that we were able to advance this technology of audio recognition, there seems to be a chance that at some point we could really use this to assist humans in their own.

[Leo] Wow, okay, I’m seeing follow up questions that are specifically asking, “What about ultrasonic range?” So you mentioned bats, what about you know, bats, rodents, other animals that are too high pitch for human ears?

[Stefan] Yeah, so you need specialized recording equipment for that. You can’t do it with your smartphone. I know there are certain devices that you can attach to a smartphone that will lower the frequency, and then, so you can hear it. So that’s the device that’s recording at the high frequency and they lower the pitch. And you can hear it, you could use that as detection system. But typically if you can visualize it as a spectrogram, it doesn’t matter which frequency range it is, you just have to pick the right frequency range and for birds it’s different than it is for bat for instance.

[Leo] And then, so for those other animals, the same type of algorithm would work if you had the right recording equipment, and then I assume you would need to get a bunch of sample recordings to train the machine learning just like you did to train it with all the bird songs.

[Stefan] Yeah, in theory, that’s what should happen, but typically you really have to adapt to the acoustic domain. And if you’re recording insects they have these really short intervals that are vocalizing in. And so the window that you’re looking at is not three seconds but maybe only a few milliseconds. And Marine mammals, there’s a lot of noise going on under water, but the range they can hear under water is just so much worse in this case than it is for birds. So there’s a lot going on and you have to account for all of that. So we have to account for the acoustic domain that you’re applying your algorithm to. So in principle, you can use these same techniques so everyone is using deep neural nets right now, and they’ve proven to be really powerful, but you have to adjust them to your use-case to what you’re trying to achieve.

[Leo] Okay, all right, I want to move back away from the highly technical stuff here, cause we’re running short on time and I want to make sure we get to some of the questions for those of us who are not IT experts. And so I saw an interesting question, “What about birds like mockingbirds that mimics or imitate other birds? Could a Mockingbird fool the app into thinking it’s something else?”

[Stefan] Yes, it could, and there was this actually, we had this funny example and people are trying this with the upload page that you just saw. They took a Mockingbird recording, and BirdNET would say that, okay, it’s a Mockingbird at rank one, And then the second prediction would be the bird that the Mockingbird was imitating. So it’s somehow fooled BirdNET. I not enough to say a case this other bird, but it fooled it’s so much of it said, “Yeah, there’s a slight chance of being another birds”.

So this is really interesting to see that BirdNET also recognizes these imitated birds, but oftentimes what happens is if you have birds that not just imitates other birds, but also environmental sounds like car alarms maybe even human speech, cats, they’re screaming and something. Like the European Starling, for instance, there’s such diversity of bird species. It’s almost impossible to get a positive ID if you have a mocking, Starling and then it’s just doing some random sounds that it just picked up. So it’s sometimes can be really challenging to adjust to that because there’s not a whole lot of training recordings. And typically if it’s a Mockingbird recording, it doesn’t say which it is mocking, so it’s really hard to adjust to that. And those birds are some of the hardest challenges that we’re facing.

[Leo] Wow, I can imagine. I’m also, Oh, I’m seeing questions about how to record sounds for later. And I know there is a spot in the app that says “save”, and somebody was asking, “Where does it save to?”

[Stefan] Yeah, so if you’re out in the field and you don’t have an internet connection and you have selected interval, there’s these two buttons, one that just analyze and the other says save, and it was save the recording that you just selected, only the snippet that you selected as an observation. And if you go to the main menu and then to the list of my observations, it will show you this as not analyzed. And then you know this is, Right.

[Leo] So I’m sorry, I’m trying this here.

[Stefan] Sure.

[Leo] Then- So I’m gonna hit save.

[Stefan] Right.

[Leo] Okay. And now-

[Stefan] Saved as observation.

[Leo] saved as observation and where do I go again?

[Stefan] The main menu.

[Leo] Okay.

[Stefan] And there is Show Observations and there’s this list, and the first one right now, as it says it’s not analyzed. And that’s the recording you just saved, so you can click on it.

[Leo] So this is something that I would do when I don’t have an internet connection?

[Stefan] Right.

[Leo] When I get back home, I have an internet connection.

[Stefan] Right, and then there’s the Analyze button, you can press it, and then hopefully says it’s Wood Thrush. And then if you go back to the observations menu, it will show up as Wood Thrush observation.

[Leo] Okay, fantastic. Taught me something new.

[Stefan] Yeah, I know it’s not.. This button was added, so we got so much feedback. People requesting this, “Oh my God, how do we include this in the UI?” It’s not easy, we had to, okay, just have this additional button and say, it’s an observation, and hopefully people figure it out. But yeah, if we’re doing a second version of it then we got to do some things better than we did before so.

[Leo] Excellent, I’ve got maybe one more question before we run out of time here. So a lot of people have asked about connections between BirdNET and Macaulay Library and eBird. So I know you used a lot of the sounds that were used to train the app came from Macaulay Library, as well as other databases like xeno-canto, right?

[Stefan] Yeah.

[Leo] And so when people submit observations to eBird, if they submit audio recording media along with their checklist that goes into Macaulay Library. And so a lot of that Citizen Science recorded media has gone into training the app.

[Stefan] Right.

[Leo] But I’ve heard… I’ve been seeing questions about, “Can people then go back and use BirdNET to check their ID on the sounds they’ve submitted, or can they submit data directly from the app to eBird?

[Stefan] Right, this is not as easy as it might sound because we don’t really know if this detection that you just made is really the right ID is that we have only 8% precision. So there’s a good chance that it’s not the right species and you don’t want to have this an eBird. But what we’re working on right now is having a link in the app where you can click on it and then if you have the eBird or the the Macaulay app installed on your device, it will start this application. And then you can submit your observation if you’re really confident.

So there’s no real link between BirdNET and eBird, because we don’t wanna push all those false detections to eBird. But if you’re really confident, and then in one of the next versions, we will have this link and you can start eBird and then submit your observation.

[Leo] Okay, fantastic. So it is right at one o’clock. I wanna be respectful of everyone’s time, because this is when we said we would end, and I know Stefan, it’s probably dinnertime for you so–

[Stefan] Oh!

[Leo] I know I want lunch, so that means, you probably want dinner .

[Stefan] And the kids might too .

[Leo] Yes, so this is so interesting. We could go on forever, but we’re gonna wrap it up here. Stefan, thank you so much for taking the time to talk with us today and thanks as well for all of your hard work in developing this amazing app. It’s a fantastic tool already. I know we’re all super excited for the versions to come and all your planned future improvements, so thank you.

[Stefan] Yeah. Thank you so much for having me. It was so nice too to being able to have this in the form of this webinars, so it’s a nice thing for us too.

[Leo] Awesome, I also want to thank our audience for joining us today too. That’s a really great that we’ve had such a large turnout and I’m sure people, we’ll have even more people watch the archived version which we will have up online.

Before we go, if we didn’t get to your question today, please email us, and we will be happy to follow up with you more directly. For general questions about the Cornell Lab of Ornithology or about our public programs, about bird ID help, pretty much any random question you have about birds in general, please email our public information team, which is the bottom link on the screen here, cornellbirds@cornell.edu. And if you have more technical questions more specifically about BirdNET, all of these things about you know, ultrasonic sounds and all these amazing technical questions you can email Stefan and his colleagues in the Center for Conservation Bioacoustics.

Their email address is ccb-birdnet@cornell.edu. And then of course, don’t forget to visit that BirdNET website which is birdnet.cornell.edu. Okay, so that is our show. Thank you everyone, I hope you all enjoyed it and I hope you’ll all download the free BirdNET mobile app. If you haven’t already and go have fun identifying bird sounds everywhere you go. Thanks Stefan, thanks everyone. Happy Birding.

[Stefan] Thank you. Thanks.

[Leo] Take care.

End of transcript

Stefan Kahl

Postdoctoral Fellow, Center for Conservation Bioacoustics at the Cornell Lab of Ornithology

How can computers learn to recognize birds from sounds? As a postdoc within the Center for Conservation Bioacoustics, Stefan is trying to find an...

Learn how to identify birds by sound with the Lab’s BirdNet app and website. Advances in machine learning are making it easier to identify birds by their sounds.

This was a Facebook Live event and Q & A with BirdNet developer Stefan Kahl.