Wild Edible Plant Classifier

With the need for an increase in food resources before the year 2050, it's crucial to consider alternative food sources. Convolutional Neural Networks can help achieve this!

This project focuses on a Wild Edible Plant Classifier that compares the performance of three state-of-the-art CNN architectures: MobileNet v2, GoogLeNet, and ResNet-34. The artefact created is part of my BSc dissertation, aimed at classifying 35 classes of wild edible plants using Transfer Learning.

Wild Edible Plant Classes

Figure 1. The 35 Wild Edible Plant classes used in the project.

The 35 classes of wild edible plants are listed in table 1, accompanied by the number of images (per class) within the dataset. The dataset created is comprised of Flickr images, obtained through their API using the rudimentary scripts within the \data_gathering folder. The dataset can be found on Kaggle here and contains 16,535 images, where the quantity of images per class varies from 400 to 500.

Class Quantity
Alfalfa 470
Allium 481
Borage 500
Burdock 460
Calendula 500
Cattail 466
Chickweed 488
Chicory 500
Chive blossoms 455
Coltsfoot 500
Common mallow 439
Common milkweed 469
Class Quantity
Common vetch 451
Common yarrow 474
Coneflower 500
Cow parsley 500
Cowslip 442
Crimson clover 400
Crithmum maritimum 433
Daisy 490
Dandelion 500
Fennel 452
Fireweed 500
Gardenia 500
Class Quantity
Garlic mustard 409
Geranium 500
Ground ivy 408
Harebell 500
Henbit 500
Knapweed 500
Meadowsweet 456
Mullein 500
Pickerelweed 454
Ramsons 489
Red clover 449
Total 16,535

Table 1. A detailed list of the Wild Edible Plant classes with the number of images per class within the dataset.

The project is divided into three Jupyter Notebooks. The first one contains a sample of the plant classes to visualise them, stored within a zip file found inside the dataset folder, and covers steps 4 to 6 in the Machine Learning Pipeline diagram (figure 2). The second notebook focuses on the tuning of the CNN models, and the third and final notebook visualises their results.

Machine Learning Pipeline

Figure 2. Machine Learning Pipeline diagram.

The dissertation report (in pdf format) is on the GitHub repository here. A copy of its abstract is highlighted below.

Maintaining a steady flow of food is becoming increasingly difficult. With the global population predicted to reach 9.6 billion by 2050, new food sources are required. The research conducted in this paper shows that Deep Learning (DL) applications can improve the quantity and quality of harvests produced when using disease identification and plant recognition techniques.

However, they have not yet been used to identify natural vegetation as a potential food source. This study aims to understand the role DL plays in agricultural food production, expand DLs use within horticulture, and potentially identify new food sources within natural environments for daily consumption. Throughout this paper, various toolsets, machine environments, and research methods are discussed, assisting in determining the best methodology for creating an artefact that identifies wild edible plants. While the artefact focuses on three state-of-the-art Convolutional Neural Networks (CNNs), additional information found within this report includes the components of CNNs, accompanied by the design and development of the artefact itself.

The three architectures, GoogLe-Net, MobileNet v2, and ResNet-34, were built using the open-source deep learning framework PyTorch, where 36 variants of these models were created and tested using 12 different parameters. The model’s performance was evaluated based on six performance metrics to classify 35 classes of wild edible plants, each trained and tested on a dataset containing 16,535 images. Overall, achieving classification accuracies of 74.29%, 82.85%, and 80.35%, for GoogLeNet, MobileNet v2, and ResNet-34, respectively.