04.05.2014 Views

Identification of Grape Varieties via Digital Leaf Image ... - Oiv2010.ge

Identification of Grape Varieties via Digital Leaf Image ... - Oiv2010.ge

Identification of Grape Varieties via Digital Leaf Image ... - Oiv2010.ge

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Identification</strong> <strong>of</strong> <strong>Grape</strong> <strong>Varieties</strong> <strong>via</strong> <strong>Digital</strong> <strong>Leaf</strong> <strong>Image</strong> Processing by Computer<br />

1 J. ZHANG, 2 P. YANNE * and 3 H. LI<br />

1,2 College <strong>of</strong> Information Engineering, Northwest A&F University<br />

Yangling, Shaanxi, China<br />

pyanne@nwsuaf.edu.cn (corresponding author) *<br />

3 College <strong>of</strong> Oenology, Northwest A&F University<br />

lihuawine@nwsuaf.edu.cn<br />

ABSTRACT<br />

<strong>Grape</strong> variety identification is <strong>of</strong> great significance for resource statistics, new specie<br />

detection and protection <strong>of</strong> genetic resources. Based on the classical ampelographic grape<br />

identification method combined with machine learning and pattern recognition techniques in<br />

computer science, we proposed a new cheap and fast identification method <strong>via</strong> leaf image<br />

processing. We demonstrated its feasibility <strong>via</strong> the implementation <strong>of</strong> a prototype which could<br />

classify 354 leaf images belonging to 20 varieties with an accuracy rate <strong>of</strong> 87%. Our<br />

techniques can be applied to computer aided diagnosis <strong>of</strong> grape leaf diseases and new variety<br />

discovery, as well as to quantify the classical ampelographic identification method. We<br />

propose further work to transform this new method from a prototype into a practical s<strong>of</strong>tware<br />

product.<br />

Keywords: <strong>Grape</strong> variety identification; Ampelography; Pattern recognition; <strong>Image</strong><br />

processing; Hu's moment invariants<br />

RESUME<br />

L‟identification des cépages est très importante pour les statistiques de ressources, la<br />

détection des nouvelles espèces et la protection des ressources génétiques. Basés sur la<br />

méthode classique de l‟identification ampélographique et grâce aux récents progrès en<br />

informatique, notamment la reconnaissance des formes et l‟apprentissage automatique, nous<br />

proposons une nouvelle méthode efficace et automatique par ordinateur pour l‟identification<br />

des cépages. Nous démontrons sa faisabilité <strong>via</strong> la réalisation d‟un prototype qui a pu<br />

classifier 354 fichiers de feuilles, appartenant à20 cépages avec une précision de 87%. Notre<br />

technique peut s‟appliquer au diagnostic des maladies de vigne, à la découverte de nouvelles<br />

espèces et àla quantification de la méthode classique de l‟identification ampélographique.<br />

Nous suggérons des directions de recherche future afin de transformer notre prototype en<br />

produit logiciel.


I. Introduction<br />

There are over 10,000 grape varieties throughout the world. About 3000 <strong>of</strong> them are widely<br />

cultivated in production and many are wine varieties [Zhai, 2001]. <strong>Grape</strong> variety<br />

identification is <strong>of</strong> great significance for resource statistics, new specie detection and<br />

protection <strong>of</strong> genetic resources. OIV‟s Strategic Framework includes the task for recognising<br />

new viticultural varieties [OIV, 2005].<br />

The classical identification method is based on ampelography [Galet,1990; Tassie and<br />

Blieschke, 2008]. Some new methods have been also developed recently, using different<br />

approaches such as DNA molecular genetic marker [Bower et al., 1993; Zhang et al., 1996;<br />

Testier et al., 1999], pollen morphology [Wang and Li, 2000], anthocyanin analysis<br />

[Wendelin and Barna, 1994], etc. All these methods need expert intervention and are hence<br />

quite expensive. Some <strong>of</strong> them need special devices and take a long time. Today, computer<br />

technologies have a wide range <strong>of</strong> applications in many fields including grape production.<br />

There are many successful examples where the computer has been used for image processing<br />

[Li et al., 2007; Barbu, 2009] and identification <strong>of</strong> plant species [Ye et al., 2004] based on<br />

pattern recognition. We look for a new method for identifying grape varieties combining the<br />

computer techniques and the classical ampelography. Based on the processing <strong>of</strong> digital grape<br />

leaf image, this new method would be rapid, efficient and nearly automatic with little or even<br />

no human intervention. Our research objective is to develop a s<strong>of</strong>tware product, available on<br />

web, which will be able to tell a browser the variety <strong>of</strong> the grape leaf image that s/he uploads.<br />

The ampelographic identification <strong>of</strong> grape varieties is based on the observation <strong>of</strong> features<br />

on some organs <strong>of</strong> a grape, such as flower, berry, shoot and leaf. OIV has produced 2 editions<br />

[OIV, 1983; OIV, 2009] <strong>of</strong> the document “OIV descriptor list for grape varieties and Vitis<br />

species” which defines as a standard the ampelographic characteristics for the identification <strong>of</strong><br />

Vitis varieties and species. Using the 128 characteristics selected by [OIV, 1983] where each<br />

characteristics is signed a code and may take values from 1 to 9 for all grapes, [OIV, 2000]<br />

describes 250 wine grape varieties <strong>of</strong> its member states, by assigning a values to descriptor<br />

codes for each variety . For a given grape sample, if each its code has the same value as the<br />

variety V <strong>of</strong> the 250 in [OIV, 2000], this grape‟s variety is classified as V. All ampelographic<br />

experts agreed that the features <strong>of</strong> mature leaf are the most determinate for the varieties<br />

identification. For the 128 codes, 35 <strong>of</strong> them are for leaf and 29 for mature leaf. [OIV, 2009]<br />

adds another 18 codes from 601 to 618 on mature leaf. On the “Primary descriptor priority list”<br />

<strong>of</strong> 14 codes, there are 9 on leaf.<br />

In our new approach based, the main idea is to let computer calculate all the code values<br />

instead <strong>of</strong> measuring them by a human being. Then the computer can compare these values<br />

against the known ones as in [OIV, 2000] to find the right variety. However, on one hand, it‟s<br />

not easy to calculate some code values and on the other hand, it is not necessary to know all<br />

these values for the identification purpose. Furthermore, some features not selected by [OIV,<br />

2009] may also contribute to distinguish or identify varieties, for example, Hu's moment<br />

invariants for an image [Hu, 1962].


A digital image is composed <strong>of</strong> a pixel f(x, y) matrix where (x, y) is the index or coordinator<br />

<strong>of</strong> the matrix. Each pixel f(x, y) represents an image dot and is described by a series <strong>of</strong><br />

numbers. For a binary image like a photo in an old news paper, a pixel f(x, y) is either 0 for<br />

white or 1 for black. For a colour image taken by a digital camera, a pixel may be a<br />

combination <strong>of</strong> three basic colours with different densities. Hu defined 7 moment invariants<br />

for any digital image. Each invariant can be easily calculated as the function <strong>of</strong> its pixels f(x,<br />

y). The 7 invariants‟ values are nearly independent <strong>of</strong> the rotation, position or size <strong>of</strong> the<br />

image <strong>of</strong> the matrix. They have been successfully used in computer pattern recognition<br />

applications such as car registration number [Liu and Lu, 2008], static hand gesture [Liu et al.,<br />

2008], tiger variety [Xu and Qi, 2009], human face [Gan and Zhang, 2002] and corn leaf<br />

disease recognition [Shen et al., 2008]. Yanhua YE and Chun CHEN <strong>of</strong> Hong Kong<br />

Polytechnic University have developed a Computer Plant Species Recognition System,<br />

CPSRS [Ye et al., 2004] which could provide a convenient and efficient way to search and<br />

identify plant species from a digital image file.<br />

Departing from the works mentioned above, which consist <strong>of</strong> the cornerstone <strong>of</strong> our method,<br />

we present our method in detail and experiment it by the implementing a s<strong>of</strong>tware prototype<br />

on an ordinary personal computer. We then analyze our experiment results and discuss on<br />

some choices that have been made, the remaining problems and possible improvements as<br />

well as applications. We conclude on the feasibility <strong>of</strong> our new method and point out the<br />

future work.<br />

II. Materials and Methods<br />

Our identification method is constructed on 4 steps: 1) collect typical mature grape sample<br />

leaves for the varieties we want to identify, 2) scan the leaves into digital image files, 3) select<br />

a set <strong>of</strong> characteristics or features useful for identification and computable by computer from<br />

the images, 4) build a s<strong>of</strong>tware classifier based on the features calculated from the sample<br />

files.<br />

1) Collect mature sample leaves<br />

Following the requirements <strong>of</strong> OIV [OIV, 2009], for each variety, we collected about 10<br />

mature leaves from different shoots at the third middle level, between berry set and veraison<br />

time. These leaves were collected from the grape variety culture field <strong>of</strong> College <strong>of</strong> Oenology,<br />

Northwest A&F University in Yangling, Shaanxi, China. There are a total <strong>of</strong> 500 leaves<br />

belonging to 3 wild local varieties and 47 cultured ones including Sauvignon, Riesling,<br />

Traminer, Sémillon, Chenin Blanc, Ugni Blanc, Müller-Thurgau, Cabernet Sauvignon,<br />

Carignan, Gamay, Syrah, Muscat, etc.<br />

2) Obtain digital leaf files<br />

For each leaf, we scanned both leaf sides with the default parameters <strong>of</strong> 3 A4 size ordinary<br />

scanners. We got a total <strong>of</strong> 1073 colour leaf image files at the resolution <strong>of</strong> 300 DPI (Dot Per<br />

Inch). In our s<strong>of</strong>tware prototype, we used 354 leaf files <strong>of</strong> 20 varieties.


3) Select features for identification<br />

Naturally, the 47 features on mature leaves, coded by OIV [OIV, 2009] have been<br />

considered. More researches have to be done for calculating some features, e.g. “density <strong>of</strong><br />

prostrate hairs between the main veins on lower side <strong>of</strong> blade”, OIV code 84. We have found<br />

a way to calculate some <strong>of</strong> them, including size and circumference <strong>of</strong> blade, length <strong>of</strong> petiole,<br />

length <strong>of</strong> veins, etc. We select also some features, neither considered by OIV nor<br />

ampelography, which are easy to calculate and useful for identification, e.g. Hu‟s 7 moment<br />

invariants. In order to quickly build our prototype, with the criteria <strong>of</strong> both computable and<br />

useful, we finally selected the size and circumference <strong>of</strong> blade and Hu‟s 7 moment invariants<br />

to form the feature set or vector <strong>of</strong> 9 dimensions.<br />

4) Build a s<strong>of</strong>tware classifier<br />

Let‟s explain the mathematical basis <strong>of</strong> our method. Each leaf image is represented by a<br />

feature vector Lj= (fj1, … ,fj9) where fji is a real number. Such vector Lj is a point in the 9<br />

dimension feature space in mathematics. We imagine the Euclidean distance D L 1 , L 2 =<br />

f 11 − f 12 2 + ⋯ + f 19 − f 29 2 between 2 grape leaves L1 and L2 <strong>of</strong> the same variety should<br />

be in average smaller than that <strong>of</strong> 2 different varieties. For the variety i, its mass centre<br />

Ci=( ci1, …, ci9) Where cim=<br />

n<br />

f jm<br />

j=1 , fjm is the m-th feature <strong>of</strong> the j-th leaf sample Lji <strong>of</strong> the<br />

n<br />

variety Vi, and its radius R i =maximum <strong>of</strong> D(Ci, Lji) for j=1 to n. For a given leaf j‟s vector<br />

Lj, if we only find one variety S which can satisfy D(Lj, C S ) ≤ R S , we can conclude that the<br />

leaf Lj belongs to the variety S.<br />

Unfortunately, for the 354 vectors <strong>of</strong> 20 varieties, their mass centres are so close and their<br />

radiuses are so big that the 9 dimension sphere <strong>of</strong> a variety S i at centre C i with radius R i<br />

has intersection with the spheres <strong>of</strong> other varieties. To reduce the space occupied by each<br />

variety, we improve the above method by detecting the 9 dimension cube which inscribed the<br />

sphere. This can be done by finding the value range <strong>of</strong> the vector‟s each dimension for every<br />

variety. Some <strong>of</strong> leaves still cannot be distinguished. For this case, based on the fact that the<br />

values <strong>of</strong> each dimension for each variety should satisfy the normal distribution, we introduce<br />

the probability <strong>of</strong> a leaf L belonging to a variety S by the following formula:<br />

P L, S = 1 −<br />

9 2∗dL j<br />

j=1 ∗ 1 , dL j = L j − 1 r S,j 9 2<br />

∗ r(S, j) , r(S, j)=max(fsj)–min(fsj)<br />

where max/min(fsj) means the maximum/minimum value <strong>of</strong> j-th dimension for all samples<br />

<strong>of</strong> the variety S.<br />

Finally, we build our classifier with the following algorithm:


1) Find the vector Li <strong>of</strong> a leaf image i and compare Li (fij) with the value range max/min(fsj)<br />

<strong>of</strong> all varieties S (S=1 to 20 and j= 1 to 9).<br />

2) If max(fsj)>=fij>=min(fsj) holds for only one variety S with j=1 to 9, then the leaf Li<br />

belongs to the variety S.<br />

3) Else, leaf Li satisfies the relation in step 2) for m varieties S 1, ⋯ , S m, . We find out all the<br />

probabilities P(L i, , S j ). Li belongs to the variety S j with the probability P(L i, S j ) which<br />

is the maximum <strong>of</strong> P L i, S j for j=1 to m.<br />

III. Results and Discussion<br />

We developed a s<strong>of</strong>tware prototype in Matlab implementing the above algorithm to verify<br />

our method. The following figure 1 shows an execution <strong>of</strong> our prototype under Matlab<br />

environment. The user selects a leaf image and then asks for the classification. The prototype<br />

displays the image in 3 modes and prints out the variety name:<br />

Figure 1. Execution <strong>of</strong> classification prototype<br />

We have tested our algorithm to classify the 354 images files belonging to 20 varieties. The<br />

correct classification rate is <strong>of</strong> 87%. This rate is obtained by calling the classifier on all the<br />

354 files and count the present <strong>of</strong> corrected classified ones.


The accuracy decreases when the number <strong>of</strong> varieties increases. We may resolve this<br />

problem by increasing the feature vector‟s dimensions, i.e. to find more features, and use<br />

better classification algorithm such as SVM [Wu et al., 2008]. The latter is a well known<br />

machine learning method for classification. It is in fact the capacity <strong>of</strong> computer to learn from<br />

examples. After having trained it by giving many leaf samples belonging to each grape<br />

variety, the s<strong>of</strong>tware can decide to which variety a new leaf belongs to.<br />

Our prototype can only classify scanned images. In order to classify digital camera images,<br />

we have to consider factors such as photography distance and focus. On the other hand, a<br />

camera may take photos from different angles. This may allow us to distinguish the prostrate<br />

hairs <strong>of</strong> a leaf from the erect hairs. However, this problem can be better resolved by a 3<br />

dimensions camera or scanner. Based on the 3D image technique, we can more easily<br />

calculate other leaf features such as the pr<strong>of</strong>ile <strong>of</strong> leaf (OIV code 74 [OIV, 2009]). Light wave<br />

lengths other than visible ones, such as infrared, microwave and terahertz [Lu, 2002; Xing<br />

and Baerdemaeker, 2005] etc. can also be used to obtain digital leaf images. These images<br />

should supply complementary features, useful for the variety identification. By combining<br />

these mentioned techniques, we are expected to be able to calculate all the 45 codes selected<br />

by OIV and hence to identify all grape varieties based on digital image processing by<br />

computer.<br />

This new identification method may not only simplify the identification procedure, but its<br />

techniques can also improve the classical ampelographic identification method. The current<br />

OIV Descriptor List [OIV, 2009] uses the code values in a qualitative way. For example, the<br />

code 65 for the size <strong>of</strong> blade takes values 1, 3, 5, 7 and 9 which means respectively, very<br />

small, small, medium, large and very large. Our method can calculate the size <strong>of</strong> blade in<br />

inch 2 or cm 2 effectively and automatically. We may do this for all quantifiable codes <strong>of</strong> all<br />

known varieties. With these quantitive values, we may give a value range for each current<br />

qualitative value on one hand, and check if the code values for the 250 varieties in [OIV, 2000]<br />

are coherent. These techniques can be used to detect new variety and guess the parent<br />

varieties <strong>of</strong> a new hybrid variety. In fact, the feature vector <strong>of</strong> a new variety will not belong to<br />

any known varieties, but a hybrid one should be close to its parents‟ ones. Computers<br />

s<strong>of</strong>tware can easily find out all the similar varieties and sort them according to the similitude.<br />

IV. Conclusions<br />

Based on the classical ampelographic grape identification method combined with machine<br />

learning and pattern recognition techniques in computer science, we proposed a new cheap<br />

and fast identification method <strong>via</strong> leaf image processing. We demonstrated its feasibility by<br />

implementing a s<strong>of</strong>tware prototype which could classify 354 leaf images belonging to 20<br />

varieties with an accuracy <strong>of</strong> 87%.<br />

We are continuing our research to increase both the number <strong>of</strong> grape varieties and the<br />

accuracy by calculating more features from digital images and improving the classification<br />

algorithms.


Acknowledgments<br />

We would first express our gratitude to Dr Jean-Claude Ruf, head <strong>of</strong> OIV‟s vitiviniculture<br />

science and techniques department for his advices during his visit to our university at the<br />

occasion <strong>of</strong> the 6th International Symposium on Viticulture and Enology, in Yangling,<br />

Shaanxi, China. Mrs A. Tsioli, head <strong>of</strong> OIV‟s viticulture unity, supplies us with a lot <strong>of</strong> useful<br />

information. The whole research team for the project <strong>of</strong> grape variety identification in our<br />

university contributed to this work, especially, Dr JF NING for his suggestion <strong>of</strong> adopting<br />

Hu's moment invariants, Dr C CAI as well as his students for the leaf sample collecting and<br />

image scanning and Dr Y ZHANG for his encouragement. At last, we would thank the<br />

students <strong>of</strong> our team, ZG FENG, WJ HAN, Z SONG, H ZHANG and Y ZHANG.<br />

Bibliography<br />

Barbu T., 2009. Content-based image retrieval using Gabor filtering. In: Proceedings - 20th<br />

International Workshop on Database and Expert Systems Applications. New York: Institute<br />

<strong>of</strong> Electrical and Electronics Engineers Inc. 2009: 236-240.<br />

Bower J.E., Bandman E.B., Meredith C.P., 1993. DNA fingerprinting characterization <strong>of</strong><br />

some wine grape cultivars. American Society for Enology and Viticulture, 44: 266-271.<br />

Galet P., 1990. French grapevine varieties and vineyards. Volume 2. The French<br />

ampelography. 2nd edition. Montpellier.<br />

Gan J.Y., Zhang Y.W., 2002. Face Recognition Based on Moment Invariants and Neural<br />

Networks. Computer Engineering and Applications, 38(7): 53-57.<br />

Hu M.K., 1962. Visual Pattern Recognition by Moment Invariants. IRE Transaction<br />

Information Theory, 8(2): 179-187.<br />

Li Y., Chi Z.R., Feng D.D., 2007. <strong>Leaf</strong> vein extraction using independent component analysis.<br />

In: 2006 IEEE International Conference on Systems, Man and Cybernetics. New York:<br />

Institute <strong>of</strong> Electrical and Electronics Engineers Inc. 5: 3890-3894.<br />

Liu J.M., Lu K., 2008. The identification <strong>of</strong> car logo based on Hu‟s invariant moments.<br />

scientific and technical information (Academic Edition), 36: 76-81.<br />

Liu Y., Gan Z.J., Sun Y., 2008. Static Hand Gesture Recognition and its Application based on<br />

Support Vector Machines. In: S<strong>of</strong>tware Engineering, Artificial Intelligence, Networking,<br />

and Parallel/Distributed Computing. Ninth ACIS International Conference, Phuket:<br />

2008(9): 517-521.<br />

Lu R., 2002. Detection <strong>of</strong> bruises on apples using near-infrared hyperspectral imaging.<br />

Information & Electrical Technologies Division <strong>of</strong> ASAE, vol.46(2): 1-8.


OIV, 1983. 1st Edtion <strong>of</strong> the OIV Descriptor list for grape varieties and Vitis species.<br />

http://www.oiv.int.<br />

OIV, 2000. Description <strong>of</strong> world vine varieties. http://www.oiv.int.<br />

OIV, 2005. RESOLUTION AG 1/2005, OIV Strategic Framework 2005-2008. .<br />

http://www.oiv.int.<br />

OIV, 2009. 2nd Edtion <strong>of</strong> the OIV Descriptor list for grape varieties and Vitis species.<br />

http://www.oiv.int.<br />

Shen W.Z., Wu Y., Chen Z.L., Wei H.D., 2008. Grading method <strong>of</strong> leaf spot disease based on<br />

image processing. In: Proceedings - International Conference on Computer Science and<br />

S<strong>of</strong>tware Engineering. NJ: IEEE Computer Society. Vol. (6): 491-494.<br />

Tassie L., Blieschke N., 2008. Ampelography: do you know what variety you are planting in<br />

your vineyard or nursery. Australian & New Zealand <strong>Grape</strong> grower & Winemaker: national<br />

journal <strong>of</strong> the grape and wine industry, No.537.<br />

Testier C., Daivd J., This P., Boursiquot J. M., Charrier A., 1999. Optimization <strong>of</strong> the choice<br />

<strong>of</strong> molecular markers for varietal identification in Vitis Vinifera L. Theoretical and Applied<br />

Genetics, 98(1): 171-177.<br />

Wang X.D., Li C.L., 2000. A study on pollen morphology <strong>of</strong> the genus Vitis L. Acta<br />

Phytotaxonomica Sinica, 38(1): 43-52.<br />

Wu D.K., Xie C.Y., Ma C.W., 2008. The SVM classification leafminer-infected leaves based<br />

on fractal dimension. In: 2008 IEEE International Conference on Cybernetics and<br />

Intelligent Systems, CIS 2008. NJ: Inst. <strong>of</strong> Elec. and Elec. Eng. Computer Society. 2008:<br />

147-151.<br />

Xing J., Baerdemaeker J.D., 2005. Bruise detection on „Jonagold‟ apples using hyperspectral<br />

imaging. Postharvest Biology and Technology, 37(2005): 152-162.<br />

Xu Q.J., Qi D.W., 2009. Parameters for Texture Feature <strong>of</strong> Panthera tigers altaica Based on<br />

Gray Level Co-occurrence Matrix. Journal <strong>of</strong> Northeast Forestry University, 37(7):<br />

125-130.<br />

Ye Y.H., Chen C., Li C.T., Fu H., Chi Z.R., 2004. A Computerized Plant Species Recognition<br />

System. In: Proceedings <strong>of</strong> 2004 International Symposium on Intelligent Multimedia,<br />

Video and Speech Processing. Hong Kong: Institute <strong>of</strong> Electrical and Electronics<br />

Engineers Inc. 2004: 723-726.<br />

Zhai H., 2001. The main grape varieties used for processing. In: Wine <strong>Grape</strong> Growing and<br />

Processing Techniques. Yang T.Q., 1st edition. Beijing: China Agriculture Press. 1: 48.<br />

Zhang L.P., Lin B.N., Shen D.X., 1996. The extraction, purification and identification <strong>of</strong><br />

RFLP <strong>of</strong> the chromosomal DNA <strong>of</strong> <strong>Grape</strong>. Journal <strong>of</strong> Fruit Science, 13(2): 71-74.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!