RBU_JR_LIS_V23_2021-FULL_TEXT-E-Copy
The RBU Journal of Library & Information science is a scholarly communication for education, research and development of the Library & Information science field. It is published annually. The first volume was published in 1997. It received ISSN (0972-2750) in the 5th volume in the year 2001. From 17th Volume published in the year 2015, the journal becomes peer-reviewed by eminent experts across the country. This journal WAS enlisted by UGC approved List of Journal in 2017, With Serial No. 351 and Journal NO. 45237. Since 2019, this Journal Qualified as per analysis protocol as Group D Journal and listed under UGC CARE approved list of Journals.
The RBU Journal of Library & Information science is a scholarly communication for education, research and development of the Library & Information science field. It is published annually. The first volume was published in 1997. It received ISSN (0972-2750) in the 5th volume in the year 2001. From 17th Volume published in the year 2015, the journal becomes peer-reviewed by eminent experts across the country. This journal WAS enlisted by UGC approved List of Journal in 2017, With Serial No. 351 and Journal NO. 45237.
Since 2019, this Journal Qualified as per analysis protocol as Group D Journal and listed under UGC CARE approved list of Journals.
- TAGS
- ddc
- bibliographic coupling
- integrated library systems
- ejournals consortium
- drdo
- generalities class
- dewey decimal classification
- controlled vocabulary
- literary warrant
- information management
- khas community
- garrett ranking
- library of congress
- rabindra bharati university
- sudip ranjan hatua
- information science
- citations
- libraries
- metadata
- retrieved
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
RBU Journal of library & Information Science, V. 23, 2021
books, $v subfield in 5 books, $x subfield in 4 books, and
$z subfield in 3 books. There is no matching seen under
subfield $d and $y. From comparison given in the table 3
below, it can be concluded that the subfield $a are most
famous to the users and the popularity of other subfields
are as follows $v, $z, $x.
Number of books/records = 100
MARC Subfields used in LCSH
(N=100)
$a $x $v $z $d
Number of titles with this subfield in LCSH
descriptors
100
100%
21
21%
10
10%
7
7%
1
1%
Number of titles which have at least one
49 4 5 3 0
matching with LCSH subfield terms
Percentage 49% 19.04% 50% 42.85% 0%
Table 3: Comparison of social tags with LCSH descriptors from MARC subfield’s point of view
7 Similarity and distance measurement based on
Jaccard similarity coefficient
In this study top frequently used social tags and top
frequently used LCSH descriptors were analyzed in order
to identify if any similarities and distances exist at the
level of use. For this purpose Jaccard similarity index was
used. “The Jaccard Index, also known as the Jaccard
similarity coefficient, is a statistic used in understanding
the similarities between sample sets. The measurement
emphasizes similarity between finite sample sets, and is
formally defined as the size of the intersection divided by
the size of the union of the sample sets”. This is a measure
of similarity for two sets of data, with a range from 0% to
100%. When the percentage is higher, that means more
similarities can be found between the two populations
(Statistics How To, n.d.).
The formula is as follows:
Jaccard Index = (the number in both sets) / (the
number in either set)
In details steps are:
“Count the number of members which are shared between
both sets.
Count the total number of members in both sets (shared
and un-shared).
Divide the number of shared members by the total
number of members.
Multiply the number you found in by 100 (This will
produce a percentage measurement of similarity between
the two sample sets)” (Statistics How To, n.d.).
We know the formula is:
Jaccard Index = (the number in both sets) / (the
number in either set)
The same formula in notation is:
J(X, Y) = |X∩Y| / |X∪Y|
[Where X= Social tags and Y= LCSH descriptors]
For this study both data sets are as follows:
X= {1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 14, 21, 22, 28, 37, 67,
72, 98}
Y= {1, 2, 3, 5, 6, 7, 8, 10, 11, 16}
So,
J(X, Y) = |X∩Y| / |X∪Y|
11
https://lisrbu.wixsite.com/dlis/rbu-journal-of-lis
J(X, Y) =|{1, 2, 3, 5, 7, 10, 11}| / |{1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, 14, 16, 21, 22, 28, 37, 67, 72, 98}|
J(X, Y) = 7/21 = 0.3333
We know if the results would be closer to 100%, that
means high similarity presents (e.g. 90% is more similar
than 89%). If results would be 0%, that means no
similarity presents.
This study also shows the Jaccard distance between them.
“The Jaccard distance, is a measure of how dissimilar two
sets are. It is the complement of the Jaccard index and can
be found by subtracting the Jaccard Index from 1”
(Statistics How To, n.d.).
The formula is as follows:
D(X, Y) = 1 – J(X,Y)
Here, Jaccard distance is = 1- 0.3333 = 0.6667
In this study, Jaccard similarity index becomes 0.3333 or
33.33 (0.3333*100 = 33.33%) which indicate a little
similarity between social tags and descriptors. Jaccard
distance shows that the top frequent social tags used by
users and top frequent LCSH descriptors used by domain
experts are dissimilar.
Suggestion and Conclusions
Overall comparison between social tags and LCSH
descriptors provides many results regarding the
functionality and usability of social tags in the library.
Overlapping of terms makes it clear that the vocabulary of
the social tags is larger than the LCSHs database. Out of
total LCSH descriptors and social tags only 51 terms were
overlapped i.e. these 51 terms used by both experts and
general users in whole collection. Those overlapping terms
cover only 3.06% (very small portion) for social tags and
27.86% for LCSH descriptors. This means that users
mostly use controlled terms as tags to describe books, but
experts rarely use social tags as descriptors. In terms of
overlapping words, Spearman's rank correlation suggests
that when the word is used as a tag (here used as
LibraryThing tag), as a descriptor there is 83 percent
chance of using it. However it is clear that there are
vocabulary differences between the two datasets.