What is Mendeley Readership?

Over the past few years, more librarians, funders, and publishers have begun to look beyond citations for indications that a work is having an impact on the research community at large. In the search for more immediate and more prospective indicators, they look at the traffic to the article pages hosted at their journals and repositories, (much like news sites track their analytics) but they are often frustrated because the traffic is so poorly characterized. Because there’s no way to determine if a site visit comes from an author (and thus a possible future citer) or a practitioner, the correlation between site visits/downloads and content which is highly cited is too weak to be predictive. Here’s where Mendeley comes in. Because Mendeley can give a fine-grained view on how content arising from an institution’s researchers or a journal is being read, it sits in a nice space between the high signal value of a citation and the rapid speed of a site visit or download. In addition, researchers value the data for a real-time view on of how their work is being read by those in their field and in other disciplines.

With that in mind, it’s important to note some disadvantages of Mendeley’s readership data. The readership of a document reported by Mendeley depends on the underlying population of Mendeley users. While data on the academic discipline and status or title of the researcher is available from Mendeley, not all documents have a reading population which is representative of the academic community as a whole. In most cases (see Figure), the scale of Mendeley means that the skew is minimal, however. Another important consideration is that readership isn’t as clearly understood a concept as citation. This post attempts to remedy this.

How do we count readers?

The number of readers of a document in Mendeley is the number of copies of that document in all Mendeley libraries. However, this simple concept contains many complexities. For example, documents may be the same logical document while being different at the bit level. Frequently, a document found in an open access repository will come with a cover page that makes it a different file, or a researcher may have added annotations to the PDF using an application such as Adobe Acrobat. In addition, many older documents don’t have any or good quality metadata embedded in them (no DOI, for example), so they can’t always be identified as the same file just based on metadata. We had to develop a method for clustering all the similar documents and grouping together the different files which are the same logical document. The size of this cluster is the number of readers. In the earlier days of Mendeley, we’d just periodically run this process across all the documents uploaded by all our users and regenerate the clusters, and thus regenerate the whole catalog. As the number of documents grew into the hundreds of millions, this process began to serve us less well, and it also caused the number of readers to rise or fall unexpectedly if a subsequent run changed which cluster a file was associated with. Today, we determine which cluster an uploaded PDF belongs to in real-time, so the number of readers doesn’t fluctuate unexpectedly.

This improvement in data quality has enhanced the trust the academic community places in our data, but we’re going a step further and registering our data with the DOI Event Tracker. This initiative run by Crossref is a neutral clearinghouse for altmetrics data from a range of platforms, providing an extra level of certainty about data quality and consistency to data consumers. One way this improves transparency and trust is that it ensures that we can’t go randomly changing numbers behind the scenes (not that we would do such a thing) when historical data rests in CED.

The following is a technical note for metrics researchers and developers building applications which consume data from the Mendeley catalog.

As many of you already know, we have begun to transition from our old and somewhat ad-hoc set of academic statuses and disciplines and to the much more extensive set of Elsevier categories. As a result, you will begin to see different labels for the readership values reported by the API. As we made some changes, we became aware of an error whereby not all the readers were mapping cleanly between the categories and the subtotals for academic statuses and disciplines attained incorrect values. Overall counts per document remained correct, however, because they are generated by the real-time catalog and not as part of the batch process which generates the category subtotals. We recently re-freshed the subtotals and have fully transitioned over to the new set of academic statuses (now referred to as user roles) and disciplines (now referred to as subject areas) so the totals should be 100% accurate again.   

Here is what you can expect to see in the returned JSON responses with the new categories highlighted in red –

 {
 "title": "Therapeutic effect of TSG-6 engineered iPSC-derived MSCs on experimental periodontitis in rats: A pilot study",
 "type": "journal",
 "authors": [
 {
 "first_name": "Heng",
 "last_name": "Yang"
 },
 {
 "first_name": "Raydolfo M.",
 "last_name": "Aprecio"
 },
 {
 "first_name": "Xiaodong",
 "last_name": "Zhou"
 },
 {
 "first_name": "Qi",
 "last_name": "Wang"
 },
 {
 "first_name": "Wu",
 "last_name": "Zhang"
 },
 {
 "first_name": "Yi",
 "last_name": "Ding"
 },
 {
 "first_name": "Yiming",
 "last_name": "Li"
 }
 ],
 "year": 2014,
 "source": "PLoS ONE",
 "identifiers": {
 "scopus": "2-s2.0-84903624808",
 "pmid": "24979372",
 "doi": "10.1371/journal.pone.0100285",
 "issn": "19326203"
 },
 "id": "ff7fe569-e7eb-34de-8e98-3edbcbc1301d",
 "abstract": "BACKGROUND: We derived mesenchymal stem cells (MSCs) from rat induced pluripotent stem cells (iPSCs) and transduced them with tumor necrosis factor alpha-stimulated gene-6 (TSG-6), to test whether TSG-6 overexpression would boost the therapeutic effects of iPSC-derived MSCs in experimental periodontitis.\\n\\nMETHODS: A total of 30 female Sprague-Dawley (SD) rats were randomly divided into four groups: healthy control group (Group-N, n = 5), untreated periodontitis group (Group-P, n = 5), iPS-MSCs-treated and iPSC-MSCs/TSG-6-treated periodontitis groups (Group-P1 and P2, n = 10 per group). Experimental periodontitis was established by ligature and infection with Porphyromonas gingivalis around the maxillae first molar bilaterally. MSC-like cells were generated from rat iPSCs, and transducted with TSG-6. iPSC-MSCs or iPSC-MSCs/TSG-6 were administrated to rats in Group-P1 or P2 intravenously and topically, once a week for three weeks. Blood samples were obtained one week post-injection for the analysis of serum pro-inflammatory cytokines. All animals were killed 3 months post-treatment; maxillae were then dissected for histological analysis, tartrate-resistant acid phosphatase (TRAP) staining, and morphological analysis of alveolar bone loss.\\n\\nRESULTS: Administration of iPSC-MSC/TSG-6 significantly decreased serum levels of IL-1β and TNF-α in the Group-P2 rats (65.78 pg/ml and 0.56 pg/ml) compared with those in Group-P (168.31 pg/ml and 1.15 pg/ml respectively) (p<0.05). Both alveolar bone loss and the number of TRAP-positive osteoclasts showed a significant decrease in rats that received iPSC-MSC/TSG-6 treatment compared to untreated rats in Group-P (p<0.05).\\n\\nCONCLUSIONS: We demonstrated that overexpression of TSG-6 in rat iPSC-derived MSCs were capable of decreasing inflammation in experimental periodontitis and inhibiting alveolar bone resorption. This may potentially serve as an alternative stem-cell-based approach in the treatment and regeneration of periodontal tissues.",
 "link": "http://www.mendeley.com/research/therapeutic-effect-tsg6-engineered-ipscderived-mscs-experimental-periodontitis-rats-pilot-study",
 "reader_count": 13,
 "reader_count_by_academic_status": {
 "Student > Bachelor": 1,
 "Researcher": 3,
 "Professor > Associate Professor": 1,
 "Student > Master": 3,
 "Student > Ph. D. Student": 5
 },
 "reader_count_by_user_role": {
 "Student > Bachelor": 1,
 "Researcher": 3,
 "Professor > Associate Professor": 1,
 "Student > Master": 3,
 "Student > Ph. D. Student": 5
 },
 "reader_count_by_subject_area": {
 "Engineering": 2,
 "Medicine and Dentistry": 3,
 "Agricultural and Biological Sciences": 5,
 "Neuroscience": 1,
 "Chemistry": 1,
 "Veterinary Science and Veterinary Medicine": 1
 },
 "reader_count_by_subdiscipline": {
 "Engineering": {
 "Engineering": 2
 },
 "Medicine and Dentistry": {
 "Medicine and Dentistry": 3
 },
 "Neuroscience": {
 "Neuroscience": 1
 },
 "Chemistry": {
 "Chemistry": 1
 },
 "Agricultural and Biological Sciences": {
 "Agricultural and Biological Sciences": 5
 },
 "Veterinary Science and Veterinary Medicine": {
 "Veterinary Science and Veterinary Medicine": 1
 }
 },
 "reader_count_by_country": {
 "Malaysia": 1
 },
 "group_count": 0
 }

We encourage you to begin using the new fields of reader_count_by_user_role and reader_count_by_subject_area as soon as you can as we would hope to deprecate the old fields in the future.

This post was written by William Gunn – Director of Scholarly Communication for Mendeley at Elsevier @mrgunn 

 

 

This entry was posted in catalog, readership-statistics, Uncategorized on by .

About Joyce-Stack - Developer Outreach

Joyce Stack completed a BSc. (Hons) Computer Science from The Open University while working in a startup in Co. Cork, Ireland. She moved to London in 2005 to join one of the City’s leading exponents of agile techniques at that time to work as a Java Developer. She is now working in the Developer Outreach role in Mendeley where her responsibilities are to meet and educate developers about the Mendeley API, attend conferences, meetups and hackathons and basically be the face of the API. She is passionate about providing an excellent developer experience and APIs. She likes biking, swimming and yoga but she hates potatoes despite being Irish.