Category Archives: Uncategorized

Changes to our versioning strategy.

When we released version 1 of our API in September 2014 then we decided on using HTTP’s content negotiation mechanism as documented here.

In summary, when you request a version 1 resource then you specify the accepted resource in the Accept header eg.

GET /documents?limit=125 HTTP/1.1
Accept: application/vnd.mendeley-document.1+json
Authorization: Bearer MSwxNDA5N…T1I5dnhPa0U

All clients were “very strongly encouraged” to include version information in your application’s requests so as to be able to predicate the behaviour from those requests.

Why are changing our approach to versioning?

Unfortunately, we didn’t lock down the Media Types we selected. Many clients are making calls using wildcard calls in the Media Types.

We should have made the implications of accepting all future API versions clearer to all clients up front. But we didn’t and so we would have to break many clients to fix this appropriately. We also have some other internal technical reasons which lead us to this decision.   

So as to avoid having to break many clients then the technical team made the decision to move the versioning information into the URI. This means that when we release version 2 of the ‘Documents’ API the request above will look like this:

GET /documents/v2?limit=125 HTTP/1.1
Accept: application/vnd.mendeley-document+json
Authorization: Bearer MSwxNDA5N…T1I5dnhPa0U

What are we changing?

All new endpoints will now contain the version information in the the URI and the Media Types will no longer contain the version number in it either.

Version 1 – Retrieve a collection of groups

curl 'https://api.mendeley.com/groups' 
 -H 'Authorization: Bearer <ACCESS_TOKEN>' 
 -H 'Accept: application/vnd.mendeley-group.1+json'

Version 2 – Retrieve a collection of groups

curl 'https://api.mendeley.com/groups/v2' 
 -H 'Authorization: Bearer <ACCESS_TOKEN>' 
 -H 'Accept: application/vnd.mendeley-group-list+json'

Version 1 – Retrieve a single group

curl 'https://api.mendeley.com/groups/9da69ed0-9d44-3d5f-b999-970f41c94c35' 
 -H 'Authorization: Bearer <ACCESS_TOKEN>' 
 -H 'Accept: application/vnd.mendeley-group.1+json'

Version 2 – Retrieve a single group

curl 'https://api.mendeley.com/groups/v2/9da69ed-9d44-3d5f-b999-970f41c94c35' 
 -H 'Authorization: Bearer <ACCESS_TOKEN>' 
 -H 'Accept:application/vnd.mendeley-group+json'

A word on Media Types

The Media Types that we use in version 1 all follow this pattern.  

application/vnd.mendeley-<RESOURCE_TYPE>.<VERSION>+json

The Media Types that we use in version 2 will follow this pattern.

application/vnd.mendeley-<RESOURCE_TYPE><NONE or -list>+json

For requesting a single group by id then you would use:

application/vnd.mendeley-group+json

For requesting a group collection then you would use:

application/vnd.mendeley-group-list+json

Impact on clients

You can use both version 1 and version 2 resources together. Clients must remember to switch their URIs and their Media Types appropriately.  

Errors

  • HTTP 406 –  Check you have set the appropriate headers.  
  • HTTP 415 –  One or more of the media types you have set in your headers is not acceptable.

Gotchas

You have to set the HTTP Accept header to empty when doing a delete to indicate you are not expecting a response. This is to maintain backward compatibility with version 1.  

If you see this error message?

{“message”:”The Accept header contains forbidden wildcards, it must be specified with a value from: [application/vnd.mendeley-group+json]; this is to ensure backwards compatibility”}

If you see the a message similar to this above then it means that you have used an unacceptable Media Type in your Accept header.

Development Tokens and Beta Endpoints

All Beta endpoints require an additional Development Token. These tokens live for a period of 90 days. This is to allow for reasonable amount of testing by clients of a beta endpoint.  You will find all details on the Development Token generator page.

Feedback

We would love to hear how this is impacting you or if you are having issues with understanding it – email api@mendeley.com

 

What is Mendeley Readership?

Over the past few years, more librarians, funders, and publishers have begun to look beyond citations for indications that a work is having an impact on the research community at large. In the search for more immediate and more prospective indicators, they look at the traffic to the article pages hosted at their journals and repositories, (much like news sites track their analytics) but they are often frustrated because the traffic is so poorly characterized. Because there’s no way to determine if a site visit comes from an author (and thus a possible future citer) or a practitioner, the correlation between site visits/downloads and content which is highly cited is too weak to be predictive. Here’s where Mendeley comes in. Because Mendeley can give a fine-grained view on how content arising from an institution’s researchers or a journal is being read, it sits in a nice space between the high signal value of a citation and the rapid speed of a site visit or download. In addition, researchers value the data for a real-time view on of how their work is being read by those in their field and in other disciplines.

With that in mind, it’s important to note some disadvantages of Mendeley’s readership data. The readership of a document reported by Mendeley depends on the underlying population of Mendeley users. While data on the academic discipline and status or title of the researcher is available from Mendeley, not all documents have a reading population which is representative of the academic community as a whole. In most cases (see Figure), the scale of Mendeley means that the skew is minimal, however. Another important consideration is that readership isn’t as clearly understood a concept as citation. This post attempts to remedy this.

How do we count readers?

The number of readers of a document in Mendeley is the number of copies of that document in all Mendeley libraries. However, this simple concept contains many complexities. For example, documents may be the same logical document while being different at the bit level. Frequently, a document found in an open access repository will come with a cover page that makes it a different file, or a researcher may have added annotations to the PDF using an application such as Adobe Acrobat. In addition, many older documents don’t have any or good quality metadata embedded in them (no DOI, for example), so they can’t always be identified as the same file just based on metadata. We had to develop a method for clustering all the similar documents and grouping together the different files which are the same logical document. The size of this cluster is the number of readers. In the earlier days of Mendeley, we’d just periodically run this process across all the documents uploaded by all our users and regenerate the clusters, and thus regenerate the whole catalog. As the number of documents grew into the hundreds of millions, this process began to serve us less well, and it also caused the number of readers to rise or fall unexpectedly if a subsequent run changed which cluster a file was associated with. Today, we determine which cluster an uploaded PDF belongs to in real-time, so the number of readers doesn’t fluctuate unexpectedly.

This improvement in data quality has enhanced the trust the academic community places in our data, but we’re going a step further and registering our data with the DOI Event Tracker. This initiative run by Crossref is a neutral clearinghouse for altmetrics data from a range of platforms, providing an extra level of certainty about data quality and consistency to data consumers. One way this improves transparency and trust is that it ensures that we can’t go randomly changing numbers behind the scenes (not that we would do such a thing) when historical data rests in CED.

The following is a technical note for metrics researchers and developers building applications which consume data from the Mendeley catalog.

As many of you already know, we have begun to transition from our old and somewhat ad-hoc set of academic statuses and disciplines and to the much more extensive set of Elsevier categories. As a result, you will begin to see different labels for the readership values reported by the API. As we made some changes, we became aware of an error whereby not all the readers were mapping cleanly between the categories and the subtotals for academic statuses and disciplines attained incorrect values. Overall counts per document remained correct, however, because they are generated by the real-time catalog and not as part of the batch process which generates the category subtotals. We recently re-freshed the subtotals and have fully transitioned over to the new set of academic statuses (now referred to as user roles) and disciplines (now referred to as subject areas) so the totals should be 100% accurate again.   

Here is what you can expect to see in the returned JSON responses with the new categories highlighted in red –

 {
 "title": "Therapeutic effect of TSG-6 engineered iPSC-derived MSCs on experimental periodontitis in rats: A pilot study",
 "type": "journal",
 "authors": [
 {
 "first_name": "Heng",
 "last_name": "Yang"
 },
 {
 "first_name": "Raydolfo M.",
 "last_name": "Aprecio"
 },
 {
 "first_name": "Xiaodong",
 "last_name": "Zhou"
 },
 {
 "first_name": "Qi",
 "last_name": "Wang"
 },
 {
 "first_name": "Wu",
 "last_name": "Zhang"
 },
 {
 "first_name": "Yi",
 "last_name": "Ding"
 },
 {
 "first_name": "Yiming",
 "last_name": "Li"
 }
 ],
 "year": 2014,
 "source": "PLoS ONE",
 "identifiers": {
 "scopus": "2-s2.0-84903624808",
 "pmid": "24979372",
 "doi": "10.1371/journal.pone.0100285",
 "issn": "19326203"
 },
 "id": "ff7fe569-e7eb-34de-8e98-3edbcbc1301d",
 "abstract": "BACKGROUND: We derived mesenchymal stem cells (MSCs) from rat induced pluripotent stem cells (iPSCs) and transduced them with tumor necrosis factor alpha-stimulated gene-6 (TSG-6), to test whether TSG-6 overexpression would boost the therapeutic effects of iPSC-derived MSCs in experimental periodontitis.\\n\\nMETHODS: A total of 30 female Sprague-Dawley (SD) rats were randomly divided into four groups: healthy control group (Group-N, n = 5), untreated periodontitis group (Group-P, n = 5), iPS-MSCs-treated and iPSC-MSCs/TSG-6-treated periodontitis groups (Group-P1 and P2, n = 10 per group). Experimental periodontitis was established by ligature and infection with Porphyromonas gingivalis around the maxillae first molar bilaterally. MSC-like cells were generated from rat iPSCs, and transducted with TSG-6. iPSC-MSCs or iPSC-MSCs/TSG-6 were administrated to rats in Group-P1 or P2 intravenously and topically, once a week for three weeks. Blood samples were obtained one week post-injection for the analysis of serum pro-inflammatory cytokines. All animals were killed 3 months post-treatment; maxillae were then dissected for histological analysis, tartrate-resistant acid phosphatase (TRAP) staining, and morphological analysis of alveolar bone loss.\\n\\nRESULTS: Administration of iPSC-MSC/TSG-6 significantly decreased serum levels of IL-1β and TNF-α in the Group-P2 rats (65.78 pg/ml and 0.56 pg/ml) compared with those in Group-P (168.31 pg/ml and 1.15 pg/ml respectively) (p<0.05). Both alveolar bone loss and the number of TRAP-positive osteoclasts showed a significant decrease in rats that received iPSC-MSC/TSG-6 treatment compared to untreated rats in Group-P (p<0.05).\\n\\nCONCLUSIONS: We demonstrated that overexpression of TSG-6 in rat iPSC-derived MSCs were capable of decreasing inflammation in experimental periodontitis and inhibiting alveolar bone resorption. This may potentially serve as an alternative stem-cell-based approach in the treatment and regeneration of periodontal tissues.",
 "link": "http://www.mendeley.com/research/therapeutic-effect-tsg6-engineered-ipscderived-mscs-experimental-periodontitis-rats-pilot-study",
 "reader_count": 13,
 "reader_count_by_academic_status": {
 "Student > Bachelor": 1,
 "Researcher": 3,
 "Professor > Associate Professor": 1,
 "Student > Master": 3,
 "Student > Ph. D. Student": 5
 },
 "reader_count_by_user_role": {
 "Student > Bachelor": 1,
 "Researcher": 3,
 "Professor > Associate Professor": 1,
 "Student > Master": 3,
 "Student > Ph. D. Student": 5
 },
 "reader_count_by_subject_area": {
 "Engineering": 2,
 "Medicine and Dentistry": 3,
 "Agricultural and Biological Sciences": 5,
 "Neuroscience": 1,
 "Chemistry": 1,
 "Veterinary Science and Veterinary Medicine": 1
 },
 "reader_count_by_subdiscipline": {
 "Engineering": {
 "Engineering": 2
 },
 "Medicine and Dentistry": {
 "Medicine and Dentistry": 3
 },
 "Neuroscience": {
 "Neuroscience": 1
 },
 "Chemistry": {
 "Chemistry": 1
 },
 "Agricultural and Biological Sciences": {
 "Agricultural and Biological Sciences": 5
 },
 "Veterinary Science and Veterinary Medicine": {
 "Veterinary Science and Veterinary Medicine": 1
 }
 },
 "reader_count_by_country": {
 "Malaysia": 1
 },
 "group_count": 0
 }

We encourage you to begin using the new fields of reader_count_by_user_role and reader_count_by_subject_area as soon as you can as we would hope to deprecate the old fields in the future.

This post was written by William Gunn – Director of Scholarly Communication for Mendeley at Elsevier @mrgunn 

 

 

Using GraphQL with Mendeley

I was snooping around on GitHub a few weeks back looking for repositories that have used the Mendeley API. I was surprised by how many I found. Unfortunately, some are obsolete and no longer work.  

I did find one gem though called GraphQL-Mendeley. GraphQL is a query language that allows client developers to easily ask for data from a server. It was built by Facebook and came about when they were rewriting their mobile clients. Primarily it allowed product developers to be able to build things quickly and easily by providing a consistent query language. Think about how large the Facebook data lake is. Developers could now be selective in exactly what data they required. You are no longer constrained by some pre-determined server view that an API provides. You can execute one network call and be very precise about what you want or you can or you can be greedy and for multiple unrelated objects in one call.

REST APIs do have overheads in that writing APIs is notoriously difficult and writing a good one that fit your product needs, is easy to evolve and services multiple clients is just downright hard.

GraphQL is independent of platforms and data sources so you can run queries using your favourite languages and platform rather easily. It’s not concerned with preparing data for parsing or having to write lots of code to merge multiple unrelated datasets. It’s almost like an API but for data and a lot less complicated.  

Researchers often work with data that  is flat and unstructured. I could see benefits for organisations who have large datasets (like us) in providing the data via GraphQL so clients can just ask for the data they need.  

Checkout the GitHub example and run it for yourself. Below you will see a screenshot of a query that I ran to find the name of people in the group identified with id ‘e3630413-abd9-3308-8937-c5f119c17a28

 

graphql

If you had to do this using from code and consuming the APIs that we currently have then you would first make a call to Groups such as:  

https://api.mendeley.com/groups/e3630413-abd9-3308-8937-c5f119c17a28/members

This returns the following JSON response.  

[
 {
   “profile_id”: “75375001-76bc-3c41-a0e8-3c88a4829918”,
   “joined”: “2013-01-26T19:36:40.000Z”,
   “role”: “normal”
 },
 {
   “profile_id”: “1c3ac854-1c3c-3202-9753-93b69dd1566f”,
   “joined”: “2012-06-19T15:11:34.000Z”,
   “role”: “normal”
  ]

]

 

Your code would then have to parse the JSON, iterate over the response and for each of the profile ids you would make a call to Profiles to retrieve the id and name details:  

https://api.mendeley.com/profiles/75375001-76bc-3c41-a0e8-3c88a4829918

Writing less code is always a good thing so it’s going to be interesting to see how this gets adopted.  

On a site note we love to see what cool things you build with the Mendeley API. Please get in touch and let me know what you have done. We are working to get external libraries and projects listed on our developer portal – email joyce.stack@mendeley.com.

 

Students try to predict potential collaborations at Hack Cambridge

Our first event of 2016 was attending Hack Cambridge, the University of Cambridge’s very own hackathon.

The popular music venue, the Corn Exchange, was repurposed to fit 400+ hackers along with 20+ sponsors.  This was the first event by this team of organisers. They did well to get so many sponsors and draw such a large crowd, including teams from Spain and Croatia. Such a large gathering of incredibly bright people from all over the UK and beyond. Seems that building apps is old school, rather a lot of the teams that I talked to were turning their hands to number crunching types of hacks. Got to admit, I didn’t understand what some people were telling me.  Felt like a bit of a wally.

wheres-wally

Where’s Wally?

However, we fear these first-time organisers have bitten off more than they could chew with such a large event. The single biggest issue was down to poor (almost non-existent) WiFi which caused some teams to head home, almost defeating the purpose of the event in the first place. Admittedly, this was an issue with the location.  The whole event felt chaotic and over subscribed. Sorry guys. Learn and iterate.

The team that stood out the most and won the Mendeley challenge was team Leev. They used the Mendeley API to retrieve papers devoted to biomedical disciplines (e.g. genomics, drug discovery and development, chemical similarity). Data from PubMed was sourced to complement Mendeley data so they could create co-authorship networks – graphs where every node represents an author and edges show co-authorship between two individuals. Through analysing the network topology, they were able to predict individuals who are important in their field, are able to bridge fields, as well as get a glimpse of how tight-knit is the field in general. Based on link prediction, they were trying to predict potential collaborations, hoping to benefit the research community.

They applied a number of methods from network science to solve the problem:

  • Louvain method for community detection and network clustering
  • Adamic-Adar index for predicting how the network will evolve and to suggest potential collaborators
  • HITS algorithm for calculating the “importance” of each collaborator

 

 

leer

Team Leev –  Dilyana Mincheva, Chi-Jui Wu and Aleksejs Sazonovs. 

Congratulations to team Leev and thanks to all the others who participated in this event. You can read about the other Mendeley hacks here:

 

Academic services made easy – Mendeley integrates with Peerwith

Mendeley Blog

Screen Shot 2016-01-04 at 19.36.05

The very nature of research means academics become experts in their fields. But what happens when they need services outside of their field of research, such as translations or artwork for their paper or book? They rely on author services, which are often delivered by other academics; For example, by PhD students that edit papers as a freelance job. Performing these services can not only be an way to earn some extra money, it also allows people to gain experience and grow skills in effective scholarly communication.

But academics and service providers often have difficulties finding each other directly and often depend on middlemen to get the work done. This means that services are more expensive than needed, and that people most of the time have no idea who actually performs the work.

p-eerwithPeerwith wants to change this. Launched in beta in October 2015, the platform brings academics directly in…

View original post 241 more words

How I prepared for moderating my first panel discussion.

I’m currently doing a ground speed of 866km/h with another 6 hours 51 minutes to my destination.

That destination is API Strategy and Practice Conference in Austin, Texas. I’m off to chair my first session – the microservices session. I’ve just penned my introduction after 3 glasses of wine after watching the Love & Mercy film about Brian Wilson (excellent film by the way).

I thought I’d write a post about how I feel pre-first-panel-discussion-session-organiser-event. I keep worrying about the level of detail I should go into in the 5-minute introduction. Do I lead with a statement of fact about microservices and APIs running the world?  “What would Mike Amundsen say?” – something more profound no doubt. I had a dream last night that it was all chaotic and nobody listened and it was the worst session of the event.

I plan on a basic introduction of who I am and what’s about to happen. Unlike Mike Amundsen I am not an industry expert but I do have some expertise in being involved in a very difficult (ongoing) migration from a monolithic application to microservices architecture. This is one of the reasons I got asked to do this session.

The format of this session is about 1-hour 20 minutes. There will be 3 speakers doing a presentation of 15-20 minutes each followed by a panel discussion with all the speakers plus some additional experts in the field to have a ‘chat’ about microservices.

Thing is that as the session organiser (with help obviously from the API Strategy Team) it feels anything but a ‘chat’. I feel immense pressure to deliver an excellent session especially given how popular microservices are. I really want the attendees to walk out of the session and be able to do one positive thing in the office the next day.

However, I’m starting to think maybe the entire audience have been doing it longer than us? Maybe they all think it’s a fad and will go nowhere? How do I pick the questions that I should ask? What if I fall off stage? What if… ?

If you are reading this and you are preparing to chair your first panel discussion then here is how I’ve prepared:

  • Read about how to run a panel discussion.
    • I like talking (understatement of the century) but I’m the one person who should be doing the least amount of talking. Understand what is expected of you from a discussion. You may feel the crowd is relaxed enough so you may be able to let go of some etiquette.
  • Do your research.
    • I bought a copy of Sam Newman’s book on Building Microservices. Why? Internal concepts do not translate to external conferences. Find the common languages so you can speak generally.
  • Research your panel and reach out
    • Reach out to the speakers and ask how they would like to be introduced. Usually, they have some bio that they like to share about their achievements. It’s also important to give them some notice of how you are planning to run the session. It’s difficult enough to be on stage with a microphone, camera and lights in your eyes so let them know what you are going to ask. Even better – get them to suggest the content that they are comfortable with.
  • Reach out to your team
    • I emailed all the devs on my team and asked them ‘what would they ask an expert panel about microservices? ’ This will give you a good basis of what you should be asking.
  • Be prepared for feedback.
    • In fact, I would say go and seek it. If you want to improve then you need to accept feedback.
  • Accept that you will not please everyone
    • Its hard at conferences right – you have the noobs and the experts and it’s really difficult to please everybody. Just accept it.
  • Enjoy it
    • I remember the first time I was given a microphone and I was like ‘What the heck do I do with that? How do I hold it?” Embrace the new experience. You’ll love it.

OK only 6 hours and 7 minutes to go until I land. I wonder how much of that advice will make sense in a few days time after the session. I’ll report back.