Requesting permission: reflections and perspectives from the University of St Andrews

Kyle Brady of the University of St Andrews here expands on his popular presentation from our recent UKCoRR members day in the below post:

In July I attended the UKCoRR Members Day and delivered a presentation on the subject of approaching publishers for permission from the perspective of someone working in open access/repository support. The title of the presentation was ‘Requesting permission: approaching publishers, lessons learned, and the many successes!’ Here’s the presentation in the St Andrews Research Repository.

In this blog post I’ll go over some of the points from the presentation that I think struck a chord with the audience, with the overall intention of explaining the rationale behind our processes. Before I begin, I must say that I am very grateful to the other attendees on the day who shared their experiences in the Q&A, as well as after the event. It was really encouraging to hear from so many colleagues who have experienced similar stumbling blocks as we have, and it was especially useful to hear from those who do things differently to us at St Andrews.

I had noticed the issue of publisher permissions popping up on the UKCoRR email list on a number of occasions, often in relation to specific publishers who don’t have a public open access or author self-archiving policy. Additionally, a Google Doc listing publishers and their responses to requests to archive book chapters has been circulated many times, and indeed was the subject of numerous discussions on the Members Day as well. This brings me to the first point from my presentation that I felt was perhaps the most illuminating, and this is the fact that most of our permission requests are actually for articles published in journals and conference proceedings. Perhaps not the most shocking expose on the face of it, but if you factor in REF2021 compliance it is in fact quite significant. This is because 60% of our permissions requests are for outputs potentially in scope for the REF open access policy. So, I argued, having an effective permissions policy can potentially affect an institution’s approach to their REF return and level of exceptions required.

Figure 1 - Pie chartKB

Figure 1 – Item Types

Another perhaps less ‘sticky’ and more ‘carroty’ reason for all this comes down to effective curation of our research outputs. Many of the items in our repository are archived on the basis of successful permission requests for print-only publications, and so are often unique as they cannot be found online anywhere else.  So, I explained my thoughts about digital preservation and the duty of care we have for this rare part of our collections. Part of this duty of care is ensuring permissions are well thought out, and that ensuing replies are clear and unambiguous. But, as I explained, no matter how careful you may be, I expect that risk management will always play a part in any decision to host third party copyright material online.

So, how do we do it? From the outset I want to state that I don’t believe our process is perfect by any means. And, although we have had an overwhelming amount of success there are caveats, but more on that later!

Figure 2 - workflow_KB

Figure 2 – Permissions workflow

When we receive a manuscript for archiving we first check SherpaRomeo, an authoritative database of publishers’ and journals’ open access and author self-archiving policies that I’m sure we’re all intimately familiar with. If we come up short we then check the journal/publisher’s website for a policy (if indeed there is a website). If we are still left wanting we’ll then go to the author and ask them to check the publishing contract. This is a very important step as it includes the author in the process and in so doing alerts them to the work required to make things open access. It also has an important educational function as it highlights the importance of retaining rights, including copyright, and the distinction between exclusive and non-exclusive licences for instance. We are also conscious of the close relationship many of our authors have with publishers, so we always try to ensure that we have the author’s prior consent before any permission requests are sent.

Figure 3 - Permissions SS_KB

Figure 3 – Permissions spreadsheet

Once we have the go ahead to approach a publisher we record the action in a spreadsheet and assign it an ID. Then, when we receive a reply we can easily update the spreadsheet, take any actions on the Pure record (we use Pure as our Current Research Information System by the way!), and importantly we save the email in a folder and rename it according to the ID.  We think it is important to track and document these requests in such a way as it creates a convenient audit trail, but it also gives us a way to assess the effectiveness of our process. You may also notice that we can report on the items types too, so for instance we know that 60% of our permission requests relate to outputs that are potentially in scope of the REF2021 open access policy.

The vast majority of responses come back in the form of emails, often but not always from editors of the journals themselves. As I said before these are filed away and retained as proof that permission has been attained. But, a question I posed at the end of my presentation was: does this actually protect our collection? It would seem common sense to suggest that items that are archived on the basis of an email are less protected than items that are archived in response to a signed letter. But is this actually the case? Might both forms of response be equally fallacious if in fact the issuer of the permission response is not vetted for authenticity (whatever that would mean). I don’t have an answer for this, so this is the point at which I ended my presentation and opened the debate to the floor.

My enduring impression from speaking to colleagues on the day was that each institution has a clear understanding of the level of risk they are willing to take, even if it is not enshrined in policy. Generally speaking my colleagues and I in the Open Access team at St Andrews tend to err on the side of caution and risk aversion, but from speaking to colleagues at other institutions my feeling is that we could perhaps afford to be less so. At any rate, the question of how we can protect these unique parts of our collections lingers on I’m afraid, and I suppose ultimately it is always going to be a balancing act between collection growth and collection sustainability.

If you’re a UKCoRR member and would like to contribute to the blog, please get in touch with any of us on the Committee.

Jisc co-design workshop: ‘Digital Skills for Research’

On 25th April I was amongst a number of colleagues from a wide range of stakeholders and organisations invited by Jisc to a co-design workshop on ‘Digital Skills for Research’.

Co-design is Jisc’s “collaborative innovation model” by which they have engaged with the sector to identify 6 discrete challenges to explore and we were here today to think about next generation research environments.

Our first task was to brainstorm around research support roles which perhaps turned out to be rather more discursive than our hosts expected. Rather than a clearly defined list of discrete roles, a consensus began to emerge that associated skill sets are extremely fluid and that professional nomenclature, and institutional structure, can potentially have the effect of artificially limiting the scope of a role  (it is perhaps instructive in this context to revisit the range of job titles posted to the UKCoRR mailing list in 2016, and, no doubt, the many similarities and differences of the associated job descriptions.)

research_support

This led on to a discussion about qualifications, training and appropriate professional accreditisation, which is certainly an issue for repositories and Open Access (also see related  blog posts from Cambridge at the bottom of this post). A similar issue was also raised in the context of research software development.

ARMA’s Professional Development Framework is perhaps the obvious resource for research support skills and the Vitae Researcher Developer Framework (RDF) is a valuable resource for researchers and their support services alike and might highlight the increasingly collaborative relationship between researchers and support services – a theme that also emerged at a recent White Rose Libraries Digital Scholarship Event

We got back on track in the next exercise, considering the range of (digital) skills required by professionals working in research support roles, although it was observed that ‘skill’ doesn’t necessarily describe the sheer range of knowledge required, such as an overview of the myriad funder policies around open access and data management for example.

To consider bibliometrics as but one skill set that might often fall to a Librarian or other information professional in lieu of a trained bibliometrician (how many institutions have one of these exotic beasts?), there are myriad proprietary data sources covering both ‘traditional’ and ‘alternative’ metrics that we must be familiar with and which might help to inform impact assessment and yet there is no clear training offer, at least I’m not aware. The best resource I have found is MyRI, a collaborative project of three Irish academic libraries, freely available at http://myri.conul.ie/.

So, which organisations are responsible for fostering these myriad digital skills, now and in the future? The day’s final exercise identified the usual suspects – Jisc, ARMA, CILIP and of course our very own UKCoRR, though there is the ongoing question around our capacity as an unfunded, voluntary organisation and positioning in relation to other organisations. We hope to continue our consultation around the future vision and remit for UKCoRR (survey) at our members’ day at the University of Warwick on 7th July.

A useful lunchtime conversation with Helen Blanchett considered some sort of OA training provision and network support from Jisc, the discussion was obviously informal but we hope that Helen will be in Warwick for our members’ day to discuss the idea further.

If the landscape is complex now it will only become more so, with ever more specialist roles and associated skill sets and the final discussion was around the potential role for Jisc and by extension, for our own purposes, for UKCoRR:

future_research

Jisc’s provision currently comprises “144 guides and case studies”, as well as a number of face to face and online courses, both fee paying and free models, including their Digital leaders programme and the Digifest conference for example; one suggestion was that there is in fact a gap for a conference dedicated to cutting edge digital research practice, in view of the fact that Digifest2017 focused on teaching rather than research.

Related posts from University of Cambridge Office of Scholarly Communication:

Tagged ,

CRIS and repositories (draft briefing)

I’m just going through old files before I leave my current post and uncovered this which I drafted for the Repositories Support Project (RSP) several years ago but which was never used. As the RSP is now defunct, I’m linking it here in case it might be of use to someone:

Current Research Information Systems (CRIS) and repositories

Tagged ,

Repository Professionals: The Next Generation

The internet is basically a teleportation device for information [citation needed] and like the original Star Trek series, where the technology may have aspired to be futuristic but is very firmly rooted in a 1960s aesthetic, repository systems are still using technologies and protocols from the early days of the web (COAR 2016).

Spock and Kirk 1968

© Public domain. Image source: Wikimedia Commons

In April 2016 the Confederation of Open Access Repositories (COAR) launched a working group focussed on Next Generation Repositories and as the 9th International Open Access Week rolls around it’s another chance to take stock of the repository landscape and its mission to boldly promote open access, the recent ongoing discussion around which is captured by Richard Poynder and Kathleen Shearer of COAR.

Equally important as the technology, if not more so, are those on the bridge and in the engine room who increasingly need a professional skill set the breadth and depth of which rivals anything required by Starfleet; from traditional librarianship, to web-science, from a hundred and one technical protocols to an arcane realm of policy edicts from university, research funder and government. We even have our own Borg in the form of the commercial publishing industry, ever more efficient at assimilating the infrastructure and co-opting the language of open access. As a case in point, the publishing giant Elsevier that acquired Mendeley in 2013 as well as [Atira] Pure (CRIS software) in 2012 and more recently SSRN in 2016, now run a Mendeley Certification Program for Librarians, as they seek to lock-in researchers and their librarians, Facebook-like, into their ecosystem. A particularly jarring example of corporate hubris even by their standards.

For this year’s Open Access week, then, we want to know what you think UKCoRR’s role should be in nurturing the next generation of repository professionals?

As argued by UKCoRR member Jennifer Bayjoo recently in her paper Getting New Professionals into Open Access at the Northern Collaboration Conference, OA and repositories are still not a priority in many CILIP accredited professional library and information management qualifications. CILIP assess courses against their Professional Skills and Knowledge Base which has just one single reference to Open Access buried in point 7.3 ‘Selection of materials and resources’ (and which is only accessible to paid-up members of CILIP, in stark contrast to Elsevier’s ‘freemium’ model for Mendeley.)

It is instructive also to consider the types of job that have been posted to the UKCoRR list which increasingly focus on a broader range of skills than the traditional ‘Repository Manager’ and with a growing emphasis on research data management for example. Of the 16 roles posted to the list in 2016 only 2 explicitly mention the word ‘repository’ and just 1 ‘librarian’:

Research Repository Data Administrator
Research Publications Officer
Research Data Management Advisor
Research Data Support Manager
Copyright and Scholarly Communications Manager
Research and Scholarly Communications Consultant
Open Access & Research Data Advisor
Manager of the Institutional Repository
REF and Systems Manager
Research Data Adviser
Research Publications Manager
Research support librarian
Research Publications Officer
Research Data Officer
Research Publications Assistant
Open Access Officer

The most common perspective on the value of UKCoRR seems to be our supportive community which is largely self-sustaining via the email list, do we need to do anything beyond this?

What is our role in liaising with other organisations like Jisc, CILIP or ARMA?

Might you be willing to share your expertise via an informal mentorship scheme for example?

With these issues in mind, we have put together a very short survey and would like your help to identify the skills and knowledge the future Open Access professionals should have.

As Captain Jean Luc Picard might have said to send his (much more modern) Starship Enterprise to warp speed, “Engage!”

Technology words for repository managers

Posted on behalf of Nancy Pontika, UKCoRR External Liaison Officer and Open Access Aggregation Officer for CORE 

The role of the repository manager is constantly evolving. The repository manager of today needs to be aware and interpret not only their affiliated institution’s open access policies, but also the national and international ones that emerge from public funding agencies. The proliferation of these policies introduces technical requirements for repositories, the use of current research information systems (CRIS) and the installation of various plug-ins for example, and often repository managers serve as intermediary between the IT department of their institution and their supervisors or library directors and have to communicate messages and requests. A couple of months ago on the UKCoRR members list we had a discussion around the specific technological terms that repository managers hear regularly.

Even though currently I am not a repository manager, I couldn’t help but sympathise. The past year that I have been working for CORE – a global repositories harvesting service – I am the only non-developer in a team of four (wonderful!) developers. As a result, there are times where I feel lost in our discussions, so I decided to put together a list of often used technical terms that can relate to repositories. As a first step, a Google Spreadsheet was created with some basic terms. Then the UKCoRR list members were asked to add more of these jargon words and also weight each term based on how often they tend to hear it. (I am going to keep this list over there for future reference and do not hesitate to save local copies if you find it useful.) In the end, with the help of two CORE developers – Samuel Pearce and Matteo Cancellieri – we tried to provide brief definitions and give simple examples when possible.

The following table contains a list of these terms and their definitions. Since this is not an exhaustive list (and it was never meant to be) feel free to add other terms in the comments area of this blog post.

 

Web technologies
Apache Apache is a web server. When your browser such as Internet Explorer or Google Chrome requests a website (for example http://core.ac.uk) Apache is the software that returns the webpage to your browser.
Tomcat Tomcat is a web application server. Tomcat works like a web server but it serves more complex pages and operations than the web server. For example your online banking system uses a web application server while this blog uses a web server.
Java A programming language that usually runs in Web Application Servers (such as Tomcat). CORE uses Java for running the harvesting of the repositories.
PHP An Open Source scripting language particularly suited for website development. It is commonly installed with Apache, which allows web pages to be more complex without having to run a separate web application server such as Tomcat. For example, CORE uses PHP in its web pages.
robots.txt A text file that specifies how a web server regulates access of automatic content downloaders.  For example, CORE follows the rules in the robots.txt file. The rules may limit the number of requests per second made to your webserver or restricts access to certain places on your website, such as a login page.
SSH (Secure Shell) A protocol that allows one computer to connect to another and send commands by typing in text rather than clicking buttons.
MySQL MySQL is an Open Source Database Management System owned by the Oracle Corporation.
Perl A programming language usually used for scripting and text processing.
JavaScript A programming language that usually runs in your browser to allow web pages to be more dynamic and reactive. Web forms may use JavaScript to ensure they are filled in correctly before submitting them.
Crawler A crawler is a machine which automatically visits web pages and processes them. A common example is Google, which crawls websites, extracts content and makes it available via its search engine.
Cron jobs Programs that are set to run at specific times. For example, they are used for periodic tasks such as running automatic updates every day at midnight or extracting and processing the text from your full-text outputs in your repository to make them searchable.
Development
dev site A website used for testing. This allows developers to test and process information without the risk of breaking the “live” production website.
Git A version control system, like Subversion (SVN). It enables tracking changes in code.
SVN/Subversion “SubVersioN” – it’s a version control system, like git. It enables tracking changes in code.
clone A command in Git that copies code from a remote server to a local machine.
Other
UNIX An operating system analogous to DOS, Windows and Mac OS. Nowadays, Unix refers to a group of operating systems that adhere to the Unix specification. An example of Unix based operating systems are Linux and Mac OS.
LINUX An operating system based on Unix. The Linux code is open source and allows anyone to modify and distribute software and source code creating different variants of ‘Linux’. The most popular version of Linux are Ubuntu, RedHat, Debian and Fedora.
HTTP proxy An HTTP Proxy is a gateway for users on a network to access the internet. This allows large organisations to track internet usage and also limits the amount of downloaded data by storing it within the proxy. The next time the same website is requested, the local copy is sent to the user rather than re-downloading it.
External resolver A external resolver service (such as The DOI® System or HDL.NET®) allows a digital object, such as research outputs, to have a unique global identifier.
Mirrors A Mirror is a copy of another website. An organisation may mirror a website to reduce traffic and hits to the source website.
Metadata Protocols
OAI-PMH OAI-PMH, (Open Archives Initiative Protocol for Metadata Harvesting) is a standard for exposing metadata in a structured way – particularly for computers to understand.
SWORD SWORD (Simple Web-service Offering Repository Deposit) is a protocol that simplifies and standardises the way content is deposited into repositories.
Data access
API An API (Application Program Interface) is a set of rules that defines how parts of software or two separate programs interact with each other. For example, the CORE API allows developers to use CORE’s data from within their own applications.
Widget A small application with limited functionality that runs within a larger application or program. The CORE Similarity Widget retrieves similar articles based on metadata and runs within the larger application of a repository.
Plugin Similar to a widget, a Plugin adds extra functionality to software. This may add new features or change the way an existing feature works.
Text mining The process in which high quality data is extracted from text using a computer.
Data Dumps A single or multiple files that contain a large set of data.