EPrints, Impact and our MiniREF Exercise

During October-December 2010, the University ran a “MiniREF Exercise” to gather data on outputs, impacts and esteem data for over 1200 Research staff. The review of that work is now ongoing.

Enlighten, our institutional repository service was used as the platform for selecting and reporting on this information. The Library’s Enlighten team worked closely with the Office of the Vice Principal (Research and Enterprise), Research Offices in Colleges and many academic colleagues as we updated/added publications data.

Impact and Enquire

Our JISC funded Enquire project had impact as a core element and gave us scope to explore different options for gathering supporting information about publications as well as impact data. Ultimately, we decided to work with and implement the functionality which Southampton had developed, back in 2007 for RAE2008.

IRRA – Institutional Repositories for Research Assessment

The work of the Institutional Repositories for Research Assessment (IRRA) project, funded by JISC for the RAE2008 seemed an ideal fit for the collection of outputs, impact and esteem data.

The IRRA Project had developed an EPrints add-on for research assessment which created separate mySQL tables for recording measures of esteem, selecting publications and providing reports. This was designed to:

“Facilitate the gathering of evidence for RAE returns by allowing users to:

  • Record measures of esteem
  • Select items from the repository
  • Qualify each selected item for RAE return

And allowing unit managers/administrators to:

  • Carry out each of the above on behalf of a user (e.g. to adjudicate between two users selecting the same item)
  • Identify and resolve problems with selections
  • Produce reports in Word and Excel (RA2) format”

– From the “EPrints RAE Module Silver Release README”

The EPrints add-on has a rich set of features for capturing and reporting supporting information . We configured it [with assistance from Tim Miles-Board at EPrints] to provide an additional focus on impact and esteem data for our mini-REF pilot exercise. We created two separate means of collecting impact information. The first was a separate section to capture the impact of an individual’s research. The second was an impact option for each specific output where multiple authors could add their own impact to the same output. We didn’t use the latter approach however.

We trialled this with our internal REF Working Group who were very impressed with initial demonstrations of this add-on and provided valuable feedback enabling us to refine the language and make changes to the functionality. One example of this was the removal of the “Selected by” feature which would show staff who else had selected publications, at least in the user view. We have kept this as a feature in our REF Reporting section which only a handful of Administrators have access too.

New MiniREF user options

Using these IRRA add-on we have created new set of mini-REF options which are displayed when staff login.

Mini-REF Options

These options included “REF Selections” – for publications and outputs associated with an individual, and “REF Impact” for impacts authored/created by individuals. A third option “REF Reporting” was only available to designated REF Administrators in the College Research offices. This was a new user role (with thanks to Patrick McSweeney at EPrints).

REF Selections

When users chose “REF Selection”, a list of outputs from 2008 onwards was displayed for selection. In the original release this was based on surname and publication year from 2001 onwards. We updated this to 2008 onwards and took advantage of our author GUID work to more precisely show records available – or in some cases highlight those records which didn’t yet have a GUID.

Staff could select their publications, rank them in order of preference, provide additional information and rate them. The text and guidance for this exercise was written by colleagues in the Office of the Vice-Principal (Research and Enterprise), in consultation with our REF Working Group.

REF Selected Items

Selected Items screen

Clicking on Edit Info provided Users with a Selection Details screen where further details, a self-rating and a preference could be added.

REF Selection Entry Screen

Impact and esteem

In addition to the selection of four outputs we also wanted to capture impact and esteem data. The IRRA add-on provide a range of granular esteem options (which included impact) but for our MiniREF exercise we only used three fields

  • Impact
  • Esteem
  • Other Information

The Other Information field was added to enable staff to provide any supporting information they wanted to provide.

REF Reporting and Administrator Options

The IRRA add-on also included a reporting section, this includes Word (HTML) and CSV (for Excel) outputs. This could potentially be extended to an appropriate XML format, potentially CERIF for interoperability. For our exercise we focussed on the existing reports. The Excel report was the one which was used the most since the data could be re-used in Access or just readily reviewed in Excel itself.

This was designed with the RA2 in mind and included labels. We removed many of these and replaced the output codes such as D [a journal article] with the text “journal article”.

It was possible to extract various lists either electronically or in print, for example, to help with preparation and checking of REF submissions.

We added a REF Administrator role so that authorised staff including College Research Office staff and Research and Enterprise administrators could complete information on behalf of academic colleagues. This was an invaluable feature.

REF Administrator Options

Concluding comments and key lessons

Over 1200 academic colleagues returned data to the MiniREF, this was through a mix of self and proxy selection. The exercise ran for just over six weeks and during that time more than 4000 additional records were added to Enlighten. The Library’s Enlighten team dealt with over 700+ e-mail and telephone enquiries [and this doesn’t include the ones which went to our College Research Offices].

We learned a number of key process and development lessons, including:

  • Your publications database can never be comprehensive enough in advance of an exercise like this
  • Ensure you are ready to deal with the volume of queries, updates and additional publications which the exercise will elicit
  • Administration features including the opportunity to make changes on behalf of users (to update records on their behalf) and to run various reports are absolutely vital for managing returns and gauging progress
  • Learn lessons, take onboard feedback and be flexible/nimble enough to make changes to the system and workflows
The Library’s role and the work of the Enlighten team has been greeted very positively and favourably as a result of this exercise. It has enabled us to work much more closely with academic colleagues and College Research staff. We will now build on this work to ensure Enlighten is as comprehensive as possible.

Using Google Analytics to show our Top 100 Searches

We have just updated our August 2010 Top 100 searches from Google and other Internet search engines. This post provides an overview of how we do this and the Excel file(s) which we use.

It’s done manually at the beginning of each month but doesn’t take long to do, once the initial files were set-up. We just:

  • Extract the data from Analytics
  • Use Excel to produce an Top 100 searches in HTML
  • Copy the HTML data into a Top 100 searches page

Getting Started with Google Analytics

In Analytics, Keywords can be found under Traffic Sources from the left hand option menu. Set the date range which you want to report on and GA will display an overview of the numbers searches and the number of keywords used. It will show the first 10, we change this to the top 250 to get an overview of the searches.

Keywords Traffic for Enlighten for August 2010

Keywords Traffic for Enlighten for August 2010

The keywords are exported using the Export > CSV for Excel file format. This exports daily search counts as well as the keywords and their visits.

Exporting as CSV for Excel

Exporting as CSV for Excel

From Analytics to Excel

In Excel we trim the data from Analytics into just two columns keywords and visits. At its most basic, we could have just made this data public but we felt it was more useful to provide links back into Enlighten which would enable us to this data.

Excel with Keyword and Visits Columns

Excel with Keyword and Visits Columns

To use these, we add two additional columns, one for the keywords to be used by the link (Keyword_Plus) and the other for the links which will be generated by our Excel formula (Link). In the new Keyword_Plus column, we replace all the spaces with + these will be used as the search terms by Google.

In the Link column, we use a formula in Excel to create an HTML formatted list which combines the first two columns. This is set-up to pass the search terms back to Google. To ensure relevance we use our local custom Google Search but results could be limited using site:.

We use the former because it gives us more control over the sites searched and the “look and feel” of the search and results pages. We do definitely recommend applying an option like this and not just passing results to Google…

Excel with Keyword Plus column

Excel with Keyword Plus column

The formula which we use for links is this (the hardest part was sorting out the & and ” syntax!):

="<LI><A HREF=""http://www.lib.gla.ac.uk/enlighten/search/results.html?cx=008133519044995412890%3Ai9xbikqzcrc&cof=FORID%3A9&ie=UTF-8&q="&C1 &"&sa=Search"&""">"&B1&"</A> "&"("&B2&")"&"</li>"

An alternative version just going directly to Google and limiting by site:

="<LI><A HREF=""http://www.google.co.uk/search?q="&C2 &"+site:eprints.gla.ac.uk"&""">"&B1&"</A> "&"("&B2&")"&"</li>"

A copy of our August 2010 Excel file, including these formula can be found as GA_Custom_Search_Example.xls and GA_Site_Search_Example.xls on the Enlighten website.

Once the formula is in the first row of the search terms, we scroll down to add this for all 100 searches. The column variables are replaced with the keywords and visits and prepped with <LI> so they can be used as an ordered list. This was initially an unordered list but we changed it quite early on to become ordered since that made more sense for a Top 100 listing.

From Excel to the Top 100 Searches web page

The results of Column D are copied into our Top 100 Searches web page between ordered list tags to show their ranking. The introductory text is updated to provide an overview of the number of searches for that month.

This is uploaded as index.html into our /top100searches directory. and the finished page is available from the Top 100 Searches link which is part of our default left hand navigation bar.

Enlighten - Top 100 Searches (August 2010)

Enlighten - Top 100 Searches (August 2010)

Monthly and Yearly Counts

Over the last couple of months of we have also started to provide monthly as well year to date search counts. This gives new search terms (and papers)  an opportunity to be seen.

See:  2010 to date

A word about John Wayne…

Searches for John Wayne continue to rank at the top of both our annual and monthly statistics. Using the Advanced Filter for August 2010 we can see that there were 68 variations on keywords with John Wayne sending a total of 392 searches. These accounted for 2.28% of our search traffic.

The paper, “Is that you John Wayne? Is this me?”: myth and meaning in American representations of the Vietnam war by Professor Simon Newman is freely available from the publisher link in the record in Enlighten.

Enlighten workshops for University staff

Last week we ran a couple of workshop sessions open to staff across the University who deposit in Enlighten. We have an Enlighten contacts mailing list which we set-up after our various department meetings to update staff on new services, workshops and changes.

The workshop was also a good opportunity to provide a snapshot “state of the nation” showing Enlighten’s growth in deposits [with thanks to the Enlighten team and EPrints Services] and usage through a mix of our ROAR data and Google Analytics.

Enlighten: GUIDs, Glasgow Authors and Funders

It has been an exciting few months for Enlighten and our JISC funded Enrich project with the introduction of wide a range of updates to the service.

These include:

  • Logging in with University credentials (GUID)
  • Browsing by Glasgow author
  • Importing new records
  • Adding funder data

Logging in with University credentials (GUID)

Over 11,000 user records for staff were added to the repository from our Data Vault, by IT Services in December 2009. These records enable staff to login using their GUID (Glasgow Unique IDentifier) to deposit publications – no more need to register for a separate account.

We have also disabled “create account”, no one can now register to deposit or login to Enlighten.

A key aim of Enrich was to lower barriers to deposit and enabling these logins for staff has already seen a marked increase in the number of staff now adding records and depositing their papers.

Browsing by Glasgow author

With these user records in place, and working with EPrints Services we have set-up browse views of Glasgow authors, using the name in the user record NOT the name in the author field of the ePrint field, like the default people view.

This approach has enabled us to decouple the name used by an author in the citation of the paper from the name which they are identified by in the University’s central systems. All publications, for instance by staff who have published under both a maiden and married name are now grouped together.

Repository depositors can now link publications to staff records by adding their unique Glasgow Identifier (GUID) in the author field.

These Glasgow author views use the EPrints record number, not the GUID or the staff number.

Importing new records

We have been importing records from University departments into Enlighten and there are now over 22,000 records in the service. This work is ongoing and we have additional data from a range of departments which will be added over the coming weeks.

Adding funder data

A new Funding option has been added to the deposit workflow to enable project and funder data to be linked to publications.

This is done with a multi-value Funder field which includes project name, code and funder. This field can be autocompleted (using an HTML mark-up file) with publicly available projects from the Research System. This multi-value field replaces the default Funder and Project fields which are part of EPrints.

Repository staff will check that the funder data added is publicly available before the publication is moved into the live archive.

The funder field is displayed in the ePrints record and is used to create a Research Funder browse view. This lists funders and their associated publications – work on this is ongoing.

Open Access at the Berlin7 Conference

In December I attended the Berlin7 at La Sorbonne in Paris (2-4 December 2009) as part of a panel session entitled “Practical challenges in moving to Open Access: a focus on research funders and universities”.

A Research Institution’s View

The focus of my presentation was the role of institutions and funders in implementing Open Access mandates with a particular focus on our own institutional repository.

It posed 5 questions [with some answers] for institutions and funders:

  1. How universities can help funders implement mandates
  2. What the infrastructure implications are for universities
  3. What the policy implications are for universities
  4. How funders can help universities
  5. What are the shared (and different goals) for institutions and funders

One of the key goals of Enrich is to easily link funding data (from our Research System) to publications in the repository and to demonstrate compliance with funder mandates such as the Wellcome Trust.

The panel was chaired by Fred Friend and also included:

  • John Houghton (Victoria University)
  • Alma Swan (Key Perspectives Ltd)
  • Wolfram Horstmann (Bielefeld University)
  • Johannes Fournier (DFG, German Research Foundation) and Anita Eppelin (German National Library of Medicine)
  • Kurt de Belder (Leiden University)
  • Robert Kiley (Wellcome Trust)
  • Bernard Rentier (Université de Liège)

All of the presentations from Berlin7 are available online.