Using Google Analytics to show our Top 100 Searches

We have just updated our August 2010 Top 100 searches from Google and other Internet search engines. This post provides an overview of how we do this and the Excel file(s) which we use.

It’s done manually at the beginning of each month but doesn’t take long to do, once the initial files were set-up. We just:

  • Extract the data from Analytics
  • Use Excel to produce an Top 100 searches in HTML
  • Copy the HTML data into a Top 100 searches page

Getting Started with Google Analytics

In Analytics, Keywords can be found under Traffic Sources from the left hand option menu. Set the date range which you want to report on and GA will display an overview of the numbers searches and the number of keywords used. It will show the first 10, we change this to the top 250 to get an overview of the searches.

Keywords Traffic for Enlighten for August 2010

Keywords Traffic for Enlighten for August 2010

The keywords are exported using the Export > CSV for Excel file format. This exports daily search counts as well as the keywords and their visits.

Exporting as CSV for Excel

Exporting as CSV for Excel

From Analytics to Excel

In Excel we trim the data from Analytics into just two columns keywords and visits. At its most basic, we could have just made this data public but we felt it was more useful to provide links back into Enlighten which would enable us to this data.

Excel with Keyword and Visits Columns

Excel with Keyword and Visits Columns

To use these, we add two additional columns, one for the keywords to be used by the link (Keyword_Plus) and the other for the links which will be generated by our Excel formula (Link). In the new Keyword_Plus column, we replace all the spaces with + these will be used as the search terms by Google.

In the Link column, we use a formula in Excel to create an HTML formatted list which combines the first two columns. This is set-up to pass the search terms back to Google. To ensure relevance we use our local custom Google Search but results could be limited using site:.

We use the former because it gives us more control over the sites searched and the “look and feel” of the search and results pages. We do definitely recommend applying an option like this and not just passing results to Google…

Excel with Keyword Plus column

Excel with Keyword Plus column

The formula which we use for links is this (the hardest part was sorting out the & and ” syntax!):

="<LI><A HREF="""&C1 &"&sa=Search"&""">"&B1&"</A> "&"("&B2&")"&"</li>"

An alternative version just going directly to Google and limiting by site:

="<LI><A HREF="""&C2 &""&""">"&B1&"</A> "&"("&B2&")"&"</li>"

A copy of our August 2010 Excel file, including these formula can be found as GA_Custom_Search_Example.xls and GA_Site_Search_Example.xls on the Enlighten website.

Once the formula is in the first row of the search terms, we scroll down to add this for all 100 searches. The column variables are replaced with the keywords and visits and prepped with <LI> so they can be used as an ordered list. This was initially an unordered list but we changed it quite early on to become ordered since that made more sense for a Top 100 listing.

From Excel to the Top 100 Searches web page

The results of Column D are copied into our Top 100 Searches web page between ordered list tags to show their ranking. The introductory text is updated to provide an overview of the number of searches for that month.

This is uploaded as index.html into our /top100searches directory. and the finished page is available from the Top 100 Searches link which is part of our default left hand navigation bar.

Enlighten - Top 100 Searches (August 2010)

Enlighten - Top 100 Searches (August 2010)

Monthly and Yearly Counts

Over the last couple of months of we have also started to provide monthly as well year to date search counts. This gives new search terms (and papers)  an opportunity to be seen.

See:  2010 to date

A word about John Wayne…

Searches for John Wayne continue to rank at the top of both our annual and monthly statistics. Using the Advanced Filter for August 2010 we can see that there were 68 variations on keywords with John Wayne sending a total of 392 searches. These accounted for 2.28% of our search traffic.

The paper, “Is that you John Wayne? Is this me?”: myth and meaning in American representations of the Vietnam war by Professor Simon Newman is freely available from the publisher link in the record in Enlighten.


Enlighten workshops for University staff

Last week we ran a couple of workshop sessions open to staff across the University who deposit in Enlighten. We have an Enlighten contacts mailing list which we set-up after our various department meetings to update staff on new services, workshops and changes.

The workshop was also a good opportunity to provide a snapshot “state of the nation” showing Enlighten’s growth in deposits [with thanks to the Enlighten team and EPrints Services] and usage through a mix of our ROAR data and Google Analytics.

Enlighten: GUIDs, Glasgow Authors and Funders

It has been an exciting few months for Enlighten and our JISC funded Enrich project with the introduction of wide a range of updates to the service.

These include:

  • Logging in with University credentials (GUID)
  • Browsing by Glasgow author
  • Importing new records
  • Adding funder data

Logging in with University credentials (GUID)

Over 11,000 user records for staff were added to the repository from our Data Vault, by IT Services in December 2009. These records enable staff to login using their GUID (Glasgow Unique IDentifier) to deposit publications – no more need to register for a separate account.

We have also disabled “create account”, no one can now register to deposit or login to Enlighten.

A key aim of Enrich was to lower barriers to deposit and enabling these logins for staff has already seen a marked increase in the number of staff now adding records and depositing their papers.

Browsing by Glasgow author

With these user records in place, and working with EPrints Services we have set-up browse views of Glasgow authors, using the name in the user record NOT the name in the author field of the ePrint field, like the default people view.

This approach has enabled us to decouple the name used by an author in the citation of the paper from the name which they are identified by in the University’s central systems. All publications, for instance by staff who have published under both a maiden and married name are now grouped together.

Repository depositors can now link publications to staff records by adding their unique Glasgow Identifier (GUID) in the author field.

These Glasgow author views use the EPrints record number, not the GUID or the staff number.

Importing new records

We have been importing records from University departments into Enlighten and there are now over 22,000 records in the service. This work is ongoing and we have additional data from a range of departments which will be added over the coming weeks.

Adding funder data

A new Funding option has been added to the deposit workflow to enable project and funder data to be linked to publications.

This is done with a multi-value Funder field which includes project name, code and funder. This field can be autocompleted (using an HTML mark-up file) with publicly available projects from the Research System. This multi-value field replaces the default Funder and Project fields which are part of EPrints.

Repository staff will check that the funder data added is publicly available before the publication is moved into the live archive.

The funder field is displayed in the ePrints record and is used to create a Research Funder browse view. This lists funders and their associated publications – work on this is ongoing.

Open Access at the Berlin7 Conference

In December I attended the Berlin7 at La Sorbonne in Paris (2-4 December 2009) as part of a panel session entitled “Practical challenges in moving to Open Access: a focus on research funders and universities”.

A Research Institution’s View

The focus of my presentation was the role of institutions and funders in implementing Open Access mandates with a particular focus on our own institutional repository.

It posed 5 questions [with some answers] for institutions and funders:

  1. How universities can help funders implement mandates
  2. What the infrastructure implications are for universities
  3. What the policy implications are for universities
  4. How funders can help universities
  5. What are the shared (and different goals) for institutions and funders

One of the key goals of Enrich is to easily link funding data (from our Research System) to publications in the repository and to demonstrate compliance with funder mandates such as the Wellcome Trust.

The panel was chaired by Fred Friend and also included:

  • John Houghton (Victoria University)
  • Alma Swan (Key Perspectives Ltd)
  • Wolfram Horstmann (Bielefeld University)
  • Johannes Fournier (DFG, German Research Foundation) and Anita Eppelin (German National Library of Medicine)
  • Kurt de Belder (Leiden University)
  • Robert Kiley (Wellcome Trust)
  • Bernard Rentier (Université de Liège)

All of the presentations from Berlin7 are available online.

New journal fields added to Enlighten

We have added three new fields for the journal article document type for Enlighten. These are:

  • ISSN (Online) [Text]
  • Journal Abbreviation [Text]
  • Published Online [Date]
New Journal Fields in the record display

New Journal Fields in the record display

Example record: Multiple categories: the equivalence of a globular and a cubical approach

Autocompletion enabled

We have also refined journal autocompletion so that it will now fill
in five fields:

  • Journal title
  • Journal abbreviation
  • ISSN [Printed]
  • ISSN (Online)
  • Publisher

This is triggered by an entry in any one of the fields except ‘publisher’.

Journal autolookup example for title

Journal autolookup example for title

New searches

We have added ISSN, ISSN (Online) and journal abbreviation search options to our  Advanced Search screen.

New journal fields in Advanced Search

New journal fields in Advanced Search

A search for adv math or adv. math. in the abbreviation field will retrieve matching records.


These new fields were added using the web Admin interface and then added into the workflow default.xml. New scripts were added to lookup to implement this in the new fields. We also removed the local stage which EPrints creates when new fields are created.

We also needed to add phrases into render.xml for the fields when they were added to the record display.

Enlighten’s 1,001 Multi-disciplinary Tweets

EnlightenPapers has now reached 1,001 tweets for on Twitter, that is, 1,001 new records since we added the Twitter code on the 9th of June this year. These include, among others, research in Celtic, English, History, Classics, Life Sciences, Law and the Physical Sciences.

The 1,001st Tweet

Our 1,001st tweet is a self-deposited full text paper from the National e-Science Centre (NeSC) at the University of Glasgow, “Applying formal methods to standard development: the open distributed processing experience” by Professor Richard Sinnott.

The 1,001st Tweet

The 1,001st Tweet

Twitter Growth and Visualisations

Twitter has provided an interesting gauge for our growth which we hadn’t anticipated when we started using it, for both the cumulative count of new additions and the rise (and fall) of followers. Our followers, currently some 65 [but it has been up to 80+] seem to be fairly volatile, many joining as papers match their interests and then leaving as perhaps they realise the volume of outputs and the broader range of material is not what they want as part of their own Twitter feed.

Twitter visualisation apps like Visible Tweets also provide us with new opportunities to showcase latest additions, and while such apps could be dismissed as party tricks they provide a glimpse of the range of re-use and visualisation options open to us. Visible Tweets also supports Twitter’s range of operators so that views can be refined by date, sender, hashtag and more. For EnlightenPapers we can use the limit from:EnlightenPapers to display only our own tweets rather than any replies which feature us.

Example of paper displayed in Visible Tweets

A paper displayed in Visible Tweets

Try EnlightenPapers in Visible Tweets – with rotation!

New Records, Bibliographic Services Staff and Training

Records, and freely available full text, have been added from a wide range of subjects over the summer months as we work with departments to add both their retrospective material and new publications.

Enlighten staff in our bibliographic services department have been very busy dealing with this increased influx of material, reviewing records, adding subject headings and checking copyright. They have also been involved in training sessions for departmental staff.

Since the beginning of 2009 we have run training sessions for staff in over 30 different departments about Enlighten, the University’s Publications Policy and the deposit process. These have been a mix of Powerpoint, hands-on work and coffee/tea; a formula which has proven to be a successful way to deliver the training and more importantly to start to build a “deposit community”.

Repository Growth – A Snapshot from ROAR

Using our entry in ROAR, we can track our growth since 2004 when the Enlighten [formerly the Glasgow ePrints Service] was launched as part of the JISC funded DAEDALUS project. We have had steady growth but this is now really starting to accelerate as the training and publications policy start to make themselves felt. We have already added some 2000+ records since the beginning of 2009 and more than half of those have been added in the last 3 months.

Growth of Enlighten since February 2004 from ROAR

Growth of Enlighten since February 2004 from ROARMAP