Using Google Analytics to show our Top 100 Searches

We have just updated our August 2010 Top 100 searches from Google and other Internet search engines. This post provides an overview of how we do this and the Excel file(s) which we use.

It’s done manually at the beginning of each month but doesn’t take long to do, once the initial files were set-up. We just:

  • Extract the data from Analytics
  • Use Excel to produce an Top 100 searches in HTML
  • Copy the HTML data into a Top 100 searches page

Getting Started with Google Analytics

In Analytics, Keywords can be found under Traffic Sources from the left hand option menu. Set the date range which you want to report on and GA will display an overview of the numbers searches and the number of keywords used. It will show the first 10, we change this to the top 250 to get an overview of the searches.

Keywords Traffic for Enlighten for August 2010

Keywords Traffic for Enlighten for August 2010

The keywords are exported using the Export > CSV for Excel file format. This exports daily search counts as well as the keywords and their visits.

Exporting as CSV for Excel

Exporting as CSV for Excel

From Analytics to Excel

In Excel we trim the data from Analytics into just two columns keywords and visits. At its most basic, we could have just made this data public but we felt it was more useful to provide links back into Enlighten which would enable us to this data.

Excel with Keyword and Visits Columns

Excel with Keyword and Visits Columns

To use these, we add two additional columns, one for the keywords to be used by the link (Keyword_Plus) and the other for the links which will be generated by our Excel formula (Link). In the new Keyword_Plus column, we replace all the spaces with + these will be used as the search terms by Google.

In the Link column, we use a formula in Excel to create an HTML formatted list which combines the first two columns. This is set-up to pass the search terms back to Google. To ensure relevance we use our local custom Google Search but results could be limited using site:.

We use the former because it gives us more control over the sites searched and the “look and feel” of the search and results pages. We do definitely recommend applying an option like this and not just passing results to Google…

Excel with Keyword Plus column

Excel with Keyword Plus column

The formula which we use for links is this (the hardest part was sorting out the & and ” syntax!):

="<LI><A HREF="""&C1 &"&sa=Search"&""">"&B1&"</A> "&"("&B2&")"&"</li>"

An alternative version just going directly to Google and limiting by site:

="<LI><A HREF="""&C2 &""&""">"&B1&"</A> "&"("&B2&")"&"</li>"

A copy of our August 2010 Excel file, including these formula can be found as GA_Custom_Search_Example.xls and GA_Site_Search_Example.xls on the Enlighten website.

Once the formula is in the first row of the search terms, we scroll down to add this for all 100 searches. The column variables are replaced with the keywords and visits and prepped with <LI> so they can be used as an ordered list. This was initially an unordered list but we changed it quite early on to become ordered since that made more sense for a Top 100 listing.

From Excel to the Top 100 Searches web page

The results of Column D are copied into our Top 100 Searches web page between ordered list tags to show their ranking. The introductory text is updated to provide an overview of the number of searches for that month.

This is uploaded as index.html into our /top100searches directory. and the finished page is available from the Top 100 Searches link which is part of our default left hand navigation bar.

Enlighten - Top 100 Searches (August 2010)

Enlighten - Top 100 Searches (August 2010)

Monthly and Yearly Counts

Over the last couple of months of we have also started to provide monthly as well year to date search counts. This gives new search terms (and papers)  an opportunity to be seen.

See:  2010 to date

A word about John Wayne…

Searches for John Wayne continue to rank at the top of both our annual and monthly statistics. Using the Advanced Filter for August 2010 we can see that there were 68 variations on keywords with John Wayne sending a total of 392 searches. These accounted for 2.28% of our search traffic.

The paper, “Is that you John Wayne? Is this me?”: myth and meaning in American representations of the Vietnam war by Professor Simon Newman is freely available from the publisher link in the record in Enlighten.


Enlighten workshops for University staff

Last week we ran a couple of workshop sessions open to staff across the University who deposit in Enlighten. We have an Enlighten contacts mailing list which we set-up after our various department meetings to update staff on new services, workshops and changes.

The workshop was also a good opportunity to provide a snapshot “state of the nation” showing Enlighten’s growth in deposits [with thanks to the Enlighten team and EPrints Services] and usage through a mix of our ROAR data and Google Analytics.

Twitter ye not*, EnlightenPapers tweets

Enlighten has embraced the microblogging tech du jour Twitter with an account for our latest additions. Launched on the 9th of June, EnlightenPapers has posted nearly 400 updates and garnered an eclectic collection of followers.



As the EPrints wiki notes, setting up Twitter for an EPrints repository is “dead easy” and the necessary code is available from

We saved this code to a file called and dropped it into the cfg.d directory.

Links to Twitter from Enlighten

EnlightenPapers Twitter Link

We have added a “Follow Us” section on Enlighten’s home page (like EPrints Files) where we have clustered the various RSS feeds which EPrints offers with a link to Twitter.

The RSS and Twitter links have been added to the default.xml template and are displayed in all of the record and browse pages.

Just because we can, should we?

Are we just hopping on the microblogging bandwagon? Just because it is so easy to drop the code into EPrints should we really be tweeting our latest additions?

My personal opinion is yes, one of the underlying principles of Enlighten (and I think EPrints) is the effective re-use of our content. We need to be able to push our content to wherever our users happen to be. If they happen to use Twitter then that’s where we should be.

Using WordPress, we have now been able to take advantage of the Twitter widget which provides a realtime reflection of the content being added to Enlighten.

A follow-up question is, can we do more?

Andrew Preater, Web Applications Manager at Durham University Library has done some further work with the code to manage titles over 140 characters in length. Andrew’s code truncates the title, inserting […] to ensure that there is enough room for the url and then including a #dro hashtag.

More specific hashtags, for instance by faculty e.g. #enl_educ could provide an opportunity for users to search for more appropriate content.

There has been some discussion about Twitter on the EPrints mailing list [EP-tech] Eprints files twitter.

Durham Research Online Twitter entry

Durham Research Online Twitter entry

Traffic – Early Days

According to Google Analytics, is now the 4th most popular referring site to Enlighten for the period from the 9th of June until the 14th of July.


Other Repositories

A number of other UK repositories are tweeting and this is by no means a complete list, it is just a number which we have looked at:

* With apologies to the memory of Frankie Howerd

Our Top 100 Searches from Google

Enlighten, like so many other repositories uses Google Analytics (although I think we are just scratching the surface of the information which it can provide for us) and we have now updated our top 100 keyword searches from Google for the last six months.

The results provide an insight  into both the breadth of research which is done at the University and the searches which users are doing in Google which bring them to us.

They also show that in the last two months searches just for “Enlighten” have overtaken the previous top search “desquamative gingivitis”! Enlighten is steadily climbing in the Google rankings and is now listed third in Google search results.

The top 10 terms from 12th of January to 30th of June 2009:

  1. enlighten (306)
  2. desquamative gingivitis (302)
  3. ambiguous figures (87)
  4. issn 0021-8979 (82) [Journal of Applied Physics]
  5. enlighten glasgow (68)
  6. pictish symbols (67)
  7. sex appeal in advertising (67)
  8. search b (56)
  9. world journal of engineering (54)
  10. pergamon press glasgow enlighten (52)

The full list of 100 searches >>

Google Custom Search

The searches in our top 100 are passed to our local Google custom search service which  includes material in our Glasgow Theses Service.  This local service is configured to just display individual records and fulltext  PDF”s and to exclude all of the Browse Views which a standard Google search includes.