Monday, January 16, 2012

Elementary hacking for Australian scholars who use Google Books

One of the great frustrations of Australian studies in the digital age is that some of the best resources in Google Books are only available in the US. In a cruel irony, these books—which were written printed, bound and distributed in Australia and are long out of copyright—cannot be released to Australian ISPs. Thankfully, a colleague has shown me that there's a way around the restrictions which I'll explain here. I've been applying this system for my lexicographic research which explains the examples that I use.

These tips work for any browser, but I prefer Chrome because the address bar and search bar are one and the same. This makes life simpler, especially if you have multiple tabs open at a time.

How to hack Google Books
The Google Books archive offers access in four ways:
  • No Preview Available provides bibliographic details with no access to content
  • Snippet View offers bibliographic details plus a few sentences. Sometimes there is a 'contents' section which may bear little relation to the actual contents of the book.
  • Limited Preview lets you search the content and get access to small sections of text
  • Full View allows you to see, search and download the whole lot.
Because of territorial licensing arrangements with US libraries, many texts that come up as Snippet View or Limited Preview in Australia and elsewhere are available as Full View in the US.

Users in Australia can get around some of these territorial restrictions by using what's called a web anonymiser. This is a system that makes use of a ‘proxy’ IP address borrowed from the US. Using a US proxy is a way of tricking Google into thinking that you're in the Land of the Free (Content).

Here's how to use an anonymiser.
  1. Go ahead and search for your word in Google Books. If you’re not permitted to search the text of the book, it will probably look something like this:

  2. Don’t cry. Select and copy everything in the address bar then open a new tab by pressing Apple + T or CTRL + T on a PC.

  3. In this new tab, open an anonymiser. At the time of writing an effective anonymiser was MyNinjaProxy. Anonymisers seem to change fairly regularly so by the time you read this it may not work for you.

  4. Paste the contents of your search bar into the blank field provided. In MyNinjaProxy it looks like this (next to the button that says 'Surf Now!'):

  5. Hit 'Surf Now!' (or 'Go', or equivalent) The same page you were looking at in Google Books will (fingers crossed) now turn up in the new tab as if you are in America. Voila!


If you don't notice any difference between the Australian and American versions of the page you are looking at then it means that the licensing restrictions apply equally to both countries. With any luck though, the US version will be offered as Full View. As a rule, 20th-century publications and later remain under wraps. Luckily for me, much of my research is on 19th-century texts.

Note
  • It’s not illegal to use anonymisers. As far as I know it’s Google’s responsibility, not yours, to keep this material in the US, if that’s what was in the terms of the scanning negotiations with the libraries
  • Because the information is travelling from Australia to the US and back again the pages will take longer to load. Sometimes you may need to reload the page entirely (press the reload button on your browser or click inside the address bar and press return)
  • You cannot physically download the PDF version in Full View texts accessed via a proxy. In my experience a PDF file will download but for some reason it’s blank. If anyone has any luck doing this, let me know.
  • You can't always go directly to the page you want and may have to click through page by page. This seems to be inconsistent from book to book.
  • You cannot link directly to a Google Books page that is routed through an anonymiser, nor save it as a bookmark. In order to avoid going through the whole performance of finding the same part of the text again, I simply make a screen shot of a page. In a Mac, make sure the full page is visible on the screen and then press Apple+Shift+4. A cross will appear. Click and drag over the page area then release all your fingers. An image of the page will be sent to your desktop.
  • To check the accuracy of Google's bibliographic metadata click on the icon showing a representation of the front cover of the book to go directly to the beginning, then click forward page by page until you get to the title page. In the image reproduced in Step 5 above, the front-cover icon is the green thing in the top left corner, under the word ‘Report’.

What to do if you still can't get into Full View using an anonymiser
If after all this you don't get a Full or Snippet view, or you don't have access to the pages that contain the bibliographic data, open another tab on your browser and go to www.archive.org. Many of Google's archived books are also archived here and can be accessed without an anonymiser. The search bar, however, will only let you search for the title of a book not the text. This is usually not a problem, because if you've got this far you will already know what book you're interested in. There is also a way around this problem: see What to do if the search function provided by an online archive is dodgy, below.

Once you have the book you are interested in, choose ‘Read online’ to look at the original images of the books which are searchable in this interface. Note that the search is not always reliable, and I don't know why this is the case. It's worth going to Full Text to see the warts-and-all OCR text behind the images, and to rely on your computer's own system of text search, eg, Apple + F.

What to do if the search function provided by an online archive is dodgy
Online archives like Trove have very reliable in-built search functions. When you type in a word, you can be reasonably confident that it’s going to dredge up everything of relevance, provided your search term matches an existing transcription. The search function in Archive.org, on the other hand, is pretty dodgy. I’m probably doing it wrong but I can’t seem to search for anything other than titles of books, rather than the text. Other sites, like the University of Michigan’s digital collection, appear to give you everything when they’re actually inadvertently holding certain results back.

Here’s how to get behind the scenes of an online archive and do searches on your own terms.

1. Go to your online archive. In this case I’m opening Archive.org




2. Ignore the word ‘Search’ and the tantalising rectangular space to its right. Instead, put your cursor in the address bar and add this text before the address “site:”. It will now look like this:


Important: if you are using a browser other than Chrome, then don’t do this in the Address bar. Instead, use the search bar with the magnifying glass.



3. Don’t press return yet. Add the word or words you are interested in prior to the address. Eg, the word ‘mangurt’, like this:



This is how the same thing will look in Safari:



Now press return. Google’s own search function will do the work of searching for the ‘mangurt’ within archive.org. So you will get results that look like this:



This is effectively a Google search with results limited to what’s found within Archive.org.

4. In this case I’m not happy with the way that Google has tried to guess what I really wanted to find. It came up with “man hurt” and other rubbish. So this time I will put my word in inverted commas thus:



Now I get very useful results:



This brings me to....

How get around Google’s automatic spell check correction

Google has recently decided that if you spell something in a non-standard way or are searching for a rare or obsolete word, it offers you ‘improved’ alternatives. This is frustrating for lexicographers but fortunately it’s a simple as using double quotation marks, as if you were searching for an exact phrase. Thus a search “mangurt” will only look for words spelled exactly that way. This works in conjunction with other Boolean notation.
Take the following search:

“mangurt” “marn grook” -mango -“yoghurt” location:au

This will return results with the exact spelling of mangurt as well as the exact phrase “marn grook”, excluding all results with the word ‘mango’ and the exact spelling of the word yoghurt, but allowing alternative spellings such as ‘yogurt’ and only from sites that are based in Australia.
Meanwhile this search:

“mangurt” “marn grook” -mango -“yoghurt” site:www.archive.org

...will do the same thing but limit results to pages that appear within archive.org.


Wednesday, January 11, 2012

Is Munanga a placename?


Click to enlarge
This image relates to the discussion and comments on David Nash's post on Munanga. It was prepared by Kay Dancey who searched the National Geospatial Intelligence Agency (NGA, formally NIMA) GEOnet Names Server for Indonesian placenames that resemble Munanga. If the Yolngu word munaŋa is from a placename, perhaps Munangge is a candidate. The trouble is that it's not directly on a trepang trading routes. Two of the routes in the image below go around the east side of Flores and end up in the Kimberley. A Madurese route goes west but also ends up in the Kimberley.

The image is taken from Morwood, MA & Hobbs, DR 1997, ‘The Asian connection: preliminary report on Indonesian trepang sites on the Kimberley coast, NW Australia’, Archaeology in Oceania 32:197–206. Reproduced in Russel, Denise. 2004. Aboriginal–Makassan interactions in the eighteenth and nineteenth centuries in northern Australia and contemporary sea rights claims, Australian Aboriginal Studies, 2004/1

Any comments? Please add them to the bottom of this post: http://www.paradisec.org.au/blog/2008/10/munanga/

Monday, December 05, 2011

Best of Anthropology on iTunes-U

These are my favourite anthropology podcasts on iTunes-U. I haven't included links to the store because they frequently don't work. Enjoy! I'll be updating this list. If you find anything else worth listening to, please drop me a line.

COURSES

Geography of world cultures
Martin W Lewis, Stanford
A tour through world cultures, languages and migration patterns.

Anthropology 114
Rosemary A Joyce, Berkeley
This course covers the history of anthropological thought.

Anthropology 1
Terrence W Deacon, Berkeley
From primate evolution to cultural evolution

INDIVIDUAL LECTURES

Measurement of Bodily Transformations (1 Feb 2010)
Stanley Ulijaszek, Oxford

Indigenous capitalism in Upland Indonesia (5 Feb 2010)
Tania Murray Li, Oxford

Neither Freud nor Artemidorous (27 April 2010)
Charles Stewart, Oxford

Interview with Evans-Pritchard Lecturer Dr Charles Stewart (13 May 2010)
Charles Stewart, Oxford

What is social anthropology? (27 Oct 2010)
Marcus Banks, Oxford


Wednesday, July 06, 2011

Found poetry: forgotten

This found poem comes from the Webster's thesaurus entry for 'forgotten'

forgotten
adjective
Vivaldi's opera's are largely forgotten UNREMEMBERED, out of mind,
past recollection,
beyond/past recall,
consigned to oblivion;
left behind;
neglected,
overlooked, ignored, disregarded,
unrecognized.


Monday, February 21, 2011

Top 10 online resources for Philippine studies

I have been meaning to post a list like this for a long time now. These are resources that I absolutely can't manage without:

1. University of Michigan: The United State and Its Territories
A treasure trove of searchable documents from the American period, and earlier. Note that the search algorithm is unreliable, not to mention the OCR, so you need to take a guerilla approach. The site includes all of Blair & Robinson and almost all of the reports of the Philippine Commission. Indispensable for historians.

Ok, this is kinda obvious but the fact remains that there are Philippine titles here that simply can't be found or accessed elsewhere. A few gems that I've discovered: Buzeta and Bravo's Diccionario Geográfico, Estadístico, Histórico de las Islas Filipinas, Vol I and Vol II, and rare works by Alonso de Méntrida and Juan Felix de la Encarnación. If you discover anything else that's cool, please let me know!

It's really not the most advanced dictionary but it's quick and easy. Best of all it searches Cebuano-Visayan, Tagalog and Hilagaynon-Visayan simultaneously and arranges output in columns.

A metasearcher that covers the major online library catalogues of the Philippines, including the National Library of the Philippines and UP.

This is another catalogue metasearcher which includes more Philippine libraries than eLib but you can only search one library at a time.

A small but well managed library in Manila specialising in Spanish materials and not discoverable through metasearchers. If the Instituto Cervantes keeps it on their shelves, it's worth reading

If you can't find it at Michigan (above), try the National Archives. They will bust a gut looking for what you need and will send you a report of their efforts in the mail for free.

Holds the biggest depository of the Philippine materials outside the National Library of the Philippines. They even have a mimeographed copy of the H Otley Beyer collection, destroyed in the US bombing of Manila.

An excellent document-sharing site and a great way to disseminate research. I've started a collection for Bohol studies here and another for Eskaya studies here.

Lists of resources on languages of the Philippines with some wonderful maps. This is really for linguists and anthropologists, but it gives a good overall sense of the cultural diversity of the Philippines.

Biggest Time-Wasters

1. Augustinian Recollects
Apparently nobody told the Recollects that they no longer rule the islands. Believe it or not, these guys are sitting on one of the most important collections of unpublished digitised manuscripts covering the administration of Augustinian districts right up to the Revolution and after. Unfortunately you can't find them online but you need to go here:
81 Alondras St
Mira-nila Homes Tandang Sora, Quezon City UP P.O. Box 206 1101.
Tel. +6329512861

That's inside Mira-nila Subdivision which is inside Tierra Pura subdivision. Not far from UP, and just beyond Quezon Memorial Circle. Ask for Fr Emil Quilatan. You'll have to make an appointment and you'll be kicked out after a couple of hours. Documents can be printed on request for a small fee and your bag will be officiously inspected on the way out.

Another archive with amazing material including fiction in Cebuano from the early 20th century. It's not exactly open to the public and the librarians are suspicious of visitors. And God help you if you want a copy of anything. Here's hoping the proposed Boholano Studies Center at HNU fares better.

An ambitious site that aims to gather "the most important and significant documents and artifacts on Philippine Studies and makes them available on the Internet for free". It's nowhere near achieving that goal yet and is cluttered with useless external links and summaries of texts rather than the real deal. Worth checking on from time to time, just to see what's new.


Monday, January 24, 2011

Best of Linguistics on iTunes-U

These are my favourite linguistics lectures on iTunes-U. They all happen to have been broadcast by the University of Arizona and the full list is here. Let me know if you find anything else that's worth listening to in iTunes U.


Exploring ERP Effects of Metaphor via Crowdsourcing
Vicky Lai, MPI for Psycholinguistics, Nijmegen, and Steve Bethard, University of Leuven

A Close Look at Writing Systems, or, Four Tongues Better than One
William Watt, University of California Irvine

The Comprehension Hypothesis Extended
Stephen Krashen, University of California Los Angeles

All in the family? Evaluating the role of kin selection in language evolution
Maggie Tallerman, Newcastle University

Some unresolved issues in language endangerment and revitalization
Lindsay J. Whaley, Dartmoth College

Anthropological Models and Historical Linguistics: The Story of Uto-Aztecan
Jane H. Hill, University of Arizona - 2008 ANLI Symposium

Linguistics, Anthropology, the Media, and Arabic Diglossia
Keith Walters, Portland State University

Modeling in Historical Linguistics: Trisecting Computational Methods, Speech Communities, and Language Change
Claire Bowern, Rice University

Is it any way you might could tell me how come I am not a English speaker?
Rusty Barrett, University of Kentucky

The absolute best of these are Krashen's 'The Comprehension Hypothesis Extended', Walter's 'Linguistics, Anthropology, the Media and Arabic Diglossia' and Barrett's 'Is it any way you might could tell me how come I am not a English speaker?'

Friday, July 31, 2009

Bloggy blog

My graduate students have a blog which is part of their assessment.
Look it up, comment on it. Go on, do it. Do it now.