MLZ site translator: CiNii

CiNii banner

The CiNii aggregator service run by the Japan National Institute of Informatics collects most academic articles published in Japan, both English and Japanese. [1] Access is by a simple search interface, shown to the right: it just works, and no login is required. The CiNii portal is an important resource both for Japanese researchers, and for those to whom Japan itself is a topic of research.

CiNii banner

This translator is actually the oldest component MLZ has to offer. Development of Multilingual Zotero itself was originally inspired by the Second Annual CiNii API Contest, held in 2010. At that time (as today) CiNii was one of the few aggregators leveraging RDF (a rich, machine-friendly descriptive syntax) to offer bilingual metadata on its holdings. A few of us judged that a variant of Zotero capable of handling multilingual field variants could be cast in time for the deadline, and we did make it in the door with a proof-of-concept implementation. The judges were less than totally impressed—Multilingual Zotero rated only an Honorable Mention (佳作)—but we’re pretty happy with the tool that has resulted from subsequent development, and that’s what counts.

The translator recognises both search listings and individual entries. A search listing with an active translator selection dialog is shown to the right.

It bears mention that there is some unevenness in the CiNii metadata: the maintainers presumably depend on the institutions that funnel data to the service for quality control, and the degree of care does vary, particularly with respect to English content.

Things to watch out for in fetched items include: Japanese-language items without accompanying English metadata; English titles incorrectly appended to the headline (Japanese) field; and unpredictable ordering of transliterated names (with the family name set at random in first or last position). In the CiNii data itself, first and last name are delimited by spaces, not commas, which makes the parsing of names with particles (“van”, “von”, “de”, etc.) an uncertain business. Finally, transliterated Japanese names are tagged with en (the English language). If you are fussy about language tagging, you will want to change these to something more precise such as ja-alalc97 (romanised Japanese).

All that said, CiNii deserves a huge amount of credit here: it would be a great thing if other multilingual dissemination projects were to follow suit and leverage the potential of RDF data modelling in a similar way. Enjoy this one—it’s why we’re here.

[1] The CiNii site also offers a search service for books. Although the user interface design is the same, this area of the site is a front-end to third-party archives, and does not offer the rich RTF data of the articles repository. It is not currently covered by this site translator.
This entry was posted in Announcements. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>