AILLA's Collections

The archive is organized hierarchically:



The top level is the collection: a set of resources produced by one person or team. Most of our collections are the results of a specific project ( Chatino Language Documentation Project Collection) or the work of a single researcher ( Joel Sherzer Kuna Collection). A collection may consist of a single resource, like the Mac Chapin Kuna Collection, which contains a digital reproduction of a vinyl record produced by Dr. Chapin in 1970. Other collections contain hundreds of resources ( Oscar Aguilera Chilean Languages Collection). Some collections are complete, because the project has ended or the researcher has died ( Arthur Sorensen Amazonian Languages Collection). Others are still growing, as the depositors add new materials.


Collections are organized by language, rather than by some other criterion, such as country or date, because language is our primary focus. What constitutes a language is not always a simple question (read more). We generally rely on our depositors to guide us in this determination.


A resource is a set of media files related in terms of their intellectual content. Thus, a resource may contain one file or hundreds of files, depending on the nature of the content. Some examples:

  • a recording of a narrative performance made in both audio and video, with an Elan file containing the transcription and translation (5 files: wav, mp3, mpg, mp4, eaf);
  • a 100-page notebook containing field notes from a single field trip (101 files: 100 tiff, 1 pdf);
  • a scanned article that was published in a journal (1 file: pdf).

Resource identifiers

Each AILLA resource is given a unique identifier. AILLA IDs are structured in a way that supports administrative purposes and makes it easy to see which files belong to a given resource. Many of our depositors also find this syntax helpful. Here's how it works:

Resource ID = CUK001R001
CUK   ISO 3-letter code for the Kuna language
CUK001   first deposit for this language
CUK001R002   first resource in this deposit

CUK001R002 is a bundle that contains many media files, since the Myth of the White Prophet is a very long chant, recorded across several tapes, that Joel Sherzer has studied for many years, so there are different versions of transcriptions and translations as well as notes.

Item IDs - file names - are numbered in order. So, CUK001R002I001.mp3 is the first part of the recording; CUK001R002I002.mp3 is the second part; and so on. CUK001R002I001.pdf is the oldest transcription; CUK001R002I801.pdf is the most recent; others were made in between. Item numbers may be organized in batches, eg by hundreds, to facilitate grouping of materials within the resource bundle.


AILLA is a joint effort of the LLILAS Benson Latin American Studies and Collections, the Department of Linguistics, and the Digital Library Services Division of the University Libraries at the University of Texas at Austin.
AILLA is also grateful for support from the National Endowment for the Humanities and the National Science Foundation.
