OLAC Record: Two Main Contents in a Syllabus for Language Documentation: the Learning Data Models and an Assignation of Data Conversion

OLAC Record
oai:scholarspace.manoa.hawaii.edu:10125/42045

Metadata

Title: Two Main Contents in a Syllabus for Language Documentation: the Learning Data Models and an Assignation of Data Conversion

Bibliographic Citation: Ohya, Kazushi, Ohya, Kazushi; 2017-03-02; Previously, in Author(2015a) at ICLDC4, we suggested a data format of time information and a data model for inter-linear data, and the strategy of language documentation in the same line was proposed in Author(2016) at LREC2016. As far as these experiments go, there are four fields in language documentation: (a) learning usages of tools including devices and software, (b) learning data models that are used in programming based on computer science, (c) transforming data formats of language data into others in moving to the next phase of data handling in language documentation, and (d) implementing software of data management system such as ELAN, FLEx, SQL engines, web systems, and so on. The actual components selected for language documentation are different according to the projects' scale, period, and members. However, if we suppose that participants of language documentation are simply linguists and computer scientists and/or engineers, we can get a clear perspective on a structure of language documentation. Linguists are mainly engaged in the (a), (b), and sometimes (c), and computer scientists and/or engineers in the (c) and (d). Linguists seem to be awake to the danger of sticking to an application, but there are few people who give up doing it. As a history of computer science and archive studies said, there is no future in application dependence. The (a) is not appropriate as a field of language documentation. For linguists, learning data models is the only way to ensure life-long operation of their language data. However, it is true that there are not enough guidelines or textbooks on data models for linguists. In the poster we will suggest a syllabus for that. For example, two data models, a simple tree structure adopted in many markup languages and a inter-linear data structure adopted in record-based data or in XML data formats like ELAN and FLEx, the background theory and the actual way of manipulation are included in our syllabus. And, in terms of projects' architecture, the (c) is the most problematic point because there is a missing theory of data conversion: no format to define a pattern of a final description or converted data. In nine years of our projects, we have not found a good solution to this problem. In our syllabus, linguists learn an idea of system science and a way to assign roles with computer scientists and/or engineers. References: Durand, J.et.al eds, 2014, The Oxford Handbook of Corpus Phonology, Oxford University Press McCawley, J.D., 1981, Everything that Linguists have Always Wanted to Know about Logic but were ashamed to ask, The University of Chicago Press [Author 2015a] [Author 2015b] [Author 2016] Thieberger, N. ed., 2012, The Oxford Handbook of Linguistic Fieldwork, The Oxford University Press; Kaipuleohone University of Hawai'i Digital Language Archive;http://hdl.handle.net/10125/42045.

Contributor (speaker): Ohya, Kazushi

Creator: Ohya, Kazushi

Date (W3CDTF): 2017-03-02

Description: Previously, in Author(2015a) at ICLDC4, we suggested a data format of time information and a data model for inter-linear data, and the strategy of language documentation in the same line was proposed in Author(2016) at LREC2016. As far as these experiments go, there are four fields in language documentation: (a) learning usages of tools including devices and software, (b) learning data models that are used in programming based on computer science, (c) transforming data formats of language data into others in moving to the next phase of data handling in language documentation, and (d) implementing software of data management system such as ELAN, FLEx, SQL engines, web systems, and so on. The actual components selected for language documentation are different according to the projects' scale, period, and members. However, if we suppose that participants of language documentation are simply linguists and computer scientists and/or engineers, we can get a clear perspective on a structure of language documentation. Linguists are mainly engaged in the (a), (b), and sometimes (c), and computer scientists and/or engineers in the (c) and (d). Linguists seem to be awake to the danger of sticking to an application, but there are few people who give up doing it. As a history of computer science and archive studies said, there is no future in application dependence. The (a) is not appropriate as a field of language documentation. For linguists, learning data models is the only way to ensure life-long operation of their language data. However, it is true that there are not enough guidelines or textbooks on data models for linguists. In the poster we will suggest a syllabus for that. For example, two data models, a simple tree structure adopted in many markup languages and a inter-linear data structure adopted in record-based data or in XML data formats like ELAN and FLEx, the background theory and the actual way of manipulation are included in our syllabus. And, in terms of projects' architecture, the (c) is the most problematic point because there is a missing theory of data conversion: no format to define a pattern of a final description or converted data. In nine years of our projects, we have not found a good solution to this problem. In our syllabus, linguists learn an idea of system science and a way to assign roles with computer scientists and/or engineers. References: Durand, J.et.al eds, 2014, The Oxford Handbook of Corpus Phonology, Oxford University Press McCawley, J.D., 1981, Everything that Linguists have Always Wanted to Know about Logic but were ashamed to ask, The University of Chicago Press [Author 2015a] [Author 2015b] [Author 2016] Thieberger, N. ed., 2012, The Oxford Handbook of Linguistic Fieldwork, The Oxford University Press

Identifier (URI): http://hdl.handle.net/10125/42045

Table Of Contents: 42045-a.pdf

42045-b.pdf

Type (DCMI): Text

OLAC Info

Archive: Language Documentation and Conservation

Description: http://www.language-archives.org/archive/ldc.scholarspace.manoa.hawaii.edu

GetRecord: OAI-PMH request for OLAC format

GetRecord: Pre-generated XML file

OAI Info

OaiIdentifier: oai:scholarspace.manoa.hawaii.edu:10125/42045

DateStamp: 2024-08-25

GetRecord: OAI-PMH request for simple DC format

Search Info
Citation: Ohya, Kazushi. 2017. Language Documentation and Conservation.
Terms: dcmi_Text

http://www.language-archives.org/item.php/oai:scholarspace.manoa.hawaii.edu:10125/42045
Up-to-date as of: Thu Sep 25 0:32:18 EDT 2025

Metadata
Title:		Two Main Contents in a Syllabus for Language Documentation: the Learning Data Models and an Assignation of Data Conversion
Bibliographic Citation:		Ohya, Kazushi, Ohya, Kazushi; 2017-03-02; Previously, in Author(2015a) at ICLDC4, we suggested a data format of time information and a data model for inter-linear data, and the strategy of language documentation in the same line was proposed in Author(2016) at LREC2016. As far as these experiments go, there are four fields in language documentation: (a) learning usages of tools including devices and software, (b) learning data models that are used in programming based on computer science, (c) transforming data formats of language data into others in moving to the next phase of data handling in language documentation, and (d) implementing software of data management system such as ELAN, FLEx, SQL engines, web systems, and so on. The actual components selected for language documentation are different according to the projects' scale, period, and members. However, if we suppose that participants of language documentation are simply linguists and computer scientists and/or engineers, we can get a clear perspective on a structure of language documentation. Linguists are mainly engaged in the (a), (b), and sometimes (c), and computer scientists and/or engineers in the (c) and (d). Linguists seem to be awake to the danger of sticking to an application, but there are few people who give up doing it. As a history of computer science and archive studies said, there is no future in application dependence. The (a) is not appropriate as a field of language documentation. For linguists, learning data models is the only way to ensure life-long operation of their language data. However, it is true that there are not enough guidelines or textbooks on data models for linguists. In the poster we will suggest a syllabus for that. For example, two data models, a simple tree structure adopted in many markup languages and a inter-linear data structure adopted in record-based data or in XML data formats like ELAN and FLEx, the background theory and the actual way of manipulation are included in our syllabus. And, in terms of projects' architecture, the (c) is the most problematic point because there is a missing theory of data conversion: no format to define a pattern of a final description or converted data. In nine years of our projects, we have not found a good solution to this problem. In our syllabus, linguists learn an idea of system science and a way to assign roles with computer scientists and/or engineers. References: Durand, J.et.al eds, 2014, The Oxford Handbook of Corpus Phonology, Oxford University Press McCawley, J.D., 1981, Everything that Linguists have Always Wanted to Know about Logic but were ashamed to ask, The University of Chicago Press [Author 2015a] [Author 2015b] [Author 2016] Thieberger, N. ed., 2012, The Oxford Handbook of Linguistic Fieldwork, The Oxford University Press; Kaipuleohone University of Hawai'i Digital Language Archive;http://hdl.handle.net/10125/42045.
Contributor (speaker):		Ohya, Kazushi
Creator:		Ohya, Kazushi
Date (W3CDTF):		2017-03-02
Description:		Previously, in Author(2015a) at ICLDC4, we suggested a data format of time information and a data model for inter-linear data, and the strategy of language documentation in the same line was proposed in Author(2016) at LREC2016. As far as these experiments go, there are four fields in language documentation: (a) learning usages of tools including devices and software, (b) learning data models that are used in programming based on computer science, (c) transforming data formats of language data into others in moving to the next phase of data handling in language documentation, and (d) implementing software of data management system such as ELAN, FLEx, SQL engines, web systems, and so on. The actual components selected for language documentation are different according to the projects' scale, period, and members. However, if we suppose that participants of language documentation are simply linguists and computer scientists and/or engineers, we can get a clear perspective on a structure of language documentation. Linguists are mainly engaged in the (a), (b), and sometimes (c), and computer scientists and/or engineers in the (c) and (d). Linguists seem to be awake to the danger of sticking to an application, but there are few people who give up doing it. As a history of computer science and archive studies said, there is no future in application dependence. The (a) is not appropriate as a field of language documentation. For linguists, learning data models is the only way to ensure life-long operation of their language data. However, it is true that there are not enough guidelines or textbooks on data models for linguists. In the poster we will suggest a syllabus for that. For example, two data models, a simple tree structure adopted in many markup languages and a inter-linear data structure adopted in record-based data or in XML data formats like ELAN and FLEx, the background theory and the actual way of manipulation are included in our syllabus. And, in terms of projects' architecture, the (c) is the most problematic point because there is a missing theory of data conversion: no format to define a pattern of a final description or converted data. In nine years of our projects, we have not found a good solution to this problem. In our syllabus, linguists learn an idea of system science and a way to assign roles with computer scientists and/or engineers. References: Durand, J.et.al eds, 2014, The Oxford Handbook of Corpus Phonology, Oxford University Press McCawley, J.D., 1981, Everything that Linguists have Always Wanted to Know about Logic but were ashamed to ask, The University of Chicago Press [Author 2015a] [Author 2015b] [Author 2016] Thieberger, N. ed., 2012, The Oxford Handbook of Linguistic Fieldwork, The Oxford University Press
Identifier (URI):		http://hdl.handle.net/10125/42045
Table Of Contents:		42045-a.pdf
Table Of Contents:		42045-b.pdf
Type (DCMI):		Text
OLAC Info
Archive:		Language Documentation and Conservation
Description:		http://www.language-archives.org/archive/ldc.scholarspace.manoa.hawaii.edu
GetRecord:		OAI-PMH request for OLAC format
GetRecord:		Pre-generated XML file
OAI Info
OaiIdentifier:		oai:scholarspace.manoa.hawaii.edu:10125/42045
DateStamp:		2024-08-25
GetRecord:		OAI-PMH request for simple DC format
Search Info
Citation:		Ohya, Kazushi. 2017. Language Documentation and Conservation.
Terms:		dcmi_Text