Guidelines for a standardized data format for use in cross

阿新 • • 發佈：2018-12-28

There is an increasing number of linguistic databases worldwide, raising the possibility of a vast network for potential comparative studies. However, these databases are generally created independently of each other, and often have a unique and narrow focus. This means that the formats used for encoding the data are often different and this creates real difficulties in effectively comparing data across databases.

In an effort to resolves these issues, the Cross-Linguistic Data Formats Initiative (CLDF) was created. In a paper published in Scientific Data, the CLDF sets out proposed guidelines for a standardized format for linguistic databases, and also supplies a software package, a basic ontology and usage examples of best practices. The goal of this effort is to facilitate sharing and re-use of data in comparative linguistics.

Standardizing data formats to facilitate sharing and reuse

The CLDF provides a data model underlying its recommendations that aims to be simple, yet expressive, and is based on the data model previously developed for the Cross-Linguistic Data project. This model has four main entities: (a) Languages; (b) Parameters; (c) Values; and (d) Sources. In the model, each Value is related to a Parameter and a Language, and can be based on multiple Sources. There are additionally References for Sources, and References can also have Contexts (which, for example, for printed references would be page numbers).

The CLDF data model is a package format, in which a dataset would be made up of a set of data files containing tables, and a descriptive file that defines the relationships between the tables. Each linguistic data type would have a CLDF module and additional components, which would be the aspects of the data in the module that recur across multiple data types. The CLDF modules would also contain terms from the CLDF ontology. The ontology is a list of vocabulary that represents objects and properties with well-known semantics in comparative linguistics. This makes it possible for users to reference these terms in a uniform way.

A software package to enable validation and manipulation

The CLDF specifications use common file formats -- such as CSV, JSON and BibTeX -- that are widely supported, with the goal that these files can easily be read and written on many platforms. Even more importantly, the standardized format will allow researchers without programming skills to access and manipulate the data with preexisting tools, rather than this ability being limited to researchers with sufficient programming skills to create their own tools. To facilitate this, the CLDF has created a "cookbook" repository for scripts for use with the CLDF specifications.

"We want to bring access to these data and the ability to compare them to as many researchers as possible," states Johann-Mattis List of the Max Planck Institute for the Science of Human History. Robert Forkel, one of the driving forces behind the CLDF initiative, also notes that the CLDF format is not limited to linguistic data alone, but can also incorporate databases of cultural and geographic data, for example. "CLDF may drastically facilitate the testing of questions regarding the interaction between linguistic, cultural, and environmental factors in linguistic and cultural evolution."

Guidelines for a standardized data format for use in cross

There is an increasing number of linguistic databases worldwide, raising the possibility of a vast network for potential comparative studies. However, the

Flutter : call setState() on a State object for a widget that no longer appears in the widget tree

最近在做專案的時候遇到一個問題，我的實現是進入一個頁面，非同步獲取資料然後setState更新顯示，但是出現了一個情況，每次第一次進去的時候沒問題，可以正常顯示，但是返回鍵退出頁面再進到這個介面就會chux出現問題，Flutter的資料現在相對還較少，找了很久沒發現問題，最後

[原]敏捷開發專案中升級mysql資料庫，談談結構、資料比對，navicat for mysql + dbForge Data Compare for MySQL 足矣！

在公司專案中，我有幸負責製作“升級指令碼”。升級指令碼，無疑兩步，先結構比對，再初始資料比對。一、結構比對結構比對比較簡單，使用navicat for mysql 工具，“工具”-“結構同步”，選擇好“源”、“目標

論文閱讀 | CrystalBall: A Visual Analytic System for Future Event Discovery and Analysis from Social Media Data

夏洛特 bstr soci 相同方式 PE VM src 測量 CrystalBall: A Visual Analytic System for Future Event Discovery and Analysis from Social Media Data 論文地

node-pre-gyp ERR! Pre-built binaries not installable for <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="ec8b9e9c8facddc2dddc

node-pre-gyp ERR! Pre-built binaries not installable for [email protected] and [email protected] (node-v48 ABI, glibc) (falling back t

轉錄組分析綜述A survey of best practices for RNA-seq data analysis

轉錄組分析綜述轉錄組文獻解讀 Trinity cufflinks 轉錄組研究綜述文章解讀今天介紹下小編最近閱讀的關於RNA-seq分析的文章，文章發在Genome Biology 上的A survey of

關於 Unable to resolve dependency for ':<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="69081919290d0c0b1c0e28070d1b06000d3d0c1

Android studio升級到3.0往上之後，遇到的最噁心的一件事，本人及其懶惰，從來不寫文章，但這個問題還是要記錄下來。百度了NNNNN多之後，問題都沒有解決，像關掉"offline work"開關、或者匯入的model 的build版本不一致這種方案，我也不知道發

SenseGen: A Deep Learning Architecture for Synthetic Sensor Data Generation論文解讀

一、論文概述 SenseGen這篇論文是17年發表在PerCom Workshops上的一篇論文，來自加州大學洛杉磯分校（University of California at Los Aneles，UCLA）網路與嵌入式系統實驗室（Netoworked & Embedded Syste

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of t

今天執行theano程式的時候，遇到了下面的問題： GRU4Rec git:(master) ✗ python run_rsc15.py Using cuDNN version 6021 on context None Mapped name None to device cuda: GeF

VWhy Google built a new search tool for data journalists

Why Google built a new search tool for data journalistsData journalism can deliver some of the most rewarding and valuable stories — but it can also be tim

Creating Defensible Data Analytics Moats for a Competitive Advantage

Karim DamjiIn ancient times, a castle was the home base of defense for the kingdom it ruled. The approach to the castle was not easily accessible. It was d

Why Use K-Means for Time Series Data? (Part One)

As an only child, I spent a lot of time by myself. Oftentimes my only respite from the extreme boredom of being by myself was daydreaming. I would meditate

I built Vasern — a data storage for React Native

I built Vasern — a data storage for React NativeAn open source sync database solutionReact Native is a framework for building native mobile apps, released

Ask HN: Why did Microsoft not use HTML as the format for its word documents?

Is there any critical functional difference between doc vs. HTML format for storing documents? Seems like in 2018, if a new Word processor were to be made,

Unable to resolve dependency for :<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5b3a2b2b1b3f3e23142b2f32343528">[email

這個是android studio 3.0的坑如果你的專案裡面有好多個依賴module就會報很多行相似的錯誤原因是你的app下面的build.gradel裡面的buildTypes{ debug{ ... } release{ ..

Android Studio 3.0.1 gradle編譯報錯 Error : unable to resolve dependency for <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="a0c1

在app目錄下build.gradle中引用了第三方類庫，gradle編譯時不斷報錯，無法resolve第三方類庫，或者無法download第三方類庫dependencies { implementation fileTree(include: ['*.jar'],

Android Studio報錯Unable to resolve dependency for ':<a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="721302023200171e">[email&#

Android Studio報錯Unable to resolve dependency for’:[email protected]/compileClasspath’:無法引用任何外部依賴的解決辦法 Android Studio 在引用外部依賴時

android studio更新3.2遇到的坑，unable to resolve dependency for <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="5a3b2a2a1a7474747474

距離第一次寫部落格轉眼一週多了，為什麼一直沒寫，真的這個過程中遇到了一些棘手的小麻煩 android studio更新3.2遇到的坑 4天前手賤更新了studio，然後新建工程就一堆報錯，照著網上的很多類似的部落格進行了處理，總結了一下無非以下幾種 1、降低依

A Relational Model of Data for Large Shared Data Banks 1970

大型共享資料庫的資料關係模型未來的資料庫使用者一定是和資料在機器中的儲存（即資料庫的內部模式）相互隔離的。而通過提示服務來提供資訊是一個不太令人滿意的解決方法。當資料可得內部模式表示發生改變，甚至資料外部表示的多個方面發生改變的時候，終端使用者和大多數的應用程式的活動都不

unable to resolve dependency for <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="4322333303">[email protected]a>。。。解決方

最近把android studio升級到了3.2，出了一堆問題，主要就是無法更新無法下載之類的。包括出現的： unable to resolve dependency for [email protected]。。。 Open File Show

Guidelines for a standardized data format for use in cross

相關推薦