Proceedings of the conference on Language Technologies & Digital Humanities

Authors

Darja Fišer (ed)
Faculty of Arts, University of Ljubljana; Jožef Stefan Institute
Andrej Pančur (ed)
Institute of Contemporary History

Keywords:

Language Technologies, Digital Humanities

Synopsis

With this year's conference we are celebrating the 20th anniversary since the first conference »Language technologies« which took place in 1998 in Cankarjev dom, Ljubljana and was organized by Tomaž Erjavec, Vojko Gorjanc, Jerneja Žganec Gros and Anica Rant. The topics of the first conference were the development and application of language technologies for Slovene and directions for the future. 26 papers were presented, dealing with speech technologies and phonology, computerassisted translation and teaching, corpora, encoding standards for language data and searching for information on the internet. Following the conference a round table discussion was held, the direct result of which was the establishment of the Slovenian language technologies society which has since been the main initiator and organizer of all the following editions of the conference. Together with the Centre of lagnuage resources and technologies of the University of Ljubljana (CJVT), Faculty of Electrical Engineering of the University in Ljubljana and research infrastructures CLARIN.SI and DARIAH-SI the Society is also organizing this year's conference, held on 20-21 September 2018 at the Faculty of Electrical Engineering. In its 11th installment and after a successful expansion of the conference programme to Digital Humanities in 2016, we have retained the focus on the integration of the two disciplines and at the same time aimed to position the conference as an important meeting hub for fellow researchers in the region.

This year, 47 papers will be presented, including 2 talks by invited lecturers, 36 regular full papers and 5 abstracts, and 4 student papers. All the papers were reviewed by 3 reviewers. 21 papers weresubmitted in Slovene and 26 in English. The total number of all authors of the accepted papers is 92. Over half of the authors of the accepted papers are Slovene, 10% are from Croatia and the rest of the authors come from as many as 19 different countries. This is why the conference programme was designed in such a way that the first day is international, with the talks in English while talks on the second day will be held in Slovene. As opposed to the previous edition of the conference we have opted for a single track programme so that all the participants can attend all the talks, aiming to promote and foster closer collaboration among the researchers in language technologies and digital humanities. In addition, we have also introduced a poster session with 9 posters.

The editors would like to thank everyone who has contributed to the success of this conference, especially the invited lecturers and the authors of the papers for co-creating an inspiring conference programme, the Programme Committee for their dedicated reviews, the Organizing Committee for all the organizational efforts, the ession Chairs for their smooth and efficient management of the conference programme, the Technical Editors for preparing the online proceedings and the loyal sponsors for their selfless support of our activities.

Chapters

  • Preface
  • Too good to be true
    Current approaches to author profiling
    Malvina Nissim
  • Bringing Digital Humanities to the wider public
    libraries as incubator for DH research results
    Martijn Kleppe
  • A Comparison of Statistical and Neural Machine Translation for Slovene, Serbian and Croatian
    Mihael Arčan
  • SETimes.SR – A Reference Training Corpus of Serbian
    Vuk Batanovi´ć, Nikola Ljubešić, Tanja Samardžić
  • Artistic Visualizations and Beyond
    A Study of Materializations of a Digital Database
    Narvika Bovcon, Aleš Vaupotič
  • Opus-MontenegrinSubs 1.0
    First electronic corpus of the Montenegrin language
    Nikola Ljubešić, Petar Božović, Tomaž Erjavec, Jörg Tiedemann, Vojko Gorjanc
  • Zapis in prikaz starejših pesniških besedil ter njihovih variant v TEI
    Tomaž Erjavec, Nina Ditmajer, Matija Ogrin
  • Zakaj ne z eno poizvedbo hkrati po različnih korpusih?
    Troje korpusnih preverb pod primerjalnim drobnogledom
    Helena Dobrovoljc, Urška Vranjek Ošlak
  • Frekvenčni seznami n-gramov v korpusih slovenskega jezika
    Kaja Dobrovoljc
  • Razvoj smernic za predajo in arhiviranje kvalitativnih podatkov v Arhivu družboslovnih podatkov
    Maja Dolinar, Janez Štebe, Sonja Bezjak
  • Prehod iz statističnega strojnega prevajanja na prevajanje z nevronskimi omrežji za jezikovni par slovenščina-angleščina
    Gregor Donaj, Mirjam Sepesy Maučec
  • Analiza tvitov slovenskih korporativnih uporabnikov
    Darja Fišer, Monika Kalin Golob
  • Citiranje jezikoslovnih podatkov v slovenskih znanstvenih objavah: stanje in priporočila
    Darja Fišer, Tomaž Erjavec, Jakob Lenardič
  • Glagolske večbesedne enote v učnem korpusu ssj500k 2.1
    Polona Gantar, Špela Arhar Holdt, Jaka Čibej, Taja Kuzman, Teja Kavčič
  • Towards Semantic Role Labeling in Slovene and Croatian
    Nikola Ljubešić, Polona Gantar, Kristina Štrkalj Despot, Simon Krek
  • Zbirka primerov rabe vejice Vejica 1.3
    Peter Holozan
  • Croatian Web Dictionary Mrežnik
    One year later - What is different?
    Lana Hudeček, Milica Mihaljević
  • Portuguese Corpora of the 18th century
    Old Medicine texts for teaching and research activities
    Maria José Bocorny Finatto, Paulo Quaresma, Maria Filomena Gonçalves
  • Interaktivna karta slovenskih narečnih besedil
    Alenka Kavčič, Ivan Lovrić, Vera Smole
  • Učinkovit izračun frekvenčnih statistik za slovenske jezikovne korpuse
    Simon Krek, Aleksander Ključevšek, Marko Robnik-Šikonja
  • Kolokacijski slovar sodobne slovenščine
    Aniko Kovač, Maja Marković
  • A Rule-Based Syllabifier for Serbian
    Aniko Kovač, Maja Marković
  • Debating Evil
    Using Word Embeddings to Analyze Parliamentary Debates on War Criminals in The Netherlands
    Milan M. van Lange, Ralf D. Futselaar
  • hr500k – A Reference Training Corpus of Croatian
    Vuk Batanovi´ć, Nikola Ljubešić, Tomaž Erjavec, Željko Agić, Filip Klubička
  • The Parlameter corpus of contemporary Slovene parliamentary proceedings
    Darja Fišer, Nikola Ljubešić, Tomaž Erjavec, Filip Dobranič
  • KAS-term and KAS-biterm
    Datasets and baselines for monolingual and bilingual terminology extraction from academic writing
    Darja Fišer, Nikola Ljubešić, Tomaž Erjavec
  • Strokovno-znanstvena slovenščina: besednovrstne in oblikoskladenjske značilnosti
    Tomaž Erjavec, Nataša Logar
  • Word Selection in the Slovenian Sentence Matrix Test for Speech Audiometry
    Tatjana Marvin, Jure Derganc, Samo Beguš, Saba Battelino
  • Korpusna analiza nestandardne stave vejice po uvajalnih prislovnih zvezah
    Darja Fišer, Vojko Gorjanc, Eneja Osrajnik
  • Trajnost digitalnih izdaj
    Uporaba statičnih spletnih strani na portalu Zgodovina Slovenije - SIstory
    Andrej Pančur
  • Spregledana kulturna dediščina in uporaba digitalne raziskovalne infrastrukture za humanistiko v raziskavi Odlivanje smrti
    Andrej Pančur, Alenka Pirman, Maruša Kocjančič
  • Analiza slovničnih napak v korpusu spisov učencev japonščine na osnovni ravni
    Miha Pavlovič, Rena Ito
  • Building a corpus of the Croatian parliamentary debates using UDPipe open source NLP tools and Neo4j graph database for creation of social ontology model, text classification and extraction of semantic information
    Benedikt Perak, Filip Rodik
  • Samopromocija na Instagramu
    Primer predsednikovega profila
    Dan Podjed, Ajda Pretnar
  • Data Mining Workspace Sensors
    A New Approach to Anthropology
    Dan Podjed, Ajda Pretnar
  • Crowdsourcing terminology: harnessing the potential of translator’s glossaries
    Ivanka Rajh, Siniša Runjaić
  • Evaluation of Statistical Readability Measures on Slovene texts
    Špela Arhar Holdt, Simon Krek, Marko Robnik-Šikonja, Tadej Škvorc, Senja Pollak
  • Exploring Finno-Ugric linguistics through solving IT problems
    Tobias Weber, Jeremy Bradley
  • Teaching women writers with NEWW Virtual Research Environment
    Narvika Bovcon, Katja Mihurko Poniž, Marie Nedregotten Sørbø, Viola Parente-Čapková, Amelia Sanz, Suzan van Dijk, Aleš Vaupotič
  • Odnosi do jezika v slovenski, hrvaški in srbski računalniško posredovani komunikaciji
    Darja Fišer, Damjan Popič
  • Online database in Research of Correspondence of Franjo Ksaver Kuhač (1834-1911)
    Sara Ries
  • Distant Reading for European Literary History. A COST Action
    Katja Mihurko Poniž, Christof Schöch, Maciej Eder, Carolin Odebrecht, Mike Kestemont, Antonija Primorac, Justin Tonra, Catherine Kanellopoulou
  • Korpus in baza Gos Videolectures
    Darinka Verdonik
  • Korpus tvitov slovenskih politikov Janes TwePo
    Urška Bratoš
  • You, thou and thee
    A statistical analysis of Shakespeare’s use of pronominal address terms
    Isolde van Dorst
  • Primerjava luščilnikov terminologije Sketch Engine in CollTerm za znanstvena besedila
    Klara Eva Kukovičič
  • K-means Clustering for POS Tagger Improvement
    Gabi Rolih

Downloads

Download data is not yet available.

Downloads

Published

July 2, 2018

Details about this monograph

ISBN-13 (15)

978-961-06-0111-1

Date of first publication (11)

2018-10-01

How to Cite

Fišer, D., & Pančur, A. (Eds.). (2018). Proceedings of the conference on Language Technologies & Digital Humanities. University of Ljubljana Press. https://doi.org/10.4312/9789610601111