A production of Ipogia Skini
Help us create an artificial intelligence that understands Cypriot Greek
"ΑΠΟαποικιοΠΟΙΗΣΗ" is concerned with the publication of a poetry collection through the collaboration of 10 Greek Cypriot poets/prose writers and the artificial intelligence Natural Language processing (NLP) model, GPT-2. According to its creators, the company Open AI, GPT-2 has the ability to create new, synthetic text, predicting the next word, based on all previous words within a text. The poems that will be developed for publication will be in Cypriot Greek and will focus on the question "What defines a human being".
In order to achieve that, we first have to train the GPT-2 to learn Cypriot Greek using a Training Dataset. Machine learning models, such as GPT-2, learn by analysing the patterns of a lot of data that we feed into it. In this case, the data that we need to provide is text written in Cypriot Greek. We have been working on creating a Cypriot Greek Dataset by writing many texts in Cypriot Greek in a homogenous way. Now, with the help of an autocorrector, we are ready to open the Dataset-creating process to the wider population in order to achieve a more diverse and a more inclusive corpus. As artificial intelligence is increasingly becoming part of our everyday lives, it is important for us to create a Dataset that is as inclusive as possible, to ensure that the GPT-2 represents the diverse population of our island.
We would thus like to ask you for your help, to build an ethical and inclusive Cypriot Greek Training Dataset! By clicking on the link below, you can input your own text that you would like to provide to our Dataset. Please follow the instructions and make sure you use the autocorrector, to ensure that our Dataset’s text is as homogenous as possible, in order for GPT-2 to learn Cypriot Greek well.
To create an inclusive Dataset that reflects the diversity of our island.
To adhere to high ethical dataset-creation standards by respecting copyrights in creating the Dataset
To provide an Open-Source Dataset in Cypriot Greek after the completion of our project
To contribute to the spreading of a more systematic homogeneous way of spelling Cypriot Greek
To contribute to the decolonisation of Cyprus
Through the presentation of the final outcome of the current project, the publication of the poetry collection we aim to open a dialogue about the role and the influence of artificial intelligence in our lives