PRETO: A High-performance Text Mining Tool for Preprocessing Turkish Texts

comments Comments Off
By Volkan TUNALI, June 26, 2012 1:59 am

For my text mining research, I often need to preprocess document collections of varying size. Besides texts in English, I also work on texts in Turkish. Therefore, I need special preprocessing options for texts in Turkish.

In order to meet my special preprocessing needs, I have developed a text mining tool for preprocessing texts in Turkish as well as English. I call this tool PRETO. It is now available as an open source project at Google Code under GNU GPL v3 license. You can freely download and use it. Address of the project is http://code.google.com/p/preto/

If you use this tool for academic research purposes, please cite it as below:

Volkan Tunalı, Turgay Tugay Bilgin, “PRETO: A High-performance Text Mining Tool for Preprocessing Turkish Texts”, International Conference on Computer Systems and Technologies (CompSysTech), Ruse, Bulgaria, June 22-23, 2012, 134-140.

You can access the paper via ACM Digital Library.

Panorama Theme by Themocracy