Big textual data: how to find relevant information (with low cost)?
نوع المنشور
بحث أصيل
المؤلفون

ABSTRACT Obviously, it is not useful to accumulate large amounts of information if we cannot find a particular piece of information. Also, extracting relevant and targeted information from textual data on large digital media, and if they are heterogeneous and multilingual, is certainly not a new problem. However, the current methods prove to be expensive and the results are too often inappropriate, too numerous and not very presentable for the user. In addition to current methods, we propose an original method: Contextual Exploration. This is the EC3 software. EC3 does not need syntactic analysis, statistical analysis nor a "general" ontology. EC3 uses only small ontologies called "linguistic ontologies" that expresses the language of knowledge. This is why EC3 works very quickly on large corpus, which components can be both whole and short text: SMS to books. At the output, EC3 offers a dynamic visual representation of results. EC3 has been tested on very large digitized corpus provided by the French Labex OBVIL "Observatory of the Literary Life", in partnership with the National Library of France.

المجلة
العنوان
In Proceedings of the 10th International Conference on Management of Digital EcoSystems (MEDES ’18), September 25–28, 2018, Tokyo
الناشر
ACM, New York, NY, USA
بلد الناشر
فرنسا
نوع المنشور
Both (Printed and Online)
المجلد
1
السنة
2018
الصفحات
6