Accelerating crimes on internet alert the law implementation bodies to keep an eye on online activities which involve huge data. This will build a requirement to detect suspicious activities online available on discussion forums by optimizing the use of data mining tools. This paper highlights the data mining techniques which are prototyped and implemented for closely studying discussion forums data for suspicious activities in different domains. Thus, for detecting suspicious discussions on the discussion forums dataset, numerous mining methods have been implemented to date. Through this, doubtful activities can be revealed by analyzing the interests of all users. The main obstacle faced by researchers in doing so is the lack of information retrieval and data analysis tools for real-time data of forum websites. The existent database is quite massive and thus to extract desired knowledge from such large search space of social data, an intelligent and interactive data mining algorithm is required. Moreover, the involvement of a large number of parameters in the search space makes the large-scale search impractical. Consequently, efficient search approaches are of essential significance. It is necessary to acquire knowledge about data mining in order to discover information. Data mining is defined as the process of discovering, extracting and analyzing meaningful patterns, structure, models, and rules from large quantities of data. Data mining is emanating as one of the tools for crime detection, clustering of prime location for finding crime hot spots, criminal profiling, predictions of crime trends and many other related applications.
Many scientific researchers have been done on the significance of crime data mining and their results are revealed in the new software applications to analysis and detecting the crime data.
A framework has been developed by Fabio Calefato, Filippo Lanubile, Nicole Novielli, University of Bari “Aldo Moro”, which can be used for emotion detection from online forums. EmoTxt identifies emotions in an input corpus provided as a comma separated value (CSV) file, with one text per line, preceded by a unique identifier. The output is a CSV file containing the text id and the predicted label for each item of the input collection. Their model intends to find the recognition of specific emotions, such as joy, love, and anger etc. Whereas the other proposed systems have classified the emotions as positive, negative, or neutral.
According to research by Fabio Calefato, the framework defines a tree-structured hierarchical classification of emotions, where each level refines the granularity of the previous one, thus providing more indication of its nature. The framework includes, at the top level, six basic emotions, namely love, joy, anger, sadness, fear, and surprise.