Mining Textual Chinese News Articles
The purpose of this paper is to implement a system for analyzing unlabeled news documents that are written in Chinese Language to retrieve at most (100) relevant documents from document collection in response to a specific query, then the retrieved documents are grouped by authors. To reach this purpose, the implementation of the designed system framework consists of two phases. In phase 1, an Information Retrieval system is implemented, while in phase 2, a classification system is implemented. Each phase consists of several steps, and different algorithms, tools and techniques are used in each step. The system results in each phase are evaluated according to evaluation schema.
Keywords: Chinese news documents analysis, Information Retrieval, Documents Classification.
Download Full-Text
ABOUT THE AUTHORS
Mohammad Alkhaleefah
He received the Msc degree from the University of Manchester, United Kingdom, in 2008. He is a PhD candidate in Graduate Institute of Networking and Multimedia at National Taiwan University. He is a lecturer with the Faculty of Rahma College at the Al-Balqa Applied University, Jordan. His research interests include e-government, e-services, Web2.0, knowledge management, and data mining.
Shadi Alian
He received the B.Sc. degree in Computer Science from Yarmouk University, Irbid, Jordan, in 2004, then he was awarded a merit-based scholarship to continue his M.Sc. degree in Computer Science from Northeastern Illinois University, Chicago, Illinois, USA, in 2007. He is, at present, lecturer at The University of Jordan, Aqaba, Jordan, since 2010. His research interests include Multi-agent algorithms, Mutation testing, automatic test data generation and natural language processing.
Orabe Almanaseer
He received the Msc degree from the University of Manchester, United Kingdom, in 2008. He is a lecturer with the Faculty of information Technology and Systems at the University of Jordan, Aqaba, Jordan. His research interests include Web 2.0 and social media for e-government, ICT for development, and document interoperability. Mr. Almanaseer, is an IEEE Student Member.
Mohammad Alkhaleefah
He received the Msc degree from the University of Manchester, United Kingdom, in 2008. He is a PhD candidate in Graduate Institute of Networking and Multimedia at National Taiwan University. He is a lecturer with the Faculty of Rahma College at the Al-Balqa Applied University, Jordan. His research interests include e-government, e-services, Web2.0, knowledge management, and data mining.
Shadi Alian
He received the B.Sc. degree in Computer Science from Yarmouk University, Irbid, Jordan, in 2004, then he was awarded a merit-based scholarship to continue his M.Sc. degree in Computer Science from Northeastern Illinois University, Chicago, Illinois, USA, in 2007. He is, at present, lecturer at The University of Jordan, Aqaba, Jordan, since 2010. His research interests include Multi-agent algorithms, Mutation testing, automatic test data generation and natural language processing.
Orabe Almanaseer
He received the Msc degree from the University of Manchester, United Kingdom, in 2008. He is a lecturer with the Faculty of information Technology and Systems at the University of Jordan, Aqaba, Jordan. His research interests include Web 2.0 and social media for e-government, ICT for development, and document interoperability. Mr. Almanaseer, is an IEEE Student Member.