Friday 18th of May 2012
 

Application of some Retrieved Information Method on Internet


Published in Volume 7, Issue 6, pp 351-357, November 2010


This paper compares several methods of information extraction on the internet. Today, internet has become a treasure of knowledge. Every year, thousands of pieces of different information are posted on the internet. So, extracted information on the internet for many different purposes has become an important problem today. Users may extract information based on some available tools such as Lapis, Risk, Rapier, Wien, and Stalker. However, these tools have a disadvantage: we must update the training data when the website changes. So SVM and CRF associated with natural language processing are the best solutions to solve this problem. Information extraction from online Vietnamese news website with SVM and CRF brings experiment results that is very optimistic. Its results reach nearly 90% of the accuracy in websites and the processing time is less than one minute per site when the specified number of link levels is 1 within the base site.

Keywords: RI (Retrieved Information), CRFs (Condition Random Fields), SVM (Support Vector Machine), ECT (Embedded Catalog Tree)

Download Full-Text

IJCSI Published Papers Indexed By:

 

 

 

 
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »