Documents
Poster
Poster
Web Content Extraction Based on Maximum Continuous Sum of Text Density
- Citation Author(s):
- Submitted by:
- Kai Sun
- Last updated:
- 21 November 2016 - 9:34pm
- Document Type:
- Poster
- Document Year:
- 2016
- Event:
- Presenters:
- Kai Sun
- Paper Code:
- IALP1601
- Categories:
- Log in to post comments
Generally different websites have different web page structures, which would heavily affect the extraction quality when the web content is automatically collected. The maximum continuous sum of text density (MCSTD) method can extract web content from different web pages efficiently and effectively.