Abstract
The web structure yields significant insights into web algorithms for searching, structuring, discovering, mining, and revealing web information. We formalize our view of the web structure in terms of the integer linear programming that converts the web directed graph to the optimal hierarchical structure. The model represents a high level structure regardless of various measures such as the cosine similarity and tf-idf measure of the vector space model as well as the PageRank of Google. Another advantage for our approach is that the corresponding sensitivity analysis yields an allowable range for the optimal structure so that the model can be estimated even though dynamic changes take place in the web pages, links, and structures.
Original language | English |
---|---|
Journal | CEUR Workshop Proceedings |
Volume | 140 |
State | Published - 2005 |
Externally published | Yes |
Event | ICWS 2005 2nd International Workshop on Semantic and Dynamic Web Processes, SDWP 2005 - Orlando, FL, United States Duration: 11 Jul 2005 → 11 Jul 2005 |
Keywords
- Linear programming
- PageRank
- Tf-idf
- VSM
- Web structuring