Abstract
The paper concerns the problem of predicting behaviour of web users, based on real historical data which constitutes an important issue in web mining.
The research reported here was conducted while the authors participated in the international ECML/ PKDD 2007 Discovery Challenge competition – Track 1.
The results presented here ended up as the winning solution to the contest.
We describe the contest tasks and the real industrial datasets concerning the recorded behaviour of sample of Polish Web users on which our experiments were performed.
We present the whole extensive experimental process from the data preprocessing phase to exploratory analysis of the data to the experimental comparison and discussion of various prediction models which we examined.
As we explain, our solution has low time and space complexity, scales well with large datasets and, at the same time, produces high-quality results.
Get full access to this article
View all access options for this article.
