Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ NARCISarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
NARCIS
Conference object . 2013
Data sources: NARCIS
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Time-Aware Chi-squared for Document Filtering over Time

Authors: Kenter, T.; Graus, D.; Meij, E.; de Rijke, M.;

Time-Aware Chi-squared for Document Filtering over Time

Abstract

Document filtering over time is applied in tasks such as tracking topics in online news or social media. We consider it a classification task, where topics of interest correspond to classes, and the feature space consists of the words associated to each class. In streaming settings the set of words associated with a concept may change. In this paper we employ a multinomial Naive Bayes classifier and perform periodic feature selection to adapt to evolving topics. We propose two ways of employing Pearson's χ2 test for feature selection and demonstrate their benefit on the TREC KBA 2012 data set. By incorporating a time-dependent function in our equations for χ2 we provide an elegant method for applying different weighting and windowing schemes. Experiments show improvements of our approach over a non-adaptive baseline, in a realistic settings with limited amounts of training data.

Country
Netherlands

[1] J. Allan. Introduction to topic detection and tracking. In Topic detection and tracking, pages 1-16. Springer, 2002.

[2] J. Frank, M. Kleiman-Weiner, D. Roberts, F. Niu, C. Zhang, C. R´e, and I. Soboro↵ . Building an entity-centric stream filtering test collection for TREC 2012. In Proceedings of the 21st TREC, 2012.

[3] I. Katakis, G. Tsoumakas, and I. Vlahavas. Dynamic feature space and incremental feature selection for the classification of textual data streams. In PKDD, pages 102-116, 2006.

[4] H. J. Kim and J. Chang. Integrating incremental feature weighting into naive bayes text classifier. In Machine Learning and Cybernetics, 2007 International Conference on, volume 2, pages 1137-1143, 2007.

[5] R. Klinkenberg. Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, 8 (3):281-300, 2004.

[6] Y. Yiming and J. O. Pedersen. A comparative study on feature selection in text categorization. In ICML '97, pages 412-420, 1997.

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
  • citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
    Powered byBIP!BIP!
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Funded byView all
NWO| Building Rich Links to Enable Television History Research
Project
  • Funder: Netherlands Organisation for Scientific Research (NWO) (NWO)
  • Project Code: 640.004.802
,
EC| PROMISE
Project
PROMISE
Participative Research labOratory for Multimedia and Multilingual Information Systems Evaluation
  • Funder: European Commission (EC)
  • Project Code: 258191
  • Funding stream: FP7 | SP1 | ICT
iis
,
NWO| Semantic Search in E-Discovery
Project
  • Funder: Netherlands Organisation for Scientific Research (NWO) (NWO)
  • Project Code: 727.011.005
iis
,
EC| LIMOSINE
Project
LIMOSINE
Linguistically Motivated Semantic aggregatIon engiNes
  • Funder: European Commission (EC)
  • Project Code: 288024
  • Funding stream: FP7 | SP1 | ICT
iis
Related to Research communities
Digital Humanities and Cultural Heritage DH-CH communities : CLARIN
moresidebar

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.