Skip to content

yao8839836/COT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

COT

The datasets and code of this paper:

Liang Yao, Yin Zhang, Baogang Wei, Lei Li, Fei Wu, Peng Zhang, and Yali Bian. "Concept over time: the combination of probabilistic topic model with wikipedia knowledge." Expert Systems with Applications 60 (2016): 27-38.

Dataset

3158 TechCrunch blogs are in data/TechCrunch 1 year (3,158 docs)/datablog/.

6778 New York Times 2011 global news are in data/NYT/.

Timestamp

TechCrunch: /data/TechCrunch 1 year (3,158 docs)/time.txt

NYT: /file/boc/time(NYT).txt, also can be found in /file/doclist(NYT)part.txt

pre-processed files after Wikification

TechCrunch: file/boc/wikified(dense)/

NYT: file/boc/wikified(NYT)/

Wikipedia articles' view statistics of each month

TechCrunch: file/boc/views(tech).txt

NYT: file/boc/views(NYT).txt

Implementation

/src/cot/COT.java is the implementation of the first variation (TOT + link + view).

About

Concept over time: the combination of probabilistic topic model with wikipedia knowledge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages