Skip to content

ofrendo/WebDataIntegration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WebDataIntegration (Group 5)

Our goal in this project is to integrate information about companies with information about cities in which their headquarters are located. The resulting dataset could then be analyzed from a data science point of view in order to find relationships, i.e. how does the population in a city correlate with the size or other attributes of companies. In order to gather more information about companies, we first combine several datasets together, all of which are about companies but derived from different sources. We then integrate this result with the data about locations.

Development Folders

  • data: Contains each of the raw datasets, the gold standards and the mapping/resolution/fusion results
  • latex: .tex files for the project report
  • lib: .jar files used as libraries in Java
  • queries: Our queries used for collecting data from Freebase and DBpedia
  • RapidMinerRepo: Contains the RapidMiner processes used to learn matching rules with a linear regression. Can be imported as a repository within RapidMiner
  • src: Java source code
  • usecase: Given sample files for a movie/actors use case

Contributing

  • Oliver Frendo
  • Dandan Li
  • Zehui Wang
  • Yi-Ru Cheng

About

Web Data Integration project 1st semester

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published