Skip to content

ttracx/DataCleaner

 
 

Repository files navigation

DataCleaner

Build Status: Linux

DataCleaner logo

The premier Open Source Data Quality solution.

Powered by Neopost and Human Inference

Module structure

The main application modules are:

  • api - The public API of DataCleaner. Mostly interfaces and annotations that you should use to build your own extensions.
  • testware - Useful classes for unit testing of DataCleaner and extension code.
  • engine
  • core - The core engine piece which allows execution of jobs and components as per the API.
  • xml-config - Contains utilities for reading and writing job files and configuration files of DataCleaner.
  • env - Different/alternative environments that DataCleaner can run in, for instance Apache Spark or webapp-cluster
  • components
  • ... - many sub modules containing built-in as well as additional components/extensions to use with DataCleaner.
  • standard-components - a container-project that dependends on all components that are normally bundled in DataCleaner community edition.
  • desktop
  • api - The public API for the DataCleaner desktop application.
  • ui - The Swing-based user interface for desktop users
  • monitor
  • api - the API classes and interfaces of DataCleaner monitor
  • services - web services and controllers of DataCleaner monitor
  • widgets - reusable widgets and UI work, based on GWT
  • ui - the actual web user interface, based primarily on GWT and JSF
  • documentation - end-user reference documentation, published on https://datacleaner.org/docs

Code style and formatting

In the root of the project you can find 'Formatter-[IDE].xml' files which enable you to import the code formatting rules of the project into your IDE.

Continuous Integration

There's a public build of DataCleaner that can be found on Travis CI:

https://travis-ci.org/datacleaner/DataCleaner

Where to go for end-user information?

Please visit the main DataCleaner website https://datacleaner.org for downloads, news, forums etc.

Reference Documentation for users is available at https://datacleaner.org/docs - GitHub wiki and issues are used for developers and technical aspects only.

License

Licensed under the Lesser General Public License, see http://www.gnu.org/licenses/lgpl.txt

About

The premier open source Data Quality solution

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 91.2%
  • HTML 6.8%
  • Scala 1.4%
  • Other 0.6%