Skip to content

nlalak/SDARTS

Repository files navigation

Introduction

SDARTS is a protocol for metasearching over document collections. You may consider using SDARTS if:

  • You want to search (one or multiple) text or XML collections that you have from a single search interface.
  • You want to search remote document collections that export their metadata under the Open Archives protocol.
  • You want to search multiple web-based document collections from one, single search interface.

SDARTS was developed as part of PERSIVAL (an NSF Digital Library Initiative--Phase 2 project) at the Computer Science Department of Columbia University.

SDARTS is a hybrid of two previously existing protocols, STARTS and SDLIP. SDARTS is essentially an instantiation of the SDLIP protocol with a richer set of metadata, which can be effectively used for building sophisticated metasearchers. SDARTS makes a wide variety of collections with heterogeneous interfaces accessible under one uniform interface.

The SDARTS toolkit provides ready-to-use, configurable wrappers. They can be used directly for wrapping locally available text and XML collections, and for wrapping web-accessible databases.

The SDARTS toolkit also contains two optional sets of applications: The OAI SDARTS Cooperative Suite, which can makes SDARTS OAI-compliant and enables SDARTS to access OAI-compliant collections. We provide the SDARTS Automatic Content Summary Extraction for remote web databases, which extracts statistics about the vocabulary and the word frequencies of web databases over which SDARTS does not have immediate access.


Documentation

Installation Instructions

Source code documentation

Wrapper Configuration

SDARTS supports three types of collections: text "doc" wrapper, xml "doc" wrapper, and "www" wrapper, which is for local plain text documents, local xml documents and remote web-based collections fronted by CGI-based search engine, respectively.


Download

SDARTS Server Executables and Source

SDARTS Clients Executables and Source

Sample Local Document Collections

(Note: These are the document collections themselves; the wrapping files are in the distribution)


SDARTS API

The SDARTS API provides a way to query the collections indexed by an SDARTS server directly from within an application. The SDARTS API is a web service over SOAP, and the WSDL description of the service is provided, so the developers can use the API using their favorite language.

To use the SDARTS API, developers can either:

  • Download the WSDL description of the service (and use for example SOAP::Lite for Perl, or Visual Studio .NET, or any other language that supports web services), or

  • Download the necessary proxy files for Java from the "Download" section.


Publications and Presentations

Publications

Presentations


People

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published