Skip to content
Change the repository type filter

All

    Repositories list

    • alive

      Public
      Service status checker for the BCube triple store.
      Python
      GNU General Public License v3.0
      0067Updated Jul 6, 2022Jul 6, 2022
    • nutch

      Public
      Mirror of Apache Nutch
      Java
      Apache License 2.0
      1.2k200Updated Feb 15, 2018Feb 15, 2018
    • OWSLib

      Public
      OWSLib is a Python package for client programming with Open Geospatial Consortium (OGC) web service (hence OWS) interface standards, and their related content models.
      Python
      BSD 3-Clause "New" or "Revised" License
      275080Updated Jun 20, 2016Jun 20, 2016
    • restparql

      Public
      RESTful micro service on top of the BCube triple store.
      Python
      GNU General Public License v3.0
      0000Updated Feb 25, 2016Feb 25, 2016
    • initial text preprocessors for the triplestore and feature classification
      Jupyter Notebook
      Other
      32280Updated Jan 30, 2016Jan 30, 2016
    • Python
      Other
      2110Updated Jan 28, 2016Jan 28, 2016
    • intermediate pipeline workflows for ad hoc analysis
      Python
      1000Updated Jan 28, 2016Jan 28, 2016
    • Jupyter Notebook
      Other
      1010Updated Jan 28, 2016Jan 28, 2016
    • The harvest data characterized and represented as turtle (ttl).
      Python
      Other
      0000Updated Jan 14, 2016Jan 14, 2016
    • Python
      BSD 3-Clause "New" or "Revised" License
      1010Updated Dec 31, 2015Dec 31, 2015
    • 1000Updated Nov 19, 2015Nov 19, 2015
    • frontera

      Public
      A flexible frontier for web crawlers
      Python
      BSD 3-Clause "New" or "Revised" License
      217000Updated Oct 1, 2015Oct 1, 2015
    • semantics

      Public
      OWL ontologies to describe web services found by the BCube Nutch Crawler.
      GNU General Public License v3.0
      2210Updated Sep 22, 2015Sep 22, 2015
    • Quickly build arbitrary size Hadoop Cluster based on Docker
      Shell
      860000Updated Jun 3, 2015Jun 3, 2015
    • demonstration of the workflow from harvest to triplestore through a flask API
      Python
      MIT License
      1000Updated May 28, 2015May 28, 2015
    • A crawler/parser for THREDDS catalogs
      Python
      GNU General Public License v3.0
      22060Updated May 19, 2015May 19, 2015
    • Apache Nutch fork tunned for web services and data discovery.
      Java
      Apache License 2.0
      2920Updated May 18, 2015May 18, 2015
    • simhash

      Public
      A Python Implementation of Simhash Algorithm
      Python
      MIT License
      222000Updated May 11, 2015May 11, 2015
    • quick comparison view for raw xml service responses, similarity scores on bag of words, xml bits excluded
      Python
      5000Updated Apr 3, 2015Apr 3, 2015
    • Python API for Various DB-Backed Simhash Clusters
      Python
      MIT License
      23000Updated Mar 12, 2015Mar 12, 2015
    • Simhash and near-duplicate detection
      Python
      MIT License
      116000Updated Mar 11, 2015Mar 11, 2015
    • Better interoperability between open source metadata servers and clients.
      Python
      MIT License
      21000Updated Oct 2, 2014Oct 2, 2014
    • Deploying apache-hadoop in a virtualized cluster as easy as 1-2-3.
      Shell
      80000Updated Jun 5, 2014Jun 5, 2014
    • Java
      34000Updated Apr 30, 2014Apr 30, 2014
    • Java
      Other
      0000Updated Dec 1, 2012Dec 1, 2012