Skip to content

Latest commit

 

History

History
120 lines (109 loc) · 4.9 KB

CHANGE.md

File metadata and controls

120 lines (109 loc) · 4.9 KB

CHANGE LOG

Development TODO's
Not Started:
  • Create location and geographical parsing
    • Look into geograpy library
  • Create synonym based regex parser
  • Develop annotator groups (sort of like mini pipelines)
  • Write tests for Annotators and Pipelines
  • Remove choice of SVM of NB from classification building / training (feature is never used)
In-Progress:
  • [] Reconfiguring papertrails deployment
Version Change Log:
Version 0.5: Date time parsing, number parsing, forcing SSL, Regex parsing
  • Integrate SpaCy NER for numbers
  • Start prototyping datetime parsing
  • Create bot specific annotator capability
  • Regex based annotator
  • Force HTTPS on all API endpoints.
  • Docker deployment
Version 0.4: Logging, Binary Classification Annotators, Binary Regex Annotators, Update Routes, .env based schema
  • In app logging to Papertrails, transimission syslog of Apache to Papertrails
  • COMMENT/DOCUMENTATION OVERHAUL => START STRONG, FINISH STRONG
  • Work on gazetteer specificity
  • Start prototyping 'Trait' parsing
  • Create ReGex annotator capability
    • Code
    • Test
  • Build test database to mimic production database for testing.
  • Restructure database to support entities table and connect to intents with OIDs.
  • Integrate trait parsing for plurality
  • Store annotator information in database to avoid hardcoding parameters and data
  • Create custom exceptions for classifiers and gazetteers
  • Create test for get-stopwords-and-entities database method
    • NOTE: Method has been deprecated and removed
  • Add m:n table for expressions and entities to link binary entities to their expressions
  • Expanding Train class set of endpoints to train classifiers and gazetteers piecemeal.
    • Code
    • Test
  • Choose schema search path via environment variable
Version 0.3: Major change for gazeteer and analysis pipeline
  • Create gazetteer
    • code
    • test
  • Create Analysis pipeline to combine classifier and gazetteer
    • code
    • test
  • Build out additional database methods to support entity types and stopwords associted with intents
    • code
    • test
  • integrate new db mmethods into Classification class, return results to be used by classification annotator
    • code
    • test
Version 0.2
  • Create basic validation webpage for unlabeled expressions.
    • code
    • test
  • Build out archive table and unlabeled expression table
  • Look into using Flask's g object for persistent db connection
Version 0.1 - Initial Version
  • Implement unlabeled expressions table and routes to add un-validated expressions
    • code
    • test
  • Improve classification response to show scoring metrics if possible for LinearSVM
  • Implement request parsing and parameter validation middleware or library-
    • Using webargs library (utilizes Marshmallow)
  • Setup External Postgres Database to handle Expressions
  • Test caching of Spacy Model to see if it works in production
  • Authentication Using simple Token based WWW-Authenticate authorization
    • Requires added 'WSGIAuthenticationpass On' to wsgi.conf file on Elastic Bek
  • Allow choice between Naive Bayes and LinearSVM during construction of sk-learn pipeline
  • Implement config/environment files for use in configuring app.
  • Host Server on AWS Elastic Beanstalk
  • Build out logging capabilities BASIC
  • Create route to delete intent and make sure all expressions are deleted too
    • code
    • test
  • Create route to delete expression/s
    • code
    • test
  • Create custom exceptions for to raise up to the route level for Database Errors and Database Intput Errors
  • Convert all server responses to JSON format rather than HTML as standard in flask (This might be alleviated by flask-restful.
    • Implemented through the jsonify(key1=value1, key2=value2, ...) method that creates a Flask resp with passed in key/value pairs
  • Create route to update classification model with new expressions
    • code
    • test
  • Create routes to add expressions to postgres
    • code
    • test
  • Explore potential use of Flask Restful extension rather than using smaller basic extensions (reinventing wheel?)
  • Create tests for existing routes.
  • Create database method to remove all expressions from an intent
    • code
    • test
  • Create database methods to add and remove an intent
    • code
    • test
  • Create table 'intent_expressions', in schema 'nlp', in local server database 'test'
  • NLP_Database class to pull intents and expressiosn from pg database
    • code
    • test
  • Update training of classifier to use intents/expressions in postgres
    • code
    • test
  • Initial push to git:
    • Basic classification module
    • Transformers
    • caching of Spacy model
    • basic Flask app
    • resources