apiary.apib

FORMAT: 1A
HOST: http://openocr.yourserver.com

# openocr
OpenOCR REST API - See http://www.openocr.net

# Group OpenOCR

If you are writing in Go, it's easier to just use the [open-ocr-client](https://github.com/tleyden/open-ocr-client) library, which uses this API.

## OCR Decode via URL [/ocr]

### OCR Decode [POST]

Decode the image given in the image url parameter and return the decoded OCR text

+ Request (application/json)

    + Body
                
            {  
                "img_url":"http://bit.ly/ocrimage",
                "engine":"tesseract",
                "engine_args":{  
                  "config_vars":{  
                    "tessedit_char_whitelist":"0123456789"
                  },
                  "psm":"3"
                }
            }
          
    + Schema 

            {  
               "$schema":"http://json-schema.org/draft-04/schema#",
               "title":"OpenOCR OCR Processing Request",
               "description":"A request to convert an image into text",
               "type":"object",
               "properties":{  
                  "img_url":{  
                     "description":"The URL of the image to process.",
                     "type":"string"
                  },
                  "engine":{  
                     "description":"The OCR engine to use",
                     "enum":[  
                        "tesseract",
                        "go_tesseract",
                        "mock"
                     ]
                  },
                  "engine_args":{  
                     "type":"object",
                     "description":"The OCR engine arguments to pass (engine-specific)",
                     "properties":{  
                        "config_vars":{  
                           "type":"object",
                           "description":"Config vars - equivalent of -c args to tesseract"
                        },
                        "psm":{  
                           "description":"Page Segment Mode, equivalent of -psm arg to tesseract.  To use default, omit this field from the JSON.",
                           "enum":[  
                              "0",
                              "1",
                              "2",
                              "3",
                              "4",
                              "5",
                              "6",
                              "7",
                              "8",
                              "9",
                              "10"
                           ]
                        },
                        "lang":{  
                           "description":"The language to use.  If omitted, will use English",
                           "enum":[  
                              "eng",
                              "ara",
                              "bel",
                              "ben",
                              "bul",
                              "ces",
                              "dan",
                              "deu",
                              "ell",
                              "fin",
                              "fra",
                              "heb",
                              "hin",
                              "ind",
                              "isl",
                              "ita",
                              "jpn",
                              "kor",
                              "nld",
                              "nor",
                              "pol",
                              "por",
                              "ron",
                              "rus",
                              "spa",
                              "swe",
                              "tha",
                              "tur",
                              "ukr",
                              "vie",
                              "chi-sim",
                              "chi-tra"
                           ]
                        }
                     },
                     "additionalProperties":false
                  },
                  "inplace_decode":{  
                     "type":"boolean",
                     "description":"If true, will attempt to do ocr decode in-place rather than queuing a message on RabbitMQ for worker processing.  Useful for local testing, not recommended for production."
                  }
               },
               "required":[  
                  "img_url",
                  "engine"
               ],
               "additionalProperties":false
            }
        

+ Response 200 (text/plain)

        Decoded OCR Text

## OCR Decode via file upload [/ocr-file-upload]

### OCR Decode [POST]

Decode the image data given directly in the multipart/related data.  The order is important, and the first part should be the JSON, which tells it which OCR engine to use.  The schema for thi JSON is documented in the `/ocr` endpoint.  The `img_url` parameter of the JSON will be ignored in this case.

The image attachment should be the _second_ part, and it should work with any image content type (eg, image/png, image/jpg, etc). 

+ Request (multipart/related; boundary=---BOUNDARY)

        -----BOUNDARY
        Content-Type: application/json

        {"engine":"tesseract"}
        -----BOUNDARY
        
        -----BOUNDARY
        Content-Disposition: attachment;
        Content-Type: image/png
        filename="attachment.txt".

        PNGDATA.........
        -----BOUNDARY
        

+ Response 200 (text/plain)

        Decoded OCR Text