Post 3 – The Fast and the Furious: Tokyo Drift

It is trivial to just throw an image into an OCR library, so there is additional work that we need to do to get it ready.

Most notably, squaring the receipt, cropping, and making the text itself more visible through posterising & sharpening.

We plan to build a system for identifying and squaring the receipt. To do this, I wrote a set of rules or ‘maxims’ for how a receipt may appear, so we can factor these in when doing the identification / pre-processing:

  1. A receipt as at most 4 corners
  2. In the photo, there may be as few as 0 and as many as 4 corners of the receipt
  3. Some, but not all corners will be exactly 90 degrees
  4. The receipt may not be entirely flat

There will probably be more rules that appear over time, but this helps us identify and process receipts that:

  1. Aren’t full in-frame
  2. May be slightly curled and not on a flat surface
  3. Might have torn corners

While straightening a receipt is always good, it becomes incredibly complicated if we apply the upper three rules, therefore the new approach will be to do multiple attempts at identifying the corners, and only straighten if four are found, otherwise algorithmic errors could warp the receipt incorrectly.

Once we pre-process the receipt, we will then do a simple text-identification pass to identify what regions of the image we need to run tesseract on.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *