NVlabs/ocrobin

Name: ocrobin

Owner: NVIDIA Research Projects

Description: null

Created: 2018-04-11 19:28:40.0

Updated: 2018-05-08 06:11:07.0

Pushed: 2018-04-22 04:30:10.0

Homepage: null

Size: 25107

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

ocrobin

Automatic binarization using deep learning.

This implements a grayscale-to-binary pixel-for-pixel transformation. The models it is usually used with perform some denoising and deblurring, but they are small enough not to contain any significant shape priors. The use of 2D LSTMs in the binarization model allows for some modeling of global noise and intensity properties.

Inference

ab inline
image", cmap="gray", interpolation="bicubic")
Populating the interactive namespace from numpy and matplotlib
rt ocrobin
 ocrobin.Binarizer("bin-000000046-005393.pt")
odel
Sequential(
  (0): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True)
  (2): ReLU()
  (3): LSTM2(
    (hlstm): RowwiseLSTM(
      (lstm): LSTM(8, 4, bidirectional=1)
    )
    (vlstm): RowwiseLSTM(
      (lstm): LSTM(8, 4, bidirectional=1)
    )
  )
  (4): Conv2d(8, 1, kernel_size=(1, 1), stride=(1, 1))
  (5): Sigmoid()
)
ize(10, 10)
e = mean(imread("testdata/sample.png")[:, :, :3], 2)
ry = bm.binarize(image)
lot(121); imshow(image)
lot(122); imshow(binary)
<matplotlib.image.AxesImage at 0x7fdca6651790>

png

lot(121); imshow(image[400:600, 400:600])
lot(122); imshow(1-binary[400:600, 400:600])
<matplotlib.image.AxesImage at 0x7fdca6576910>

png

Training

Training data for ocrobin-train is stored in tarfiles, with binary images and corresponding grayscale images.

sh
-ztvf testdata/bindata.tgz | sed 5q
drwxrwxr-x tmb/tmb           0 2018-04-17 10:27 ./
-rw-rw-r-- tmb/tmb      391766 2018-04-10 09:35 ./A001BIN.bin.png
-rw-rw-r-- tmb/tmb     6021129 2018-04-10 09:35 ./A001BIN.gray.png
-rw-rw-r-- tmb/tmb      226629 2018-04-10 09:36 ./A002BIN.bin.png
-rw-rw-r-- tmb/tmb     2685607 2018-04-10 09:36 ./A002BIN.gray.png


tar: write error

The training data is actually artificially generated; document image degradation for this kind of training works quite well at simulating real data.

 dlinputs import tarrecords
le = tarrecords.tariterator(open("testdata/bindata.tgz")).next()
le["__key__"]
lot(121); imshow(sample["gray.png"])
lot(122); imshow(sample["bin.png"])
<matplotlib.image.AxesImage at 0x7fdc386ec390>

You can use the ocrobin-train binary to carry out the training.

sh
robin-train -d testdata/bindata.tgz -o temp

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.