Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot specify no cropping #5

Open
gwern opened this issue Dec 20, 2015 · 0 comments
Open

Cannot specify no cropping #5

gwern opened this issue Dec 20, 2015 · 0 comments

Comments

@gwern
Copy link
Contributor

gwern commented Dec 20, 2015

(This is related to but not identical to issue #2 .)

Cropping is particularly undesirable on very small images like 64x64 where it may delete a lot of the image (especially when the images come pre-centered and cropped already). Currently, you cannot run dcgan.torch with no cropping despite the configurable arguments like loadSize=64 fineSize=64 suggesting that should be possible. This is not due to design but a bug in the cropping code in data/donkey_folder.lua, it seems; said in issue #2:

Right now, loadSize has to be greater than fineSize (because of a bug in the cropping logic). So it's okay to have loadSize=65 fineSize=64 th main.lua


I messed around some with the responsible trainHook and I think the bug can be fixed by simply checking for the case where the original H/W are greater than the fineSize value and if they aren't, feeding 0s into the crop function, so the new version would look like this:

-- do random crop if fineSize/sampleSize is configured to be smaller than NN's input dimensions, loadSize
local iW = input:size(3)
local iH = input:size(2)
local oW = sampleSize[2]
local oH = sampleSize[2]
if (iW > oW) then
 w1 = math.ceil(torch.uniform(1e-2, iW-oW))
else
 w1 = 0
end
if (iH > oH) then
 h1 = math.ceil(torch.uniform(1e-2, iH-oH))
else
 h1 = 0
end
local out = image.crop(input, w1, h1, w1 + oW, h1 + oH)
assert(out:size(2) == oW)
assert(out:size(3) == oH)

Or to diff it:

diff --git a/data/donkey_folder.lua b/data/donkey_folder.lua
index 3a82393..5248f4e 100644
--- a/data/donkey_folder.lua
+++ b/data/donkey_folder.lua
@@ -52,17 +52,26 @@ local mean,std
 local trainHook = function(self, path)
    collectgarbage()
    local input = loadImage(path)
+
+   -- do random crop if fineSize/sampleSize is configured to be smaller than NN's input dimensions, loadSize
    local iW = input:size(3)
    local iH = input:size(2)
-
-   -- do random crop
-   local oW = sampleSize[2];
+   local oW = sampleSize[2]
    local oH = sampleSize[2]
-   local h1 = math.ceil(torch.uniform(1e-2, iH-oH))
-   local w1 = math.ceil(torch.uniform(1e-2, iW-oW))
+   if (iW > oW) then
+    w1 = math.ceil(torch.uniform(1e-2, iW-oW))
+   else
+    w1 = 0
+   end
+   if (iH > oH) then
+    h1 = math.ceil(torch.uniform(1e-2, iH-oH))
+   else
+    h1 = 0
+   end
    local out = image.crop(input, w1, h1, w1 + oW, h1 + oH)
    assert(out:size(2) == oW)
    assert(out:size(3) == oH)
+
    -- do hflip with probability 0.5
    if torch.uniform() > 0.5 then out = image.hflip(out); end
    out:mul(2):add(-1) -- make it [0, 1] -> [-1, 1]

This seems to work both in the 64x64px default version and the 128x128px fork, eg

$ nThreads=1 DATA_ROOT=myimages dataset=folder batchSize=2 loadSize=128 fineSize=128 nz=75 ngf=106 ndf=48 gpu=0 th main-128.lua
{
  ntrain : inf
  beta1 : 0.5
  name : "experiment1"
  niter : 25
  batchSize : 2
  ndf : 48
  fineSize : 128
  nz : 75
  loadSize : 128
  gpu : 0
  ngf : 106
  dataset : "folder"
  lr : 0.0002
  noise : "normal"
  nThreads : 1
  display_id : 10
  display : 1
}
Random Seed: 5143   
Starting donkey with id: 1 seed: 5144
table: 0x406af6b8
Loading train metadata from cache
Dataset: folder  Size:  442215  
Epoch: [1][       0 /   221107]  Time: 8.181  DataTime: 0.003    Err_G: 1.1998  Err_D: 1.1637   
Epoch: [1][       1 /   221107]  Time: 5.115  DataTime: 0.001    Err_G: 0.3660  Err_D: 1.5839   
Epoch: [1][       2 /   221107]  Time: 5.965  DataTime: 0.001    Err_G: 2.8597  Err_D: 1.7219   
Epoch: [1][       3 /   221107]  Time: 6.163  DataTime: 0.001    Err_G: 0.1956  Err_D: 2.2080   
Epoch: [1][       4 /   221107]  Time: 5.537  DataTime: 0.001    Err_G: 0.7360  Err_D: 1.9527   
Epoch: [1][       5 /   221107]  Time: 6.300  DataTime: 0.001    Err_G: 6.8542  Err_D: 3.6255   
...

And looking at the displayed training sample images in the display server, they don't look cropped like before. So although I haven't run anything to completion, I think that fix works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant