You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to fit some models with custom generators, but fit/predict/evaluate_generator / generator_next functions doesn't seem to work for me.
I'm using:
tensorflow: 2.2.0-rc2 (GPU)
tensorflow R pkg: 2.2.0 (CRAN)
keras R pkg: 2.3.0.0 (CRAN)
Sample code:
library(keras)
library(tidyverse)
input1 <- layer_input(shape = 1)
input2 <- layer_input(shape = 1)
out <- layer_add(list(input1, input2))
model <- keras_model(list(input1, input2), out)
generator <- function() {
list(list(1, 2), 3)
}
model %>% compile(loss = "mse", optimizer = "sgd")
2020-06-25 13:16:44.945200: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-06-25 13:16:44.975314: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:44.975767: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.645GHz coreCount: 16 deviceMemorySize: 7.92GiB deviceMemoryBandwidth: 238.66GiB/s
2020-06-25 13:16:44.976043: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-25 13:16:44.977732: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-06-25 13:16:44.979357: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-25 13:16:44.979802: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-25 13:16:44.981302: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-25 13:16:44.982329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-06-25 13:16:44.985423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-06-25 13:16:44.985694: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:44.986424: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:44.986821: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-06-25 13:16:44.987081: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-06-25 13:16:45.018159: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2799925000 Hz
2020-06-25 13:16:45.018581: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fdfa0000b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-25 13:16:45.018598: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-06-25 13:16:45.174748: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:45.175242: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5561a8761d60 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-06-25 13:16:45.175262: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 1070, Compute Capability 6.1
2020-06-25 13:16:45.175526: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:45.175940: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.645GHz coreCount: 16 deviceMemorySize: 7.92GiB deviceMemoryBandwidth: 238.66GiB/s
2020-06-25 13:16:45.176037: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-25 13:16:45.176073: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-06-25 13:16:45.176090: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-06-25 13:16:45.176107: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-06-25 13:16:45.176137: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-06-25 13:16:45.176156: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-06-25 13:16:45.176200: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-06-25 13:16:45.176287: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:45.176720: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:45.177085: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-06-25 13:16:45.177143: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-06-25 13:16:45.177807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-06-25 13:16:45.177822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-06-25 13:16:45.177846: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-06-25 13:16:45.177944: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:45.178342: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-06-25 13:16:45.180569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7029 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
generator_next(generator)
Error in py_has_attr_impl(x, name) :
Cannot convert object to an environment: [type=closure; target=ENVSXP].
model %>% predict(list(1, 2)) # Works
[,1]
[1,] 3
model %>% predict_generator(generator, steps = 1) # Freezes
model %>% fit_generator(generator, steps_per_epoch = 10, validation_data = list(list(1, 2), 3)) # Freezes
1/10 [==>...........................] - ETA: 0s - loss: 0.0000e+00 # Frozen on this step, RStudio non responsive
I'm using custom generators directly in python and everything works fine.
In more advanced generators with model with custom losses and metrics I'm getting :
Error in py_call_impl(callable, dots$args, dots$keywords) :
RuntimeError: in user code:
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:878 test_function *
outputs = self.distribute_strategy.run(
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:951 run **
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
return fn(*args, **kwargs)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:849 test_step **
y, y_pred, sample_weight, regularization_losses=self.losses)
/home/maju116/anaconda3/lib/python3.7/site-packag
26. | stop(structure(list(message = "RuntimeError: in user code:
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:878 test_function *
outputs = self.distribute_strategy.run(
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:951 run **
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
return fn(*args, **kwargs)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:849 test_step **
y, y_pred, sample_weight, regularization_losses=self.losses)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/engine/compile_utils.py:204 __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/losses.py:143 __call__
losses = self.call(y_true, y_pred)
/home/maju116/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/losses.py:246 call
return self.fn(y_true, y_pred, **self._fn_kwargs)
<string>:4 fn
/home/maju116/R/x86_64-pc-linux-gnu-library/4.0/reticulate/python/rpytools/call.py:21 python_function
raise RuntimeError(res[kErrorKey])
RuntimeError: Evaluation error: ValueError: None values not supported..
", call = py_call_impl(callable, dots$args, dots$keywords), cppstack = structure(list(file = "", line = -1L, stack = c("/home/maju116/R/x86_64-pc-linux-gnu-library/4.0/reticulate/libs/reticulate.so(Rcpp::exception::exception(char const*, bool)+0x7b) [0x7f1e580a78bb]", "/home/maju116/R/x86_64-pc-linux-gnu-library/4.0/reticulate/libs/reticulate.so(Rcpp::stop(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x2b) [0x7f1e580a792b]", ...
25. | wrapper at func_graph.py#968
24.
wrapped_fn at def_function.py#441
23.
func_graph_from_py_func at func_graph.py#981
22.
_create_graph_function at function.py#2667
21.
_define_function_with_shape_relaxation at function.py#2706
20.
_maybe_define_function at function.py#2774
19.
__call__ at function.py#2419
18.
_call at def_function.py#611
17.
__call__ at def_function.py#580
16.
evaluate at training.py#1018
15.
_method_wrapper at training.py#66
14.
evaluate_generator at training.py#1444
13.
new_func at deprecation.py#324
12.
(structure(function (...)
{
dots <- py_resolve_dots(list(...))
result <- py_call_impl(callable, dots$args, dots$keywords) ...
11.
do.call(func, args)
10.
call_generator_function(object$evaluate_generator, args)
9.
evaluate_generator(., test_generator, steps = 1)
8.
function_list[[k]](value)
7.
withVisible(function_list[[k]](value))
6.
freduce(value, `_function_list`)
5.
`_fseq`(`_lhs`)
4.
eval(quote(`_fseq`(`_lhs`)), env, env)
3.
eval(quote(`_fseq`(`_lhs`)), env, env)
2.
withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
1.
test_yolo %>% evaluate_generator(test_generator, steps = 1)
Is there other way of using generators with tensorflow 2.2 or am I missing sth ?
@maju116 Thanks! This is a know issue with TF >= 2.1 see #986
I have spent sometime debugging this and it looks like TensorFlow is always evaluating the generators in a different thread which leads to a deadlock somewhere.
Recommended workaround at the moment would be to use keras::train_on_batch() and write your own training loop that reads from the generator.
Hi,
I'm trying to fit some models with custom generators, but
fit/predict/evaluate_generator / generator_next
functions doesn't seem to work for me.I'm using:
tensorflow: 2.2.0-rc2 (GPU)
tensorflow R pkg: 2.2.0 (CRAN)
keras R pkg: 2.3.0.0 (CRAN)
Sample code:
I'm using custom generators directly in python and everything works fine.
In more advanced generators with model with custom losses and metrics I'm getting :
Is there other way of using generators with tensorflow 2.2 or am I missing sth ?
The text was updated successfully, but these errors were encountered: