-
Notifications
You must be signed in to change notification settings - Fork 6
/
2020-08-27_SCINet-Geospatial-WG_Workshop-Session3-Intro-to-Python-Dask_CHAT.txt
137 lines (137 loc) · 11.1 KB
/
2020-08-27_SCINet-Geospatial-WG_Workshop-Session3-Intro-to-Python-Dask_CHAT.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
00:08:39 Rowan Gaffney: Welcome everyone. Todays Tutorials is at: https://kerriegeil.github.io/SCINET-GEOSPATIAL-RESEARCH-WG/content/3-Session3-intro-to-python-dask.html
00:12:43 Amy Hudson: Green check/ red X can be accessed through ‘Participants’ tab, click a green check if you’re good to go please!
00:14:55 Rowan Gaffney: For issues logging in, vrsc email is : [email protected]
00:16:45 Alisa Coffin: My cell phone died
00:17:15 Zhanyou Xu: i still use the session from last one
00:17:23 Amy Hudson: nice screensaver alisa
00:18:00 Rowan Gaffney: Zhanyou - that is fine, but we are going to spawn a new session with different parameters
00:18:07 Rowan Gaffney: what Kerrie is doing now
00:19:59 Gerard Lazo: when I re-login I get to the previous session. how do I shutdown everything from the previous session?
00:20:44 Rowan Gaffney: Gerad, go to file --> Hub Control Panel --> Stop Server
00:21:13 Rowan Gaffney: and this should eventual bring you back to the spawning page
00:21:22 Yang Yang: What is the Container Exec Args for?
00:21:27 Zhanyou Xu: Does the $ sign vs > matter for the terminal?
00:24:04 Rowan Gaffney: Yang - did that answer your question about the exec args?
00:26:46 Elizabeth Chin: I tried to do the git clone by copying what is in the tutorial, but go the following errors: fatal: write error: Bad file descriptor
fatal: index-pack failed
00:26:48 Yang Yang: Yes, thanks Rowan! If we don’t put the specs for the container, I assume we will run issues later, right?
00:26:58 Alex Arp: I’m gettin an error
00:26:59 Alex Arp: fatal: write error: Bad file descriptor
fatal: index-pack failed
00:27:07 Nicole Kaplan: I got that error too
00:27:25 Zeya Xue: I have that error as well
00:27:36 Scott Drummond: error:
00:27:38 Scott Drummond: Singularity> git clone --single-branch https://github.com/kerriegeil/SCINET-GEOSPATIAL-RESEARCH-WG.gitCloning into 'SCINET-GEOSPATIAL-RESEARCH-WG'...remote: Enumerating objects: 18, done.remote: Counting objects: 100% (18/18), done.remote: Compressing objects: 100% (3/3), done.fatal: write error: Bad file descriptorfatal: index-pack failed
00:28:14 Gerard Lazo: i get the same error as above
00:28:38 Nicole Kaplan: yes
00:28:40 Elizabeth Chin: yes, home directory
00:28:43 Alex Arp: home
00:29:09 Alicia Foxx: home/first.last
00:29:28 Elizabeth Chin: how do you refresh?
00:29:30 Kyle Knipper: mine worked and I can see the folder
00:29:30 Danielle Lemay: Maybe check disk space in your home directory if the write is failing ?
00:29:37 Alex Arp: Not seeing the folder
00:29:53 Alicia Foxx: I also got the error
00:29:54 Jennifer Chang: I had to move to a project folder, then git clone worked...
00:30:44 Elizabeth Chin: same here- git clone in a project dir worked but not my home dir
00:30:50 Sivanandan Chudalayandi: @jennifer chang… did this not work on your home folder
00:31:12 Amy Hudson: to uncheck your green check, just click it again
00:31:23 Andrew Severin: may be space realted
00:32:03 Nicole Kaplan: me too
00:32:46 Rowan Gaffney: https://github.com/kerriegeil/SCINET-GEOSPATIAL-RESEARCH-WG
00:34:06 Gerard Lazo: it does appear space related. i changed to another project partition from home directory and it worked.
00:35:13 Yanghui Kang: Maybe Kerrie can share the file to everyone through zoom, if this works, so people don’t need unzip.
00:36:48 Andrew Severin: cd $TMPDIR
00:36:48 Sivanandan Chudalayandi: cd $TMPDIR
00:37:58 Andrew Severin: echo $TMPDIR will show you where it is
00:40:45 Nicole Kaplan: I also tried to open it from a python 3 notebook and can not open a new blank notebool=k
00:40:50 Kerrie Geil: git clone --single-branch https://github.com/kerriegeil/SCINET-GEOSPATIAL-RESEARCH-WG.git
00:41:25 Rowan Gaffney: Nicole, where you able to get the .ipynb into jupyterlab?
00:42:58 Nicole Kaplan: no. I just uploaded it from my local machine. I want to open a blank notebook and I can't do that. so perhaps I should shutdown my server session and relaunch?
00:44:58 Rowan Gaffney: Nicole, hmm, can up open the py_geo notebook?
00:45:16 Nicole Kaplan: I got it now
00:45:42 Lina Castano-Duque: can you make your font bigger
00:45:45 Rowan Gaffney: Nicole - great!
00:47:37 Nicole Kaplan: so that's the conda environment running in the container.
00:47:57 Rowan Gaffney: yes "py_geo" is a conda environment within the container
00:48:07 Alisa Coffin: What does SLURM stand for?
00:48:16 Elizabeth Chin: I don’t see the py_geo environment but I initiated Jupyter Hub with the container
00:48:43 Alisa Coffin: never mind, I found it: Simple Linux Utility for Resource Management
00:48:49 Nicole Kaplan: I actually wonder if light mode would be easier to read...
00:49:12 Rowan Gaffney: Elizabeth - hmmm.... you should see it... can you see it when you go file ---> new launcher
00:49:20 Gerard Lazo: i thought my session shuld have started in the desired kernel, but it didn't
00:49:59 Rowan Gaffney: Gerard - you should be able to manual change it and then rerun the desired code cells
00:52:09 Zhanyou Xu: I logged out, and relog in after enter the 6 digits
00:52:25 Rowan Gaffney: all - if you fall behind, a static version (html) of this notebook is here: https://kerriegeil.github.io/SCINET-GEOSPATIAL-RESEARCH-WG/html-tutorials/session3-intro-to-python-dask-on-ceres.html
00:52:27 Zhanyou Xu: it go to the Jupyter directly without asking me to enter the conyainer
00:52:47 Rowan Gaffney: Claire - good to know - will have to add this information to the documentation
00:53:22 Rowan Gaffney: Zhanyou, you should log out via: File --> Hub Control Panel --> Stop Server
00:53:33 Zhanyou Xu: trying..
00:53:35 Rowan Gaffney: and then log back it with the container options
00:54:32 Kyle Knipper: yep
00:54:35 Danielle Lemay: yes
00:54:36 Alicia Foxx: yes
00:54:53 Alex Arp: Different error. NameError: name 'daskpath' is not defined
00:55:04 Dave Fleisher: no, maybe i did something wrong
00:55:14 Lina Castano-Duque: same error, I had to delete the can be deleted file
00:55:30 Alex Arp: Same error now
00:58:56 Sivanandan Chudalayandi: I only see python 2 and python 3 as my kernel options
01:03:31 Rowan Gaffney: Zhanyou - use you should copy all the options in the instructional, including the exec agrs
01:04:36 Rowan Gaffney: Siva, not sure why you are not seeing the other kernels
01:04:49 Rowan Gaffney: Zhanyou, yes, Home should work fine
01:09:11 Gerard Lazo: i also do not see the correct kernels. only python3. i did add the path and option lines for the container on startup. maybe a way to check for kernel initially should be included in future sessions.
01:09:18 Danielle Lemay: Dask Dashboard is orange buttons, but clicking on “Workers”, the Dash Workers tab is empty
01:09:47 Rowan Gaffney: Danielle, have you launched scaled the cluster yet?
01:10:31 Danielle Lemay: yes
01:10:37 Rowan Gaffney: hmmm
01:12:09 Danielle Lemay: 3 workers, 120 cores. Just nothing under the Dash Workers tab. How would I be sure of the port?
01:12:45 Yang Yang: If we running the same issue as Kerrie did, shall we quit the whole process and restart?
01:13:17 Rowan Gaffney: oh gotcha - make sure you have the correct address: /user/user.name/proxy/XXXXX/status
01:13:37 Rowan Gaffney: where XXXXX is the dashboard port number
01:14:04 Rowan Gaffney: yes, you may need to restart and replace the "short" to "brief-low"
01:14:19 Rowan Gaffney: Where you have defined the "client" object
01:15:20 Yang Yang: Thanks Rowan, besides the two client object, there are many other options. What are the differences between them and how do we choose?
01:17:00 Rowan Gaffney: Hi Yang, sorry, before I meant the cluster object. Those options are quite similar to the sbatch options
01:18:07 Rowan Gaffney: restart the kernel, and rerun the cells from the beginning
01:19:56 Yang Yang: Got it, thanks Rowan.
01:20:31 Elizabeth Chin: what’s the benefit of using a conda environment inside of a container vs just sharing the conda envrionment as a yml file?
01:21:49 Lina Castano-Duque: I am having a hard time to understand how to translate this new skill into other coding languages like R. How to set this if I want to run models in R using interactive mode
01:23:05 Sivanandan Chudalayandi: Is JupyterLab deployed when we start Jupyterhub container?
01:24:30 Rowan Gaffney: Siva - yes jupyterlab is deployed by default from Jupyterhub
01:24:38 Rowan Gaffney: with a container
01:25:08 Sivanandan Chudalayandi: https://docs.anaconda.com/anaconda/navigator/tutorials/r-lang/#:~:text=Open%20the%20environment%20with%20the,select%20New%2C%20then%20select%20R.&text=To%20run%20the%20code%2C%20in,the%20keyboard%20shortcut%20Ctrl%2DEnter.
01:25:32 Zeya Xue: https://github.com/xiaodaigh/disk.frame
01:25:43 Zeya Xue: Disk.frame is an R’s Dask
01:26:57 Gerard Lazo: there also appears to be a package for R called parallel
01:27:07 Amy Hudson: From our Research Highlights in SCINet Newsletter July, John Humphreys uses R and writes: ‘Conveniently,
the R package used to run the model (r-INLA) included native multithreading (OpenMP) to handle parallel
processing and allowed HPC cores to run concurrently.’
01:30:24 Sivanandan Chudalayandi: My problem was with the kernel
01:32:25 Rowan Gaffney: Siva, not sure why you can't find the kernel within the container - really strange!
01:33:44 Sivanandan Chudalayandi: btw… reg multicore computing in R, another option is https://github.com/tidyverse/multidplyr
01:33:58 Rowan Gaffney: thanks Siva!
01:35:32 Yanghui Kang: I think there are some good examples on loading data in through dask in tomorrow’s Session 5
01:35:47 Amy Hudson: I have to go, thank you!
01:39:41 Alisa Coffin: thank you very much Kerrie and Rowan and everyone else. This was really great!
01:39:48 Alisa Coffin: got to go now
01:41:05 Nicole Kaplan: I have to take off, but this has been great. Thank you and see you tomorrow.
01:41:58 Rowan Gaffney: Thanks for join Nicole and Alisa!
01:43:00 Lina Castano-Duque: Dave's idea of an executable program (.py or .r) example will be great to see with parallel computing
01:44:44 Elizabeth Chin: I was unable to use the py_geo environment with the notebook today. Will you be using containers and conda environments for tomorrow’s tutorials? If so, would you be able to share the container beforehand and also the conda env as a yml file? This way we can make sure it’s working beforehand.
01:47:04 Rowan Gaffney: Hi Elizabeth, here is the link to the environment: https://github.com/rmg55/container_stacks/blob/master/data_science_im_rs/py_geo.yml
01:47:09 Elizabeth Chin: thank you!
01:47:23 Dave Fleisher: thanks guys, have to leave, really helpful.
01:47:50 Kerrie Geil: https://kerriegeil.github.io/SCINET-GEOSPATIAL-RESEARCH-WG/content/2-Session2-intro-to-ceres.html
01:48:39 Rowan Gaffney: Elizabeth - feel free to send me an email if you try the container again and still have issues!
01:48:49 srinivasa pinnamaneni: Can you be able to share the all links in one email?
01:49:11 Kerrie Geil: https://kerriegeil.github.io/SCINET-GEOSPATIAL-RESEARCH-WG/
01:49:58 Rowan Gaffney: All the tutorials on the top of the webpage
01:50:36 Max Feldman: Thanks great job!
01:50:48 Sivanandan Chudalayandi: Thanks all!!
01:51:08 Yanghui Kang: Thank you!!