Unable to locate output files from a scheduled job run #349

AravindAmazon · 2023-03-02T19:51:17Z

Hello team,

I have a job scheduled which will auto-run a basic Python script that writes a csv file output. The job runs successfully. However, am not able to locate the directory where the output file is scored.
The code line used for writing : temp.to_csv('trial_output.csv'), where temp is the data-frame variable.
When I use the same script in regular JupYter environment (outside JupYter lab), the csv file gets written successfully to the local JupYter environment folder. The issue appears to be happening only in the JupYter lab environment while using a scheduled job. Appreciate if someone can help (I use JupYter notebook via the AWS SageMaker interface).

Full-script:
import pandas as pd
temp = pd.read_csv("s3:///")
temp.to_csv('trial_output.csv')

Overall purpose:
Require to auro-run case predictions on a daily basis (with a volume of atleast 10,000 predictions per day) and share a daily csv with business users (without any manual intervention)

Thanks,
Aravind

welcome · 2023-03-02T19:51:19Z

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.

You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

rubenvarela · 2023-03-03T14:35:50Z

When I try recreating this, the file gets saved to the root folder of jupyter lab which maps to the location from where I ran the jupyter-lab command.

JasonWeill · 2023-06-23T23:49:06Z

ArchivingExecutionManager archives the output files to a .tar.gz file, but it doesn't include files created as a side effect of running the notebook, as described in this issue.

This issue might be fixed by either modifying ArchivingExecutionManager or creating an alternate execution manager that gathers all output formats and all supporting files in and under the working directory, and saves them into an archive of some kind (.zip or .tar.gz).

JasonWeill · 2023-08-14T16:35:11Z

Closing because #388 is merged.

JasonWeill · 2023-08-14T16:36:02Z

To use the archiving scheduler, follow these instructions in the docs: https://jupyter-scheduler.readthedocs.io/en/latest/operators/index.html#example-capturing-side-effect-files

JasonWeill changed the title ~~Unable to locate output files from a JupYter job scheduler run~~ Unable to locate output files from a scheduled job run Mar 2, 2023

JasonWeill added the bug Something isn't working label Mar 2, 2023

JasonWeill added this to the 1.4.0 Release milestone Jun 6, 2023

JasonWeill self-assigned this Jun 22, 2023

JasonWeill modified the milestones: 1.4.0 Release, 1.5.0 Release Jun 26, 2023

JasonWeill mentioned this issue Jun 28, 2023

Archiving all-files scheduler #388

Merged

JasonWeill modified the milestones: 2.0.0 Release, 2.1.0 Jul 31, 2023

JasonWeill closed this as completed Aug 14, 2023

JasonWeill mentioned this issue Aug 14, 2023

Support for multiple output files #408

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to locate output files from a scheduled job run #349

Unable to locate output files from a scheduled job run #349

AravindAmazon commented Mar 2, 2023

welcome bot commented Mar 2, 2023

rubenvarela commented Mar 3, 2023

JasonWeill commented Jun 23, 2023

JasonWeill commented Aug 14, 2023

JasonWeill commented Aug 14, 2023

Unable to locate output files from a scheduled job run #349

Unable to locate output files from a scheduled job run #349

Comments

AravindAmazon commented Mar 2, 2023

welcome bot commented Mar 2, 2023

rubenvarela commented Mar 3, 2023

JasonWeill commented Jun 23, 2023

JasonWeill commented Aug 14, 2023

JasonWeill commented Aug 14, 2023