-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deployed crawl4AI tool not working - exception=AttributeError('copy
is not supported.')>
#349
Comments
Hi @makispl , the issue you are seeing is related to how the enterprise platform tries to serialize the browser or page objects from Playwright. These objects cannot be copied or pickled. This usually occurs if the scraping logic or Crawl4AI usage crosses process boundaries. To fix it, keep all Crawl4AI usage and the scraping process within the same agent or process. Instead of returning browser objects or passing them around, just return the final scraped results as plain data. Adjusting your code in this way should prevent the serialization attempts that cause the error. I have not tried to use this with CrewAI, and to be honest, I haven't used CrewAI at all, so I'm not very familiar with it. Therefore, it's difficult for me to engage with this unless someone, like yourself, can work on it. I can help you somehow; if we can fix it, we could create a wrapper and make it available for people who want to use Crawl4AI in the CrewAI library. |
Thanks for the reply, @unclecode! I implemented your suggestions (with some help from the windsurf/cascade), approaching the issue like this:
However, the issue still persists. Could you please take a look at the script I just emailed you (cannot attach .py here)? Your help is invaluable! Once we fix this, it could pave the way for effective scraping for many CrewAI users—a wrapper would be a piece of cake! 🍰 |
@makispl I can see now. Okay, sure. I will check the email. To speed it up a little bit, please attach the file contains entire of required code to test and send it to my email. If it's not, please create a file and put everything in it so that I can open it in my VS Code and run it. Then I will see what I can do. If there are any requirements that must be installed, please create a requirements.txt file. see how it goes. |
@unclecode in order to verify if it works in the crewai platform, each time I modify the script, I then have to deploy it there (crewai platform) and by using its API (via my app) I check if data is scraped or not. I don't have any access to the platform's log files, except for the following snippet crewai team shared with me:
So, running the tool locally (as is) it works well. This is why I wanted your experienced view on that, in order you check things that might be responsible for this bad behaviour (serialization, usage that crosses process boundaries etc). That said, if you want to check locally the custom_tool.py I emailed you, you can either:
Please let me know if can provide you with any additional information. |
@makispl I will try to run this locally. If I can't, I will let you know. If I can, I will start fresh, open a CrewAI project, and use Crawl4ai as a custom tool. Perhaps this will help me better take over the issue. |
Exactly! But please keep in mind that most probably the custom tool will run perfectly with crewai locally and supposedly you won’t see any issues. Those issues only appear when the crew is deployed in the crewAI platform. |
@makispl Such a challenge haha, ok we will see |
Hi everyone,
I've made a custom scraping tool which incorporates craw4AI and use it in a crew of agents (crewAI). The crew runs perfectly locally and the custom_tool.py scrapes efficiently. However, when deploying the crew in the crewai+ enterprise platform, the tool does not work properly and not return scraped data at all.
The respective agent's output is:
"Unfortunately, I encountered persistent issues when attempting to use the Crawl4AI Crawler tool on all provided competitor URLs. As a result, I was unable to scrape the pricing plan data…" which means that the 'custom_tool.py' does not work as it is supposed to.
From the enterprise logs, all I can get is:
After trying to resolve it with the help of windsurf/cursor, their suggestions rely on this:
The error you're encountering on the CrewAI+ enterprise platform appears to be related to a serialization issue with the Playwright browser instance. The error AttributeError('copy' is not supported.') suggests that there's a problem with copying or serializing the browser state, which is likely happening because the enterprise platform handles processes differently than your local environment.
Could someone more experienced with similar issues help?
As always, @unclecode I'd appreciate your help on that.
The text was updated successfully, but these errors were encountered: