-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pyarrow module for Wazuh agent - WARNING #7566
Comments
Update
|
UpdateThe error mentioned in the issue was verified in the agent, and the current steps mentioned in were followed docu-v4.9.0-alpha3 The same configuration used in the E2E was added to The necessary dependencies from were installed : root@474fb88aa286:/# pip freeze
boto3==1.17.85
botocore==1.20.112
cryptography==3.4.8
jmespath==0.10.0
numpy==2.0.1
pyarrow==14.0.1
python-dateutil==2.9.0.post0
s3transfer==0.4.2
six==1.16.0
urllib3==1.26.19 After restarting the agent, we observed the error: 2024/07/29 10:38:47 wazuh-modulesd:aws-s3: INFO: Executing Service Analysis: (Service: cloudwatchlogs, Profile: default)
2024/07/29 10:38:47 wazuh-modulesd:aws-s3: WARNING: Service: cloudwatchlogs - Returned exit code 10
2024/07/29 10:38:47 wazuh-modulesd:aws-s3: WARNING: Service: cloudwatchlogs - pyarrow module is required
2024/07/29 10:38:47 wazuh-modulesd:aws-s3: INFO: Fetching logs finished. And from the interpreter, this was the output: root@474fb88aa286:/# python3
Python 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.
Traceback (most recent call last): File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/pyarrow/__init__.py", line 65, in <module>
import pyarrow.lib as _lib
AttributeError: _ARRAY_API not found
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/pyarrow/__init__.py", line 65, in <module>
import pyarrow.lib as _lib
File "pyarrow/lib.pyx", line 36, in init pyarrow.lib
ImportError: numpy.core.multiarray failed to import
>>> print(pyarrow.__version__)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'pyarrow' is not defined
The next step was to remove the dependencies for both root@474fb88aa286:/# pip3 uninstall numpy pyarrow
Found existing installation: numpy 2.0.1
Uninstalling numpy-2.0.1:
Would remove:
/usr/local/bin/f2py
/usr/local/bin/numpy-config
/usr/local/lib/python3.10/dist-packages/numpy-2.0.1.dist-info/*
/usr/local/lib/python3.10/dist-packages/numpy.libs/libgfortran-040039e1-0352e75f.so.5.0.0
/usr/local/lib/python3.10/dist-packages/numpy.libs/libquadmath-96973f99-934c22de.so.0.0.0
/usr/local/lib/python3.10/dist-packages/numpy.libs/libscipy_openblas64_-99b71e71.so
/usr/local/lib/python3.10/dist-packages/numpy/*
Proceed (Y/n)? y
Successfully uninstalled numpy-2.0.1
Found existing installation: pyarrow 14.0.1
Uninstalling pyarrow-14.0.1:
Would remove:
/usr/local/lib/python3.10/dist-packages/pyarrow-14.0.1.dist-info/*
/usr/local/lib/python3.10/dist-packages/pyarrow/*
Proceed (Y/n)? y
Successfully uninstalled pyarrow-14.0.1 Then, https://github.com/wazuh/wazuh/blob/master/framework/requirements.txt#L61 root@474fb88aa286:/# pip3 install numpy==1.26.0 pyarrow==14.0.1
Collecting numpy==1.26.0
Downloading numpy-1.26.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (58 kB)
Collecting pyarrow==14.0.1
Using cached pyarrow-14.0.1-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (3.0 kB)
Downloading numpy-1.26.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 37.2 MB/s eta 0:00:00
Using cached pyarrow-14.0.1-cp310-cp310-manylinux_2_28_x86_64.whl (38.0 MB)
Installing collected packages: numpy, pyarrow
Successfully installed numpy-1.26.0 pyarrow-14.0.1
...
root@474fb88aa286:/# pip freeze
boto3==1.17.85
botocore==1.20.112
cryptography==3.4.8
jmespath==0.10.0
numpy==1.26.0
pyarrow==14.0.1
python-dateutil==2.9.0.post0
s3transfer==0.4.2
six==1.16.0
urllib3==1.26.19 And the interpreter was run again: root@474fb88aa286:/# python3
Python 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> Once the agent was restarted, this is the output we can observe from CloudWatch Logs : root@474fb88aa286:/# tail -f /var/ossec/logs/ossec.log | grep wazuh-modulesd
2024/07/29 11:40:56 wazuh-modulesd:aws-s3[11380] wm_aws.c:703 at wm_aws_run_service(): DEBUG: Service: cloudwatchlogs - OUTPUT: DEBUG: +++ Debug mode on - Level: 2
2024/07/29 11:41:53 wazuh-modulesd:aws-s3[11380] wm_aws.c:201 at wm_aws_main(): INFO: Fetching logs finished.
2024/07/29 11:41:53 wazuh-modulesd:aws-s3[11380] schedule_scan.c:153 at _get_next_time(): WARNING: Interval overtaken.
2024/07/29 11:41:53 wazuh-modulesd:aws-s3[11380] wm_aws.c:84 at wm_aws_main(): INFO: Starting fetching of logs.
2024/07/29 11:41:53 wazuh-modulesd:aws-s3[11380] wm_aws.c:171 at wm_aws_main(): INFO: Executing Service Analysis: (Service: cloudwatchlogs, Profile: default)
2024/07/29 11:41:53 wazuh-modulesd:aws-s3[11380] wm_aws.c:558 at wm_aws_run_service(): DEBUG: Create argument list
2024/07/29 11:41:53 wazuh-modulesd:aws-s3[11380] wm_aws.c:662 at wm_aws_run_service(): DEBUG: Launching S3 Command: wodles/aws/aws-s3 --service cloudwatchlogs --aws_profile default --regions us-east-1 --aws_log_groups /aws/lambda/ec2-instance-autodeletion --debug 2
...
DEBUG: Getting CloudWatch logs from log stream "2024/07/29/[$LATEST]xxxxxxxx" in log group "/aws/lambda/ec2-instance-autodeletion" using token "f/xxxxxxxxxxxxxx/s", start_time "1722222154858" and end_time "None"
DEBUG: +++ There are no new events in the "/aws/lambda/ec2-instance-autodeletion" group
DEBUG: Saving data for log group "/aws/lambda/ec2-instance-autodeletion" and log stream "2024/07/29/[$LATEST]xxxxxxxxxxxxxxx".
DEBUG: The saved values are "{'token': 'f/xxxxxxxxxxxxxxxxxxxxxxxxxs', 'start_time': 1722211200000, 'end_time': 1722222154858}"
DEBUG: Some data already exists on DB for that key. Updating their values... |
UpdateThe next step will be to follow the same steps from the documentation as before, but when installing the dependencies, the specific version pip3 install boto3==1.17.85 pyarrow==14.0.1 numpy==1.26.0 |
UpdateI followed the entire documentation on docu-v4.9.0-alpha3 again, modifying only the last command to install the dependencies by adding root@50211a4e9179:/# pip3 install boto3==1.17.85 pyarrow==14.0.1 numpy==1.26.0
Collecting boto3==1.17.85
Downloading boto3-1.17.85-py2.py3-none-any.whl.metadata (6.2 kB)
Collecting pyarrow==14.0.1
Downloading pyarrow-14.0.1-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (3.0 kB)
Collecting numpy==1.26.0
Downloading numpy-1.26.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (58 kB)
Collecting botocore<1.21.0,>=1.20.85 (from boto3==1.17.85)
Downloading botocore-1.20.112-py2.py3-none-any.whl.metadata (5.6 kB)
Collecting jmespath<1.0.0,>=0.7.1 (from boto3==1.17.85)
Downloading jmespath-0.10.0-py2.py3-none-any.whl.metadata (8.0 kB)
Collecting s3transfer<0.5.0,>=0.4.0 (from boto3==1.17.85)
Downloading s3transfer-0.4.2-py2.py3-none-any.whl.metadata (1.5 kB)
Collecting python-dateutil<3.0.0,>=2.1 (from botocore<1.21.0,>=1.20.85->boto3==1.17.85)
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting urllib3<1.27,>=1.25.4 (from botocore<1.21.0,>=1.20.85->boto3==1.17.85)
Downloading urllib3-1.26.19-py2.py3-none-any.whl.metadata (49 kB)
Collecting six>=1.5 (from python-dateutil<3.0.0,>=2.1->botocore<1.21.0,>=1.20.85->boto3==1.17.85)
Downloading six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Downloading boto3-1.17.85-py2.py3-none-any.whl (131 kB)
Downloading pyarrow-14.0.1-cp310-cp310-manylinux_2_28_x86_64.whl (38.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.0/38.0 MB 41.9 MB/s eta 0:00:00
Downloading numpy-1.26.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 41.9 MB/s eta 0:00:00
Downloading botocore-1.20.112-py2.py3-none-any.whl (7.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.7/7.7 MB 42.0 MB/s eta 0:00:00
Downloading jmespath-0.10.0-py2.py3-none-any.whl (24 kB)
Downloading s3transfer-0.4.2-py2.py3-none-any.whl (79 kB)
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
Downloading urllib3-1.26.19-py2.py3-none-any.whl (143 kB)
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: urllib3, six, numpy, jmespath, python-dateutil, pyarrow, botocore, s3transfer, boto3
Successfully installed boto3-1.17.85 botocore-1.20.112 jmespath-0.10.0 numpy-1.26.0 pyarrow-14.0.1 python-dateutil-2.9.0.post0 s3transfer-0.4.2 six-1.16.0 urllib3-1.26.19
...
root@50211a4e9179:/# pip freeze
boto3==1.17.85
botocore==1.20.112
cryptography==3.4.8
jmespath==0.10.0
numpy==1.26.0
pyarrow==14.0.1
python-dateutil==2.9.0.post0
s3transfer==0.4.2
six==1.16.0
urllib3==1.26.19 (The version
And this was the output of CloudWatch Logs on the agent: root@50211a4e9179:/# tail -f /var/ossec/logs/ossec.log | grep wazuh-modulesd
2024/07/29 12:55:11 wazuh-modulesd:aws-s3[3768] wm_aws.c:84 at wm_aws_main(): INFO: Starting fetching of logs.
2024/07/29 12:55:11 wazuh-modulesd:aws-s3[3768] wm_aws.c:171 at wm_aws_main(): INFO: Executing Service Analysis: (Service: cloudwatchlogs, Profile: default)
2024/07/29 12:55:11 wazuh-modulesd:aws-s3[3768] wm_aws.c:558 at wm_aws_run_service(): DEBUG: Create argument list
2024/07/29 12:55:11 wazuh-modulesd:aws-s3[3768] wm_aws.c:662 at wm_aws_run_service(): DEBUG: Launching S3 Command: wodles/aws/aws-s3 --service cloudwatchlogs --aws_profile default --regions us-east-1 --aws_log_groups wazuh-cloudwatchlogs-integration-tests --debug 2
2024/07/29 12:55:14 wazuh-modulesd:aws-s3[3768] wm_aws.c:703 at wm_aws_run_service(): DEBUG: Service: cloudwatchlogs - OUTPUT: DEBUG: +++ Debug mode on - Level: 2
2024/07/29 12:55:14 wazuh-modulesd:aws-s3[3768] wm_aws.c:201 at wm_aws_main(): INFO: Fetching logs finished.
2024/07/29 12:55:14 wazuh-modulesd:aws-s3[3768] wm_aws.c:80 at wm_aws_main(): DEBUG: Sleeping until: 2024/07/29 12:56:11
...
2024/07/29 12:56:13 wazuh-modulesd:aws-s3[3768] wm_aws.c:703 at wm_aws_run_service(): DEBUG: Service: cloudwatchlogs - OUTPUT: DEBUG: +++ Debug mode on - Level: 2
DEBUG: +++ Getting alerts from "us-east-1" region.
DEBUG: Generating default configuration for retries: mode standard - max_attempts 10
DEBUG: only logs: None
DEBUG: Getting log streams for "wazuh-cloudwatchlogs-integration-tests" log group
DEBUG: Found "wazuh-cloudwatchlogs-integration-tests" log stream in wazuh-cloudwatchlogs-integration-tests
DEBUG: Getting data from DB for log stream "wazuh-cloudwatchlogs-integration-tests" in log group "wazuh-cloudwatchlogs-integration-tests"
DEBUG: Token: "f/xxxxxxxxxxxxxxxxxxxxxx/s", start_time: "1722211200000", end_time: "1722211200003"
DEBUG: Getting CloudWatch logs from log stream "wazuh-cloudwatchlogs-integration-tests" in log group "wazuh-cloudwatchlogs-integration-tests" using token "f/xxxxxxxxxxxxxxxxxxxxxx/s", start_time "1722211200004" and end_time "None"
DEBUG: +++ There are no new events in the "wazuh-cloudwatchlogs-integration-tests" group
DEBUG: Saving data for log group "wazuh-cloudwatchlogs-integration-tests" and log stream "wazuh-cloudwatchlogs-integration-tests".
DEBUG: The saved values are "{'token': 'f/xxxxxxxxxxxxxxxxxxxx/s', 'start_time': 1722211200000, 'end_time': 1722211200004}"
DEBUG: Some data already exists on DB for that key. Updating their values...
DEBUG: Purging the BD
DEBUG: Getting log streams for "wazuh-cloudwatchlogs-integration-tests" log group
DEBUG: Found "wazuh-cloudwatchlogs-integration-tests" log stream in wazuh-cloudwatchlogs-integration-tests
DEBUG: committing changes and closing the DB And the interpreter was run again: root@50211a4e9179:/# python3
Python 3.10.12 (main, Mar 22 2024, 16:50:05) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyarrow
>>> exit()
root@50211a4e9179:/# |
Hello,
The steps in the documentation for installing pyarrow are not making the module work for the Wazuh agent.
This is related to the Release 4.9.0 - Alpha 3 - E2E UX tests - Amazon Cloudwatch Logs integration .
Kindly review as it generates the following error:
cc: @fdalmaup
The text was updated successfully, but these errors were encountered: