Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid management of reference counting in open_files (OSX) #1127

Closed
1 of 2 tasks
jakub-bacic opened this issue Sep 14, 2017 · 7 comments
Closed
1 of 2 tasks

Invalid management of reference counting in open_files (OSX) #1127

jakub-bacic opened this issue Sep 14, 2017 · 7 comments

Comments

@jakub-bacic
Copy link
Contributor

jakub-bacic commented Sep 14, 2017

While iterating over file descriptors of the given process, it's possible to get EPERM error while calling proc_pidfdinfo:

psutil/psutil/_psutil_osx.c

Lines 1137 to 1141 in c4c6e22

nb = proc_pidfdinfo((pid_t)pid,
fdp_pointer->proc_fd,
PROC_PIDFDVNODEPATHINFO,
&vi,
sizeof(vi));

The errors checking part will raise exception and goto error label:

psutil/psutil/_psutil_osx.c

Lines 1151 to 1152 in c4c6e22

psutil_raise_for_pid(pid, "proc_pidinfo() syscall failed");
goto error;

If such situation happens after the first iteration of the loop (so py_path variable is not NULL), it will cause py_path reference count to be decremented even though it shouldn't be:

Py_XDECREF(py_path);

I was able to discover that issue while testing psutil on Mac OS High Sierra beta release. Issue is consistently reproducible (at least on my machine) when I try to iterate over all processes and list open files for them. Example problematic process is called lsd and EPERM error is also visible while using command-line lsof utility:

hsierras-MacBook-Pro:psutil jbacic$ lsof -p 325
COMMAND PID   USER   FD   TYPE DEVICE   SIZE/OFF       NODE NAME
lsd     325 jbacic  cwd    DIR   1,15       1024          2 /
lsd     325 jbacic  txt    REG   1,15      19136 4298624763 /usr/libexec/lsd
lsd     325 jbacic  txt    REG   1,15      32768 4298764710 /private/var/db/mds/messages/502/se_SecurityMessages
lsd     325 jbacic  txt    REG   1,15     376816 4298543466 /System/Library/Frameworks/Security.framework/Versions/A/PlugIns/csparser.bundle/Contents/MacOS/csparser
lsd     325 jbacic  txt    REG   1,15   26704160 4298629438 /usr/share/icu/icudt59l.dat
lsd     325 jbacic  txt    REG   1,15    7331840 4298779644 /private/var/folders/35/wbfsx81n0wlfd1165gphvvw00000gp/0/com.apple.LaunchServices-219-v2.csstore
lsd     325 jbacic  txt    REG   1,15     837248 4298618806 /usr/lib/dyld
lsd     325 jbacic  txt    REG   1,15 1146023936 4298642428 /private/var/db/dyld/dyld_shared_cache_x86_64h
lsd     325 jbacic    0r   CHR    3,2        0t0        332 /dev/null
lsd     325 jbacic    1u   CHR    3,2        0t0        332 /dev/null
lsd     325 jbacic    2u   CHR    3,2        0t0        332 /dev/null
lsd     325 jbacic    3                                     vnode: Operation not permitted

Note that this is not an issue for previous Mac OS/OSX releases and probably that is why this issue wasn't found earlier.

Suggested actions:

  • fix reference counting in psutil_proc_open_files (so it won't corrupt memory in case of unexpected errors)
  • perform static code analysis of other functions in _psutil_osx.c
@giampaolo
Copy link
Owner

Mmm I see what you mean. To clarify, the problem occurs in such a circumstance:

  • code makes (say) 1 loop, py_path and py_tuple are correctly DECREFed.
  • code makes another loop, bumps into an error, ends up in error: goto and both python objects get DECREFed again (twice); this causes the segfault

Correct? This is interesting... and worrying, because I believe there are other similar constructs elsewhere in the code. In this specific case it seems setting py_path and py_tuple to NULL right after being DECREFed is the way to go (because in the goto block we use Py_XDECREF which would then act as a no-op). I'm not sure about other circumstances though. It would be good to find a robust solution once and for all, and adopt it everywhere.

add EPERM to error checking part so it won't raise an exception for such process but skip a single fd instead (similar to the way lsof behaves)

No, we want it to propagate. Otherwise it would just return an incorrect result (an empty list).

@jakub-bacic
Copy link
Contributor Author

You got it right. I've already managed to find a couple of other errors related to reference counting in this file (but it's possible that they are harmless because they are never reached) e.g.:

psutil/psutil/_psutil_osx.c

Lines 144 to 149 in c4c6e22

if (! py_name) {
// Likely a decoding error. We don't want to fail the whole
// operation. The python module may retry with proc_name().
PyErr_Clear();
py_name = Py_None;
}

You should not assign Py_None value without INCREFing it. I'd replace that assignment with:

py_name = Py_BuildValue("");

Regarding the general fix, I also thought about setting pointer to NULL after being DECREFed. It's not required in most cases but it will easily prevent us from such errors (assuming that Py_XDECREF is used in error handling section). Additionally, it's not a very uncommon style in C programming (setting pointer to NULL once it's freed). Such defensive programming makes debugging dangling pointer bugs much easier - accessing NULL pointer will crash immediately on most platforms instead of reading/overwriting some random memory and leaving your app in undefined state.

No, we want it to propagate. Otherwise it would just return an incorrect result (an empty list).

That's not entirely true. You can get EPERM error just for a part of file descriptors so the result can still have some entries. That's why I suggested omitting this error instead of raising AccessDenied exception.

Example from my machine:

hsierras-MacBook-Pro:psutil jbacic$ lsof -p 396
COMMAND  PID   USER   FD      TYPE             DEVICE   SIZE/OFF       NODE NAME
routined 396 jbacic  cwd       DIR               1,15       1024          2 /
routined 396 jbacic  txt       REG               1,15      27408 4298624971 /usr/libexec/routined
routined 396 jbacic  txt      0000                0,0          0          0 /private/var/folders/35/wbfsx81n0wlfd1165gphvvw00000gp/0/com.apple.routined/Cache/Cache.sqlite-shm
routined 396 jbacic  txt       REG               1,15      56944 4298446202 /System/Library/Address Book Plug-Ins/LocalSource.sourcebundle/Contents/MacOS/LocalSource
routined 396 jbacic  txt       REG               1,15      49720 4298739750 /private/var/db/analyticsd/currentConfiguration.json
routined 396 jbacic  txt       REG               1,15   26704160 4298629438 /usr/share/icu/icudt59l.dat
routined 396 jbacic  txt       REG               1,15      32768 4298316497 /Users/jbacic/Library/Application Support/AddressBook/AddressBook-v22.abcddb-shm
routined 396 jbacic  txt       REG               1,15      94064 4298446145 /System/Library/Address Book Plug-Ins/DirectoryServices.sourcebundle/Contents/MacOS/DirectoryServices
routined 396 jbacic  txt       REG               1,15     837248 4298618806 /usr/lib/dyld
routined 396 jbacic  txt       REG               1,15 1146023936 4298642428 /private/var/db/dyld/dyld_shared_cache_x86_64h
routined 396 jbacic    0r      CHR                3,2        0t0        332 /dev/null
routined 396 jbacic    1u      CHR                3,2        0t0        332 /dev/null
routined 396 jbacic    2u      CHR                3,2     0t5447        332 /dev/null
routined 396 jbacic    3   NPOLICY                                          
routined 396 jbacic    4                                                    vnode: Operation not permitted
routined 396 jbacic    5                                                    vnode: Operation not permitted
routined 396 jbacic    6u      REG               1,15          0 4298764773 /private/var/folders/35/wbfsx81n0wlfd1165gphvvw00000gp/T/.AddressBookLocks/database.lock
routined 396 jbacic    7u    systm 0xf19f715c02095e53        0t0            [ctl com.apple.netsrc id 9 unit 9]
routined 396 jbacic    8                                                    vnode: Operation not permitted
routined 396 jbacic    9                                                    vnode: Operation not permitted
routined 396 jbacic   10                                                    vnode: Operation not permitted
routined 396 jbacic   11u      REG               1,15     393216 4298316489 /Users/jbacic/Library/Application Support/AddressBook/AddressBook-v22.abcddb
routined 396 jbacic   12                                                    vnode: Operation not permitted
routined 396 jbacic   13                                                    vnode: Operation not permitted
routined 396 jbacic   14                                                    vnode: Operation not permitted
routined 396 jbacic   15                                                    vnode: Operation not permitted
routined 396 jbacic   16u      REG               1,15     193672 4298316496 /Users/jbacic/Library/Application Support/AddressBook/AddressBook-v22.abcddb-wal
routined 396 jbacic   17                                                    vnode: Operation not permitted
routined 396 jbacic   18                                                    vnode: Operation not permitted
routined 396 jbacic   19                                                    vnode: Operation not permitted
routined 396 jbacic   20u      REG               1,15      32768 4298316497 /Users/jbacic/Library/Application Support/AddressBook/AddressBook-v22.abcddb-shm
routined 396 jbacic   21u      REG               1,15          0 4298764858 /private/var/folders/35/wbfsx81n0wlfd1165gphvvw00000gp/T/.AddressBookLocks/_Users_jbacic_Library_Application Support_AddressBook_Metadata_.MetaData.lock_lock
routined 396 jbacic   22                                                    vnode: Operation not permitted
routined 396 jbacic   23                                                    vnode: Operation not permitted
routined 396 jbacic   24                                                    vnode: Operation not permitted
routined 396 jbacic   25                                                    vnode: Operation not permitted
routined 396 jbacic   26                                                    vnode: Operation not permitted
>>> import psutil
>>> for f in psutil.Process(396).open_files():
>>>   print(f.fd, f.path)

(6, '/private/var/folders/35/wbfsx81n0wlfd1165gphvvw00000gp/T/.AddressBookLocks/database.lock')
(11, '/Users/janusz/Library/Application Support/AddressBook/AddressBook-v22.abcddb')
(16, '/Users/janusz/Library/Application Support/AddressBook/AddressBook-v22.abcddb-wal')
(20, '/Users/janusz/Library/Application Support/AddressBook/AddressBook-v22.abcddb-shm')
(21, '/private/var/folders/35/wbfsx81n0wlfd1165gphvvw00000gp/T/.AddressBookLocks/_Users_janusz_Library_Application Support_AddressBook_Metadata_.MetaData.lock_lock')

What do you think?

@giampaolo
Copy link
Owner

Sorry for being late. As for the permission error, I still think PermissionError should propagate. lsof has a way to inform the user that some fds are there but cannot be accessed but psutil doesn't. It will just return an incomplete result, the user will assume that it's all there when it's not and that is bad.

As for the reference counting issue, where are we at with it? Can you provide a PR? It looks like a high priority bug.

@jakub-bacic
Copy link
Contributor Author

I created a PR with appropriate changes to open_files function: #1144

I can fix the rest of _psutil_osx.c file later. However, if nobody reported any issues so far, it probably means they are not very likely to be reached. I just don't want to block this one as it's high priority bug due to Mac OS High Sierra release.

@giampaolo
Copy link
Owner

Did this issue cause segfaults?

@jakub-bacic
Copy link
Contributor Author

Yes. Invalid handling of reference counting (double DECREF) causes memory corruption and segfault in the end.

@giampaolo
Copy link
Owner

Fixed in #1144.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants