Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update librz/magic to the latest from OpenBSD #721

Open
XVilka opened this issue Feb 25, 2021 · 21 comments
Open

Update librz/magic to the latest from OpenBSD #721

XVilka opened this issue Feb 25, 2021 · 21 comments
Labels
buildsystem good first issue Good for newcomers refactor Refactoring requests

Comments

@XVilka
Copy link
Member

XVilka commented Feb 25, 2021

Current code in librz/magic is from 2009.
There were tons of the changes in OpenBSD since, see https://github.com/openbsd/src/tree/master/usr.bin/file

Note, that there are changes that were done in this code after the import, so they should be isolated and reapplied as appropriate.
Also, probably it could be made as subproject, see #209

Initially it was imported in this commit: radareorg/radare2@323441c

You can see the following changes by browsing the history of libr/magic and librz/magic correspondingly.

@ret2libc
Copy link
Member

It makes me wonder whether we really need rz_magic module at all...

@XVilka
Copy link
Member Author

XVilka commented Feb 25, 2021

From my point of view, the magic search feature is important. But we could also consider the possibility of using libmagic as a dependency, one of our subprojects. But it depends on how much it was customized, maybe the mainstream version is incompatible out of the box.

@ret2libc
Copy link
Member

I agree it is an important feature indeed! I'm just not sure if it makes sense to have a module which essentially just mirrors either BSD implementation or GNU implementation, in case sys_magic is used...

I guess we need to research a bit more and see if there's any relevant change in rz_magic.

@caribpa
Copy link
Contributor

caribpa commented Feb 25, 2021

I also think the magic search feature is important. Moreover, it would be better to use libmagic, though after diffing with magic.c 1.6 (I can't find magic.c 1.8 anywhere) I see there are some changes that basically only make use of Rizin types:

-private int
-info_from_stat(struct magic_set *ms, mode_t md)
-{
+static int info_from_stat(RzMagic *ms, unsigned short md) {
        /* We cannot open it, but we were able to stat it. */
-       if (md & 0222)
-               if (file_printf(ms, "writable, ") == -1)
+       if (md & 0222) {
+               if (file_printf(ms, "writable, ") == -1) {
                        return -1;
-       if (md & 0111)
-               if (file_printf(ms, "executable, ") == -1)
+               }
+       }
+       if (md & 0111) {
+               if (file_printf(ms, "executable, ") == -1) {
                        return -1;
-       if (S_ISREG(md))
-               if (file_printf(ms, "regular file, ") == -1)
+               }
+       }
+       if (S_ISREG(md)) {
+               if (file_printf(ms, "regular file, ") == -1) {
                        return -1;
-       if (file_printf(ms, "no read permission") == -1)
+               }
+       }
+       if (file_printf(ms, "no read permission") == -1) {
                return -1;

But then, there are some Rizin specific functions defined:

+RZ_API void rz_magic_free(RzMagic *ms) {
+       if (ms) {
+               free_mlist(ms->mlist);
+               free(ms->o.pbuf);
+               free(ms->o.buf);
+               free(ms->c.li);
+               free(ms);
+       }
+}
+
+RZ_API bool rz_magic_load_buffer(RzMagic *ms, const char *magicdata) {
+       if (*magicdata == '#') {
+               struct mlist *ml = file_apprentice(ms, magicdata, FILE_LOAD);
+               if (ml) {
+                       free_mlist(ms->mlist);
+                       ms->mlist = ml;
+                       return true;
+               }
+       } else {
+               eprintf("Magic buffers should start with #\n");
+       }
+       return false;
+}

@caribpa
Copy link
Contributor

caribpa commented Feb 25, 2021

There is some content removed in the Rizin version, but I cannot tell if they are rizin specific changes or version changes (1.6 -> 1.8, as magic.c disappears from the OpenBSD history in the next version)

@caribpa
Copy link
Contributor

caribpa commented Feb 25, 2021

It shall be noted that all the GNU, BSD, etc versions come from https://www.darwinsys.com/file/

@ret2libc
Copy link
Member

ret2libc commented Mar 3, 2021

It shall be noted that all the GNU, BSD, etc versions come from https://www.darwinsys.com/file/

Yes, except the version we ship is from OpenBSD, that as said in the link wrote its own code to implement libmagic :(

@XVilka
Copy link
Member Author

XVilka commented Mar 4, 2021

@caribpa @ret2libc if the patching cannot be avoided we could employ semantic patching with Coccinelle suite:

@ret2libc
Copy link
Member

ret2libc commented Mar 4, 2021

@caribpa @ret2libc if the patching cannot be avoided we could employ semantic patching with Coccinelle suite:

* https://coccinelle.gitlabpages.inria.fr/website/sp.html

* https://dewaka.com/blog/2020/01/25/semantic-patching/

I'm not sure what coccinnelle has to do with libmagiz :) Could you elaborate?

@caribpa
Copy link
Contributor

caribpa commented Mar 4, 2021

From what I understand, @XVilka is suggesting to use the latest libmagiz (still needs to be discussed which one), and smart-patch it (with the Rizin specific changes) using Semantic patching (coccinelle).

@caribpa
Copy link
Contributor

caribpa commented Mar 4, 2021

I think we should see first if it is possible to simply use the system's libmagic and replace the unwanted behavior.

@caribpa
Copy link
Contributor

caribpa commented Mar 4, 2021

@ret2libc why the thumbs down? 😭

@caribpa
Copy link
Contributor

caribpa commented Mar 4, 2021

Btw, what I was suggesting when I said simply use the system's libmagic and replace the unwanted behavior was using the weak attribute (though I realized it is not portable), or dlsym (which is POSIX but apparently not on Windows).

@caribpa
Copy link
Contributor

caribpa commented Mar 4, 2021

Windows seems to have a pragma for weak-like, see this answer

@caribpa
Copy link
Contributor

caribpa commented Mar 4, 2021

According to this answer, the dlsym equivalent in Windows is GetProcAddress

@caribpa
Copy link
Contributor

caribpa commented Mar 4, 2021

Well, apparently there's already rz_lib_dl_sym for solving the portability issues 🙂

@ret2libc
Copy link
Member

ret2libc commented Mar 4, 2021

@ret2libc why the thumbs down? sob

Just because I don't like the approach. And, again, I don't see why coccinelle would really help to do any non-trivial adjustments.

@XVilka
Copy link
Member Author

XVilka commented Mar 5, 2021

@caribpa @ret2libc I extracted all changes for files in librz/magic since the initial import into the separate repository, also combined some commits for clarity, see https://github.com/XVilka/rizin-magic

Please note that it's made through git replace --graft so GitHub does't show the right history, locally it should be fine.

git replace --graft f448ee2682c48d8cc3d2592147f96984ecbc6ced a8e896b7551107dd924e5ce8dc3b2be0ac8fe7ae

@XVilka
Copy link
Member Author

XVilka commented Mar 5, 2021

And here is the combined patch of all changes in both radare2 and rizin since the initial import:

magic_radare_rizin_changes.patch.zip

The same difference but without changes by clang-format:

magic_radare_rizin_changes_no_clang_format.patch.zip

@caribpa
Copy link
Contributor

caribpa commented Mar 5, 2021

Excellent work @XVilka 👏

I think we should first research if we could make libmagic a dependency (like capstone), compile it, and then use rz_lib_dl_sym for getting the libmagic functions we don't want to modify (and implement our own version/wrap of the things we want to work differently) for future-proofing.

@stale
Copy link

stale bot commented Sep 7, 2021

This issue has been automatically marked as stale because it has not had recent activity. Considering a lot has probably changed since its creation, we kindly ask you to check again if the issue you reported is still relevant in the current version of rizin. If it is, update this issue with a comment, otherwise it will be automatically closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Sep 7, 2021
@ret2libc ret2libc removed the stale label Sep 7, 2021
@stale stale bot added the stale label Mar 12, 2022
@rizinorg rizinorg deleted a comment from stale bot Mar 14, 2022
@stale stale bot removed the stale label Mar 14, 2022
@XVilka XVilka added this to the 0.5.0 milestone Aug 24, 2022
@ret2libc ret2libc removed this from the 0.5.0 milestone Jan 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
buildsystem good first issue Good for newcomers refactor Refactoring requests
Projects
None yet
Development

No branches or pull requests

3 participants