Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【推荐】datar: dplyr in python #23

Closed
1 of 6 tasks
pwwang opened this issue Sep 16, 2021 · 6 comments
Closed
1 of 6 tasks

【推荐】datar: dplyr in python #23

pwwang opened this issue Sep 16, 2021 · 6 comments

Comments

@pwwang
Copy link

pwwang commented Sep 16, 2021

推荐分类

  • 封面
  • 话题
  • 生信动态
  • 文章
  • 工具
  • 资源推荐

推荐内容

  • 标题:datar: python中的dplyr
  • 摘要/简介:
    在生信分析中,R是很常用的语言,R中数据处理的包,特别是tidyverse开发的包,包括dplyr, tidyr, forcats等,很受欢迎。他们的API设计简单易记,配合ggplot,简直数据分析+作图的神组合。而python中,pandas虽然强大,但API繁多且不容易记住。datarR中相关的包在python中进行了实现,使得python中的数据分析也可以用上dplyr的语法。datar不仅实现了管道操作,并且尽量遵循原包的API设计,对R熟悉的同学很容易上手。
  • 链接:https://github.com/pwwang/datar
@pwwang pwwang added the 推荐 label Sep 16, 2021
@ShixiangWang
Copy link
Member

Wow! 很棒的工具,这个是需要R包支持,还是完全在Python中实现的呢?

@ShixiangWang
Copy link
Member

logo

image

示例:

from datar import f
from datar.dplyr import mutate, filter, if_else
from datar.tibble import tibble
# or
# from datar.all import f, mutate, filter, if_else, tibble

df = tibble(
    x=range(4),
    y=['zero', 'one', 'two', 'three']
)
df >> mutate(z=f.x)
"""# output
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       1
2       2      two       2
3       3    three       3
"""

df >> mutate(z=if_else(f.x>1, 1, 0))
"""# output:
        x        y       z
  <int64> <object> <int64>
0       0     zero       0
1       1      one       0
2       2      two       1
3       3    three       1
"""

df >> filter(f.x>1)
"""# output:
        x        y
  <int64> <object>
0       2      two
1       3    three
"""

df >> mutate(z=if_else(f.x>1, 1, 0)) >> filter(f.z==1)
"""# output:
        x        y       z
  <int64> <object> <int64>
0       2      two       1
1       3    three       1
"""

@pwwang
Copy link
Author

pwwang commented Sep 17, 2021

Wow! 很棒的工具,这个是需要R包支持,还是完全在Python中实现的呢?

Purely in python, backended by pandas. No R at all.

@ShixiangWang
Copy link
Member

这就厉害了。我看到example中的f似乎指代 tidyverse中的.data。这个在README中建议说明下指代管道上游结果数据框,不然对一些初学者会有理解困难。

感谢分享。

@pwwang
Copy link
Author

pwwang commented Sep 17, 2021

f不完全指代上文的data,所以没有单独在README中提,这里有详细的文档:
https://pwwang.github.io/datar/f/

@ShixiangWang
Copy link
Member

好的,将在第二期进行推荐。

谢谢。

该issue将关闭。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants