Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-21359 #18585

Closed
wants to merge 1 commit into from
Closed

SPARK-21359 #18585

wants to merge 1 commit into from

Conversation

Shanshan-IC
Copy link

What changes were proposed in this pull request?

As described in https://issues.apache.org/jira/browse/SPARK-21359

add new functions for frequency discretizer transformation

How was this patch tested?

test example:
val data = Array((0, 18.0), (1, 19.0), (2, 8.0), (3, 5.0), (4, 2.2), (5, 1.0), (6, 9.1), (7, 10.1), (8, 1.1), (9, 16.0), (10, 20.0), (11, 20.0))
val df = spark.createDataFrame(data).toDF("id", "hour")
val frequency = new FrequencyDiscretizer()
.setInputCol("hour")
.setOutputCol("result")
.setNumBuckets(4)

val result = frequency.fit(df).transform(df) 
result.show() 

You will get:
+---+----+------+
| id|hour|result|
+---+----+------+
| 0|18.0| 2.0|
| 1|19.0| 3.0|
| 2| 8.0| 1.0|
| 3| 5.0| 1.0|
| 4| 2.2| 0.0|
| 5| 1.0| 0.0|
| 6| 9.1| 1.0|
| 7|10.1| 2.0|
| 8| 1.1| 0.0|
| 9|16.0| 2.0|
| 10|20.0| 3.0|
| 11|20.0| 3.0|
+---+----+------+

Please review http://spark.apache.org/contributing.html before opening a pull request.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented Jul 10, 2017

I'd ask you to close this and back up and discuss what this is about on the JIRA first. I don't understand what this is intended to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants