Skip to content

Commit

Permalink
Doc fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
ahirreddy committed Apr 15, 2014
1 parent 6d658ba commit a19afe4
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions docs/sql-programming-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,6 +216,8 @@ parts = lines.map(lambda l: l.split(","))
people = parts.map(lambda p: {"name": p[0], "age": int(p[1])})

# Infer the schema, and register the SchemaRDD as a table.
# In future versions of PySpark we would like to add support for registering RDDs with other
# datatypes as tables
peopleTable = sqlCtx.inferSchema(people)
peopleTable.registerAsTable("people")

Expand Down Expand Up @@ -293,11 +295,11 @@ JavaSchemaRDD teenagers = sqlCtx.sql("SELECT name FROM parquetFile WHERE age >=

peopleTable # The SchemaRDD from the previous example.

# JavaSchemaRDDs can be saved as parquet files, maintaining the schema information.
# SchemaRDDs can be saved as parquet files, maintaining the schema information.
peopleTable.saveAsParquetFile("people.parquet")

# Read in the parquet file created above. Parquet files are self-describing so the schema is preserved.
# The result of loading a parquet file is also a JavaSchemaRDD.
# The result of loading a parquet file is also a SchemaRDD.
parquetFile = sqlCtx.parquetFile("people.parquet")

# Parquet files can also be registered as tables and then used in SQL statements.
Expand Down

0 comments on commit a19afe4

Please sign in to comment.