About this repo

This repo is a demonstration of using Apache Spark with Python to provide analysis of CSV-delimited data from a PV solar inverter. See pvoutput.org, jfy-monitor and monitoring my inverter.

There are two file formats involved in the entire collection. The first is the output from solarmonj, and has the following schema:

Field name Datatype and units

Timestamp seconds-since-epoch

Temperature float (degrees C)

energyNow float (Watts)

energyToday float (Watt-hours)

powerGenerated float (Hertz)

voltageDC float (Volts)

current float (Amps)

energyTotal float (Watt-hours)

voltageAC float (Volts)

Due to bugs in solarmonj when combined with occasionally marginal hardware, some rows in the first version are invalid:

1370752022,1.4013e-45,-0.27184,0,-0.27184,1.4013e-45,1.3703e-40,1.36638e-40,6.43869e-41

Those records are dropped prior to creating RDDs.

The second schema is from jfy-monitor, and has this schema:

The first schema is in effect for records starting on 2013-06-04 and ending on 2018-03-26.

The second schema takes effect with the logfiles starting on 2018-03-27. There were some records from 2018-03-27/8 which have different fields, because I was updating jfyMonitor in production and breaking things. We drop those records.

We load up all the data files, and then from within a venv which has pyspark installed, we run

$ spark-submit /path/to/this/file (args)

to generate several reports: - for each year, which month had the day with the max and min energy outputs - for each month, what was the average energy generated - for each month, what was the total energy generated

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
LICENSE		LICENSE
README.rst		README.rst
solar-spark.py		solar-spark.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About this repo

About

Releases

Packages

Languages

Field name	Datatype and units
Timestamp	seconds-since-epoch
Temperature	float (degrees C)
energyNow	float (Watts)
energyToday	float (Watt-hours)
powerGenerated	float (Hertz)
voltageDC	float (Volts)
current	float (Amps)
energyTotal	float (Watt-hours)
voltageAC	float (Volts)

License

jmcp/solar-spark

Folders and files

Latest commit

History

Repository files navigation

About this repo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages