TMDB Movie Data Analysis

by Chinonso Okonkwo

Objectives

This is a repository for Udacity Data Analyst Project 1 (Investigate a Dataset). The dataset used in the project is also included in this repository.

Installation

The libraries used on this project include:

Pandas – For storing and manipulating structured data. Pandas functionality is built on NumPy (upgrade to version 0.25.1)
Numpy – For multi-dimensional array, matrix data structures and, performing mathematical operations
Matplotlib – For all visualizations (including maps and graphs)

Introduction

I analyzed the dataset which contains information of about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue. The analysis is focused on answering the questions:

Which movie title had the highest budget?
Which movie titles has the highest revenue?
Which movies are the most popular of all times?
Is there a correlation between vote_count and revenue?
What kinds of properties are associated with movies that have high revenues?

Project Methodology

The main steps for this project can be summarized as follows:

Data Wrangling
- Data Assessment
- Data Cleaning
Exploratory Analysis
Conclusions/Results

Results

Based on the data and analysis carried out;

The most Popular Movies of all time are Jurassic World, Mad Max: Fury Road, Interstellar, Guardians of Galaxy and Insurgent.
The Scatter plot visualization plotted shows that there is no correlation between vote_counts and revenue generated.
High Popularity ratings is associated with movies that generates high revenue
The budget of a movie that generates low revenue is about 5 million while that of a high revenue movie over 52 million. This clearly shows that budget of a movie is correllated with the revenue of a movie, but there are limitations to this result, such as the year the movie was released(release_year) and Director of the Movie.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Investigate_a_Dataset.html		Investigate_a_Dataset.html
Investigate_a_Dataset.ipynb		Investigate_a_Dataset.ipynb
README.md		README.md
tmdb-movies.csv		tmdb-movies.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TMDB Movie Data Analysis

by Chinonso Okonkwo

Objectives

Installation

Introduction

Project Methodology

Results

About

Releases

Packages

Languages

Nonso-Analytics/Udacity-Investigate-A-Dataset

Folders and files

Latest commit

History

Repository files navigation

TMDB Movie Data Analysis

by Chinonso Okonkwo

Objectives

Installation

Introduction

Project Methodology

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages