Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Econometrics UX #1878

Merged
merged 36 commits into from
Jun 1, 2022
Merged

Econometrics UX #1878

merged 36 commits into from
Jun 1, 2022

Conversation

DidierRLopes
Copy link
Collaborator

@DidierRLopes DidierRLopes commented May 27, 2022

Also fixes issue with Sector and Industry Analysis.

Improves a lot of Econometrics UX. Mostly in regards to have a better error message and docs, more meaningful auto-complete and pushing the user towards using the correct convention when calling the commands.

Should be good for review @JerBouma. When you are review this, could you update documentation along the way please?

@DidierRLopes DidierRLopes added bug Fix bug enhancement Enhancement labels May 27, 2022
@JerBouma
Copy link
Contributor

JerBouma commented May 30, 2022

@DidierRLopes

I agree that there needs to be a way to process a dataset through the common menu. We need to think carefully about this though as it needs to be really intuitive how to adjust the dataset which is in essence not the case right now given your second point. Did you check the documentation though? I thought I was quite clear in that.

What is the deal with Panel and Compare? I follow the structure of a typical regression (with obviously the exclusion of the intercept):
image

E.g. panel DEPENDENT INDEPENDENT INDEPENDENT INDEPENDENT. This is similar to me as Stata, see: https://www.reed.edu/psychology/stata/analyses/parametric/Regression/regress.html

What I dislike about fitting everything under one command is that the command is going to do a lot. These are tests that are relevant to determine if your results actually make sense or need to be adjusted. Then again, I understand your point so let's give it a shot.

@DidierRLopes DidierRLopes marked this pull request as ready for review May 31, 2022 00:22
@JerBouma
Copy link
Contributor

Some comments:

Point 1
type is missing a way to see what types exists in each column. I want this to return when I simply type type. This is because I need to understand whether I need to adjust a certain column.

2022 Feb 25, 08:08 (✨) /econometrics/ $ type
           wp
┏━━━━━━━━━━━━┳━━━━━━━━━━┓
┃ columns    ┃ dtypes   ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━┩
│ nr         │ int64    │
├────────────┼──────────┤
│ year       │ category │
├────────────┼──────────┤
│ black      │ int64    │
├────────────┼──────────┤
│ exper      │ int64    │
├────────────┼──────────┤
│ hisp       │ int64    │
├────────────┼──────────┤
│ hours      │ int64    │
├────────────┼──────────┤
│ married    │ int64    │
├────────────┼──────────┤
│ educ       │ int64    │
├────────────┼──────────┤
│ union      │ int64    │
├────────────┼──────────┤
│ lwage      │ float64  │
├────────────┼──────────┤
│ expersq    │ int64    │
├────────────┼──────────┤
│ occupation │ int64    │
└────────────┴──────────┘

This command should return:

  • with just type all types of all loaded in datasets
  • with type wage_panel just the types of the wage_panel dataset
  • with type wage_panel.year show just the type of the year column of wage_panel
  • with type wage_panel.year -f category change the actual type.

Point 2
I can not do anything with index if I just call it, it actually returns an error:

2022 May 31, 04:19 (🦋) /econometrics/ $ index wp
Error: argument of type 'NoneType' is not iterable

What I am looking for is that it returns me the current index (e.g. tell me that the current index set is X with type Y and lets me know if I want to change the index I should type index wp -i INDEX1,INDEX2) when I type index wp and that setting a multi-index is also clear that this is the way to do it: index wp -i nr,year.

Point 3
I have a feeling that panel is now broken:

2022 May 31, 04:30 (🦋) /econometrics/ $ panel -i wp.lwage -d wp.nr

Error: local variable 'df' referenced before assignment

Error: not enough values to unpack (expected 4, got 0)

2022 May 31, 04:31 (🦋) /econometrics/ $ panel -d wp.lwage -i wp.black,wp.hisp,wp.exper,wp.expersq,wp.married,wp.educ,wp.union,wp.year

Error: local variable 'df' referenced before assignment

Error: not enough values to unpack (expected 4, got 0)

2022 May 31, 04:32 (🦋) /econometrics/ $

Point 4
I believe bpag is broken too displaying NaNs.

2022 May 31, 04:34 (🦋) /econometrics/ $ bpag

Breusch-Pagan heteroscedasticity test
┏━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃          ┃ Breusch-Pagan ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ lm-stat  │ 37.22         │
├──────────┼───────────────┤
│ p-value  │ nan           │
├──────────┼───────────────┤
│ f-stat   │ 58.69         │
├──────────┼───────────────┤
│ fp-value │ 0.00          │
└──────────┴───────────────┘

The result nan indicates no existence of heteroscedasticity.

Point 5
I believe coint is also broken:

2022 May 31, 04:36 (🦋) /econometrics/ $ coint -t wp.black wp.lwage
The following args couldn't be interpreted: ['wp.lwage']
More than one dataset.column must be provided.

2022 May 31, 04:36 (🦋) /econometrics/ $ coint -t wp.black,wp.lwage
More than one dataset.column must be provided.

2022 May 31, 04:36 (🦋) /econometrics/ $

@JerBouma
Copy link
Contributor

New one, when you rename columns it doesn't update correctly. Here I renamed year to date.

2022 May 31, 04:58 (🦋) /econometrics/ $ type wp.date -f category
usage: type -n {wp.nr,wp.year,wp.black,wp.exper,wp.hisp,wp.hours,wp.married,wp.educ,wp.union,wp.lwage,wp.expersq,wp.occupation} -f {int,float,str,bool,category,date} [-h]
type: error: argument -n/--name: invalid choice: 'wp.date' (choose from 'wp.nr', 'wp.year', 'wp.black', 'wp.exper', 'wp.hisp', 'wp.hours', 'wp.married', 'wp.educ', 'wp.union', 'wp.lwage', 'wp.expersq', 'wp.occupation')


2022 May 31, 04:59 (🦋) /econometrics/ $ type wp.year -f category
Error: 'year'

2022 May 31, 04:59 (🦋) /econometrics/ $

@JerBouma
Copy link
Contributor

desc also not working correctly, you would want to see summary statistics from all variables not just one and it crashes when you want to call help.

2022 May 31, 05:02 (🦋) /econometrics/ $ desc wp.nr

Statistics for dataset: 'wp
┏━━━━━━━━┳━━━━━━┓
┃        ┃ nr   ┃
┡━━━━━━━━╇━━━━━━┩
│ count  │ 4360 │
├────────┼──────┤
│ unique │ 545  │
├────────┼──────┤
│ top    │ 13   │
├────────┼──────┤
│ freq   │ 8    │
└────────┴──────┘

2022 May 31, 05:02 (🦋) /econometrics/ $ desc wp
usage: desc -n {wp.nr,wp.year,wp.black,wp.exper,wp.hisp,wp.hours,wp.married,wp.educ,wp.union,wp.lwage,wp.expersq,wp.occupation} [-h] [--export EXPORT]
desc: error: argument -n/--name: invalid choice: 'wp' (choose from 'wp.nr', 'wp.year', 'wp.black', 'wp.exper', 'wp.hisp', 'wp.hours', 'wp.married', 'wp.educ', 'wp.union', 'wp.lwage', 'wp.expersq', 'wp.occupation')

2022 May 31, 05:02 (🦋) /econometrics/ $ desc -h
usage: desc -n {wp.nr,wp.year,wp.black,wp.exper,wp.hisp,wp.hours,wp.married,wp.educ,wp.union,wp.lwage,wp.expersq,wp.occupation} [-h] [--export EXPORT]
desc: error: the following arguments are required: -n/--name

@JerBouma
Copy link
Contributor

@DidierRLopes Made updates to the docs, not fully done though since some stuff is still broken.

@JerBouma
Copy link
Contributor

JerBouma commented Jun 1, 2022

Made some improvements as well as update all the docs.

E.g, this text now tells you what to do:

2022 Jun 01, 07:01 (🦋) /econometrics/ $ load wage_panel -a wp

2022 Jun 01, 07:01 (🦋) /econometrics/ $ panel -d wp.nr -i wp.year
The column 'nr' from the dataset 'wp' is not a MultiIndex. Make sure you set the index correctly with the index (e.g. index wp -i nr,year) command where the first level is the entity (e.g. Tesla Inc.) and the second level the date (e.g. 2021-03-31)

2022 Jun 01, 07:01 (🦋) /econometrics/ $

And I wanted it to be more robust if you select two dependent variables for some reason:

2022 Jun 01, 07:02 (🦋) /econometrics/ $ panel -d wp.nr,wp.year  -i wp.year
It appears you have selected multiple variables for the dependent variable. The model only accepts one.
Did you intend to include these variables as independent variables? Use -i wp.nr,wp.year in this case.

2022 Jun 01, 07:03 (🦋) /econometrics/ $

@DidierRLopes DidierRLopes merged commit 49560ab into main Jun 1, 2022
@Chavithra Chavithra deleted the econometrics branch June 7, 2022 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fix bug enhancement Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants