Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in check_datevar(dt_input, date_var): Date variable has duplicated dates. please clean data first [Panel data] #418

Closed
Anil-Pothala opened this issue Jun 24, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@Anil-Pothala
Copy link

I have been exploring the Robyn algorithm to do Market Mix Modeling on my dataset. The dataset is monthly level data of the promotional activity for each customer. In this case the data looks like this-

Id Date Revenue Channels…
Customer 1 Jan-2021
…..
Customer 1 Dec-2021
Customer 2 Feb-2021
…..
Customer 2 Dec-2021

In this way we have over 1000 customers and their monthly data of the channel activity. We have been able to create models using linear regression to get the impact of each channel. Now when we tried to run this data on Robyn we get a duplicate date error, so does this mean we have to run Robyn algorithm for each customer separately? Then we will have only 12 data points for the model and getting daily or weekly is also not possible for us. Is there anyway to run this kind of data on Robyn?

Environment & Robyn version

R version (R 4.0.5)

@laresbernardo laresbernardo self-assigned this Jun 24, 2022
@laresbernardo
Copy link
Collaborator

laresbernardo commented Jun 24, 2022

Hi @Anil-Pothala

I see you've shared this question as an issue here, but also in a discussion ticket, a FB post, and you emailed + LinkedIn me directly. Please, let's just create a single ticket and we'll do our best to help you.

The reason you can't set more than one date per row is that Robyn doesn't deal with panel data. These kinds of models are also called “mixed effect model” or "hierarchical model" (with some nuanced differences). So yes, if you want an MMM for each customer/segmentation you'd have to run a single model for each one as of today. We do have planned to think and develop a solution for this sometime, but the math is quite complex to solve (hierarchical + ridge) and we haven't even started.

On the other hand, as rules of thumb, we strongly recommend 1) weekly/daily data granularity, especially if you have <4 years' worth of data, and 2) 1:10 column:row (data point/variables) relationship.

@laresbernardo laresbernardo added the enhancement New feature or request label Jun 24, 2022
@laresbernardo laresbernardo changed the title Error in check_datevar(dt_input, date_var) : Date variable has duplicated dates. please clean data first. Error in check_datevar(dt_input, date_var): Date variable has duplicated dates. please clean data first [Panel data] Jun 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants