Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More than one join and stdin #235

Closed
aborruso opened this issue Mar 20, 2019 · 7 comments
Closed

More than one join and stdin #235

aborruso opened this issue Mar 20, 2019 · 7 comments

Comments

@aborruso
Copy link
Contributor

Hi I have these 3 CSV

cat join_01.csv
id,name
1,andy
2,anna

cat join_02.csv
id,surname
1,torn
2,montana

cat join_03.csv
id,city
1,Paris
2,London

I would like to join 1 & 2, and then (1 & 2) with 3.

I start with

mlr --csv join -j id -f join_01.csv join_02.csv

But how to join the result of it with 3, without producing an output file?

Is there something like

mlr --csv join -j id -f join_01.csv join_02.csv then join -j id -f /stdin/ join_03.csv

Or something like

mlr --csv join -j id -f join_01.csv join_02.csv | mlr --csv join -j id -f /stdin/ join_03.csv

Or

mlr --csv join -j id -f *.csv

I know that probably the reply is always "no", but I think it would be very useful to have something like these and to have in output something like

id,name,surname,city
1,andy,torn,Paris
2,anna,montana,London

Thank you

@johnkerl
Copy link
Owner

@aborruso

$ mlr --csv join -j id -f join_01.csv then join -j id -f join_02.csv join_03.csv
id,surname,name,city
1,torn,andy,Paris
2,montana,anna,London

?

@johnkerl
Copy link
Owner

I'll type this up as an FAQ entry.

@aborruso
Copy link
Contributor Author

Great, thank you

@aborruso
Copy link
Contributor Author

I'm reopening it. Then you could close it once the FAQ will be ready

@johnkerl
Copy link
Owner

http://johnkerl.org/miller-releases/miller-head/doc/cookbook.html#Doing_multiple_joins

Will be in 5.6.0

Thank you! :)

@NikosAlexandris
Copy link
Contributor

I have multiple files with the same structure (== same header), i.e. one of the files looks like:

→ mlr --csv cat input_1.csv
zone,label,mean,stddev
1,Barren,0.985039418507162,0.00327046755267665
2,Permanent Snow and Ice,0.990449367088603,0.00347695390530483
3,Water Bodies,0.989689587426295,0.00283130417558745
9,Urban and Built-up Lands,0.975935137657604,0.00444728462815199
10,Dense Forests,0.982209571011,0.00151525916704626
20,Open Forests,0.982498749692162,0.00156407685057156
25,Forest/Cropland Mosaics,0.983158952435782,0.00119917817740868
30,Natural Herbaceous,0.982886083655933,0.00172176084515656
35,Natural Herbaceous/Croplands Mosaics,0.983636363636364,0.000771389215405308
36,Herbaceous Croplands,0.983256814928586,0.00116699486215588
40,Shrublands,0.977095890410958,0.00439150607284168

I can repeat the above example for up to 3 inputs files:

→ mlr --csv join --ul --ur --lp l --rp r -j zone,label -f input_1.csv then join -j zone,label -f input_2.csv input_3.csv
zone,label,mean,stddev,lmean,lstddev,rmean,rstddev
1,Barren,0.985141452451229,0.00296063409807811,0.985039418507162,0.00327046755267665,0.984987557668154,0.00294390190031405
2,Permanent Snow and Ice,0.990172413793102,0.003303950103698,0.990449367088603,0.00347695390530483,0.989895569620253,0.0036093031143493
3,Water Bodies,0.988460091843363,0.00249696810252014,0.989689587426295,0.00283130417558745,0.988820568927725,0.00259939680954198
9,Urban and Built-up Lands,0.976210534599518,0.00436170313798978,0.975935137657604,0.00444728462815199,0.976246019422661,0.00448461857275749
10,Dense Forests,0.982076308739861,0.00148154340071296,0.982209571011,0.00151525916704626,0.982190048828062,0.00146571376545496
20,Open Forests,0.982497740034809,0.00153034810204273,0.982498749692162,0.00156407685057156,0.982466195761245,0.00150786476199308
25,Forest/Cropland Mosaics,0.983225779156313,0.00117294632959691,0.983158952435782,0.00119917817740868,0.983067448680308,0.00118920277246029
30,Natural Herbaceous,0.982983720528064,0.00166879860717559,0.982886083655933,0.00172176084515656,0.982925841727431,0.00165974257213183
35,Natural Herbaceous/Croplands Mosaics,0.983354838709678,0.00106402725807591,0.983636363636364,0.000771389215405308,0.983575757575758,0.000817620458171498
36,Herbaceous Croplands,0.983352463549144,0.00113906386917183,0.983256814928586,0.00116699486215588,0.983220069731363,0.00116521692353466
40,Shrublands,0.977145547945205,0.00442587899140026,0.977095890410958,0.00439150607284168,0.977166380789022,0.00444267104546084

How about doing this for many more input files?

@johnkerl
Copy link
Owner

Hi @NikosAlexandris I split this out as #403

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants