-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New cartodbfy function (overwrites CDB_CartodbfyTable) #78
Conversation
Still needs to be fully tested (partially tested now) using the existing regression tests. Does not manage the timestamp columns at this time.
with no SRID in the metadata (thanks mate).
newrelname := relationname || '_' || i; | ||
|
||
IF i > 100 THEN | ||
RAISE EXCEPTION '_CDB_Unique_Relation_Name looping too far'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we would have to add all the new exceptions to the calling code (importer) so they can be managed, wouldn't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if they all had a common format? Then the calling code could just extract the relevant message text and pass it on... right now all the exceptions are of the "I'm dead, call an ambulance" variety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL. I guess that would be pretty awesome, cc @Kartones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already parse some outputs from ogr2ogr/sql/etc, so perfect if we format them to distinguish this kind of "naming failures" (and similar cartodbfication issues) from fatal errors
a valid SRID on the geometry objects themselves
geo columns are "perfect" to start w/.
keys are in fact unique.
FYI @pramsey I'll be working on integrating this with the editor CartoDB/cartodb#4962 |
one question that is open: should we add a index to the_geom? generating index for the_geom takes a good amount of time and I'm not so sure if it's useful at all what is your opinion @pramsey @Kartones @rafatower @rochoa ? |
Gotta do it. It's not useful to cartodb mainline functions, but SQL API users swap back and forth between |
INTO rec | ||
FROM pg_class c | ||
JOIN pg_namespace n ON c.relnamespace = n.oid | ||
WHERE c.relname = newrelname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a common practice to loop until valid name is found? Put it this way: can we filter by `c.relname ilike newrelname || '%' and decide the name based on existing names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two tricks: finding only names that match the established "pattern" for these things (foo_2) and finding the "next" name. Probably a query that regexp_matches
the current names can do both at once and spit out an answer suitable for appending the "next" number to. This isn't a particularly deep loop, but replacing loops with SQL is always a good idea, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried a regexp_match
approach, and it's really ugly and starts with an identity ("does this relation already exist?") query anyways, so I think the simplicity of the loops is better to stick with.
WITH raw AS (SELECT c.relname, n.nspname, ( select regexp_matches(c.relname, E'bar_(\\d+)') )[1]::integer + 1 as c
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE ( c.relname = 'bar' OR c.relname ~ ('bar' || E'_\\d+$') )
AND n.nspname = 'public' order by c desc) SELECT relname, nspname, case when c is null then 0 else c end as c from raw order by c desc;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm OK, just wanted to know how ugly it could be the other approach. Now we know it so let's go with the loop 👍
Important note: the current code handles raster tables, this code doesn't. |
We also need to account for raster tables |
Mostly the current raster support just seems to be adding triggers and assuming that all columns are already correctly named, so it's pretty brittle. |
Just got this error testing everything together with the triggers enabled.
|
Sorry, it's not just enabling the call, it's changing one param too, so it looks like
|
thanks for pointing out so quickly :) |
For raster tables, just copying the logic from the existing code looks like this
I'm not sure how happy we really are w/ this, probably we should do the full song-and-dance with rasters, in terms of ensuring that columns have the right SRS, right raster column names, etc, etc, etc. There's a whole extra layer for raster though, which is there's a hard expectation of certain overview raster tables existing, no? |
I guess it depends on roadmap. I think we can go ahead with the copied logic as you suggest, but summoning @javisantana just in case. |
@pramsey could you add the raster code to the PR? |
by removing references to created_at and updated_at columns
Now the cartodbfied table is a bit smaller because it does not have the timestamp columns.
When creating triggers, expectation is to have the columns the_geom and the_geom_webmercator even if the source table does not have any geometry columns. Populate it in the rewrite with NULL values and right types.
Do not create timestamp columns/triggers on cartodbfy
- Delete old CDB_CartodbfyTable code - Delete auxiliary functions no longer used - Modify the new CDB_CartodbfyTable signature to be backwards compatible.
with Cartodbfy being invoked by schema triggers. Some issues with regclass interpretation in tests still remain. Some issues with slightly different behavior to old version remain. Some issues with error messages / notification messages changing a little still remain.
This logic SHOULD BE MOVED TO Cartodbfy internals.
Just touch expected output to adapt to NOTICEs and other stuff that don't affect functionality.
only used in tests
Comment out tests that check cartodb_id text columns. These are no longer taken into consideration as candidate primary ID (candidate columns should be numeric).
Replace CDB_CartodbfyTable by new CartodbfyTable2
Conflicts: test/CDB_QuotaTest.sql
Now cartodbfyied tables take less space because of the timestamp columns.
In order to be able to test and rollback, should be needed See https://github.com/CartoDB/cartodb-postgresql/blob/master/CONTRIBUTING.md#testing-changes-live
Just found cartodbfy failed for schema-names-with-dashes. This should fix it.
[wip] New cartodbfy function (overwrites CDB_CartodbfyTable)
Still needs to be fully tested (partially tested now) using
the existing regression tests. Does not manage the timestamp
columns at this time.