Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New cartodbfy function (overwrites CDB_CartodbfyTable) #78

Merged
merged 37 commits into from
Aug 19, 2015

Conversation

javisantana
Copy link
Contributor

Still needs to be fully tested (partially tested now) using
the existing regression tests. Does not manage the timestamp
columns at this time.

pramsey added 2 commits April 17, 2015 17:53
Still needs to be fully tested (partially tested now) using
the existing regression tests. Does not manage the timestamp
columns at this time.
with no SRID in the metadata (thanks mate).
newrelname := relationname || '_' || i;

IF i > 100 THEN
RAISE EXCEPTION '_CDB_Unique_Relation_Name looping too far';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we would have to add all the new exceptions to the calling code (importer) so they can be managed, wouldn't we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if they all had a common format? Then the calling code could just extract the relevant message text and pass it on... right now all the exceptions are of the "I'm dead, call an ambulance" variety.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL. I guess that would be pretty awesome, cc @Kartones

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already parse some outputs from ogr2ogr/sql/etc, so perfect if we format them to distinguish this kind of "naming failures" (and similar cartodbfication issues) from fatal errors

@rafatower
Copy link
Contributor

FYI @pramsey I'll be working on integrating this with the editor CartoDB/cartodb#4962

@javisantana
Copy link
Contributor Author

one question that is open:

should we add a index to the_geom?

generating index for the_geom takes a good amount of time and I'm not so sure if it's useful at all

what is your opinion @pramsey @Kartones @rafatower @rochoa ?

@pramsey
Copy link
Contributor

pramsey commented Aug 10, 2015

Gotta do it. It's not useful to cartodb mainline functions, but SQL API users swap back and forth between the_geom based queries and the_geom_webmercator queries all the time. Having a surprise "your query will suck unless you add an index" case would reduce the Magic of CartoDB.

INTO rec
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE c.relname = newrelname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it a common practice to loop until valid name is found? Put it this way: can we filter by `c.relname ilike newrelname || '%' and decide the name based on existing names?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two tricks: finding only names that match the established "pattern" for these things (foo_2) and finding the "next" name. Probably a query that regexp_matches the current names can do both at once and spit out an answer suitable for appending the "next" number to. This isn't a particularly deep loop, but replacing loops with SQL is always a good idea, IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a regexp_match approach, and it's really ugly and starts with an identity ("does this relation already exist?") query anyways, so I think the simplicity of the loops is better to stick with.

WITH raw AS (SELECT c.relname, n.nspname, ( select regexp_matches(c.relname, E'bar_(\\d+)') )[1]::integer + 1 as c
    FROM pg_class c
    JOIN pg_namespace n ON c.relnamespace = n.oid
    WHERE ( c.relname = 'bar' OR c.relname ~ ('bar' || E'_\\d+$') )
    AND n.nspname = 'public' order by c desc) SELECT relname, nspname, case when c is null then 0 else c end as c from raw order by c desc;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK, just wanted to know how ugly it could be the other approach. Now we know it so let's go with the loop 👍

@pramsey
Copy link
Contributor

pramsey commented Aug 10, 2015

Important note: the current code handles raster tables, this code doesn't.

@rafatower
Copy link
Contributor

We also need to account for raster tables

@pramsey
Copy link
Contributor

pramsey commented Aug 10, 2015

Mostly the current raster support just seems to be adding triggers and assuming that all columns are already correctly named, so it's pretty brittle.

@rafatower
Copy link
Contributor

Just got this error testing everything together with the triggers enabled.

2015-08-10 16:10:34 UTC LOG:  statement: 
                SELECT cartodb.CDB_CartodbfyTable2('five_countries_15', 'public');

2015-08-10 16:10:34 UTC ERROR:  syntax error at or near "8022947" at character 78
2015-08-10 16:10:34 UTC QUERY:  CREATE trigger track_updates AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE ON 8022947 FOR EACH STATEMENT EXECUTE PROCEDURE cartodb.cdb_tablemetadata_trigger()
2015-08-10 16:10:34 UTC CONTEXT:  PL/pgSQL function _cdb_create_triggers(text,regclass) line 9 at EXECUTE statement
        SQL statement "SELECT _CDB_create_triggers(destschema, reloid)"
        PL/pgSQL function cdb_cartodbfytable2(regclass,text) line 47 at PERFORM
2015-08-10 16:10:34 UTC STATEMENT:  
                SELECT cartodb.CDB_CartodbfyTable2('five_countries_15', 'public');

@pramsey
Copy link
Contributor

pramsey commented Aug 10, 2015

Sorry, it's not just enabling the call, it's changing one param too, so it looks like

  -- Add triggers to the destination table, as necessary
  PERFORM _CDB_create_triggers(destschema, destoid);

@rafatower
Copy link
Contributor

thanks for pointing out so quickly :)

@pramsey
Copy link
Contributor

pramsey commented Aug 10, 2015

For raster tables, just copying the logic from the existing code looks like this

  -- Rasters only get a cartodb_id and a limited selection of triggers
  -- underlying assumption is that they are already formed up correctly
  SELECT cartodb._CDB_is_raster_table(schema_name, reloid) INTO is_raster;
  IF is_raster THEN

    PERFORM cartodb._CDB_create_cartodb_id_column(reloid);
    PERFORM cartodb._CDB_create_raster_triggers(destschema, reloid);

  ELSE

    -- Rewrite (or rename) the table to the new location
    PERFORM _CDB_Rewrite_Table(reloid, destschema);

    -- The old regclass might not be valid anymore if we re-wrote the table...
    destoid := (destschema || '.' || destname)::regclass;

    -- Add indexes to the destination table, as necessary
    PERFORM _CDB_Add_Indexes(destoid);

    -- Add triggers to the destination table, as necessary
    PERFORM _CDB_create_triggers(destschema, destoid);

  END IF;

I'm not sure how happy we really are w/ this, probably we should do the full song-and-dance with rasters, in terms of ensuring that columns have the right SRS, right raster column names, etc, etc, etc. There's a whole extra layer for raster though, which is there's a hard expectation of certain overview raster tables existing, no?

@rafatower
Copy link
Contributor

I'm not sure how happy we really are w/ this, probably we should do the full song-and-dance with rasters...

I guess it depends on roadmap. I think we can go ahead with the copied logic as you suggest, but summoning @javisantana just in case.

@javisantana
Copy link
Contributor Author

@pramsey could you add the raster code to the PR?
could you also check that new function works with ogr2ogr? (I'm thinking here on FME)

Rafa de la Torre and others added 17 commits August 11, 2015 19:52
by removing references to created_at and updated_at columns
Now the cartodbfied table is a bit smaller because it does not have the
timestamp columns.
When creating triggers, expectation is to have the columns the_geom and
the_geom_webmercator even if the source table does not have any geometry
columns. Populate it in the rewrite with NULL values and right types.
Do not create timestamp columns/triggers on cartodbfy
- Delete old CDB_CartodbfyTable code
- Delete auxiliary functions no longer used
- Modify the new CDB_CartodbfyTable signature to be backwards
  compatible.
with Cartodbfy being invoked by schema triggers. Some
issues with regclass interpretation in tests still remain.
Some issues with slightly different behavior to old version
remain. Some issues with error messages / notification messages
changing a little still remain.
This logic SHOULD BE MOVED TO Cartodbfy internals.
Just touch expected output to adapt to NOTICEs and other stuff that
don't affect functionality.
Comment out tests that check cartodb_id text columns. These are no
longer taken into consideration as candidate primary ID (candidate
columns should be numeric).
Replace CDB_CartodbfyTable by new CartodbfyTable2
Rafa de la Torre added 3 commits August 17, 2015 15:27
Now cartodbfyied tables take less space because of the timestamp
columns.
@rafatower rafatower changed the title [wip] First draft of new cartodbfy function (named CDB_CartodbfyTable2) [wip] New cartodbfy function (overwrites CDB_CartodbfyTable) Aug 17, 2015
rafatower pushed a commit that referenced this pull request Aug 19, 2015
[wip] New cartodbfy function (overwrites CDB_CartodbfyTable)
@rafatower rafatower merged commit 74e6807 into master Aug 19, 2015
@rafatower rafatower deleted the new_cartodbfy branch August 19, 2015 13:12
@rafatower rafatower changed the title [wip] New cartodbfy function (overwrites CDB_CartodbfyTable) New cartodbfy function (overwrites CDB_CartodbfyTable) Aug 19, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants