New cartodbfy function (overwrites CDB_CartodbfyTable) #78

javisantana · 2015-04-20T11:20:36Z

Still needs to be fully tested (partially tested now) using
the existing regression tests. Does not manage the timestamp
columns at this time.

Still needs to be fully tested (partially tested now) using the existing regression tests. Does not manage the timestamp columns at this time.

with no SRID in the metadata (thanks mate).

rochoa · 2015-04-21T13:53:54Z

scripts-available/CDB_CartodbfyTable.sql

+    newrelname := relationname || '_' || i;
+
+    IF i > 100 THEN
+      RAISE EXCEPTION '_CDB_Unique_Relation_Name looping too far';


I guess we would have to add all the new exceptions to the calling code (importer) so they can be managed, wouldn't we?

What if they all had a common format? Then the calling code could just extract the relevant message text and pass it on... right now all the exceptions are of the "I'm dead, call an ambulance" variety.

LOL. I guess that would be pretty awesome, cc @Kartones

We already parse some outputs from ogr2ogr/sql/etc, so perfect if we format them to distinguish this kind of "naming failures" (and similar cartodbfication issues) from fatal errors

a valid SRID on the geometry objects themselves

geo columns are "perfect" to start w/.

keys are in fact unique.

rafatower · 2015-08-10T10:29:00Z

FYI @pramsey I'll be working on integrating this with the editor CartoDB/cartodb#4962

javisantana · 2015-08-10T12:39:40Z

one question that is open:

should we add a index to the_geom?

generating index for the_geom takes a good amount of time and I'm not so sure if it's useful at all

what is your opinion @pramsey @Kartones @rafatower @rochoa ?

pramsey · 2015-08-10T12:41:27Z

Gotta do it. It's not useful to cartodb mainline functions, but SQL API users swap back and forth between the_geom based queries and the_geom_webmercator queries all the time. Having a surprise "your query will suck unless you add an index" case would reduce the Magic of CartoDB.

rochoa · 2015-08-10T13:08:51Z

scripts-available/CDB_CartodbfyTable.sql

+    INTO rec
+    FROM pg_class c
+    JOIN pg_namespace n ON c.relnamespace = n.oid
+    WHERE c.relname = newrelname


Is it a common practice to loop until valid name is found? Put it this way: can we filter by `c.relname ilike newrelname || '%' and decide the name based on existing names?

Two tricks: finding only names that match the established "pattern" for these things (foo_2) and finding the "next" name. Probably a query that regexp_matches the current names can do both at once and spit out an answer suitable for appending the "next" number to. This isn't a particularly deep loop, but replacing loops with SQL is always a good idea, IMO.

I tried a regexp_match approach, and it's really ugly and starts with an identity ("does this relation already exist?") query anyways, so I think the simplicity of the loops is better to stick with.

WITH raw AS (SELECT c.relname, n.nspname, ( select regexp_matches(c.relname, E'bar_(\\d+)') )[1]::integer + 1 as c FROM pg_class c JOIN pg_namespace n ON c.relnamespace = n.oid WHERE ( c.relname = 'bar' OR c.relname ~ ('bar' || E'_\\d+$') ) AND n.nspname = 'public' order by c desc) SELECT relname, nspname, case when c is null then 0 else c end as c from raw order by c desc;

I'm OK, just wanted to know how ugly it could be the other approach. Now we know it so let's go with the loop 👍

pramsey · 2015-08-10T15:43:33Z

Important note: the current code handles raster tables, this code doesn't.

rafatower · 2015-08-10T15:53:10Z

We also need to account for raster tables

pramsey · 2015-08-10T15:56:40Z

Mostly the current raster support just seems to be adding triggers and assuming that all columns are already correctly named, so it's pretty brittle.

rafatower · 2015-08-10T16:16:40Z

Just got this error testing everything together with the triggers enabled.

2015-08-10 16:10:34 UTC LOG:  statement: 
                SELECT cartodb.CDB_CartodbfyTable2('five_countries_15', 'public');

2015-08-10 16:10:34 UTC ERROR:  syntax error at or near "8022947" at character 78
2015-08-10 16:10:34 UTC QUERY:  CREATE trigger track_updates AFTER INSERT OR UPDATE OR DELETE OR TRUNCATE ON 8022947 FOR EACH STATEMENT EXECUTE PROCEDURE cartodb.cdb_tablemetadata_trigger()
2015-08-10 16:10:34 UTC CONTEXT:  PL/pgSQL function _cdb_create_triggers(text,regclass) line 9 at EXECUTE statement
        SQL statement "SELECT _CDB_create_triggers(destschema, reloid)"
        PL/pgSQL function cdb_cartodbfytable2(regclass,text) line 47 at PERFORM
2015-08-10 16:10:34 UTC STATEMENT:  
                SELECT cartodb.CDB_CartodbfyTable2('five_countries_15', 'public');

pramsey · 2015-08-10T16:19:07Z

Sorry, it's not just enabling the call, it's changing one param too, so it looks like

  -- Add triggers to the destination table, as necessary
  PERFORM _CDB_create_triggers(destschema, destoid);

rafatower · 2015-08-10T16:25:41Z

thanks for pointing out so quickly :)

pramsey · 2015-08-10T16:35:04Z

For raster tables, just copying the logic from the existing code looks like this

  -- Rasters only get a cartodb_id and a limited selection of triggers
  -- underlying assumption is that they are already formed up correctly
  SELECT cartodb._CDB_is_raster_table(schema_name, reloid) INTO is_raster;
  IF is_raster THEN

    PERFORM cartodb._CDB_create_cartodb_id_column(reloid);
    PERFORM cartodb._CDB_create_raster_triggers(destschema, reloid);

  ELSE

    -- Rewrite (or rename) the table to the new location
    PERFORM _CDB_Rewrite_Table(reloid, destschema);

    -- The old regclass might not be valid anymore if we re-wrote the table...
    destoid := (destschema || '.' || destname)::regclass;

    -- Add indexes to the destination table, as necessary
    PERFORM _CDB_Add_Indexes(destoid);

    -- Add triggers to the destination table, as necessary
    PERFORM _CDB_create_triggers(destschema, destoid);

  END IF;

I'm not sure how happy we really are w/ this, probably we should do the full song-and-dance with rasters, in terms of ensuring that columns have the right SRS, right raster column names, etc, etc, etc. There's a whole extra layer for raster though, which is there's a hard expectation of certain overview raster tables existing, no?

rafatower · 2015-08-10T16:47:39Z

I'm not sure how happy we really are w/ this, probably we should do the full song-and-dance with rasters...

I guess it depends on roadmap. I think we can go ahead with the copied logic as you suggest, but summoning @javisantana just in case.

javisantana · 2015-08-11T10:28:42Z

@pramsey could you add the raster code to the PR?
could you also check that new function works with ogr2ogr? (I'm thinking here on FME)

by removing references to created_at and updated_at columns

Now the cartodbfied table is a bit smaller because it does not have the timestamp columns.

When creating triggers, expectation is to have the columns the_geom and the_geom_webmercator even if the source table does not have any geometry columns. Populate it in the rewrite with NULL values and right types.

Do not create timestamp columns/triggers on cartodbfy

- Delete old CDB_CartodbfyTable code - Delete auxiliary functions no longer used - Modify the new CDB_CartodbfyTable signature to be backwards compatible.

with Cartodbfy being invoked by schema triggers. Some issues with regclass interpretation in tests still remain. Some issues with slightly different behavior to old version remain. Some issues with error messages / notification messages changing a little still remain.

This logic SHOULD BE MOVED TO Cartodbfy internals.

Just touch expected output to adapt to NOTICEs and other stuff that don't affect functionality.

only used in tests

Comment out tests that check cartodb_id text columns. These are no longer taken into consideration as candidate primary ID (candidate columns should be numeric).

Replace CDB_CartodbfyTable by new CartodbfyTable2

Conflicts: test/CDB_QuotaTest.sql

Now cartodbfyied tables take less space because of the timestamp columns.

In order to be able to test and rollback, should be needed See https://github.com/CartoDB/cartodb-postgresql/blob/master/CONTRIBUTING.md#testing-changes-live

Just found cartodbfy failed for schema-names-with-dashes. This should fix it.

[wip] New cartodbfy function (overwrites CDB_CartodbfyTable)

pramsey added 2 commits April 17, 2015 17:53

First draft of new cartodbfy function (named CDB_CartodbfyTable2)

f3c20ac

Still needs to be fully tested (partially tested now) using the existing regression tests. Does not manage the timestamp columns at this time.

Fix Rambo's test case, of a single geometry-only table

14414c4

with no SRID in the metadata (thanks mate).

rochoa reviewed Apr 21, 2015
View reviewed changes

pramsey added 7 commits April 21, 2015 06:58

Handle geometry column with no metadata SRID (grrr) but

bb68579

a valid SRID on the geometry objects themselves

Fix bug with missing non-geo columns in case where

74b7740

geo columns are "perfect" to start w/.

Re-use columns named 'cartodb_id' if the values of the

8dc7f45

keys are in fact unique.

Document functions a bit more

614a446

Use standard error message format

dd209d0

Fix no-op case error

c1bfef2

Break routine into two halves

0899c64

javisantana mentioned this pull request May 22, 2015

167K rows file import fails sometimes because of lat/long guessing CartoDB/cartodb#3740

Closed

rafatower mentioned this pull request Aug 10, 2015

Deploy the new CDB_CartodbfyTable2 to production CartoDB/cartodb#4962

Closed

rochoa reviewed Aug 10, 2015
View reviewed changes

Enable trigger addition routine

b195aa4

Rafa de la Torre and others added 17 commits August 11, 2015 19:52

Fix CDB_CartodbfyTableTest

c11d1bb

by removing references to created_at and updated_at columns

Fix quota test

6b9ab3d

Now the cartodbfied table is a bit smaller because it does not have the timestamp columns.

Fix for the_geom does not exist

8c41203

When creating triggers, expectation is to have the columns the_geom and the_geom_webmercator even if the source table does not have any geometry columns. Populate it in the rewrite with NULL values and right types.

Recover test for cartodb_id not-null constraint

8a031f5

Merge pull request #107 from CartoDB/new_cartodbfy_rtorre

c00d607

Do not create timestamp columns/triggers on cartodbfy

Replace CDB_CartodbfyTable by new CartodbfyTable2

a5321ec

- Delete old CDB_CartodbfyTable code - Delete auxiliary functions no longer used - Modify the new CDB_CartodbfyTable signature to be backwards compatible.

Fix regclass mismatch on column alter/drop

7f55a02

This logic SHOULD BE MOVED TO Cartodbfy internals.

Tweak expected output of test_ddl_triggers

f211669

Just touch expected output to adapt to NOTICEs and other stuff that don't affect functionality.

Fix deletion of cartodb_postgresql_unpriv_user

3d89d82

only used in tests

Recover _CDB_check_prerequisites (sorry, my fault)

72ebc39

Simple fix for type cheking in test

010dd13

Disable a couple of tests

900531f

Comment out tests that check cartodb_id text columns. These are no longer taken into consideration as candidate primary ID (candidate columns should be numeric).

Add minor piece of doc

b7b5be1

Make cartodbfy return destoid

565edcb

Use return value from cartodbfy

47d8429

Merge pull request #109 from CartoDB/new_cartodbfy_bw_compat_signature

a61a92a

Replace CDB_CartodbfyTable by new CartodbfyTable2

pramsey mentioned this pull request Aug 14, 2015

Replace CDB_CartodbfyTable by new CartodbfyTable2 #109

Merged

Rafa de la Torre added 3 commits August 17, 2015 15:27

Merge remote-tracking branch 'origin/master' into new_cartodbfy

2b48f90

Conflicts: test/CDB_QuotaTest.sql

Drop function in order to change return value

3f588df

Fix for quota test after merge with master

5375423

Now cartodbfyied tables take less space because of the timestamp columns.

rafatower changed the title ~~[wip] First draft of new cartodbfy function (named CDB_CartodbfyTable2)~~ [wip] New cartodbfy function (overwrites CDB_CartodbfyTable) Aug 17, 2015

Rafa de la Torre added 4 commits August 18, 2015 15:15

Go back to version 0.8.2

ed97d87

In order to be able to test and rollback, should be needed See https://github.com/CartoDB/cartodb-postgresql/blob/master/CONTRIBUTING.md#testing-changes-live

Fix for schema-with-dashes

78bf202

Review of format strings and escaping of id's

805af3b

Just found cartodbfy failed for schema-names-with-dashes. This should fix it.

Version updated to 0.9.0 plus release notes

f71a2ac

rafatower pushed a commit that referenced this pull request Aug 19, 2015

Merge pull request #78 from CartoDB/new_cartodbfy

74e6807

[wip] New cartodbfy function (overwrites CDB_CartodbfyTable)

rafatower merged commit 74e6807 into master Aug 19, 2015

rafatower deleted the new_cartodbfy branch August 19, 2015 13:12

rafatower changed the title ~~[wip] New cartodbfy function (overwrites CDB_CartodbfyTable)~~ New cartodbfy function (overwrites CDB_CartodbfyTable) Aug 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New cartodbfy function (overwrites CDB_CartodbfyTable) #78

New cartodbfy function (overwrites CDB_CartodbfyTable) #78

javisantana commented Apr 20, 2015

rochoa Apr 21, 2015

pramsey Apr 21, 2015

rochoa Apr 21, 2015

Kartones Apr 21, 2015

rafatower commented Aug 10, 2015

javisantana commented Aug 10, 2015

pramsey commented Aug 10, 2015

rochoa Aug 10, 2015

pramsey Aug 10, 2015

pramsey Aug 10, 2015

rochoa Aug 10, 2015

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

javisantana commented Aug 11, 2015

New cartodbfy function (overwrites CDB_CartodbfyTable) #78

New cartodbfy function (overwrites CDB_CartodbfyTable) #78

Conversation

javisantana commented Apr 20, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rafatower commented Aug 10, 2015

javisantana commented Aug 10, 2015

pramsey commented Aug 10, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

pramsey commented Aug 10, 2015

rafatower commented Aug 10, 2015

javisantana commented Aug 11, 2015