-
Notifications
You must be signed in to change notification settings - Fork 91
/
NEWS
10838 lines (7341 loc) · 431 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Release Notes
===============================================================================
3.6 Series (201x/xx/xx - )
===============================================================================
3.6.0 (subaruboshi) 201x/xx/xx
* Version 3.6.0
This is the first version of pgpool-II 3.6 series.
That is, a "major version up" from 3.5 series.
* Overview
Major enhancements in Pgpool-II 3.6 include:
- Improve the behavior of fail-over. In the steaming replication mode,
client sessions will not be disconnected when a fail-over occurs any
more if the session does not use the failed standby server. If the
primary server goes down, still all sessions will be disconnected.
Also it is possible to connect to Pgpool-II even if it is doing health
checking retries. Before all attempt of connecting to Pgpool-II failed
while doing health checking retries.
- New PGPOOL SET command has been introduced. Certain configuration
parameters now can be changed on the fly in a session.
- Watchdog is significantly enhanced. It becomes more reliable than
previous releases.
- Handling of extended query protocol (e.g. used by Java applications)
in streaming replication mode speeds up if many rows are returned
in a result set.
- Import parser of PostgreSQL 9.6.
- In some cases pg_terminate_backend() now does not trigger a fail-over.
- Change documentation format from raw HTML to SGML.
The above items are explained in more detail in the sections below.
* Major Enhancements
- Improve the behavior of fail-over. (Tatsuo Ishii)
In the steaming replication mode, client sessions will not be
disconnected when a fail-over occurs any more if the session
does not use the failed standby server. If the primary server
goes down, still all sessions will be disconnected. Health check
timeout case will also cause the full session disconnection.
Other health check error, including retry over case does not
trigger full session disconnection.
For user's convenience, "show pool_nodes" command shows the session
local load balance node info since this is important for users in
case of fail-over. If the load balance node is not the failed node,
the session will not be affected by fail-over.
Also now it is possible to connect to Pgpool-II even if it is doing
health checking retries. Before all attempt of connecting to Pgpool-II
failed while doing health checking retries. Before any attempt to
connect to Pgpool-II fails if it is doing a health check against
failed node even if fail_over_on_backend_error is off because Pgpool-II
child first tries to connect to all backend including the failed one and
exits if it fails to connect to a backend (of course it fails). This is
a temporary situation and will be resolved before pgpool executes fail-over.
However if the health check is retrying, the temporary situation keeps longer
depending on the setting of health_check_max_retries and health_check_retry_
delay. This is not good. Attached patch tries to mitigate the problem:
When an attempt to connect to backend fails, give up connecting to
the failed node and skip to other node, rather than exiting the process
if operating in streaming replication mode and the node is not primary node.
Mark the local status of the failed node to "down". This will let the primary
node be selected as a load balance node and every queries will be sent to the
primary node. If there's other healthy standby nodes, one of them will be
chosen as the load balance node.
After the session is over, the child process will suicide to not retain
the local status.
- Add PGPOOL SHOW, PGPOOL SET and PGPOOL RESET commands. (Muhammad Usama)
These are similar to the PostgreSQL's SET and SHOW commands for GUC
variables, adding the functionality in Pgpool-II to set and reset
the value of config parameters for the current session, and for
that it adds a new syntax in Pgpool-II which is similar to PostgreSQL's
SET and RESET variable syntax with an addition of PGPOOL keyword at the start.
Currently supported configuration parameters by PGPOOL SHOW/SET/RESET are:
log_statement, log_per_node_statement, check_temp_table, check_unlogged_
table, allow_sql_comments, client_idle_limit, log_error_verbosity,
client_min_messages, log_min_messages, client_idle_limit_in_recovery.
- Sync inconsistent status of PostgreSQL nodes in Pgpool-II instances
after restart. (Muhammad Usama)
- At the Pgpool-II startup, the status of each configured backend node
is loaded from the backend status file or otherwise initialized by
querying the backend nodes. This technique works fine in stand alone mode
and also with the watchdog enabled as long as the status of backend nodes
remains consistent until all Pgpool-II nodes got up and running. But since
Pgpool-II does not sync the backend node status from the watchdog cluster
at startup time, so in some cases the Pgpool-II nodes participating in the
watchdog cluster may get a different status for the same backend,
especially if the Pgpool-II nodes part of the watchdog cluster starts
at different times and between that time an unavailable backend
PostgreSQL node had become available again.
So to solve this, the commit implements the new startup procedure
for the standby Pgpool-II, And now the standby Pgpool-II will load the
backend status of all configured PostgreSQL nodes from the watchdog
master/coordinator node at the time of startup.
- Enhance performance of SELECT when lots of rows involved. (Tatsuo Ishii)
Pgpool-II flushes data to network (calling write(2)) every time it
sends a row data ("Data Row" message) to frontend. For example, if 10,000
rows needed to be transfer, 10,000 times write()s are issued. This is
pretty expensive. Since after repeating to send row data, "Command Complete"
message is sent, it's enough to issue a write() with the command complete
message. Also there are unnecessary flushing are in handling the command
complete message.
Quick testing showed that from 47% to 62% performance enhancements
were achieved in some cases.
Unfortunately, performance in workloads where transferring few rows,
will not be enhanced since such rows are needed to flush to network anyway.
- Import PostgreSQL 9.6's SQL parser. (Bo Peng)
This allows Pgpool-II to fully understand the newly added SQL syntax
such as COPY INSERT RETURNING.
- In some cases pg_terminate_backend() now does not trigger a fail-over. (Muhammad Usama)
Since the pg_terminate_backend function in PostgreSQL is used to terminate
the backend connection, So what happens is, when this function kills a PostgreSQL
backend that is connected to the Pgpool-II, This disconnection of backend by
pg_terminate_backend function is appeared as a backend node failure to the
Pgpool-II. But the problem here is, PostgreSQL does not give any information
to the client program that the connection is going to be killed because of
the pg_terminate_backend call and on the client side, it looks similar to
the backend node failure.
Now to solve this in Pgpool-II we need two things. First is to
identify the pg_terminate_backend function in the query and the
Pgpool-II child process that hosts the particular backend connection
which will be killed by that pg_terminate_backend function call,
so that we get a heads up in advance about the backend termination,
and secondly the routing of the query containing pg_terminate_backend
also needs a new logic so that the query should be sent to the correct
PostgreSQL node that hosts the backend with the PID referred by
the pg_terminate_backend()
So how does this commit handles pg_terminate_backend()??
In the SimpleQuery() function which is the work horse of simple query
processing in the Pgpool-II we start with the search of the
pg_terminate_backend() function call in the query parse tree and if
the search comes out to be successful, the next step is to locate
the Pgpool-II child process and a backend node of that connection
whose PID is specified in pg_terminate_backend function's argument.
Once the connection and the Pgpool-II child process is identified,
we just set the swallow_termination flag(added by this commit in
ConnectionInfo structure) for that connection in the shared memory,
and also set the query destination node to the backend node that
hosts that particular connection and does not call pool_where_to_send()
for this query so that the query should be sent to the correct
backend node.
Now when the query is routed to the correct node and consequently
the backend gets killed, that results in the communication error on
Pgpool-II side, the Pgpool-II already knows that this disconnection
is due the pg_terminate_backend and not because of node failure as
the swallow_termination flag is already set for the connection.
Some works are still remaining.
pg_terminate_backend is not handled with extended query protocol.
Currently we only support pg_terminate_backend(constant number)
function calls. If the expression or sub query is used in the
argument of pg_terminate_backend then it would not be handled e.g
pgpool=# select pg_terminate_backend(1025); -- Supported
pgpool=# select pg_terminate_backend( 2 +1); -- NOT SUPPORTED
pgpool=# select pg_terminate_backend((select 1)); -- NOT SUPPORTED
Currently only one pg_terminate_backend call in a query is handled.
- HTML documents are now generated from SGML documents.
(Muhammad Usama, Tatsuo Ishii, Bo Peng)
It is intended to have better construction, contents and
maintainability. However, still there's tremendous room to
enhance the SGML documents. Please help us!
* Other Enhancements
- Make authentication error message more user friendly. (Tatsuo Ishii)
When attempt to connect to backend (including health checking),
emit error messages from backend something like "sorry,
too many clients already" instead of "invalid authentication
message response type, Expecting 'R' and received '%c'"
- Tighten up health check timer expired condition in pool_check_fd().
(Muhammad Usama)
Check if the signal was actually the health check timer expire to
make sure that we do not declare the timer expire due to some other
signal arrived while waiting for data for health check in pool_check_fd().
- Add new script called "watchdog_setup". (Tatsuo Ishii)
watchdog_setup is a command to create a temporary installation
of Pgpool-II clusters with watchdog for mainly testings.
Add "-pg" option to pgpool_setup. (Tatsuo Ishii)
This is useful when you want to assign specific port numbers to
PostgreSQL while using pgpool_setup. Also now pgpool_setup is
installed in the standard bin directory which is same as pgpool.
- Add "replication delay" column to "show pool_nodes". (Tatsuo Ishii)
This column shows the replication delay value in bytes if operated
in streaming replication mode.
- Do not update status file if all backend nodes are in down
status. (Chris Pacejo, Tatsuo Ishii)
This commit tries to remove the data inconsistency in replication
mode found in [pgpool-general: 3918] by not recording the status file
when all backend nodes are in down status. This surprisingly simple
but smart solution was provided by Chris Pacejo.
- Allow to use multiple SSL cipher protocols. (Multiple Usama)
By replacing TLSv1_method() with SSLv23_method() while initializing
the SSL session, we can use more protocols than TLSv1 protocol.
Allow to use arbitrary number of items in the
black_function_list/white_function_list. (Muhammad Usama)
Previously there were fixed limits for those.
- Properly process empty queries (all comments). (Tatsuo Ishii)
Pgpool-II now recognizes an empty query consisted of all comments
(for example "/* DBD::Pg ping test v3.5.3 */") (note that no ';')
as an empty query.
Before such that query was recognized an error.
Add some warning messages for wd_authkey hash calculation
failure. (Yugo Nagata)
Sometimes wd_authkey calculation fails for some reason other
than authkey mismatch. The additional messages make these distinguishable
for each other.
* Changes
- Change the default value of search_primary_node_timeout from 10 to
300. (Tatsuo Ishii)
Prior default value 10 seconds is sometimes too short for a standby
to be promoted.
- Change the Makefile under directory src/sql/, that is proposed by
[pgpool-hackers: 1611]. (Bo Peng)
- Change the PID length of pcp_proc_count command output to 6
characters long. (Bo Peng)
If the Pgpool-II process ID are over 5 characters, the 6th character
of each process ID will be removed. This commit changes the process ID
length of pcp_proc_count command output to 6 characters long.
- Redirect all user queries to primary server. (Tatsuo Ishii)
Up to now some user queries are sent to other than the primary
server even if load_balance_mode = off. This commit changes the
behavior: if load_balance_mode = off in streaming replication mode,
now all the user queries are sent to the primary server only.
* Bug fixes
- Fix the case when all backends are down then 1 node attached. (Tatsuo Ishii)
When all backends are down, no connection is accepted. Then 1
PostgreSQL becomes up, and attach the node using pcp_attach_node.
It successfully finishes. However, when a new connection arrives,
still the connection is refused becausePgpool-II child process
looks into the cached status, in which the recovered node is
still in down status if mode is streaming replication mode
(native replication and other modes are fine). Solution is,
if all nodes are down, force to restart all pgpool child.
- Fix for avoiding downtime when Pgpool-II changes require a
restart. (Muhammad Usama)
To fix this, The verification mechanism of configuration
parameter values is reversed, previously the standby nodes used
to verify their parameter values against the respective values
on the master Pgpool-II node and when the inconsistency was
found the FATAL error was thrown, now with this commit the
verification responsibility is delegated to the master Pgpool-II
node. Now the master node will verify the configuration parameter
values of each joining standby node against its local values and
will produce a WARNING message instead of an error in case of a
difference. This way the nodes having the different configurations
will also be allowed to join the watchdog cluster and the user
has to manually look out for the configuration inconsistency
warnings in the master Pgpool-II log to avoid the surprises at
the time of Pgpool-II master switch over.
- Fix a problem with the watchdog failover_command locking mechanism. (Muhammad Usama)
From Pgpool-II 3.5 watchdog was using the separate individual
locks for each node-failover command (failover, failback and follow-master)
and the lock was acquired just before executing the respective failover
script and was released as soon as the script execution finishes.
This technique although was very efficient but also had a problem.
If the failover_command takes a very little time and gets finished
before the lock request from other Pgpool-II node arrives, the other
node is also granted a lock, since the lock was already released by
the first node at that time. Consequently, both nodes ends up executing
the failover script. So to fix this we are reverting back to the tested
failover interlocking design used prior to Pgpool-II 3.5 where all the
commands gets locked at the failover start by the node that becomes a
lock-holder and each command lock is released after its execution
finishes. And only the lock-holder node is allowed to acquire/release
the individual command lock. That way the lock-holder node keeps the
lock-holder status throughout the span of the failover execution and
the system becomes less time sensitive.
- Disable strict aliasing optimization. (Tatsuo Ishii)
flatten_set_variable_args() was imported from PostgreSQL in Pgpool-II 3.5.
To make the code work, a compiler flag -fno-strict-aliasing is necessary
(PostgreSQL does so). Unfortunately when the function was imported, the
compiler flag was not added. To fix this, configure.ac was modified.
- Do not use random() while generating MD5 salt. (Tatsuo Ishii)
random() should not be used in security related applications.
To replace random(), import PostmasterRandom() from PostgreSQL.
Also store current time at the start up of Pgpool-II main process
for later use.
- Don't ignore sync message from frontend when query cache is enabled. (Tatsuo Ishii)
While returning cached query result, sync message sent from frontend
is discarded. This is harmless because "ready for query" messages is
sent to frontend afterward. Problem is, AccessShareLock held by previous
parse message processing is not released until sync message is received
by the backend. Fix is, forwarding the sync message to backend and discarding
"ready for query" message returned from backend.
- Fix bug that Pgpool-II fails to start if listen_addresses is
empty string. (bug 237) (Muhammad Usama)
The socket descriptor array (fds[]) was not getting the array
end marker when TCP listen addresses are not used.
- Create regression log directory if it does not exist yet. (Tatsuo Ishii)
- Fixing the error messages when the socket operation fails. (Muhammad Usama)
When some socket operation fails, we issue close socket before throwing
an error which causes the errno value to be overwritten and consequently
the error log does not print the correct failure reason. The solution is
to save the errno before closing the socket and use the saved value to
print error description.
- Fix regression failure of 003.failover. (Tatsuo Ishii)
Update expected data to reflect the changes made to show pool_nodes.
Also fix show pool_nodes to proper use "_" instead of space for the
column name.
- Fix hang when portal suspend received. (bug 230) (Tatsuo Ishii)
When portal suspend message is received, it's not enough to forward it
to the client. Since backend expects to receive execute message, trying
to read further more messages from backend will never succeed.
To fix this, turn off the query in progress flag, which will pole
incoming message from the client.
- Fix pgpool doesn't de-escalate IP in case network restored. (bug 228) (Muhammad Usama)
set_state function is made to de-escalate, when it is changing the
local node's state from the coordinator state to some other state.
- SIGUSR1 signal handler should be installed before watchdog
initialization. (Muhammad Usama)
Since there can be a case where a failover request from other
watchdog nodes arrive at the same time when the watchdog has just
been initialized, and if we wait any longer to install a SIGUSR1
signal handler, it can result in a potential crash
- Fix for bug of inconsistent status of PostgreSQL nodes in
Pgpool-II instances after restart. (bug 218) (Muhammad Usama)
Watchdog does not synchronize status. Currently at the Pgpool-II
startup, The status of each configured backend node is loaded from the
backend status file or otherwise initialized by querying the backend
nodes. This technique works fine in stand alone mode and also with
the watchdog enabled as long as the status of backend nodes remains
consistent until all Pgpool-II nodes got up and running. But since
Pgpool-II does not sync the backend node status from the watchdog
cluster at startup time, so in some cases the pgpool-II nodes
participating in the watchdog cluster may get a different status
for the same backend, especially if the Pgpool-II nodes part of
the watchdog cluster starts at different times and between that
time an unavailable backend PostgreSQL node had become available
again.
So to solve this, the commit implements the new startup procedure
for the standby Pgpool-II, And now the standby Pgpool-II will
load the backend status of all configured PostgreSQL nodes from
the watchdog master/coordinator node at the time of startup.
- Fix Pgpool-II doesn't escalate ip in case of another node
unavailability. (bug 215) (Muhammad Usama)
The heartbeat receiver fails to identify the heartbeat sender watchdog
node when the heartbeat destination is specified in terms of an IP address
while wd_hostname is configured as a hostname string or vice versa.
- Fixing a coding mistake in watchdog code. (Muhammad Usama)
wd_issue_failover_lock_command() function is supposed to forward
command type passed in as an argument to the wd_send_failover_sync_command()
function instead it was passing the NODE_FAILBACK_CMD command type.
The commit also contains some log message enhancements.
- Display human readable output for backend node status. (Muhammad Usama)
Changed the output of pcp_node_info utility and show commands display
human readable backend status string instead of internal status code.
- Replace "MAJOR" macro to prevent occasional failure. (Tatsuo Ishii)
The macro calls pool_virtual_master_db_node_id() and then access
backend->slots[id]->con using the node id returned. In rare cases,
it could point to 0 (in case when the DB node is not connected),
which gives access to con->major, then it causes a segfault.
- Fix "kind mismatch" error message in Pgpool-II. (Muhammad Usama)
Many of "kind mismatch..." errors are caused by notice/warning messages
produced by one or more of the DB nodes. In this case now Pgpool-II forwards
the messages to frontend, rather than throwing the "kind mismatch..." error.
This would reduce the chance of "kind mismatch..." errors.
- Fix handling of pcp_listen_addresses config parameter. (Muhammad Usama)
- Save and restore errno in each signal handler. (Tatsuo Ishii)
- Fix usage of wait(2) in pgpool main process. (Tatsuo Ishii)
When child process dies, SIGCHLD signal is raised and wait(2) knows
the event. However, multiple child death does not necessarily creates exact
same number of SIGCHLD signal as the number of dead children and wait(2)
could wait for an event which never happens in this case. I Actually
encountered this situation while testing Pgpool-II. Solution is,
to use waitpid(2) instead of wait(2).
- Fix confusing error messages. (Tatsuo Ishii)
pool_read() does not emit error messages when read(2) returns -1
if fail_over_on_backend_error is off. In any case the cause of error
should be emitted. I do not back port this because it's a too trivial
enhancement.
- Fix buffer over run problem in "show pool_nodes". (Tatsuo Ishii)
While processing "show pool_nodes", the buffer for hostname was too short.
It should be same size as the buffer used for pgpool.conf. Problem reported
by a twitter user who is using pgpool on AWS (which could have very long hostname).
- Fix [pgpool-hackers: 1638] pgpool-II does not use default configuration. (Muhammad Usama)
Configuration file not found should just throw a WARNING message
instead of ERROR or FATAL.
- Fix bug with load balance node id info on shmem. (Tatsuo Ishii)
There are few places where the load balance node was mistakenly
put on wrong place. It should be placed on:
ConnectionInfo *con_info[child id, connection pool_id, backend id].load_balancing_node].
In fact it was placed on:
*con_info[child id, connection pool_id, 0].load_balancing_node].
As long as the backend id in question is 0, it is ok. However while
testing Pgpool-II 3.6's enhancement regarding failover,
if primary node is 1 (which is the load balance node) and standby
is 0, a client connecting to node 1 is disconnected when failover
happens on node 0. This is unexpected and the bug was revealed.
It seems the bug was there since long time ago but it had
not found until today by the reason above.
- Fix for bug that pgpool hangs connections to database. (bug 197) (Muhammad Usama)
The client connection was getting stuck when backend node and
remote Pgpool-II node becomes unavailable at the same time.
The reason was a missing command timeout handling in the function
that sends the IPC commands to watchdog.
- Fix a possible hang during health checking. (bug 204) (Yugo Nagata)
Health checking was hang when any data wasn't sent from
backend after connect(2) succeeded. To fix this, pool_check_fd()
returns 1 when select(2) exits with EINTR due to SIGALRM while
health checking is performed.
- Deal with the case when the primary is not node 0 in streaming
replication mode. (Tatsuo Ishii)
http://www.pgpool.net/mantisbt/view.php?id=194#c837 reported that
if primary is not node 0, then statement timeout could occur even
after bug194-3.3.diff was applied. After some investigation,
it appeared that MASTER macro could return other than primary
or load balance node, which was not supposed to happen,
thus do_query() sends queries to wrong node (this is not clear
from the report but I confirmed it in my investigation).
pool_virtual_master_db_node_id(), which is called in MASTER macro
returns query_context->virtual_master_node_id if query context exists.
This could return wrong node if the variable has not been set yet.
To fix this, the function is modified: if the variable is not either
load balance node or primary node, the primary node id is returned.
For master and 3.5-stable, additional fixes/enhancements are made:
pool_extended_send_and_wait() now issues flush message if the request
is 'E' (execute).
Before it was issued outside (in Execute), but this makes the logic to
determine to which node the flush message to be sent unnecessary complex.
A debug message in pool_write is enhanced by adding backend node id.
- If statement timeout is enabled on backend and do_query() sends a query to
primary node, and all of following user queries are sent to standby,
it is possible that the next command, for example END, could cause a statement
timeout error on the primary, and a kind mismatch error on pgpool-II is
raised. (bug 194) (Tatsuo Ishii)
This fix tries to mitigate the problem by sending sync message instead of
flush message in do_query(), expecting that the sync message reset the
statement timeout timer if we are in an explicit transaction.
We cannot use this technique for implicit transaction case, because
the sync message removes the unnamed portal if there's any.
Plus, pg_stat_statement will no longer show the query issued
by do_query() as "running".
- Fix extended protocol handling in raw mode. (Tatsuo Ishii)
Bug152 reveals that extended protocol handling in raw mode
(actually other than in stream mode) was wrong in Describe()
and Close(). Unlike stream mode, they should wait for backend response.
- Fix confusing comments in pgpool.conf. (Tatsuo Ishii)
- Fix Japanese and Chinese documentation bug about raw mode. (Yugo Nagata, Bo Peng)
Connection pool is available in raw mode.
- Fix is_set_transaction_serializable() when
SET default_transaction_isolation TO 'serializable'. (bug 191) (Bo Peng)
SET default_transaction_isolation TO 'serializable' is sent to not only
primary but also to standby server in streaming replication mode,
and this causes an error. Fix is, in streaming replication mode,
SET default_transaction_isolation TO 'serializable' is sent only to
the primary server.
- Fix extended protocol hang with empty query. (bug 190) (Tatsuo Ishii)
The fixes related to extended protocol cases in 3.5.1 broke the case of
empty query. In this case backend replies with "empty query response"
which is same meaning as a command complete message. Problem is, when
empty query response is received, pgpool does not reset the query in progress
flag thus keeps on waiting for backend. However, backend will not send the
ready for query message until it receives a sync message. Fix is,
resetting the in progress flag after receiving the empty query response
and reads from frontend expecting it sends a sync message.
- Fix for [pgpool-general: 4569] Pgpool-II 3.5 : segfault. (Muhammad Usama)
PostgreSQL's memory and exception manager APIs adopted by the Pgpool-II 3.4
are not thread safe and are causing the segmentation fault in the watchdog
lifecheck process, as it uses the threads to ping configured trusted hosts for
checking the upstream connections. Fix is to remove threads and use the child
process approach instead.
- Validating the PCP packet length. (Muhammad Usama)
Without the validation check, a malformed PCP packet can crash the
PCP child and/or can run the server out of memory by sending the packet
with a very large data size.
- Fix Pgpool-II hang bug (bug 167). (Tatsuo Ishii)
Pgpool-II 3.5 or after in streaming replication mode does not
wait for response from each phase such parse, bind anymore.
However, if do_query is called, it sends flush message to retrieve
the result of system catalog look up. This is only sent to primary node
which may results in retrieving previous message results, for example
parse complete. If standby is assigned to load balance node, the node
does not return parse complete message, which will cause a problem
in bug167 case, because parse message for "BEGIN" was sent to both the
primary and the standby. Fix is, send flush message in do_query if the
load balance node is one of standbys.
- Fix pgpool_setup to not confuse log output. (Tatsuo Ishii)
Before it simply redirects the stdout and stderr of pgpool process
to a log file. This could cause log contents being garbled or even
missed because of race condition caused by multiple process being
writing concurrently. I and Usama found this while investigating
the regression failure of 004.watchdog. To fix this, pgpool_setup
now generates startall script so that pgpool now sends stdout/stderr
to cat command and cat writes to the log file (It seems the race
condition does not occur when writing to a pipe).
- Fix for [pgpool-general: 4519] Worker Processes Exit and
Are Not Re-spawned. (Muhammad Usama)
The problem was due to a logical mistake in the code for checking
the exiting child process type when the watchdog is enabled.
I have also changed the severity of the message from FATAL to LOG,
emitted for child exits due to max connection reached.
- Fix pgpool hung after receiving error state from backend. (bug #169) (Tatsuo Ishii)
This could happen if we execute an extended protocol query
and it fails. After an error is received the "ignore till sync flag"
is set and retained even if sync message was actually received.
Thus any subsequent query (in the case above "DEALLOCATE message")
is not processed and pgpool waits for message from frontend and backend,
and pgpool sticks here because no message will arrive from both side.
To fix this, unconditionally reset the "ignore till sync flag"
in ReadyforQuery(). This is safe because apparently we already
received the ready for query message.
- Fix query stack problems in extended protocol case. (bug 167, 168) (Tatsuo Ishii)
- Fix [pgpool-hackers: 1440] yet another reset query stuck problem. (Tatsuo Ishii)
After receiving X message from frontend, if Pgpool-II detects EOF
on the connection before sending reset query, Pgpool-II could wait for
backend which had not received the reset query. To fix this,
if EOF received, treat this as FRONTEND_ERROR, rather than ERROR.
- Fix for [pgpool-general: 4265] another reset query stuck problem. (Muhammad Usama)
The solution is to report FRONTEND_ERROR instead of simple ERROR
when pool_flush on front-end socket fails.
- Fixing pgpool-recovery module compilation issue with PostgreSQL 9.6. (Muhammad Usama)
Incorporating the change of function signature for GetConfigOption()
functions in PostgreSQL 9.6
- Fix compile issue on freebsd. (Muhammad Usama)
Add missing include files. The patch is contributed by the bug
reporter and enhanced a little by me.
- Fix regression test to check timeout of each test. (Yugo Nagata)
- Add some warning messages for wd_authkey hash calculation failure. (Yugo Nagata)
Sometimes wd_authkey calculation fails for some reason other
than authkey mismatch. The additional messages make these distinguishable
for each other.
===============================================================================
3.5 Series (2016/01/29 - )
===============================================================================
3.5.4 (ekieboshi) 2016/08/31
* Version 3.5.4
This is a bugfix release against pgpool-II 3.5.3.
__________________________________________________________________
* Bug fixes
- Fix buffer over run problem in "show pool_nodes". (Tatsuo Ishii)
While processing "show pool_nodes", the buffer for hostname was too
short. It should be same size as the buffer used for pgpool.conf.
Problem reported by a twitter user who is using pgpool on AWS (which
could have very long hostname).
- Fix usage of wait(2) in pgpool main process. (Tatsuo Ishii)
The usage of wait(2) in pgpool main could cause infinite wait in the
system call. Solution is, to use waitpid(2) instead of wait(2).
- Save and restore errno in each signal handler. (Tatsuo Ishii)
- Fix handling of pcp_listen_addresses config parameter. (Muhammad Usama)
- Fix "kind mismatch" error message in pgpool. (Muhammad Usama)
Many of "kind mismatch..." errors are caused by notice/warning
messages produced by one or more of the DB nodes. In this case now
Pgpool-II forwards the messages to frontend, rather than throwing the
"kind mismatch..." error. This would reduce the chance of "kind
mismatch..." errors.
See [pgpool-hackers: 1501] for more details.
- Replace "MAJOR" macro to prevent occasional failure. (Tatsuo Ishii)
The macro calls pool_virtual_master_db_node_id() and then access
backend->slots[id]->con using the node id returned. In rare cases, it
could point to 0 (in case when the DB node is not connected), which
gives access to con->major, then it causes a segfault.
See bug 225 for related info.
- Fixing a coding mistake in watchdog code. (Muhammad Usama)
wd_issue_failover_lock_command() function is supposed to forward command type
passed in as an argument to the wd_send_failover_sync_command() function instead
it was passing the NODE_FAILBACK_CMD command type.
The commit also contains some log message enhancements.
- doc : Fixing a typo in english doc (Muhammad Usama)
- Fix for bun 215 that pgpool doesn't escalate ip in case of another node unavailability.
(Muhammad Usama)
The heartbeat receiver fails to identify the heartbeat sender watchdog node when
the heartbeat destination is specified in terms of an IP address while
wd_hostname is configured as a hostname string or vice versa.
See bug 215 for related info.
- Fix for bug of inconsistent status of Postgresql nodes in Pgpool instances
after restart.(Muhammad Usama)
Watchdog does not synchronize status.
See bug 218 for related info.
- SIGUSR1 signal handler should be installed before watchdog initialization.
(Muhammad Usama)
Since there can be a case where a failover request from other watchdog nodes
arrive at the same time when the watchdog has just been initialized,
and if we wait any longer to install a SIGUSR1 signal handler, it can
result in a potential crash.
- Fix for bug 228 that pgpool doesn't de-escalate IP in case network restored.
(Muhammad Usama)
See bug 228 for related info.
- Fix hang when portal suspend received. (Tatsuo Ishii)
See bug 230 for related info.
- test : Add regression test for bug 230. (Tatsuo Ishii)
- Fixing a typo in the log message. (Muhammad Usama)
- Fixing the error messages when the socket operation fails. (Muhammad Usama)
- Tighten up health check timer expired condition in pool_check_fd(). (Muhammad Usama)
- doc : Add comment to the document about connection_cache. (Tatsuo Ishii)
- Fix Handling of pcp_socket_dir was missing from pool_get_config(). (Muhammad Usama)
- doc : Fix Japanese document typo. (Bo Peng)
- Fix "out of memory" by using "pg_md5 -m".(Muhammad Usama)
See bug 236 for related info.
- Fix for 237 that Pgpool-II fails to start if listen_addresses is empty string.
(Muhammad Usama)
The socket descriptor array (fds[]) was not getting the array end marker
when TCP listen addresses are not used.
See bug 237 for related info.
===============================================================================
3.5.3 (ekieboshi) 2016/06/17
* Version 3.5.3
This is a bugfix release against pgpool-II 3.5.2.
__________________________________________________________________
* New features
- Allow to access to pgpool while doing health checking (Tatsuo Ishii)
Currently any attempt to connect to pgpool fails if pgpool is doing
health check against failed node even if fail_over_on_backend_error is
off because pgpool child first tries to connect to all backend
including the failed one and exits if it fails to connect to a backend
(of course it fails). This is a temporary situation and will be
resolved before pgpool executes failover. However if the health check
is retrying, the temporary situation keeps longer depending on the
setting of health_check_max_retries and health_check_retry_delay. This
is not good. Attached patch tries to mitigate the problem:
- When an attempt to connect to backend fails, give up connecting to
the failed node and skip to other node, rather than exiting the
process if operating in streaming replication mode and the node is
not primary node.
- Mark the local status of the failed node to "down".
- This will let the primary node be selected as a load balance node
and every queries will be sent to the primary node. If there's other
healthy standby nodes, one of them will be chosen as the load
balance node.
- After the session is over, the child process will suicide to not
retain the local status.
Per [pgpool-hackers: 1531].
* Bug fixes
- Fix is_set_transaction_serializable() when
SET default_transaction_isolation TO 'serializable'. (Bo Peng)
SET default_transaction_isolation TO 'serializable' is sent to
not only primary but also to standby server in streaming replication mode,
and this causes an error. Fix is, in streaming replication mode,
SET default_transaction_isolation TO 'serializable' is sent only to the
primary server.
See bug 191 for related info.
- Fix Chinese documentation bug about raw mode (Yugo Nagata, Bo Peng)
Connection pool is available in raw mode.
- Fix confusing comments in pgpool.conf (Tatsuo Ishii)
- Fix extended protocol handling in raw mode (Tatsuo Ishii)
Bug152 reveals that extended protocol handling in raw mode (actually
other than in stream mode) was wrong in Describe() and Close().
Unlike stream mode, they should wait for backend response.
See bug 152 for related info.
- Permit pgpool to support multiple SSL cipher protocols (Muhammad Usama)
Currently TLSv1_method() is used to initialize the SSL context, that puts an
unnecessary limitation to allow only TLSv1 protocol for SSL communication.
While postgreSQL supports other ciphers protocols as well. The commit changes
the above and initializes the SSLSession using the SSLv23_method()
(same is also used by PostgreSQL). Because it can negotiate the use of the
highest mutually supported protocol version and remove the limitation of one
specific protocol version.
- If statement timeout is enabled on backend and do_query() sends a (Tatsuo Ishii)
query to primary node, and all of following user queries are sent to
standby, it is possible that the next command, for example END, could
cause a statement timeout error on the primary, and a kind mismatch
error on pgpool-II is raised.
This fix tries to mitigate the problem by sending sync message instead
of flush message in do_query(), expecting that the sync message reset
the statement timeout timer if we are in an explicit transaction. We
cannot use this technique for implicit transaction case, because the
sync message removes the unnamed portal if there's any.
Plus, pg_stat_statement will no longer show the query issued by
do_query() as "running".
See bug 194 for related info.
- Deal with the case when the primary is not node 0 in streaming replication mode.
(Tatsuo Ishii)
http://www.pgpool.net/mantisbt/view.php?id=194#c837 reported that if
primary is not node 0, then statement timeout could occur even after
bug194-3.3.diff was applied. After some investigation, it appeared
that MASTER macro could return other than primary or load balance
node, which was not supposed to happen, thus do_query() sends queries
to wrong node (this is not clear from the report but I confirmed it in
my investigation).
pool_virtual_master_db_node_id(), which is called in MASTER macro
returns query_context->virtual_master_node_id if query context
exists. This could return wrong node if the variable has not been set
yet. To fix this, the function is modified: if the variable is not
either load balance node or primary node, the primary node id is
returned.
- Fix a possible hang during health checking (Yugo Nagata)
Health checking was hang when any data wasn't sent
from backend after connect(2) succeeded. To fix this,
pool_check_fd() returns 1 when select(2) exits with
EINTR due to SIGALRM while health checking is performed.
Reported and patch provided by harukat and some modification
by Yugo.
See bug 204 for related info.
- change the Makefile under this directory src/sql/,that is proposed by (Bo Peng)
[pgpool-hackers: 1611]
- fix for 0000197: pgpool hangs connections to database.. (Muhammad Usama)
The client connection was getting stuck when backend node and remote pgpool-II
node becomes unavailable at the same time. The reason was a missing command
timeout handling in the function that sends the IPC commands to watchdog.
- Fix bug with load balance node id info on shmem (Tatsuo Ishii)
There are few places where the load balance node was mistakenly put on
wrong place. It should be placed on:
ConnectionInfo *con_info[child id, connection pool_id, backend id].load_balancing_node].
In fact it was placed on:
*con_info[child id, connection pool_id, 0].load_balancing_node].
As long as the backend id in question is 0, it is ok. However while
testing pgpool-II 3.6's enhancement regarding failover, if primary
node is 1 (which is the load balance node) and standby is 0, a client
connecting to node 1 is disconnected when failover happens on node
0. This is unexpected and the bug was revealed.
It seems the bug was there since long time ago but it had not found
until today by the reason above.
- Fixing coverity scan reported issues. (Muhammad Usama)
===============================================================================
3.5.2 (ekieboshi) 2016/04/26
* Version 3.5.2
This is a bugfix release against pgpool-II 3.5.1.
__________________________________________________________________
* Bug fixes
- Fix for segfault during trusted_servers check (Muhammad Usama)
PostgreSQL's memory and exception manager APIs adopted by the
pgpool 3.4 are not thread safe and are causing the segmentation fault
in the watchdog lifecheck process, as it uses the threads to ping
configured trusted hosts for checking the upstream connections.
Fix is to remove threads and use the child process approach instead.
See [pgpool-general: 4569] for more details.
- Removing the limit on the maximum number of items in the
black_function_list and white_function_list lists (Muhammad Usama)
extract_string_tokens in pool_config uses the fixed size malloc on
the array to hold the black_function_list/white_function_list items.
This imposes a limit of maximum items in these lists.
The fix is to use realloc to increase the array size when it gets full.
- Fix check "PCP Directory" in "Parameter Setting" in install
(Nozomi Anzai)
The command "pcp_system_info" was discarded in 3.5, but pgpoolAdmin
still confirmed if it existed.