Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kong can't reach cassandra cluster when entrypoints goes down. #660

Closed
yoanisgil opened this issue Oct 27, 2015 · 29 comments
Closed

Kong can't reach cassandra cluster when entrypoints goes down. #660

yoanisgil opened this issue Oct 27, 2015 · 29 comments
Assignees
Labels
task/feature Requests for new features in Kong
Milestone

Comments

@yoanisgil
Copy link

We're running Kong 0.5.1 (on Docker) and as part of our stress/load/chaos testing we abruptly stopped one cassandra node (we're running a cassandra cluster of 3 nodes with a replication factor of 1). It seems as thought Kong was not able to recover, because all subsequents requests fail with:

2015/10/27 18:55:10 [error] 57#0: *124254 [lua] responses.lua:61: cb(): Cassandra error: Cassandra returned error (Unprepared): Prepared query with ID ecd6bc7ab3257b13b233aa5e5f091bde not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache), client: 172.17.42.1, server: _, request: "GET /crakbucks/status HTTP/1.1", host: "10.1.8.52:8000"
2015/10/27 18:55:10 [error] 57#0: *124255 [lua] responses.lua:61: cb(): Cassandra error: Cassandra returned error (Unprepared): Prepared query with ID ecd6bc7ab3257b13b233aa5e5f091bde not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache), client: 172.17.42.1, server: _, request: "GET /crakbucks/status HTTP/1.1", host: "10.1.8.52:8000"
2015/10/27 18:55:11 [error] 57#0: *124256 [lua] responses.lua:61: cb(): Cassandra error: Cassandra returned error (Unprepared): Prepared query with ID ecd6bc7ab3257b13b233aa5e5f091bde not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache), client: 172.17.42.1, server: _, request: "GET /crakbucks/status HTTP/1.1", host: "10.1.8.52:8000"

I know this was supposed to be fixed in #11 however it seems that when one of the cassandra entry points goes down, Kong is unable to reach Cassandra. Our Cassandra configuration in kong.yml looks like :

databases_available:
  cassandra:
    properties:
      contact_points:
        - "10.1.8.43:9042"
        - "10.1.8.42:9042"
      timeout: 1000
      keyspace: kong
      keepalive: 60000 # in milliseconds
@shashiranjan84
Copy link
Contributor

@yoanisgil Replication factor of 1 seems to be wrong, if node goes down so does the data and Cassandra can't recover because only copy of data is gone. So any query looking for that specific data would fail.

Note: I am not Cassandra expert so I may be wrong

@ahmadnassri
Copy link
Contributor

indeed, we recommend at least a replication factor of 2: https://getkong.org/about/faq/#how-does-it-work

@thibaultcha
Copy link
Member

This issue seems to be related with the lack of load balancing/retrying policy from the Cassandra driver.

When connecting to the cluster, the driver shuffles the contact points and then tries to connect to them one by one until one replies (is up), which usually happens pretty quickly (no reason to put an invalid contact point in the configuration).

However, once connected, a socket is considered valid and will be put in the Openresty's connection pool to be reused later on. On each subsequent query Kong retrieves the socket from the connection pool and uses it to perform the query. However, it does not have retry policies (since it is the role of the driver, as implemented in the official Datastax drivers) and will not try to perform the query on another node if the original node went down while the socket was waiting in the connection pool.

It seems to me the driver needs to implement such policies to handle this (fair) scenario.

@thibaultcha thibaultcha added area/DAO task/feature Requests for new features in Kong labels Oct 28, 2015
@yoanisgil
Copy link
Author

@thibaultcha that definitely makes sense. We've managed to stabilize our cluster setup with 3 nodes and a replication factor of 3. Kong was able to continue working even after 2 nodes were down, but at some point after re-adding the two "failed" nodes we experienced again the same issue described in this issue. A simple restart did the trick to get the gateway healthy.

My tests were more about performance but tomorrow I will be more into chaos testing and I might some more helpful/consistent/descriptive feedback.

In the meantime we will probably add a health on http://$KONG_ENDPOINT:8001/apis , as well as on the Cassandra nodes to make sure we can detect failures and recover as early as possible.

By the way is there such thing as a health endpoint within the Kong API?

Bests,

Yoanis.

@subnetmarco
Copy link
Member

By the way is there such thing as a health endpoint within the Kong API?

@yoanisgil Not really a health endpoint, but we could extend it: https://getkong.org/docs/0.5.x/admin-api/#retrieve-node-status

@yoanisgil
Copy link
Author

If calling this node involves querying Cassandra then it will do. If not I think checking on /APIs will do unless you recommend something better ;)

@subnetmarco
Copy link
Member

Not in the current version, but in the next one we will be also retrieving stats from the database.

@sonicaghi
Copy link
Member

open an issue?

@thibaultcha
Copy link
Member

Back to the original issue here,

I opened an issue (thibaultcha/lua-cassandra#11) to discuss the retry policy implementation in the driver. Still waiting for some feedback from the Openresty Google group before jumping on it.

@thibaultcha thibaultcha self-assigned this Oct 29, 2015
@thibaultcha thibaultcha added this to the 0.6.0 milestone Oct 29, 2015
@yoanisgil
Copy link
Author

So today I was a bit more into some "chaos testing". Below the setup:

  • Single Kong node running on machine named aws038
  • 3 Cassandra nodes running on machines named aws034, aws038, aws07. All Cassandra nodes are running in containers.
  • Replication factor is tree. This is updated after Kong runs migrations, using:
 ALTER KEYSPACE kong WITH replication = {'class': 'SimpleStrategy', 'replication_factor':3};

NOTE: aws is not related at all with Amazon Web Services.

After that I replayed a few thousands of requests from our access log. Throughout this replay test I abruptly stopped node aws074 using docker kill. Immediately after I started to notice a few 500 errors at a rate of one 500 error per second. This is the output from /usr/local/kong/logs/error.log:

2015/10/29 17:54:23 [error] 60#0: *382446 [lua] responses.lua:61: cb(): Cassandra error: connection refused, client: 10.1.8.44, server: _, request: "GET /craktracking-track/v1/track/CR/86.190.179.577f37282f118b15f52745707ad48c06a4/1.3189.GB.15689.CRY_999520613_INH_MPOP_GB_4698_13884_ADV38839_17625_iOS_ipad_Oct2015?apikey=1232 HTTP/1.1", host: "aws038:8000"

and when I try to reload Kong this is what I get:

[INFO] Using configuration: /etc/kong/kong.yml
[INFO] Kong version.......0.5.1
       Proxy HTTP port....8000
       Proxy HTTPS port...8443
       Admin API port.....8001
       dnsmasq port.......8053
       Database...........cassandra keepalive=60000 timeout=1000 contact_points=10.1.8.43:9042,10.1.8.42:9042 keyspace=kong
[INFO] Connecting to the database...
[ERR] Cassandra error: closed

After bringing the Cassandra node back error were still showing at the same rate in /usr/local/kong/logs/error.log but I was able to reload kong.

@thibaultcha
Copy link
Member

I am currently testing this and what I am experiencing so far is what I expected:

  • Single running Kong instance (my local development installation
  • 3 Cassandra nodes running in Docker, replication factor of 2

Very small load, I am really trying to be able to read my error logs:

$ siege -t 5m -c 5 http://localhost:8000/mockbin/status/200

Everything is going fine as long as the Cassandra nodes are running. I decide to stop one of them. Errors appear:

2015/10/29 13:25:57 [error] 23643#0: *6078 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from localhost: closed, client: 127.0.0.1, server: _, request: "GET /mockbin/status/200 HTTP/1.1", host: "localhost:8000"
2015/10/29 13:25:57 [error] 23643#0: *6080 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from localhost: closed, client: 127.0.0.1, server: _, request: "GET /mockbin/status/200 HTTP/1.1", host: "localhost:8000"
2015/10/29 13:25:57 [error] 23643#0: *6082 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from localhost: closed, client: 127.0.0.1, server: _, request: "GET /mockbin/status/200 HTTP/1.1", host: "localhost:8000"

They disappear after a while (because ultimately the reused socket is closed and not reused anymore, and when the driver decides to open a new socket, it does check if the contact point is up, and since it's not, it skips that node every time now).

When I restart the Cassandra node, some errors appear:

2015/10/29 13:25:57 [error] 23643#0: *6078 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from localhost: closed, client: 127.0.0.1, server: _, request: "GET /mockbin/status/200 HTTP/1.1", host: "localhost:8000"
2015/10/29 13:25:57 [error] 23643#0: *6080 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from localhost: closed, client: 127.0.0.1, server: _, request: "GET /mockbin/status/200 HTTP/1.1", host: "localhost:8000"
2015/10/29 13:25:57 [error] 23643#0: *6082 [lua] responses.lua:61: cb(): Cassandra error: Failed to read frame header from localhost: closed, client: 127.0.0.1, server: _, request: "GET /mockbin/status/200 HTTP/1.1", host: "localhost:8000"

This is very probably due to Kong caching the prepared statements (on a per-host basis) and reusing them but the node was restarted, hence forgot about them. But it is not an issue, those errors are logged but Kong should handle a re-preparation on-the-go and retry the query.

After a while all the cached statements have been re-prepared and Kong is working without any error again. I am currently going through the official drivers to implement their retry policy in the Lua one for the first type of errors.


I am not sure why you encounter the "closed" error when reloading Kong, since the driver is supposed to go one by one through each contact_point and skip one if it is unreachable.

@yoanisgil
Copy link
Author

@thibaultcha maybe is the fact that your using the local development installation? I'm running Kong using Docker, this is what my launch script looks like:

docker run -v /usr/local/etc/kong:/etc/kong --rm -p 8000:8000 -p 8001:8001 mashape/kong:0.5.1

As for the kong configuration file is the default adapted to include 2 cassandra seed nodes.

Let me know if I can be of any help. I guess I could checkout master and deploy a particular commit.

@thibaultcha
Copy link
Member

maybe is the fact that your using the local development installation?

What do you mean? I think our results are similar?

I am using 0.5.2, the config file is the default too.

@yoanisgil
Copy link
Author

You're right, our results our similar except for the connection error on kong reload. I can give it a try to 0.5.2 to see if results are 100% consistent.

@yoanisgil
Copy link
Author

It seems as though both entrypoints needs to be available before Kong can start. Below my tests:

This is the command I'm using to launch kong:

docker run -v /usr/local/etc/kong:/etc/kong --name kong-node --rm -p 8000:8000 -p 8001:8001 mashape/kong:0.5.2

Scenario 1

Start Cassandra Node A and leave Cassandra Node B down.

Results from kong start:

[INFO] Using configuration: /etc/kong/kong.yml
[INFO] Kong version.......0.5.2
       Proxy HTTP port....8000
       Proxy HTTPS port...8443
       Admin API port.....8001
       dnsmasq port.......8053
       Database...........cassandra keepalive=60000 timeout=1000 contact_points=10.1.8.43:9042,10.1.8.42:9042 keyspace=kong
[INFO] Connecting to the database...
[ERR] Cassandra error: closed

Scenario 2

Start Cassandra Node B and leave Cassandra Node A down.

Results from kong start:

[INFO] Using configuration: /etc/kong/kong.yml
[INFO] Kong version.......0.5.2
       Proxy HTTP port....8000
       Proxy HTTPS port...8443
       Admin API port.....8001
       dnsmasq port.......8053
       Database...........cassandra keepalive=60000 timeout=1000 contact_points=10.1.8.43:9042,10.1.8.42:9042 keyspace=kong
[INFO] Connecting to the database...
[ERR] Cassandra error: closed

Scenario 3

Start Cassandra Node A and Cassandra Node B

Within this scenario Kong starts OK (though it might take two starts before it's successfully up).

I will proceed now to test kong behavoir under an unexpected failure of 1 or 2 Cassandra nodes.

@yoanisgil
Copy link
Author

@thibaultcha so I conducted the same tests as yesterday this time with Kong 0.5.2:

When I took two nodes down I experienced the same errors as yesterday. However when I brought the nodes back online, I saw this:

2015/10/30 17:47:06 [error] 61#0: *103565 [lua] responses.lua:61: cb(): Cassandra error: Cassandra returned error (Unprepared): Prepared query with ID ecd6bc7ab3257b13b233aa5e5f091bde not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache), client: 10.1.8.44, server: _, request: "GET /craktracking-track/v1/track/CR/46.136.110.16291ed881698668a5f064c6232d82961e9/2375197.1421.ES.5237?apikey=adf HTTP/1.1", host: "aws038:8000"
2015/10/30 17:47:06 [error] 61#0: *103566 [lua] responses.lua:61: cb(): Cassandra error: Cassandra returned error (Unprepared): Prepared query with ID ecd6bc7ab3257b13b233aa5e5f091bde not found (either the query was not prepared on this host (maybe the host has been restarted?) or you have prepared too many queries and it has been evicted from the internal cache), client: 10.1.8.44, server: _, request: "GET /craktracking-track/v1/track/CR/50.185.73.226351ba3be4f3674e45f388cb1e9b36625/1.1325.US.5013.TJ_689601_REDT_MPOP_US_4371_12989_ADV36318_0000_Android_mobile_Oct2015?apikey=ada HTTP/1.1", host: "aws038:8000"

Neither reload/restart will work, they both exist with this error:

[root@aws038 opscenter]# docker exec -ti kong-node kong reload
[INFO] Using configuration: /etc/kong/kong.yml
[INFO] Kong version.......0.5.2
       Proxy HTTP port....8000
       Proxy HTTPS port...8443
       Admin API port.....8001
       dnsmasq port.......8053
       Database...........cassandra keepalive=60000 timeout=1000 contact_points=10.1.8.43:9042,10.1.8.42:9042 keyspace=kong
[INFO] Connecting to the database...
[ERR] Inconsitency

even thought the cluster looks healthy:

docker exec -ti cassandra-node nodetool status  kong
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns (effective)  Host ID                               Rack
UN  10.1.8.42  444.98 KB  256     100.0%            6a619057-3db4-446e-bb79-cdab2cb51bb8  rack1
UN  10.1.8.43  457.42 KB  256     100.0%            3a062e26-1d03-4aaf-989f-8b3dac7dadcd  rack1
UN  10.1.8.52  437.49 KB  256     100.0%            513b817c-4613-4619-8046-50997cd3940d  rack1

Finally it does not seems like Kong was not able to recover under any circumstances. I'm attaching a Grafana chart which shows the rate at which requests fails after taking two nodes down.

screenshot from 2015-10-30 13-56-27

@thibaultcha
Copy link
Member

Just an update on this. The new driver should be finished soon, it follows the implementation of the official Datastax drivers:

  • It will have cluster awareness, retry, reconnection, load balancing and address resolution policies (basic ones at least for now) as well as many more options to configure it.
  • It uses the contact_points simple as an entry point to the cluster, and then retrieves peers and select one for each query according to the load balancing policy. No more need to put all of your contact_points in kong.yml. If some nodes need to not be hit by Kong, or less frequently, that is the job of the load balancing policy.
  • The only load balancing policy implemented yet is round robin, since it is the simplest to do, but it is still better than the current driver, which simply shuffles the contact_points and connects to one of them every time. The load should be slightly more balanced.
  • The only reconnection policy implemented is "exponential". If a node fails to answer, it is marked as down, and not tried for t. After t, the node is eventually retried and if still not available, won't be retried for t^2, etc. If the node miraculously answers at one of the retries, we mark it as being up again.

Anybody can implement its own [retry, reconnection, load balancing, address resolution] policy and pass it to the driver as an option, which allows for much more flexibility and is future-proof (more policies should be implemented in the driver directly anyways).

  • The driver will also be leveraging the ngx.shared.DICT API, which allows all nginx workers to have informations about the cluster, which should be more efficient than a per-worker memory zone like we currently have.
  • It will also be better at handling prepared queries and when those are not prepared on some hosts.
  • And many other things not really relevant here.

My only concern is to test it (in production), since even with tests, I can't guarantee it will be perfect out of the box... Once that is done, I can start replacing the current one in Kong with it, which should be very quick.

@yoanisgil
Copy link
Author

@thibaultcha thanks for the detailed update. For sure we can help with testing, since I have some decent test scenarios, which involves 3 kong instances and 3 nodes and some monitoring as well.

When I first looked into Kong it was my expectation that contact_points were just that, a set of entry points to the cluster and that Kong will later find out more about the cluster topology by reaching out to the configured nodes. I do believe that for a production environment there should at least be two contact_points.

Anyhow, send us and update of the Docker image and I will test as thoroughly as I can ;)

@samek
Copy link

samek commented Dec 15, 2015

We just experienced this when a cassandra node got restarted.

@jtconnor
Copy link

+1

We're testing a Cassandra cluster with 3 nodes and a replication factor of 2. When we take a Cassandra node down, Kong continues to run happily but when we bring the Cassandra node back up, Kong starts to return 500 errors and only recovers after restarting Kong. This is on [email protected]. In which Kong version do you anticipate including the driver fix?

@sonicaghi
Copy link
Member

@jtconnor this coming one 0.6.0

@thibaultcha
Copy link
Member

#803 just landed in next. It includes the new driver! As I already explained its behavior in a previous comment, I will just be brief:

  • Scenarios described here do not result in the errors it used to. If a node goes down (or multiple nodes), the driver labels them as "down" and the reconnection policy takes over, until the node is considered back up again.
  • Failed queries are retried on another coordinator (that is, the next valid node according to the load balancing policy).
  • The load balancing policy is now more effective than the basic shuffling previously implemented.
  • Assuming the replication factor is correctly set, as long as at least one node with a replica of the data is still up, everything should be smooth.
  • Kong can start even if some C* nodes are currently down.
  • The migrations should be more consistent as the driver correctly waits for a consensus upon the schema between nodes.

There are many scenarios when the new driver performs better and is more reliable, but I won't go through the details here.

Like @sinzone said, this will be released in 0.6.0, but we are hoping to get a release-candidate out by this week or the next one.

@yoanisgil
Copy link
Author

@thibaultcha that's excellent news. Let me know once the release candidate is available and I will give it a try.

@jtconnor
Copy link

Great! Thanks for the update!

@ahmadnassri
Copy link
Contributor

0.6 is now released which includes the changes described here.

@slater-ben
Copy link

Hi,

Just came across this (in doing some searching to try and figure out how Kong is actually configured to use Cassandra) and thought I could add some Cassandra info that might help clarify a few things (and that seemed to be missing):

To understand how Cassandra will respond to a failed node you need to consider both the keyspace replication factor (RF) and the consistency level (CL) used for the query. It looks like the previous version the driver used Quorum consistency. That means that a clear majority (ie > 50%) of replicas must respond to a query for it to be successful. So, with RF=3 and CL=Quorom, 2 out of 3 nodes must respond to for success. With RF=2 and CL=Quorom then you still need 2 nodes to respond for success (as 1 out 2 is not > 50%) so you effectively don't have high availability. RF=3 and CL=Quorom is most common in Cassandra usage as it provides high availability and guaranteed strong consistency (reads guaranteed to reflect previous writes).

Looking at the code in the updated driver, it appears consistency factor has been changed to local_one. That means that only one replica needs to respond for a query to be successful (so you still get HA even with RF 2). However, it means you are in the world of eventual (not guaranteed) consistency so Cassandra will does not guarantee all reads will reflect all previous writes.

For a fuller explanation see: https://www.instaclustr.com/blog/2015/04/03/pragmatic-availability-with-cassandra/

(Apologies for commenting on a closed issue but wasn't sure where else to put this.)

@thibaultcha
Copy link
Member

The driver can receive any consistency as an option. If people find it relevant we can add an option to kong.yml to use another one that the default one.

On Feb 11, 2016, at 8:50 PM, Ben Slater [email protected] wrote:

Hi,

Just came across this (in doing some searching to try and figure out how Kong is actually configured to use Cassandra) and thought I could add some Cassandra info that might help clarify a few things (and that seemed to be missing):

To understand how Cassandra will respond to a failed node you need to consider both the keyspace replication factor (RF) and the consistency level (CL) used for the query. It looks like the previous version the driver used Quorum consistency. That means that a clear majority (ie > 50%) of replicas must respond to a query for it to be successful. So, with RF=3 and CL=Quorom, 2 out of 3 nodes must respond to for success. With RF=2 and CL=Quorom then you still need 2 nodes to respond for success (as 1 out 2 is not > 50%) so you effectively don't have high availability. RF=3 and CL=Quorom is most common in Cassandra usage as it provides high availability and guaranteed strong consistency (reads guaranteed to reflect previous writes).

Looking at the code in the updated driver, it appears consistency factor has been changed to local_one. That means that only one replica needs to respond for a query to be successful (so you still get HA even with RF 2). However, it means you are in the world of eventual (not guaranteed) consistency so Cassandra will does not guarantee all reads will reflect all previous writes.

For a fuller explanation see: https://www.instaclustr.com/blog/2015/04/03/pragmatic-availability-with-cassandra/

(Apologies for commenting on a closed issue but wasn't sure where else to put this.)


Reply to this email directly or view it on GitHub.

@slater-ben
Copy link

Yes, sorry - should have said default consistency changed. However, I thought some context on Cassandra consistency might be useful.

To be honest - I'm not sure how important consistency is in the Kong context - eventual consistency may well be good enough.

@thibaultcha
Copy link
Member

Yeah, Cassandra is rarely queried by Kong, since Kong maintains its own cache. We mainly use it because of how easy it is to achieve distribution (assumed it is configured right) and using counters to achieve distributed rate-limiting.

With Postgres support coming (and more importantly, SQL), we expect Cassandra usage to drop and only be used for distributed/HA setups.

We welcome any discussion though, so no trouble for commenting on a closed issue :)

On Feb 11, 2016, at 8:56 PM, Ben Slater [email protected] wrote:

Yes, sorry - should have said default consistency changed. However, I thought some context on Cassandra consistency might be useful.

To be honest - I'm not sure how important consistency is in the Kong context - eventual consistency may well be good enough.


Reply to this email directly or view it on GitHub.

kikito pushed a commit that referenced this issue Apr 23, 2024
### Summary

#### 2.6.0
```
Release 2.6.0 Tue February 6 2024
        Security fixes:
      #789 #814  CVE-2023-52425 -- Fix quadratic runtime issues with big tokens
                   that can cause denial of service, in partial where
                   dealing with compressed XML input.  Applications
                   that parsed a document in one go -- a single call to
                   functions XML_Parse or XML_ParseBuffer -- were not affected.
                   The smaller the chunks/buffers you use for parsing
                   previously, the bigger the problem prior to the fix.
                   Backporters should be careful to no omit parts of
                   pull request #789 and to include earlier pull request #771,
                   in order to not break the fix.
           #777  CVE-2023-52426 -- Fix billion laughs attacks for users
                   compiling *without* XML_DTD defined (which is not common).
                   Users with XML_DTD defined have been protected since
                   Expat >=2.4.0 (and that was CVE-2013-0340 back then).

        Bug fixes:
            #753  Fix parse-size-dependent "invalid token" error for
                    external entities that start with a byte order mark
            #780  Fix NULL pointer dereference in setContext via
                    XML_ExternalEntityParserCreate for compilation with
                    XML_DTD undefined
       #812 #813  Protect against closing entities out of order

        Other changes:
            #723  Improve support for arc4random/arc4random_buf
       #771 #788  Improve buffer growth in XML_GetBuffer and XML_Parse
       #761 #770  xmlwf: Support --help and --version
       #759 #770  xmlwf: Support custom buffer size for XML_GetBuffer and read
            #744  xmlwf: Improve language and URL clickability in help output
            #673  examples: Add new example "element_declarations.c"
            #764  Be stricter about macro XML_CONTEXT_BYTES at build time
            #765  Make inclusion to expat_config.h consistent
       #726 #727  Autotools: configure.ac: Support --disable-maintainer-mode
    #678 #705 ..
  #706 #733 #792  Autotools: Sync CMake templates with CMake 3.26
            #795  Autotools: Make installation of shipped man page doc/xmlwf.1
                    independent of docbook2man availability
            #815  Autotools|CMake: Add missing -DXML_STATIC to pkg-config file
                    section "Cflags.private" in order to fix compilation
                    against static libexpat using pkg-config on Windows
       #724 #751  Autotools|CMake: Require a C99 compiler
                    (a de-facto requirement already since Expat 2.2.2 of 2017)
            #793  Autotools|CMake: Fix PACKAGE_BUGREPORT variable
       #750 #786  Autotools|CMake: Make test suite require a C++11 compiler
            #749  CMake: Require CMake >=3.5.0
            #672  CMake: Lowercase off_t and size_t to help a bug in Meson
            #746  CMake: Sort xmlwf sources alphabetically
            #785  CMake|Windows: Fix generation of DLL file version info
            #790  CMake: Build tests/benchmark/benchmark.c as well for
                    a build with -DEXPAT_BUILD_TESTS=ON
       #745 #757  docs: Document the importance of isFinal + adjust tests
                    accordingly
            #736  docs: Improve use of "NULL" and "null"
            #713  docs: Be specific about version of XML (XML 1.0r4)
                    and version of C (C99); (XML 1.0r5 will need a sponsor.)
            #762  docs: reference.html: Promote function XML_ParseBuffer more
            #779  docs: reference.html: Add HTML anchors to XML_* macros
            #760  docs: reference.html: Upgrade to OK.css 1.2.0
       #763 #739  docs: Fix typos
            #696  docs|CI: Use HTTPS URLs instead of HTTP at various places
    #669 #670 ..
    #692 #703 ..
       #733 #772  Address compiler warnings
       #798 #800  Address clang-tidy warnings
       #775 #776  Version info bumped from 9:10:8 (libexpat*.so.1.8.10)
                    to 10:0:9 (libexpat*.so.1.9.0); see https://verbump.de/
                    for what these numbers do

        Infrastructure:
       #700 #701  docs: Document security policy in file SECURITY.md
            #766  docs: Improve parse buffer variables in-code documentation
    #674 #738 ..
    #740 #747 ..
  #748 #781 #782  Refactor coverage and conformance tests
       #714 #716  Refactor debug level variables to unsigned long
            #671  Improve handling of empty environment variable value
                    in function getDebugLevel (without visible user effect)
    #755 #774 ..
    #758 #783 ..
       #784 #787  tests: Improve test coverage with regard to parse chunk size
  #660 #797 #801  Fuzzing: Improve fuzzing coverage
       #367 #799  Fuzzing|CI: Start running OSS-Fuzz fuzzing regression tests
       #698 #721  CI: Resolve some Travis CI leftovers
            #669  CI: Be robust towards absence of Git tags
       #693 #694  CI: Set permissions to "contents: read" for security
            #709  CI: Pin all GitHub Actions to specific commits for security
            #739  CI: Reject spelling errors using codespell
            #798  CI: Enforce clang-tidy clean code
    #773 #808 ..
       #809 #810  CI: Upgrade Clang from 15 to 18
            #796  CI: Start using Clang's Control Flow Integrity sanitizer
  #675 #720 #722  CI: Adapt to breaking changes in GitHub Actions Ubuntu images
            #689  CI: Adapt to breaking changes in Clang/LLVM Debian packaging
            #763  CI: Adapt to breaking changes in codespell
            #803  CI: Adapt to breaking changes in Cppcheck

        Special thanks to:
            Ivan Galkin
            Joyce Brum
            Philippe Antoine
            Rhodri James
            Snild Dolkow
            spookyahell
            Steven Garske
                 and
            Clang AddressSanitizer
            Clang UndefinedBehaviorSanitizer
            codespell
            GCC Farm Project
            OSS-Fuzz
            Sony Mobile
```

#### 2.6.1
```
Release 2.6.1 Thu February 29 2024
        Bug fixes:
            #817  Make tests independent of CPU speed, and thus more robust
       #828 #836  Expose billion laughs API with XML_DTD defined and
                    XML_GE undefined, regression from 2.6.0

        Other changes:
            #829  Hide test-only code behind new internal macro
            #833  Autotools: Reject expat_config.h.in defining SIZEOF_VOID_P
            #819  Address compiler warnings
       #832 #834  Version info bumped from 10:0:9 (libexpat*.so.1.9.0)
                    to 10:1:9 (libexpat*.so.1.9.1); see https://verbump.de/
                    for what these numbers do

        Infrastructure:
            #818  CI: Adapt to breaking changes in clang-format

        Special thanks to:
            David Hall
            Snild Dolkow
```

#### 2.6.2
```
Release 2.6.2 Wed March 13 2024
        Security fixes:
       #839 #842  CVE-2024-28757 -- Prevent billion laughs attacks with
                    isolated use of external parsers.  Please see the commit
                    message of commit 1d50b80cf31de87750103656f6eb693746854aa8
                    for details.

        Bug fixes:
       #839 #841  Reject direct parameter entity recursion
                    and avoid the related undefined behavior

        Other changes:
            #847  Autotools: Fix build for DOCBOOK_TO_MAN containing spaces
            #837  Add missing #821 and #824 to 2.6.1 change log
       #838 #843  Version info bumped from 10:1:9 (libexpat*.so.1.9.1)
                    to 10:2:9 (libexpat*.so.1.9.2); see https://verbump.de/
                    for what these numbers do

        Special thanks to:
            Philippe Antoine
            Tomas Korbar
                 and
            Clang UndefinedBehaviorSanitizer
            OSS-Fuzz / ClusterFuzz
```

Signed-off-by: Aapo Talvensaari <[email protected]>
bungle added a commit that referenced this issue Apr 23, 2024
### Summary

#### 2.6.0
```
Release 2.6.0 Tue February 6 2024
        Security fixes:
      #789 #814  CVE-2023-52425 -- Fix quadratic runtime issues with big tokens
                   that can cause denial of service, in partial where
                   dealing with compressed XML input.  Applications
                   that parsed a document in one go -- a single call to
                   functions XML_Parse or XML_ParseBuffer -- were not affected.
                   The smaller the chunks/buffers you use for parsing
                   previously, the bigger the problem prior to the fix.
                   Backporters should be careful to no omit parts of
                   pull request #789 and to include earlier pull request #771,
                   in order to not break the fix.
           #777  CVE-2023-52426 -- Fix billion laughs attacks for users
                   compiling *without* XML_DTD defined (which is not common).
                   Users with XML_DTD defined have been protected since
                   Expat >=2.4.0 (and that was CVE-2013-0340 back then).

        Bug fixes:
            #753  Fix parse-size-dependent "invalid token" error for
                    external entities that start with a byte order mark
            #780  Fix NULL pointer dereference in setContext via
                    XML_ExternalEntityParserCreate for compilation with
                    XML_DTD undefined
       #812 #813  Protect against closing entities out of order

        Other changes:
            #723  Improve support for arc4random/arc4random_buf
       #771 #788  Improve buffer growth in XML_GetBuffer and XML_Parse
       #761 #770  xmlwf: Support --help and --version
       #759 #770  xmlwf: Support custom buffer size for XML_GetBuffer and read
            #744  xmlwf: Improve language and URL clickability in help output
            #673  examples: Add new example "element_declarations.c"
            #764  Be stricter about macro XML_CONTEXT_BYTES at build time
            #765  Make inclusion to expat_config.h consistent
       #726 #727  Autotools: configure.ac: Support --disable-maintainer-mode
    #678 #705 ..
  #706 #733 #792  Autotools: Sync CMake templates with CMake 3.26
            #795  Autotools: Make installation of shipped man page doc/xmlwf.1
                    independent of docbook2man availability
            #815  Autotools|CMake: Add missing -DXML_STATIC to pkg-config file
                    section "Cflags.private" in order to fix compilation
                    against static libexpat using pkg-config on Windows
       #724 #751  Autotools|CMake: Require a C99 compiler
                    (a de-facto requirement already since Expat 2.2.2 of 2017)
            #793  Autotools|CMake: Fix PACKAGE_BUGREPORT variable
       #750 #786  Autotools|CMake: Make test suite require a C++11 compiler
            #749  CMake: Require CMake >=3.5.0
            #672  CMake: Lowercase off_t and size_t to help a bug in Meson
            #746  CMake: Sort xmlwf sources alphabetically
            #785  CMake|Windows: Fix generation of DLL file version info
            #790  CMake: Build tests/benchmark/benchmark.c as well for
                    a build with -DEXPAT_BUILD_TESTS=ON
       #745 #757  docs: Document the importance of isFinal + adjust tests
                    accordingly
            #736  docs: Improve use of "NULL" and "null"
            #713  docs: Be specific about version of XML (XML 1.0r4)
                    and version of C (C99); (XML 1.0r5 will need a sponsor.)
            #762  docs: reference.html: Promote function XML_ParseBuffer more
            #779  docs: reference.html: Add HTML anchors to XML_* macros
            #760  docs: reference.html: Upgrade to OK.css 1.2.0
       #763 #739  docs: Fix typos
            #696  docs|CI: Use HTTPS URLs instead of HTTP at various places
    #669 #670 ..
    #692 #703 ..
       #733 #772  Address compiler warnings
       #798 #800  Address clang-tidy warnings
       #775 #776  Version info bumped from 9:10:8 (libexpat*.so.1.8.10)
                    to 10:0:9 (libexpat*.so.1.9.0); see https://verbump.de/
                    for what these numbers do

        Infrastructure:
       #700 #701  docs: Document security policy in file SECURITY.md
            #766  docs: Improve parse buffer variables in-code documentation
    #674 #738 ..
    #740 #747 ..
  #748 #781 #782  Refactor coverage and conformance tests
       #714 #716  Refactor debug level variables to unsigned long
            #671  Improve handling of empty environment variable value
                    in function getDebugLevel (without visible user effect)
    #755 #774 ..
    #758 #783 ..
       #784 #787  tests: Improve test coverage with regard to parse chunk size
  #660 #797 #801  Fuzzing: Improve fuzzing coverage
       #367 #799  Fuzzing|CI: Start running OSS-Fuzz fuzzing regression tests
       #698 #721  CI: Resolve some Travis CI leftovers
            #669  CI: Be robust towards absence of Git tags
       #693 #694  CI: Set permissions to "contents: read" for security
            #709  CI: Pin all GitHub Actions to specific commits for security
            #739  CI: Reject spelling errors using codespell
            #798  CI: Enforce clang-tidy clean code
    #773 #808 ..
       #809 #810  CI: Upgrade Clang from 15 to 18
            #796  CI: Start using Clang's Control Flow Integrity sanitizer
  #675 #720 #722  CI: Adapt to breaking changes in GitHub Actions Ubuntu images
            #689  CI: Adapt to breaking changes in Clang/LLVM Debian packaging
            #763  CI: Adapt to breaking changes in codespell
            #803  CI: Adapt to breaking changes in Cppcheck

        Special thanks to:
            Ivan Galkin
            Joyce Brum
            Philippe Antoine
            Rhodri James
            Snild Dolkow
            spookyahell
            Steven Garske
                 and
            Clang AddressSanitizer
            Clang UndefinedBehaviorSanitizer
            codespell
            GCC Farm Project
            OSS-Fuzz
            Sony Mobile
```

#### 2.6.1
```
Release 2.6.1 Thu February 29 2024
        Bug fixes:
            #817  Make tests independent of CPU speed, and thus more robust
       #828 #836  Expose billion laughs API with XML_DTD defined and
                    XML_GE undefined, regression from 2.6.0

        Other changes:
            #829  Hide test-only code behind new internal macro
            #833  Autotools: Reject expat_config.h.in defining SIZEOF_VOID_P
            #819  Address compiler warnings
       #832 #834  Version info bumped from 10:0:9 (libexpat*.so.1.9.0)
                    to 10:1:9 (libexpat*.so.1.9.1); see https://verbump.de/
                    for what these numbers do

        Infrastructure:
            #818  CI: Adapt to breaking changes in clang-format

        Special thanks to:
            David Hall
            Snild Dolkow
```

#### 2.6.2
```
Release 2.6.2 Wed March 13 2024
        Security fixes:
       #839 #842  CVE-2024-28757 -- Prevent billion laughs attacks with
                    isolated use of external parsers.  Please see the commit
                    message of commit 1d50b80cf31de87750103656f6eb693746854aa8
                    for details.

        Bug fixes:
       #839 #841  Reject direct parameter entity recursion
                    and avoid the related undefined behavior

        Other changes:
            #847  Autotools: Fix build for DOCBOOK_TO_MAN containing spaces
            #837  Add missing #821 and #824 to 2.6.1 change log
       #838 #843  Version info bumped from 10:1:9 (libexpat*.so.1.9.1)
                    to 10:2:9 (libexpat*.so.1.9.2); see https://verbump.de/
                    for what these numbers do

        Special thanks to:
            Philippe Antoine
            Tomas Korbar
                 and
            Clang UndefinedBehaviorSanitizer
            OSS-Fuzz / ClusterFuzz
```

Signed-off-by: Aapo Talvensaari <[email protected]>
tysoekong pushed a commit that referenced this issue Apr 26, 2024
```
Release 2.6.0 Tue February 6 2024
        Security fixes:
      #789 #814  CVE-2023-52425 -- Fix quadratic runtime issues with big tokens
                   that can cause denial of service, in partial where
                   dealing with compressed XML input.  Applications
                   that parsed a document in one go -- a single call to
                   functions XML_Parse or XML_ParseBuffer -- were not affected.
                   The smaller the chunks/buffers you use for parsing
                   previously, the bigger the problem prior to the fix.
                   Backporters should be careful to no omit parts of
                   pull request #789 and to include earlier pull request #771,
                   in order to not break the fix.
           #777  CVE-2023-52426 -- Fix billion laughs attacks for users
                   compiling *without* XML_DTD defined (which is not common).
                   Users with XML_DTD defined have been protected since
                   Expat >=2.4.0 (and that was CVE-2013-0340 back then).

        Bug fixes:
            #753  Fix parse-size-dependent "invalid token" error for
                    external entities that start with a byte order mark
            #780  Fix NULL pointer dereference in setContext via
                    XML_ExternalEntityParserCreate for compilation with
                    XML_DTD undefined
       #812 #813  Protect against closing entities out of order

        Other changes:
            #723  Improve support for arc4random/arc4random_buf
       #771 #788  Improve buffer growth in XML_GetBuffer and XML_Parse
       #761 #770  xmlwf: Support --help and --version
       #759 #770  xmlwf: Support custom buffer size for XML_GetBuffer and read
            #744  xmlwf: Improve language and URL clickability in help output
            #673  examples: Add new example "element_declarations.c"
            #764  Be stricter about macro XML_CONTEXT_BYTES at build time
            #765  Make inclusion to expat_config.h consistent
       #726 #727  Autotools: configure.ac: Support --disable-maintainer-mode
    #678 #705 ..
  #706 #733 #792  Autotools: Sync CMake templates with CMake 3.26
            #795  Autotools: Make installation of shipped man page doc/xmlwf.1
                    independent of docbook2man availability
            #815  Autotools|CMake: Add missing -DXML_STATIC to pkg-config file
                    section "Cflags.private" in order to fix compilation
                    against static libexpat using pkg-config on Windows
       #724 #751  Autotools|CMake: Require a C99 compiler
                    (a de-facto requirement already since Expat 2.2.2 of 2017)
            #793  Autotools|CMake: Fix PACKAGE_BUGREPORT variable
       #750 #786  Autotools|CMake: Make test suite require a C++11 compiler
            #749  CMake: Require CMake >=3.5.0
            #672  CMake: Lowercase off_t and size_t to help a bug in Meson
            #746  CMake: Sort xmlwf sources alphabetically
            #785  CMake|Windows: Fix generation of DLL file version info
            #790  CMake: Build tests/benchmark/benchmark.c as well for
                    a build with -DEXPAT_BUILD_TESTS=ON
       #745 #757  docs: Document the importance of isFinal + adjust tests
                    accordingly
            #736  docs: Improve use of "NULL" and "null"
            #713  docs: Be specific about version of XML (XML 1.0r4)
                    and version of C (C99); (XML 1.0r5 will need a sponsor.)
            #762  docs: reference.html: Promote function XML_ParseBuffer more
            #779  docs: reference.html: Add HTML anchors to XML_* macros
            #760  docs: reference.html: Upgrade to OK.css 1.2.0
       #763 #739  docs: Fix typos
            #696  docs|CI: Use HTTPS URLs instead of HTTP at various places
    #669 #670 ..
    #692 #703 ..
       #733 #772  Address compiler warnings
       #798 #800  Address clang-tidy warnings
       #775 #776  Version info bumped from 9:10:8 (libexpat*.so.1.8.10)
                    to 10:0:9 (libexpat*.so.1.9.0); see https://verbump.de/
                    for what these numbers do

        Infrastructure:
       #700 #701  docs: Document security policy in file SECURITY.md
            #766  docs: Improve parse buffer variables in-code documentation
    #674 #738 ..
    #740 #747 ..
  #748 #781 #782  Refactor coverage and conformance tests
       #714 #716  Refactor debug level variables to unsigned long
            #671  Improve handling of empty environment variable value
                    in function getDebugLevel (without visible user effect)
    #755 #774 ..
    #758 #783 ..
       #784 #787  tests: Improve test coverage with regard to parse chunk size
  #660 #797 #801  Fuzzing: Improve fuzzing coverage
       #367 #799  Fuzzing|CI: Start running OSS-Fuzz fuzzing regression tests
       #698 #721  CI: Resolve some Travis CI leftovers
            #669  CI: Be robust towards absence of Git tags
       #693 #694  CI: Set permissions to "contents: read" for security
            #709  CI: Pin all GitHub Actions to specific commits for security
            #739  CI: Reject spelling errors using codespell
            #798  CI: Enforce clang-tidy clean code
    #773 #808 ..
       #809 #810  CI: Upgrade Clang from 15 to 18
            #796  CI: Start using Clang's Control Flow Integrity sanitizer
  #675 #720 #722  CI: Adapt to breaking changes in GitHub Actions Ubuntu images
            #689  CI: Adapt to breaking changes in Clang/LLVM Debian packaging
            #763  CI: Adapt to breaking changes in codespell
            #803  CI: Adapt to breaking changes in Cppcheck

        Special thanks to:
            Ivan Galkin
            Joyce Brum
            Philippe Antoine
            Rhodri James
            Snild Dolkow
            spookyahell
            Steven Garske
                 and
            Clang AddressSanitizer
            Clang UndefinedBehaviorSanitizer
            codespell
            GCC Farm Project
            OSS-Fuzz
            Sony Mobile
```

```
Release 2.6.1 Thu February 29 2024
        Bug fixes:
            #817  Make tests independent of CPU speed, and thus more robust
       #828 #836  Expose billion laughs API with XML_DTD defined and
                    XML_GE undefined, regression from 2.6.0

        Other changes:
            #829  Hide test-only code behind new internal macro
            #833  Autotools: Reject expat_config.h.in defining SIZEOF_VOID_P
            #819  Address compiler warnings
       #832 #834  Version info bumped from 10:0:9 (libexpat*.so.1.9.0)
                    to 10:1:9 (libexpat*.so.1.9.1); see https://verbump.de/
                    for what these numbers do

        Infrastructure:
            #818  CI: Adapt to breaking changes in clang-format

        Special thanks to:
            David Hall
            Snild Dolkow
```

```
Release 2.6.2 Wed March 13 2024
        Security fixes:
       #839 #842  CVE-2024-28757 -- Prevent billion laughs attacks with
                    isolated use of external parsers.  Please see the commit
                    message of commit 1d50b80cf31de87750103656f6eb693746854aa8
                    for details.

        Bug fixes:
       #839 #841  Reject direct parameter entity recursion
                    and avoid the related undefined behavior

        Other changes:
            #847  Autotools: Fix build for DOCBOOK_TO_MAN containing spaces
            #837  Add missing #821 and #824 to 2.6.1 change log
       #838 #843  Version info bumped from 10:1:9 (libexpat*.so.1.9.1)
                    to 10:2:9 (libexpat*.so.1.9.2); see https://verbump.de/
                    for what these numbers do

        Special thanks to:
            Philippe Antoine
            Tomas Korbar
                 and
            Clang UndefinedBehaviorSanitizer
            OSS-Fuzz / ClusterFuzz
```

KAG-4331

Signed-off-by: Aapo Talvensaari <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task/feature Requests for new features in Kong
Projects
None yet
Development

No branches or pull requests

9 participants