-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is there an out-of-the-box way to create an associative address trie that supports both IPv4 and IPv6? #103
Comments
The short answer is a qualified yes, although I don't think I have such a type in the library. I will have to check. I may have created something like that for a test. While all the subnets/addresses in a trie must have the same bit-size, at least for the tries in this library, it is fairly trivial to create a binary tree that is a wrapper of two tries, one for IPv4 and one for IPv6. So any operation just defers to one or the other, depending on whether the operation argument is IPv4 or IPv6. I think I may have done something like this before, so I will take a look. |
That sounds like a straightforward enough solution, thanks. Is there any reason why one should prefer this over just using an Also, when getting the values of all CIDRs that contain a given IP, what is the best way to convert the result to a list of values? Here's what I mean:
The only thing I could come up with was this, as
|
Converting all IPv4 addresses to IPv6 IPv4-mapped addresses is another option, and it would work pretty well. Yeah, I've had plans to change the return value of elementsContaining to a linked list. The Go library does that, in fact. You can use this code to traverse the result:
To go in the reverse direction, use getParent(). No, the nodes of such a list or of a tree are not convertible to a stream. Nor the address keys of each node. But that block of code you posted would work (trie.forEach), because it does provide an iterator, and it is "iterable", so the "forEach" method of Iterable is available. There are a number of iterators in fact. If there is a way to convert iterators to streams (I am guessing there is, maybe check stackoverflow), then you can probably use a collect method when you have the resulting stream. |
Actually, one way to get a stream is to create a new trie using asNewTrie https://seancfoley.github.io/IPAddress/IPAddress/apidocs/inet/ipaddr/format/util/AddressTrie.TrieNode.html#asNewTrie-- Then you can covert to a set or a map using asSet (or asMap for associative tries), then when you have a map or set there are methods to produce streams. But this is a bit expensive because the asNewTrie method creates a new trie (which is actually a list when it came from elementContains). |
I did not notice there is actually a spliterator method. So you can create a stream with java.util.stream.StreamSupport.stream(spliterator, false); |
Thanks a lot for the info. Could we assume that using forEach is not more expensive than spliterator? Also, a question about efficiency: How exactly does elementsContaining get all the CIDRs that contain a given IP? I assume it would exploit the trie structure (i.e. does not just iterate over all nodes linearly). |
Yes, elementsContaining is very efficient, it does a single trie lookup, and as it navigates a branch of the trie, it simply saves the branch. The trie lookup is efficient, it is a binary search. After that, you basically have a linked list encoded as a trie (it's a list because each trie node has at most one subnode). It is the branch path that was traversed on the "elementsContaining" call. At that point, you just have a list. Anything I say now is really not specific to this library, it applies to lists/streams/iterators in general, and data structures in general. An iterator will simply follow the list. A spliterator partitions the set of elements to be iterated, so that you can iterate the partitions at the same time, concurrently. In this case, the spliterator is not more efficient because it is not easy to split a linked list (you have to actually traverse the list to split into partitions). An array list is easy to partition, a linked list is not. The spliterator would be more efficient if it were a real binary tree and not just a single branch, because then it is easy to partition. Also note that the list is typically short. On average it is log2(size of trie). So for a trie with 1000 addresses, the list is usually 10 nodes long, which is short. The most efficient code for short lists is the block of code I wrote for you, or the use of an iterator. Using streams and spliterators is never efficient for small collections, they have too much overhead when setting up and cleaning up. Streams and spliterators are only more efficient when you have large collections and the spliterator is able to split up the collection well enough that you can traverse the stream in parallel with multiple threads. Also, when you have lots of individual operations on the collection, streams are useful for writing code that is concise and clear. So, in this case, while using streams and spliterators can provide concise and pretty code, they are less efficient. These lists are nowhere near long enough for streams or spliterators to be efficient, and in addition, linked lists are not good candidates for parallel streams because they are difficult to split into partitions. Using an iterator, or traversing the list with the block of code I provided, is by far the most efficient. |
This was all very useful. The solution I landed on was to use an For my method that should return the values associated with all CIDRs that contain a given IP, I used your Thus far I have not noticed any caveats with my implementation; if you can think of any caveats, I'd love to hear them! Thanks a lot for you help. |
There are no caveats, really. The only possible caveat is if you had the unusual situation in which you wanted to distinguish an IPv4-mapped IPv6 address from the associated IPv4 address. In other words, if you wanted to distinguish "::ffff:102:304" from 1.2.3.4, your implementation could not do so. However, for all intensive purposes, those two are the same address, so that is not much of a concern. An IPv4-mapped IPv6 address is the same as its associated IPv4 address, and in fact some programming languages treat them as the same address as well (like, for instance, the Go and Java languages). |
I kept this open since I intended to add a dual IPv4/v6 trie type. That is now a part of version 5.5.0 with the type |
That's awesome, thank you! |
I'm trying to create an address trie that maps CIDRs to strings, and can do an IP lookup that gets multiple results.
I want to be able to efficiently search for a given IP address (both IPv4 and IPv6 supported), and get back the values associated with ALL the CIDRs that contain the IP address. I see there's an IPv4AddressTrie and an IPv6AddressTrie, as well as an abstract AddressTrie, but I'm not sure if there's an out-of-the-box way to do this. I would imagine something like:
I know this isn't exactly how AddressTrie is instantiated, but I'm just illustrating my goal here. Please let me know if there is already a solution implemented for this. Much appreciated.
The text was updated successfully, but these errors were encountered: