Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching in string which contains json throws an exception #432

Open
goodstas opened this issue Feb 2, 2024 · 6 comments
Open

Searching in string which contains json throws an exception #432

goodstas opened this issue Feb 2, 2024 · 6 comments

Comments

@goodstas
Copy link

goodstas commented Feb 2, 2024

Hi, i have a .net application where i use Redis Stack to store data.
Each business logic object is stored inside wrapper object as json. Wrapper object has JsonValue property.
This is a definition of the wrapper:

[Document(StorageType = StorageType.Json, Prefixes = new string[] { nameof(REWrapperObject) }, IndexName = $"{nameof(REWrapperObject)}-idx")]
 public class REWrapperObject
 {
     [RedisIdField, Indexed]
     public string REUniqueId { get; set; }

     [Searchable]
     public string ObjectType { get; set; }

     [Searchable]
     public string ObjectId   { get; set; }

     [Searchable]
     public string JsonValue { get; set; }

     [Indexed(Aggregatable = true, Sortable = true)]
     public long TimeStamp { get; set; }
 }

Class Person definition is as follows :

public class Person
{      
    public Guid Id { get; set; }
          
    public string FirstName { get; set; }

    public string LastName { get; set; }

    public double Height { get; set; }
  
    public double Weight { get; set; }
  
    public DateTime Birthdate { get; set; }
}

I use JsonSerializer from System.Text.Json and the serialization Person instance looks like that :
"{"Id":"869fa3d7-7470-4215-bfd5-cc8b8d635fc4","FirstName":"Michael","LastName":"Johnson","Height":170,"Weight":95.07,"Birthdate":"1996-08-22T00:00:00"}"

When i tried to make query which looks for substring inside JsonValue property i got an exception
This is how query looks : var michaels = _redisProvider.RedisCollection<REWrapperObject>().Where(obj => obj.JsonValue.Contains("\"FirstName\":\"Michael\"")).ToList();

The exception is " Syntax error at offset 23 near FirstName
Failed on FT.SEARCH REWrapperObject1KBenchmark-idx (@JsonValue:"FirstName":"Michael") LIMIT 0 100"

By doing some elimination what can cause an error, i found out that there is a problem to look for ':' character inside JsonValue.
Even if i look only for this character. It throws the same exception just offset is different.

I will really appreciate your help to understand where is a problem.
Thank you.
@slorello89

@goodstas goodstas changed the title Search in string which contains json throw an exception Searching in string which contains json throws an exception Feb 2, 2024
@slorello89
Copy link
Member

Hi @goodstas, the way you are trying to search this doesn't make a whole lot of sense in the context of a Redis full text search. What Contains does is craft a full-text search, in the full text indexes, the special characters are all stripped away and treated as whitespace, and they are actually illegal in the queries. try running this query instead:

var res = collection.Where(x => x.JsonValue.Contains("Michael Johnson")).ToList();

I'd be remiss if I didn't mention that you are sort of working against the grain, full-text is more meant for passage-level searches, not necessarily these types of matches.

If you want to run that exact match (like you need the json to be correct and not the data) you can change Searchable -> Indexed and re-run your original query.

But if what you are looking for is for the Person objects first name to be Michael and last name to be Johnson, you'd be better off indexing the person object:

using Redis.OM;
using Redis.OM.Modeling;
using StackExchange.Redis;

var muxer = ConnectionMultiplexer.Connect("localhost:6379");
var db = muxer.GetDatabase();

var provider = new RedisConnectionProvider(muxer);
provider.Connection.CreateIndex(typeof(REWrapperObject));
var collection = provider.RedisCollection<REWrapperObject>();

collection.Insert(new REWrapperObject()
{
    Person = new Person() { FirstName = "Michael", LastName = "Johnson" }
});

var res = collection.Where(x => x.Person.FirstName == "Michael" && x.Person.LastName == "Johnson").ToList();
Console.WriteLine(res.Count);


[Document(StorageType = StorageType.Json, Prefixes = new string[] { nameof(REWrapperObject) }, IndexName = $"{nameof(REWrapperObject)}-idx")]
public class REWrapperObject
{
    [RedisIdField, Indexed]
    public string REUniqueId { get; set; }

    [Searchable]
    public string ObjectType { get; set; }

    [Searchable]
    public string ObjectId   { get; set; }
    
    [Indexed(CascadeDepth = 1)] public Person Person { get; set; }

    [Indexed]
    public string JsonValue { get; set; }

    [Indexed(Aggregatable = true, Sortable = true)]
    public long TimeStamp { get; set; }
}

public class Person
{      
    public Guid Id { get; set; }
          
    [Indexed]
    public string FirstName { get; set; }

    [Indexed]
    public string LastName { get; set; }

    public double Height { get; set; }
  
    public double Weight { get; set; }
  
    public DateTime Birthdate { get; set; }
}

@goodstas
Copy link
Author

goodstas commented Feb 2, 2024

Thank you for comment @slorello89 .
I use JsonValue inside wrapper object because i want some generic solution for storing business logic objects. I can't put Person as is as a member of wrapper object because i also have Dog, Cat and etc...
With regular string this search will work but as you said Redis looks different on JsonValue with Searchable property....

I changed the attribute Searchable above JsonValue to Indexed one as you suggested and now Contains works fine.

I just want to understand if the performance of applying Indexed attribute instead of Searchable is worse?

What i try to check that specific property like FirstName has specific value like Michael.
Is there a better way to do it if i want to use generic wrapper to store my objects?

Another reason why I need wrapper because there are types which require special JsonSerializationOptions and current version of Redis.OM doesn't provide the access or possibility to provide my custom JsonSerializarionOptions.
So the solution that i found is to serialize it by myself and store it as string in the wrapper object .

One more thing that i want to verify is about number of documents that Redis scans.
When i got an exception because of syntax error , i saw that the query limited to the first 100 documents but if i have 1000 or 100000 ..
Do i need to define something to override this default number (100)?
Or how i can query big number of documents?

Thank you!

@slorello89
Copy link
Member

So I will say that the Indexed is generally more efficient (you are doing an infix check so that might be somewhat less efficient than your standard tag search)

RE sizes, 100 is the default "Chunk Size" available in the constructor, if your result set is larger than that, it will automatically paginate for you until it it pulls back the entire result set, but you are generally better off with a larger chunk size in that case. The idea is the preven very large blocking queries to Redis.

You can Adjust the Chunk size in the constructor for the Redis Collection.

@goodstas
Copy link
Author

goodstas commented Feb 11, 2024

Thank you for the response.
Another issue that i need your help. I added array property to my wrapper object :

   [Indexed]
   public string[] Filters { get; set; } 

when i perform query like :
var redisObjects = wrappers.Where(_ => _.Filters[0] == "3").ToList();
i get System.ArgumentExeption("Unknown separator type")

This is a call stack :
at Redis.OM.Common.ExpressionTranslator.SplitPredicateSeporators(ExpressionType type) at Redis.OM.Common.ExpressionTranslator.TranslateBinaryExpression(BinaryExpression binExpression) at Redis.OM.Common.ExpressionTranslator.BuildQueryFromExpression(Expression exp) at Redis.OM.Common.ExpressionTranslator.TranslateWhereMethod(MethodCallExpression expression) at Redis.OM.Common.ExpressionTranslator.BuildQueryFromExpression(Expression expression, Type type, Expression mainBooleanExpression, Type rootType) at Redis.OM.Searching.RedisCollectionEnumerator1..ctor(Expression exp, IRedisConnection connection, Int32 chunkSize, RedisCollectionStateManager stateManager, Expression1 booleanExpression, Boolean saveState, Type rootType, Type type) at Redis.OM.Searching.RedisCollection1.GetEnumerator()
at System.Collections.Generic.List1..ctor(IEnumerable1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable1 source) at Program.<Main>$(String[] args) in ConsoleGeometryRedis\Program.cs:line 78

If i use regular string property for querying it works. I think the problem is in '[' and ']' characters...

What do i miss , @slorello89 ?
Thanks again.

@slorello89
Copy link
Member

RediSearch doesn't support these sorts of equality operations (comparing a specific value within a list/array). Which is why when it tries to parse the query it blows up.

@goodstas
Copy link
Author

goodstas commented Feb 15, 2024

Ok, i understand. From other side i entered another problem by building multiple Where conditions for querying Redis when i access array with the index in for loop.
I have an array of strings which i want to use as a filters of the query as you can see below .
Filters is a property of my wrapper object of type string and i check if it contains filters array values:
for (int i = 0; i < filters.Length; ++i) { query = query.Where(obj => obj.Filters.Contains(filters[i])); }
I got an exception when Redis.OM parses these Where conditions when i call var fetchedObjects = query.ToList();
The issue here is to get value from array by using index.

When i use different approach with foreach loop to build Where conditions then everything works fine.
foreach (var filter in filters) { query = query.Where(obj => obj.Filters.Contains(filter)); }

@slorello89 , don't you think that i should be able to access array items using index in the for loop as well as i do it with foreach loop?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants