2

I'm trying to use faceting to look at the terms that were indexed into a field, realizing this is a bit atypical, but I'm using it as a debugging tool. The problem is I'm not seeing any faceted terms.

I think this is something to do with later version of Solr, perhaps v8 or v9, but not finding any reference to that. I do have a shell script that reproduces the problem pretty easily:

#!/bin/bash

COLLECTION=test
NUM_SHARDS=1  # Solr v9 (maybe earlier) doesn't restrict to single shard, BUT doesn't matter for our small dataset
TERMS=test
TYPE=text_general
SUFFIX=_t
FIELD=test$SUFFIX
DOC_ID=doc1

echo;echo
echo Removing Any Previous collection - will give error if run for the first time
curl "http://localhost:8983/solr/admin/collections?action=DELETE&name=$COLLECTION"

echo;echo
echo Creating Collection $COLLECTION with $NUM_SHARDS shards
curl "http://localhost:8983/solr/admin/collections?action=CREATE&name=$COLLECTION&numShards=$NUM_SHARDS&replicationFactor=1&maxShardsPerNode=$NUM_SHARDS"

echo;echo
echo Adding doc id = $DOC_ID with TERMS = $TERMS
curl -X POST "http://localhost:8983/solr/$COLLECTION/update?commit=true" \
 -H "Content-Type: application/json" \
 -d "[
   {
     \"id\": \"$DOC_ID\",
     \"$FIELD\": \"this is a test of the Solr indexing system: TERMS = $TERMS\"
   }
 ]"

echo;echo
echo Query for all docs
curl "http://localhost:8983/solr/$COLLECTION/select?q=*:*"

echo;echo
echo Query for TERMS = $TERMS
curl "http://localhost:8983/solr/$COLLECTION/select?defType=edismax&qf=$FIELD&q=$TERMS&rows=1"

echo;echo
echo Testing Old Facet API
curl "http://localhost:8983/solr/$COLLECTION/select?q=*:*&rows=0&facet=true&facet.field=$FIELD"

echo;echo
echo Testing New Facet API
curl "http://localhost:8983/solr/$COLLECTION/select?q=*:*&rows=0&json.facet=%7Bwords:%7Btype:terms,field:content_t,limit:10%7D%7D"

echo;echo
echo Done

If I put an echo in front of my curl command that inserts the document you can see that the variable substitution looks correct:

curl -X POST http://localhost:8983/solr/my-files/update?commit=true -H Content-Type: application/json -d [
   {
     "id": "doc1",
     "test_t": "this is a test of the Solr indexing system: TERMS = test"
   }
 ]

Then doing a null query you can see that the document is ndexed:

Query for all docs
{
  "responseHeader":{
  "zkConnected":true,
  "status":0,
  "QTime":0,
  "params":{
    "q":"*:*"
  }
},
"response":{
    "numFound":1,
    "start":0,
    "numFoundExact":true,
    "docs":[ ]
  }
}

Searching for the terms seems to work:

{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":0,
    "params":{
      "q":"test",
      "defType":"edismax",
      "qf":"test_t",
      "rows":"1"
    }
},
"response":{
  "numFound":1,
  "start":0,
  "numFoundExact":true,
  "docs":[{
    "id":"doc1",
    "test_t":"this is a test of the Solr indexing system: TERMS = test",
    "_version_":1847191228405252096,
    "_root_":"doc1"
  }]
}

But notice what happens when I try to facet:

echo;echo
echo Testing Old Facet API
curl "http://localhost:8983/solr/$COLLECTION/select?q=*:*&rows=0&facet=true&facet.field=$FIELD"

Gives the output:

Testing Old Facet API
{
  "responseHeader":{
  "zkConnected":true,
  "status":0,
  "QTime":0,
  "params":{
    "q":"*:*",
    "facet.field":"test_t",
    "rows":"0",
    "facet":"true"
  }
},
"response":{
  "numFound":1,
  "start":0,
  "numFoundExact":true,
  "docs":[ ]
},
"facet_counts":{
  "facet_queries":{ },
  "facet_fields":{
    "test_t":[ ]
  },
  "facet_ranges":{ },
  "facet_intervals":{ },
  "facet_heatmaps":{ }
}
}

Similar results if I try the newer JSON syntax:

echo;echo
echo Testing New Facet API
curl "http://localhost:8983/solr/$COLLECTION/select?q=*:*&rows=0&json.facet=%7Bwords:%7Btype:terms,field:content_t,limit:10%7D%7D"

2 Answers 2

1

After emailing the Solr user mailing list, there are TWO things you need to do:

  1. You need to have uninvertible=true, AND

  2. You need to explicitly specify an analyzer for fields, even though they're based on TextField.

Here's what wound up working:

curl -X POST -H 'Content-type:application/json' \
  "http://localhost:8983/solr/$COLLECTION/schema" \
  -d '{
    "add-field-type": {
      "name": "multivalued_texts",
      "class": "solr.TextField",
      "stored": true,
      "multiValued": true,
      "indexed": true,
      "docValues": false,
      "uninvertible": true,
      "analyzer": {
        "type": "index",
        "tokenizer": {
          "class": "solr.StandardTokenizerFactory"
        },
        "filters": [
          {
            "class": "solr.LowerCaseFilterFactory"
          }
        ]
      }
    }
  }'
Sign up to request clarification or add additional context in comments.

Comments

0

I ran your example on Solr 9.1.0 and I get some values in the facets. This command:

$ curl "http://localhost:8983/solr/$COLLECTION/select?q=*:*&rows=0&facet=true&facet.field=$FIELD"

returns:


{
  "responseHeader":{
    "zkConnected":true,
    "status":0,
    "QTime":1,
    "params":{
      "q":"*:*",
      "facet.field":"test_t",
      "rows":"0",
      "facet":"true"}},
  "response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "test_t":[
        "a",1,
        "indexing",1,
        "is",1,
        "of",1,
        "solr",1,
        "system",1,
        "terms",1,
        "test",1,
        "the",1,
        "this",1]},
    "facet_ranges":{},
    "facet_intervals":{},
    "facet_heatmaps":{}}}

This is a brand new, local Solr 9.1.0 that I started with solr start -c

If I include the debug parameter in the request

curl "http://localhost:8983/solr/test/select?q=*:*&rows=0&facet=true&facet.field=test_t&debug=true"

the output shows some debug information for the facets too:

"debug":{
    "rawquerystring":"*:*",
    "querystring":"*:*",
    "parsedquery":"MatchAllDocsQuery(*:*)",
    "parsedquery_toString":"*:*",
    "explain":{},
    "facet-debug":{
      "elapse":0,
      "sub-facet":[{
          "processor":"SimpleFacets",
          "elapse":0,
          "action":"field facet",
          "maxThreads":0,
          "sub-facet":[{
              "elapse":0,
              "requestedMethod":"not specified",
              "appliedMethod":"FC",
              "inputDocSetSize":1,
              "field":"test_t",
              "numBuckets":11}]}]},

2 Comments

That's so strange, thank you. Maybe I'll try with earlier 9x
OK, after more testing, Text Field Faceting Broke Between Solr 9.6.0 and 9.7.0, I've sent an email to the mailing list to report it. I hope they don't say it's not supported anymore.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.