Query DSL 是 Elasticsearch 的查詢用 Domain-specific Language
可以當成是 Elasticsearch 的 SQL
只不過它實際上就是一堆 JSON
elasticsearch-dsl
就是官方發佈的一套用來操作 Query DSL 的 Python package
用起來有點像 Django 的 ORM
$ pip install elasticsearch-dsl
ref:
https://github.com/elastic/elasticsearch-dsl-py
https://elasticsearch-dsl.readthedocs.org/en/latest/index.htmlin app/documents.py
from elasticsearch_dsl import DocType, String, Boolean from elasticsearch_dsl.connections import connections connections.create_connection(hosts=['127.0.0.1', ]) class AlbumDoc(DocType): upc = String(index='not_analyzed') title = String(analyzer='ik', fields={'raw': String(index='not_analyzed')}) artist = String(analyzer='ik') is_ready = Boolean() class Meta: index = 'dps' doc_type = 'album' @classmethod def sync(cls, album): album_doc = AlbumDoc(meta={'id': album.id}) album_doc.upc = album.get_upcs(output_str=False) album_doc.title = album.name album_doc.artist = album.artist.name album_doc.is_ready = album.is_ready album_doc.save() def save(self, *args, **kwargs): return super(AlbumDoc, self).save(*args, **kwargs) def get_model_obj(self): from svapps.dps.models import Album return Album.objects.get(id=self.meta.id) # to create mappings AlbumDoc.init()
YourDocType.init()
這樣 Elasticsearch 才會根據你的 DocType 產生對應的 mapping
否則 Elasticsearch 就會在你第一次倒資料進去的時候根據你的資料的 data type 建立對應的 mapping
所以 analyzer 之類的設定就會是預設的standard
你可以透過 _mapping
API 來檢查
http://127.0.0.1:9200/dps/_mapping/track
http://127.0.0.1:9200/dps/_mapping/album 需要全文搜尋的欄位要設為 analyzed
(string 欄位默認都是 analyzed)
not_analyzed
但是你就不能對 analyzed 的欄位使用 term 了
除非你對該欄位額外再建立一個 raw 欄位
ref:
https://elasticsearch-dsl.readthedocs.org/en/latest/persistence.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html#CO59-2album_doc = AlbumDoc(meta={'id': 42}) album_doc.upc = ['887375000619', '887375502069'] album_doc.title = 'abc' album_doc.artist = 'xyz' album_doc.is_ready = True album_doc.save() # 可以如常地 query,不用管它是不是 list search = AlbumDoc.search().filter('term', upc='887375000619') response = search.execute()
因為 Elasticsearch 是 schemaless
所以即使你定義了 String 欄位
還是可以存一個 list 進去
search = TrackDoc.search() / .filter('term', is_ready=True) / .query('match', title=u'沒有的啊') search = TrackDoc.search() / .filter('term', is_ready=True) / .query( Q('match', title='沒有的啊') & / Q('match', artist='那我懂你意思了') & / Q('match', album='沒有的, 啊!?') ) q = Q( 'bool', must=[ Q('match', title={'query': track_name, 'fuzziness': 'AUTO'}), ], should=[ Q('match', album={'query': album_name, 'minimum_should_match': '60%'}), Q('match', artist={'query': artist_name, 'minimum_should_match': '80%'}), ], minimum_should_match=1 ) search = TrackDoc.search().filter('term', is_ready=True).query(q) q = Q( 'bool', should=[ Q('term', isrc=q), Q('term', upc=q), Q('match', **{'title.raw': {'query': q}}), Q('multi_match', query=q, fields=['title', 'artist', 'album']), ], ) search = Search(index='dps', doc_type=['track', 'album']).query(q) search = search[:20] # print the raw Query DSL import uniout from pprint import pprint pprint(search.to_dict()) response = search.execute() print(response.hits.total) print(response[0].title) print(response[0].artist) print(response[0].album) print(response[0].is_ready)
ref:
https://elasticsearch-dsl.readthedocs.org/en/latest/search_dsl.html