欢迎访问 生活随笔!

生活随笔

当前位置: 首页 > 编程资源 > 编程问答 >内容正文

编程问答

白话Elasticsearch26-深度探秘搜索技术之function_score自定义相关度分数算法

发布时间:2025/3/21 编程问答 40 豆豆
生活随笔 收集整理的这篇文章主要介绍了 白话Elasticsearch26-深度探秘搜索技术之function_score自定义相关度分数算法 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

文章目录

  • 概述
  • 官方说明
  • 例子

概述

继续跟中华石杉老师学习ES,第26篇

课程地址: https://www.roncoo.com/view/55


官方说明

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

简单来说: 自定义一个function_score函数,自己将某个field的值,跟es内置算出来的分数进行运算,然后由自己指定的field来进行分数的增强


例子

需求: 看帖子的人越多,那么帖子的分数就越高

先给所有的帖子数据增加follower数量 , 将对帖子搜索得到的分数,跟follower_num进行运算,由follower_num在一定程度上增强帖子的分数
看帖子的人越多,那么帖子的分数就越高

POST /forum/article/_bulk { "update": { "_id": "1"} } { "doc" : {"follower_num" : 5} } { "update": { "_id": "2"} } { "doc" : {"follower_num" : 10} } { "update": { "_id": "3"} } { "doc" : {"follower_num" : 25} } { "update": { "_id": "4"} } { "doc" : {"follower_num" : 3} } { "update": { "_id": "5"} } { "doc" : {"follower_num" : 60} }

DSL

GET /forum/article/_search {"query": {"function_score": {"query": {"multi_match": {"query": "java spark","fields": ["tile", "content"]}},"field_value_factor": {"field": "follower_num","modifier": "log1p","factor": 0.5},"boost_mode": "sum","max_boost": 5}} }
  • 如果只有field,那么会将每个doc的分数都乘以follower_num,如果有的doc follower是0,那么分数就会变为0,效果很不好。

  • 因此一般会加个log1p函数,公式会变为,new_score = old_score * log(1 + number_of_votes),这样出来的分数会比较合理 。

  • 再加个factor,可以进一步影响分数,new_score = old_score * log(1 + factor * number_of_votes)

  • boost_mode,可以决定分数与指定字段的值如何计算 : multiply,replace, sum,min,max,avg

  • max_boost,限制计算出来的分数不要超过max_boost指定的值

返回结果:

{"took": 87,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": 2,"max_score": 3.8050528,"hits": [{"_index": "forum","_type": "article","_id": "5","_score": 3.8050528,"_source": {"articleID": "DHJK-B-1395-#Ky5","userID": 3,"hidden": false,"postDate": "2019-05-01","tag": ["elasticsearch"],"tag_cnt": 1,"view_cnt": 10,"title": "this is spark blog","content": "spark is best big data solution based on scala ,an programming language similar to java spark","sub_title": "haha, hello world","author_first_name": "Tonny","author_last_name": "Peter Smith","new_author_last_name": "Peter Smith","new_author_first_name": "Tonny","follower_num": 60}},{"_index": "forum","_type": "article","_id": "2","_score": 1.7247463,"_source": {"articleID": "KDKE-B-9947-#kL5","userID": 1,"hidden": false,"postDate": "2017-01-02","tag": ["java"],"tag_cnt": 1,"view_cnt": 50,"title": "this is java blog","content": "i think java is the best programming language","sub_title": "learned a lot of course","author_first_name": "Smith","author_last_name": "Williams","new_author_last_name": "Williams","new_author_first_name": "Smith","follower_num": 10}}]} }

总结

以上是生活随笔为你收集整理的白话Elasticsearch26-深度探秘搜索技术之function_score自定义相关度分数算法的全部内容,希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。