全文检索
1.全文搜索概念:
(1)数据结构:
·结构化:只具有固定格式或者有限长度的数据,如数据库,元数据等
·非结构化:指不定长或者无固定格式的数据,如邮件,word文档等
(2)非结构化数据的检索:
·顺序扫描法:适合小数据量文件
·全文搜索:将非结构化的数据转为结构化的数据,然后创建索引,在进行搜索
(3)概念:全文搜索是一种将文件中所有文本域搜索项匹配的文件资料检索方式
2.全文搜索实现原理
3.全文搜索实现技术:基于java的开源实现Lucene,ElasticSearch(具有自身的分布式管理功能),Solr
4.ElasticSearch简介:
概念:
(1)高度可扩展的开源全文搜索和分析引擎
(2)快速的,近实的多大数据进行存储,搜索和分析
(3)用来支撑有复杂的数据搜索需求的企业级应用
特点及介绍:
(1)分布式
(2)高可用
(3)对类型,支持多种数据类型
(4)多API
(5)面向文档
(6)异不写入
(7)近实时:每隔n秒查询,在写入磁盘中
(8)基于Lucene
(9)Apache协议
5.ElasticSearch与Spring Boot集成
(1)配置环境:ElasticSearch,Spring Data ElasticSearch,JNA
(2)安装ElasticSearch,下载包,解压直接启动即可,这里特别说一下ElasticSearch的一些异常问题,必须版本对应,其次端口问题一定要注意
(3)建立Spring Boot项目
(4)我们修改pom.xml文件,将相关依赖加进去
(5)在项目代码编写之前我们必须在本地安装ElasticSearch并在版本上与Spring Boot版本相兼容,其次注意端口号的问题,集成时ElasticSearch服务的端口号为9200,而客户端端口号为9300
接下来我们启动本地安装的ElasticSearch然后在启动我们的项目:
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.dhtt.spring.boot.blog</groupId> <artifactId>spring.data.action</artifactId> <version>0.0.1-SNAPSHOT</version> <packaging>jar</packaging> <name>spring.data.action</name> <description>Demo project for Spring Boot</description> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>2.1.0.RELEASE</version> <relativePath /> <!-- lookup parent from repository --> </parent> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> <java.version>1.8</java.version> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <!-- spring boot集成elasticsearch --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> </dependency> <dependency> <groupId>org.springframework.data</groupId> <artifactId>spring-data-elasticsearch</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-thymeleaf</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!-- 添加热部署 --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-devtools</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <!-- JNA 的依赖 --> <dependency> <groupId>net.java.dev.jna</groupId> <artifactId>jna</artifactId> <version>4.5.1</version> </dependency> <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> </dependency> <!-- 内存数据库h2 --> <!-- <dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> </dependency> --> <!-- MySql数据库驱动 --> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.46</version> </dependency> <!-- hibernate持久层框架引入 --> <dependency> <groupId>org.hibernate</groupId> <artifactId>hibernate-core</artifactId> <version>5.3.7.Final</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-maven-plugin</artifactId> </plugin> </plugins> </build> </project>
启动项目进行测试,观察项目各项配置是否正确,项目能否成功启动,项目启动成功后
(5)接下来配置application.properties文件:
#thymeleaf配置 spring.thymeleaf.encoding=UTF-8 #热部署静态文件,不需要缓存,实时观察文件修改效果 spring.thymeleaf.cache=false #使用html5标准 spring.thymeleaf.mode=HTML5 spring.thymeleaf.suffix=.html spring.resources.chain.strategy.content.enabled=true #elasticsearch服务器地址 spring.data.elasticsearch.cluster-nodes=127.0.0.1:9300 #连接超时时间 spring.data.elasticsearch.properties.transport.tcp.connect_timeout=120s #节点名字,默认elasticsearch #spring.data.elasticsearch.cluster-name=elasticsearch #spring.data.elasticsearch.repositories.enable=true #spring.data.elasticsearch.properties.path.logs=./elasticsearch/log #spring.data.elasticsearch.properties.path.data=./elasticsearch/data #数据库连接配置 spring.datasource.url=jdbc:mysql://localhost:3306/blog_test?useUnicode=true&characterEncoding=UTF-8&serverTimezone=GMT%2B8&useSSL=false spring.datasource.username=root spring.datasource.password=qitao1996 spring.datasource.driver-class-name=com.mysql.jdbc.Driver #jpa配置 spring.jpa.show-sql=true spring.jpa.hibernate.ddl-auto=create-drop
(6)进行后台编码:
文档类EsBlog:
package com.dhtt.spring.boot.blog.spring.data.action.entity; import java.io.Serializable; import javax.persistence.Id; import org.springframework.data.elasticsearch.annotations.Document; /** * EsBlog实体(文档)类 * * @author QiTao * */ @Document(indexName="blog",type="blog") //指定文档 public class EsBlog implements Serializable { /** * */ private static final long serialVersionUID = 4745983033416635193L; @Id private String id; private String title; private String summary; private String content; protected EsBlog() { super(); } public EsBlog(String title, String summary, String content) { super(); this.title = title; this.summary = summary; this.content = content; } public String getId() { return id; } public void setId(String id) { this.id = id; } public String getTitle() { return title; } public void setTitle(String title) { this.title = title; } public String getSummary() { return summary; } public void setSummary(String summary) { this.summary = summary; } public String getContent() { return content; } public void setContent(String content) { this.content = content; } @Override public String toString() { return "EsBlog [id=" + id + ", title=" + title + ", summary=" + summary + ", content=" + content + "]"; } }
资源库,定义数据查询接口:
package com.dhtt.spring.boot.blog.spring.data.action.repository; import org.springframework.data.domain.Page; import org.springframework.data.domain.PageRequest; import org.springframework.data.elasticsearch.repository.ElasticsearchRepository; import com.dhtt.spring.boot.blog.spring.data.action.entity.EsBlog; /** * EsBlogRepository接口 * * @author QiTao * */ public interface EsBlogRepository extends ElasticsearchRepository<EsBlog, String> { /** * 分页,查询,去重 * * @param title * @param summary * @param content * @param pageable * @return */ Page<EsBlog> findDistinctEsBlogByTitleContainingOrSummaryContainingOrContentContaining(String title, String summary, String content, PageRequest pageRequest); }
最后编写Controller类:
package com.dhtt.spring.boot.blog.spring.data.action.web.user; import java.util.List; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.data.domain.Page; import org.springframework.data.domain.PageRequest; import org.springframework.web.bind.annotation.GetMapping; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.RequestParam; import org.springframework.web.bind.annotation.RestController; import com.dhtt.spring.boot.blog.spring.data.action.entity.EsBlog; import com.dhtt.spring.boot.blog.spring.data.action.repository.EsBlogRepository; @RestController @RequestMapping("/blogs") public class BlogController { @Autowired private EsBlogRepository esBlogRepository; @GetMapping public List<EsBlog> list(@RequestParam(value = "title") String title, @RequestParam(value = "summary") String summary, @RequestParam(value = "content") String content, @RequestParam(value = "pageIndex", defaultValue = "0") int pageIndex, @RequestParam(value = "pageSize", defaultValue = "10") int pageSize) { //添加测试数据 esBlogRepository.deleteAll(); esBlogRepository.save(new EsBlog("登黄鹤楼", "王之涣的等黄鹤楼", "百日依山尽,黄河入海流,欲穷千里目,更上一层楼")); esBlogRepository.save(new EsBlog("相思", "王维的相思", "红豆生南国,春来发几枝,愿君多采截,此物最相思")); esBlogRepository.save(new EsBlog("静夜思", "李白的静夜思", "床前明月光,疑是地上霜,举头望明月,低头思故乡")); //查询获取 PageRequest pageRequest=PageRequest.of(pageIndex,pageSize); Page<EsBlog> page= esBlogRepository.findDistinctEsBlogByTitleContainingOrSummaryContainingOrContentContaining(title, summary, content, pageRequest); return page.getContent(); } }
启动项目,前台进行访问:
前台结果打印成功,故我们的Elasticsearch+Spring Boot集成成功