Solr 如何自动导入来自 MySQL 的数据

2022-06-15 09:59:53 浏览数 (1)

导入数据时的注意事项

在笔记 2 中,可能在执行导入时会报错,那是因为还需要将 mysql-connector-java-xxx.jar 放入 solr-xxx/server/lib 文件夹下;

自动增量更新

  • solr-dataimport-scheduler.jar 放入 solr-xxx/server/solr-webapp/webapp/WEB-INF/lib 文件夹下;
  • 在 ``solr-xxx/server/solr-webapp/webapp/WEB-INF/web.xml` 中配置监听;
代码语言:javascript复制
<listener>
	<listener-class>  
		org.apache.solr.handler.dataimport.scheduler.ApplicationListener
	</listener-class>
</listener>
  • solr-xxx/server/solr/ 下新建文件夹 conf注意不是 solr-xxx/server/solr/weibo/ 中的 conf
  • solr-data-importscheduler.jar 中提取出 dataimport.properties 放入上一步创建的 conf 文件夹中,并根据自己的需要进行修改;比如我的配置如下;
代码语言:javascript复制
# dataimport.properties example
#
# From this example, copy everything bellow "dataimport scheduler properties" to your
#   dataimport.properties file and then change params to fit your needs
#
# IMPORTANT:
# Regardless of whether you have single or multiple-core Solr,
#   use dataimport.properties located in your solr.home/conf (NOT solr.home/core/conf)
# For more info and context see here:
# http://wiki.apache.org/solr/DataImportHandler#dataimport.properties_example


#Tue Jul 21 12:10:50 CEST 2010
# metadataObject.last_index_time=2010-09-20 11:12:47
# last_index_time=2010-09-20 11:12:47


#################################################
#                                               #
#       dataimport scheduler properties         #
#                                               #
#################################################

#  to sync or not to sync
#  1 - active; anything else - inactive
syncEnabled=1

#  which cores to schedule
#  in a multi-core environment you can decide which cores you want syncronized
#  leave empty or comment it out if using single-core deployment
syncCores=weibo

#  solr server name or IP address
#  [defaults to localhost if empty]
server=localhost

#  solr server port
#  [defaults to 80 if empty]
port=8983

#  application name/context
#  [defaults to current ServletContextListener's context (app) name]
webapp=solr

#  URL params [mandatory]
#  remainder of URL
params=/dataimport?command=delta-import&clean=false&commit=true

#  schedule interval
#  number of minutes between two runs
#  [defaults to 30 if empty]
# 自动增量更新时间间隔,单位为 min,默认为 30 min
interval=5

# 重做索引时间间隔,单位 min,默认 7200,即 5 天
reBuildIndexInterval = 7200

# 重做索引的参数
reBuildIndexParams=/dataimport?command=full-import&clean=true&commit=true
 

# 重做索引时间间隔的开始时间
reBuildIndexBeginTime=1:30:00

总结

到此,我们就可以实现数据库自动增量导入了;

0 人点赞