# README
document-benchmark
This is a Go application (originally written by Dvir Volk) which supports reading, indexing and searching using two search engines:
with the following datasets:
- Wikipedia Abstract Data Dumps: from English-language Wikipedia:Database page abstracts. This use case generates 3 TEXT fields per document.
- pmc: Full text benchmark with academic papers from PMC.
Getting Started
Download Standalone binaries ( no Golang needed )
If you don't have go on your machine and just want to use the produced binaries you can download the following prebuilt bins:
https://github.com/RediSearch/RediSearchBenchmark/releases/latest
OS | Arch | Link |
---|---|---|
Windows | amd64 (64-bit X86) | document-benchmark-windows-amd64.exe |
Linux | amd64 (64-bit X86) | document-benchmark-linux-amd64 |
Linux | arm64 (64-bit ARM) | document-benchmark-linux-arm64 |
Darwin | amd64 (64-bit X86) | document-benchmark-darwin-amd64 |
Darwin | arm64 (64-bit ARM) | document-benchmark-darwin-arm64 |
Here's how bash script to download and try it:
wget -c https://github.com/RediSearch/RediSearchBenchmark/releases/latest/download/document-benchmark-$(uname -mrs | awk '{ print tolower($1) }')-$(dpkg --print-architecture).tar.gz -O - | tar -xz
# give it a try
./document-benchmark --help
Installation in a Golang env
The easiest way to get and install the benchmark utility with a Go Env is to use
go get
and then go install
:
# Fetch this repo
go get github.com/RediSearch/RediSearchBenchmark
cd $GOPATH/src/github.com/RediSearch/RediSearchBenchmark
make
Try it out
To try it out locally we can use docker in the following manner to spin up both a Redis and Elastic environments:
sudo sysctl -w vm.max_map_count=262144
docker run -d -p 9200:9200 -p 9300:9300 -e "ELASTIC_PASSWORD=password" docker.elastic.co/elasticsearch/elasticsearch:8.3.3
docker run -d -p 6379:6379 redis/redis-stack:edge
- Retrieve the wikipedia dataset, and populate with 1000000 documents:
wget https://s3.amazonaws.com/benchmarks.redislabs/redisearch/datasets/enwiki-abstract/enwiki-latest-abstract.xml
- Populate into RediSearch:
./bin/document-benchmark -hosts "127.0.0.1:6379" -engine redis -file enwiki-latest-abstract.xml -maxdocs 100000
- Populate into ElasticSearch:
./bin/document-benchmark -hosts "https://127.0.0.1:9200" -engine elastic -password "password" -file enwiki-latest-abstract.xml -maxdocs 100000
- Run the RediSearch benchmark:
./bin/document-benchmark -hosts "127.0.0.1:6379" -engine redis -benchmark search -file enwiki-latest-abstract.xml
- Run the ElasticSearch benchmark:
./bin/document-benchmark -hosts "https://127.0.0.1:9200" -engine elastic -password "password" -file enwiki-latest-abstract.xml -benchmark search
# Functions
Benchmark runs a given function f for the given duration, and outputs the throughput and latency of the function.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
SearchBenchmark returns a closure of a function for the benchmarker to run, using a given index and options, on a set of queries.
SearchBenchmark returns a closure of a function for the benchmarker to run, using a given index and options, on a set of queries.
No description provided by the author
SearchBenchmark returns a closure of a function for the benchmarker to run, using a given index and options, on a set of queries.
# Constants
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
IndexName is the name of our index on all engines.
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
# Type aliases
ByTimestamp implements sort.Interface based on the Timestamp field of the DataPoint.