Update search design document

close #51
This commit is contained in:
Mohammad
2023-05-07 18:12:15 +00:00
committed by Vinnie Falco
parent f3f1ce364c
commit 5eb42de218

View File

@@ -1,36 +1,47 @@
= Search Functionality
This document describes the requirements and design of search functionality across Boost site documentation generated by Antora.
This document outlines the requirements and design of search functionality for Boost site documentation generated by Antora.
== Client-Side Search
With a client-side approach, a search index is built during the process of building a static website and will be loaded along the page inside the browser. A Javascript search engine running inside the browser uses this search index to respond to the search queries so there is no server-side service involved.
With a client-side approach, a search index is built during the process of building a static website and loaded within the page in the browser. A Javascript search engine inside the browser responds to search queries without any server-side service.
Client-side search has some advantages over server-based search:
Advantages of this approach include:
* It is very responsive because there is no request/response involved in the search process.
* There is no need for a server-side database and keep it updated with new content.
* There is no need for a server-side search engine and keep it updated with new content.
* Low maintenance cost because there is no load on the server to respond to search queries.
* It can work offline (considering that we can build reference documents offline)
* It can work offline (considering that we can locally build reference documentation).
== Server-Side Search
== research
With a server-side approach, all documents are indexed in a search engine, such as Elasticsearch, and a server-side service executes search requests on the search engine and returns results.
Advantages of this approach include:
* Wider search scopes: server-hosted search indices can be huge.
* Better results: because of more powerful cloud-computing resources.
* The possibility of semantic search.
* Analytics: server-collected statistics can be used to better serve users.
== Research
=== Antora Lunr Extension
https://gitlab.com/antora/antora-lunr-extension[Antora Lunr Extension] indexes the content during the build and includes the index in the published site, which is then used to provide a client-side full-text search.
https://gitlab.com/antora/antora-lunr-extension[Antora Lunr Extension] indexes content during builds and includes the index in published sites to provide a client-side full-text search.
Pros:
* Easy Integration with Antora (minor changes to `playbook.yaml` and `header-content.hbs`).
* No need for a server-side service and works offline.
* Very responsive due to the fact there is no request/response and everything happens inside the browser.
* Easy integration with Antora (minor changes to playbook.yaml and header-content.hbs).
* No need for a server-side service.
* Responsiveness as there is no request/response involved.
* Works offline.
Cons:
* If we decide to give an option for searching in all Boost libraries this will lead to a large `search-index.js` file (in order of 100MB) which makes it impractical.
* No option for adding metadata for categorizing search results in reference documentation.
* No semantic search.
* A search option for all Boost libraries could lead to a large search-index.js file (around 100MB), making it impractical to deploy.
* No option to add metadata for categorizing search results in reference documentation.
* No semantic search functionality.
NOTE: Antora Lunr Extension is already integrated into the demo site https://docs.cppalliance.org/user-guide/index.html
@@ -40,13 +51,13 @@ NOTE: Antora Lunr Extension is already integrated into the demo site https://doc
Free Algolia search service for developer docs.
Needs investigation (http://docsearch.algolia.com/[link-to-website]).
We are currently investigating this in https://github.com/cppalliance/site-docs/pull/54[this pull request].
==== typesense
==== Typesense
Open source alternative to Algolia which can be deployed locally.
Open-source alternative to Algolia that can be deployed locally.
Needs investigation (https://typesense.org/[link-to-website]).
Requires investigation (https://typesense.org/[link-to-website]).
=== Reference Documentation in Other Languages
@@ -54,9 +65,9 @@ Needs investigation (https://typesense.org/[link-to-website]).
Open-source documentation host for crates of the Rust Programming Language.
It uses a custom client-side search solution for searching in Reference Documentation. engine and indexing tools are implemented in https://github.com/rust-lang/rust/tree/master/src/librustdoc[librustdoc].
It uses a custom client-side search solution for searching in Reference Documentation. The search engine and indexing tools are implemented at https://github.com/rust-lang/rust/tree/master/src/librustdoc[librustdoc].
The search index seems to contain meta-data for identifier types:
The search index contains metadata for identifier types:
[, Javascript]
----
@@ -71,7 +82,6 @@ const itemTypes = [
];
----
The search results are being narrowed down into Names, Parameters, and Return Types. Example of the search page (searching `socket` in `Tokio` library): https://docs.rs/tokio/1.28.0/tokio/?search=socket
The `search-index.js` file for each crate is not loaded until the user clicks on the search bar to reduce the size of the page.
The search results are narrowed down into Names, Parameters, and Return Types. An example of the search page (searching socket in the Tokio library) can be found at https://docs.rs/tokio/1.28.0/tokio/?search=socket.
The search-index.js file for each crate is not loaded until the user clicks on the search bar to reduce page size.