120+ Engineers
20+ Countries
850+ Projects
750+ Satisfied Clients
4.9 Clutch
120+ Engineers
20+ Countries
850+ Projects
750+ Satisfied Clients
4.9 Clutch
120+ Engineers
20+ Countries
850+ Projects
750+ Satisfied Clients

Building a Fast, Typo-Tolerant Search Engine in Python with Typesense

Learn the essential skills and steps to become a full stack developer. Start your journey today with this comprehensive guide for beginners!

Last Update: 31 Oct 2024

Building a Fast, Typo-Tolerant Search Engine in Python with Typesense image

1. Describe Typesense

Typesense is a quick, open-source search engine that is highly relevant and typo-tolerant for full-text searches. In contrast to other search engines, Typesense is intended to be easy to use without sacrificing any of its robust capabilities. It's a great option for applications like e-commerce platforms, content-rich websites, and real-time search apps where you need to give consumers a strong, typo-tolerant search experience.

What Makes Typesense Unique?

Although Elasticsearch and Algolia are popular alternatives, Typesense stands out for being simple to set up, offering lightning-fast searches, and having exceptional typo tolerance right out of the box. Typesense's support for simple deployment in containers and availability as a hosted version (Typesense Cloud) make getting started in various contexts easier.

2. Why Use Python with Typesense?

Python is a great choice for creating apps because of its simplicity and adaptability, and Typesense's Python client facilitates smooth integration. If you want quick search functionality in a Python project, whether it's in a data processing script, Django, or Flask, Typesense provides:

  • Low-latency searches: Typesense is incredibly speed-optimized.
  • Tolerance for errors: It manages typos and incomplete word matches.
  • Relevance tuning: Setting up relevance ranking for results is simple.

3. How to Begin Using Typesense?

Let's get started by configuring Typesense and integrating it with Python. The Typesense server will be started first, and then it will be connected to a Python project.

Setting Up Typesense Locally

The quickest way to run Typesense locally is via Docker. Run the following command to start a Typesense container:

docker run -p 8108:8108 -v /typesense-data:/data typesense/typesense:latest  --data-dir /data --api-key=xyz --enable-cors

Using the API key xyz, this program launches a Typesense server on localhost:8108. In production, don't forget to substitute a secure key for xyz.

Installing the Typesense Python Client

Install the typesense-python client to interact with Typesense from Python:

pip install typesense

4. Connecting to Typesense with Python

With the Typesense server running, you can connect to it from your Python script or app.

import typesense

client = typesense.Client({
    'nodes': [{
        'host': 'localhost',
        'port': '8108',
        'protocol': 'http'
    }],
    'api_key': 'xyz',
    'connection_timeout_seconds': 2
})

This code initializes a Typesense client connected to your local server. Now, you are ready to create collections and index data.

5. Creating and Managing Collections

Data is kept in collections in Typesense. Typesense is able to enhance search performance since each collection has a schema that specifies the fields and their types.

Defining a Collection Schema

Let's build a product (products) data collection for an online store. title, description, categories and price are examples of fields that we will define:

product_schema = {
    "name": "products",
    "fields": [
        {"name": "title", "type": "string"},
        {"name": "description", "type": "string"},
        {"name": "price", "type": "float"},
        {"name": "categories", "type": "string[]", "facet": True}
    ]
}

client.collections.create(product_schema)

 

This schema includes a title, description, price, and a categories field for faceting (filtering by categories).

Adding and Updating Records

With our schema set, we can add records to the collection:

products = [
    {"title": "Wireless Earbuds", "description": "High-quality earbuds", "price": 29.99, "categories": ["electronics"]},
    {"title": "Smart Watch", "description": "Feature-rich smartwatch", "price": 99.99, "categories": ["wearables", "electronics"]}
]

client.collections['products'].documents.import_(products, {'action': 'upsert'})

This code imports product data and performs an "upsert" action, adding or updating records as needed.

6. Performing Searches

Now, let’s search our products collection for items related to "watch."

results = client.collections['products'].documents.search({
    'q': 'watch',
    'query_by': 'title,description',
    'facet_by': 'categories'
})

print(results)

With Typesense, you can search in several fields (in this case, title and description) and use facets to narrow down the results.

7. Advanced Search Features

Typesense offers robust search customization options:

  • Typo Tolerance:  By returning results even for little errors, built-in typo tolerance enhances the user experience.
  • Relevance Tuning: To improve relevance, change which fields are given greater weight in the search.
  • Faceting and Filtering: To enable users to filter results by category, add faceting to fields such as categories.

 

For example, adding relevance tuning is as simple as:

results = client.collections['products'].documents.search({
    'q': 'watch',
    'query_by': 'title,description',
    'query_by_weights': '3,1'  # title is weighted more heavily than description
})

 

 

8. Tips for Performance Optimization

As you build more complex applications, it’s important to consider optimizations:

  • Batch Importing: It is more efficient to add documents in batches using import_ rather than importing records one at a time.
  • Index Tuning: Create your schema with care, retaining all necessary fields and facets.
  • Caching: To save search demand, think about caching results if your queries are often asked.

9. Real-World Example: Building a Search API

To illustrate Typesense in action, let’s create a basic Flask API for product search.

from flask import Flask, request, jsonify
import typesense


app = Flask(__name__)

client = typesense.Client({
    'nodes': [{'host': 'localhost', 'port': '8108', 'protocol': 'http'}],
    'api_key': 'xyz',
    'connection_timeout_seconds': 2
})


@app.route('/search')
def search():
    query = request.args.get('q')
    search_parameters = {
        'q': query,
        'query_by': 'title,description',
        'facet_by': 'categories'
    }

    results = client.collections['products'].documents.search(search_parameters)
    return jsonify(results)


if __name__ == '__main__':
    app.run()

 

This straightforward Flask application offers a search API that receives a query and produces Typesense results.

10. Conclusion

Adding search functionality to your project is simple using Typesense. Typesense is a good option for applications that require accurate and rapid search because of its typo tolerance, speed, and relevancy characteristics.

Frequently Asked Questions

Trendingblogs
Get the best of our content straight to your inbox!

By submitting, you agree to our privacy policy.

Have a Project To Discuss?

We're ready!

Let's
Talk