Why write my own indexing and search engine

January 30, 2007

Like any developer, I used to trust only the code I wrote myself. All that has changed with OpenSource and its widespread use.
However Iam always on the lookout to write something that is better than what is available in OpenSource – personally a means to justify the act of writing an application πŸ™‚
The growing volume of data in an enterprise can be a liability or an asset, depending on how you see it. Access to this data converts it to useful information.
How does one access information easily? Do we really care about the millions of hits that Google returns? I dont think we go beyond the first couple of pages.
I define “Effective search” to address the above issue – I need to get to the information of interest fast, period.
OpenSource indexing and search frameworks are far behind the commercial ones like a Google search appliance or the Verity or other search engines.
Looking around, Lucene turned out to be a good fit for my index. The catch is I still required parsers, readers and data sources to make it complete.
This led me to write Ferret. It doesnot re-invent the wheel i.e wherever possible.
The good news is that it can index file systems & web sites(secure inranets and public sites). The best part is that it is highly customizable – I can add a datasource to index databases for e.g or add parsers to new file types.
The recent announcement on availability of Omnifind led me to evaluate it and of course compare with Ferret. After some extensive study of its features, Iam still to find out if I will be able to recommend it to a client when I cannot customize many aspects except the look & feel maybe. Also it beats me why I cannot schedule an indexing operation or atleast provide API to invoke the indexer! Omnifind suits the “indexing for dummies” needs but not for any active deployment within a coprorate portal for e.g.
For now, Ferret does all this and has found a client πŸ™‚


4 Responses to “Why write my own indexing and search engine”

  1. Yodin Says:

    hi it’s a good to have a complete code for software development. But I still confuse in the aspect of compile,linking, and bug.Is there any manual that only shows figure for that reason. I mean for all type of programming language.To get the code is easy, to test it I don’t know…

  2. yODIN Says:


  3. Yodin,
    Iam not sure what you are looking for and whether it is even related to this post. Are you looking for a easy to use search engine implementation that requires minimal effort in compiling and deployment?

  4. Srikanth Says:

    Looks like someone else too has search engine with same name.

    I could not reach the ruby ferret site as I keep getting site not found problem. However, I was puzzled when I had seen people talking about using ferret and wondered whether it was the same ferret that you wrote.

