Roman Kagan

26 posts
Roman started working as a programmer as a teenager when he was hired to hack Prolog at a Minsk artificial intelligence lab. Roman was one of the first developers using Java to create web applications. Since 1991, Roman has been consulting for companies including Hewlett-Packard, EDS, GM, Ford, Chrysler, Fanuc Robotics, Comerica and Polk.

Models for Search Ranking

Simple Approaches to Search Ranking In the world of search, ranking models are crucial. They determine the order in which results are displayed, impacting user experience and engagement. But does ranking always have to be complex? The answer might surprise you. 1. Understanding Ranking ModelsRanking models are algorithms used by […]

Various Search Relevance Algorithms.

Various search relevance algorithms have been developed over the years to improve the quality of search results. Some of these methods are foundational, while others are cutting-edge and have arisen from advancements in machine learning and natural language processing. Here’s a list of some popular search relevance algorithms and methods: […]

What is BM25f

BM25f is an extension of the BM25 scoring function, which is a part of the family of ranking functions used in information retrieval. BM25 itself is a modern alternative to the classic TF-IDF scheme, designed to rank documents based on their relevance to a given query. Here’s a breakdown of […]

What is TF-IDF

TF-IDF stands for Term Frequency-Inverse Document Frequency. It’s a numerical statistic used to indicate the importance of a word in a document relative to a collection of documents, often called a corpus. TF-IDF is commonly used in the field of information retrieval and text mining. Here’s a breakdown: Why is […]

About Scalding

Scalding is a Scala library. Scalding is easy to work with and reason about the data in distributed systems like Hadoop. It presents the data as a collection and allows to perform the computation on data in a matter that is similar to Scala API, so it appears to the […]

Code Musing

“I am sorry I have had to write you such a long letter, but I did not have time to write you a short one” Pascal, Blaise (1623 – 1662) – French philosopher and mathematician. At the age of 18 he invented the first calculating machine.   So I wonder why […]

What is new in SOLR 6.x

Solr 6 builds on the innovation of Solr 5 obviously. First of all – let’s take a look at what was done in Solr 5. There were improvements for “bin/solr” and “bin/post” – easy to startup Solr, add new documents, more APIs were introduced. The user interface was rewritten in […]