Last week
- To evaluate search we typically build a judgment list We transform clickstream data into evaluation data This labels a result......
About a month ago
- Look at this math and grasp at its majesty: P(R) = P(R | BM25) * P(R | Emb) # Prob(Relevance) = lexical * embedding OK what’s so special about that? That’s an AND. A probabilistic way of combining scores so that when BOTH “things happen”, the final result becomes true. Here when...
- Good vector search means more than embeddings Embeddings don’t know when a result matches doesn’t match Similarity floors don’t work......
- I’ve been using the Irish energy provider Energia for 5 years or so (as of writing, 2026) and they used to have a useful insights dashboard that let me analyse my power usage. Well, they seem to have removed it so I built a handy dashboard that anyone can use. It’s at...
- Its convenient to have a lexical score normalized from 0 1 Sadly BM25 scores tend to be all over the......
about 1 month ago
- You may know BM25 lets you tune two parameters k1 how quickly to saturate document term frequency’s contribution b how......
- Rare terms have high inverse document frequency IDF BM25 scoring treats high IDF terms as more relevant Why We assume......
- In the previous tip we discussed how pointwise 1 5 labels fall apart The expert rater gives only nit picky......
4 months ago
- I have a weird relationship with statistics: on one hand, I try not to look at it too often. Maybe once or twice a year. It’s because analytics is not actionable: what difference does it make if a thousand people saw my article or ten thousand? I mean, sure, you might try to...
- A free introductory search course for anyone who wants better search without all the hard work...
5 months ago
- After the LLM judge hype curve crashes, what will come after?...
6 months ago
- Kicking the tires on an initial, naive agentic search with some thoughts on how it could be improved further...
7 months ago
- Jeff Kaufman shared some data around contra dance attendance as a function of requirements on wearing surgical masks. He compares this data to survey data, which is a useful way to validate in both directions. I found the plot compelling for a different reason – depending on how...
- I recently read You do not need “analytics” for your blog because you are neither a military surveillance unit nor a commodity trading company by Leon Paternoster. It’s a well-argued piece, and I agree with the general thrust… but I also won’t be removing analytics from my site...
8 months ago
- An analysis of DiskANN, a newer graph-based ANN index built for cheaper disk while still retaining high recall and throughput....
- Metrics can be incredibly powerful. But you have too many of them. Let’s talk about how and when to use metrics. The Golden Rule The golden rule of metrics is this: any metric you maintain should directly drive action if outside expected bounds. The reason this is an important...
9 months ago
- Say what you will about Jupyter Notebooks, but I think they are an incredible medium for learning and quick experimentation. I use Jupyter Notebooks all the time for my work and personal use. So, naturally, I was curious when I read that you could use Claude Code with Jupyter...
- After publishing my Analysis of Links From The White House’s “Wire” Website, Tina Nguyen, political correspondent at The Verge, reached out with some questions. Her questions made me realize that the numbers in my analysis weren’t quite correct (I wasn’t de-depulicating links...
Rows per page